network bridge with libvrt not working

Issues related to configuring your network
opticalc
Posts: 7
Joined: 2017/05/11 17:10:29

network bridge with libvrt not working

Postby opticalc » 2017/05/18 22:31:03

Trying to set up a KVM environemnt to play around with, cannot seem to get networking to pass traffic. My Centos7 baremetal server's connection to my LAN is on enp3s0, its config is rather bare:

Code: Select all

NM_CONTROLLED="no"
TYPE=Ethernet
BOOTPROTO=none
IPV4_FAILURE_FATAL=no
IPV6INIT=no
NAME=enp3s0
DEVICE=enp3s0
ONBOOT=yes


Here is my XML file that I use with virsh for net-define, then I net-start it

Code: Select all

<network>
  <name>insideBridge</name>
  <bridge name="inside0" stp="on" delay="2" />
  <mac address='00:16:3E:00:00:00'/>
  <forward mode="route" dev="enp3s0"/>
  <ip address="192.168.10.253" netmask="255.255.255.0" />
</network>


after I start the network in virsh, my linux ifconfig command sees the interface inside0 with IP 192.168.10.253 but I cannot ARP or ping anything on my inside network. Im using docs from https://libvirt.org/formatnetwork.html. Not sure what I need to do. I even tried turning on IP forwarding in the kernel.

if i put an ip directly on enp3s0 it can have comms with my LAN.

User avatar
jlehtone
Posts: 1829
Joined: 2007/12/11 08:17:33
Location: Finland

Re: network bridge with libvrt not working

Postby jlehtone » 2017/05/19 10:49:46

Less is more. In this case "doing less" ...

The default libvirt installation does define a network "default" that forwards to physical network in NAT mode. Furthermore, it provides DHCP and DNS for the virtual machines.

The network that you have defined does forward to physical network in route mode. No DHCP.

What is common is that they both are connected by routing. There are two other options:
1) Isolated, no connection to physical network.
2) Bridged connection.
3) Passthrough. (A VM uses physical NIC of the host directly, without a "network")

I presume that you know the cheap consumer "routers" that home users tend to have?
A small box that has a "WAN" part and at least one "LAN" port or WiFi AP.

Your host must route. It must become a router.
The 192.168.10.253 on "inside0" is the LAN-port of your host computer.
Where is your WAN-port? The enp3s0 must have an IP that lets your host be a member of the physical network (that you call "LAN"?) Ideally, the DHCP server of the LAN does provide it for the enp3s0.

Your insideBridge is in mode "route". Do the members of LAN know that subnet 192.168.10/24 is behind the "CentOS baremetal router"?

In NAT mode the members of LAN do not know that 192.168.10/24 exists; they merely communicate with a "baremetal CentOS" computer.

Your insideBridge does not offer DHCP for the virtual machines. You have to configure each VM statically.


Tip. libvirt includes 'virt-manager', a GUI management tool.
One can run it in a baremetal server that has no X11, if one does tunnel (via ssh) the output to X11 server elsewhere.
One can run it in another machine (that has X11) and merely connect to remote libvirtd (via ssh).

opticalc
Posts: 7
Joined: 2017/05/11 17:10:29

Re: network bridge with libvrt not working

Postby opticalc » 2017/05/19 13:15:17

sorry - i guess i could have provided a bit more info. Just to get started, I already have a DHCP server on my LAN - I dont want the KVM host or any of its interfaces/bridges to be the DHCP server. So in virsh I had already destroyed and undefined that initial virbr0. I make it a point to use CLI anywhere I can, so I have not been doing too much in virt-manager.

My plan is to work on the enp2s0/WAN part later after I get the initial comms on the LAN side working. Overall goal is to have pfsense in a VM that does all the routing. but first I have to figure the LAN out. From what I read in that link for docs, forward mode I need should be route? Is that right? by all accounts that has to be it. but then it seems something else is wrong.

User avatar
jlehtone
Posts: 1829
Joined: 2007/12/11 08:17:33
Location: Finland

Re: network bridge with libvrt not working

Postby jlehtone » 2017/05/19 16:51:56

We need some clear names to understand what is going on.

The enp3s0 of the is connected to physical subnet that is outside of the host. Lets call that PhyLAN.
The inside0 is a virtual network inside the host. Address range 192.168.10/24. Lets call that VirLAN.

The PhyLAN and VirLAN are separate subnets.

You have a DHCP in PhyLAN. It should give IP config for the enp3s0.

The virtual machines will be connected to VirLAN.
The VirLAN does not have DHCP and thus the VM's do not get IP config from anywhere.

There is no route to internet (yet). You intend to set the host to be the "gateway out" for the PhyLAN. Can your DHCP server send the IP of enp3s0 as "default route" for the members of PhyLAN? You should, already, because the VirLAN is "outside" of PhyLAN.

On the host you have to override that, because its "gateway out" has to be the interface of pfsense VM that is connected to the VirLAN. On the pfsense VM you have to add a static route to PhyLAN subnet via 192.168.10.253, while the default route of pfsense will be via its "WAN-port".

On all other VM's in VirLAN should have the same routes as the host.

The route to internet from members of PhyLAN is then via the host (a router) that forwards to member "pfsense" of VirLAN (an another router).
The route to internet from members of VirLAN (and the host) is via member "pfsense" of VirLAN.
The route to PhyLAN from all members of VirLAN (pfsense included) is via 192.168.10.253.


That is all a bit complex, but doable. Are you sure that is really what you want?


Option B. Bridging.
1. Define no networks on the libvirt. Create no VirLAN at all.
2. Create a bridge interface on the host. Manually. Enslave the enp3s0 into that bridge. Let the bridge get an IP from the DHCP of the PhyLAN.
3. For each VM, enslave their interface to the same bridge. This will make them members of the PhyLAN.

Each VM will get IP config from the DHCP of the PhyLAN.
No static routes are required.
The DHCP server must tell everyone that the default route is the IP address that it assigns for the pfsense's interface.
The pfsense must ignore that default route and use the one offered by the DHCP server that is one the WAN-side (behind enp2s0).
The host shall do no routing.
There will be only one subnet.

Simpler, isn't it?

opticalc
Posts: 7
Joined: 2017/05/11 17:10:29

Re: network bridge with libvrt not working

Postby opticalc » 2017/05/19 17:50:57

yes, ok ill have to fess up here - I already got my home network working just like you describe in option B.

Code: Select all

baremetal host has a cable modem connected to enp2s0 and ive got ifcfg scripts set up for enp2s0 to be on br2, no ip configured for br2

baremetal host has a switch connected to enp3s0 and ive got ifcfg scripts set up for enp2s0 to be on br3, static ip 192.168.10.253 configured for br3

my VMs for me to play around with have 1 interface defined with br3

the pfsense vm is defined with 2 interfaces br2 and br3; it has a dhcp client for the WAN side (pfsense see a 10gigabit connection on vtnet0 for the WAN) and dhcp server on the LAN side (pfsense see a 10gigabit connection on vtnet1 for the LAN).

all my physical PCs on the LAN as well as VMs on the LAN now get DHCP and can route to internet via pfsense VM


but this setup wont let me play (learn) how to use networking within libvrt, which is a primary goal of this set up. Ill read up more on your write up to see if I can figure out why its not working, thanks much for the detail.

User avatar
jlehtone
Posts: 1829
Joined: 2007/12/11 08:17:33
Location: Finland

Re: network bridge with libvrt not working

Postby jlehtone » 2017/05/19 20:12:41

Okay,

Create the additional bridge "inside0" with libvirt. Set it in NAT mode.
Set it to have DHCP of its own (libvirt will start a 'dnsmasq' process for that "network").
Set the subnet to be something else than 192.168.10/24.
For example 192.168.20/24, with the host as 192.168.20.1 and VMs in range 192.168.20.100--192.168.20.200.

Check that libvirt enables routing in kernel when it starts that network. If not, then set it explicitly.

Add a VM for testing.

Check the firewall rules of the host that traffic is allowed appropriately. Libvirt and firewalld should do that by default.


If that works setup, then change type from NAT to route and set the DHCP in pfsense to send one static route for all clients:
to 192.168.20/24 via 192.168.10.253

(Check firewall again for this setup.)

opticalc
Posts: 7
Joined: 2017/05/11 17:10:29

Re: network bridge with libvrt not working

Postby opticalc » 2017/05/19 22:09:42

ok got it.

well i now suspect what I tried from the beginning didnt work because of firewalld. I implemented what you said and now my new VM guest can DHCP from the new bridge I created and it can ping its default gateway (baremetal host new bridge) but cannot ping out past that.

I did have ipv4.ip_forward=0 and adding the bridge in libvrt did get it changed to 1 but I dont think my firewalld did everything it was supposed to:

Code: Select all

[root@localhost xml]# firewall-cmd --state
running
[root@localhost xml]# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2017-05-18 08:18:24 CDT; 1 day 8h ago
     Docs: man:firewalld(1)
 Main PID: 671 (firewalld)
   CGroup: /system.slice/firewalld.service
           └─671 /usr/bin/python -Es /usr/sbin/firewalld --nofork --nopid

May 18 08:18:23 localhost.localdomain systemd[1]: Starting firewalld - dynamic firewall daemon...
May 18 08:18:24 localhost.localdomain systemd[1]: Started firewalld - dynamic firewall daemon.
May 18 16:16:41 localhost.localdomain firewalld[671]: ERROR: UNKNOWN_INTERFACE: 'enp2s0' is not in any zone
May 18 16:16:43 localhost.localdomain firewalld[671]: ERROR: UNKNOWN_INTERFACE: 'enp3s0' is not in any zone
May 18 16:28:42 localhost.localdomain firewalld[671]: ERROR: UNKNOWN_INTERFACE: 'enp2s0' is not in any zone
May 18 16:28:44 localhost.localdomain firewalld[671]: ERROR: UNKNOWN_INTERFACE: 'enp3s0' is not in any zone
May 18 17:04:11 localhost.localdomain firewalld[671]: ERROR: UNKNOWN_INTERFACE: 'enp2s0' is not in any zone
May 18 17:04:13 localhost.localdomain firewalld[671]: ERROR: UNKNOWN_INTERFACE: 'enp3s0' is not in any zone
[root@localhost xml]# firewall-cmd --get-default-zone
public
[root@localhost xml]# firewall-cmd --get-active-zones
public
  interfaces: br2 br3
[root@localhost xml]# sudo firewall-cmd --zone=public --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: br2 br3
  sources:
  services: dhcpv6-client ssh
  ports:
  protocols:
  masquerade: no
  forward-ports:
  sourceports:
  icmp-blocks:
  rich rules:

[root@localhost xml]#


so it looks like firewalld didnt do anything for my new bridge? those 2 there br2 and br3 are the ones i created outside of libvrt.

Edit To Add:
possibly the install of convirt I did previously (that failed, get their infamous index out of bounds error) caused my firewalld issues? my /etc/sysctl.conf has this:

Code: Select all

# Added by convirt-tool :Skip firewall for bridge traffic
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0

User avatar
jlehtone
Posts: 1829
Joined: 2007/12/11 08:17:33
Location: Finland

Re: network bridge with libvrt not working

Postby jlehtone » 2017/05/20 10:32:17

You can see the active rules that are in the kernel (netfilter) with:

Code: Select all

iptables -S
iptables -t mangle -S
iptables -t nat -S

Hint: grep inside0 (if that is the name of the bridge) from the output and then look for other rules that seem to relate to what you see.


The net.bridge.bridge-nf-call* are not necessary. Originally the bridged traffic never did enter the netfilter. At some point kernel was changed to filter also the bridged traffic.
http://ebtables.netfilter.org/br_fw_ia/br_fw_ia.html has some flowcharts.

However, at least in Red Hat the net.bridge.bridge-nf-call* = 0 config did appear quite soon to maintain the old "unfiltered" behaviour.
When the virtualization started to boom, this became even more important, cutting "unnecessary" overhead. This config had some issues though, the /etc/sysctl.conf was processed before the kernel had the net.bridge.bridge-nf-call*.

On 7.3 the kernel has the changed again. The module that would enable filtering bridged traffic is not loaded by default. Hence no need to set those variables.


When your VM's talk with PhyLAN devices via br3, that is bridged traffic. The br2 bridges too.

When a VM in inside0 talks with PhyLAN devices, that is not bridged. That is where the baremetal host routes between inside0 network and br3 network.

opticalc
Posts: 7
Joined: 2017/05/11 17:10:29

Re: network bridge with libvrt not working

Postby opticalc » 2017/05/21 03:02:06

ok so those fw rules arent the problem. but I think ive got it narrowed down a good bit.

I noticed that after boot, theres no static IP on my inside network bridge (now br1 and eth1, after loading centos6.9). however, if I use virsh to destroy then start the br1 network, then use service network restart, everything just comes together and works perfectly. so I looked in /var/log/messages and pretty sure this is the issue:

Code: Select all

May 20 21:17:30 kvmhost libvirtd: Could not find keytab file: /etc/libvirt/krb5.tab: No such file or directory
May 20 21:17:30 kvmhost dnsmasq[2329]: no interface with address 192.168.10.252
May 20 21:17:30 kvmhost dnsmasq[2329]: FAILED to start up
May 20 21:17:30 kvmhost dnsmasq[2330]: no interface with address 192.168.100.105
May 20 21:17:30 kvmhost dnsmasq[2330]: FAILED to start up
May 20 21:17:30 kvmhost kernel: Ebtables v2.0 registered
May 20 21:17:32 kvmhost kernel: device vnet0 entered promiscuous mode
May 20 21:17:32 kvmhost kernel: device vnet1 entered promiscuous mode
May 20 21:17:34 kvmhost kernel: device vnet0 left promiscuous mode
May 20 21:17:34 kvmhost kernel: br0: port 2(vnet0) entering disabled state
May 20 21:17:34 kvmhost kernel: device vnet1 left promiscuous mode
May 20 21:17:34 kvmhost kernel: br1: port 2(vnet1) entering disabled state
May 20 21:17:35 kvmhost kernel: lo: Disabled Privacy Extensions
May 20 21:17:36 kvmhost kernel: device vnet0 entered promiscuous mode
May 20 21:17:36 kvmhost kernel: device vnet1 entered promiscuous mode
May 20 21:17:38 kvmhost kernel: device vnet0 left promiscuous mode
May 20 21:17:38 kvmhost kernel: br0: port 2(vnet0) entering disabled state
May 20 21:17:38 kvmhost kernel: device vnet1 left promiscuous mode
May 20 21:17:38 kvmhost kernel: br1: port 2(vnet1) entering disabled state


my guess is those dnsmasq messages are the clue. after I use virsh to destroy then start the br1 network, then use service network restart, then when everything starts working, here is what got put into /var/log/messages:

Code: Select all

May 20 21:42:41 kvmhost kernel: device eth1 left promiscuous mode
May 20 21:42:41 kvmhost kernel: br1: port 1(eth1) entering disabled state
May 20 21:42:48 kvmhost kernel: device br1-nic entered promiscuous mode
May 20 21:42:48 kvmhost kernel: br1: starting userspace STP failed, starting kernel STP
May 20 21:42:48 kvmhost dnsmasq[3162]: started, version 2.48 cachesize 150
May 20 21:42:48 kvmhost dnsmasq[3162]: compile time options: IPv6 GNU-getopt DBus no-I18N DHCP TFTP "--bind-interfaces with SO_BINDTODEVICE"
May 20 21:42:48 kvmhost dnsmasq[3162]: reading /etc/resolv.conf
May 20 21:42:48 kvmhost dnsmasq[3162]: using nameserver 208.67.220.220#53
May 20 21:42:48 kvmhost dnsmasq[3162]: using nameserver 208.67.222.222#53
May 20 21:42:48 kvmhost dnsmasq[3162]: read /etc/hosts - 3 addresses
May 20 21:42:48 kvmhost dnsmasq[3162]: read /var/lib/libvirt/dnsmasq/br1.addnhosts - 0 addresses
May 20 21:43:00 kvmhost kernel: device eth0 left promiscuous mode
May 20 21:43:00 kvmhost kernel: br0: port 1(eth0) entering disabled state
May 20 21:43:02 kvmhost kernel: lo: Disabled Privacy Extensions
May 20 21:43:03 kvmhost kernel: r8169 0000:02:00.0: eth0: link down
May 20 21:43:03 kvmhost kernel: ADDRCONF(NETDEV_UP): eth0: link is not ready
May 20 21:43:03 kvmhost kernel: device eth0 entered promiscuous mode
May 20 21:43:03 kvmhost kernel: r8169 0000:03:00.0: eth1: link down
May 20 21:43:03 kvmhost kernel: r8169 0000:03:00.0: eth1: link down
May 20 21:43:03 kvmhost kernel: ADDRCONF(NETDEV_UP): eth1: link is not ready
May 20 21:43:03 kvmhost kernel: device eth1 entered promiscuous mode
May 20 21:43:05 kvmhost kernel: r8169 0000:03:00.0: eth1: link up
May 20 21:43:05 kvmhost kernel: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
May 20 21:43:05 kvmhost kernel: br1: port 2(eth1) entering listening state
May 20 21:43:07 kvmhost kernel: br1: port 2(eth1) entering learning state
May 20 21:43:11 kvmhost kernel: br1: port 2(eth1) entering forwarding state


it looks like maybe the issue is that dnsmasq is trying to do some work before libvrt gets things set up? I did verify that no rcX.d script is running dsnmasq, and also verified that if i disable all libvrt's bridges from autostart, then dnsmasq doesnt get run at boot, so it looks like libvrt is running it.

and I did find this: https://www.redhat.com/archives/libvirt-users/2012-September/msg00070.html which seems to be just what Im running into. not sure if theres anything I can do about it.

User avatar
jlehtone
Posts: 1829
Joined: 2007/12/11 08:17:33
Location: Finland

Re: network bridge with libvrt not working

Postby jlehtone » 2017/05/21 09:37:30

Yes, the libvirtd starts dnsmasq processes as necessary.

How is the definition of "inside0" now?

On one installation the "default" has:

Code: Select all

# cat /etc/libvirt/qemu/networks/default.xml
<!--
WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE
OVERWRITTEN AND LOST. Changes to this xml configuration should be made using:
  virsh net-edit default
or other application using the libvirt API.
-->

<network>
  <name>default</name>
  <uuid>a592447d-3924-4c9e-8335-f58a401d4006</uuid>
  <forward mode='nat'/>
  <bridge name='virbr0' stp='on' delay='0'/>
  <mac address='52:54:00:12:34:56'/>
  <ip address='192.168.122.1' netmask='255.255.255.0'>
    <dhcp>
      <range start='192.168.122.2' end='192.168.122.254'/>
    </dhcp>
  </ip>
</network>