Networking issues with VMware interfaces, after yum upgrade

Issues related to configuring your network
HotChocolate
Posts: 7
Joined: 2018/05/22 14:36:40

Networking issues with VMware interfaces, after yum upgrade

Post by HotChocolate » 2018/05/22 19:03:30

Hi everyone

I'm Muneer from Switzerland. I just subscribed to this forum because I urgently need help from the CentOS geeks ;-)
We use CentOS boxes on VMware virtual-servers with nginx (plus) as reverse-proxies and load-balancers.
Till now everthing worked great. Unfortunately no longer after the last yum upgrade! Now all VMware NICs arn't accessable from the related servers anymore.

Just a simple overview. 3 VMs with CentOS and nignx with Windows 2016 Servers behind among 3 Network-Zones

DMZ ----> Reverse-Proxy / LB Zone 1 ---------> Frontend-Servers (Windows)
______________________________________ or ---------> LB Zone 2 ---------------> Backend-Servers (Windows)
______________________________________________________________ or ---------------> LB Zone 3 --------------> SQL Servers


and the windows Servers, from several different customers, are in separate vLans.
But the LBs (Zone 1 to 3) are used from all customers.

Here is what we have on the working env:

CentOS: 3.10.0-693.21.1.el7.x86_64 #1 SMP Wed Mar 7 19:03:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

NICs (VMware):

Code: Select all

# ifconfig | grep -A1 ens

ens192: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.0.100.171  netmask 255.255.255.0  broadcast 10.0.100.255
--
ens193: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.12.13.16  netmask 255.255.255.0  broadcast 10.12.13.255
--
ens194: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.10.12.15  netmask 255.255.255.0  broadcast 10.10.12.255
--
ens224: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.12.12.15  netmask 255.255.255.0  broadcast 10.12.12.255
--
ens224:0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.12.12.17  netmask 255.255.255.0  broadcast 10.12.12.255
--
ens224:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.10.12.37  netmask 255.255.255.0  broadcast 10.10.12.255
--
ens225: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.11.13.16  netmask 255.255.255.0  broadcast 10.11.13.255
--
ens256: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.11.12.15  netmask 255.255.255.0  broadcast 10.11.12.255
--
ens256:0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.11.12.17  netmask 255.255.255.0  broadcast 10.11.12.255
--
ens256:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.10.12.36  netmask 255.255.255.0  broadcast 10.10.12.255
--
ens257: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.10.13.16  netmask 255.255.255.0  broadcast 10.10.13.255
All off them I can reach from every frontend-server on the specific port (see netstat below) by telnet.
The reason why we have several is the routing to the vlans. In order to make it reachable we added a NIC for the specific vlan on the VM.
Besides that we used virtual interfaces (:0, :1) to separate traffic coming from different environments to avoid if-statements in the nginx conf.

Code: Select all

# ip route

default via 10.0.100.1 dev ens192 proto static metric 100
default via 10.12.13.1 dev ens193 proto static metric 101
default via 10.10.12.1 dev ens194 proto static metric 102
default via 10.12.12.1 dev ens224 proto static metric 103
default via 10.11.13.1 dev ens225 proto static metric 104
default via 10.11.12.1 dev ens256 proto static metric 105
default via 10.10.13.1 dev ens257 proto static metric 106
10.0.100.0/24 dev ens192 proto kernel scope link src 10.0.100.171 metric 100
10.10.12.0/24 dev ens194 proto kernel scope link src 10.10.12.15 metric 100
10.10.12.0/24 dev ens224 proto kernel scope link src 10.10.12.37 metric 101
10.10.12.0/24 dev ens256 proto kernel scope link src 10.10.12.36 metric 102
10.10.13.0/24 dev ens257 proto kernel scope link src 10.10.13.16 metric 100
10.11.12.0/24 dev ens256 proto kernel scope link src 10.11.12.15 metric 100
10.11.13.0/24 dev ens225 proto kernel scope link src 10.11.13.16 metric 100
10.12.12.0/24 dev ens224 proto kernel scope link src 10.12.12.15 metric 100
10.12.13.0/24 dev ens193 proto kernel scope link src 10.12.13.16 metric 100

# arp
Address                  HWtype  HWaddress           Flags Mask            Iface
10.10.13.26              ether   00:50:56:bc:5b:1d   C                     ens257
10.0.100.172             ether   00:50:56:bc:4d:7f   C                     ens192
10.12.12.131             ether   00:50:56:bc:83:c0   C                     ens224
10.11.12.132             ether   00:50:56:bc:dd:d5   C                     ens256
10.11.12.131             ether   00:50:56:bc:8f:26   C                     ens256
10.12.12.111             ether   00:50:56:bc:4c:dd   C                     ens224
10.11.13.15              ether   00:50:56:bc:de:55   C                     ens225
10.12.13.26              ether   00:50:56:bc:41:19   C                     ens193
10.11.13.26              ether   00:50:56:bc:6f:b6   C                     ens225
10.10.12.25              ether   00:50:56:bc:5b:b5   C                     ens194
10.10.12.47              ether   00:50:56:bc:5b:b5   C                     ens194
10.12.12.16              ether   00:50:56:bc:47:4f   C                     ens224
10.12.12.27              ether   00:50:56:bc:7c:85   C                     ens224
10.11.12.25              ether   00:50:56:bc:72:e0   C                     ens256
10.11.12.251             ether   00:50:56:bc:3b:12   C                     ens256
10.12.13.5               ether   00:50:56:bc:b5:7c   C                     ens193
10.11.13.5               ether   00:50:56:bc:de:55   C                     ens225
gateway                  ether   cc:03:d9:02:b6:00   C                     ens192
10.10.12.46              ether   00:50:56:bc:5b:b5   C                     ens194
10.10.12.46              ether   00:50:56:bc:72:e0   C                     ens256
10.12.13.15              ether   00:50:56:bc:b5:7c   C                     ens193
10.10.12.47              ether   00:50:56:bc:7c:85   C                     ens224
10.11.12.27              ether   00:50:56:bc:72:e0   C                     ens256
10.12.12.25              ether   00:50:56:bc:7c:85   C                     ens224
10.12.12.251             ether   00:50:56:bc:7a:f4   C                     ens224

# netstat -an | grep -w LISTEN

tcp        0      0 10.10.12.57:80          0.0.0.0:*               LISTEN
tcp        0      0 10.10.12.56:80          0.0.0.0:*               LISTEN
tcp        0      0 10.11.12.25:80          0.0.0.0:*               LISTEN
tcp        0      0 10.11.12.15:80          0.0.0.0:*               LISTEN
tcp        0      0 10.11.12.5:80           0.0.0.0:*               LISTEN
tcp        0      0 10.12.12.5:80           0.0.0.0:*               LISTEN
tcp        0      0 10.10.12.5:1433         0.0.0.0:*               LISTEN
tcp        0      0 10.12.12.7:1433         0.0.0.0:*               LISTEN
tcp        0      0 10.12.12.5:1433         0.0.0.0:*               LISTEN
tcp        0      0 10.11.12.5:1433         0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:2233            0.0.0.0:*               LISTEN
tcp        0      0 10.10.12.5:4443         0.0.0.0:*               LISTEN
tcp        0      0 10.11.12.5:4443         0.0.0.0:*               LISTEN
tcp        0      0 10.12.12.5:4443         0.0.0.0:*               LISTEN
tcp        0      0 10.12.12.5:4444         0.0.0.0:*               LISTEN
tcp        0      0 10.10.12.5:4445         0.0.0.0:*               LISTEN
As I wrote, till now everthing worked as a charm.
This weekend we upgraded it to

3.10.0-862.2.3.el7.x86_64 #1 SMP Wed May 9 18:05:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

and now these interfaces aren't reachable anymore from the windows servers in front of the LBs! No ping, no telnet.
BUT from one LB to the next it still works?!

For example. I can telnet from LB Zone 1 the IP 10.12.12.15, but not from one of the Win Servers in Zone 1, and therefore the whole Application fails.

- There is no blocking firewalld / iptables running, because we have pfsense instances in front of them. (No changes where done on them.)

Code: Select all

# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:firewalld(1)
- Also no changes on the VMware SW.

The only changes were done on the OS was the yum upgrade.

I've tried once with all NICs, and their config, removed before upgrading, and then adding them back.
Still the same result.

On the not working env it looks like this:

Code: Select all

# ifconfig | grep -A1 ens
ens192: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.0.100.171  netmask 255.255.255.0  broadcast 10.0.100.255
--
ens193: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.12.13.16  netmask 255.255.255.0  broadcast 10.12.13.255
--
ens194: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.10.12.15  netmask 255.255.255.0  broadcast 10.10.12.255
--
ens224: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.12.12.15  netmask 255.255.255.0  broadcast 10.12.12.255
--
ens224:0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.12.12.17  netmask 255.255.255.0  broadcast 10.12.12.255
--
ens224:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.10.12.37  netmask 255.255.255.0  broadcast 10.10.12.255
--
ens225: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.11.13.16  netmask 255.255.255.0  broadcast 10.11.13.255
--
ens256: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.11.12.15  netmask 255.255.255.0  broadcast 10.11.12.255
--
ens256:0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.11.12.17  netmask 255.255.255.0  broadcast 10.11.12.255
--
ens256:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.10.12.36  netmask 255.255.255.0  broadcast 10.10.12.255
--
ens257: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.10.13.16  netmask 255.255.255.0  broadcast 10.10.13.255
Seems still the same.

But some changes on the routing table. Different metric values and the two marked with XXX were missing before.

Code: Select all

# ip route

default via 10.0.100.1 dev ens192 proto static metric 100
default via 10.12.13.1 dev ens193 proto static metric 101
default via 10.10.12.1 dev ens194 proto static metric 102
default via 10.12.12.1 dev ens224 proto static metric 103
default via 10.11.13.1 dev ens225 proto static metric 104
default via 10.11.12.1 dev ens256 proto static metric 105
default via 10.10.13.1 dev ens257 proto static metric 106
10.0.100.0/24 dev ens192 proto kernel scope link src 10.0.100.171 metric 100
10.10.12.0/24 dev ens194 proto kernel scope link src 10.10.12.15 metric 102
10.10.12.0/24 dev ens224 proto kernel scope link src 10.10.12.37 metric 103
10.10.12.0/24 dev ens256 proto kernel scope link src 10.10.12.36 metric 105
10.10.13.0/24 dev ens257 proto kernel scope link src 10.10.13.16 metric 106
10.11.12.0/24 dev ens256 proto kernel scope link src 10.11.12.15 metric 105
10.11.12.0/24 dev ens256 proto kernel scope link src 10.11.12.17 metric 105  XXX
10.11.13.0/24 dev ens225 proto kernel scope link src 10.11.13.16 metric 104
10.12.12.0/24 dev ens224 proto kernel scope link src 10.12.12.15 metric 103
10.12.12.0/24 dev ens224 proto kernel scope link src 10.12.12.17 metric 103  XXX
10.12.13.0/24 dev ens193 proto kernel scope link src 10.12.13.16 metric 101
I don't think this should cause such an issue? But it's strange somehow.

Besides that the arp cache looks pretty different:

Code: Select all

# arp
Address                  HWtype  HWaddress           Flags Mask            Iface
10.11.12.27              ether   00:50:56:bc:72:e0   C                     ens256
10.0.100.172             ether   00:50:56:bc:4d:7f   C                     ens192
gateway                  ether   cc:03:d9:02:b6:00   C                     ens192
10.10.12.47              ether   00:50:56:bc:7c:85   C                     ens224
10.12.12.131             ether   00:50:56:bc:83:c0   C                     ens224
10.11.13.26              ether   00:50:56:bc:6f:b6   C                     ens225
10.12.13.26              ether   00:50:56:bc:41:19   C                     ens193
10.11.12.25              ether   00:50:56:bc:72:e0   C                     ens256
10.11.12.132             ether   00:50:56:bc:dd:d5   C                     ens256
10.12.12.27              ether   00:50:56:bc:7c:85   C                     ens224
10.12.12.25              ether   00:50:56:bc:7c:85   C                     ens224
10.10.12.25              ether   00:50:56:bc:5b:b5   C                     ens194
10.10.13.26              ether   00:50:56:bc:5b:1d   C                     ens257
10.10.12.46              ether   00:50:56:bc:72:e0   C                     ens256
10.11.12.131             ether   00:50:56:bc:8f:26   C                     ens256
The proxy could bind the ports without an issue:

Code: Select all

# netstat -an | grep -w LISTEN
tcp        0      0 10.12.12.5:4444         0.0.0.0:*               LISTEN
tcp        0      0 10.10.12.5:4445         0.0.0.0:*               LISTEN
tcp        0      0 10.10.12.57:80          0.0.0.0:*               LISTEN
tcp        0      0 10.10.12.56:80          0.0.0.0:*               LISTEN
tcp        0      0 10.11.12.25:80          0.0.0.0:*               LISTEN
tcp        0      0 10.11.12.15:80          0.0.0.0:*               LISTEN
tcp        0      0 10.11.12.5:80           0.0.0.0:*               LISTEN
tcp        0      0 10.12.12.5:80           0.0.0.0:*               LISTEN
tcp        0      0 10.10.12.5:1433         0.0.0.0:*               LISTEN
tcp        0      0 10.12.12.7:1433         0.0.0.0:*               LISTEN
tcp        0      0 10.12.12.5:1433         0.0.0.0:*               LISTEN
tcp        0      0 10.11.12.5:1433         0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:2233            0.0.0.0:*               LISTEN
tcp        0      0 10.10.12.5:4443         0.0.0.0:*               LISTEN
tcp        0      0 10.11.12.5:4443         0.0.0.0:*               LISTEN
tcp        0      0 10.12.12.5:4443         0.0.0.0:*               LISTEN
tcp6       0      0 :::2233                 :::*                    LISTEN
I'm pretty lost, and really have no clue.

Does anybody have or had a similar issue?
Any hint would be highly appreciated!

Thanks in advance!

HotC
Last edited by HotChocolate on 2018/05/22 21:29:53, edited 1 time in total.

User avatar
TrevorH
Site Admin
Posts: 33202
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: Networking issues with VMware interfaces, after yum upgrade

Post by TrevorH » 2018/05/22 19:32:55

Did you have the VMWare Guest Additions (or whatever they're called) installed before? They'll need reinstalling for each kernel update.
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

HotChocolate
Posts: 7
Joined: 2018/05/22 14:36:40

Re: Networking issues with VMware interfaces, after yum upgrade

Post by HotChocolate » 2018/05/22 20:46:56

Hi Trevor

Thanks for the quick reply!
Yes they are/were installed. Good point. Will check and reinstall.
Kind regards.

HotC

HotChocolate
Posts: 7
Joined: 2018/05/22 14:36:40

Re: Networking issues with VMware interfaces, after yum upgrade

Post by HotChocolate » 2018/05/22 21:36:23

TrevorH wrote:Did you have the VMWare Guest Additions (or whatever they're called) installed before? They'll need reinstalling for each kernel update.
Hi again
I've tried, but still the same :-(

hunter86_bg
Posts: 2019
Joined: 2015/02/17 15:14:33
Location: Bulgaria
Contact:

Re: Networking issues with VMware interfaces, after yum upgrade

Post by hunter86_bg » 2018/05/23 03:48:02

Have you made a tcpdump on one of the CentOS VMs, in ordser to check if packages are received/sent ?

HotChocolate
Posts: 7
Joined: 2018/05/22 14:36:40

Re: Networking issues with VMware interfaces, after yum upgrade

Post by HotChocolate » 2018/05/23 10:56:28

hunter86_bg wrote:Have you made a tcpdump on one of the CentOS VMs, in ordser to check if packages are received/sent ?
Hi hunter86_bg

Not yet. You're right, should have been done earlier. :roll:
You can see, that packages were received, but none sent.

Here the results taken on one LB from telnet and ping from one Windows server (10.12.21.111):

Env. before upgrade / OK (1st sequence from telnet, closed after success / 2nd from 4 pings):

Code: Select all

# tcpdump -vv -i any src host 10.12.21.111 or dst host 10.12.21.111
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes

12:34:46.129420 IP (tos 0x2,ECT(0), ttl 127, id 29367, offset 0, flags [DF], proto TCP (6), length 52)
    10.12.21.111.51568 > stynetvm03v.sbsdc.net.http: Flags [SEW], cksum 0xe9ec (correct), seq 156455364, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0

12:34:46.129438 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 52)
    stynetvm03v.sbsdc.net.http > 10.12.21.111.51568: Flags [S.], cksum 0x3fb2 (incorrect -> 0x70f9), seq 3675606141, ack 156455365, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0

12:34:46.129933 IP (tos 0x0, ttl 127, id 29368, offset 0, flags [DF], proto TCP (6), length 40)
    10.12.21.111.51568 > stynetvm03v.sbsdc.net.http: Flags [.], cksum 0x1fda (correct), seq 1, ack 1, win 1026, length 0

12:34:58.113637 IP (tos 0x0, ttl 127, id 29377, offset 0, flags [DF], proto TCP (6), length 41)
    10.12.21.111.51561 > stynetvm03v.sbsdc.net.ms-sql-s: Flags [.], cksum 0x9e71 (correct), seq 1767297605:1767297606, ack 3210347483, win 1026, length 1

12:34:58.113656 IP (tos 0x0, ttl 64, id 32409, offset 0, flags [DF], proto TCP (6), length 52)
    stynetvm03v.sbsdc.net.ms-sql-s > 10.12.21.111.51561: Flags [.], cksum 0x3fb2 (incorrect -> 0xf3fa), seq 1, ack 1, win 296, options [nop,nop,sack 1 {0:1}], length 0

12:35:08.532785 IP (tos 0x0, ttl 127, id 29388, offset 0, flags [DF], proto TCP (6), length 41)
    10.12.21.111.51568 > stynetvm03v.sbsdc.net.http: Flags [P.], cksum 0x1cd1 (correct), seq 1:2, ack 1, win 1026, length 1: HTTP

12:35:08.532807 IP (tos 0x0, ttl 64, id 57690, offset 0, flags [DF], proto TCP (6), length 40)
    stynetvm03v.sbsdc.net.http > 10.12.21.111.51568: Flags [.], cksum 0x3fa6 (incorrect -> 0x22f6), seq 1, ack 2, win 229, length 0

----------------------------------------------------------------------------------

12:35:22.620982 IP (tos 0x0, ttl 127, id 29403, offset 0, flags [none], proto ICMP (1), length 60)
    10.12.21.111 > stynetvm03v.sbsdc.net: ICMP echo request, id 1, seq 57, length 40

12:35:22.620999 IP (tos 0x0, ttl 64, id 38543, offset 0, flags [none], proto ICMP (1), length 60)
    stynetvm03v.sbsdc.net > 10.12.21.111: ICMP echo reply, id 1, seq 57, length 40

12:35:23.634830 IP (tos 0x0, ttl 127, id 29404, offset 0, flags [none], proto ICMP (1), length 60)
    10.12.21.111 > stynetvm03v.sbsdc.net: ICMP echo request, id 1, seq 58, length 40

12:35:23.634843 IP (tos 0x0, ttl 64, id 39272, offset 0, flags [none], proto ICMP (1), length 60)
    stynetvm03v.sbsdc.net > 10.12.21.111: ICMP echo reply, id 1, seq 58, length 40

12:35:24.647676 IP (tos 0x0, ttl 127, id 29405, offset 0, flags [none], proto ICMP (1), length 60)
    10.12.21.111 > stynetvm03v.sbsdc.net: ICMP echo request, id 1, seq 59, length 40

12:35:24.647690 IP (tos 0x0, ttl 64, id 39316, offset 0, flags [none], proto ICMP (1), length 60)
    stynetvm03v.sbsdc.net > 10.12.21.111: ICMP echo reply, id 1, seq 59, length 40

12:35:25.659432 IP (tos 0x0, ttl 127, id 29410, offset 0, flags [none], proto ICMP (1), length 60)
    10.12.21.111 > stynetvm03v.sbsdc.net: ICMP echo request, id 1, seq 60, length 40

12:35:25.659447 IP (tos 0x0, ttl 64, id 40103, offset 0, flags [none], proto ICMP (1), length 60)
    stynetvm03v.sbsdc.net > 10.12.21.111: ICMP echo reply, id 1, seq 60, length 40
Env. after upgrade / NOK (1st sequence from telnet, waited till close / 2nd from 4 pings):

Code: Select all

# tcpdump -vv -i any src host 10.12.21.111 or dst host 10.12.21.111
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes

12:22:50.654351 IP (tos 0x2,ECT(0), ttl 127, id 27816, offset 0, flags [DF], proto TCP (6), length 52)
    10.12.21.111.51513 > stynetvm03v.sbsdc.net.http: Flags [SEW], cksum 0xab1b (correct), seq 3167870285, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0

12:22:53.667970 IP (tos 0x2,ECT(0), ttl 127, id 27817, offset 0, flags [DF], proto TCP (6), length 52)
    10.12.21.111.51513 > stynetvm03v.sbsdc.net.http: Flags [SEW], cksum 0xab1b (correct), seq 3167870285, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0

12:22:59.676549 IP (tos 0x0, ttl 127, id 27826, offset 0, flags [DF], proto TCP (6), length 48)
    10.12.21.111.51513 > stynetvm03v.sbsdc.net.http: Flags [S], cksum 0xbfea (correct), seq 3167870285, win 8192, options [mss 1460,nop,nop,sackOK], length 0

12:23:19.191908 IP (tos 0x2,ECT(0), ttl 127, id 27840, offset 0, flags [DF], proto TCP (6), length 52)
    10.12.21.111.51514 > stynetvm03v.sbsdc.net.ms-sql-s: Flags [SEW], cksum 0xf61a (correct), seq 420032717, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0

12:23:21.062697 IP (tos 0x2,ECT(0), ttl 127, id 27846, offset 0, flags [DF], proto TCP (6), length 52)
    10.12.21.111.51515 > stynetvm03v.sbsdc.net.microsoft-ds: Flags [SEW], cksum 0x4131 (correct), seq 1674747592, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0

12:23:22.161066 IP (tos 0x2,ECT(0), ttl 127, id 27849, offset 0, flags [DF], proto TCP (6), length 52)
    10.12.21.111.51516 > stynetvm03v.sbsdc.net.netbios-ssn: Flags [SEW], cksum 0x34ab (correct), seq 3885443258, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0

12:23:24.074125 IP (tos 0x2,ECT(0), ttl 127, id 27850, offset 0, flags [DF], proto TCP (6), length 52)
    10.12.21.111.51515 > stynetvm03v.sbsdc.net.microsoft-ds: Flags [SEW], cksum 0x4131 (correct), seq 1674747592, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0

12:23:25.175733 IP (tos 0x2,ECT(0), ttl 127, id 27857, offset 0, flags [DF], proto TCP (6), length 52)
    10.12.21.111.51516 > stynetvm03v.sbsdc.net.netbios-ssn: Flags [SEW], cksum 0x34ab (correct), seq 3885443258, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0

12:23:30.076728 IP (tos 0x0, ttl 127, id 27862, offset 0, flags [DF], proto TCP (6), length 48)
    10.12.21.111.51515 > stynetvm03v.sbsdc.net.microsoft-ds: Flags [S], cksum 0x5600 (correct), seq 1674747592, win 8192, options [mss 1460,nop,nop,sackOK], length 0

12:23:31.189966 IP (tos 0x0, ttl 127, id 27863, offset 0, flags [DF], proto TCP (6), length 48)
    10.12.21.111.51516 > stynetvm03v.sbsdc.net.netbios-ssn: Flags [S], cksum 0x497a (correct), seq 3885443258, win 8192, options [mss 1460,nop,nop,sackOK], length 0

----------------------------------------------------------------------------------

12:23:32.142176 IP (tos 0x0, ttl 127, id 27864, offset 0, flags [none], proto ICMP (1), length 60)
    10.12.21.111 > stynetvm03v.sbsdc.net: ICMP echo request, id 1, seq 53, length 40

12:23:36.795741 IP (tos 0x2,ECT(0), ttl 127, id 27869, offset 0, flags [DF], proto TCP (6), length 52)
    10.12.21.111.51517 > stynetvm03v.sbsdc.net.ms-sql-s: Flags [SEW], cksum 0x0e93 (correct), seq 1251665600, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0

12:23:37.088900 IP (tos 0x0, ttl 127, id 27870, offset 0, flags [none], proto ICMP (1), length 60)
    10.12.21.111 > stynetvm03v.sbsdc.net: ICMP echo request, id 1, seq 54, length 40

12:23:42.086300 IP (tos 0x0, ttl 127, id 27876, offset 0, flags [none], proto ICMP (1), length 60)
    10.12.21.111 > stynetvm03v.sbsdc.net: ICMP echo request, id 1, seq 55, length 40

12:23:43.303670 IP (tos 0x2,ECT(0), ttl 127, id 27879, offset 0, flags [DF], proto TCP (6), length 52)
    10.12.21.111.51520 > stynetvm03v.sbsdc.net.ms-sql-s: Flags [SEW], cksum 0xfd8f (correct), seq 4228269652, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0

12:23:46.310948 IP (tos 0x2,ECT(0), ttl 127, id 27884, offset 0, flags [DF], proto TCP (6), length 52)
    10.12.21.111.51520 > stynetvm03v.sbsdc.net.ms-sql-s: Flags [SEW], cksum 0xfd8f (correct), seq 4228269652, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0

12:23:47.088505 IP (tos 0x0, ttl 127, id 27885, offset 0, flags [none], proto ICMP (1), length 60)
    10.12.21.111 > stynetvm03v.sbsdc.net: ICMP echo request, id 1, seq 56, length 40

12:23:47.276260 IP (tos 0x2,ECT(0), ttl 127, id 27886, offset 0, flags [DF], proto TCP (6), length 52)
    10.12.21.111.51521 > stynetvm03v.sbsdc.net.ms-sql-s: Flags [SEW], cksum 0xf5ba (correct), seq 2921438221, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0

12:23:50.286309 IP (tos 0x2,ECT(0), ttl 127, id 27891, offset 0, flags [DF], proto TCP (6), length 52)
    10.12.21.111.51521 > stynetvm03v.sbsdc.net.ms-sql-s: Flags [SEW], cksum 0xf5ba (correct), seq 2921438221, win 8192, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
What the hell?!!
Seems the incoming packages are the same, but no replies at all.

hunter86_bg
Posts: 2019
Joined: 2015/02/17 15:14:33
Location: Bulgaria
Contact:

Re: Networking issues with VMware interfaces, after yum upgrade

Post by hunter86_bg » 2018/05/23 18:58:30

As I'm reading it from a phone I might have missed something, but I have noticed that before update tcp offloading is working, while after the update is switched off.
Can you run 'ethtool' (maybe with -k option) to check for any differences ? Even try to enable tcp offloading after the update.
Also, compare the modules used prior and after the update.What type of NIC do you set for the VMs?
Edit:Another reason for not replying could indicate missing default gateway or problems with the arp table.

HotChocolate
Posts: 7
Joined: 2018/05/22 14:36:40

Re: Networking issues with VMware interfaces, after yum upgrade

Post by HotChocolate » 2018/05/23 21:04:49

hunter86_bg wrote:As I'm reading it from a phone I might have missed something, but I have noticed that before update tcp offloading is working, while after the update is switched off.
Can you run 'ethtool' (maybe with -k option) to check for any differences ? Even try to enable tcp offloading after the update.
Also, compare the modules used prior and after the update.What type of NIC do you set for the VMs?
Edit:Another reason for not replying could indicate missing default gateway or problems with the arp table.
Hi again

Thanks for the good advices!

I've compared, and found there are several changes:

Differences on modules after upgrade

Code: Select all

removed:
-----------
Module                  Size  Used by
edac_core              58151  1 sb_edac

added:
---------
Module                  Size  Used by
crc_t10dif             12912  1 sd_mod
ip6_tables             26912  1 ip6table_filter
iptable_filter         12810  0
nfnetlink              14490  2 nfnetlink_log,nfnetlink_queue
nfnetlink_log          17892  0
nfnetlink_queue        18197  0
shpchp                 37047  0

changed:
--------------
ip_tables from [b]0 to 1 iptable_filter[/b]
Differences on NIC settings after upgrade

Code: Select all

removed:
------------
tx-mpls-segmentation: off [fixed]

added:
---------
rx-udp_tunnel-port-offload: off [fixed]

changed:
-----------
none
VMware NIC Adapter Type: VMXNET 3

First I thought it might be the changes on the ip_tables module, but its now just used by iptable_filter, where the iptable_filter isn't used.

I've compared again the package traffic, and the behaviour is very strange.
Pinging from one Windows Server to the CentOS host (LB) and vice versa shows.

Before upgrade:
tcp-dump on LB (CentOS + nginx) ping FROM Win IIS (10.12.21.111)

Code: Select all

20:36:34.822777  In 00:50:56:bc:a3:2c (oui Unknown) ethertype IPv4 (0x0800), length 76: 10.12.21.111 > stynetvm03v.sbsdc.net: ICMP echo request, id 1, seq 24, length 40
20:36:34.822837 Out 00:50:56:bc:1d:f3 (oui Unknown) ethertype IPv4 (0x0800), length 76: stynetvm03v.sbsdc.net > 10.12.21.111: ICMP echo reply, id 1, seq 24, length 40
20:36:35.841832  In 00:50:56:bc:a3:2c (oui Unknown) ethertype IPv4 (0x0800), length 76: 10.12.21.111 > stynetvm03v.sbsdc.net: ICMP echo request, id 1, seq 25, length 40
20:36:35.841847 Out 00:50:56:bc:1d:f3 (oui Unknown) ethertype IPv4 (0x0800), length 76: stynetvm03v.sbsdc.net > 10.12.21.111: ICMP echo reply, id 1, seq 25, length 40
20:36:36.848556  In 00:50:56:bc:a3:2c (oui Unknown) ethertype IPv4 (0x0800), length 76: 10.12.21.111 > stynetvm03v.sbsdc.net: ICMP echo request, id 1, seq 26, length 40
20:36:36.848568 Out 00:50:56:bc:1d:f3 (oui Unknown) ethertype IPv4 (0x0800), length 76: stynetvm03v.sbsdc.net > 10.12.21.111: ICMP echo reply, id 1, seq 26, length 40
20:36:37.864182  In 00:50:56:bc:a3:2c (oui Unknown) ethertype IPv4 (0x0800), length 76: 10.12.21.111 > stynetvm03v.sbsdc.net: ICMP echo request, id 1, seq 27, length 40
20:36:37.864192 Out 00:50:56:bc:1d:f3 (oui Unknown) ethertype IPv4 (0x0800), length 76: stynetvm03v.sbsdc.net > 10.12.21.111: ICMP echo reply, id 1, seq 27, length 40
tcp-dump on LB (CentOS + nginx) ping TO Win IIS (10.12.21.111)

Code: Select all

20:38:01.525962 Out 00:50:56:bc:1d:f3 (oui Unknown) ethertype IPv4 (0x0800), length 100: stynetvm03v.sbsdc.net > 10.12.21.111: ICMP echo request, id 48091, seq 5, length 64
20:38:01.526132  In 00:50:56:bc:a3:2c (oui Unknown) ethertype IPv4 (0x0800), length 100: 10.12.21.111 > stynetvm03v.sbsdc.net: ICMP echo reply, id 48091, seq 5, length 64
20:38:02.526056 Out 00:50:56:bc:1d:f3 (oui Unknown) ethertype IPv4 (0x0800), length 100: stynetvm03v.sbsdc.net > 10.12.21.111: ICMP echo request, id 48091, seq 6, length 64
20:38:02.526222  In 00:50:56:bc:a3:2c (oui Unknown) ethertype IPv4 (0x0800), length 100: 10.12.21.111 > stynetvm03v.sbsdc.net: ICMP echo reply, id 48091, seq 6, length 64
20:38:03.526095 Out 00:50:56:bc:1d:f3 (oui Unknown) ethertype IPv4 (0x0800), length 100: stynetvm03v.sbsdc.net > 10.12.21.111: ICMP echo request, id 48091, seq 7, length 64
20:38:03.526275  In 00:50:56:bc:a3:2c (oui Unknown) ethertype IPv4 (0x0800), length 100: 10.12.21.111 > stynetvm03v.sbsdc.net: ICMP echo reply, id 48091, seq 7, length 64
Both ways are ok.

After upgrade:

tcp-dump on LB (CentOS + nginx) ping FROM Win IIS (10.12.21.111)

Code: Select all

20:36:50.880427  In 00:50:56:bc:9e:1d (oui Unknown) ethertype IPv4 (0x0800), length 76: 10.12.21.111 > stynetvm03v.sbsdc.net: ICMP echo request, id 1, seq 73, length 40
20:36:55.633095  In 00:50:56:bc:9e:1d (oui Unknown) ethertype IPv4 (0x0800), length 76: 10.12.21.111 > stynetvm03v.sbsdc.net: ICMP echo request, id 1, seq 74, length 40
20:37:00.640047  In 00:50:56:bc:9e:1d (oui Unknown) ethertype IPv4 (0x0800), length 76: 10.12.21.111 > stynetvm03v.sbsdc.net: ICMP echo request, id 1, seq 75, length 40
20:37:05.636915  In 00:50:56:bc:9e:1d (oui Unknown) ethertype IPv4 (0x0800), length 76: 10.12.21.111 > stynetvm03v.sbsdc.net: ICMP echo request, id 1, seq 76, length 40
tcp-dump on LB (CentOS + nginx) ping TO Win IIS (10.12.21.111)

Code: Select all

20:37:33.643802 Out 00:50:56:bc:5d:ad (oui Unknown) ethertype IPv4 (0x0800), length 100: stynetvm03v.sbsdc.net > 10.12.21.111: ICMP echo request, id 47982, seq 1, length 64
20:37:34.643899 Out 00:50:56:bc:5d:ad (oui Unknown) ethertype IPv4 (0x0800), length 100: stynetvm03v.sbsdc.net > 10.12.21.111: ICMP echo request, id 47982, seq 2, length 64
20:37:35.643978 Out 00:50:56:bc:5d:ad (oui Unknown) ethertype IPv4 (0x0800), length 100: stynetvm03v.sbsdc.net > 10.12.21.111: ICMP echo request, id 47982, seq 3, length 64
20:37:36.644007 Out 00:50:56:bc:5d:ad (oui Unknown) ethertype IPv4 (0x0800), length 100: stynetvm03v.sbsdc.net > 10.12.21.111: ICMP echo request, id 47982, seq 4, length 64
20:37:37.644054 Out 00:50:56:bc:5d:ad (oui Unknown) ethertype IPv4 (0x0800), length 100: stynetvm03v.sbsdc.net > 10.12.21.111: ICMP echo request, id 47982, seq 5, length 64
20:37:38.644196 Out 00:50:56:bc:5d:ad (oui Unknown) ethertype IPv4 (0x0800), length 100: stynetvm03v.sbsdc.net > 10.12.21.111: ICMP echo request, id 47982, seq 6, length 64
As you can can see, I can either receive or send from the CentOS node. But not together.
Do you think the mentioned changes could cause such kind of behaviour?
I'm very surprised that an upgrade only from stable repos would lead to this mess.
I mean it makes not much sense to upgrade if we have to apply changes on that level every time.
On the other hand we can't avoid upgrading till it gets eol.

hunter86_bg
Posts: 2019
Joined: 2015/02/17 15:14:33
Location: Bulgaria
Contact:

Re: Networking issues with VMware interfaces, after yum upgrade

Post by hunter86_bg » 2018/05/24 03:39:57

Can you check with 'lspci. -v' the module in use by the adapter.
It seems that there is a bug but I'm not sure if it's VmWare or CentOS.
The only option left is to open a bug .

HotChocolate
Posts: 7
Joined: 2018/05/22 14:36:40

Re: Networking issues with VMware interfaces, after yum upgrade

Post by HotChocolate » 2018/05/24 10:29:23

hunter86_bg wrote:Can you check with 'lspci. -v' the module in use by the adapter.
It seems that there is a bug but I'm not sure if it's VmWare or CentOS.
The only option left is to open a bug .
Hi hunter86_bg

Ok, I ran lspci -s -vv onall of them.

The only difference is:

Before upgrade:
00:01.0 PCI bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX AGP bridge (rev 01) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap- 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
I/O behind bridge: 0000f000-00000fff
Memory behind bridge: fff00000-000fffff
Prefetchable memory behind bridge: fff00000-000fffff
Secondary status: 66MHz+ FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR- NoISA+ VGA- MAbort- >Reset- FastB2B+
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Kernel modules: shpchp

After upgrade:
00:01.0 PCI bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX AGP bridge (rev 01) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap- 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
I/O behind bridge: 0000f000-00000fff
Memory behind bridge: fff00000-000fffff
Prefetchable memory behind bridge: fff00000-000fffff
Secondary status: 66MHz+ FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR- NoISA+ VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Kernel modules: shpchp

FastB2B- instead of FastB2B+

Now clue if this can cause the issue. Is there a way to revert back?

Post Reply