Page 1 of 2

Tg3 Network Link Unstable

Posted: 2017/12/06 00:59:35
by Takx
Hello Guys,


Will try to summary it all (sorry for the long post, but already tried lots of stuff without success):

Server: HP DL380 GEN9 with Broadcom Limited NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) on CentOS 7.3.

Using kvm + ovirt for virtualization.

Have 4 VMs for now (all created from scratch, no clone), but when I up the virtual interface (in ovirt) of the 4º machine start having a interface up/down flapping / unstable behavior.

It looks like it have relation to STP (on the bridge) or tg3 driver (tried all the basic too, change cable and switch port)

Code: Select all

[root@ ~]# dmesg -e -H | tail
[Dez 5 22:26] tg3 0000:02:00.2 eno3: Link is down
[  +0,000235] ovirtmgmt: port 1(eno3) entered disabled state
[Dez 5 22:27] tg3 0000:02:00.2 eno3: Link is up at 1000 Mbps, full duplex
[  +0,000011] tg3 0000:02:00.2 eno3: Flow control is off for TX and off for RX
[  +0,000003] tg3 0000:02:00.2 eno3: EEE is enabled
[  +0,000036] ovirtmgmt: port 1(eno3) entered blocking state
[  +0,000002] ovirtmgmt: port 1(eno3) entered forwarding state

Looking around google found someone with problem alike:

http://centosfaq.org/centos/tg3-network-link-unstable/

Kind old but tried the solution forcing a new driver for Tg3, not sure if I did the right thing because I forced 7.4 RHEL (kmod-tg3-3.137s-1.rhel7u4.x86_64)
https://support.hpe.com/hpsc/swd/public ... =4184#tab3


Actual driver:

Code: Select all

[root@ ~]# modinfo tg3
filename:       /lib/modules/3.10.0-693.5.2.el7.x86_64/weak-updates/tg3/tg3.ko
firmware:       tigon/tg3_tso5.bin
firmware:       tigon/tg3_tso.bin
firmware:       tigon/tg3.bin
version:        3.137s
license:        GPL
description:    Broadcom Tigon3 ethernet driver
author:         David S. Miller (davem@redhat.com) and Jeff Garzik (jgarzik@pobox.com)
rhelversion:    7.4
srcversion:     2E77010351D2112D6C92521


Also tried somewhat this solution: viewtopic.php?t=44838

But only stopped NetworkManager and set ip on the interface (ifcfg-eno1 for hypervisor managment and ifcfg-ovirtmgmt for VMs traffic).

Code: Select all

[root@ ~]# systemctl status NetworkManager
● NetworkManager.service - Network Manager
   Loaded: loaded (/usr/lib/systemd/system/NetworkManager.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Ter 2017-12-05 20:41:44 -02; 2h 4min ago
     Docs: man:NetworkManager(8)
 Main PID: 1482 (code=exited, status=0/SUCCESS)

Bridge Interface Script:

Code: Select all

[root@ ~]# cat /etc/sysconfig/network-scripts/ifcfg-ovirtmgmt
# Generated by VDSM version 4.19.37-1.el7.centos
DEVICE=ovirtmgmt
TYPE=Bridge
DELAY=0
STP=off
ONBOOT=yes
BOOTPROTO=static
DEFROUTE=yes
NM_CONTROLLED=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
DNS1=10.41.24.17
DNS2=10.41.22.176
MTU=1500
HWADDR=94:18:82:7b:b3:8e
IPADDR=10.40.198.34
GATEWAY=10.40.196.1
NETMASK=255.255.252.0

Interfaces:

Code: Select all

[root@ ~]# ifconfig eno3
eno3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        ether 94:18:82:7b:b3:8e  txqueuelen 1000  (Ethernet)
        RX packets 1589679  bytes 1019123832 (971.9 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 767636  bytes 135496224 (129.2 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 16

[root@ ~]# ifconfig ovirtmgmt
ovirtmgmt: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.40.198.34  netmask 255.255.252.0  broadcast 10.40.199.255
        inet6 fe80::9618:82ff:fe7b:b38e  prefixlen 64  scopeid 0x20<link>
        ether 94:18:82:7b:b3:8e  txqueuelen 1000  (Ethernet)
        RX packets 1512163  bytes 982532263 (937.0 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 735268  bytes 130518582 (124.4 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@ ~]#

Bridge config and stp (all default):

Code: Select all

[root@ ~]# brctl show ovirtmgmt
bridge name     bridge id               STP enabled     interfaces
ovirtmgmt               8000.9418827bb38e       no              eno3
                                                        vnet0
                                                        vnet1
                                                        vnet2
                                                        vnet3
[root@ ~]# brctl showstp ovirtmgmt
ovirtmgmt
 bridge id              8000.9418827bb38e
 designated root        8000.9418827bb38e
 root port                 0                    path cost                  0
 max age                  20.00                 bridge max age            20.00
 hello time                2.00                 bridge hello time          2.00
 forward delay             0.00                 bridge forward delay       0.00
 ageing time             300.00
 hello timer               0.00                 tcn timer                  0.00
 topology change timer     0.00                 gc timer                  13.43
 flags


eno3 (1)
 port id                8001                    state                forwarding
 designated root        8000.9418827bb38e       path cost                  4
 designated bridge      8000.9418827bb38e       message age timer          0.00
 designated port        8001                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.00
 flags

vnet0 (2)
 port id                8002                    state                forwarding
 designated root        8000.9418827bb38e       path cost                100
 designated bridge      8000.9418827bb38e       message age timer          0.00
 designated port        8002                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.00
 flags

vnet1 (3)
 port id                8003                    state                forwarding
 designated root        8000.9418827bb38e       path cost                100
 designated bridge      8000.9418827bb38e       message age timer          0.00
 designated port        8003                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.00
 flags

vnet2 (4)
 port id                8004                    state                forwarding
 designated root        8000.9418827bb38e       path cost                100
 designated bridge      8000.9418827bb38e       message age timer          0.00
 designated port        8004                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.00
 flags

vnet3 (5)
 port id                8005                    state                forwarding
 designated root        8000.9418827bb38e       path cost                100
 designated bridge      8000.9418827bb38e       message age timer          0.00
 designated port        8005                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.00
 flags

[root@ ~]#



Still with the problem and no more ideias, any suggestions?

Re: Tg3 Network Link Unstable

Posted: 2017/12/06 07:32:04
by TrevorH
That looks like a physical problem to me. Did you change cables, switch ports etc?

Re: Tg3 Network Link Unstable

Posted: 2017/12/06 11:40:49
by Takx
TrevorH wrote:That looks like a physical problem to me. Did you change cables, switch ports etc?
Yeah tried all that.


Will try changing the server interface now, from Eno3 to Eno2 (will have to change lots of things, the IP is MAC reserved).

Re: Tg3 Network Link Unstable

Posted: 2017/12/06 16:48:20
by Takx
Changed the bridge interface to Eno4, still the same problem.

Probably driver related, because the failure only happen when I have 3 or more VMs on the interface.

Those drivers are made by Broadcom or Linux community? Need to report this.

Re: Tg3 Network Link Unstable

Posted: 2017/12/06 19:56:55
by TrevorH
By Redhat.

Re: Tg3 Network Link Unstable

Posted: 2017/12/06 20:13:02
by Takx
Do you know how (and where) I can officially report it to them (and all the technical info they need)?

I will also get in touch with HP.

Thanks.

Re: Tg3 Network Link Unstable

Posted: 2017/12/06 20:27:12
by TrevorH
If you have a RHEL subscription then you can raise a support ticket. If you don't then you can raise a ticket on bugzilla.redhat.com but support there is not official and it depends on who it gets assigned to whether or not they'll do much about it. Some people are great and fix problems just because they exist, others don't.

Re: Tg3 Network Link Unstable

Posted: 2017/12/06 21:59:14
by Takx
@TrevorH

Do you know how can I roll back to the original (default) network driver? I did yum uninstall the rpm package I installed already, is that all?

The mapper (device-mapper: table: 253:5: multipath: error getting device) is having problems mapping the interface eno3 after a system reboot, but checking HP bios it says interface 3 is healthy. Is it possible that the interface is actually dead?

Re: Tg3 Network Link Unstable

Posted: 2017/12/06 22:17:10
by TrevorH
If you yum remove the HP package and reboot then it should be using the distro driver.
The mapper (device-mapper: table: 253:5: multipath: error getting device) is having problems mapping
Is that the actual error message? Everything about that says you have a device mapper error to do with multipath and that's about talking to SAN/NAS type stuff.

Re: Tg3 Network Link Unstable

Posted: 2017/12/06 22:26:15
by Takx
Yeah, I researched a little more, it probably complaining about my NFS mount that is on interface eno3 IP (the one that is not appearing).

Booted on HP low lvl shell and could use the interface 3 normally, ping working, also booted on the windows script portable inside hp embedded applications and working too.

I guess it is a driver problem, but no idea how to solve it. Can I force CentOS default driver install again?