Kernel 3.10.0-957 network connection failed

Issues related to configuring your network
Post Reply
silvio
Posts: 67
Joined: 2008/11/10 13:06:03

Kernel 3.10.0-957 network connection failed

Post by silvio » 2018/11/21 15:04:41

Hi,

after switching to kernel 3.10.0-957 from the cr repo, i lost my network connection.
I can see no error messages in the logs, the devices are up and (i think) they have the correct settings.

We use a bonding mode 1 device with 2 nics and if i reboot to kernel 3.10.0-862.14.4 the system is working normal.
My settings:

bond0 device:

DEVICE=bond0
TYPE=Bond
ONBOOT=yes
BONDING_MASTER=yes
BOOTPROTO=none
NM_CONTROLLED=no
IPADDR=10.234.16.51
NETMASK=255.255.255.0
GATEWAY=10.234.16.1
NAME=bond0
USERCTL=no
BONDING_OPTS="mode=1 miimon=100"

first slave:

TYPE=Ethernet
BOOTPROTO=none
NM_CONTROLLED=no
USERCTL=no
NAME=enp4s0f0
DEVICE=enp4s0f0
ONBOOT=yes
MASTER=bond0
SLAVE=yes

second slave:

TYPE=Ethernet
BOOTPROTO=none
NM_CONTROLLED=no
USERCTL=no
NAME=enp4s0f1
DEVICE=enp4s0f1
ONBOOT=yes
MASTER=bond0
SLAVE=yes

/proc/net/bonding/bond0:

Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: enp4s0f0
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: enp4s0f0
MII Status: up
Speed: 2000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: b4:99:ba:ac:4a:d2
Slave queue ID: 0

Slave Interface: enp4s0f1
MII Status: up
Speed: 2000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: b4:99:ba:ac:4a:d6
Slave queue ID: 0

ethtool for all devices:

Settings for bond0:
Supported ports: [ ]
Supported link modes: Not reported
Supported pause frame use: No
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes: Not reported
Advertised pause frame use: No
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Speed: 2000Mb/s
Duplex: Full
Port: Other
PHYAD: 0
Transceiver: internal
Auto-negotiation: off
Link detected: yes
Settings for enp4s0f0:
Supported ports: [ Backplane ]
Supported link modes: 10000baseKR/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes: 10000baseKR/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Speed: 2000Mb/s
Duplex: Full
Port: Other
PHYAD: 0
Transceiver: internal
Auto-negotiation: off
Supports Wake-on: d
Wake-on: d
Current message level: 0x00002000 (8192)
hw
Link detected: yes
Settings for enp4s0f1:
Supported ports: [ Backplane ]
Supported link modes: 10000baseKR/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes: 10000baseKR/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Speed: 2000Mb/s
Duplex: Full
Port: Other
PHYAD: 1
Transceiver: internal
Auto-negotiation: off
Supports Wake-on: d
Wake-on: d
Current message level: 0x00002000 (8192)
hw
Link detected: yes

route:

Kernel IP Routentabelle
Ziel Router Genmask Flags Metric Ref Use Iface
0.0.0.0 10.234.16.1 0.0.0.0 UG 0 0 0 bond0
10.234.16.0 0.0.0.0 255.255.255.0 U 0 0 0 bond0

ip link show:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp4s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
link/ether b4:99:ba:ac:4a:d2 brd ff:ff:ff:ff:ff:ff
3: enp4s0f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
link/ether b4:99:ba:ac:4a:d2 brd ff:ff:ff:ff:ff:ff
4: enp4s0f4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether b4:99:ba:ac:4a:d4 brd ff:ff:ff:ff:ff:ff
5: enp4s0f5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether b4:99:ba:ac:4a:d8 brd ff:ff:ff:ff:ff:ff
6: enp4s0f6: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether b4:99:ba:ac:4a:d5 brd ff:ff:ff:ff:ff:ff
7: enp4s0f7: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether b4:99:ba:ac:4a:d9 brd ff:ff:ff:ff:ff:ff
8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether b4:99:ba:ac:4a:d2 brd ff:ff:ff:ff:ff:ff

lspci:

04:00.0 Ethernet controller: Emulex Corporation OneConnect OCe10100/OCe10102 Series 10 GbE (rev 02)
04:00.1 Ethernet controller: Emulex Corporation OneConnect OCe10100/OCe10102 Series 10 GbE (rev 02)
04:00.2 Fibre Channel: Emulex Corporation OneConnect OCe10100/OCe10102 Series 10 GbE CNA (rev 02)
04:00.3 Fibre Channel: Emulex Corporation OneConnect OCe10100/OCe10102 Series 10 GbE CNA (rev 02)
04:00.4 Ethernet controller: Emulex Corporation OneConnect OCe10100/OCe10102 Series 10 GbE (rev 02)
04:00.5 Ethernet controller: Emulex Corporation OneConnect OCe10100/OCe10102 Series 10 GbE (rev 02)
04:00.6 Ethernet controller: Emulex Corporation OneConnect OCe10100/OCe10102 Series 10 GbE (rev 02)
04:00.7 Ethernet controller: Emulex Corporation OneConnect OCe10100/OCe10102 Series 10 GbE (rev 02)

Have someone an idea what i can check or what changed between these 2 kernel versions (see some changes in the kernel changelog but this should not change the function).

Silvio

silvio
Posts: 67
Joined: 2008/11/10 13:06:03

Re: Kernel 3.10.0-957 network connection failed

Post by silvio » 2018/11/22 13:25:51

Some additional statistics:

ip -statistics link show:

2: enp4s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
link/ether b4:99:ba:ac:4a:d2 brd ff:ff:ff:ff:ff:ff
RX: bytes packets errors dropped overrun mcast
0 0 0 0 0 0
TX: bytes packets errors dropped carrier collsns
188244 4482 0 0 0 0
3: enp4s0f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
link/ether b4:99:ba:ac:4a:d2 brd ff:ff:ff:ff:ff:ff
RX: bytes packets errors dropped overrun mcast
0 0 0 0 0 0
TX: bytes packets errors dropped carrier collsns
84 2 0 0 0 0
8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether b4:99:ba:ac:4a:d2 brd ff:ff:ff:ff:ff:ff
RX: bytes packets errors dropped overrun mcast
0 0 0 0 0 0
TX: bytes packets errors dropped carrier collsns
188328 4484 0 0 0 0

I can see that the system is sending packages but does not receive the answers on bond0.


ip neigh show with 3.10.0-957 kernel:
10.234.16.1 dev bond0 FAILED
10.234.16.11 dev bond0 FAILED

ip neigh show with 3.10.0-862.14.4 kernel:
10.234.16.1 dev bond0 lladdr 44:03:a7:4a:14:47 REACHABLE

silvio
Posts: 67
Joined: 2008/11/10 13:06:03

Re: Kernel 3.10.0-957 network connection failed

Post by silvio » 2018/11/23 10:23:59

To evaluate the problem i deleted the bonding device and create a single network device with one of the nics.
The problem is the same, with the old kernel the network is usable with the new kernel .. no network connection but no error messages.

Silvio

User avatar
toracat
Site Admin
Posts: 7518
Joined: 2006/09/03 16:37:24
Location: California, US
Contact:

Re: Kernel 3.10.0-957 network connection failed

Post by toracat » 2018/11/24 17:13:45

Which driver is your device using? Output from lspci -vv will tell you. Also lspci -nn will show the device IDs.
CentOS Forum FAQ

silvio
Posts: 67
Joined: 2008/11/10 13:06:03

Re: Kernel 3.10.0-957 network connection failed

Post by silvio » 2018/11/26 13:31:19

toracat wrote:
2018/11/24 17:13:45
Which driver is your device using? Output from lspci -vv will tell you. Also lspci -nn will show the device IDs.
Hi,

the nics in the blades use the be2net driver.

04:00.0 Ethernet controller [0200]: Emulex Corporation OneConnect OCe10100/OCe10102 Series 10 GbE [19a2:0700] (rev 02)
04:00.1 Ethernet controller [0200]: Emulex Corporation OneConnect OCe10100/OCe10102 Series 10 GbE [19a2:0700] (rev 02)

04:00.0 Ethernet controller: Emulex Corporation OneConnect OCe10100/OCe10102 Series 10 GbE (rev 02)
Subsystem: Hewlett-Packard Company NC551i Dual Port FlexFabric 10Gb Adapter
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 27
NUMA node: 0
Region 1: Memory at fdff0000 (32-bit, non-prefetchable)
Region 2: Memory at fdfc0000 (64-bit, non-prefetchable)
Region 4: Memory at fdfa0000 (64-bit, non-prefetchable)
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=375mA PME(D0-,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME-
Capabilities: [48] MSI-X: Enable+ Count=32 Masked-
Vector table: BAR=1 offset=00002000
PBA: BAR=1 offset=00003000
Capabilities: [c0] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <1us, L1 <16us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75.000W
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
MaxPayload 128 bytes, MaxReadReq 4096 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s <1us, L1 <16us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP- ECRC- UnsupReq+ ACSViol-
UESvrt: DLP- SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
Capabilities: [194 v1] Device Serial Number b4-99-ba-ff-fe-ac-4a-d2
Kernel driver in use: be2net
Kernel modules: be2net

04:00.1 Ethernet controller: Emulex Corporation OneConnect OCe10100/OCe10102 Series 10 GbE (rev 02)
Subsystem: Hewlett-Packard Company NC551i Dual Port FlexFabric 10Gb Adapter
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin B routed to IRQ 45
NUMA node: 0
Region 1: Memory at fdf90000 (32-bit, non-prefetchable)
Region 2: Memory at fdf60000 (64-bit, non-prefetchable)
Region 4: Memory at fdf40000 (64-bit, non-prefetchable)
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=375mA PME(D0-,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME-
Capabilities: [48] MSI-X: Enable+ Count=32 Masked-
Vector table: BAR=1 offset=00002000
PBA: BAR=1 offset=00003000
Capabilities: [c0] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <1us, L1 <16us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75.000W
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
MaxPayload 128 bytes, MaxReadReq 4096 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s <1us, L1 <16us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP- ECRC- UnsupReq+ ACSViol-
UESvrt: DLP- SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
Capabilities: [194 v1] Device Serial Number b4-99-ba-ff-fe-ac-4a-d2
Kernel driver in use: be2net
Kernel modules: be2net


Silvio

silvio
Posts: 67
Joined: 2008/11/10 13:06:03

Re: Kernel 3.10.0-957 network connection failed

Post by silvio » 2018/12/04 10:21:36

same situation with kernel 3.10.0-957.1.3

User avatar
toracat
Site Admin
Posts: 7518
Joined: 2006/09/03 16:37:24
Location: California, US
Contact:

Re: Kernel 3.10.0-957 network connection failed

Post by toracat » 2018/12/04 18:00:45

When going from 7.5 to 7.6, I see several patches for be2net in the kernel. It is hard to tell if one (or more) of them caused the problem you are seeing. There is absolutely no error anywhere?
CentOS Forum FAQ

milan.zelenka
Posts: 1
Joined: 2019/01/16 14:38:33

Re: Kernel 3.10.0-957 network connection failed

Post by milan.zelenka » 2019/01/16 14:53:27

This is due to Red Hat deprecates this network card since RHEL 7.2. It's described in this article: https://access.redhat.com/solutions/3723091 (requires RH login).

In short:

This NIC is marked as deprecated since rhel-7.2 as per the release notes: https://access.redhat.com/documentation ... ctionality

Solution is blacklist lfpc kernel module.

User avatar
toracat
Site Admin
Posts: 7518
Joined: 2006/09/03 16:37:24
Location: California, US
Contact:

Re: Kernel 3.10.0-957 network connection failed

Post by toracat » 2019/01/28 18:39:13

That is it. Thanks for posting the links.
CentOS Forum FAQ

silvio
Posts: 67
Joined: 2008/11/10 13:06:03

Re: Kernel 3.10.0-957 network connection failed

Post by silvio » 2019/03/19 10:22:30

With Kernel 3.10.0-957.10.1 the network and the fibre channel adapter works and i have network and san connection.

"* Regression in lpfc and the CNE1000 (BE2 FCoE) adapters that no longer
initialize (BZ#1664067)"


Silvio

Post Reply