No SAN connection with the latest kernel

Issues related to hardware problems
Post Reply
silvio
Posts: 29
Joined: 2008/11/10 13:06:03
Contact:

No SAN connection with the latest kernel

Post by silvio » 2018/05/18 07:51:13

Hi,

we have some bl465c G7 blades with Emulex cnas.
These devices uses the be2net and lpfc drivers for network and fc connections.
I reinstalled one of the blades with an 1511.iso (i know it's old but the DVD is in the bladecenter and installing systems over Ilom is a ...) and all is working.
After yum upgrade to the latest version, i lost the connection to the SAN.
In the bootlog i can see this:

May 17 17:52:52 blade-server-16 kernel: Command line: BOOT_IMAGE=/vmlinuz-3.10.0-862.2.3.el7.x86_64 root=UUID=c02e6a02-88ed-4757-a57d-b36a581b85ea ro rhgb quiet LANG=de_DE.UTF-8
...
May 17 17:52:55 blade-server-16 kernel: lpfc 0000:04:00.2: 0:1412 Failed to set up driver resource.
May 17 17:52:55 blade-server-16 kernel: lpfc 0000:04:00.2: Driver probe function unexpectedly returned 16
May 17 17:52:55 blade-server-16 kernel: lpfc 0000:04:00.3: 0:1412 Failed to set up driver resource.
May 17 17:52:55 blade-server-16 kernel: lpfc 0000:04:00.3: Driver probe function unexpectedly returned 16

I don't know if this only happens in this kernel version, because i switch directly from 3.10.0-514 to 3.10.0-862.2.3 .
Have someone seen the same problem and have a solution?

lspci:
04:00.0 Ethernet controller: Emulex Corporation OneConnect OCe10100/OCe10102 Series 10 GbE (rev 02)
04:00.1 Ethernet controller: Emulex Corporation OneConnect OCe10100/OCe10102 Series 10 GbE (rev 02)
04:00.2 Fibre Channel: Emulex Corporation OneConnect OCe10100/OCe10102 Series 10 GbE CNA (rev 02)
04:00.3 Fibre Channel: Emulex Corporation OneConnect OCe10100/OCe10102 Series 10 GbE CNA (rev 02)
04:00.4 Ethernet controller: Emulex Corporation OneConnect OCe10100/OCe10102 Series 10 GbE (rev 02)
04:00.5 Ethernet controller: Emulex Corporation OneConnect OCe10100/OCe10102 Series 10 GbE (rev 02)
04:00.6 Ethernet controller: Emulex Corporation OneConnect OCe10100/OCe10102 Series 10 GbE (rev 02)
04:00.7 Ethernet controller: Emulex Corporation OneConnect OCe10100/OCe10102 Series 10 GbE (rev 02)


Silvio

silvio
Posts: 29
Joined: 2008/11/10 13:06:03
Contact:

Re: No SAN connection with the latest kernel

Post by silvio » 2018/05/28 12:20:49

I tested all official kernels between 3.10.514 and 3.10.862.3.2 and the problem starts with 3.10.862.
Now the system is running with 3.10.693.21.1 which should be the last CentOS 7.4 kernel.
I the kernel changelog i see a lot of changes between these 2 versions but i found no infos about my problem.

Silvio

User avatar
TrevorH
Forum Moderator
Posts: 23686
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: No SAN connection with the latest kernel

Post by TrevorH » 2018/05/28 12:56:32

If you have time and inclination to debug this then you could try experimenting with the lpfc_log_verbose parameter that modinfo lpfc says it supports. Perhaps using that you can get more information from it about what it was trying to do at the time...

Any fix will need to come from Redhat though so if you have more information, raise a ticket on bugzilla.redhat.com
CentOS 5 died in March 2017 - migrate NOW!
Full time Geek, part time moderator. Use the FAQ Luke

silvio
Posts: 29
Joined: 2008/11/10 13:06:03
Contact:

Re: No SAN connection with the latest kernel

Post by silvio » 2018/05/28 13:38:33

I hope that i can check it in ~3weeks, after the SAN migration :-) .

silvio
Posts: 29
Joined: 2008/11/10 13:06:03
Contact:

Re: No SAN connection with the latest kernel

Post by silvio » 2018/10/18 13:27:28

TrevorH wrote:
2018/05/28 12:56:32
If you have time and inclination to debug this then you could try experimenting with the lpfc_log_verbose parameter that modinfo lpfc says it supports. Perhaps using that you can get more information from it about what it was trying to do at the time...
After a while i had a little bit time ..
Set up another system with the current kernel (3.10.0-862.14.4.el7.x86_64) and the same problem.
For the debugging i create a lpfc.conf file in modprobe.d with these options:
options lpfc lpfc_log_verbose=0xffff

After an reboot i see .. nothing.
Only the same messages:

[ 4.738655] lpfc 0000:04:00.2: 0:1412 Failed to set up driver resource.
[ 4.739364] lpfc 0000:04:00.2: Driver probe function unexpectedly returned 16
[ 4.869673] lpfc 0000:04:00.3: 0:1412 Failed to set up driver resource.
[ 4.870340] lpfc 0000:04:00.3: Driver probe function unexpectedly returned 16

I was not sure if 0xffff is the correct code and checked it with lpfc_log_verbose=1 but it changed nothing.

Any idea what i did wrong with my config file?

Silvio

With rmmod lpfc i crash the kernel ...

Post Reply