Centos 7.2 has missing enclosure end device links

Issues related to hardware problems
Post Reply
krisd
Posts: 1
Joined: 2017/02/25 03:15:58

Centos 7.2 has missing enclosure end device links

Post by krisd » 2017/02/27 17:53:30

Problem in a nutshell: kernel device links are missing under /sys/class/enclosures/SLOT.../ directories for some HBAs.
Is there some other way to determine which /dev/sd* names map to which slots in an enclosure?

Background:
I'm running with Centos 7.2 3.10.0-327.36.3.el7.x86_64
I have one many HDD populated enclosure (not released yet, so can't give more details) and 4 - LSI 9300 HBAs, each connecting to 2 separate ESMs (SAS expanders connecting to separate drive ports) in the enclosure.

Normally, the linux ses driver populates links under /sys/class/enclosure/SLOT . . . which map each of the enclosure slots to a kernel device. For example:

Code: Select all

# ls -d1 /sys/class/enclosure/*/*
/sys/class/enclosure/10:0:4:0/components
/sys/class/enclosure/10:0:4:0/device
/sys/class/enclosure/10:0:4:0/power
/sys/class/enclosure/10:0:4:0/SLOT 00,7PG0TNJC
/sys/class/enclosure/10:0:4:0/SLOT 01,7PG276NR
/sys/class/enclosure/10:0:4:0/SLOT 02,7PG2HTZR
/sys/class/enclosure/10:0:4:0/SLOT 03,7PG2E5GR
. . .
/sys/class/enclosure/1:0:4:0/SLOT 03,7PG2E5GR
/sys/class/enclosure/1:0:4:0/SLOT 04,7PG2HHBR
/sys/class/enclosure/1:0:4:0/SLOT 05,7PG26PHR
/sys/class/enclosure/1:0:4:0/SLOT 06,7PG17V6C
/sys/class/enclosure/1:0:4:0/SLOT 07,7PG17W2C
. . .
The text for the directory names used appear equal to the element descriptors reported by the enclosure:

Code: Select all

# sg_ses /dev/sg188 -p ed
  XXX  XXXXX  XXX
  Primary enclosure logical identifier (hex): 5000ccab03000480
Element Descriptor In diagnostic page:
  generation code: 0x0
  element descriptor by type list
    Element type: Array device slot, subenclosure id: 0 [ti=0]
      Overall descriptor: <empty>
      Element 0 descriptor: [color=#FF0000]SLOT 00,7PG0TNJC[/color]
      Element 1 descriptor: SLOT 01,7PG276NR
      Element 2 descriptor: SLOT 02,7PG2HTZR
      Element 3 descriptor: SLOT 03,7PG2E5GR
      Element 4 descriptor: SLOT 04,7PG2HHBR
. . .
So I'm guess that is where the text come from. Also, different enclosures populate this text differently.

Under these "SLOT XX,SN" directories there should be a "device" link, which points back to the SCSI drive under a slot. Searching Internet, this appears to be the recommended way to discover which drives are in which slot, especially when building raid sets.

Code: Select all

# ls -d1 /sys/class/enclosure/10\:0\:4\:0/SLOT*/device/block/*
/sys/class/enclosure/10:0:4:0/SLOT 00,7PG0TNJC            /device/block/sdha
/sys/class/enclosure/10:0:4:0/SLOT 01,7PG276NR            /device/block/sdgz
/sys/class/enclosure/10:0:4:0/SLOT 02,7PG2HTZR            /device/block/sdgu
/sys/class/enclosure/10:0:4:0/SLOT 03,7PG2E5GR            /device/block/sdgp
/sys/class/enclosure/10:0:4:0/SLOT 04,7PG2HHBR            /device/block/sdgl
. . .
The problem is:
When only two HBAs from one host are attached to the enclosure, the device
entries are populated correctly. But, when there are 4 LSI 9300 HBAs connecting to enclosure (2 to one ESM, 2 to the other), one of the HBA connections to each ESM is missing the "device" links.

Code: Select all

# ls -d1 /sys/class/enclosure/*
/sys/class/enclosure/10:0:4:0
/sys/class/enclosure/1:0:4:0
/sys/class/enclosure/11:0:4:0
/sys/class/enclosure/9:0:4:0

# ls -d1 /sys/class/enclosure/*/SLOT*/device
/sys/class/enclosure/10:0:4:0/SLOT 00,7PG0TNJC            /device
/sys/class/enclosure/10:0:4:0/SLOT 01,7PG276NR            /device
/sys/class/enclosure/10:0:4:0/SLOT 02,7PG2HTZR            /device
. . .
/sys/class/enclosure/1:0:4:0/SLOT 00,7PG0TNJC            /device
/sys/class/enclosure/1:0:4:0/SLOT 01,7PG276NR            /device
/sys/class/enclosure/1:0:4:0/SLOT 02,7PG2HTZR            /device
/sys/class/enclosure/1:0:4:0/SLOT 03,7PG2E5GR            /device
. . .
Where the 'device' links are missing for enclosure connections 11: and 9:

The question is: is this a Centos bug (seems like it), something related to the enclosure operation, or is there some other (multipath?) setting I'm missing.

There is a bug I found which noted a problem in red hat 7.3 which sounds like it may be related (however the problem apparently didn't exist in 7.2). The fix is not planned until 7.4,
https://bugzilla.redhat.com/show_bug.cgi?id=1394089
I am unaware of any other method to map drive slots to kernel devices so that code can reliably exercise particular slots.

I'm unsure how to debug this further - Any ideas? (sorry for the long post)

yaplej
Posts: 7
Joined: 2014/10/13 14:21:19

Re: Centos 7.2 has missing enclosure end device links

Post by yaplej » 2017/10/19 17:51:46

Not sure if your issue ever got resolved but Im now running 7.4 and cannot see anything in /sys/class/enclosure/.
viewtopic.php?f=49&t=64678

It looks like my enclosure is listed but its not populating for some reason.

Code: Select all

sg_ses /dev/sg59 -p ed
  DELL      MD1200            1.06
  Primary enclosure logical identifier (hex): 500c04f27b7e6a00
Element Descriptor In diagnostic page:
  generation code: 0x0
  element descriptor by type list
    Element type: Array device slot, subenclosure id: 0 [ti=0]
      Overall descriptor: <empty>
      Element 0 descriptor: Slot 0
      Element 1 descriptor: Slot 1
      Element 2 descriptor: Slot 2
      Element 3 descriptor: Slot 3
      Element 4 descriptor: Slot 4
      Element 5 descriptor: Slot 5
      Element 6 descriptor: Slot 6
      Element 7 descriptor: Slot 7
      Element 8 descriptor: Slot 8
      Element 9 descriptor: Slot 9
      Element 10 descriptor: Slot 10
      Element 11 descriptor: Slot 11
    Element type: Power supply, subenclosure id: 0 [ti=1]
      Overall descriptor: <empty>
      Element 0 descriptor: Power Supply 1
      Element 1 descriptor: Power Supply 2
    Element type: Cooling, subenclosure id: 0 [ti=2]
      Overall descriptor: <empty>
      Element 0 descriptor: Fan 0 in Power Supply 1
      Element 1 descriptor: Fan 1 in Power Supply 1
      Element 2 descriptor: Fan 0 in Power Supply 2
      Element 3 descriptor: Fan 1 in Power Supply 2
    Element type: Temperature sensor, subenclosure id: 0 [ti=3]
      Overall descriptor: <empty>
      Element 0 descriptor: SIM 0 Temperature Sensor
      Element 1 descriptor: SIM 1 Temperature Sensor
      Element 2 descriptor: BP 0 Temperature Sensor
      Element 3 descriptor: BP 1 Temperature Sensor
    Element type: Audible alarm, subenclosure id: 0 [ti=4]
      Overall descriptor: <empty>
      Element 0 descriptor: Buzzer
    Element type: Enclosure services controller electronics, subenclosure id: 0 [ti=5]
      Overall descriptor: <empty>
      Element 0 descriptor: SIM 0
      Element 1 descriptor: SIM 1
    Element type: Enclosure, subenclosure id: 0 [ti=6]
      Overall descriptor: <empty>
      Element 0 descriptor: Enclosure
    Element type: Language, subenclosure id: 0 [ti=7]
      Overall descriptor: <empty>
      Element 0 descriptor: Language
    Element type: Voltage sensor, subenclosure id: 0 [ti=8]
      Overall descriptor: <empty>
      Element 0 descriptor: Input Voltage Sensor 0
      Element 1 descriptor: Input Voltage Sensor 1
    Element type: Voltage sensor, subenclosure id: 0 [ti=9]
      Overall descriptor: <empty>
      Element 0 descriptor: 12V Sensor 0
      Element 1 descriptor: 12V Sensor 1
    Element type: Voltage sensor, subenclosure id: 0 [ti=10]
      Overall descriptor: <empty>
      Element 0 descriptor: 5V Sensor 0
      Element 1 descriptor: 5V Sensor 1
    Element type: Current sensor, subenclosure id: 0 [ti=11]
      Overall descriptor: <empty>
      Element 0 descriptor: Input Current Sensor 0
      Element 1 descriptor: Input Current Sensor 1
    Element type: Current sensor, subenclosure id: 0 [ti=12]
      Overall descriptor: <empty>
      Element 0 descriptor: 12V Current Sensor 0
      Element 1 descriptor: 12V Current Sensor 1
    Element type: Current sensor, subenclosure id: 0 [ti=13]
      Overall descriptor: <empty>
      Element 0 descriptor: 5V Current Sensor 0
      Element 1 descriptor: 5V Current Sensor 1
    Element type: Simple subenclosure, subenclosure id: 0 [ti=14]
      Overall descriptor: <empty>
      Element 0 descriptor: Simple Subenclosure
    Element type: vendor specific [0x80], subenclosure id: 0 [ti=15]
      Overall descriptor: <empty>
      Element 0 descriptor: Input Power Sensor 0
      Element 1 descriptor: Input Power Sensor 1
    Element type: vendor specific [0x80], subenclosure id: 0 [ti=16]
      Overall descriptor: <empty>
      Element 0 descriptor: 12V Power Sensor 0
      Element 1 descriptor: 12V Power Sensor 1
    Element type: vendor specific [0x80], subenclosure id: 0 [ti=17]
      Overall descriptor: <empty>
      Element 0 descriptor: 5V Power Sensor 0
      Element 1 descriptor: 5V Power Sensor 1

Post Reply