LVM2 on top of software raid 5 + failed disk = data corruption?

Installing, Configuring, Troubleshooting server daemons such as Web and Mail
Post Reply
dm133
Posts: 2
Joined: 2008/12/16 17:59:20

LVM2 on top of software raid 5 + failed disk = data corrupti

Post by dm133 » 2008/12/16 18:37:14

Hi everyone.

Background.....
-We have a server running Cent OS 4.7 kernel 2.6.9-78.0.5.ELsmp.
-It has 6 300Gb SAS drives.
-These drives are then cut into partitions that are added to mdadm software raid arrays.
-Boot is raid 1 with an ext3 filesystem.
-/ is raid 1 with an ext3 filesystem.
-swap is raid 0. (previous admin did this, we will fix this eventually)
-we have one small raid 0 array for fast access formatted with ext3.
-Finally, we have one rather large 1.3 TB raid 5 array that we use for our users files. This array has LVM2 volumes on top of it all of which are ext 3.


The other day the messages log started recording messages about "bad segments" on sda. A smarctl -t long /dev/sda did confirm that the drive was no longer in pristine shape. We decided to replace the disk before things got worse.

The procedure we took is as follows...

backup failing disks partition table:
sfdisk -d /dev/sda > /etc/partitions.sda

fail and remove the disk from all of its arrays:
mdadm /dev/md0 --fail /dev/sda1 --remove /dev/sda1
...

turn off swap since its in a swap array. shutdown the swap array:
swapoff -a
mdadm -S /dev/md4

tell the OS to kill the drive for hot swap and pull the bad drive:
echo "scsi remove-single-device 0 0 0 0" > /proc/scsi/scsi

insert new drive and tell the OS to find it:
echo "scsi add-single-device 0 0 0 0" > /proc/scsi/scsi

create the partition table on the new disk:
sfdisk /dev/sda < /etc/partitions.sda

add the drive back to all of its arrays:
mdadm /dev/md0 --add /dev/sda1

re-create the swap array, format it, turn swap back on:
mdam --create --verbose /dev/md4 --level=0 --raid-devices=2 /dev/sda1 /dev/sdb1
mkswap /dev/md4
swapon -a



We then kicked back and watched /proc/mdstat. The arrays finished rebuilding with no errors and we could access all the data so we went home and went to bed. However, sometime during the night one of our LVM volumes re-mounted itself read only. Upon trying re-mount it, the machine hard locked and we had to reboot it. Upon rebooting, the machine wanted to fsck all but two of the LVM volumes (all the fs's that were on raid but had no LVM volumes were in happy and steady states). There were so many inode errors we decided to do a full system restore from backups.

If its helpful, the errors that messages showed when the LVM volume decided to remount itself ro is below. We got similar errors when trying to fsck all but two of our LVM volumes.

Dec 13 02:37:40 farrell kernel: EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory #16335409: rec_len % 4 != 0 - offset=0, inode=3325888793, rec_len=36875, name_len=128
Dec 13 02:37:40 farrell kernel: Aborting journal on device dm-0.
Dec 13 02:37:42 farrell kernel: ext3_abort called.
Dec 13 02:37:42 farrell kernel: EXT3-fs error (device dm-0): ext3_journal_start_sb: Detected aborted journal
Dec 13 02:37:42 farrell kernel: Remounting filesystem read-only
Dec 13 02:45:08 farrell kernel: EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory #10028449: rec_len % 4 != 0 - offset=0, inode=3453369297, rec_len=43883, name_len=176
Dec 13 02:45:57 farrell kernel: EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory #17728222: rec_len % 4 != 0 - offset=0, inode=4248137906, rec_len=41903, name_len=120
Dec 13 02:48:01 farrell kernel: EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory #37765142: rec_len % 4 != 0 - offset=0, inode=189228823, rec_len=2857, name_len=49
Dec 13 02:50:17 farrell kernel: EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory #10027806: rec_len % 4 != 0 - offset=0, inode=2494885026, rec_len=6898, name_len=164
Dec 13 02:51:37 farrell kernel: EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory #17089522: rec_len % 4 != 0 - offset=0, inode=3224714177, rec_len=24635, name_len=56
Dec 13 03:22:44 farrell kernel: EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory #31687011: rec_len % 4 != 0 - offset=0, inode=934687742, rec_len=46451, name_len=57
Dec 13 03:37:43 farrell kernel: EXT3-fs error (device dm-3): ext3_readdir: bad entry in directory #1033375: directory entry across blocks - offset=0, inode=2770636924, rec_len=62712, name_len=4
Dec 13 03:37:43 farrell kernel: Aborting journal on device dm-3.
Dec 13 03:37:43 farrell kernel: ext3_abort called.
Dec 13 03:37:43 farrell kernel: EXT3-fs error (device dm-3): ext3_journal_start_sb: Detected aborted journal
Dec 13 03:37:43 farrell kernel: Remounting filesystem read-only
Dec 13 03:52:21 farrell kernel: EXT3-fs error (device dm-7): ext3_readdir: bad entry in directory #1442857: rec_len % 4 != 0 - offset=0, inode=732481902, rec_len=52555, name_len=181
Dec 13 03:52:21 farrell kernel: Aborting journal on device dm-7.
Dec 13 03:52:21 farrell kernel: ext3_abort called.
Dec 13 03:52:21 farrell kernel: EXT3-fs error (device dm-7): ext3_journal_start_sb: Detected aborted journal
Dec 13 03:52:21 farrell kernel: EXT3-fs error (device dm-7): ext3_readdir: bad entry in directory #1442857: rec_len % 4 != 0 - offset=0, inode=732481902, rec_len=52555, name_len=181
Dec 13 03:52:21 farrell kernel: Aborting journal on device dm-7.
Dec 13 03:52:21 farrell kernel: ext3_abort called.
Dec 13 03:52:21 farrell kernel: EXT3-fs error (device dm-7): ext3_journal_start_sb: Detected aborted journal
Dec 13 03:52:21 farrell kernel: Remounting filesystem read-only
Dec 13 03:53:08 farrell kernel: EXT3-fs error (device dm-7): ext3_readdir: bad entry in directory #1606426: rec_len % 4 != 0 - offset=0, inode=3306786808, rec_len=1430, name_len=20


So now my question is this.
Did something I do (or didn't do) cause the unrecoverable ext3 corruption (ie: how do your drive replacement methods compare to mine) or did we just have some bad luck?


Thanks for your help
Dustin

ps: I'm starting to wonder if this might be related to a kernel bug. Googling showed some mentions of that with similar setups. Let me know if you ever experienced anything similar and if you found out if it was a bug.

dm133
Posts: 2
Joined: 2008/12/16 17:59:20

Re: LVM2 on top of software raid 5 + failed disk = data corruption?

Post by dm133 » 2009/01/02 17:29:30

dang. over 300 views and no responses.

since I posted here i ran a bonnie++ test on a machine with similar hardware and the exact same software issue hoping to find out if it was caused by some sort of software bug. after 3 days of torture it completed just fine.

I guess it could still be hardware related or software related (the controller cards, and hence their kernel modules, are different between the two machines). If things blow up again or whenever we retire the machine with the original problem I'll do some more tests to try to track down the ultimate problem; but in the mean time I'm beginning to think I'll never know the root cause.

If anybody out there has done drive replacements in a similar setup before could you at least let me know if my methods were sound? I'd like to be able to at least rule out the human factor (as much as possible). As far as I can tell from other tests and more reading, I did things right, but its hard not to drone on things and constantly second guess yourself when things like this happen :).

hope everyone is having a good new year...
Dustin

Post Reply

Return to “CentOS 4 - Server Support”