Unable to boot with degraded RAID 1

Issues related to applications and software problems
Post Reply
tsotrilos
Posts: 4
Joined: 2018/02/20 10:02:40

Unable to boot with degraded RAID 1

Post by tsotrilos » 2019/07/11 11:50:02

Hello everyone.

I just installed a CentOS 7 hyper-visor. During the setup I configured the system with software RAID 1.

Unfortunately upon the finalization of the installation I tried to boot with a degraded RAID by removing
one of my drives (2x 1TB). The system is able to boot but it gets into emergency mode no matter the hard drive
that I am trying to boot. With both hard drives connected everything works like a charm and the sync of the HDDs
is fully successful.

My setup is like this:

fstab:

Code: Select all

#
# /etc/fstab
# Created by anaconda on Tue Jul  9 06:32:55 2019
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=532be9b5-923b-41ba-b854-b1c93e2a5b6f /                       xfs     defaul                                                                                                                                                             ts        0 0
UUID=eb49e5ff-28bb-4845-8870-b799a2935eef /boot                   xfs     defaul                                                                                                                                                             ts        0 0
UUID=24d13d6e-be6d-44d3-8891-9ef1c1ef2056 /var/lib/libvirt        ext4    defaul                                                                                                                                                             ts        1 2
UUID=b584edd5-c96a-45ad-9f42-41d659f79930 swap                    swap    defaul                                                                                                                                                             ts        0 0
/proc/mdstat:

Code: Select all

Personalities : [raid1]
md124 : active raid1 sdb5[1] sda5[0]
      891288576 blocks super 1.2 [2/2] [UU]
      bitmap: 2/7 pages [8KB], 65536KB chunk

md125 : active raid1 sdb1[1] sda1[0]
      41937920 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md126 : active raid1 sdb2[1] sda2[0]
      16776192 blocks super 1.2 [2/2] [UU]

md127 : active raid1 sdb3[1] sda3[0]
      1047552 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk
blkid:

Code: Select all

/dev/md125: LABEL="root" UUID="532be9b5-923b-41ba-b854-b1c93e2a5b6f" TYPE="xfs"
/dev/sdb1: UUID="8fbceb1d-51c8-49d2-9fbe-2f562a66fe5c" UUID_SUB="66ebb960-2909-2eb5-92c9-8570cd486430" LABEL="localhost.localdomain:root" TYPE="linux_raid_member"
/dev/sda1: UUID="8fbceb1d-51c8-49d2-9fbe-2f562a66fe5c" UUID_SUB="3811ac75-f64e-961e-2ec8-134173ee71ac" LABEL="localhost.localdomain:root" TYPE="linux_raid_member"
/dev/sda2: UUID="5b9a6010-ab22-c13f-6efa-faa32a4cb59a" UUID_SUB="a7715781-4911-73b9-c997-dde88de90eff" LABEL="localhost.localdomain:swap" TYPE="linux_raid_member"
/dev/sda3: UUID="432a8914-eb80-296d-316d-127bca1f72f9" UUID_SUB="ae413f79-5c9a-3831-107d-c5f880a80e32" LABEL="localhost.localdomain:boot" TYPE="linux_raid_member"
/dev/sda5: UUID="da76e980-429a-4e41-b89c-12d208bc162e" UUID_SUB="58c36e6d-fcbe-a6f6-48fd-5af7153e6e45" LABEL="var_lib_libvirt" TYPE="linux_raid_member"
/dev/sdb2: UUID="5b9a6010-ab22-c13f-6efa-faa32a4cb59a" UUID_SUB="d8227f49-629b-f7b9-1011-439f01de5d52" LABEL="localhost.localdomain:swap" TYPE="linux_raid_member"
/dev/sdb3: UUID="432a8914-eb80-296d-316d-127bca1f72f9" UUID_SUB="94821ec4-e264-5476-cd7d-257ad7c56011" LABEL="localhost.localdomain:boot" TYPE="linux_raid_member"
/dev/sdb5: UUID="da76e980-429a-4e41-b89c-12d208bc162e" UUID_SUB="d5f6ce08-e98e-d75b-791c-42b2b732413e" LABEL="var_lib_libvirt" TYPE="linux_raid_member"
/dev/md127: LABEL="boot" UUID="eb49e5ff-28bb-4845-8870-b799a2935eef" TYPE="xfs"
/dev/md126: LABEL="swap" UUID="b584edd5-c96a-45ad-9f42-41d659f79930" TYPE="swap"
/dev/md124: LABEL="vms" UUID="24d13d6e-be6d-44d3-8891-9ef1c1ef2056" TYPE="ext4"
/etc/mdadm.conf:

Code: Select all

 mdadm.conf written out by anaconda
MAILADDR root
AUTO +imsm +1.x -all
ARRAY /dev/md/boot level=raid1 num-devices=2 UUID=432a8914:eb80296d:316d127b:ca1f72f9
ARRAY /dev/md/root level=raid1 num-devices=2 UUID=8fbceb1d:51c849d2:9fbe2f56:2a66fe5c
ARRAY /dev/md/swap level=raid1 num-devices=2 UUID=5b9a6010:ab22c13f:6efafaa3:2a4cb59a
ARRAY /dev/md/var_lib_libvirt level=raid1 num-devices=2 UUID=da76e980:429a4e41:b89c12d2:08bc162e
lsblk:

Code: Select all

NAME      MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda         8:0    0 931.5G  0 disk
├─sda1      8:1    0    40G  0 part
│ └─md125   9:125  0    40G  0 raid1 /
├─sda2      8:2    0    16G  0 part
│ └─md126   9:126  0    16G  0 raid1 [SWAP]
├─sda3      8:3    0     1G  0 part
│ └─md127   9:127  0  1023M  0 raid1 /boot
├─sda4      8:4    0     1K  0 part
└─sda5      8:5    0 850.1G  0 part
  └─md124   9:124  0   850G  0 raid1 /var/lib/libvirt
sdb         8:16   0 931.5G  0 disk
├─sdb1      8:17   0    40G  0 part
│ └─md125   9:125  0    40G  0 raid1 /
├─sdb2      8:18   0    16G  0 part
│ └─md126   9:126  0    16G  0 raid1 [SWAP]
├─sdb3      8:19   0     1G  0 part
│ └─md127   9:127  0  1023M  0 raid1 /boot
├─sdb4      8:20   0     1K  0 part
└─sdb5      8:21   0 850.1G  0 part
  └─md124   9:124  0   850G  0 raid1 /var/lib/libvirt
While entering in emergency mode I can see that a warning appears on the upper part of the screen:

Warning: /dev/disk/by-id/md-uuid 432a8914-eb80-296d-316d-127bca1f72f9 does not exist
Warning: /dev/disk/by-id/md-uuid 5b9a6010-ab22-c13f-6efa-faa32a4cb59a does not exist
Warning: /dev/disk/by-id/md-uuid 8fbceb1d-51c8-49d2-9fbe-2f562a66fe5c does not exist
Warning: /dev/disk/by-uuid 532be9b5-923b-41ba-b854-b1c93e2a5b6f does not exist

I found a similar post in the forum but unfortunately I was not able to resolve my problem as the issue
was not matching mine.

I am pretty new to CentOS 7 so please go easy with me :)

Thank you in advance

tsotrilos
Posts: 4
Joined: 2018/02/20 10:02:40

Re: Unable to boot with degraded RAID 1

Post by tsotrilos » 2019/07/15 10:28:12

Update:

So I managed to bypass the issue and now I am able to boot with one drive.
I will name the drives A and B in order not to confuse you.
In order to this I had to give the below commands while the system was into emergency mode:

While booted with drive B

Code: Select all

mdadm --run /dev/md125
mdadm --run /dev/md126
mdadm --run /dev/md127
ctrl+d
Upon this the system booted just fine with on hard drive.

The RAID of course found in degraded mode.

lsblk:

Code: Select all

NAME      MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda         8:0    0 931.5G  0 disk
├─sda1      8:1    0    40G  0 part
│ └─md127   9:127  0    40G  0 raid1 /
├─sda2      8:2    0    16G  0 part
│ └─md126   9:126  0    16G  0 raid1 [SWAP]
├─sda3      8:3    0     1G  0 part
│ └─md125   9:125  0  1023M  0 raid1 /boot
├─sda4      8:4    0     1K  0 part
└─sda5      8:5    0 850.1G  0 part
  └─md124   9:124  0   850G  0 raid1 /var/lib/libvirt
sdb         8:16   0 931.5G  0 disk
├─sdb1      8:17   0    40G  0 part
├─sdb2      8:18   0    16G  0 part
│ └─md126   9:126  0    16G  0 raid1 [SWAP]
├─sdb3      8:19   0     1G  0 part
├─sdb4      8:20   0     1K  0 part
└─sdb5      8:21   0 850.1G  0 part
In order to rebuild it I did this:

Code: Select all

 mdadm --add /dev/md124 /dev/sdb5
mdadm: re-added /dev/sdb5
mdadm --add /dev/md127 /dev/sdb1
mdadm: re-added /dev/sdb1
 mdadm --add /dev/md125 /dev/sdb3
mdadm: re-added /dev/sdb3
The RAID re-synced successfully:

Code: Select all

Personalities : [raid1]
md124 : active raid1 sdb5[1] sda5[0]
      891288576 blocks super 1.2 [2/2] [UU]
      bitmap: 0/7 pages [0KB], 65536KB chunk

md125 : active raid1 sdb3[1] sda3[0]
      1047552 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md126 : active raid1 sdb2[1] sda2[0]
      16776192 blocks super 1.2 [2/2] [UU]

md127 : active raid1 sdb1[1] sda1[0]
      41937920 blocks super 1.2 [2/2] [UU]
      bitmap: 1/1 pages [4KB], 65536KB chunk

unused devices: <none>


Well... it was requiring almost 11 hours to fully sync them so I changed the sync speed
limit by:
echo 100000 > /proc/sys/dev/raid/speed_limit_min

Upon this I tried to boot with the other hard drive (A)... but with no luck.
While I was into emergency mode I gave the same commands

Code: Select all

mdadm --run /dev/md125
mdadm --run /dev/md126
mdadm --run /dev/md127
ctrl+d
but this time upon the log out I stuck in an emergency mode loop.

The difference that I noticed is that while I was booted the system with the B hard drive
upon issuing every command I was getting a reply that the "md*** added", but while I was
booting the system with the drive A I was not getting any output on my commands.

Any ideas?

Post Reply