So... I started with two blank 6TB hard drives. Ahead of time I preallocated a 512MiB partition FAT32 partition on sdb to act as a place to later put a duplicate copy of the EFI partition that I was going to let the CentOS 7 installer create on sda (since EFI partitions can't be mirrored via RAID). Anyway, from within the CentOS installer, I created an EFI partition, and then set up RAID arrays for /boot, swap, /, and /home (ext4 for all but swap, and no LVM). I did the installation, everything went fine, and afterwards I used mdadm -E and the links in /dev/disk/by-uuid to check that things seemed to be the way I wanted them; about my only surprise was that the md0, md1, etc. I had specified during installation instead turned into md124, md125, etc.
Having done that, I used dd to copy the real EFI partition (sda1) onto sdb1. I changed the UUID on sdb1, set its flags the same as sda1, and then I used efibootmgr to create a boot entry for its copy of shimx64.efi. My reasoning was that, in the event of a complete sda failure, I could still boot into my OS via the EFI loader on sdb, then use mdadm to fail and remove the sda partitions from their respective arrays, add a blank replacement disk with an identical partition table, and then use mdadm again to regenerate the array.
Oh, one final thing: I added the nofail flag to the installer-created fstab entry for mounting the EFI partition in /boot/efi, because I didn't want that missing partition to break the boot process in the event of an sda failure. All the other fstab entries used the RAID UUIDs, so I figured they'd still mount properly if sda went south (which apparently was an incorrect assumption, as you'll see).
Now it was time to test what I had done. I disconnected both drives, removed sdb, and then used a physical HD duplicator to create a clone of it on a blank drive. I replaced the original sdb with the clone, and booted with only that one drive attached (i.e. to simulate a total sda failure). The CentOS EFI entry wouldn't work (obviously, since it was pointing sda1), but the sdb1 entry I made with efibootmgr did work; I got to grub fine, and I even got a bit of the graphical CentOS loading screen. But then things died, and I was dropped into emergency mode.
The relevant messages I got on screen are as follows:
Code: Select all
[ OK ] Reached target Basic System.
[ TIME ] Timed out waiting for device dev-disk-by\x2duuid-(uuid of md126, i.e. root partition).
[ TIME ] Timed out waiting for device dev-disk-by\x2duuid-(uuid of md126, i.e. root partition).
[ DEPEND ] Dependency failed for File System Check on /dev/disk/by-uuid/(uuid of md126).
[ DEPEND ] Dependency failed for /sysroot.
[ DEPEND ] Dependency failed for Initrd Root File System.
[ DEPEND ] Dependency failed for Reload Configuration from the Real Root.
Anyway, that's where I'm at. It's almost as if initrd doesn't register md126's UUID as valid if both disks aren't present. Obviously there's something I'm missing, but nothing obvious is jumping out at me. Any help would be appreciated.