Removing drive from Centos 7 server makes the drive unbootable

General support questions
blackmagic
Posts: 13
Joined: 2009/02/05 04:04:49
Location: Brisbane Australia

Removing drive from Centos 7 server makes the drive unbootable

Post by blackmagic » 2018/07/23 12:41:10

A couple of years ago I created a minimal Centos 7 server on a Lenovo Ideapad s100, which I used successfully as an NGINX-based home webserver. I kept the original Windows 10 hard drive from the Lenovo and used a spare 500GB Hitachi 2.5" drive to install Centos on the laptop. I used 'yum upgrade' regularly to keep the system fine-tuned.

One day in April 2018 I took the Hitachi drive out of the Lenovo and replaced it with the original Windows hard drive. The system booted up perfectly and I downloaded and installed a major Windows update onto the original disk drive.

When I put the Hitachi drive back into the Lenovo it refused to boot up, saying there was a problem accessing the grub loader. I booted the laptop from a rescue USB drive and followed instructions I found online for repairing the grub loader. The instructions didn't work and I put the Hitachi drive to one side as it had a couple of gigs of important data on it.

I got a spare 2.5" Seagate drive, put it in the Lenovo and installed a minimal Centos 7.5 system on it, plus the NGINX software. It worked fine until I removed the Seagate drive from the Lenovo while trying to rescue data from the Hitachi drive. When I put the Seagate drive back into the Lenovo it came up with the same grub loader error as before with the Hitachi drive. Over the next few days I got the same grub loader error whenever I removed a working Centos hard drive from the Lenovo for any reason. It didn't matter whether I used HDD or SSD drives.

Thinking it might be a problem related to the Lenovo I purchased a second-hand ASUS F550C laptop, removed the Windows 10 hard drive and installed a minimal Centos 7.5 system on a spare Western Digital 2.5" drive in the ASUS laptop. I booted the machine up and down a few times to prove Centos was working. Then I took the Western Digital drive out of the ASUS laptop and put the Windows drive back in. Windows booted up OK. I removed the Windows drive and popped the Western Digital drive back into the ASUS laptop and got the familiar grub loader error.

This problem has probably been undetected for a long time because once a Centos system is up and running there is usually no need to remove the hard drive. But what I have proved, using two different laptops and disk drives from several manufacturers, is that removing a working hard drive with Centos on it - even for as few as two minutes - renders the drive unbootable.

Has anyone got any thoughts on this problem? It has cost me a lot of time and professional embarrassment having my home webserver, which is a development machine, out of action for almost a month while I searched for a workaround.

desertcat
Posts: 843
Joined: 2014/08/07 02:17:29
Location: Tucson, AZ

Re: Removing drive from Centos 7 server makes the drive unbootable

Post by desertcat » 2018/07/23 22:00:50

blackmagic wrote:
2018/07/23 12:41:10
A couple of years ago I created a minimal Centos 7 server on a Lenovo Ideapad s100, which I used successfully as an NGINX-based home webserver. I kept the original Windows 10 hard drive from the Lenovo and used a spare 500GB Hitachi 2.5" drive to install Centos on the laptop. I used 'yum upgrade' regularly to keep the system fine-tuned.

One day in April 2018 I took the Hitachi drive out of the Lenovo and replaced it with the original Windows hard drive. The system booted up perfectly and I downloaded and installed a major Windows update onto the original disk drive.

When I put the Hitachi drive back into the Lenovo it refused to boot up, saying there was a problem accessing the grub loader. I booted the laptop from a rescue USB drive and followed instructions I found online for repairing the grub loader. The instructions didn't work and I put the Hitachi drive to one side as it had a couple of gigs of important data on it.

I got a spare 2.5" Seagate drive, put it in the Lenovo and installed a minimal Centos 7.5 system on it, plus the NGINX software. It worked fine until I removed the Seagate drive from the Lenovo while trying to rescue data from the Hitachi drive. When I put the Seagate drive back into the Lenovo it came up with the same grub loader error as before with the Hitachi drive. Over the next few days I got the same grub loader error whenever I removed a working Centos hard drive from the Lenovo for any reason. It didn't matter whether I used HDD or SSD drives.

Thinking it might be a problem related to the Lenovo I purchased a second-hand ASUS F550C laptop, removed the Windows 10 hard drive and installed a minimal Centos 7.5 system on a spare Western Digital 2.5" drive in the ASUS laptop. I booted the machine up and down a few times to prove Centos was working. Then I took the Western Digital drive out of the ASUS laptop and put the Windows drive back in. Windows booted up OK. I removed the Windows drive and popped the Western Digital drive back into the ASUS laptop and got the familiar grub loader error.

This problem has probably been undetected for a long time because once a Centos system is up and running there is usually no need to remove the hard drive. But what I have proved, using two different laptops and disk drives from several manufacturers, is that removing a working hard drive with Centos on it - even for as few as two minutes - renders the drive unbootable.

Has anyone got any thoughts on this problem? It has cost me a lot of time and professional embarrassment having my home webserver, which is a development machine, out of action for almost a month while I searched for a workaround.
That is indeed odd. Did you check your BIOS?!? It almost sounds like a BIOS problem ie. IF you stick a drive in it has no problem, the second you take it out and install another drive with an OS on it it still thinks it is seeing the last drive in it but does not see thus generating the error. I'd stick the CentOS drive in, but before booting CentOS I'd pop on over to BIOS and see what drive BIOS sees if the drives do not match that is your problem. It should be found under the BOOT LOADER of BIOS if the drive does not match there should be an option to change it to the correct one, once you do that I suspect you will be able to boot CentOS.

Other than that.... no idea.

blackmagic
Posts: 13
Joined: 2009/02/05 04:04:49
Location: Brisbane Australia

Re: Removing drive from Centos 7 server makes the drive unbootable

Post by blackmagic » 2018/07/24 12:58:19

You might be onto something desertcat, by suspecting the BIOS.

To test your theory, here's what I did this afternoon.

1. I removed the Western Digital Windows 10 SSD drive from the ASUS laptop. I couldn't fiddle with the Lenovo laptop because it is now a production webserver.

2. I built a minimal Centos 7.5 system on the ASUS laptop using a Western Digital 2.5" 500gig drive. I built the system from a USB iso drive.

3. I logged onto the ASUS laptop and looked at a few files, created a /usb directory, and ran a few diagnostics. Everything worked fine.

4. I rebooted the ASUS laptop several times, using a combination of 'reboot' and 'shutdown -h now' commands. I looked at the BIOS entries and everything was OK.

5. I removed the WD 2.5" drive from the ASUS laptop for about 5 minutes, then put it back and to my surprise the Centos system booted up.

6. I rebooted the ASUS laptop and looked at the BIOS entries. Everything was fine.

7. I removed the Centos drive and put back the Windows 10 drive. The ASUS booted up OK and I spent about 15 minutes working on the Windows machine.

8. I removed the Windows drive and put in the Centos drive. The ASUS laptop wouldn't boot up.

The Lenovo differs from the ASUS in this respect: the Lenovo will attempt to boot from a corrupted Centos drive and report any problems found, whereas the ASUS drops into BIOS mode as soon as it finds a problem with the primary boot device. Beyond that there is no way of discovering why the drive wouldn't boot up.

This does not prove that this is strictly a BIOS problem because I can remove the Windows drive from the ASUS laptop as often as I like and the BIOS will always recognise the Windows drive when I put it back into the laptop. Removing a Centos drive becomes problematical if a non-Centos drive is introduced, as demonstrated above.

Having worked with Linux since 1985 in my professional life I know that I could create a Linux server drive on machine A and run it on machines B, C or D without encountering any problems. Linux would always load up the correct drivers for whatever physical machine it was booted on.

So far 40 people have read this post and only one has replied. Does this mean the issue is going to end up in the 'too hard' basket, or is there a way to escalate these problems?

desertcat
Posts: 843
Joined: 2014/08/07 02:17:29
Location: Tucson, AZ

Re: Removing drive from Centos 7 server makes the drive unbootable

Post by desertcat » 2018/07/25 20:10:51

blackmagic wrote:
2018/07/24 12:58:19
You might be onto something desertcat, by suspecting the BIOS.

To test your theory, here's what I did this afternoon.

1. I removed the Western Digital Windows 10 SSD drive from the ASUS laptop. I couldn't fiddle with the Lenovo laptop because it is now a production webserver.

2. I built a minimal Centos 7.5 system on the ASUS laptop using a Western Digital 2.5" 500gig drive. I built the system from a USB iso drive.

3. I logged onto the ASUS laptop and looked at a few files, created a /usb directory, and ran a few diagnostics. Everything worked fine.

4. I rebooted the ASUS laptop several times, using a combination of 'reboot' and 'shutdown -h now' commands. I looked at the BIOS entries and everything was OK.

5. I removed the WD 2.5" drive from the ASUS laptop for about 5 minutes, then put it back and to my surprise the Centos system booted up.

6. I rebooted the ASUS laptop and looked at the BIOS entries. Everything was fine.

7. I removed the Centos drive and put back the Windows 10 drive. The ASUS booted up OK and I spent about 15 minutes working on the Windows machine.

8. I removed the Windows drive and put in the Centos drive. The ASUS laptop wouldn't boot up.

The Lenovo differs from the ASUS in this respect: the Lenovo will attempt to boot from a corrupted Centos drive and report any problems found, whereas the ASUS drops into BIOS mode as soon as it finds a problem with the primary boot device. Beyond that there is no way of discovering why the drive wouldn't boot up.

This does not prove that this is strictly a BIOS problem because I can remove the Windows drive from the ASUS laptop as often as I like and the BIOS will always recognise the Windows drive when I put it back into the laptop. Removing a Centos drive becomes problematical if a non-Centos drive is introduced, as demonstrated above.

Having worked with Linux since 1985 in my professional life I know that I could create a Linux server drive on machine A and run it on machines B, C or D without encountering any problems. Linux would always load up the correct drivers for whatever physical machine it was booted on.

So far 40 people have read this post and only one has replied. Does this mean the issue is going to end up in the 'too hard' basket, or is there a way to escalate these problems?
I bow down to you -- I started on computers in 1990 and was a DOS junkie until Windows '95 came out, hated it, and switched to Linux about 2000.

As to to your question it would be interesting to swap the *same* drives within the *same* machine. I can't remember when -- it started with M$ of course -- you can't take a drive out of Machine "A" and stick it in Machine "B". I have a 250 GB SSD that I took out of my Workstation which upgraded to a 500 GB SSD, that has CentOS 7.4 and stuck it into a backup machine. Will it read it?? Of course not. But even on my workstation I have a 1TB HDD that has Fedora 20 on it, and then I also have the 500GB SSD that has CentOS 7.5 IF I want to boot the HDD I can change it within BIOS and tell BIOS to boot the HDD at which point it becomes the DEFAULT, until I go back into BIOS and select the SSD at which point it becomes the DEFAULT.

I've tried to swap drives between two different machines (the most recent is my gateway server which has a defunct CDROM/DVD on it) which I tried to install CentOS 5.xx on when the HDD died. So... I installed the OS on a new drive I stuck in my Workstation, when I was sure it was working I then stuck it in my gateway, which would not read it, I then stuck the drive back into the workstation which had no problem with it. I solved the problem by removing the the new HDD and reinstalling into the gateway server, then removing the DVD drive from the workstation, moving it to the gateway server and hooking it up, then installing the new OS to the HDD in place, and once that was done removing the DVD drive and reinstalling into my workstation. It was a Giant PITA to do it that way but that was the ONLY way. In your case as long as the two drives are from the *same* machine ie configured on the same machine, I suspect you can boot which ever drive you want by selecting it in BIOS and making that drive the DEFAULT.

As to your other question... probably there are other people out there who have the same problem, but have no answers -- they are as mystified as you are. Most replies are from people who have had the problem you described, and were not afraid to hack the problem. TrevorH is "The Man", but even he does not know everything (though damn near everything). If you solve your problem please post your results and how you did it. Enquiring minds want to know!! Sorry I can't be of bigger help. With luck it might give you some ideas to test.

Desert Cat

blackmagic
Posts: 13
Joined: 2009/02/05 04:04:49
Location: Brisbane Australia

Re: Removing drive from Centos 7 server makes the drive unbootable

Post by blackmagic » 2018/07/27 02:59:56

Hi Desertcat,

I started my IT career in 1963, programming a General Electric 225 computer at a brewery in Melbourne, Australia. In June 1967 I had the good fortune to be transferred to GE's computer division headquarters in Phoenix AZ for 3 months. I went to a rodeo in Flagstaff on 4th July 1967 and visited Tucson a couple of times on weekend drives from my base in Phoenix. I also took in Nogales, the Grand Canyon, and Las Vegas, sleeping in my car to save money.

I don't have the time or resources to continue testing for a solution to this problem, especially as I've learned how to avoid the problem. I'm surprised that Centos doesn't have a more formal method for logging bugs. If it has I'm not aware of it. Coming onto a forum and hoping someone with appropriate knowledge and experience will provide the correct answer is not an industry-wide approach to bug fixing.

Thanks for your time however.

User avatar
TrevorH
Site Admin
Posts: 33202
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: Removing drive from Centos 7 server makes the drive unbootable

Post by TrevorH » 2018/07/27 08:36:49

Are the problem machines using UEFI rather than legacy BIOS?

There is a bugs.centos.org for reporting bugs. However, the only ones that will get fixed by reporting there are problems that are specific to CentOS. In all other respects, CentOS aims to be bug-for-bug compatible with the upstream RHEL so, if it's a bug in RHEL then that same bug should also exist in CentOS. To get problems in RHEL fixed you have to report them on bugzilla.redhat.com
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

blackmagic
Posts: 13
Joined: 2009/02/05 04:04:49
Location: Brisbane Australia

Re: Removing drive from Centos 7 server makes the drive unbootable

Post by blackmagic » 2018/07/30 10:17:23

Trevor,

To answer your question I rebooted the Lenovo and dropped it into BIOS mode. It's using AHCI. I don't want to experiment with this machine by changing the BIOS because it is a production machine.

I removed the Windows 10 drive from the Asus machine, put in the 500gig drive with a minimal Centos 7.5 on it and booted the machine up. It immediately dropped into BIOS mode, meaning it couldn't find a valid boot sector on the primary drive. The SATA controller parameter was set to AHCI. I changed it to IDE but the machine still wouldn't boot up with the Centos drive in it.

I put the Windows 10 drive back into the Asus and attempted to boot up with the SATA controller parameter still set to IDE. Windows detected a problem with the drive and fired up an Automatic Drive Recovery routine. I killed that by removing the battery.

I switched the SATA controller parameter back to AHCI and booted up again. Windows said it wanted to check and repair the drive, which I consented to. The repair operation took about 10 seconds. Everything is OK again and I'm using the Asus laptop to write this post.

blackmagic
Posts: 13
Joined: 2009/02/05 04:04:49
Location: Brisbane Australia

Re: Removing drive from Centos 7 server makes the drive unbootable

Post by blackmagic » 2018/07/30 10:33:39

Hi Trevor,

Looking at the System Summary information for the Asus laptop, it's using UEFI.

User avatar
TrevorH
Site Admin
Posts: 33202
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: Removing drive from Centos 7 server makes the drive unbootable

Post by TrevorH » 2018/07/30 10:55:41

If you sign up for the free RHEL developer subscription, you can download a free copy of the RHEL iso. There are some known bugs in the CentOS rebuild of the current grub2 packages so it might be a useful test to discover if this problem also exists in RHEL or not. You are not the first person to report that removing a drive and putting it back on a UEFI system renders it unbootable.
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

blackmagic
Posts: 13
Joined: 2009/02/05 04:04:49
Location: Brisbane Australia

Re: Removing drive from Centos 7 server makes the drive unbootable

Post by blackmagic » 2018/07/30 15:11:09

Thanks Trevor. I'll check the RHEL option out in the next few days.

Post Reply