Upgrade to CentOS 7.5 - Hardware Error

Issues related to hardware problems
Post Reply
taylorkh
Posts: 534
Joined: 2010/11/24 15:08:33
Location: North Carolina, USA

Upgrade to CentOS 7.5 - Hardware Error

Post by taylorkh » 2018/05/23 17:50:39

Today I upgraded my Dell Precision 3620 Workstation from CentOS 7.4 to 7.5. Most everything seems to gone smoothly except for some hardware error messages at boot. I can see them flash by at boot time too fast to read. If after logging in to the graphical environment I can go to another terminal e.g. Ctrl-Alt-F3 and I can see them. The FIRST time I booted after the upgrade I was able to find the messages in dmesg but not on subsequent boots although they are still displayed at boot time (?) Here they are as captured from dmesg

Code: Select all

[    0.922830] mce: [Hardware Error]: PROCESSOR 0:506e3 TIME 1527094816 SOCKET 0 APIC 0 microcode ba
[    0.922970] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 7: ee0000000040110a
[    0.923072] mce: [Hardware Error]: TSC 0 ADDR fef200c0 MISC 3880010086 
[    0.923499] mce: [Hardware Error]: PROCESSOR 0:506e3 TIME 1527094816 SOCKET 0 APIC 0 microcode ba
[    0.923664] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 8: ee0000000040110a
[    0.923774] mce: [Hardware Error]: TSC 0 ADDR fef1ce80 MISC 43880010086 
[    0.924164] mce: [Hardware Error]: PROCESSOR 0:506e3 TIME 1527094816 SOCKET 0 APIC 0 microcode ba
[    0.924364] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 9: ee0000000040110a
[    0.924557] Key type trusted registered
[    0.924518] mce: [Hardware Error]: TSC 0 ADDR fef1ff00 MISC 3880010086 
[    0.924971] mce: [Hardware Error]: PROCESSOR 0:506e3 TIME 1527094816 SOCKET 0 APIC 0 microcode ba
As I said, things SEEM to be running OK. The upgrade did not even break VMWare which is common during point upgrades.

Can anyone translate this into something humanly readable? Is it something to worry about?

TIA,

Ken

p.s. Below is a link to the getinfo data from my computer.

http://pastebin.centos.org/783656/

taylorkh
Posts: 534
Joined: 2010/11/24 15:08:33
Location: North Carolina, USA

Re: Upgrade to CentOS 7.5 - Hardware Error

Post by taylorkh » 2018/05/25 22:58:44

I don't know what it meant but I did fix it - after breaking the computer. I figured I need to upgrade the BIOS. I was reluctant to do so. The first of these machines (Dell Precision 3620 Workstation) which I purchased had a video issue - conflict between the Intel on-board video and my nVidia card I think. Tech support said to upgrade the bios even though the machine was brand new. Bricked it. Sent it back and ordered the one I have now.

I called tech support to verify the warranty was in effect and would they fix it if (when) the BIOS upgrade hose the compute? "We will take care of any problems." Ran the update - computer would not load the OS. Grub came up and I could chose the kernel - then nothing. After a couple of hours with tech support and second level tech support I planned on a reinstall of CentOS 7.5. I told the second level support person I was going to do one more thing after we got off the phone. Disconnect all cables, open the box and pop the coin cell/battery (and take the opportunity to blow the dust off the heat sinks.)

I did that, replaced the coin cell, booted, was prompted to go to BIOS setup and set the date/time which I did. Exited setup and it booted fine. The error messages are gone. A couple of lessons learned...

1 - When upgrading BIOS - pop the coin cell to allow the BIOS to clear its mind.
2 - With all the "fixes" for Spectre and other CPU related vulnerabilities... It seems the hardware vendors and software vendors are in a sort of arms race. Might be worth looking into BIOS upgrades more often so that one party's fix does not break the other party's fix.

Admins - Please feel free to stick a fork in this thread. It is SOLVED.

Thanks,

Ken :mrgreen:

Post Reply