PC crashes

General support questions
RHF91
Posts: 7
Joined: 2018/04/21 20:38:48

PC crashes

Post by RHF91 » 2018/04/21 21:16:10

dear gentlemen
I am just starting into Linux, sorry for my ignorance, many thanks in advance for your help.

My PC is brand new, and I just added/installed Thunderbird on top of the Gnome release.
Operating System: CentOS Linux 7 (Core)
CPE OS Name: cpe:/o:centos:centos:7
Kernel: Linux 3.10.0-693.21.1.el7.x86_64
Architecture: x86-64

My problem is that it crashes, generally when it is running idle.
The error message (even before I installed TB) I got is :
WARNING: CPU: 0 PID: 0 at net /sched/sch_generic.c300 dev_watchdog+0x242/0x250-abort_
this problem should not be reported (it is likely a known problem). A Kernel problem occurred , but your kernel has been tainted (flags : GW). Kernel maintainers are unable to diagnose tainted reports.

I have been looking around the web to try to find a reason/explanation for the problem.
I trend to think that it is due to my graphic card (AMD R5230-SL-2GD3-L) because on the desktop screen, I have from time to time a white stripe. My screen is ok (I know because I still use it in this transition period, under Windows OS, with no problem).
On top of what I read on some sites.
But I am not sure at all.

I tried to find out whether the driver of the graphic card is up to date, but did not find a clear way.
I found where apparently I could have the last version, but I am not very used in downloading/installing drivers with Linux.
I am afraid the situation would get worse !
https://support.amd.com/en-us/kb-articl ... Notes.aspx

I only managed to see the error message mentioned hereabove one time, and did not manage to find it through the terminal application.

Again thank you very much in advance for your help.
Greetings from Paris, France
Roberto

desertcat
Posts: 843
Joined: 2014/08/07 02:17:29
Location: Tucson, AZ

Re: PC crashes

Post by desertcat » 2018/04/22 21:42:06

RHF91 wrote:dear gentlemen
I am just starting into Linux, sorry for my ignorance, many thanks in advance for your help.

My PC is brand new, and I just added/installed Thunderbird on top of the Gnome release.
Operating System: CentOS Linux 7 (Core)
CPE OS Name: cpe:/o:centos:centos:7
Kernel: Linux 3.10.0-693.21.1.el7.x86_64
Architecture: x86-64

My problem is that it crashes, generally when it is running idle.
The error message (even before I installed TB) I got is :

WARNING: CPU: 0 PID: 0 at net /sched/sch_generic.c300 dev_watchdog+0x242/0x250-abort_
this problem should not be reported (it is likely a known problem). A Kernel problem occurred , but your kernel has been tainted (flags : GW). Kernel maintainers are unable to diagnose tainted reports.

I have been looking around the web to try to find a reason/explanation for the problem.
I trend to think that it is due to my graphic card (AMD R5230-SL-2GD3-L) because on the desktop screen, I have from time to time a white stripe. My screen is ok (I know because I still use it in this transition period, under Windows OS, with no problem).
On top of what I read on some sites.
But I am not sure at all.

I tried to find out whether the driver of the graphic card is up to date, but did not find a clear way.
I found where apparently I could have the last version, but I am not very used in downloading/installing drivers with Linux.
I am afraid the situation would get worse !
https://support.amd.com/en-us/kb-articl ... Notes.aspx

I only managed to see the error message mentioned hereabove one time, and did not manage to find it through the terminal application.

Again thank you very much in advance for your help.
Greetings from Paris, France
Roberto
PC Crashes are a fact of life... unfortunately.

Troubleshooting them is a giant PITA. There are a couple of things you can try.

A good place to start is to check both /var/log/Xorg.0.log and /var/log/dmesg to see if there is something out of whack.

Code: Select all

 "from time to time a white stripe." 
I've had this problem occur a few times. As crazy as this may sound try swapping out your SATA cable. For reasons unknown they sometimes go south, the fix is simply to swap them out for a new one.

Second try removing Thunderbird via yum remove... and reboot the machine. This will tell you if you have a software problem. You can always reinstall it to replicate the problem.

Third, if you have one laying about the place, try another graphics card. This will tell you if you have a hardware problem.

If the problem persists try downloading as clean copy of CentOS 7.4, check and make sure that the MD5sum or the equivalent matches, then do a reinstall. Your:

Code: Select all

WARNING: CPU: 0 PID: 0 at net /sched/sch_generic.c300 dev_watchdog+0x242/0x250-abort_
this problem should not be reported (it is likely a known problem). A Kernel problem occurred , but [b]your kernel has been tainted[/b] (flags : GW). Kernel maintainers are unable to diagnose tainted reports.


...Almost sounds like you *might* have a corrupted kernel.

RHF91
Posts: 7
Joined: 2018/04/21 20:38:48

Re: PC crashes

Post by RHF91 » 2018/08/22 14:28:27

Dear gentlemen

Since April problem, I have changed to a more recent graphic card and had reinstalled CentOS.
VLC is the only extra program installed.
During all this time I learnt several things on Linux, but crash problem is still there and I am unable to understand where it is coming from, just by looking at var.log messages.
Can you please help ?!

Problem symptoms :
At start message error appears while loading : "fast TSC calibration failed"
Then the computer may work fine several hours or just a few minutes before crashing, even with no program running.

Two sorts of problems :
Either
(1) computer stops and restarts on his own automatically,
or
(2) the screen+keyboard+mouse freeze (no way to input any command, sometimes the mouse pointer moves but clicking has no effect, the screen clock stops). In this case I have no other way then forcing the computer to stop pressing long time enough the start button.

Computer characteristics :
Asus PRIME X370-A
AMD Ryzen 7 1700 (3.0 GHz)
DDR4 G.Skill Ripjaws V, Rouge, 2 x 4 Go, 2400 MHz, CAS 15
Nvidia KFA2 GeForce GT 1030, 2 Go – Nvidia driver for this card installed
Initially had 3.10.0.862.el17. x86_64
Now : 3.10.0.862.9.1.el17. x86_64

I have attached /var/log files, for the two cases :

1) computer restarts on his own
In this case the PC was unattended at that time, and when I came back at 9:30, I can see it had stopped and restarted on its own !
(I believe from the log, between Aug 5 09:20 and 09:27)
Attached file :
"Messages-2018 08 05.txt"
When I log in the terminal, I have also the message
[rfranco@Host-001 ~]$ su root
Password:
ABRT has detected 1 problem(s). For more info run: abrt-cli list --since 1533236275
[root@Host-001 rfranco]# ^C
[root@Host-001 rfranco]# abrt-cli list --since 1533236275
id 83813f86e61897d5651815b1532636c30019ecc0
reason: mce: [Hardware Error]: Machine check events logged
time: Sun 05 Aug 2018 09:27:59 AM CEST
cmdline: BOOT_IMAGE=/vmlinuz-3.10.0-862.9.1.el7.x86_64 root=/dev/mapper/centos-root ro crashkernel=auto rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet LANG=en_US.UTF-8
uid: 0
Directory: /var/spool/abrt/oops-2018-08-05-09:27:58-922-0
[root@Host-001 rfranco]#


2) computer screen keyboard and mouse frozen
The computer was running idle with only a libre office file open.
Everything freezes, the mouse pointer included. The clock has stopped and indicates Aug 22 15:33:42.
I manually forced the machine to stop and restarted at 15:45
I have attached file
"messages all frozen.txt"
In this case no ABRT message in the terminal.

Many thanks in advance to whoever may help me
Best regards
Roberto

PS not sure the attachments are uploaded, everytime I add file, green indicator signals upload, and then ends with a warning signal (?!)

RHF91
Posts: 7
Joined: 2018/04/21 20:38:48

Re: PC crashes

Post by RHF91 » 2018/08/22 14:56:14

no way to upload my attachments
I tried both on my Windows PC and on the CentOS ...
???

RHF91
Posts: 7
Joined: 2018/04/21 20:38:48

Re: PC crashes

Post by RHF91 » 2018/08/22 15:02:58

I can I attach the two txt files ?
Where could I alternatively drop them ?

northpoint
Posts: 107
Joined: 2016/05/23 11:57:12

Re: PC crashes

Post by northpoint » 2018/08/22 15:04:14

Im running the Asus Prime x370 board with my Ryzen x1800.

Shouldnt you be running the latest kernel (4.whatever) for your Ryzen?

My kernel:

Code: Select all

Linux max 4.14.8-041408-generic #201712200555 SMP Wed Dec 20 10:57:38 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
I think that might clear up a lot of your ills..
Ryzen x1800 * Asus x370 Pro * CentOS 7.4 64bit / Icewarp /

User avatar
TrevorH
Site Admin
Posts: 33191
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: PC crashes

Post by TrevorH » 2018/08/22 16:06:57

reason: mce: [Hardware Error]: Machine check events logged
That's a bit of a clue. Try looking at your mcelog log file or running mcelog.
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

tunk
Posts: 1204
Joined: 2017/02/22 15:08:17

Re: PC crashes

Post by tunk » 2018/08/22 17:31:41

Also, 3.10.0-862.9.1.el7.x86_64 isn't the newest kernel.

desertcat
Posts: 843
Joined: 2014/08/07 02:17:29
Location: Tucson, AZ

Re: PC crashes

Post by desertcat » 2018/08/25 10:51:18

RHF91 wrote:
2018/08/22 14:28:27
Dear gentlemen

Since April problem, I have changed to a more recent graphic card and had reinstalled CentOS.
VLC is the only extra program installed.
During all this time I learnt several things on Linux, but crash problem is still there and I am unable to understand where it is coming from, just by looking at var.log messages.
Can you please help ?!

Problem symptoms :
At start message error appears while loading : "fast TSC calibration failed"
Then the computer may work fine several hours or just a few minutes before crashing, even with no program running.

Two sorts of problems :
Either
(1) computer stops and restarts on his own automatically,
or
(2) the screen+keyboard+mouse freeze (no way to input any command, sometimes the mouse pointer moves but clicking has no effect, the screen clock stops). In this case I have no other way then forcing the computer to stop pressing long time enough the start button.

Computer characteristics :
Asus PRIME X370-A
AMD Ryzen 7 1700 (3.0 GHz)
DDR4 G.Skill Ripjaws V, Rouge, 2 x 4 Go, 2400 MHz, CAS 15
Nvidia KFA2 GeForce GT 1030, 2 Go – Nvidia driver for this card installed
Initially had 3.10.0.862.el17. x86_64
Now : 3.10.0.862.9.1.el17. x86_64

I have attached /var/log files, for the two cases :

1) computer restarts on his own
In this case the PC was unattended at that time, and when I came back at 9:30, I can see it had stopped and restarted on its own !
(I believe from the log, between Aug 5 09:20 and 09:27)
Attached file :
"Messages-2018 08 05.txt"
When I log in the terminal, I have also the message
[rfranco@Host-001 ~]$ su root
Password:
ABRT has detected 1 problem(s). For more info run: abrt-cli list --since 1533236275
[root@Host-001 rfranco]# ^C
[root@Host-001 rfranco]# abrt-cli list --since 1533236275
id 83813f86e61897d5651815b1532636c30019ecc0
reason: mce: [Hardware Error]: Machine check events logged
time: Sun 05 Aug 2018 09:27:59 AM CEST
cmdline: BOOT_IMAGE=/vmlinuz-3.10.0-862.9.1.el7.x86_64 root=/dev/mapper/centos-root ro crashkernel=auto rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet LANG=en_US.UTF-8
uid: 0
Directory: /var/spool/abrt/oops-2018-08-05-09:27:58-922-0
[root@Host-001 rfranco]#


2) computer screen keyboard and mouse frozen
The computer was running idle with only a libre office file open.
Everything freezes, the mouse pointer included. The clock has stopped and indicates Aug 22 15:33:42.
I manually forced the machine to stop and restarted at 15:45
I have attached file
"messages all frozen.txt"
In this case no ABRT message in the terminal.

Many thanks in advance to whoever may help me
Best regards
Roberto

PS not sure the attachments are uploaded, everytime I add file, green indicator signals upload, and then ends with a warning signal (?!)

"Computer characteristics :
Asus PRIME X370-A
AMD Ryzen 7 1700 (3.0 GHz)
DDR4 G.Skill Ripjaws V, Rouge, 2 x 4 Go, 2400 MHz, CAS 15

Nvidia KFA2 GeForce GT 1030, 2 Go – Nvidia driver for this card installed"

I am drooling!!! This is almost *EXACTLY* the same machine though the mobo I'm looking at is the ASUS X370-PRO not the "A" version. I would love to build if I ever get the money. There have been a number of people who have had problems getting CentOS -- though not necessarily Linux -- to run on your particular mobo and Ryzen combo. Then there were others who had no problems at all. You might want to GOOGLE this.

While I don't think you need to O.C. the memory you might want go to the BIOS settings and tweak the memory settings so that it reads 2400 MHz, not 2133 MHz.

One of my concerns about this CPU/mobo combo is that this is set up as UEFI and you have no choice. CentOS 7 in my experience runs better as a Legacy BIOS OS. Check your manual and check Chapter 3 which deals with BIOS settings. On Page 3-26 -- 3-27 there is something called "Secure Boot" Under that you have two choices [Windows UEFI mode] and [Other OS] The DEFAULT is set for [Windows UEFI mode]

Secure Boot
This item allows you to configure the Windows Secure Boot settings and manage its keys to protect the system from unauthorized access and malwares during POST.

OS Type [Windows UEFI mode]
Allows you to select your installed operating system.

[Windows UEFI mode]
This item allows you to select your installed operating system. Execute the Microsoft Secure Boot check. Only select this option when booting on Windows UEFI mode or other Microsoft Secure Boot compliant OS.

[Other OS]
Get the optimized function when booting on Windows non-UEFI mode. Microsoft Secure Boot only supports Windows UEFI mode

Of relevance here is did you install this under UEFI or as a Legacy BIOS OS?

I'm running the previous ASUS generation mobo: M5A97 R 2.0 w/ AMD FX6300 CPU and 32 GB of DDR3 RAM, and sometime on booting I have a frozen mouse, and other times for no reason at all I'll get a frozen mouse *and* keyboard (a much beloved IBM Model M 101 key PS/2 keyboard). The solution is simply to re-boot the machine and I might not have it reoccur for several months -- what can I say,
E. coli happens.

Hope this helps

RHF91
Posts: 7
Joined: 2018/04/21 20:38:48

Re: PC crashes

Post by RHF91 » 2018/09/18 12:55:40

Dear Gentlemen

Thanks for your various answers, that I admit, bring to me other questions !

================
To northpoint :
My CentOS is 7.5.1804 normally the latest, and since my last post I tried to install the latest Kernel, which is 3.10.0-862.11.6.el7, but I never managed to have it working (the loading process gets stuck before it arrives to the CentOS starting screen, where I enter my password). So I am still running with 3.10.0.862.9.1.el17. x86_64.
Do I understand correctly that you are you using CentOS 7.4 version with the 4.14.8 kernel ?
Is it a kernel from ELRepo repository ?
================
To TrevorH :
I installed mcelog via # yum install mcelog but I do not get any file under /var/log, using the tail or grep command (# tail -f /var/log/mcelog OR # grep -i "hardware error" /var/log/mcelog). Am I missing one more thing to do ?
My mcelog is actually under /usr/sbin/mcelog

[root@Host-001 sbin]# /usr/sbin/mcelog --version
mcelog mcelog-144-8.94d853b2ea81.el7
[root@Host-001 sbin]#

[root@Host-001 sbin]# more /var/log/mcelog
/var/log/mcelog: No such file or directory
[root@Host-001 sbin]# /usr/sbin/mcelog [mcelogdevice]
mcelog: ERROR: AMD Processor family 23: mcelog does not support this processor. Please use the edac_mce_amd module instead.
CPU is unsupported
[root@Host-001 sbin]# /usr/sbin/mcelog --client
mcelog: client connect: No such file or directory
mcelog: client command write: Transport endpoint is not connected
mcelog: client read: Invalid argument
mcelog: client connect: No such file or directory
mcelog: client command write: Transport endpoint is not connected
mcelog: client read: Invalid argument
[root@Host-001 sbin]# /usr/sbin/mcelog --daemon
CPU is unsupported
[root@Host-001 sbin]#

[root@Host-001 rfranco]# grep -i "hardware errors" /var/log/mcelog
grep: /var/log/mcelog: No such file or directory
[root@Host-001 rfranco]# grep -i "hardware errors" /usr/sbin/mcelog
[root@Host-001 rfranco]#

================
To desertcat :
I have been looking a lot on the web to find tips for my configuration but did not find any relevant, unfortunately yet.
Under the Bios "Ai Tweaker" I moved the Memory frequency from the "Auto" default, to DDR$ 2400 MHz.
This did not remove the "TSC fast calibration", nor my problems.
To answer your question : I am running CentOS under a legacy Bios.
About rebooting the machine this is what I am forced to do, but the only way is by pressing the start button for a few seconds, and I believe this should be avoided as much as possible. My problem unfortunately occurs at least once daily.
================

Many thanks in advance for your replies to come
Regards
Roberto

Post Reply