Centos 7 hanging

Issues related to hardware problems
Post Reply
wawayue
Posts: 3
Joined: 2017/07/09 22:47:48

Centos 7 hanging

Post by wawayue » 2017/07/09 23:03:57

Hi,
Ok I have used CentOS for YEARS and have very few if no problems what so ever.

Recently a machine 's behavior changed. This is a dell server with remote management enabled.
(this machine does uses swap)

At round about 12 pm it appears to go dead, but still running.(and not every 12pm)(except yesterday it was ~2pm)
no screen, kb and no network connection (other than ping)

logging into Dell remote management shows nothing wrong with the machine , disks ram, cpu are all "normal", the only way out of this situation is a hard power off & on again.

Recently another machine has started doing exactly the same, this time it is an IBM server, same rule apply, but the time is more random.
Here we see a sample "messages file" , the system just died at 14:01

I did a cold reboot at 17:58
Jul 9 14:01:01 compute-01 systemd: Created slice user-0.slice.
Jul 9 14:01:01 compute-01 systemd[1]: Created slice user-0.slice.
Jul 9 14:01:01 compute-01 systemd[1]: Starting user-0.slice.
Jul 9 14:01:01 compute-01 systemd: Starting user-0.slice.
Jul 9 14:01:01 compute-01 systemd[1]: Started Session 266 of user root.
Jul 9 14:01:01 compute-01 systemd[1]: Starting Session 266 of user root.
Jul 9 14:01:01 compute-01 systemd: Started Session 266 of user root.
Jul 9 14:01:01 compute-01 systemd: Starting Session 266 of user root.
Jul 9 14:01:01 compute-01 systemd[1]: Removed slice user-0.slice.
Jul 9 14:01:01 compute-01 systemd: Removed slice user-0.slice.
Jul 9 14:01:01 compute-01 systemd[1]: Stopping user-0.slice.
Jul 9 14:01:01 compute-01 systemd: Stopping user-0.slice.
Jul 9 17:58:16 compute-01 rsyslogd: [origin software="rsyslogd" swVersion="7.4.7" x-pid="1416" x-info="http://www.rsyslog.com"] start
Jul 9 17:58:16 compute-01 rsyslogd-2027: imjournal: fscanf on state file `/var/lib/rsyslog/imjournal.state' failed
[try http://www.rsyslog.com/e/2027 ]
Jul 9 17:58:16 compute-01 rsyslogd: imjournal: ignoring invalid state file
Jul 9 17:57:51 compute-01 journal: Runtime journal is using 8.0M (max allowed 1.5G, trying to leave 2.3G free of 15.6G available → current limit 1.5G).
Jul 9 17:57:51 compute-01 kernel: microcode: microcode updated early to revision 0x1c, date = 2015-02-26
Jul 9 17:57:51 compute-01 kernel: Initializing cgroup subsys cpuset
.......

now this other IBM machine has webmin installed. , with this machine it connects to webmin but immediately reports "file not found"
Which would indicate some sort of storage disconnect. (this machine does not use swap)

as I stated previously the hardware remote consoles show no disk or hardware errors/no bad sectors/ no disk retries
and tehre is nothing in any log file suggesting problems reading /writing disks before this happends.

Post Reply