[Solved - sort of] Machine shutting itself down randomly

A 5 star hangout for overworked and underpaid system admins.
AnotherBigAl
Posts: 7
Joined: 2018/05/15 07:21:51

[Solved - sort of] Machine shutting itself down randomly

Post by AnotherBigAl » 2018/05/15 07:25:10

I have a HP P3 machine that will not now run for more than 45 to 90 minutes at a time. It has run physically unattended for many years so I thought it might be a cooling issue. When I looked inside it was indeed very dirty with the heatsinks for both processors blocked up. Without removing them they were cleaned (with a hand puffer) as was the rest of the interior. The machine and I were earthed during this process. When restarted with the case opened I noticed that one of the CPU fans was noisy so removed it and have ordered a replacement. Of course I have also had to temporarily had to remove the associated CPU.

The machine still runs quite satisfactorily with the one CPU but crashes, as it did before the clean up, after a time that varies between 45 minutes and 11/4 Hrs. and I have no idea why. Can anyone offerer any suggestions where to begin trouble shooting this problem please.
Last edited by AnotherBigAl on 2018/05/16 07:52:31, edited 1 time in total.

User avatar
TrevorH
Site Admin
Posts: 33191
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: Machine shutting itself down randomly

Post by TrevorH » 2018/05/15 10:12:56

Start by booting and running memtest86+ overnight and seeing if it finds any RAM problems.
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

AnotherBigAl
Posts: 7
Joined: 2018/05/15 07:21:51

Re: Machine shutting itself down randomly

Post by AnotherBigAl » 2018/05/15 11:19:10

It will not run for long enough even to make one pass

User avatar
avij
Retired Moderator
Posts: 3046
Joined: 2010/12/01 19:25:52
Location: Helsinki, Finland
Contact:

Re: Machine shutting itself down randomly

Post by avij » 2018/05/15 11:39:26

Heat does not do good to components. Perhaps your motherboard or power supply is damaged (see for leaking transistors and such). There's a very good chance that you will need to buy replacement parts. You may also need to consider scrapping the entire system and getting a replacement.

User avatar
TrevorH
Site Admin
Posts: 33191
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: Machine shutting itself down randomly

Post by TrevorH » 2018/05/15 11:40:04

Then my first thought would be a dodgy power supply. That or overheating. Does this machine have an HP iLO or any sort of hardware log?
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

AnotherBigAl
Posts: 7
Joined: 2018/05/15 07:21:51

Re: Machine shutting itself down randomly

Post by AnotherBigAl » 2018/05/15 11:52:23

I was wrong about the time memtest takes to run. It has gone through one cycle without any errors and is now into its second. I hadn't thought about the memory. The next time it powers off I will remove the one stick (512MB) and clean the contacts with isopropyl alcohol.

I will also swap out the PSU and see. As far as HP iLO is concerned, I am assuming it does but do not know anythisg at all about it - even how to access it.

Thank you all for your thought. They are appreciated.

AnotherBigAl
Posts: 7
Joined: 2018/05/15 07:21:51

Re: Machine shutting itself down randomly

Post by AnotherBigAl » 2018/05/15 11:55:32

At present the machine is running with the side off so plenty ov ventilation. Ambient air temp around 18 deg, humidity around69%.

User avatar
TrevorH
Site Admin
Posts: 33191
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: Machine shutting itself down randomly

Post by TrevorH » 2018/05/15 12:16:08

If an HP P3 is what google finds for me: a dual Pentium III then it's pretty ancient and the capacitor issue could well be present. Get a torch and shine it on the motherboard and look at all the capacitors to see if any have brown stains on them or are bulging out. If they are then you need a new motherboard and if you need that then you can probably get similar computing capacity brand new fairly cheaply.
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

tunk
Posts: 1204
Joined: 2017/02/22 15:08:17

Re: Machine shutting itself down randomly

Post by tunk » 2018/05/15 13:53:37

If memtest86+ runs for hours without crashing, then it suggests that MB, CPU and RAM are OK.
Could it be a HD problem? You could also run stresslinux or similar to stress the CPU.
Also, a drop of oil in the fan bearing often does wonders.

AnotherBigAl
Posts: 7
Joined: 2018/05/15 07:21:51

Re: Machine shutting itself down randomly

Post by AnotherBigAl » 2018/05/16 07:51:58

Well,I successfully ran Memtest for 7 hours 19minutes. During that time it completed 17 runs without any errors so from that I deduced that there is no problem with the memory. I would like to say that also applies the the motherboard, hard disk drive and power supply but they are hardly stressed using Memtest. I then tried to run Stresslinux but the machine is too old and the program did not recognise any of the components – well not in my inexperienced hands.

So TrevorH, I have decided to take your implied advice and scrap the machine. It has served me faithfully since new but the spec is not up to any supported version of Centos (it has been running 5) so time to pull the plug.

If it of any interest, while Memtest was running I took various temperatures using a infra-red guided hand held temperature reading device. Although not very accurate as it was not possible to get to a lot of the HDD or PSU surfaces here are the results that I obtained:

Rear (connection end) of HDD: 30 deg c
PSU: 300c
Motherboard (generally): around 31.9 deg c
Motherboard components: 39 deg c
Memory – some chips 40 deg c others around 25 deg c
PSU and BIOS heatsinks (as best as I could get it: 40 deg c
And just out of interest my screen surface: 31 deg c
This last reading is the same as my other flat screens when they are active.

I have learnt some new things from this post so thank you all for your input. It has not been entirely wasted. I will mark the thread as solved and start saving up for something a little better.

Cheers

Post Reply