NFS Cluster getting weekly errors

Issues related to applications and software problems
Post Reply
hspinks
Posts: 2
Joined: 2018/09/05 13:49:03

NFS Cluster getting weekly errors

Post by hspinks » 2018/09/06 03:07:59

Greetings All,

I am running an NFS server in a Linux cluster and I am getting the following error from pacemaker;
pcs status
Cluster name: nfs_cluster1
Stack: corosync
Current DC: lastsfile03 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Wed Sep 5 22:02:11 2018
Last change: Tue Aug 28 22:01:15 2018 by hacluster via crmd on lastsfile03

2 nodes configured
7 resources configured

Online: [ lastsfile02 lastsfile03 ]

Full list of resources:

disk_fencing1 (stonith:fence_scsi): Started lastsfile02
Resource Group: nfsgroup
my_lvm (ocf::heartbeat:LVM): Started lastsfile02
nfsshare (ocf::heartbeat:Filesystem): Started lastsfile02
nfs-daemon (ocf::heartbeat:nfsserver): Started lastsfile02
nfs-root (ocf::heartbeat:exportfs): Started lastsfile02
nfs_ip (ocf::heartbeat:IPaddr2): Started lastsfile02
nfs-notify (ocf::heartbeat:nfsnotify): Started lastsfile02
Failed Actions:
* nfs-daemon_monitor_10000 on lastsfile03 'not installed' (5): call=72, status=complete, exitreason='No init script or systemd unit file detected for nfs server',
last-rc-change='Wed Sep 5 01:00:07 2018', queued=0ms, exec=0ms

This issue also caused the cluster to fail over. I am getting this error about every other week. I am installed in a Microsoft Hyper-V environment running in a Microsoft failover cluster utilizing an EMC SAN storage array for storage.

Has anyone seen this error or know what may be causing it or a solution?

hunter86_bg
Posts: 2019
Joined: 2015/02/17 15:14:33
Location: Bulgaria
Contact:

Re: NFS Cluster getting weekly errors

Post by hunter86_bg » 2018/09/06 19:50:12

Most probably a missing package is causing it. Of course you need to debug the script (which means to stop your nfs... downtime is inevitable), and check what's going on.
There are 2 options:
1. Debug the resource by:
- migrate the resource group to the problematic server
- stop only the nfs-daemon resource
- start the resource with "debug-start" and check the output
2. If step 1 doesn't give enough clue, you can extra debug via:
- migrate the resource group to the problematic server
- stop only the nfs-daemon resource
- follow the procedure defined here

Just a side note . Why do you start the IP after the NFS server and not before that ?

User avatar
TrevorH
Site Admin
Posts: 33218
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: NFS Cluster getting weekly errors

Post by TrevorH » 2018/09/06 19:54:24

$ rpm -qf /usr/lib/systemd/system/nfs-server.service
nfs-utils-1.3.0-0.54.el7.x86_64

Is that installed?
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

hspinks
Posts: 2
Joined: 2018/09/05 13:49:03

Re: NFS Cluster getting weekly errors

Post by hspinks » 2018/11/14 11:32:46

I am still getting this error. I am running on Windows Hyper V environment running on centos 7. I could alleviate downtime if I had a list of the required packages. I do have nfs-utils-1.3.0-0.54.el7.x86_64 installed. The help is much appreciated all!

hunter86_bg
Posts: 2019
Joined: 2015/02/17 15:14:33
Location: Bulgaria
Contact:

Re: NFS Cluster getting weekly errors

Post by hunter86_bg » 2018/11/14 18:42:30

You may enable tracing of the resources in order to get more info.
Another approach is to enable debugging in the script itself (I think it was somewhere in /use/lib/ocf).
Easiest check is to run

Code: Select all

rpm -qa | sort
and inspect any differences in the installed packages.

Post Reply