Greetings All,
I am running an NFS server in a Linux cluster and I am getting the following error from pacemaker;
pcs status
Cluster name: nfs_cluster1
Stack: corosync
Current DC: lastsfile03 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Wed Sep 5 22:02:11 2018
Last change: Tue Aug 28 22:01:15 2018 by hacluster via crmd on lastsfile03
2 nodes configured
7 resources configured
Online: [ lastsfile02 lastsfile03 ]
Full list of resources:
disk_fencing1 (stonith:fence_scsi): Started lastsfile02
Resource Group: nfsgroup
my_lvm (ocf:LVM): Started lastsfile02
nfsshare (ocf:Filesystem): Started lastsfile02
nfs-daemon (ocf:nfsserver): Started lastsfile02
nfs-root (ocf:exportfs): Started lastsfile02
nfs_ip (ocf:IPaddr2): Started lastsfile02
nfs-notify (ocf:nfsnotify): Started lastsfile02
Failed Actions:
* nfs-daemon_monitor_10000 on lastsfile03 'not installed' (5): call=72, status=complete, exitreason='No init script or systemd unit file detected for nfs server',
last-rc-change='Wed Sep 5 01:00:07 2018', queued=0ms, exec=0ms
This issue also caused the cluster to fail over. I am getting this error about every other week. I am installed in a Microsoft Hyper-V environment running in a Microsoft failover cluster utilizing an EMC SAN storage array for storage.
Has anyone seen this error or know what may be causing it or a solution?
NFS Cluster getting weekly errors
-
- Posts: 2019
- Joined: 2015/02/17 15:14:33
- Location: Bulgaria
- Contact:
Re: NFS Cluster getting weekly errors
Most probably a missing package is causing it. Of course you need to debug the script (which means to stop your nfs... downtime is inevitable), and check what's going on.
There are 2 options:
1. Debug the resource by:
- migrate the resource group to the problematic server
- stop only the nfs-daemon resource
- start the resource with "debug-start" and check the output
2. If step 1 doesn't give enough clue, you can extra debug via:
- migrate the resource group to the problematic server
- stop only the nfs-daemon resource
- follow the procedure defined here
Just a side note . Why do you start the IP after the NFS server and not before that ?
There are 2 options:
1. Debug the resource by:
- migrate the resource group to the problematic server
- stop only the nfs-daemon resource
- start the resource with "debug-start" and check the output
2. If step 1 doesn't give enough clue, you can extra debug via:
- migrate the resource group to the problematic server
- stop only the nfs-daemon resource
- follow the procedure defined here
Just a side note . Why do you start the IP after the NFS server and not before that ?
Re: NFS Cluster getting weekly errors
$ rpm -qf /usr/lib/systemd/system/nfs-server.service
nfs-utils-1.3.0-0.54.el7.x86_64
Is that installed?
nfs-utils-1.3.0-0.54.el7.x86_64
Is that installed?
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke
Re: NFS Cluster getting weekly errors
I am still getting this error. I am running on Windows Hyper V environment running on centos 7. I could alleviate downtime if I had a list of the required packages. I do have nfs-utils-1.3.0-0.54.el7.x86_64 installed. The help is much appreciated all!
-
- Posts: 2019
- Joined: 2015/02/17 15:14:33
- Location: Bulgaria
- Contact:
Re: NFS Cluster getting weekly errors
You may enable tracing of the resources in order to get more info.
Another approach is to enable debugging in the script itself (I think it was somewhere in /use/lib/ocf).
Easiest check is to run
and inspect any differences in the installed packages.
Another approach is to enable debugging in the script itself (I think it was somewhere in /use/lib/ocf).
Easiest check is to run
Code: Select all
rpm -qa | sort