How to cleanup "Failed Actions" without restarting resources

General support questions
Post Reply
z_haseeb
Posts: 102
Joined: 2009/12/31 07:58:30

How to cleanup "Failed Actions" without restarting resources

Post by z_haseeb » 2017/12/15 10:19:38

To trigger a failover I killed process id of application which is clustered/HA with pacemaker/pcsd. The application restarted again on same node successfully. I don't have any problem yet. But when I go to cleanup the "Failed Actions" , all resources restarts as I trigger pcs resource cleanup command. Please guide how can I cleanup Failed Actions which should not restart all resources.

hunter86_bg
Posts: 2019
Joined: 2015/02/17 15:14:33
Location: Bulgaria
Contact:

Re: How to cleanup "Failed Actions" without restarting resources

Post by hunter86_bg » 2017/12/15 18:48:56

What is your fencing mechanism ?
As per the following unverified solution such issue can be observed if your stonith device provide unfencing.
If it is the same for you - the issue is being investigated by Red Hat in Bugzilla #1427648 for RHEL 7 and #1427643 for RHEL 6.

z_haseeb
Posts: 102
Joined: 2009/12/31 07:58:30

Re: How to cleanup "Failed Actions" without restarting resources

Post by z_haseeb » 2017/12/18 15:27:17

Fencing is disabled in our setup

User avatar
TrevorH
Site Admin
Posts: 33216
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: How to cleanup "Failed Actions" without restarting resources

Post by TrevorH » 2017/12/18 16:27:49

You need fencing. You cannot run a cluster without it safely.
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

z_haseeb
Posts: 102
Joined: 2009/12/31 07:58:30

Re: How to cleanup "Failed Actions" without restarting resources

Post by z_haseeb » 2017/12/19 04:22:44

TrevorH wrote:You need fencing. You cannot run a cluster without it safely.
Without fencing we can run a cluster. Fencing is not a prerequisite for a cluster. However its a concert safety. Furthermore I am using cluster for an application(not a back end data) with more than 1 corosync heartbeats with bonded NICs so chances are less for split brain.
Discussion is going to another way. My question is very simple. I don't want to restart all resources if I want to clear the Failed Actions messages.

User avatar
TrevorH
Site Admin
Posts: 33216
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: How to cleanup "Failed Actions" without restarting resources

Post by TrevorH » 2017/12/19 09:24:33

Fencing is not a prerequisite for a cluster
That is not what the pacemaker devs think and they are about to make fencing configuration mandatory. Fencing is not optional.
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

z_haseeb
Posts: 102
Joined: 2009/12/31 07:58:30

Re: How to cleanup "Failed Actions" without restarting resources

Post by z_haseeb » 2017/12/19 10:25:58

Dear TrevorH. Thanks for your words but this does not seem to resolve my query in case I implement Fencing.

hunter86_bg
Posts: 2019
Joined: 2015/02/17 15:14:33
Location: Bulgaria
Contact:

Re: How to cleanup "Failed Actions" without restarting resources

Post by hunter86_bg » 2017/12/19 17:27:08

In order to simply reset the fail count , you can use:

Code: Select all

pcs resource failcount reset <resource id> [node]
Yet fencing is not only recommended but a kind of "mandatory". You can check the sbd fencing (aka "poison pill") , which requires only a shared device. If you don't have a shared device - you can use a CentOS iSCSI Target.

z_haseeb
Posts: 102
Joined: 2009/12/31 07:58:30

Re: How to cleanup "Failed Actions" without restarting resources

Post by z_haseeb » 2018/01/03 07:48:25

Could not able to reset Failed Action thru below suggested commands

[quote="hunter86_bg"]In order to simply reset the fail count , you can use:

Code: Select all

pcs resource failcount reset <resource id> [node]
pcs resource failcount reset REDIS_monitor_45000 XXXCOM2
No failcounts needed resetting

pcs resource failcount reset REDIS XXXCOM2
No failcounts needed resetting

hunter86_bg
Posts: 2019
Joined: 2015/02/17 15:14:33
Location: Bulgaria
Contact:

Re: How to cleanup "Failed Actions" without restarting resources

Post by hunter86_bg » 2018/01/06 18:16:55

z_haseeb wrote:Could not able to reset Failed Action thru below suggested commands
No failcounts needed resetting
It seems the issue is not in the fail count.

Post Reply