Page 1 of 1

[SOLVED] clvmd never starts on second node

Posted: 2018/02/05 23:03:54
by hunter86_bg
Hey guys, I have been playing with clusted volume groups and it seem that my clvmd successfully starts on the first node but never on the second one.
Do you have any ideas?
The strangest thing here are the last 2 lines:

Code: Select all

фев 06 00:51:19 kalinsg02 clvm(clvmd)[1720]: INFO: clvmd is not running
фев 06 00:51:19 kalinsg02 clvm(clvmd)[1739]: INFO: clvmd is not running
фев 06 00:51:19 kalinsg02 clvm(clvmd)[1745]: INFO: Starting /usr/sbin/clvmd:
фев 06 00:51:19 kalinsg02 kernel: dlm: Using SCTP for communications
фев 06 00:51:19 kalinsg02 kernel: sctp: Hash tables configured (bind 256/256)
фев 06 00:51:19 kalinsg02 kernel: dlm: connecting to 1
фев 06 00:52:11 kalinsg02 crmd[1273]:   notice: High CPU load detected: 1.340000
фев 06 00:52:41 kalinsg02 crmd[1273]:   notice: High CPU load detected: 1.600000
фев 06 00:52:48 kalinsg02 lrmd[1270]:  warning: clvmd_start_0 process (PID 1635) timed out
фев 06 00:52:48 kalinsg02 lrmd[1270]:  warning: clvmd_start_0:1635 - timed out after 90000ms
фев 06 00:52:48 kalinsg02 crmd[1273]:    error: Result of start operation for clvmd on kalinsg02: Timed Out
фев 06 00:52:49 kalinsg02 clvm(clvmd)[1990]: INFO: PID file (pid:1752 at /var/run/resource-agents/clvmd-clvmd.pid) created for clvmd.
Here is the config:

Code: Select all

 Clone: dlmd-clone
  Meta Attrs: interleave=true ordered=true 
  Resource: dlmd (class=ocf provider=pacemaker type=controld)
   Operations: monitor interval=10 start-delay=0 timeout=20 (dlmd-monitor-interval-10)
               start interval=0s timeout=90 (dlmd-start-interval-0s)
               stop interval=0s timeout=100 (dlmd-stop-interval-0s)
 Clone: clvmd-clone
  Meta Attrs: interleave=true ordered=true 
  Resource: clvmd (class=ocf provider=heartbeat type=clvm)
   Operations: monitor interval=30 timeout=90 (clvmd-monitor-interval-30)
               start interval=0s timeout=90 (clvmd-start-interval-0s)
               stop interval=0s timeout=90 (clvmd-stop-interval-0s)

Re: clvmd never starts on second node

Posted: 2018/02/07 00:44:16
by hunter86_bg
OK , I think I found it .
The issue is in the firewalld blocking the 21064/sctp port. Once this one is added in the high-availability service - everything went back to normal.
I've opened a bug report in bugzilla.

Re: clvmd never starts on second node

Posted: 2018/02/08 05:31:45
by hunter86_bg
Yesterday I found that Red Hat do not support clusters with redundant ring (rrp) and dlm, as the second switches to sctp immediately (instead of tcp) it detects multihoming is on (rrp).