glusterd service not starting

General support questions
Post Reply
User avatar
penguinpages
Posts: 42
Joined: 2015/07/21 13:58:05

glusterd service not starting

Post by penguinpages » 2019/09/26 11:46:31

I am trying to debug new system build CentOS7 + EPEL repo. Trying to join to two other CentOS7 nodes in existing cluster. Noticed that service would not start.
Trying to track down where or what that is from. I believe I was successful in running the "add peer" command. But then other nodes could not see it and noticed service not started

Versions of gluster were different. Validated all the yum repositories were same between three nodes. <scratches head>

I did a remove and reinstall. Packages now same so must be some user error somewhere...


Still glusterd service will not start.

#######################

[root@medusa yum.repos.d]# systemctl start glusterd
Job for glusterd.service failed because the control process exited with error code. See "systemctl status glusterd.service" and "journalctl -xe" for details.
[root@medusa yum.repos.d]# journalctl -xe
Sep 26 07:22:30 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/mdcheck_continue.service:14] Invalid environment assignment, ignoring: MDADM_CHECK_DURATION='"6 hours"'
Sep 26 07:22:30 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:9] Failed to parse protect system value, ignoring: strict
Sep 26 07:22:30 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:10] Unknown lvalue 'ReadWritePaths' in section 'Service'
Sep 26 07:22:30 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:9] Failed to parse protect system value, ignoring: strict
Sep 26 07:22:30 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:10] Unknown lvalue 'ReadWritePaths' in section 'Service'
Sep 26 07:22:30 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/mdcheck_continue.service:14] Invalid environment assignment, ignoring: MDADM_CHECK_DURATION='"6 hours"'
Sep 26 07:22:30 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/mdcheck_start.timer:12] Failed to parse calendar specification, ignoring: Sun *-*-1..7 1:00:00
Sep 26 07:22:30 svr1.acme.com systemd[1]: mdcheck_start.timer lacks value setting. Refusing.
Sep 26 07:22:30 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/mdcheck_start.service:14] Invalid environment assignment, ignoring: MDADM_CHECK_DURATION='"6 hours"'
Sep 26 07:22:30 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/mdcheck_continue.service:14] Invalid environment assignment, ignoring: MDADM_CHECK_DURATION='"6 hours"'
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/firstboot-graphical.service:14] Support for option SysVStartPriority= has been removed and it is ignored
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/mdcheck_continue.service:14] Invalid environment assignment, ignoring: MDADM_CHECK_DURATION='"6 hours"'
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/mdcheck_start.service:14] Invalid environment assignment, ignoring: MDADM_CHECK_DURATION='"6 hours"'
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/mdcheck_continue.service:14] Invalid environment assignment, ignoring: MDADM_CHECK_DURATION='"6 hours"'
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:9] Failed to parse protect system value, ignoring: strict
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:10] Unknown lvalue 'ReadWritePaths' in section 'Service'
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:9] Failed to parse protect system value, ignoring: strict
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:10] Unknown lvalue 'ReadWritePaths' in section 'Service'
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/mdcheck_continue.service:14] Invalid environment assignment, ignoring: MDADM_CHECK_DURATION='"6 hours"'
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/mdcheck_start.timer:12] Failed to parse calendar specification, ignoring: Sun *-*-1..7 1:00:00
Sep 26 07:22:31 svr1.acme.com systemd[1]: mdcheck_start.timer lacks value setting. Refusing.
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/mdcheck_start.service:14] Invalid environment assignment, ignoring: MDADM_CHECK_DURATION='"6 hours"'
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/mdcheck_continue.service:14] Invalid environment assignment, ignoring: MDADM_CHECK_DURATION='"6 hours"'
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/firstboot-graphical.service:14] Support for option SysVStartPriority= has been removed and it is ignored
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/mdcheck_continue.service:14] Invalid environment assignment, ignoring: MDADM_CHECK_DURATION='"6 hours"'
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/mdcheck_start.service:14] Invalid environment assignment, ignoring: MDADM_CHECK_DURATION='"6 hours"'
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/mdcheck_continue.service:14] Invalid environment assignment, ignoring: MDADM_CHECK_DURATION='"6 hours"'
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:9] Failed to parse protect system value, ignoring: strict
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:10] Unknown lvalue 'ReadWritePaths' in section 'Service'
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:9] Failed to parse protect system value, ignoring: strict
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:10] Unknown lvalue 'ReadWritePaths' in section 'Service'
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/mdcheck_continue.service:14] Invalid environment assignment, ignoring: MDADM_CHECK_DURATION='"6 hours"'
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/mdcheck_start.timer:12] Failed to parse calendar specification, ignoring: Sun *-*-1..7 1:00:00
Sep 26 07:22:31 svr1.acme.com systemd[1]: mdcheck_start.timer lacks value setting. Refusing.
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/mdcheck_start.service:14] Invalid environment assignment, ignoring: MDADM_CHECK_DURATION='"6 hours"'
Sep 26 07:22:31 svr1.acme.com systemd[1]: [/usr/lib/systemd/system/mdcheck_continue.service:14] Invalid environment assignment, ignoring: MDADM_CHECK_DURATION='"6 hours"'
Sep 26 07:22:33 svr1.acme.com polkitd[2463]: Registered Authentication Agent for unix-process:7932:3463460 (system bus name :1.206 [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path /org/freedesktop/PolicyKit1/Au
Sep 26 07:22:33 svr1.acme.com systemd[1]: Starting GlusterFS, a clustered file-system server...
-- Subject: Unit glusterd.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/li ... temd-devel
--
-- Unit glusterd.service has begun starting up.
Sep 26 07:22:33 svr1.acme.com systemd[1]: glusterd.service: control process exited, code=exited status=1
Sep 26 07:22:33 svr1.acme.com systemd[1]: Failed to start GlusterFS, a clustered file-system server.
-- Subject: Unit glusterd.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/li ... temd-devel
--
-- Unit glusterd.service has failed.
--
-- The result is failed.
Sep 26 07:22:33 svr1.acme.com systemd[1]: Unit glusterd.service entered failed state.
Sep 26 07:22:33 svr1.acme.com systemd[1]: glusterd.service failed.
Sep 26 07:22:33 svr1.acme.com polkitd[2463]: Unregistered Authentication Agent for unix-process:7932:3463460 (system bus name :1.206, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8) (disco
lines 3372-3426/3426 (END)

##############

[root@medusa yum.repos.d]# tail /var/log/messages
Sep 26 07:22:31 medusa systemd: [/usr/lib/systemd/system/mdcheck_start.timer:12] Failed to parse calendar specification, ignoring: Sun *-*-1..7 1:00:00
Sep 26 07:22:31 medusa systemd: mdcheck_start.timer lacks value setting. Refusing.
Sep 26 07:22:31 medusa systemd: [/usr/lib/systemd/system/mdcheck_start.service:14] Invalid environment assignment, ignoring: MDADM_CHECK_DURATION='"6 hours"'
Sep 26 07:22:31 medusa systemd: [/usr/lib/systemd/system/mdcheck_continue.service:14] Invalid environment assignment, ignoring: MDADM_CHECK_DURATION='"6 hours"'
Sep 26 07:22:33 medusa systemd: Starting GlusterFS, a clustered file-system server...
Sep 26 07:22:33 medusa systemd: glusterd.service: control process exited, code=exited status=1
Sep 26 07:22:33 medusa systemd: Failed to start GlusterFS, a clustered file-system server.
Sep 26 07:22:33 medusa systemd: Unit glusterd.service entered failed state.
Sep 26 07:22:33 medusa systemd: glusterd.service failed.
Sep 26 07:30:01 medusa systemd: Started Session 71 of user root.
[root@medusa yum.repos.d]#

## Check SELinux not stopping things...
[root@medusa yum.repos.d]# cat /etc/selinux/config

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of three values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
SELINUXTYPE=targeted

######## Odd entries in /var/log/glusterfs/glusterfsd.log
[root@medusa yum.repos.d]# tail /var/log/glusterfs/glusterd.log
[2019-09-26 11:22:33.056352] I [rpc-clnt.c:1000:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2019-09-26 11:22:33.085626] E [MSGID: 106408] [glusterd-peer-utils.c:117:glusterd_peerinfo_find_by_hostname] 0-management: error in getaddrinfo: Name or service not known
[Unknown error -2]
[2019-09-26 11:22:33.108454] E [MSGID: 101075] [common-utils.c:3590:gf_is_local_addr] 0-management: error in getaddrinfo: Name or service not known

[2019-09-26 11:22:33.108594] E [MSGID: 106187] [glusterd-store.c:4817:glusterd_resolve_all_bricks] 0-glusterd: resolve brick failed in restore
[2019-09-26 11:22:33.108692] E [MSGID: 101019] [xlator.c:715:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again
[2019-09-26 11:22:33.108730] E [MSGID: 101066] [graph.c:362:glusterfs_graph_init] 0-management: initializing translator failed
[2019-09-26 11:22:33.108747] E [MSGID: 101176] [graph.c:725:glusterfs_graph_activate] 0-graph: init failed
[2019-09-26 11:22:33.109315] W [glusterfsd.c:1500:cleanup_and_exit] (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xfd) [0x55af8195d91d] -->/usr/sbin/glusterd(glusterfs_process_volfp+0x163) [0x55af8195d7c3] -->/usr/sbin/glusterd(cleanup_and_exit+0x6b) [0x55af8195cceb] ) 0-: received signum (-1), shutting down
[root@medusa yum.repos.d]#

##



[glusterd-peer-utils.c:117:glusterd_peerinfo_find_by_hostname] 0-management: error in getaddrinfo: Name or service not known


Trying to root cause to learn a bit more about gluster vs "wipe and rebuild"

User avatar
penguinpages
Posts: 42
Joined: 2015/07/21 13:58:05

Re: glusterd service not starting

Post by penguinpages » 2019/09/26 15:23:26

<< Update>>>
Looking through bash_history I think what it was I did to get version difference. yum install -y centos-release-gluster so I added /etc/yum.repos.d/CentOS-Gluster-6.repo vs the base OS and other systems only have /etc/yum.repos.d/CentOS-Gluster-5.repo

[root@medusa yum.repos.d]# systemctl start glusterd
Job for glusterd.service failed because the control process exited with error code. See "systemctl status glusterd.service" and "journalctl -xe" for details.
[root@medusa yum.repos.d]# journalctl -xe
Sep 26 11:06:46 medusa.penguinpages.local yum[8806]: Updated: glusterfs-api-devel-6.5-1.el7.x86_64
Sep 26 11:06:48 medusa.penguinpages.local yum[8806]: Updated: glusterfs-server-6.5-1.el7.x86_64
Sep 26 11:06:48 medusa.penguinpages.local yum[8806]: Updated: glusterfs-rdma-6.5-1.el7.x86_64
Sep 26 11:06:48 medusa.penguinpages.local systemd[1]: Reloading.
Sep 26 11:06:48 medusa.penguinpages.local systemd[1]: Reloading.
Sep 26 11:06:49 medusa.penguinpages.local systemd[1]: Reloading.
Sep 26 11:06:49 medusa.penguinpages.local systemd[1]: Stopping System Logging Service...
-- Subject: Unit rsyslog.service has begun shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/li ... temd-devel
--
-- Unit rsyslog.service has begun shutting down.
Sep 26 11:06:49 medusa.penguinpages.local rsyslogd[8853]: [origin software="rsyslogd" swVersion="8.24.0-41.el7_7" x-pid="8853" x-info="http://www.rsyslog.com"] exiting on signal 15.
Sep 26 11:06:49 medusa.penguinpages.local systemd[1]: Stopped System Logging Service.
-- Subject: Unit rsyslog.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/li ... temd-devel
--
-- Unit rsyslog.service has finished shutting down.
Sep 26 11:06:49 medusa.penguinpages.local systemd[1]: Starting System Logging Service...
-- Subject: Unit rsyslog.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/li ... temd-devel
--
-- Unit rsyslog.service has begun starting up.
Sep 26 11:06:49 medusa.penguinpages.local rsyslogd[8947]: [origin software="rsyslogd" swVersion="8.24.0-41.el7_7" x-pid="8947" x-info="http://www.rsyslog.com"] start
Sep 26 11:06:49 medusa.penguinpages.local systemd[1]: Started System Logging Service.
-- Subject: Unit rsyslog.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/li ... temd-devel
--
-- Unit rsyslog.service has finished starting up.
--
-- The start-up result is done.
Sep 26 11:07:15 medusa.penguinpages.local polkitd[2463]: Registered Authentication Agent for unix-process:8960:4811716 (system bus name :1.276 [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path /org/freedesktop/PolicyKit1/Au
Sep 26 11:07:15 medusa.penguinpages.local polkitd[2463]: Unregistered Authentication Agent for unix-process:8960:4811716 (system bus name :1.276, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8) (disco
Sep 26 11:07:20 medusa.penguinpages.local polkitd[2463]: Registered Authentication Agent for unix-process:8966:4812207 (system bus name :1.277 [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path /org/freedesktop/PolicyKit1/Au
Sep 26 11:07:20 medusa.penguinpages.local systemd[1]: Starting GlusterFS, a clustered file-system server...
-- Subject: Unit glusterd.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/li ... temd-devel
--
-- Unit glusterd.service has begun starting up.
Sep 26 11:07:21 medusa.penguinpages.local systemd[1]: glusterd.service: control process exited, code=exited status=1
Sep 26 11:07:21 medusa.penguinpages.local systemd[1]: Failed to start GlusterFS, a clustered file-system server.
-- Subject: Unit glusterd.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/li ... temd-devel
--
-- Unit glusterd.service has failed.
--
-- The result is failed.
Sep 26 11:07:21 medusa.penguinpages.local systemd[1]: Unit glusterd.service entered failed state.
Sep 26 11:07:21 medusa.penguinpages.local systemd[1]: glusterd.service failed.
Sep 26 11:07:21 medusa.penguinpages.local polkitd[2463]: Unregistered Authentication Agent for unix-process:8966:4812207 (system bus name :1.277, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8) (disco
[root@medusa yum.repos.d]# systemctl status glusterd
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Thu 2019-09-26 11:07:21 EDT; 13min ago
Docs: man:glusterd(8)
Process: 8972 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVEL $GLUSTERD_OPTIONS (code=exited, status=1/FAILURE)
Tasks: 0

Sep 26 11:07:20 medusa.penguinpages.local systemd[1]: Starting GlusterFS, a clustered file-system server...
Sep 26 11:07:21 medusa.penguinpages.local systemd[1]: glusterd.service: control process exited, code=exited status=1
Sep 26 11:07:21 medusa.penguinpages.local systemd[1]: Failed to start GlusterFS, a clustered file-system server.
Sep 26 11:07:21 medusa.penguinpages.local systemd[1]: Unit glusterd.service entered failed state.
Sep 26 11:07:21 medusa.penguinpages.local systemd[1]: glusterd.service failed.
[root@medusa yum.repos.d]# rpm -qa |grep gluster
glusterfs-server-6.5-1.el7.x86_64
glusterfs-api-6.5-1.el7.x86_64
glusterfs-libs-6.5-1.el7.x86_64
glusterfs-6.5-1.el7.x86_64
glusterfs-api-devel-6.5-1.el7.x86_64
python2-gluster-6.5-1.el7.x86_64
libvirt-daemon-driver-storage-gluster-4.5.0-23.el7_7.1.x86_64
centos-release-gluster6-1.0-1.el7.centos.noarch
glusterfs-extra-xlators-6.5-1.el7.x86_64
glusterfs-fuse-6.5-1.el7.x86_64
glusterfs-cli-6.5-1.el7.x86_64
glusterfs-client-xlators-6.5-1.el7.x86_64
glusterfs-rdma-6.5-1.el7.x86_64
glusterfs-devel-6.5-1.el7.x86_64
[root@medusa yum.repos.d]#


So even with putting back the "version 6" gluster repo and removing the version 5... still will not start.

User avatar
penguinpages
Posts: 42
Joined: 2015/07/21 13:58:05

Re: glusterd service not starting

Post by penguinpages » 2019/09/28 02:10:26

Update:

I resolved this only by removing all gluster packages and then doing a clear out of left over folders. Even when I did a re-install.. I saw files were left


I also gave up on getting gluster5 repo which matches the other two nodes to work and not working with v6 on this node. Investigating upgrade path and will post how that goes (once I backup my VMS :)

cd /etc/yum.repos.d/
mv CentOS-Gluster-6.repo /tmp



mv /var/lib/glusterd /tmp
mv /var/log/glusterfs /tmp
yum install centos-release-gluster glusterfs-server ansible gdeploy ansible -y
systemctl enable glusterd.service
systemctl start glusterd
systemctl status glusterd


[root@medusa ~]# systemctl status glusterd
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2019-09-27 21:54:02 EDT; 7min ago
Docs: man:glusterd(8)
Process: 10067 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVEL $GLUSTERD_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 10068 (glusterd)
Tasks: 9
CGroup: /system.slice/glusterd.service
└─10068 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO

Sep 27 21:54:02 medusa.penguinpages.local systemd[1]: Starting GlusterFS, a clustered file-system server...
Sep 27 21:54:02 medusa.penguinpages.local systemd[1]: Started GlusterFS, a clustered file-system server.


[root@thor ~]# gluster peer probe medusast
peer probe: success.
You have new mail in /var/spool/mail/root
[root@thor ~]# gluster peer probe medusast.penguinpages.local
peer probe: success.
[root@thor ~]#


I know this is not full root cause but plodding forward...

Post Reply

Return to “CentOS 7 - General Support”