Postfix stuck. Postfix/Cleanup timeout on cleanup socket error

Installing, Configuring, Troubleshooting server daemons such as Web and Mail
Post Reply
xxthegonzxx
Posts: 2
Joined: 2011/12/12 18:13:37
Location: Los Angeles

Postfix stuck. Postfix/Cleanup timeout on cleanup socket error

Post by xxthegonzxx » 2011/12/12 19:16:45

Hi all,

I'm not too familiar with postfix but our server seems to stop sending emails usually only on the weekends. The rest of the week it's fine but once it gets to about Saturday/Sunday morning it gets stuck. I've come in usually on Monday and type "mailq" only to find hundreds if not thousand or so emails stuck in the incoming queue. I have to restart the postfix service for all the emails to get sent.

I was hoping someone a little more experienced than me can point me in the right direction and explain what's going on as I'm not that familiar with it yet. I'm sure I'm not the first to have this issue and I've searched online but have found nothing specific that could be causing this. This is what "cat /var/log/maillog | grep warning" shows:

Dec 11 05:09:19 batch-ca4-02 postfix/cleanup[31691]: warning: timeout on cleanup socket while reading input attribute name
Dec 11 05:31:27 batch-ca4-02 postfix/cleanup[31691]: warning: 8A2993E3003B: read timeout on cleanup socket

Here is my postconf -n:

alias_database = hash:/etc/aliases
alias_maps = hash:/etc/aliases
command_directory = /usr/sbin
config_directory = /etc/postfix
daemon_directory = /usr/libexec/postfix
debug_peer_level = 2
html_directory = no
in_flow_delay = 10s
inet_interfaces = localhost
mail_owner = postfix
mailq_path = /usr/bin/mailq.postfix
manpage_directory = /usr/share/man
mydestination = $myhostname, localhost.$mydomain, localhost
mydomain = usaepay.com
myorigin = $mydomain
newaliases_path = /usr/bin/newaliases.postfix
queue_directory = /var/spool/postfix
readme_directory = /usr/share/doc/postfix-2.3.3/README_FILES
relayhost = 192.168.x.x
sample_directory = /usr/share/doc/postfix-2.3.3/samples
sendmail_path = /usr/sbin/sendmail.postfix
setgid_group = postdrop
unknown_local_recipient_reject_code = 550

Can anyone please help?? If I need to submit more info please let me know. Thank you!

pschaff
Retired Moderator
Posts: 18276
Joined: 2006/12/13 20:15:34
Location: Tidewater, Virginia, North America
Contact:

Postfix stuck. Postfix/Cleanup timeout on cleanup socket err

Post by pschaff » 2011/12/13 21:50:35

Welcome to the CentOS fora. Please see the recommended reading for new users linked in my signature.

Are you up to date on CentOS 5.7 with "yum update"? Anything else relevant in /var/log/messages or other log files?

KermitDaFragger
Posts: 195
Joined: 2009/09/11 19:23:05
Location: the Netherlands

Re: Postfix stuck. Postfix/Cleanup timeout on cleanup socket error

Post by KermitDaFragger » 2011/12/13 23:42:00

I doubt this is a Postfix specific problem. The sympton just seems to manifest itself in Postfix. It sounds more like a resource exhaustion problem. Any thing special in '/etc/security/limits.conf' perhaps ? Or maybe the number of open connections is limited by IPTables ?

xxthegonzxx
Posts: 2
Joined: 2011/12/12 18:13:37
Location: Los Angeles

Re: Postfix stuck. Postfix/Cleanup timeout on cleanup socket error

Post by xxthegonzxx » 2012/04/10 20:01:01

Hi guys, I'm sorry for getting to you so late. It looks like I didn't have notification setup for email and I had some other projects thrown at me. I was able to find a workaround in the meantime by writing a script to restart postfix if mailq gets above 10.

Pschaff, there are no other messages relevant in /var/log/messages.

KermitDaFragger, I think you're right about it being a resource exhaustion problem. I checked /etc/security/limits.conf and found nothing. It is entirely commented out (I'm pretty sure we don't use it). Your comment about the number of open connections limited by IPtables made me look further into it, but I saw no limit of port 25 connections or anything else that might look out of place. These emails are generated by our other database server and sent out in "batches". Then THIS server forwards them to a MX relay server. So these emails are all sent locally I believe. Could this be some sort of SMTP local connection issue? I keep coming across articles and forums stating that it might be a socket connection limit. Possibly once SMTP runs out of socket connections then postfix freezes or something like that. Unfortunately I am not sure. I can provide more details but need some guidance please.


I did find this a little strange (/var/log/maillog):

Apr 8 04:03:58 batch-ca4-02 postfix/smtpd[6742]: E00353E3003A: client=localhost[127.0.0.1]
Apr 8 04:03:58 batch-ca4-02 postfix/cleanup[6748]: E00353E3003A: message-id=
Apr 8 04:03:58 batch-ca4-02 postfix/qmgr[32470]: E00353E3003A: from=, size=1020, nrcpt=1 (queue active)
Apr 8 04:03:58 batch-ca4-02 postfix/smtpd[6742]: disconnect from localhost[127.0.0.1]
Apr 8 04:03:58 batch-ca4-02 postfix/smtp[6488]: E00353E3003A: to=, relay=192.168.x.x[192.168.x.x]:25, delay=0.08, delays=0.04/0/0/0.0
Apr 8 04:03:58 batch-ca4-02 postfix/qmgr[32470]: E00353E3003A: [b][size=72]removed[/size][/b]
Apr 8 04:08:49 batch-ca4-02 postfix/smtpd[7060]: connect from localhost[127.0.0.1]
Apr 8 04:08:49 batch-ca4-02 postfix/smtpd[7060]: 5FA6B3E3003A: client=localhost[127.0.0.1]
Apr 8 04:08:49 batch-ca4-02 postfix/cleanup[7063]: 5FA6B3E3003A: message-id=
Apr 8 04:08:49 batch-ca4-02 postfix/qmgr[32470]: 5FA6B3E3003A: from=, size=831, nrcpt=1 (queue active)
Apr 8 04:08:49 batch-ca4-02 postfix/smtpd[7060]: disconnect from localhost[127.0.0.1]
Apr 8 04:08:49 batch-ca4-02 postfix/smtp[7064]: 5FA6B3E3003A: to=, relay=192.168.x.x[192.168.x.x]:25, delay=0.15, delays=0.06/0.01/0.03
Apr 8 04:08:49 batch-ca4-02 postfix/qmgr[32470]: 5FA6B3E3003A: [b][size=72]removed[/size][/b]
Apr 8 04:08:51 batch-ca4-02 postfix/smtpd[7060]: connect from localhost[127.0.0.1]
Apr 8 04:08:51 batch-ca4-02 postfix/smtpd[7060]: C74FF3E3003A: client=localhost[127.0.0.1]
Apr 8 04:08:51 batch-ca4-02 postfix/cleanup[7063]: C74FF3E3003A: message-id=
Apr 8 04:08:51 batch-ca4-02 postfix/qmgr[32470]: C74FF3E3003A: from=, size=830, nrcpt=1 (queue active)
Apr 8 04:08:51 batch-ca4-02 postfix/smtpd[7060]: disconnect from localhost[127.0.0.1]
Apr 8 04:08:51 batch-ca4-02 postfix/smtp[7064]: C74FF3E3003A: to=, relay=192.168.x.x[192.168.x.x]:25, delay=0.08, delays=0.04/0/0/0.03

As you can see the messages get removed by the postfix/qmgr. Then RIGHT after the log shows:

Apr 8 04:08:53 batch-ca4-02 postfix/smtpd[7060]: connect from localhost[127.0.0.1]
Apr 8 04:10:55 batch-ca4-02 postfix/smtpd[7252]: connect from localhost[127.0.0.1]
Apr 8 04:16:58 batch-ca4-02 postfix/smtpd[7783]: connect from localhost[127.0.0.1]
Apr 8 04:12:56 batch-ca4-02 postfix/smtpd[7428]: connect from localhost[127.0.0.1]
Apr 8 04:29:21 batch-ca4-02 postfix/smtpd[7428]: lost connection after QUIT from localhost[127.0.0.1]
Apr 8 04:29:21 batch-ca4-02 postfix/smtpd[7783]: lost connection after QUIT from localhost[127.0.0.1]
Apr 8 04:29:21 batch-ca4-02 postfix/smtpd[7783]: disconnect from localhost[127.0.0.1]
Apr 8 04:29:21 batch-ca4-02 postfix/smtpd[7428]: disconnect from localhost[127.0.0.1]
Apr 8 04:21:08 batch-ca4-02 postfix/smtpd[8150]: connect from localhost[127.0.0.1]
Apr 8 04:14:57 batch-ca4-02 postfix/smtpd[7607]: connect from localhost[127.0.0.1]
Apr 8 04:29:21 batch-ca4-02 postfix/smtpd[8150]: lost connection after QUIT from localhost[127.0.0.1]
Apr 8 04:23:09 batch-ca4-02 postfix/smtpd[8330]: connect from localhost[127.0.0.1]
Apr 8 04:27:29 batch-ca4-02 postfix/smtpd[7252]: lost connection after QUIT from localhost[127.0.0.1]
Apr 8 04:29:21 batch-ca4-02 postfix/smtpd[7252]: disconnect from localhost[127.0.0.1]
Apr 8 04:29:21 batch-ca4-02 postfix/smtpd[8150]: disconnect from localhost[127.0.0.1]
Apr 8 04:25:13 batch-ca4-02 postfix/smtpd[8517]: connect from localhost[127.0.0.1]
Apr 8 04:29:21 batch-ca4-02 postfix/smtpd[8517]: lost connection after QUIT from localhost[127.0.0.1]
Apr 8 04:29:21 batch-ca4-02 postfix/smtpd[8517]: disconnect from localhost[127.0.0.1]
Apr 8 04:28:00 batch-ca4-02 postfix/smtpd[8762]: connect from localhost[127.0.0.1]
Apr 8 04:29:21 batch-ca4-02 postfix/smtpd[7607]: lost connection after QUIT from localhost[127.0.0.1]
Apr 8 04:29:21 batch-ca4-02 postfix/smtpd[7607]: disconnect from localhost[127.0.0.1]
Apr 8 04:16:17 batch-ca4-02 postfix/smtpd[7060]: lost connection after QUIT from localhost[127.0.0.1]
Apr 8 04:29:21 batch-ca4-02 postfix/smtpd[8330]: lost connection after QUIT from localhost[127.0.0.1]
Apr 8 04:29:21 batch-ca4-02 postfix/smtpd[7060]: disconnect from localhost[127.0.0.1]
Apr 8 04:19:08 batch-ca4-02 postfix/smtpd[7969]: connect from localhost[127.0.0.1]
Apr 8 04:29:21 batch-ca4-02 postfix/smtpd[8330]: disconnect from localhost[127.0.0.1]
Apr 8 04:29:21 batch-ca4-02 postfix/smtpd[8762]: disconnect from localhost[127.0.0.1]
Apr 8 04:29:21 batch-ca4-02 postfix/smtpd[7969]: lost connection after QUIT from localhost[127.0.0.1]
Apr 8 04:29:21 batch-ca4-02 postfix/smtpd[7969]: disconnect from localhost[127.0.0.1]
Apr 8 04:31:51 batch-ca4-02 postfix/smtpd[5486]: connect from localhost[127.0.0.1]

Followed by:

Apr 8 04:31:51 batch-ca4-02 postfix/smtpd[5486]: 902ED3E3003A: client=localhost[127.0.0.1]
Apr 8 04:31:51 batch-ca4-02 postfix/cleanup[5495]: 902ED3E3003A: message-id=
Apr 8 04:31:51 batch-ca4-02 postfix/smtpd[5486]: disconnect from localhost[127.0.0.1]
Apr 8 04:31:54 batch-ca4-02 postfix/pickup[5985]: 3BB943E3003F: uid=500 from=
Apr 8 04:31:54 batch-ca4-02 postfix/cleanup[5495]: 3BB943E3003F: message-id=
Apr 8 04:31:54 batch-ca4-02 postfix/pickup[5985]: 4A49E3E30040: uid=500 from=
Apr 8 04:31:54 batch-ca4-02 postfix/cleanup[5495]: 4A49E3E30040: message-id=
Apr 8 04:31:54 batch-ca4-02 postfix/pickup[5985]: 5AE6B3E30041: uid=500 from=
Apr 8 04:31:54 batch-ca4-02 postfix/cleanup[5495]: 5AE6B3E30041: message-id=
Apr 8 04:31:54 batch-ca4-02 postfix/pickup[5985]: 6A5013E30042: uid=500 from=
Apr 8 04:31:54 batch-ca4-02 postfix/cleanup[5495]: 6A5013E30042: message-id=
Apr 8 04:31:54 batch-ca4-02 postfix/pickup[5985]: 7A7233E30043: uid=500 from=
Apr 8 04:31:54 batch-ca4-02 postfix/cleanup[5495]: 7A7233E30043: message-id=
Apr 8 04:31:54 batch-ca4-02 postfix/pickup[5985]: 8AE433E30044: uid=500 from=
Apr 8 04:31:54 batch-ca4-02 postfix/cleanup[5495]: 8AE433E30044: message-id=

As you can see these messages are not being removed.
Any suggestions?

Post Reply