0000581: INFO: task nginx:2334 blocked for more than 120 sec

Issues related to applications and software problems
Post Reply
Spacedust
Posts: 16
Joined: 2012/03/08 16:11:08

0000581: INFO: task nginx:2334 blocked for more than 120 sec

Post by Spacedust » 2015/07/19 13:51:29

I'm getting constant hangs (random programs) with latest official kernel 2.6.32-504.30.3 or kernel 3.10.84-1.el6.elrepo.x86_64 using CentOS 6.6 64-bit.

Sometimes even saving simple vim file takes about 2 minutes !

I've replaced all hardware, motherboard, CPU disks and nothing.

Filesystem check passed.

My CPU: Intel(R) Core(TM) i7 CPU X 980 @ 3.33GHz
Board: MSI X58 Pro-E (MS-7522)
Drives: 2x Intel 520 240 GB + 2x WD RE4 3 TB

Please help me solving this.

My config:

Code: Select all

### Hetzner Online AG installimage
# sysctl config
#net.ipv4.ip_forward=1
net.ipv4.conf.all.rp_filter=1
net.ipv4.icmp_echo_ignore_broadcasts=1
# ipv6 settings (no autoconfiguration)
net.ipv6.conf.default.autoconf=0
net.ipv6.conf.default.accept_dad=0
net.ipv6.conf.default.accept_ra=0
net.ipv6.conf.default.accept_ra_defrtr=0
net.ipv6.conf.default.accept_ra_rtr_pref=0
net.ipv6.conf.default.accept_ra_pinfo=0
net.ipv6.conf.default.accept_source_route=0
net.ipv6.conf.default.accept_redirects=0
net.ipv6.conf.default.forwarding=0
net.ipv6.conf.all.autoconf=0
net.ipv6.conf.all.accept_dad=0
net.ipv6.conf.all.accept_ra=0
net.ipv6.conf.all.accept_ra_defrtr=0
net.ipv6.conf.all.accept_ra_rtr_pref=0
net.ipv6.conf.all.accept_ra_pinfo=0
net.ipv6.conf.all.accept_source_route=0
net.ipv6.conf.all.accept_redirects=0
net.ipv6.conf.all.forwarding=0
net.netfilter.nf_conntrack_max=196608
vm.dirty_background_ratio = 5
vm.dirty_ratio = 10
kernel.panic = 10

fs.aio-max-nr = 1048576
fs.file-max = 1048576
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 2048
vm.swappiness = 10
#kernel.sem=250 32000 100 128

Code: Select all

Additional Information 	INFO: task rs:main Q:Reg:1592 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rs:main Q:Reg D ffffffff81810d00 0 1592 1 0x00000080
 ffff88061f009968 0000000000000086 ffff88061f009fd8 0000000000013080
 ffff88061f008010 0000000000013080 0000000000013080 0000000000013080
 ffff88061f009fd8 0000000000013080 ffff88061b762400 ffff880620a30fa0
Call Trace:
 [<ffffffff815ef119>] schedule+0x29/0x70
 [<ffffffff815ef42e>] schedule_preempt_disabled+0xe/0x10
 [<ffffffff815edb84>] __mutex_lock_slowpath+0x194/0x240
 [<ffffffff815ed9cb>] mutex_lock+0x2b/0x50
 [<ffffffffa01b2e9c>] __jbd2_log_wait_for_space+0xbc/0x1a0 [jbd2]
 [<ffffffffa01ad008>] start_this_handle+0x378/0x6d0 [jbd2]
 [<ffffffff81182395>] ? kmem_cache_alloc+0x275/0x280
 [<ffffffffa01ad5fc>] jbd2__journal_start+0xcc/0x1b0 [jbd2]
 [<ffffffffa01dedc1>] ? ext4_dirty_inode+0x31/0x70 [ext4]
 [<ffffffffa02097ac>] __ext4_journal_start_sb+0x8c/0x150 [ext4]
 [<ffffffffa01dedc1>] ext4_dirty_inode+0x31/0x70 [ext4]
 [<ffffffff811c4e63>] __mark_inode_dirty+0x1d3/0x2d0
 [<ffffffff811b51e1>] update_time+0x81/0xc0
 [<ffffffff815eec1e>] ? __schedule+0x3fe/0x720
 [<ffffffff811b52b8>] file_update_time+0x98/0xe0
 [<ffffffff81132058>] __generic_file_aio_write+0x188/0x3b0
 [<ffffffff810bd2c1>] ? futex_wait+0x1d1/0x2b0
 [<ffffffff811322e5>] generic_file_aio_write+0x65/0xd0
 [<ffffffffa01d75e4>] ext4_file_write+0x54/0xf0 [ext4]
 [<ffffffff8119b53f>] do_sync_write+0x7f/0xb0
 [<ffffffff8119b93f>] vfs_write+0xdf/0x190
 [<ffffffff8119be8f>] SyS_write+0x5f/0xa0
 [<ffffffff810e13b6>] ? __audit_syscall_exit+0x246/0x2f0
 [<ffffffff815f9739>] system_call_fastpath+0x16/0x1b
INFO: task nginx:2313 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
nginx D 0000000000000001 0 2313 2312 0x00000080
 ffff88061effd978 0000000000000082 ffff88061effdfd8 0000000000013080
 ffff88061effc010 0000000000013080 0000000000013080 0000000000013080
 ffff88061effdfd8 0000000000013080 ffff88061db92b20 ffff88056e9d4c20
Call Trace:
 [<ffffffff815ef119>] schedule+0x29/0x70
 [<ffffffff815ef42e>] schedule_preempt_disabled+0xe/0x10
 [<ffffffff815edb84>] __mutex_lock_slowpath+0x194/0x240
 [<ffffffff8118142d>] ? cache_alloc_refill+0x9d/0x300
 [<ffffffff815ed9cb>] mutex_lock+0x2b/0x50
 [<ffffffffa01b2e9c>] __jbd2_log_wait_for_space+0xbc/0x1a0 [jbd2]
 [<ffffffffa01ad008>] start_this_handle+0x378/0x6d0 [jbd2]
 [<ffffffff81082900>] ? wake_up_bit+0x40/0x40
 [<ffffffffa01ad5fc>] jbd2__journal_start+0xcc/0x1b0 [jbd2]
 [<ffffffffa01e9fe5>] ? ext4_rename+0x95/0x810 [ext4]
 [<ffffffffa02097ac>] __ext4_journal_start_sb+0x8c/0x150 [ext4]
 [<ffffffffa01e9fe5>] ext4_rename+0x95/0x810 [ext4]
 [<ffffffff811b2099>] ? _d_rehash+0x39/0x40
 [<ffffffff811a84cc>] vfs_rename_other+0xcc/0x120
 [<ffffffff811a8ccc>] vfs_rename+0x16c/0x260
 [<ffffffff811a5ded>] ? lookup_real+0x1d/0x60
 [<ffffffff811abbd8>] SYSC_renameat+0x3c8/0x410
 [<ffffffff811ba979>] ? mntput_no_expire+0x49/0x150
 [<ffffffff811baaa3>] ? mntput+0x23/0x40
 [<ffffffff8119dcce>] ? ____fput+0xe/0x10
 [<ffffffff8107e694>] ? task_work_run+0x94/0xf0
 [<ffffffff811abc2e>] SyS_renameat+0xe/0x10
 [<ffffffff811a686b>] SyS_rename+0x1b/0x20
 [<ffffffff815f9739>] system_call_fastpath+0x16/0x1b
INFO: task nginx:2317 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
nginx D ffffffff81810d00 0 2317 2312 0x00000080
 ffff880616763978 0000000000000086 ffff880616763fd8 0000000000013080
 ffff880616762010 0000000000013080 0000000000013080 0000000000013080
 ffff880616763fd8 0000000000013080 ffff88061cde4a00 ffff880620a30fa0
Call Trace:
 [<ffffffff815ef119>] schedule+0x29/0x70
 [<ffffffff815ef42e>] schedule_preempt_disabled+0xe/0x10
 [<ffffffff815edb84>] __mutex_lock_slowpath+0x194/0x240
 [<ffffffff815ed9cb>] mutex_lock+0x2b/0x50
 [<ffffffffa01b2e9c>] __jbd2_log_wait_for_space+0xbc/0x1a0 [jbd2]
 [<ffffffffa01ad008>] start_this_handle+0x378/0x6d0 [jbd2]
 [<ffffffff81082900>] ? wake_up_bit+0x40/0x40
 [<ffffffffa01ad5fc>] jbd2__journal_start+0xcc/0x1b0 [jbd2]
 [<ffffffff811a7b54>] ? lookup_fast+0x44/0x2d0
 [<ffffffffa01e9fe5>] ? ext4_rename+0x95/0x810 [ext4]
 [<ffffffffa02097ac>] __ext4_journal_start_sb+0x8c/0x150 [ext4]
 [<ffffffffa01e9fe5>] ext4_rename+0x95/0x810 [ext4]
 [<ffffffff811b0d40>] ? __d_lookup+0x80/0x170
 [<ffffffff811a84cc>] vfs_rename_other+0xcc/0x120
 [<ffffffff811a8ccc>] vfs_rename+0x16c/0x260
 [<ffffffff811abbd8>] SYSC_renameat+0x3c8/0x410
 [<ffffffff811ba979>] ? mntput_no_expire+0x49/0x150
 [<ffffffff811baaa3>] ? mntput+0x23/0x40
 [<ffffffff8119dcce>] ? ____fput+0xe/0x10
 [<ffffffff8107e694>] ? task_work_run+0x94/0xf0
 [<ffffffff811abc2e>] SyS_renameat+0xe/0x10
 [<ffffffff811a686b>] SyS_rename+0x1b/0x20
 [<ffffffff815f9739>] system_call_fastpath+0x16/0x1b
INFO: task nginx:2320 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
nginx D 0000000000000001 0 2320 2312 0x00000080
 ffff8806167f9978 0000000000000086 ffff8806167f9fd8 0000000000013080
 ffff8806167f8010 0000000000013080 0000000000013080 0000000000013080
 ffff8806167f9fd8 0000000000013080 ffff88061cde54e0 ffff88061d0febe0
Call Trace:
 [<ffffffff815ef119>] schedule+0x29/0x70
 [<ffffffff815ef42e>] schedule_preempt_disabled+0xe/0x10
 [<ffffffff815edb84>] __mutex_lock_slowpath+0x194/0x240
 [<ffffffff815ed9cb>] mutex_lock+0x2b/0x50
 [<ffffffffa01b2e9c>] __jbd2_log_wait_for_space+0xbc/0x1a0 [jbd2]
 [<ffffffffa01ad008>] start_this_handle+0x378/0x6d0 [jbd2]
 [<ffffffff81082900>] ? wake_up_bit+0x40/0x40
 [<ffffffffa01ad5fc>] jbd2__journal_start+0xcc/0x1b0 [jbd2]
 [<ffffffff811a7b54>] ? lookup_fast+0x44/0x2d0
 [<ffffffffa01e9fe5>] ? ext4_rename+0x95/0x810 [ext4]
 [<ffffffffa02097ac>] __ext4_journal_start_sb+0x8c/0x150 [ext4]
 [<ffffffffa01e9fe5>] ext4_rename+0x95/0x810 [ext4]
 [<ffffffff811b0d40>] ? __d_lookup+0x80/0x170
 [<ffffffff811a84cc>] vfs_rename_other+0xcc/0x120
 [<ffffffff811a8ccc>] vfs_rename+0x16c/0x260
 [<ffffffff811abbd8>] SYSC_renameat+0x3c8/0x410
 [<ffffffff811ba979>] ? mntput_no_expire+0x49/0x150
 [<ffffffff811baaa3>] ? mntput+0x23/0x40
 [<ffffffff8119dcce>] ? ____fput+0xe/0x10
 [<ffffffff8107e694>] ? task_work_run+0x94/0xf0
 [<ffffffff811abc2e>] SyS_renameat+0xe/0x10
 [<ffffffff811a686b>] SyS_rename+0x1b/0x20
 [<ffffffff815f9739>] system_call_fastpath+0x16/0x1b
INFO: task nginx:2324 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
nginx D 0000000000000006 0 2324 2312 0x00000080
 ffff88061e9698f8 0000000000000086 ffff88061e969fd8 0000000000013080
 ffff88061e968010 0000000000013080 0000000000013080 0000000000013080
 ffff88061e969fd8 0000000000013080 ffff88061cdc7020 ffff88061d0fe100
Call Trace:
 [<ffffffff815ef119>] schedule+0x29/0x70
 [<ffffffffa01b5d0d>] jbd2_log_wait_commit+0xdd/0x1b0 [jbd2]
 [<ffffffff81082900>] ? wake_up_bit+0x40/0x40
 [<ffffffffa01b295c>] __process_buffer+0x6c/0x1f0 [jbd2]
 [<ffffffffa01b2bf3>] jbd2_log_do_checkpoint+0x113/0x300 [jbd2]
 [<ffffffffa01b2e6d>] __jbd2_log_wait_for_space+0x8d/0x1a0 [jbd2]
 [<ffffffff81182395>] ? kmem_cache_alloc+0x275/0x280
 [<ffffffffa01ad008>] start_this_handle+0x378/0x6d0 [jbd2]
 [<ffffffff81082900>] ? wake_up_bit+0x40/0x40
 [<ffffffffa01ad5fc>] jbd2__journal_start+0xcc/0x1b0 [jbd2]
 [<ffffffff811a7b54>] ? lookup_fast+0x44/0x2d0
 [<ffffffffa01e9fe5>] ? ext4_rename+0x95/0x810 [ext4]
 [<ffffffffa02097ac>] __ext4_journal_start_sb+0x8c/0x150 [ext4]
 [<ffffffffa01e9fe5>] ext4_rename+0x95/0x810 [ext4]
 [<ffffffff811b0d40>] ? __d_lookup+0x80/0x170
 [<ffffffff811a84cc>] vfs_rename_other+0xcc/0x120
 [<ffffffff811a8ccc>] vfs_rename+0x16c/0x260
 [<ffffffff811abbd8>] SYSC_renameat+0x3c8/0x410
 [<ffffffff811ba979>] ? mntput_no_expire+0x49/0x150
 [<ffffffff811baaa3>] ? mntput+0x23/0x40
 [<ffffffff8119dcce>] ? ____fput+0xe/0x10
 [<ffffffff8107e694>] ? task_work_run+0x94/0xf0
 [<ffffffff811abc2e>] SyS_renameat+0xe/0x10
 [<ffffffff811a686b>] SyS_rename+0x1b/0x20
 [<ffffffff815f9739>] system_call_fastpath+0x16/0x1b
INFO: task nginx:2329 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
nginx D ffffffff81810d00 0 2329 2312 0x00000080
 ffff88061e2bdb18 0000000000000082 ffff88061e2bdfd8 0000000000013080
 ffff88061e2bc010 0000000000013080 0000000000013080 0000000000013080
 ffff88061e2bdfd8 0000000000013080 ffff88061cda8fe0 ffff880620a3e540
Call Trace:
 [<ffffffff815ef119>] schedule+0x29/0x70
 [<ffffffff815ef42e>] schedule_preempt_disabled+0xe/0x10
 [<ffffffff815edb84>] __mutex_lock_slowpath+0x194/0x240
 [<ffffffff815ed9cb>] mutex_lock+0x2b/0x50
 [<ffffffffa01b2e9c>] __jbd2_log_wait_for_space+0xbc/0x1a0 [jbd2]
 [<ffffffffa01ad008>] start_this_handle+0x378/0x6d0 [jbd2]
 [<ffffffff81082900>] ? wake_up_bit+0x40/0x40
 [<ffffffffa01ad5fc>] jbd2__journal_start+0xcc/0x1b0 [jbd2]
 [<ffffffff8119df56>] ? __sb_start_write+0x76/0x120
 [<ffffffffa01e27d3>] ? ext4_evict_inode+0x233/0x480 [ext4]
 [<ffffffffa02097ac>] __ext4_journal_start_sb+0x8c/0x150 [ext4]
 [<ffffffffa01e27d3>] ext4_evict_inode+0x233/0x480 [ext4]
 [<ffffffff811b5b74>] evict+0xb4/0x1d0
 [<ffffffff811b5d73>] iput_final+0xe3/0x170
 [<ffffffff811b5e3e>] iput+0x3e/0x50
 [<ffffffff811b38a8>] dput+0x1d8/0x2c0
 [<ffffffff8119db38>] __fput+0x188/0x270
 [<ffffffff8119dcce>] ____fput+0xe/0x10
 [<ffffffff8107e68f>] task_work_run+0x8f/0xf0
 [<ffffffff81012bc5>] do_notify_resume+0xb5/0xc0
 [<ffffffff815f99f2>] int_signal+0x12/0x17
INFO: task nginx:2332 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
nginx D 0000000000000008 0 2332 2312 0x00000080
 ffff88061d42b978 0000000000000086 ffff88061d42bfd8 0000000000013080
 ffff88061d42a010 0000000000013080 0000000000013080 0000000000013080
 ffff88061d42bfd8 0000000000013080 ffff88061d428ba0 ffff8806209d8300
Call Trace:
 [<ffffffff815ef119>] schedule+0x29/0x70
 [<ffffffff815ef42e>] schedule_preempt_disabled+0xe/0x10
 [<ffffffff815edb84>] __mutex_lock_slowpath+0x194/0x240
 [<ffffffff8118142d>] ? cache_alloc_refill+0x9d/0x300
 [<ffffffff815ed9cb>] mutex_lock+0x2b/0x50
 [<ffffffffa01b2e9c>] __jbd2_log_wait_for_space+0xbc/0x1a0 [jbd2]
 [<ffffffff81182395>] ? kmem_cache_alloc+0x275/0x280
 [<ffffffffa01ad008>] start_this_handle+0x378/0x6d0 [jbd2]
 [<ffffffff81082900>] ? wake_up_bit+0x40/0x40
 [<ffffffffa01ad5fc>] jbd2__journal_start+0xcc/0x1b0 [jbd2]
 [<ffffffff811a7b54>] ? lookup_fast+0x44/0x2d0
 [<ffffffffa01e9fe5>] ? ext4_rename+0x95/0x810 [ext4]
 [<ffffffffa02097ac>] __ext4_journal_start_sb+0x8c/0x150 [ext4]
 [<ffffffffa01e9fe5>] ext4_rename+0x95/0x810 [ext4]
 [<ffffffff811b0d40>] ? __d_lookup+0x80/0x170
 [<ffffffff811a84cc>] vfs_rename_other+0xcc/0x120
 [<ffffffff811a8ccc>] vfs_rename+0x16c/0x260
 [<ffffffff811abbd8>] SYSC_renameat+0x3c8/0x410
 [<ffffffff811ba979>] ? mntput_no_expire+0x49/0x150
 [<ffffffff811baaa3>] ? mntput+0x23/0x40
 [<ffffffff8119dcce>] ? ____fput+0xe/0x10
 [<ffffffff8107e694>] ? task_work_run+0x94/0xf0
 [<ffffffff811abc2e>] SyS_renameat+0xe/0x10
 [<ffffffff811a686b>] SyS_rename+0x1b/0x20
 [<ffffffff815f9739>] system_call_fastpath+0x16/0x1b
INFO: task nginx:2334 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
nginx D ffffffff81810d00 0 2334 2312 0x00000080
 ffff88061d587978 0000000000000086 ffff88061d587fd8 0000000000013080
 ffff88061d586010 0000000000013080 0000000000013080 0000000000013080
 ffff88061d587fd8 0000000000013080 ffff88061d4280c0 ffff880620a06480
Call Trace:
 [<ffffffff815ef119>] schedule+0x29/0x70
 [<ffffffff815ef42e>] schedule_preempt_disabled+0xe/0x10
 [<ffffffff815edb84>] __mutex_lock_slowpath+0x194/0x240
 [<ffffffff815ed9cb>] mutex_lock+0x2b/0x50
 [<ffffffffa01b2e9c>] __jbd2_log_wait_for_space+0xbc/0x1a0 [jbd2]
 [<ffffffffa01ad008>] start_this_handle+0x378/0x6d0 [jbd2]
 [<ffffffff81082900>] ? wake_up_bit+0x40/0x40
 [<ffffffffa01ad5fc>] jbd2__journal_start+0xcc/0x1b0 [jbd2]
 [<ffffffffa01e9fe5>] ? ext4_rename+0x95/0x810 [ext4]
 [<ffffffffa02097ac>] __ext4_journal_start_sb+0x8c/0x150 [ext4]
 [<ffffffffa01e9fe5>] ext4_rename+0x95/0x810 [ext4]
 [<ffffffff811b2099>] ? _d_rehash+0x39/0x40
 [<ffffffff811a84cc>] vfs_rename_other+0xcc/0x120
 [<ffffffff811a8ccc>] vfs_rename+0x16c/0x260
 [<ffffffff811a5ded>] ? lookup_real+0x1d/0x60
 [<ffffffff811abbd8>] SYSC_renameat+0x3c8/0x410
 [<ffffffff811ba979>] ? mntput_no_expire+0x49/0x150
 [<ffffffff811baaa3>] ? mntput+0x23/0x40
 [<ffffffff8119dcce>] ? ____fput+0xe/0x10
 [<ffffffff8107e694>] ? task_work_run+0x94/0xf0
 [<ffffffff811abc2e>] SyS_renameat+0xe/0x10
 [<ffffffff811a686b>] SyS_rename+0x1b/0x20
 [<ffffffff815f9739>] system_call_fastpath+0x16/0x1b

gerald_clark
Posts: 10642
Joined: 2005/08/05 15:19:54
Location: Northern Illinois, USA

Re: 0000581: INFO: task nginx:2334 blocked for more than 120

Post by gerald_clark » 2015/07/19 23:35:43

You should ask your host. It is their hardware and softeware.

User avatar
TrevorH
Site Admin
Posts: 33220
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: 0000581: INFO: task nginx:2334 blocked for more than 120

Post by TrevorH » 2015/07/20 09:40:41

Everything you posted points to a problem with your disks. Those "errors" you get are issued as normal if an I/O is stuck waiting to complete - if it waits for 120 seconds then the kernel gives it a kick to wake it up and issues that message to show that it's done it. Since the backtrace shows it waiting in ext4* functions you can be pretty sure that the problem is disk related.
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

Spacedust
Posts: 16
Joined: 2012/03/08 16:11:08

Re: 0000581: INFO: task nginx:2334 blocked for more than 120

Post by Spacedust » 2015/07/20 11:38:58

TrevorH wrote:Everything you posted points to a problem with your disks. Those "errors" you get are issued as normal if an I/O is stuck waiting to complete - if it waits for 120 seconds then the kernel gives it a kick to wake it up and issues that message to show that it's done it. Since the backtrace shows it waiting in ext4* functions you can be pretty sure that the problem is disk related.
Disks were replaced and it didn't helped.

See my SMART:

http://pastebin.com/6NJx10CQ
http://pastebin.com/MVz98hXZ
http://pastebin.com/h4zq1ek6
http://pastebin.com/ZBWRcUvL

Spacedust
Posts: 16
Joined: 2012/03/08 16:11:08

Re: 0000581: INFO: task nginx:2334 blocked for more than 120

Post by Spacedust » 2015/07/20 11:39:22

gerald_clark wrote:You should ask your host. It is their hardware and softeware.
They've replaced whole server and all drives and nothing. These MSI boards are a nightmare.

I'm 100% sure it's a faulty motherboard.

The funny thing is that everything works fine for the first 8 minutes after rebooting the server.

User avatar
TrevorH
Site Admin
Posts: 33220
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: 0000581: INFO: task nginx:2334 blocked for more than 120

Post by TrevorH » 2015/07/20 12:20:58

I'm not sure I trust smartctl output where it shows that the power on hours figure is 900000+ hours (103 years) ;-)
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

Spacedust
Posts: 16
Joined: 2012/03/08 16:11:08

Re: 0000581: INFO: task nginx:2334 blocked for more than 120

Post by Spacedust » 2015/07/20 12:56:44

TrevorH wrote:I'm not sure I trust smartctl output where it shows that the power on hours figure is 900000+ hours (103 years) ;-)
It's just their bug. See these:

Code: Select all

[root@onlinecity ~]# smartctl -l devstat /dev/sda
smartctl 6.4 2015-06-04 r4109 [x86_64-linux-2.6.32-504.30.3.el6.x86_64] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

Device Statistics (GP Log 0x04)
Page  Offset Size        Value Flags Description
0x01  =====  =               =  ===  == General Statistics (rev 2) ==
0x01  0x008  4              12  ---  Lifetime Power-On Resets
0x01  0x010  4           14095  ---  Power-on Hours
0x01  0x018  6     33229020244  ---  Logical Sectors Written
0x01  0x028  6     10653429457  ---  Logical Sectors Read
0x04  =====  =               =  ===  == General Errors Statistics (rev 1) ==
0x04  0x008  4               0  ---  Number of Reported Uncorrectable Errors
0x04  0x010  4             281  ---  Resets Between Cmd Acceptance and Completion
0x06  =====  =               =  ===  == Transport Statistics (rev 1) ==
0x06  0x008  4             281  ---  Number of Hardware Resets
0x06  0x010  4             371  ---  Number of ASR Events
0x06  0x018  4               0  ---  Number of Interface CRC Errors
0x07  =====  =               =  ===  == Solid State Device Statistics (rev 1) ==
0x07  0x008  1             255  ---  Percentage Used Endurance Indicator
                                |||_ C monitored condition met
                                ||__ D supports DSN
                                |___ N normalized value

[root@onlinecity ~]# smartctl -l devstat /dev/sdb
smartctl 6.4 2015-06-04 r4109 [x86_64-linux-2.6.32-504.30.3.el6.x86_64] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

Device Statistics (GP Log 0x04)
Page  Offset Size        Value Flags Description
0x01  =====  =               =  ===  == General Statistics (rev 2) ==
0x01  0x008  4              18  ---  Lifetime Power-On Resets
0x01  0x010  4            7126  ---  Power-on Hours
0x01  0x018  6     25745056183  ---  Logical Sectors Written
0x01  0x028  6     48142340934  ---  Logical Sectors Read
0x04  =====  =               =  ===  == General Errors Statistics (rev 1) ==
0x04  0x008  4               0  ---  Number of Reported Uncorrectable Errors
0x04  0x010  4             270  ---  Resets Between Cmd Acceptance and Completion
0x06  =====  =               =  ===  == Transport Statistics (rev 1) ==
0x06  0x008  4             270  ---  Number of Hardware Resets
0x06  0x010  4             190  ---  Number of ASR Events
0x06  0x018  4               0  ---  Number of Interface CRC Errors
0x07  =====  =               =  ===  == Solid State Device Statistics (rev 1) ==
0x07  0x008  1             255  ---  Percentage Used Endurance Indicator
                                |||_ C monitored condition met
                                ||__ D supports DSN
                                |___ N normalized value

Spacedust
Posts: 16
Joined: 2012/03/08 16:11:08

Re: 0000581: INFO: task nginx:2334 blocked for more than 120

Post by Spacedust » 2015/07/20 20:32:42

Something new:

Code: Select all

nginx: page allocation failure. order:1, mode:0x20
Pid: 2292, comm: nginx Not tainted 2.6.32-504.30.3.el6.x86_64 #1
Call Trace:
 <IRQ>  [<ffffffff811345aa>] ? __alloc_pages_nodemask+0x74a/0x8d0
 [<ffffffff811737a2>] ? kmem_getpages+0x62/0x170
 [<ffffffff811743ba>] ? fallback_alloc+0x1ba/0x270
 [<ffffffff81173e0f>] ? cache_grow+0x2cf/0x320
 [<ffffffff81174139>] ? ____cache_alloc_node+0x99/0x160
 [<ffffffff81174f43>] ? kmem_cache_alloc+0x123/0x190
 [<ffffffff8144cdd8>] ? sk_prot_alloc+0x48/0x1c0
 [<ffffffff8144e002>] ? sk_clone+0x22/0x2e0
 [<ffffffff814a2256>] ? inet_csk_clone+0x16/0xd0
 [<ffffffff814bbd83>] ? tcp_create_openreq_child+0x23/0x470
 [<ffffffff814b952d>] ? tcp_v4_syn_recv_sock+0x4d/0x310
 [<ffffffff814bbb26>] ? tcp_check_req+0x226/0x460
 [<ffffffff814b8f6b>] ? tcp_v4_do_rcv+0x35b/0x490
 [<ffffffffa0207557>] ? ipv4_confirm+0x87/0x1d0 [nf_conntrack_ipv4]
 [<ffffffff814ba812>] ? tcp_v4_rcv+0x532/0x910
 [<ffffffff814974f0>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff814975cd>] ? ip_local_deliver_finish+0xdd/0x2d0
 [<ffffffff81497858>] ? ip_local_deliver+0x98/0xa0
 [<ffffffff81496d1d>] ? ip_rcv_finish+0x12d/0x440
 [<ffffffff814972a5>] ? ip_rcv+0x275/0x350
 [<ffffffff8145ce48>] ? __netif_receive_skb+0x208/0x570
 [<ffffffff81460c08>] ? netif_receive_skb+0x58/0x60
 [<ffffffff81460d10>] ? napi_skb_finish+0x50/0x70
 [<ffffffff81462619>] ? napi_gro_receive+0x39/0x50
 [<ffffffffa0159a58>] ? e1000_receive_skb+0x58/0x90 [e1000]
 [<ffffffffa015e392>] ? e1000_clean_rx_irq+0x2b2/0x5c0 [e1000]
 [<ffffffffa015bc5f>] ? e1000_clean+0x1ef/0x990 [e1000]
 [<ffffffff81462733>] ? net_rx_action+0x103/0x2f0
 [<ffffffff8107d901>] ? __do_softirq+0xc1/0x1e0
 [<ffffffff8100c38c>] ? call_softirq+0x1c/0x30
 [<ffffffff8100fbd5>] ? do_softirq+0x65/0xa0
 [<ffffffff8107d7b5>] ? irq_exit+0x85/0x90
 [<ffffffff81533c95>] ? do_IRQ+0x75/0xf0
 [<ffffffff8100ba53>] ? ret_from_intr+0x0/0x11
 <EOI>  [<ffffffff81130a9f>] ? get_pageblock_flags_group+0x7f/0xa0
 [<ffffffff811348e8>] ? free_hot_cold_page+0xc8/0x220
 [<ffffffff81146e0e>] ? __dec_zone_page_state+0x2e/0x30
 [<ffffffff81134a82>] ? __pagevec_free+0x42/0x90
 [<ffffffff8113b0cc>] ? release_pages+0x21c/0x250
 [<ffffffff8113b484>] ? ____pagevec_lru_add+0x164/0x180
 [<ffffffff8113bbfb>] ? lru_add_drain_cpu+0x8b/0xc0
 [<ffffffff8113bc76>] ? __pagevec_release+0x26/0x40
 [<ffffffff8113c578>] ? truncate_inode_pages_range+0x358/0x500
 [<ffffffff811f2b61>] ? dqget+0x291/0x390
 [<ffffffff811f1eac>] ? dqput+0x5c/0x200
 [<ffffffff811f3173>] ? dquot_initialize+0xe3/0x1b0
 [<ffffffffa009de80>] ? ext4_delete_inode+0x0/0x2f0 [ext4]
 [<ffffffff8113c7b5>] ? truncate_inode_pages+0x15/0x20
 [<ffffffffa009dece>] ? ext4_delete_inode+0x4e/0x2f0 [ext4]
 [<ffffffff811ac45e>] ? generic_delete_inode+0xde/0x1d0
 [<ffffffff811ac5b5>] ? generic_drop_inode+0x65/0x80
 [<ffffffff811ab402>] ? iput+0x62/0x70
 [<ffffffff811a7ff0>] ? dentry_iput+0x90/0x100
 [<ffffffff811a8151>] ? d_kill+0x31/0x60
 [<ffffffff811a9d2c>] ? dput+0x7c/0x150
 [<ffffffff8118fcb9>] ? __fput+0x189/0x210
 [<ffffffff8118fd65>] ? fput+0x25/0x30
 [<ffffffff8118afbd>] ? filp_close+0x5d/0x90
 [<ffffffff8118b095>] ? sys_close+0xa5/0x100
 [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b
nginx: page allocation failure. order:1, mode:0x20
Pid: 2314, comm: nginx Not tainted 2.6.32-504.30.3.el6.x86_64 #1
Call Trace:
 <IRQ>  [<ffffffff811345aa>] ? __alloc_pages_nodemask+0x74a/0x8d0
 [<ffffffff811737a2>] ? kmem_getpages+0x62/0x170
 [<ffffffff811743ba>] ? fallback_alloc+0x1ba/0x270
 [<ffffffff81173e0f>] ? cache_grow+0x2cf/0x320
 [<ffffffff81174139>] ? ____cache_alloc_node+0x99/0x160
 [<ffffffff81174f43>] ? kmem_cache_alloc+0x123/0x190
 [<ffffffff8144cdd8>] ? sk_prot_alloc+0x48/0x1c0
 [<ffffffff8144e002>] ? sk_clone+0x22/0x2e0
 [<ffffffff814a2256>] ? inet_csk_clone+0x16/0xd0
 [<ffffffff814bbd83>] ? tcp_create_openreq_child+0x23/0x470
 [<ffffffff814b952d>] ? tcp_v4_syn_recv_sock+0x4d/0x310
 [<ffffffff814bbb26>] ? tcp_check_req+0x226/0x460
 [<ffffffff814b8f6b>] ? tcp_v4_do_rcv+0x35b/0x490
 [<ffffffffa0207557>] ? ipv4_confirm+0x87/0x1d0 [nf_conntrack_ipv4]
 [<ffffffff814ba812>] ? tcp_v4_rcv+0x532/0x910
 [<ffffffff814974f0>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff814975cd>] ? ip_local_deliver_finish+0xdd/0x2d0
 [<ffffffff81497858>] ? ip_local_deliver+0x98/0xa0
 [<ffffffff81496d1d>] ? ip_rcv_finish+0x12d/0x440
 [<ffffffff814972a5>] ? ip_rcv+0x275/0x350
 [<ffffffff8145ce48>] ? __netif_receive_skb+0x208/0x570
 [<ffffffff8145d24a>] ? process_backlog+0x9a/0x100
 [<ffffffff81462733>] ? net_rx_action+0x103/0x2f0
 [<ffffffff8107d901>] ? __do_softirq+0xc1/0x1e0
 [<ffffffff8100c38c>] ? call_softirq+0x1c/0x30
 [<ffffffff8100c38c>] ? call_softirq+0x1c/0x30
 <EOI>  [<ffffffff8100fbd5>] ? do_softirq+0x65/0xa0
 [<ffffffff8107e5fa>] ? local_bh_enable+0x9a/0xb0
 [<ffffffff81462344>] ? dev_queue_xmit+0x124/0x320
 [<ffffffff8149d6f8>] ? ip_finish_output+0x148/0x310
 [<ffffffff8149d978>] ? ip_output+0xb8/0xc0
 [<ffffffff8149cc1f>] ? __ip_local_out+0x9f/0xb0
 [<ffffffff8149cc55>] ? ip_local_out+0x25/0x30
 [<ffffffff8149d150>] ? ip_queue_xmit+0x190/0x420
 [<ffffffff814b2684>] ? tcp_transmit_skb+0x4b4/0x8b0
 [<ffffffff814b4bca>] ? tcp_write_xmit+0x1da/0xa90
 [<ffffffff814b57b0>] ? __tcp_push_pending_frames+0x30/0xe0
 [<ffffffff814a48be>] ? tcp_push+0x6e/0x90
 [<ffffffff814a58bc>] ? tcp_sendmsg+0x64c/0xa20
 [<ffffffff8144ad9b>] ? sock_aio_write+0x19b/0x1c0
 [<ffffffff8144ac00>] ? sock_aio_write+0x0/0x1c0
 [<ffffffff8118df6b>] ? do_sync_readv_writev+0xfb/0x140
 [<ffffffff8109ec20>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8107d42a>] ? local_bh_enable_ip+0x9a/0xb0
 [<ffffffff8152d18b>] ? _spin_unlock_bh+0x1b/0x20
 [<ffffffff8122da86>] ? security_file_permission+0x16/0x20
 [<ffffffff8118efc6>] ? do_readv_writev+0xd6/0x1f0
 [<ffffffff8118f126>] ? vfs_writev+0x46/0x60
 [<ffffffff8118f251>] ? sys_writev+0x51/0xb0
 [<ffffffff810e5c5e>] ? __audit_syscall_exit+0x25e/0x290
 [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b
swapper: page allocation failure. order:1, mode:0x20
Pid: 0, comm: swapper Not tainted 2.6.32-504.30.3.el6.x86_64 #1
Call Trace:
 <IRQ>  [<ffffffff811345aa>] ? __alloc_pages_nodemask+0x74a/0x8d0
 [<ffffffff812a2798>] ? swiotlb_dma_mapping_error+0x18/0x30
 [<ffffffff811737a2>] ? kmem_getpages+0x62/0x170
 [<ffffffff811743ba>] ? fallback_alloc+0x1ba/0x270
 [<ffffffff81173e0f>] ? cache_grow+0x2cf/0x320
 [<ffffffff81174139>] ? ____cache_alloc_node+0x99/0x160
 [<ffffffff81174f43>] ? kmem_cache_alloc+0x123/0x190
 [<ffffffff8144cdd8>] ? sk_prot_alloc+0x48/0x1c0
 [<ffffffff8144e002>] ? sk_clone+0x22/0x2e0
 [<ffffffff814a2256>] ? inet_csk_clone+0x16/0xd0
 [<ffffffff814bbd83>] ? tcp_create_openreq_child+0x23/0x470
 [<ffffffff814b952d>] ? tcp_v4_syn_recv_sock+0x4d/0x310
 [<ffffffff814bbb26>] ? tcp_check_req+0x226/0x460
 [<ffffffff814b8f6b>] ? tcp_v4_do_rcv+0x35b/0x490
 [<ffffffffa0207557>] ? ipv4_confirm+0x87/0x1d0 [nf_conntrack_ipv4]
 [<ffffffff814ba812>] ? tcp_v4_rcv+0x532/0x910
 [<ffffffff814974f0>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff814975cd>] ? ip_local_deliver_finish+0xdd/0x2d0
 [<ffffffff81497858>] ? ip_local_deliver+0x98/0xa0
 [<ffffffff81496d1d>] ? ip_rcv_finish+0x12d/0x440
 [<ffffffff814972a5>] ? ip_rcv+0x275/0x350
 [<ffffffff8145ce48>] ? __netif_receive_skb+0x208/0x570
 [<ffffffff81460c08>] ? netif_receive_skb+0x58/0x60
 [<ffffffff81460d10>] ? napi_skb_finish+0x50/0x70
 [<ffffffff81462619>] ? napi_gro_receive+0x39/0x50
 [<ffffffffa0159a58>] ? e1000_receive_skb+0x58/0x90 [e1000]
 [<ffffffffa015e392>] ? e1000_clean_rx_irq+0x2b2/0x5c0 [e1000]
 [<ffffffffa015bc5f>] ? e1000_clean+0x1ef/0x990 [e1000]
 [<ffffffff810f0071>] ? cpu_quiet_msk+0xc1/0x130
 [<ffffffff810f01f4>] ? rcu_process_dyntick+0x114/0x120
 [<ffffffff81462733>] ? net_rx_action+0x103/0x2f0
 [<ffffffff8107d901>] ? __do_softirq+0xc1/0x1e0
 [<ffffffff8100c38c>] ? call_softirq+0x1c/0x30
 [<ffffffff8100fbd5>] ? do_softirq+0x65/0xa0
 [<ffffffff8107d7b5>] ? irq_exit+0x85/0x90
 [<ffffffff81533c95>] ? do_IRQ+0x75/0xf0
 [<ffffffff8100ba53>] ? ret_from_intr+0x0/0x11
 <EOI>  [<ffffffff812eabee>] ? intel_idle+0xde/0x170
 [<ffffffff812eabd1>] ? intel_idle+0xc1/0x170
 [<ffffffff81426207>] ? cpuidle_idle_call+0xa7/0x140
 [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110
 [<ffffffff8151075a>] ? rest_init+0x7a/0x80
 [<ffffffff81c29f8f>] ? start_kernel+0x424/0x430
 [<ffffffff81c2933a>] ? x86_64_start_reservations+0x125/0x129
 [<ffffffff81c29453>] ? x86_64_start_kernel+0x115/0x124
nginx: page allocation failure. order:1, mode:0x20
Pid: 2292, comm: nginx Not tainted 2.6.32-504.30.3.el6.x86_64 #1
Call Trace:
 <IRQ>  [<ffffffff811345aa>] ? __alloc_pages_nodemask+0x74a/0x8d0
 [<ffffffff814629cf>] ? netif_rx+0xaf/0x160
 [<ffffffff811737a2>] ? kmem_getpages+0x62/0x170
 [<ffffffff811743ba>] ? fallback_alloc+0x1ba/0x270
 [<ffffffff81173e0f>] ? cache_grow+0x2cf/0x320
 [<ffffffff81174139>] ? ____cache_alloc_node+0x99/0x160
 [<ffffffff81174f43>] ? kmem_cache_alloc+0x123/0x190
 [<ffffffff8144cdd8>] ? sk_prot_alloc+0x48/0x1c0
 [<ffffffff8144e002>] ? sk_clone+0x22/0x2e0
 [<ffffffff814a2256>] ? inet_csk_clone+0x16/0xd0
 [<ffffffff814bbd83>] ? tcp_create_openreq_child+0x23/0x470
 [<ffffffff814b952d>] ? tcp_v4_syn_recv_sock+0x4d/0x310
 [<ffffffff814bbb26>] ? tcp_check_req+0x226/0x460
 [<ffffffff814b8f6b>] ? tcp_v4_do_rcv+0x35b/0x490
 [<ffffffffa0207557>] ? ipv4_confirm+0x87/0x1d0 [nf_conntrack_ipv4]
 [<ffffffff814ba812>] ? tcp_v4_rcv+0x532/0x910
 [<ffffffff814974f0>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff814975cd>] ? ip_local_deliver_finish+0xdd/0x2d0
 [<ffffffff81497858>] ? ip_local_deliver+0x98/0xa0
 [<ffffffff81496d1d>] ? ip_rcv_finish+0x12d/0x440
 [<ffffffff814972a5>] ? ip_rcv+0x275/0x350
 [<ffffffff8145ce48>] ? __netif_receive_skb+0x208/0x570
 [<ffffffff8145d24a>] ? process_backlog+0x9a/0x100
 [<ffffffff81462733>] ? net_rx_action+0x103/0x2f0
 [<ffffffff8107d901>] ? __do_softirq+0xc1/0x1e0
 [<ffffffff8100c38c>] ? call_softirq+0x1c/0x30
 [<ffffffff8100c38c>] ? call_softirq+0x1c/0x30
 <EOI>  [<ffffffff8100fbd5>] ? do_softirq+0x65/0xa0
 [<ffffffff8107e5fa>] ? local_bh_enable+0x9a/0xb0
 [<ffffffff81462344>] ? dev_queue_xmit+0x124/0x320
 [<ffffffff8149d6f8>] ? ip_finish_output+0x148/0x310
 [<ffffffff8149d978>] ? ip_output+0xb8/0xc0
 [<ffffffff8149cc1f>] ? __ip_local_out+0x9f/0xb0
 [<ffffffff8149cc55>] ? ip_local_out+0x25/0x30
 [<ffffffff8149d150>] ? ip_queue_xmit+0x190/0x420
 [<ffffffff814b2684>] ? tcp_transmit_skb+0x4b4/0x8b0
 [<ffffffff814b4bca>] ? tcp_write_xmit+0x1da/0xa90
 [<ffffffff814b57b0>] ? __tcp_push_pending_frames+0x30/0xe0
 [<ffffffff814a48be>] ? tcp_push+0x6e/0x90
 [<ffffffff814a58bc>] ? tcp_sendmsg+0x64c/0xa20
 [<ffffffff8144ad9b>] ? sock_aio_write+0x19b/0x1c0
 [<ffffffff8144ac00>] ? sock_aio_write+0x0/0x1c0
 [<ffffffff8118df6b>] ? do_sync_readv_writev+0xfb/0x140
 [<ffffffff814474cd>] ? sock_poll+0x5d/0x140
 [<ffffffff8109ec20>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff811d6d40>] ? ep_send_events_proc+0x0/0x110
 [<ffffffff8122da86>] ? security_file_permission+0x16/0x20
 [<ffffffff8118efc6>] ? do_readv_writev+0xd6/0x1f0
 [<ffffffff8118f126>] ? vfs_writev+0x46/0x60
 [<ffffffff8118f251>] ? sys_writev+0x51/0xb0
 [<ffffffff810e5c5e>] ? __audit_syscall_exit+0x25e/0x290
 [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b

aks
Posts: 3073
Joined: 2014/09/20 11:22:14

Re: 0000581: INFO: task nginx:2334 blocked for more than 120

Post by aks » 2015/07/21 18:32:50

Um, out of memory/swap?

Spacedust
Posts: 16
Joined: 2012/03/08 16:11:08

Re: 0000581: INFO: task nginx:2334 blocked for more than 120

Post by Spacedust » 2015/07/21 19:58:52

aks wrote:Um, out of memory/swap?
Still a lot of memory and SWAP.

Crappy Hetzner datacenter hardware. I've moved to a different machine everything suddenly works fine ;)

Post Reply