Page 1 of 1

High Load Average(~25) on CentOS7.3 after a Kernel calltrace

Posted: 2019/05/16 20:25:17
by arukum
Hi All,

I've a CentOS7.3 VM running in Azure having 4 vCPUs with 3.10.0-957.5.1.el7.x86_64 kernel. My load average is ~25 from past 3 days. I found an increase in the load average after seeing the below call trace in kern.log. There were no updates on the VM, have a similar issue in other VM and it was mitigated after a reboot. Need help in analyzing the below call trace.


May 13 17:09:40 centos73-22 kernel: INFO: task kworker/1:2:126844 blocked for more than 120 seconds.
May 13 17:09:40 centos73-22 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 13 17:09:40 centos73-22 kernel: kworker/1:2 D ffff8d5764499040 0 126844 2 0x00000080
May 13 17:09:40 centos73-22 kernel: Workqueue: hv_vmbus_con vmbus_onmessage_work [hv_vmbus]
May 13 17:09:40 centos73-22 kernel: Call Trace:
May 13 17:09:40 centos73-22 kernel: [<ffffffff87b67c49>] schedule+0x29/0x70
May 13 17:09:40 centos73-22 kernel: [<ffffffff87b65721>] schedule_timeout+0x221/0x2d0
May 13 17:09:40 centos73-22 kernel: [<ffffffff8779507d>] ? list_del+0xd/0x30
May 13 17:09:40 centos73-22 kernel: [<ffffffffc00e30ee>] ? vmbus_establish_gpadl+0x23e/0x4a0 [hv_vmbus]
May 13 17:09:40 centos73-22 kernel: [<ffffffff87b67ffd>] wait_for_completion+0xfd/0x140
May 13 17:09:40 centos73-22 kernel: [<ffffffff874d67f0>] ? wake_up_state+0x20/0x20
May 13 17:09:40 centos73-22 kernel: [<ffffffffc00e377c>] vmbus_open+0x42c/0x4e0 [hv_vmbus]
May 13 17:09:40 centos73-22 kernel: [<ffffffffc01be7ad>] storvsc_probe+0x1fd/0xd48 [hv_storvsc]
May 13 17:09:40 centos73-22 kernel: [<ffffffff876ca136>] ? kernfs_add_one+0xf6/0x150
May 13 17:09:40 centos73-22 kernel: [<ffffffffc00e0a31>] vmbus_probe+0x41/0xa0 [hv_vmbus]
May 13 17:09:40 centos73-22 kernel: [<ffffffff878a8905>] driver_probe_device+0xc5/0x3e0
May 13 17:09:40 centos73-22 kernel: [<ffffffff878a8c20>] ? driver_probe_device+0x3e0/0x3e0
May 13 17:09:40 centos73-22 kernel: [<ffffffff878a8c63>] __device_attach+0x43/0x50
May 13 17:09:40 centos73-22 kernel: [<ffffffff878a6585>] bus_for_each_drv+0x75/0xc0
May 13 17:09:40 centos73-22 kernel: [<ffffffff878a8740>] device_attach+0x90/0xb0
May 13 17:09:40 centos73-22 kernel: [<ffffffff878a7968>] bus_probe_device+0x98/0xd0
May 13 17:09:40 centos73-22 kernel: [<ffffffff878a520f>] device_add+0x4ef/0x7b0
May 13 17:09:40 centos73-22 kernel: [<ffffffff878a54ea>] device_register+0x1a/0x20
May 13 17:09:40 centos73-22 kernel: [<ffffffffc00e1306>] vmbus_device_register+0x66/0x100 [hv_vmbus]
May 13 17:09:40 centos73-22 kernel: [<ffffffffc00e47eb>] vmbus_onoffer+0x39b/0x540 [hv_vmbus]
May 13 17:09:40 centos73-22 kernel: [<ffffffffc00e4f48>] vmbus_onmessage+0x48/0xf0 [hv_vmbus]
May 13 17:09:40 centos73-22 kernel: [<ffffffff87b6778f>] ? __schedule+0x3ff/0x890
May 13 17:09:40 centos73-22 kernel: [<ffffffffc00e0961>] vmbus_onmessage_work+0x21/0x30 [hv_vmbus]
May 13 17:09:40 centos73-22 kernel: [<ffffffff874b9d8f>] process_one_work+0x17f/0x440
May 13 17:09:40 centos73-22 kernel: [<ffffffff874bae26>] worker_thread+0x126/0x3c0
May 13 17:09:40 centos73-22 kernel: [<ffffffff874bad00>] ? manage_workers.isra.25+0x2a0/0x2a0
May 13 17:09:40 centos73-22 kernel: [<ffffffff874c1c71>] kthread+0xd1/0xe0
May 13 17:09:40 centos73-22 kernel: [<ffffffff874c1ba0>] ? insert_kthread_work+0x40/0x40
May 13 17:09:40 centos73-22 kernel: [<ffffffff87b74c37>] ret_from_fork_nospec_begin+0x21/0x21
May 13 17:09:40 centos73-22 kernel: [<ffffffff874c1ba0>] ? insert_kthread_work+0x40/0x40

Re: High Load Average(~25) on CentOS7.3 after a Kernel calltrace

Posted: 2019/05/16 21:50:03
by TrevorH
There's nothing to analyze, the message means your i/o didn't complete in 2 whole minutes so the kernel just kicked the task to wake it up. The problem is likely to actually be an overloaded host leading to 2 minute response times on disk i/o.

Also, this is not CentOS 7.3 - the kernel you have there is an outdated 7.6 one. If you only updated the kernel then you need to finish the job and run yum update to get the rest of the updates. Also, since the latest kernel from yesterday - 3.10.0-957.12.2 - is required to fix the latest Intel processor vulnerability.