Kernel Error - swapper/3: page allocation failure: order:2, mode:0x104020

General support questions
Post Reply
pablor
Posts: 2
Joined: 2014/12/01 18:42:05

Kernel Error - swapper/3: page allocation failure: order:2, mode:0x104020

Post by pablor » 2017/03/10 04:43:14

I had a system instability with several kernel messages like these.

[150939.843606] swapper/3: page allocation failure: order:2, mode:0x104020
[150939.843630] swapper/5: page allocation failure: order:2, mode:0x104020
[150939.843633] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G OE ------------ 3.10.0-514.6.2.el7.x86_64 #1
[150939.843634] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 08/02/2014
[150939.843637] 0000000000104020 ee664babaff0608b ffff88042f7439d8 ffffffff816862ac
[150939.843638] ffff88042f743a68 ffffffff81186af0 ffff88083fffe4e8 ffff88042f743a28
[150939.843639] fffffffffffffffc 0010402000000000 ffff88043ffdb018 ee664babaff0608b
[150939.843640] Call Trace:
[150939.843647] <IRQ> [<ffffffff816862ac>] dump_stack+0x19/0x1b
[150939.843650] [<ffffffff81186af0>] warn_alloc_failed+0x110/0x180
...
...

Any thoughts?

Attached is the file with all received messages.

Pablo

aks
Posts: 3073
Joined: 2014/09/20 11:22:14

Re: Kernel Error - swapper/3: page allocation failure: order:2, mode:0x104020

Post by aks » 2017/03/13 17:33:51

Can't see attachment....

Is it XEN? If so, give more memory to dom0.

If not, it appears you have quite fragmented memory allocations -a re you sure you have enough RAM/swap for whatever application(s) you are running? Perhaps some context could help.

pablor
Posts: 2
Joined: 2014/12/01 18:42:05

Re: Kernel Error - swapper/3: page allocation failure: order:2, mode:0x104020

Post by pablor » 2017/03/13 20:20:02

The error messages appear on a physical server running CentOS 7.3.1611 with 32G of RAM and 16G of swap. It runs Gluster 3.8.9.

The following link has all received messages.

http://pastebin.com/44SyQCcV

Pablo

aks
Posts: 3073
Joined: 2014/09/20 11:22:14

Re: Kernel Error - swapper/3: page allocation failure: order:2, mode:0x104020

Post by aks » 2017/03/14 17:39:06

It seems you have non maskable interrupts taking to long (see kernel/events/core.c around line 485). It looks like those interrupts are involved in paging duties (virtual memory allocation).

This "feels" like a hardware problem to me. What I would do next is run some diagnostics on the hardware and then possibly involve HP.

Thalamus
Posts: 5
Joined: 2008/07/02 00:23:24
Contact:

Re: Kernel Error - swapper/3: page allocation failure: order:2, mode:0x104020

Post by Thalamus » 2017/04/03 08:08:23

Seeing the same message on 4 different machines. All PowerEdge R720xd, kernel : 3.10.0-514.10.2.el7.x86_64, gluster : glusterfs-server-3.8.10-1.el7.x86_64. Last reboot of these machines was 27. March. Been quiet since then, but on April 2. then we start seeing kernel messages from all.

On two of the machines is the swapper

[Sun Apr 2 01:13:56 2017] swapper/12: page allocation failure: order:2, mode:0x104020[Sun Apr 2 01:13:56 2017] swapper/12: page allocation failure: order:2, mode:0x104020

on the two other

[Sun Apr 2 07:51:37 2017] glusterfsd: page allocation failure: order:2, mode:0x104020

- memory shouldn't be a issue ...

total used free shared buff/cache available
Mem: 515941 3894 1356 24 510689 510544
Swap: 4095 0 4095
Total: 520037 3895 5452

aks
Posts: 3073
Joined: 2014/09/20 11:22:14

Re: Kernel Error - swapper/3: page allocation failure: order:2, mode:0x104020

Post by aks » 2017/04/03 19:00:36

- memory shouldn't be a issue ...
Unless it's a hardware problem.

Thalamus
Posts: 5
Joined: 2008/07/02 00:23:24
Contact:

Re: Kernel Error - swapper/3: page allocation failure: order:2, mode:0x104020

Post by Thalamus » 2017/04/04 05:21:13

On 4 out of 4 machines ?

Thalamus
Posts: 5
Joined: 2008/07/02 00:23:24
Contact:

Re: Kernel Error - swapper/3: page allocation failure: order:2, mode:0x104020

Post by Thalamus » 2017/05/12 08:25:40

Been quiet for over 3 weeks now. Ran swapoff, adjusted fstab to prevent mount swap on next boot, and adjusted some kernel settings, seen below

cat /etc/sysctl.d/90-prevent-swapping.conf
# From http://lists.opennebula.org/pipermail/c ... 44834.html
vm/min_free_kbytes = 524288
vm/swappiness = 0

Thalamus
Posts: 5
Joined: 2008/07/02 00:23:24
Contact:

Re: Kernel Error - swapper/3: page allocation failure: order:2, mode:0x104020

Post by Thalamus » 2017/06/06 07:57:57

Have to retract the last report - I still get these messages but I haven't managed to find root cause or seen any ill effects.

sirmonkey
Posts: 11
Joined: 2014/07/19 03:09:34

Re: Kernel Error - swapper/3: page allocation failure: order:2, mode:0x104020

Post by sirmonkey » 2017/06/22 03:21:33

I am receiving the same errors.
MY PROBLEM WAS A DEAD FAN causing overheating everything! memory, chipset ...




my machine is consumer junk...
the common connection is GlusterFS

I'm running 3.10

I'm about to swap memory.

Also I think its happening outside of Gluster load (if i connect directly to the host via iSCSI, different mount than gluster) but when at idle no errors (or maybe they are so minimal I'm not seeing. )

So I just ran a test, writing to an iscsi target hosted on the same box as gluster. these call traces happen constantly and the transfer almost stalls
[1721979.573148] [<ffffffff8104f09a>] start_secondary+0x1ba/0x230
[1721980.184564] swapper/0: page allocation failure: order:2, mode:0x104020

so maybe not gluster



a full trace
[1721986.116400] swapper/3: page allocation failure: order:2, mode:0x104020
[1721986.116417] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.10.0-514.21.1.el7.x86_64 #1
[1721986.116422] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./F2A88XM-D3HP, BIOS F2 12/24/2015
[1721986.116427] 0000000000104020 9f8f43f3c22cb311 ffff88023ed839d8 ffffffff81686f13
[1721986.116434] ffff88023ed83a68 ffffffff81187090 ffff88023eff94e8 ffff88023ed83a28
[1721986.116440] fffffffffffffffc 0010402000000000 ffff88023efd5008 9f8f43f3c22cb311
[1721986.116447] Call Trace:
[1721986.116452] <IRQ> [<ffffffff81686f13>] dump_stack+0x19/0x1b
[1721986.116468] [<ffffffff81187090>] warn_alloc_failed+0x110/0x180
[1721986.116478] [<ffffffff81682aa7>] __alloc_pages_slowpath+0x6b7/0x725
[1721986.116489] [<ffffffff8118b645>] __alloc_pages_nodemask+0x405/0x420
[1721986.116498] [<ffffffff811cf7fa>] alloc_pages_current+0xaa/0x170
[1721986.116505] [<ffffffff81185f6e>] __get_free_pages+0xe/0x50
[1721986.116513] [<ffffffff811db09e>] kmalloc_order_trace+0x2e/0xa0
[1721986.116520] [<ffffffff811dd871>] __kmalloc+0x221/0x240
[1721986.116573] [<ffffffffa015b3fa>] bnx2x_frag_alloc.isra.62+0x2a/0x40 [bnx2x]
[1721986.116601] [<ffffffffa015c2f7>] bnx2x_rx_int+0x227/0x17b0 [bnx2x]
[1721986.116610] [<ffffffff813181d5>] ? cpumask_next_and+0x35/0x50
[1721986.116639] [<ffffffffa015f72d>] bnx2x_poll+0x1dd/0x260 [bnx2x]
[1721986.116649] [<ffffffff81570a10>] net_rx_action+0x170/0x380
[1721986.116658] [<ffffffff8108f63f>] __do_softirq+0xef/0x280
[1721986.116667] [<ffffffff8169905c>] call_softirq+0x1c/0x30
[1721986.116676] [<ffffffff8102d365>] do_softirq+0x65/0xa0
[1721986.116685] [<ffffffff8108f9d5>] irq_exit+0x115/0x120
[1721986.116692] [<ffffffff81699bf8>] do_IRQ+0x58/0xf0
[1721986.116699] [<ffffffff8168ecad>] common_interrupt+0x6d/0x6d
[1721986.116703] <EOI> [<ffffffff81514c2f>] ? cpuidle_enter_state+0x4f/0xc0
[1721986.116717] [<ffffffff81514d79>] cpuidle_idle_call+0xd9/0x210
[1721986.116725] [<ffffffff810350ee>] arch_cpu_idle+0xe/0x30
[1721986.116733] [<ffffffff810e82f5>] cpu_startup_entry+0x245/0x290
[1721986.116742] [<ffffffff8104f09a>] start_secondary+0x1ba/0x230

Post Reply