Repeated Kernel Panics

General support questions including new installations
Post Reply
wolfypdx
Posts: 2
Joined: 2012/02/07 23:24:24

Repeated Kernel Panics

Post by wolfypdx » 2012/02/07 23:34:10

We have a CentOS 5.7 system running on Amazon's EC2 service with EBS volumes that has been repeatedly kernel panicking for about 6 weeks now, with intervals between panics of anywhere between 2 days and 2 weeks. A typical panic looks like this in our logs:

[code]
Feb 7 13:56:25 well kernel: [46169097.704426] Unable to handle kernel paging request at 0000000000100108 RIP:
Feb 7 13:56:25 well kernel: [46169097.704440] [<ffffffff80361954>] keyring_destroy+0x34/0xa0
Feb 7 13:56:25 well kernel: [46169097.704454] PGD 16d237067 PUD 196b5c067 PMD 0
Feb 7 13:56:25 well kernel: [46169097.704460] Oops: 0002 [1] SMP
Feb 7 13:56:25 well kernel: [46169097.704463] CPU 1
Feb 7 13:56:25 well kernel: [46169097.704466] Modules linked in: xfs nfs nfsd exportfs nfs_acl lockd sunrpc dm_mirror dm_multipath dm_mod sd_mod scsi_mod
Feb 7 13:56:25 well kernel: [46169097.704479] Pid: 21, comm: events/1 Not tainted 2.6.18-xenU-ec2-v1.2 #2
Feb 7 13:56:25 well kernel: [46169097.704483] RIP: e030:[<ffffffff80361954>] [<ffffffff80361954>] keyring_destroy+0x34/0xa0
Feb 7 13:56:25 well kernel: [46169097.704489] RSP: e02b:ffff8800007c9e00 EFLAGS: 00010217
Feb 7 13:56:25 well kernel: [46169097.704493] RAX: 0000000000200200 RBX: ffff880098c01300 RCX: ffff880098c01378
Feb 7 13:56:25 well kernel: [46169097.704497] RDX: 0000000000100100 RSI: 0000000000000001 RDI: ffffffff804aa560
Feb 7 13:56:25 well kernel: [46169097.704501] RBP: ffffffff804aa4a8 R08: 00000000ffffffff R09: 0000000000000000
Feb 7 13:56:25 well kernel: [46169097.704505] R10: 616c622061206562 R11: ffffffff80367160 R12: ffff88000ab2edc0
Feb 7 13:56:25 well kernel: [46169097.704509] R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff803611d0
Feb 7 13:56:25 well kernel: [46169097.704514] FS: 00002adb33aa36e0(0000) GS:ffffffff804f3080(0000) knlGS:0000000000000000
Feb 7 13:56:25 well kernel: [46169097.704519] CS: e033 DS: 0000 ES: 0000
Feb 7 13:56:25 well kernel: [46169097.704522] Process events/1 (pid: 21, threadinfo ffff8800007c8000, task ffff880000816100)
Feb 7 13:56:25 well kernel: [46169097.704526] Stack: ffffffff804aa4a8 ffff880098c01300 ffffffff804aa4a8 ffffffff803612a6
Feb 7 13:56:25 well kernel: [46169097.704534] ffffffff804aa4a0 ffffffff80257833 0000000000000000 ffff88000ab2edd8
Feb 7 13:56:25 well kernel: [46169097.704541] ffff88000ab2ee00 ffff88000ab2edc0 ffff88000ab2edd8 ffff88000ab2ede8
Feb 7 13:56:25 well kernel: [46169097.704547] Call Trace:
Feb 7 13:56:25 well kernel: [46169097.704552] [<ffffffff803612a6>] key_cleanup+0xd6/0x100
Feb 7 13:56:25 well kernel: [46169097.704557] [<ffffffff80257833>] run_workqueue+0xb3/0x110
Feb 7 13:56:25 well kernel: [46169097.704561] [<ffffffff802539c0>] worker_thread+0x0/0x170
Feb 7 13:56:25 well kernel: [46169097.704566] [<ffffffff802a0b50>] keventd_create_kthread+0x0/0x80
Feb 7 13:56:25 well kernel: [46169097.704570] [<ffffffff80253ae9>] worker_thread+0x129/0x170
Feb 7 13:56:25 well kernel: [46169097.704575] [<ffffffff80289ad0>] default_wake_function+0x0/0x10
Feb 7 13:56:25 well kernel: [46169097.704580] [<ffffffff802539c0>] worker_thread+0x0/0x170
Feb 7 13:56:25 well kernel: [46169097.704585] [<ffffffff80239da9>] kthread+0xd9/0x120
Feb 7 13:56:25 well kernel: [46169097.704590] [<ffffffff80269c8c>] child_rip+0xa/0x12
Feb 7 13:56:25 well kernel: [46169097.704594] [<ffffffff802a0b50>] keventd_create_kthread+0x0/0x80
Feb 7 13:56:25 well kernel: [46169097.704599] [<ffffffff80239cd0>] kthread+0x0/0x120
Feb 7 13:56:25 well kernel: [46169097.704603] [<ffffffff80269c82>] child_rip+0x0/0x12
Feb 7 13:56:25 well kernel: [46169097.704606]
Feb 7 13:56:25 well kernel: [46169097.704608]
Feb 7 13:56:25 well kernel: [46169097.704608] Code: 48 89 42 08 48 89 10 48 c7 41 08 00 02 20 00 48 c7 43 78 00
Feb 7 13:56:25 well kernel: [46169097.704626] RIP [<ffffffff80361954>] keyring_destroy+0x34/0xa0
Feb 7 13:56:25 well kernel: [46169097.704631] RSP <ffff8800007c9e00>
Feb 7 13:56:25 well kernel: [46169097.704633] CR2: 0000000000100108
Feb 7 13:56:34 well kernel: [46169097.704638] <3>BUG: soft lockup detected on CPU#1!
Feb 7 13:56:34 well kernel: [46169106.930525]
Feb 7 13:56:34 well kernel: [46169106.930526] Call Trace:
Feb 7 13:56:34 well kernel: [46169106.930529] <IRQ> [<ffffffff802b58e6>] softlockup_tick+0xf6/0x120
Feb 7 13:56:34 well kernel: [46169106.930546] [<ffffffff80276466>] timer_interrupt+0x416/0x480
Feb 7 13:56:34 well kernel: [46169106.930551] [<ffffffff804045e0>] tcp_write_timer+0x0/0x710
Feb 7 13:56:34 well kernel: [46169106.930557] [<ffffffff80211ee1>] handle_IRQ_event+0x51/0xa0
Feb 7 13:56:34 well kernel: [46169106.930561] [<ffffffff802b5cbb>] __do_IRQ+0xcb/0x150
Feb 7 13:56:34 well kernel: [46169106.930566] [<ffffffff80269f04>] call_softirq+0x1c/0x28
Feb 7 13:56:34 well kernel: [46169106.930570] [<ffffffff8027499d>] do_IRQ+0x6d/0x90
Feb 7 13:56:34 well kernel: [46169106.930576] [<ffffffff803ae28f>] evtchn_do_upcall+0xef/0x160
Feb 7 13:56:34 well kernel: [46169106.930581] [<ffffffff80269a3a>] do_hypervisor_callback+0x1e/0x2c
Feb 7 13:56:34 well kernel: [46169106.930584] <EOI> [<ffffffff8026adfd>] __write_lock_failed+0x9/0x20
Feb 7 13:56:34 well kernel: [46169106.930594] [<ffffffff8026d3f2>] __down_write_nested+0x12/0x100
Feb 7 13:56:34 well kernel: [46169106.930600] [<ffffffff8026df13>] .text.lock.spinlock+0x11/0x8e
Feb 7 13:56:34 well kernel: [46169106.930608] [<ffffffff80361847>] keyring_publish_name+0x57/0xa0
Feb 7 13:56:34 well kernel: [46169106.930615] [<ffffffff803618a3>] keyring_instantiate+0x13/0x20
Feb 7 13:56:34 well kernel: [46169106.930621] [<ffffffff803608ea>] __key_instantiate_and_link+0x5a/0x100
Feb 7 13:56:34 well kernel: [46169106.930626] [<ffffffff803622d6>] keyring_alloc+0x46/0x70
Feb 7 13:56:34 well kernel: [46169106.930633] [<ffffffff80363e6d>] alloc_uid_keyring+0x4d/0xd0
Feb 7 13:56:34 well kernel: [46169106.930640] [<ffffffff802970c3>] alloc_uid+0x103/0x1e0
Feb 7 13:56:34 well kernel: [46169106.930644] [<ffffffff8029a60f>] set_user+0xf/0xb0
Feb 7 13:56:34 well kernel: [46169106.930649] [<ffffffff8029c0ea>] sys_setreuid+0x10a/0x230
Feb 7 13:56:34 well kernel: [46169106.930655] [<ffffffff80269252>] system_call+0x86/0x8b
Feb 7 13:56:34 well kernel: [46169106.930660] [<ffffffff802691cc>] system_call+0x0/0x8b
Feb 7 13:56:34 well kernel: [46169106.930665]
[/code]

At this point, I'm at a loss for what to do next to try to fix this.

Our getinfo looks like this:

Information for general problems.
[code]
== BEGIN uname -rmi ==
2.6.18-xenU-ec2-v1.2 x86_64 x86_64
== END uname -rmi ==

== BEGIN rpm -qa \*-release\* ==
centos-release-notes-5.7-0
centos-release-5-7.el5.centos
== END rpm -qa \*-release\* ==

== BEGIN cat /etc/redhat-release ==
CentOS release 5.7 (Final)
== END cat /etc/redhat-release ==

== BEGIN getenforce ==
Disabled
== END getenforce ==

== BEGIN free -m ==
total used free shared buffers cached
Mem: 7680 1577 6102 0 56 1060
-/+ buffers/cache: 459 7220
Swap: 8191 0 8191
== END free -m ==

== BEGIN rpm -qa yum\* rpm-\* python | sort ==
python-2.4.3-44.el5_7.1
rpm-libs-4.4.2.3-22.el5_7.2
rpm-python-4.4.2.3-22.el5_7.2
yum-3.2.22-37.el5.centos
yum-fastestmirror-1.1.16-16.el5.centos
yum-metadata-parser-1.1.2-3.el5.centos
yum-updatesd-0.9-2.el5
== END rpm -qa yum\* rpm-\* python | sort ==

== BEGIN ls /etc/yum.repos.d ==
CentOS-Base.repo
CentOS-Debuginfo.repo
CentOS-Media.repo
CentOS-Vault.repo
dag.repo
epel.repo
== END ls /etc/yum.repos.d ==

== BEGIN cat /etc/yum.conf ==
[main]
cachedir=/var/cache/yum
keepcache=0
debuglevel=2
logfile=/var/log/yum.log
distroverpkg=redhat-release
tolerant=1
exactarch=1
obsoletes=1
gpgcheck=1
plugins=1
bugtracker_url=http://bugs.centos.org/set_project.php?project_id=16&ref=http://bugs.centos.org/bug_report_page.php?category=yum

# Note: yum-RHN-plugin doesn't honor this.
metadata_expire=1h

installonly_limit = 5

# PUT YOUR REPOS HERE OR IN separate files named file.repo
# in /etc/yum.repos.d
== END cat /etc/yum.conf ==

== BEGIN yum repolist all ==
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* base: mirror.rackspace.com
* centosplus: mirror.symnds.com
* epel: mirror.cogentco.com
* extras: mirror.vcu.edu
* updates: centos.mirror.nac.net
repo id repo name status
C5.0-base CentOS-5.0 - Base disabled
C5.0-centosplus CentOS-5.0 - Plus disabled
C5.0-extras CentOS-5.0 - Extras disabled
C5.0-updates CentOS-5.0 - Updates disabled
C5.1-base CentOS-5.1 - Base disabled
C5.1-centosplus CentOS-5.1 - Plus disabled
C5.1-extras CentOS-5.1 - Extras disabled
C5.1-updates CentOS-5.1 - Updates disabled
C5.2-base CentOS-5.2 - Base disabled
C5.2-centosplus CentOS-5.2 - Plus disabled
C5.2-extras CentOS-5.2 - Extras disabled
C5.2-updates CentOS-5.2 - Updates disabled
C5.3-base CentOS-5.3 - Base disabled
C5.3-centosplus CentOS-5.3 - Plus disabled
C5.3-extras CentOS-5.3 - Extras disabled
C5.3-updates CentOS-5.3 - Updates disabled
C5.4-base CentOS-5.4 - Base disabled
C5.4-centosplus CentOS-5.4 - Plus disabled
C5.4-extras CentOS-5.4 - Extras disabled
C5.4-updates CentOS-5.4 - Updates disabled
C5.5-base CentOS-5.5 - Base disabled
C5.5-centosplus CentOS-5.5 - Plus disabled
C5.5-extras CentOS-5.5 - Extras disabled
C5.5-updates CentOS-5.5 - Updates disabled
C5.6-base CentOS-5.6 - Base disabled
C5.6-centosplus CentOS-5.6 - Plus disabled
C5.6-extras CentOS-5.6 - Extras disabled
C5.6-updates CentOS-5.6 - Updates disabled
base CentOS-5 - Base enabled: 3,566
c5-media CentOS-5 - Media disabled
centosplus CentOS-5 - Plus enabled: 58
contrib CentOS-5 - Contrib enabled: 0
dag Dag RPM Repository for Centos enabled: 11,010
debug CentOS-5 - Debuginfo disabled
epel Extra Packages for Enterprise Linux 5 - x86_64 enabled: 6,896
epel-debuginfo Extra Packages for Enterprise Linux 5 - x86_64 - disabled
epel-source Extra Packages for Enterprise Linux 5 - x86_64 - disabled
extras CentOS-5 - Extras enabled: 272
updates CentOS-5 - Updates enabled: 678
repolist: 22,480
== END yum repolist all ==

== BEGIN egrep 'include|exclude' /etc/yum.repos.d/*.repo ==
== END egrep 'include|exclude' /etc/yum.repos.d/*.repo ==

== BEGIN sed -n -e "/^\[/h; /priority *=/{ G; s/\n/ /; s/ity=/ity = /; p }" /etc/yum.repos.d/*.repo | sort -k3n ==
== END sed -n -e "/^\[/h; /priority *=/{ G; s/\n/ /; s/ity=/ity = /; p }" /etc/yum.repos.d/*.repo | sort -k3n ==

== BEGIN cat /etc/fstab ==
/dev/sda1 / ext3 defaults 1 1
none /dev/pts devpts gid=5,mode=620 0 0
none /dev/shm tmpfs defaults 0 0
none /proc proc defaults 0 0
none /sys sysfs defaults 0 0
www.well.com:/www /www nfs soft,intr,nosuid,posix,retry=1440 0 0
== END cat /etc/fstab ==

== BEGIN df -h ==
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 32G 17G 14G 56% /
none 3.8G 0 3.8G 0% /dev/shm
www.well.com:/www 30G 14G 17G 47% /www
www.well.com:/opt/rw 50G 48G 2.4G 96% /opt/rw
www.well.com:/opt/snapshots
75G 51G 25G 68% /snapshots
/dev/sdf 20G 7.8G 13G 39% /well
/dev/sdg 60G 27G 34G 45% /home
/dev/sdh 8.0G 1.4G 6.7G 18% /usr/local/public
== END df -h ==

== BEGIN fdisk -l ==
Disk /dev/sda1 doesn't contain a valid partition table
Disk /dev/sda2 doesn't contain a valid partition table
Disk /dev/sdf doesn't contain a valid partition table
Disk /dev/sdh doesn't contain a valid partition table
Disk /dev/sdg doesn't contain a valid partition table

Disk /dev/sda1: 34.3 GB, 34359738368 bytes
255 heads, 63 sectors/track, 4177 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes


Disk /dev/sda2: 450.9 GB, 450934865920 bytes
255 heads, 63 sectors/track, 54823 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes


Disk /dev/sdf: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes


Disk /dev/sdh: 8589 MB, 8589934592 bytes
255 heads, 63 sectors/track, 1044 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes


Disk /dev/sdg: 64.4 GB, 64424509440 bytes
255 heads, 63 sectors/track, 7832 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

== END fdisk -l ==

== BEGIN blkid ==
/dev/sdh: UUID="430aa4ad-e029-4115-89ae-079454a08e61" TYPE="xfs"
/dev/sdf: UUID="97100e61-a4a4-4dac-97e2-32f4a301289d" TYPE="xfs"
/dev/sdg: UUID="19576d89-2bc9-47a4-a26a-795f31bba6a1" TYPE="xfs"
/swap: TYPE="swap"
/dev/sda1: UUID="78e09a34-2023-45be-a336-9a232aa536f0" TYPE="ext3"
/dev/sda2: UUID="2ce3417e-ac18-44e2-a399-c9d570231eca" SEC_TYPE="ext2" TYPE="ext3"
== END blkid ==

== BEGIN cat /proc/mdstat ==
Personalities :
unused devices: <none>
== END cat /proc/mdstat ==

== BEGIN pvs ==
== END pvs ==

== BEGIN vgs ==
No volume groups found
== END vgs ==

== BEGIN lvs ==
No volume groups found
== END lvs ==

== BEGIN rpm -qa kernel\* | sort ==
kernel-2.6.18-238.19.1.el5.centos.plus
kernel-2.6.18-274.12.1.el5.centos.plus
kernel-2.6.18-274.17.1.el5.centos.plus
kernel-2.6.18-274.3.1.el5.centos.plus
kernel-2.6.18-274.7.1.el5.centos.plus
kernel-headers-2.6.18-274.17.1.el5.centos.plus
== END rpm -qa kernel\* | sort ==

== BEGIN lspci -nn ==
pcilib: Cannot open /proc/bus/pci
lspci: Cannot find any working access method.
== END lspci -nn ==

== BEGIN lsusb ==
== END lsusb ==

== BEGIN rpm -qa kmod\* kmdl\* ==
== END rpm -qa kmod\* kmdl\* ==

== BEGIN ifconfig -a ==
eth0 Link encap:Ethernet HWaddr 12:31:35:00:14:B2
inet addr:10.255.27.64 Bcast:10.255.27.255 Mask:255.255.254.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:146313 errors:0 dropped:0 overruns:0 frame:0
TX packets:139035 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:32607156 (31.0 MiB) TX bytes:39021849 (37.2 MiB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:204049 errors:0 dropped:0 overruns:0 frame:0
TX packets:204049 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:88728413 (84.6 MiB) TX bytes:88728413 (84.6 MiB)

== END ifconfig -a ==

== BEGIN brctl show ==
./getinfo: line 87: brctl: command not found
== END brctl show ==

== BEGIN route -n ==
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
211.232.57.38 127.0.0.1 255.255.255.255 UGH 0 0 0 lo
183.109.124.46 127.0.0.1 255.255.255.255 UGH 0 0 0 lo
210.51.173.26 127.0.0.1 255.255.255.255 UGH 0 0 0 lo
79.120.178.20 127.0.0.1 255.255.255.255 UGH 0 0 0 lo
115.236.16.116 127.0.0.1 255.255.255.255 UGH 0 0 0 lo
116.228.3.8 127.0.0.1 255.255.255.255 UGH 0 0 0 lo
194.78.96.169 127.0.0.1 255.255.255.255 UGH 0 0 0 lo
10.255.26.0 0.0.0.0 255.255.254.0 U 0 0 0 eth0
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0
0.0.0.0 10.255.26.1 0.0.0.0 UG 0 0 0 eth0
== END route -n ==

== BEGIN cat /etc/resolv.conf ==
; generated by /sbin/dhclient-script
search compute-1.internal
nameserver 172.16.0.23
== END cat /etc/resolv.conf ==

== BEGIN grep net /etc/nsswitch.conf ==
#networks: nisplus [NOTFOUND=return] files
#netmasks: nisplus [NOTFOUND=return] files
netmasks: files
networks: files
netgroup: nisplus
== END grep net /etc/nsswitch.conf ==

== BEGIN chkconfig --list | grep -Ei 'network|wpa' ==
NetworkManager 0:off 1:off 2:off 3:off 4:off 5:off 6:off
network 0:off 1:off 2:on 3:on 4:on 5:on 6:off
wpa_supplicant 0:off 1:off 2:off 3:off 4:off 5:off 6:off
== END chkconfig --list | grep -Ei 'network|wpa' ==

[/code]
============================================================== 15:31:47

abednegoyulo
Posts: 550
Joined: 2007/12/26 06:24:38
Location: 127.0.0.2 44013

Repeated Kernel Panics

Post by abednegoyulo » 2012/02/08 01:34:12

[quote]
wolfypdx wrote:
== BEGIN uname -rmi ==
2.6.18-xenU-ec2-v1.2 x86_64 x86_64
== END uname -rmi ==
[/quote]

This is your situation. You have kernel panics. The kernel is not from CentOS. A CentOS kernel looks like something like this

[code]
uname -rmi
2.6.18-274.17.1.el5xen x86_64 x86_64
[/code]

To further explain your case, you might want to read [url=http://wiki.centos.org/AdditionalResources/OtherSpins]when Centos is not CentOS[/url].

wolfypdx
Posts: 2
Joined: 2012/02/07 23:24:24

Re: Repeated Kernel Panics

Post by wolfypdx » 2012/02/08 23:52:50

Thanks.

Post Reply