Issue with CPU in centos7
Posted: 2018/06/14 15:08:56
Got issue with frozen filesystem with these logs -
Jun 11 15:47:30 laxasperacla1 kernel: Code: 48 63 35 26 45 a2 00 89 c2 39 f0 0f 8d 85 fe ff ff 48 98 49 8b 0f 48 03 0c c5 e0 fd b0 81 f6 41 20 01 74 cd 0f 1f 44 00 00 f3 90 <f6> 41 20 01 75 f8 48 63 35 f5 44 a2 00 eb b7 0f b6 4d cc 4c 89
Jun 11 15:47:39 laxasperacla1 kernel: audit_log_start: 62 callbacks suppressed
Jun 11 15:47:39 laxasperacla1 kernel: audit: audit_backlog=8193 > audit_backlog_limit=8192
Jun 11 15:47:39 laxasperacla1 kernel: audit: audit_lost=50863 audit_rate_limit=0 audit_backlog_limit=8192
Jun 11 15:47:39 laxasperacla1 kernel: audit: backlog limit exceeded
Jun 11 15:47:39 laxasperacla1 kernel: audit: audit_backlog=8193 > audit_backlog_limit=8192
Jun 11 15:47:39 laxasperacla1 kernel: audit: audit_lost=50864 audit_rate_limit=0 audit_backlog_limit=8192
Jun 11 15:47:39 laxasperacla1 kernel: audit: backlog limit exceeded
Jun 11 15:47:39 laxasperacla1 kernel: audit: audit_backlog=8193 > audit_backlog_limit=8192
Jun 11 15:47:39 laxasperacla1 kernel: audit: audit_lost=50865 audit_rate_limit=0 audit_backlog_limit=8192
Jun 11 15:47:39 laxasperacla1 kernel: audit: backlog limit exceeded
Jun 11 15:47:39 laxasperacla1 kernel: audit: audit_backlog=8193 > audit_backlog_limit=8192
Jun 11 15:47:50 laxasperacla1 kernel: audit_log_start: 41 callbacks suppressed
Jun 11 15:47:50 laxasperacla1 kernel: audit: audit_backlog=8193 > audit_backlog_limit=8192
Jun 11 15:47:50 laxasperacla1 kernel: audit: audit_lost=50880 audit_rate_limit=0 audit_backlog_limit=8192
Jun 11 15:47:50 laxasperacla1 kernel: audit: backlog limit exceeded
Jun 11 15:47:50 laxasperacla1 kernel: audit: audit_backlog=8193 > audit_backlog_limit=8192
Jun 11 15:47:50 laxasperacla1 kernel: audit: audit_lost=50881 audit_rate_limit=0 audit_backlog_limit=8192
Jun 11 15:47:50 laxasperacla1 kernel: audit: backlog limit exceeded
Jun 11 15:47:50 laxasperacla1 kernel: audit: audit_backlog=8193 > audit_backlog_limit=8192
Jun 11 15:47:50 laxasperacla1 kernel: audit: audit_lost=50882 audit_rate_limit=0 audit_backlog_limit=8192
Jun 11 15:47:50 laxasperacla1 kernel: audit: backlog limit exceeded
Jun 11 15:47:50 laxasperacla1 kernel: audit: audit_backlog=8193 > audit_backlog_limit=8192
Jun 11 15:47:55 laxasperacla1 kernel: audit_log_start: 74 callbacks suppressed
Jun 11 15:47:55 laxasperacla1 kernel: audit: audit_backlog=8193 > audit_backlog_limit=8192
Jun 11 15:47:55 laxasperacla1 kernel: audit: audit_lost=50908 audit_rate_limit=0 audit_backlog_limit=8192
Jun 11 15:47:55 laxasperacla1 kernel: audit: backlog limit exceeded
Dell support says, hardware is fine. Could be issue with OS.
Jun 11 15:47:02 laxasperacla1 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 23s! [chronyd:1343]
Jun 11 15:47:02 laxasperacla1 kernel: Modules linked in: cvfs(POE) bonding iTCO_wdt iTCO_vendor_support dcdbas skx_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr joydev ipmi_ssif sg mei_me mei lpc_ich i2c_i801 shpchp ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sr_mod sd_mod cdrom mgag200 lpfc drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm crc32c_intel drm igb i40e crc_t10dif crct10dif_generic tg3 crct10dif_pclmul ahci megaraid_sas scsi_transport_fc libahci dca libata scsi_tgt i2c_algo_bit crct10dif_common ptp i2c_core pps_core dm_mirror dm_region_hash dm_log dm_mod
Jun 11 15:47:02 laxasperacla1 kernel: CPU: 4 PID: 1343 Comm: chronyd Tainted: P OEL ------------ 3.10.0-693.el7.x86_64 #1
with Hardware error on CPU 4
Jun 11 15:47:30 laxasperacla1 kernel: Code: 48 63 35 26 45 a2 00 89 c2 39 f0 0f 8d 85 fe ff ff 48 98 49 8b 0f 48 03 0c c5 e0 fd b0 81 f6 41 20 01 74 cd 0f 1f 44 00 00 f3 90 <f6> 41 20 01 75 f8 48 63 35 f5 44 a2 00 eb b7 0f b6 4d cc 4c 89
Jun 11 15:47:39 laxasperacla1 kernel: audit_log_start: 62 callbacks suppressed
Jun 11 15:47:39 laxasperacla1 kernel: audit: audit_backlog=8193 > audit_backlog_limit=8192
Jun 11 15:47:39 laxasperacla1 kernel: audit: audit_lost=50863 audit_rate_limit=0 audit_backlog_limit=8192
Jun 11 15:47:39 laxasperacla1 kernel: audit: backlog limit exceeded
Jun 11 15:47:39 laxasperacla1 kernel: audit: audit_backlog=8193 > audit_backlog_limit=8192
Jun 11 15:47:39 laxasperacla1 kernel: audit: audit_lost=50864 audit_rate_limit=0 audit_backlog_limit=8192
Jun 11 15:47:39 laxasperacla1 kernel: audit: backlog limit exceeded
Jun 11 15:47:39 laxasperacla1 kernel: audit: audit_backlog=8193 > audit_backlog_limit=8192
Jun 11 15:47:39 laxasperacla1 kernel: audit: audit_lost=50865 audit_rate_limit=0 audit_backlog_limit=8192
Jun 11 15:47:39 laxasperacla1 kernel: audit: backlog limit exceeded
Jun 11 15:47:39 laxasperacla1 kernel: audit: audit_backlog=8193 > audit_backlog_limit=8192
Jun 11 15:47:50 laxasperacla1 kernel: audit_log_start: 41 callbacks suppressed
Jun 11 15:47:50 laxasperacla1 kernel: audit: audit_backlog=8193 > audit_backlog_limit=8192
Jun 11 15:47:50 laxasperacla1 kernel: audit: audit_lost=50880 audit_rate_limit=0 audit_backlog_limit=8192
Jun 11 15:47:50 laxasperacla1 kernel: audit: backlog limit exceeded
Jun 11 15:47:50 laxasperacla1 kernel: audit: audit_backlog=8193 > audit_backlog_limit=8192
Jun 11 15:47:50 laxasperacla1 kernel: audit: audit_lost=50881 audit_rate_limit=0 audit_backlog_limit=8192
Jun 11 15:47:50 laxasperacla1 kernel: audit: backlog limit exceeded
Jun 11 15:47:50 laxasperacla1 kernel: audit: audit_backlog=8193 > audit_backlog_limit=8192
Jun 11 15:47:50 laxasperacla1 kernel: audit: audit_lost=50882 audit_rate_limit=0 audit_backlog_limit=8192
Jun 11 15:47:50 laxasperacla1 kernel: audit: backlog limit exceeded
Jun 11 15:47:50 laxasperacla1 kernel: audit: audit_backlog=8193 > audit_backlog_limit=8192
Jun 11 15:47:55 laxasperacla1 kernel: audit_log_start: 74 callbacks suppressed
Jun 11 15:47:55 laxasperacla1 kernel: audit: audit_backlog=8193 > audit_backlog_limit=8192
Jun 11 15:47:55 laxasperacla1 kernel: audit: audit_lost=50908 audit_rate_limit=0 audit_backlog_limit=8192
Jun 11 15:47:55 laxasperacla1 kernel: audit: backlog limit exceeded
Dell support says, hardware is fine. Could be issue with OS.
Jun 11 15:47:02 laxasperacla1 kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 23s! [chronyd:1343]
Jun 11 15:47:02 laxasperacla1 kernel: Modules linked in: cvfs(POE) bonding iTCO_wdt iTCO_vendor_support dcdbas skx_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr joydev ipmi_ssif sg mei_me mei lpc_ich i2c_i801 shpchp ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sr_mod sd_mod cdrom mgag200 lpfc drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm crc32c_intel drm igb i40e crc_t10dif crct10dif_generic tg3 crct10dif_pclmul ahci megaraid_sas scsi_transport_fc libahci dca libata scsi_tgt i2c_algo_bit crct10dif_common ptp i2c_core pps_core dm_mirror dm_region_hash dm_log dm_mod
Jun 11 15:47:02 laxasperacla1 kernel: CPU: 4 PID: 1343 Comm: chronyd Tainted: P OEL ------------ 3.10.0-693.el7.x86_64 #1
with Hardware error on CPU 4