3.10.0-693.5.2 kernel breaks Centos 7 Hyper-V VM :(

General support questions
Post Reply
paulmancan
Posts: 26
Joined: 2005/07/01 15:16:51

3.10.0-693.5.2 kernel breaks Centos 7 Hyper-V VM :(

Post by paulmancan » 2017/11/01 03:17:13

Any help appreciated.

I've actually seen this happen before with other kernel versions in the past.

Basically after awhile drive becomes inaccessible.

I notice references to [sdd] on scsi 3:0:1:0 but there is actually no such device attached/part of the VM.

I also notice somehow my kernel and kernel-header versions got out of sync. Probably isn't normal?

I have reverted kernel and kernel-headers 3.10.0-693.5.2 for now.

Thanks.



Running on a MS Hyper-V 2012R2 Host
We have Hyper-V VSS creating replicas


# rpm -qa | grep hyper
microsoft-hyper-v-4.2.1-20170602.x86_64
kmod-microsoft-hyper-v-4.2.1-20170602.x86_64


yum history info 54 | grep kernel
Install kernel-3.10.0-693.5.2.el7.x86_64 @updates
Updated kernel-headers-3.10.0-693.2.2.el7.x86_64 @updates

* note kernel-headers version mismatch???


Log excerpts before drive is "lost", last two lines repeat over and over

HV_FCOPY: open /dev/vmbus/hv_fcopy failed; error: 2 No such file or directory
systemd: Unit hv_fcopy_daemon.service entered failed state.Oct 29 16:18:04 crs2 systemd:
systemd: Job sys-devices-virtual-misc-vmbus\x21hv_fcopy.device/start failed with result
systemd-udevd: inotify_add_watch(7, /dev/sdc, 10) failed: No such file or directory
kernel: sd 3:0:1:1: [sdc] Asking for cache data failed



Log excerpts (hours?) before crapping out

kernel: scsi 3:0:1:1: Direct-Access Msft Virtual Disk 1.0 PQ: 0 ANSI: 4
kernel: sd 3:0:1:1: Attached scsi generic sg3 type 0
kernel: sd 3:0:1:1: [sdc] 83886080 512-byte logical blocks: (42.9 GB/40.0 GiB)
kernel: sd 3:0:1:1: [sdc] 4096-byte physical blocks
kernel: sd 3:0:1:1: [sdc] Write Protect is off
kernel: sd 3:0:1:1: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO
kernel: sd 3:0:1:1: [sdc] Sector size 0 reported, assuming 512.
kernel: sd 3:0:1:1: [sdc] 1 512-byte logical blocks: (512 B/512 B)
kernel: sd 3:0:1:1: [sdc] 0-byte physical blocks
kernel: sd 3:0:1:1: [sdc] Attached SCSI disk
kernel: scsi 3:0:1:0: Direct-Access Msft Virtual Disk 1.0 PQ: 0 ANSI: 4
kernel: sd 3:0:1:0: Attached scsi generic sg4 type 0
kernel: sd 3:0:1:0: [sdd] Sector size 0 reported, assuming 512.
kernel: sd 3:0:1:0: [sdd] 1 512-byte logical blocks: (512 B/512 B)
kernel: sd 3:0:1:0: [sdd] 0-byte physical blocks
kernel: sd 3:0:1:0: [sdd] Write Protect is off
kernel: sd 3:0:1:0: [sdd] Asking for cache data failed
kernel: sd 3:0:1:0: [sdd] Assuming drive cache: write through
kernel: sd 3:0:1:0: [sdd] Sector size 0 reported, assuming 512.
kernel: sd 3:0:1:0: [sdd] Sector size 0 reported, assuming 512.
kernel: sd 3:0:1:0: [sdd] Attached SCSI disk
kernel: scsi 3:0:1:0: Direct-Access Msft Virtual Disk 1.0 PQ: 0 ANSI: 4
kernel: sd 3:0:1:0: Attached scsi generic sg4 type 0
kernel: sd 3:0:1:0: [sdd] Sector size 0 reported, assuming 512.
kernel: sd 3:0:1:0: [sdd] 1 512-byte logical blocks: (512 B/512 B)

paulmancan
Posts: 26
Joined: 2005/07/01 15:16:51

Re: 3.10.0-693.5.2 kernel breaks Cenots 7 Hyper-V VM :(

Post by paulmancan » 2017/11/01 03:43:00

I do see there is a newer LIS build from MS.

I was an early adopter of LIS well actually the very first builds of LIC when they were called LIC and before RH had built in support and I've continued to always install these since. Are most people doing this or sticking to the built in support? I think for VSS replication it may be required to have LIS?

thx


chrisbrown
Posts: 1
Joined: 2018/01/16 10:49:21

Re: 3.10.0-693.5.2 kernel breaks Cenots 7 Hyper-V VM :(

Post by chrisbrown » 2018/01/16 10:52:40

Hi paulmancan ,

Do you still have the issue?

I have the same issue with 3.10.0-693.11.6.el7.x86_64 and LIS lis-rpms-4.2.3-4.tar.gz installed.

The only way I found to not have issue with Hyper-V VSS is using an old kernel... But it's not a solution as well.

Thanks for your answer

paulmancan
Posts: 26
Joined: 2005/07/01 15:16:51

Re: 3.10.0-693.5.2 kernel breaks Cenots 7 Hyper-V VM :(

Post by paulmancan » 2018/01/26 07:24:09

Hi. Yes this is brutal and I don't know how other people haven't been freaking out over this bug that is easily and consistently reproducible and often either hangs the OS or worse corrupts the filesystem beyond (simple anyway) repair. I guess many people must not be using VSS aware backups on these guests.

I am using a testing build of 4.15 kenel to work around this and this has been solid.

For production probably best to wait for the next kernel in 7.5 which I understand has been patched for this bug.

Post Reply