[SOLVED]CentOS 6.7 yum update; GPU access blocked by OS

General support questions
GShi
Posts: 17
Joined: 2015/10/05 14:30:21

[SOLVED]CentOS 6.7 yum update; GPU access blocked by OS

Post by GShi » 2015/10/05 14:46:56

Hi all,

I "yum update" my CentOS 6.7 and during the update procedure, I saw some conflict message from xorg-x11-glamor, Error: libX11 conflicts with libxcb-1.8.1-1.el6.x86_64, Error: avahi-glib conflicts with avahi-0.6.25-15.el6.x86_64, and of course "kmod-nvidia issues" prompted out, etc...

I am not sure what's that for, but soon after I ran "nvidia-smi" and error message showed: "Failed to initialize NVML: GPU access blocked by the operating system".

I have two Nvidia GeForce GTX 680 cards. Before update, it can be accessed and I ran parallel simulations on them a lot. I dug around online, somebody said need to remove xorg-glamor, people also mentioned kmod-nvidia issues. All these info told me that my problem is somehow related to the nvidia driver but since I really don't know where the problem is, I am scared to remove/erase anything that might cause more trouble....

Right now, if I ran "yum list", there are two versions of kmod-nvidia one is 331.xx which apparently is the old one, and the other is 352.xx which is the updated one. I don't know if two versions of kmod-nvidia would be a conflict?

================================================================================================================================================================================================================
Package Arch Version Repository Size
================================================================================================================================================================================================================
Removing:
kmod-nvidia x86_64 331.38-1.el6.elrepo @elrepo 18 M
kmod-nvidia x86_64 352.41-1.el6.elrepo installed 19 M
Removing for dependencies:
nvidia-x11-drv x86_64 331.38-1.el6.elrepo @elrepo 118 M
nvidia-x11-drv x86_64 352.41-1.el6.elrepo installed 185 M

Transaction Summary
================================================================================================================================================================================================================

Also, I am not very familiar with yum commands... Anybody has encountered the same problem before? Should I run some commands first to diagnose where the problem is and then try to solve that? Is it possible for me to remove only one version of the kmod-nvidia...?

Thanks a lot!!!!!

-GShi
Last edited by GShi on 2015/10/05 21:51:24, edited 1 time in total.

User avatar
toracat
Site Admin
Posts: 7518
Joined: 2006/09/03 16:37:24
Location: California, US
Contact:

Re: CentOS 6.7 yum update; GPU access blocked by OS

Post by toracat » 2015/10/05 15:23:56

Can you show us the output returned by:

rpm -qa \*nvidia\*
CentOS Forum FAQ

GShi
Posts: 17
Joined: 2015/10/05 14:30:21

Re: CentOS 6.7 yum update; GPU access blocked by OS

Post by GShi » 2015/10/05 15:29:58

toracat wrote:Can you show us the output returned by:

rpm -qa \*nvidia\*
Here is the output:

kmod-nvidia-352.41-1.el6.elrepo.x86_64
yum-plugin-nvidia-1.0.2-1.el6.elrepo.noarch
nvidia-detect-352.41-1.el6.elrepo.x86_64
nvidia-x11-drv-352.41-1.el6.elrepo.x86_64


Thanks for helping me. I just want to let you know what I did after I posted the problem here. So I found two kmod-nvidia. I removed all of them and re-installed kmod-nvidia-352.41 (along with dependencies of course). I also uninstall xorg-x11-glamor as recommended from that "kmod-nvidia issues". Then I ran nvidia-detect -v and the output told me my GTX 680 needs current version of kmod. I also ran nvidia-detect --xorg and ABI compatibility check passed.

Problem is: after run nvidia-smi the GPU access is still blocked by OS.................. So I guess at least I figured out kmod-nvidia-352.41 is the driver for my GPU cards, right...?

Also, I haven't reboot the system since I caused the trouble...I was worried that I won't be able to get into graphic interface....

User avatar
toracat
Site Admin
Posts: 7518
Joined: 2006/09/03 16:37:24
Location: California, US
Contact:

Re: CentOS 6.7 yum update; GPU access blocked by OS

Post by toracat » 2015/10/05 15:45:30

Just wonder if your system has CUDA installed.
CentOS Forum FAQ

GShi
Posts: 17
Joined: 2015/10/05 14:30:21

Re: CentOS 6.7 yum update; GPU access blocked by OS

Post by GShi » 2015/10/05 15:53:22

toracat wrote:Just wonder if your system has CUDA installed.
Okay. That's a good question. My answer is "I really don't know...." and I highly suspected it doesn't have CUDA. When I dug online I also tried some cuda commands but apparently none of them can work...
My work station came with CentOS installed, certain software (like Amber) pre-installed. I never ever tried install drivers/toolkit by myself. If there's no CUDA, it seems like my GPU cards worked well before too...

GShi
Posts: 17
Joined: 2015/10/05 14:30:21

Re: CentOS 6.7 yum update; GPU access blocked by OS

Post by GShi » 2015/10/05 17:00:40

Hi! So when I re-installed kmod-nvidia-352.41, I have bunch of warning message:

WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceChannelDestroy
WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceQueryCaps
WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceMemoryAllocSys
WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceMemoryCpuMap
WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceKillChannel
WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceMemoryCpuUnMap
WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceAddressSpaceCreateMirrored
WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceGetGpuInfo
WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceServiceDeviceInterruptsRM
WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceDeRegisterUvmOps
WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceMemoryFree
WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceGetUvmPrivRegion
WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceGetAttachedUuids
WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceSessionDestroy
WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceCheckEccErrorSlowpath
WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceAddressSpaceCreate
WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceCopyEngineAllocate
WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceAddressSpaceDestroy
WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceRegisterUvmCallbacks
WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceChannelAllocate
WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceDupAllocation
WARNING: /lib/modules/2.6.32-220.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceSessionCreate
WARNING: Can't read module /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia.ko: No such file or directory
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceChannelDestroy
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceQueryCaps
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceMemoryAllocSys
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceMemoryCpuMap
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceKillChannel
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceMemoryCpuUnMap
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceAddressSpaceCreateMirrored
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceGetGpuInfo
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceServiceDeviceInterruptsRM
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceDeRegisterUvmOps
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceMemoryFree
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceGetUvmPrivRegion
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceGetAttachedUuids
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceSessionDestroy
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceCheckEccErrorSlowpath
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceAddressSpaceCreate
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceCopyEngineAllocate
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceAddressSpaceDestroy
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceRegisterUvmCallbacks
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceChannelAllocate
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceDupAllocation
WARNING: /lib/modules/2.6.32-279.14.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceSessionCreate
WARNING: /lib/modules/2.6.32-431.3.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceGetGpuInfo
WARNING: /lib/modules/2.6.32-431.3.1.el6.x86_64/weak-updates/nvidia/nvidia-uvm.ko needs unknown symbol nvUvmInterfaceDupAllocation
grep: /lib/modules/2.6.32-279.14.1.el6.x86_64//weak-updates/nvidia/nvidia.ko: No such file or directory
Done.

Any idea how can i solve it....?

User avatar
toracat
Site Admin
Posts: 7518
Joined: 2006/09/03 16:37:24
Location: California, US
Contact:

Re: CentOS 6.7 yum update; GPU access blocked by OS

Post by toracat » 2015/10/05 17:08:17

Please show us the output from:

uname -mr

and

rpm -qa kernel\* | sort
CentOS Forum FAQ

GShi
Posts: 17
Joined: 2015/10/05 14:30:21

Re: CentOS 6.7 yum update; GPU access blocked by OS

Post by GShi » 2015/10/05 17:52:19

toracat wrote:Please show us the output from:

uname -mr

and

rpm -qa kernel\* | sort
uname -mr

2.6.32-431.3.1.el6.x86_64 x86_64


rpm -qa kernel\* | sort

kernel-2.6.32-220.el6.x86_64
kernel-2.6.32-279.14.1.el6.x86_64
kernel-2.6.32-431.3.1.el6.x86_64
kernel-2.6.32-573.7.1.el6.x86_64
kernel-devel-2.6.32-279.14.1.el6.x86_64
kernel-devel-2.6.32-431.3.1.el6.x86_64
kernel-firmware-2.6.32-431.3.1.el6.noarch
kernel-firmware-2.6.32-573.7.1.el6.noarch
kernel-headers-2.6.32-431.3.1.el6.x86_64
kernel-headers-2.6.32-573.7.1.el6.x86_64



So is it the problem of incompatibility between kernel version and nvidia driver version?

Thanks!

User avatar
toracat
Site Admin
Posts: 7518
Joined: 2006/09/03 16:37:24
Location: California, US
Contact:

Re: CentOS 6.7 yum update; GPU access blocked by OS

Post by toracat » 2015/10/05 18:11:03

You really have to reboot your system to get going.
CentOS Forum FAQ

GShi
Posts: 17
Joined: 2015/10/05 14:30:21

Re: CentOS 6.7 yum update; GPU access blocked by OS

Post by GShi » 2015/10/05 19:26:03

toracat wrote:You really have to reboot your system to get going.
So I rebooted the system and it stuck at this screen..."kernel panic"... Now I don't even know where to start...
Attachments
IMG_5195(1).JPG
IMG_5195(1).JPG (75.59 KiB) Viewed 4235 times

Post Reply