Drivers in dracut.conf Not Loaded at Boot

Issues related to applications and software problems
Post Reply
Quantum`
Posts: 21
Joined: 2015/05/15 18:50:42

Drivers in dracut.conf Not Loaded at Boot

Post by Quantum` » 2018/05/12 22:09:59

I am setting up to do GPU passthrough in KVM to a VM using OMVF, but it's got a cray in its craw.

I've added to /etc/dracut.d/vfio.conf:

Code: Select all

add_drivers+=" vfio_pci vfio vfio_iommu_type1 vfio_virqfd "
... and to /etc/modprobe.d/vfio.conf:

Code: Select all

options vfio-pci ids=10de:1c02,10de:10f1
... then # dracut --force

... which should add those drivers and their options to initramfs. And examining initramfs they are indeed there:

Code: Select all

# lsinitrd |grep vfio
-rw-r--r--   1 root     root           41 May 12 14:23 etc/modprobe.d/carls-vfio.conf
drwxr-xr-x   3 root     root            0 May 12 14:53 usr/lib/modules/4.16.7-1.el7.elrepo.x86_64/kernel/drivers/vfio
drwxr-xr-x   2 root     root            0 May 12 14:53 usr/lib/modules/4.16.7-1.el7.elrepo.x86_64/kernel/drivers/vfio/pci
-rwxr--r--   1 root     root        82336 May 12 14:53 usr/lib/modules/4.16.7-1.el7.elrepo.x86_64/kernel/drivers/vfio/pci/vfio-pci.ko
-rwxr--r--   1 root     root        30800 May 12 14:53 usr/lib/modules/4.16.7-1.el7.elrepo.x86_64/kernel/drivers/vfio/vfio_iommu_type1.ko
-rwxr--r--   1 root     root        55080 May 12 14:53 usr/lib/modules/4.16.7-1.el7.elrepo.x86_64/kernel/drivers/vfio/vfio.ko
-rwxr--r--   1 root     root        11072 May 12 14:53 usr/lib/modules/4.16.7-1.el7.elrepo.x86_64/kernel/drivers/vfio/vfio_virqfd.ko
But on boot the drivers are simply not loading. They do not appear in lsmod, and the normal default drivers are assigned to the video and sound cards.

Code: Select all

IOMMU Group 1 00:01.0 PCI bridge [0604]: Intel Corporation Skylake PCIe Controller (x16) [8086:1901] (rev 07)
IOMMU Group 1 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] [10de:1c02] (rev a1)
IOMMU Group 1 01:00.1 Audio device [0403]: NVIDIA Corporation GP106 High Definition Audio Controller [10de:10f1] (rev a1)

Code: Select all

lspci -k
01:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] (rev a1)
        Subsystem: ASUSTeK Computer Inc. Device 85b9
        Kernel driver in use: nouveau
        Kernel modules: nouveau
01:00.1 Audio device: NVIDIA Corporation GP106 High Definition Audio Controller (rev a1)
        Subsystem: ASUSTeK Computer Inc. Device 85b9
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel
I see no mention of vfio in dmesg, journalctl, or messages. How is this getting overlooked?

Quantum`
Posts: 21
Joined: 2015/05/15 18:50:42

Re: Drivers in dracut.conf Not Loaded at Boot

Post by Quantum` » 2018/05/14 15:13:48

Gee... am I doing new science?

User avatar
TrevorH
Forum Moderator
Posts: 22799
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: Drivers in dracut.conf Not Loaded at Boot

Post by TrevorH » 2018/05/14 16:23:26

Everyone who replies here is a volunteer and we all have our areas of expertise. Your problem is with something that I would guess most people do not use. To get any good replies, you'll need to wait for someone who knows this area. Hopefully there will be someone.

Also, you posted on Saturday, late in the evening here in the UK and most people will not have returned to work until today and the USA probably aren't even properly awake yet...
CentOS 5 died in March 2017 - migrate NOW!
Full time Geek, part time moderator. Use the FAQ Luke

hunter86_bg
Posts: 1136
Joined: 2015/02/17 15:14:33
Location: Bulgaria
Contact:

Re: Drivers in dracut.conf Not Loaded at Boot

Post by hunter86_bg » 2018/05/15 03:54:52

Sadly I had IOMMU experience only with AMD GPUs and I never had to add 'extra' drivers to the initramfs.
Let's start fresh,by answering the following:
1.What type and manufacturer is your IOMMU device
2.Have you enabled IOMMU from BIOS/UEFI?
3. Have you checked the following guide.

Quantum`
Posts: 21
Joined: 2015/05/15 18:50:42

Re: Drivers in dracut.conf Not Loaded at Boot

Post by Quantum` » 2018/05/16 05:03:01

Thanks all. I've been without TV for a week, is it.

1. It's an nVidia GeForce GTX 1060 3GB PCIe GPU
2. Yes

Code: Select all

# dmesg | grep -e DMAR -e IOMMU
[    0.000000] ACPI: DMAR 0x00000000A4018530 0000A8 (v01 INTEL  SKL      00000001 INTL 00000001)
[    0.000000] DMAR: IOMMU enabled
[    0.001000] DMAR: Host address width 39
[    0.001000] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[    0.001000] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap 1c0000c40660462 ecap 7e3ff0505e
[    0.001000] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[    0.001000] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008c40660462 ecap f050da
[    0.001000] DMAR: RMRR base: 0x000000b1856000 end: 0x000000b1875fff
[    0.001000] DMAR: RMRR base: 0x000000b3800000 end: 0x000000b7ffffff
[    0.001000] DMAR-IR: IOAPIC id 2 under DRHD base  0xfed91000 IOMMU 1
[    0.001000] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[    0.001000] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[    0.002000] DMAR-IR: Enabled IRQ remapping in x2apic mode
[    0.544647] DMAR: No ATSR found
[    0.544681] DMAR: dmar0: Using Queued invalidation
[    0.544685] DMAR: dmar1: Using Queued invalidation
[    0.544935] DMAR: Hardware identity mapping for device 0000:00:00.0
[    0.544938] DMAR: Hardware identity mapping for device 0000:00:01.0
[    0.544942] DMAR: Hardware identity mapping for device 0000:00:02.0
[    0.544945] DMAR: Hardware identity mapping for device 0000:00:14.0
[    0.544947] DMAR: Hardware identity mapping for device 0000:00:16.0
[    0.544949] DMAR: Hardware identity mapping for device 0000:00:17.0
[    0.544951] DMAR: Hardware identity mapping for device 0000:00:1b.0
[    0.544952] DMAR: Hardware identity mapping for device 0000:00:1b.4
[    0.544955] DMAR: Hardware identity mapping for device 0000:00:1c.0
[    0.544956] DMAR: Hardware identity mapping for device 0000:00:1d.0
[    0.544959] DMAR: Hardware identity mapping for device 0000:00:1f.0
[    0.544961] DMAR: Hardware identity mapping for device 0000:00:1f.2
[    0.544963] DMAR: Hardware identity mapping for device 0000:00:1f.3
[    0.544965] DMAR: Hardware identity mapping for device 0000:00:1f.4
[    0.544967] DMAR: Hardware identity mapping for device 0000:00:1f.6
[    0.544970] DMAR: Hardware identity mapping for device 0000:01:00.0
[    0.544972] DMAR: Hardware identity mapping for device 0000:01:00.1
[    0.544975] DMAR: Hardware identity mapping for device 0000:03:00.0
[    0.544977] DMAR: Hardware identity mapping for device 0000:03:00.1
[    0.544980] DMAR: Hardware identity mapping for device 0000:03:00.2
[    0.544982] DMAR: Hardware identity mapping for device 0000:03:00.3
[    0.544983] DMAR: Setting RMRR:
[    0.544985] DMAR: Ignoring identity map for HW passthrough device 0000:00:02.0 [0xb3800000 - 0xb7ffffff]
[    0.544987] DMAR: Ignoring identity map for HW passthrough device 0000:00:14.0 [0xb1856000 - 0xb1875fff]
[    0.544989] DMAR: Prepare 0-16MiB unity mapping for LPC
[    0.544991] DMAR: Ignoring identity map for HW passthrough device 0000:00:1f.0 [0x0 - 0xffffff]
[    0.545014] DMAR: Intel(R) Virtualization Technology for Directed I/O
[   40.411872] DMAR: 64bit 0000:03:11.1 uses identity mapping
[   40.587790] DMAR: 64bit 0000:03:12.1 uses identity mapping
[   40.976340] DMAR: 64bit 0000:03:10.0 uses identity mapping
[   41.000373] DMAR: 64bit 0000:03:10.1 uses identity mapping
[   41.188460] DMAR: 64bit 0000:03:10.4 uses identity mapping
[   41.378421] DMAR: 64bit 0000:03:12.5 uses identity mapping
[   41.576544] DMAR: 64bit 0000:03:11.5 uses identity mapping
3. Actually I'm working from the newer Arch guide. Sure it's not CentOS, but the CentOS guide doesn't know from OMVF. It's for CentOS v5.

Groups:

Code: Select all

# iommuTOpci
...
IOMMU Group 1 00:01.0 PCI bridge [0604]: Intel Corporation Skylake PCIe Controller (x16) [8086:1901] (rev 07)
IOMMU Group 1 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] [10de:1c02] (rev a1)
IOMMU Group 1 01:00.1 Audio device [0403]: NVIDIA Corporation GP106 High Definition Audio Controller [10de:10f1] (rev a1)
...

Code: Select all

find /sys/kernel/iommu_groups/ -type l
...
/sys/kernel/iommu_groups/1/devices/0000:01:00.1
/sys/kernel/iommu_groups/1/devices/0000:00:01.0
/sys/kernel/iommu_groups/1/devices/0000:01:00.0
...
- Most motherboards have PCIe slots provided by both the CPU and the PCH. Depending on CPU, it is possible that the processor-based PCIe slot does not support isolation properly, in which case the PCI slot itself will be appear to be grouped with the device that is connected to it. This is fine so long as only your guest GPU is included in here, but additional devices within the same group must be passed through too.
- The device and all those sharing the same IOMMU group must have their driver replaced by a stub driver or a VFIO driver in order to prevent the host machine from interacting with them.
- Due to their size and complexity, GPU drivers do not tend to support dynamic rebinding very well, so bind those placeholder drivers manually before starting the VM.

Code: Select all

# modinfo vfio-pci
filename:       /lib/modules/4.16.7-1.el7.elrepo.x86_64/kernel/drivers/vfio/pci/vfio-pci.ko
description:    VFIO PCI - User Level meta-driver
author:         Alex Williamson <alex.williamson@redhat.com>
license:        GPL v2
version:        0.2
srcversion:     285B406AFDCA2E25E1FFCE6
depends:        vfio,irqbypass,vfio_virqfd
retpoline:      Y
intree:         Y
name:           vfio_pci
vermagic:       4.16.7-1.el7.elrepo.x86_64 SMP mod_unload modversions 
parm:           ids:Initial PCI IDs to add to the vfio driver, format is "vendor:device[:subvendor[:subdevice[:class[:class_mask]]]]" and multiple comma separated entries can be specified (string)
parm:           nointxmask:Disable support for PCI 2.3 style INTx masking.  If this resolves problems for specific devices, report lspci -vvvxxx to linux-pci@vger.kernel.org so the device can be fixed automatically via the broken_intx_masking flag. (bool)
parm:           disable_idle_d3:Disable using the PCI D3 low power state for idle, unused devices (bool)
/etc/modprobe.d/vfio.conf

Code: Select all

options vfio-pci ids=10de:1c02,10de:10f1
(This is taken up by dracut)

- If pci root port (Bridge) is part of your IOMMU group, must not pass its ID to vfio-pci, as it
needs to remain attached to the host to function properly. Any other device within that group, however, should be left for vfio-pci to bind with.
- This does not guarantee that vfio-pci will be loaded before other graphics drivers though.
To ensure that -- in this order, and these must precede any video drivers loaded this way:
/etc/dracut.conf.d/vfio.conf

Code: Select all

force_drivers+=" vfio_pci vfio vfio_iommu_type1 vfio_virqfd "
	# dracut -f --regenerate-all	
And reboot.

Is vfio-pci loaded properly and bound to the right devices?

Code: Select all

# dmesg | grep -i vfio
#

Code: Select all

# lspci -nnk -d 10de:1c02
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] [10de:1c02] (rev a1)
        Subsystem: ASUSTeK Computer Inc. Device [1043:85b9]
        Kernel driver in use: nouveau
        Kernel modules: nouveau

# lspci -nnk -d 10de:10f1
01:00.1 Audio device [0403]: NVIDIA Corporation GP106 High Definition Audio Controller [10de:10f1] (rev a1)
        Subsystem: ASUSTeK Computer Inc. Device [1043:85b9]
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel
- Ensure that installed are: qemu-system-x86.x86_64, libvirt.x86_64, OVMF, and virt-manager
- Add the path to the OVMF firmware image and runtime variables template to libvirt config so virt-install or virt-manager can find it:
/etc/libvirt/qemu.conf

Code: Select all

nvram = [
        "/usr/share/OVMF/OVMF_CODE.secboot.fd:/usr/share/OVMF/OVMF_VARS.fd"
]
(File confirmed)
# systemctl status libvirtd
active (running)
# systemctl status virtlogd
active (running)

Code: Select all

# lsinitrd
...
-rw-r--r--   1 root     root           41 May 12 14:23 etc/modprobe.d/carls-vfio.conf
...
# nano dracut.conf.d/carls-vfio.conf 
# lsinitrd |grep vfio_pci
# lsinitrd |grep vfio
-rw-r--r--   1 root     root           41 May 12 14:23 etc/modprobe.d/vfio.conf
# lsinitrd |grep vfio_iommu_type1
# lsinitrd |grep vfio_virqfd
Well for the love of Saint Peter, dracut isn't picking up my drivers!

Is this another ridiculously common case where I properly put a modification in the *.d directory and it gets ignored?

Quantum`
Posts: 21
Joined: 2015/05/15 18:50:42

Re: Drivers in dracut.conf Not Loaded at Boot

Post by Quantum` » 2018/05/21 17:21:50

Ok I've followed the RHEL guide and the nVidia video and sound cards are getting passed through now.

The guest boots but it couldn't be using the nVidia card as I'm plugged HDMI into the mobo, not the nVidia card. And there's no video on the nVidia card. It must be using the Spice display. lspci -k shows the nvidia card is using the nouveau driver, yet there's no HDMI coming out of it.

I tried removing the Spice display but then it wouldn't boot complaining that there's no video. Well yes there is, in the PCI device.

Any ideas?

Also, does the nouveau driver work like this or should I install nVidia's proprietary?

Quantum`
Posts: 21
Joined: 2015/05/15 18:50:42

Re: Drivers in dracut.conf Not Loaded at Boot

Post by Quantum` » 2018/05/21 21:08:37

Now I've blacklisted nouveau and installed the nvidia driver using these instructions.

Host boots fine, guest boots fine to Spice in virt-manager. But lspci -k shows:

Code: Select all

...
 08:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] (rev a1)
        Subsystem: ASUSTeK Computer Inc. Device 85b9
        Kernel modules: nouveau
09:00.0 Audio device: NVIDIA Corporation GP106 High Definition Audio Controller (rev a1)
        Subsystem: ASUSTeK Computer Inc. Device 85b9
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel
The sound card looks fine, but there's no driver installed for the nvidia. So I ran nvidia-xconfig. It duly created a complete /etc/X11/xorg.conf which set:

Code: Select all

Driver         "nvidia"
... so I thought everything will be fine now. Rebooted and nah, still no driver used for the nVidia. On the card's HDMI port is just a non-blinking underline cursor.

here's /var/log/Xorg.0.log:

Code: Select all

[   989.535] (--) Log file renamed from "/var/log/Xorg.pid-39205.log" to "/var/log/Xorg.0.log"
[   989.535]
X.Org X Server 1.19.6
Release Date: 2017-12-20
[   989.535] X Protocol Version 11, Revision 0
[   989.535] Build Operating System:  4.15.3-300.fc27.x86_64
[   989.535] Current Operating System: Linux cygnus.darkmtter.org 4.16.9-300.fc28.x86_64 #1 SMP Thu May 1$
[   989.535] Kernel command line: BOOT_IMAGE=/vmlinuz-4.16.9-300.fc28.x86_64 root=/dev/mapper/fedora_cygn$
[   989.535] Build Date: 23 April 2018  06:16:50PM
[   989.535] Build ID: xorg-x11-server 1.19.6-8.fc28
[   989.535] Current version of pixman: 0.34.0
[   989.535]    Before reporting problems, check http://wiki.x.org
        to make sure that you have the latest version.
[   989.535] Markers: (--) probed, (**) from config file, (==) default setting,
        (++) from command line, (!!) notice, (II) informational,
        (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
[   989.535] (==) Log file: "/var/log/Xorg.0.log", Time: Mon May 21 14:10:18 2018
[   989.535] (==) Using config file: "/etc/X11/xorg.conf"
[   989.535] (==) Using config directory: "/etc/X11/xorg.conf.d"
[   989.535] (==) Using system config directory "/usr/share/X11/xorg.conf.d"
[   989.535] (==) ServerLayout "Layout0"
[   989.535] (**) |-->Screen "Screen0" (0)
[   989.535] (**) |   |-->Monitor "Monitor0"
[   989.536] (**) |   |-->Device "Device0"
[   989.536] (**) |-->Input Device "Keyboard0"
[   989.536] (**) |-->Input Device "Mouse0"
[   989.536] (==) Automatically adding devices
[   989.536] (==) Automatically enabling devices
[   989.536] (==) Automatically adding GPU devices
[   989.536] (==) Automatically binding GPU devices
[   989.536] (==) Max clients allowed: 256, resource mask: 0x1fffff
[   989.536] (==) FontPath set to:
        catalogue:/etc/X11/fontpath.d,
        built-ins
[   989.536] (==) ModulePath set to "/usr/lib64/xorg/modules"
[   989.536] (WW) Hotplugging is on, devices using drivers 'kbd', 'mouse' or 'vmmouse' will be disabled.
[   989.536] (WW) Disabling Keyboard0
[   989.536] (WW) Disabling Mouse0
[   989.536] (II) Loader magic: 0x825e00
[   989.536] (II) Module ABI versions:
[   989.536]    X.Org ANSI C Emulation: 0.4
[   989.536]    X.Org Video Driver: 23.0
[   989.536]    X.Org XInput driver : 24.1
[   989.536]    X.Org Server Extension : 10.0
[   989.537] (++) using VT number 1

[   989.537] (II) systemd-logind: logind integration requires -keeptty and -keeptty was not provided, dis$
[   989.537] (II) xfree86: Adding drm device (/dev/dri/card0)
[   989.545] (--) PCI:*(0:0:1:0) 1013:00b8:1af4:1100 rev 0, Mem @ 0x90000000/33554432, 0x9400a000/4096, B$
[   989.545] (--) PCI: (0:8:0:0) 10de:1c02:1043:85b9 rev 161, Mem @ 0x92000000/16777216, 0x800000000/2684$
[   989.546] (II) LoadModule: "glx"
[   989.546] (II) Loading /usr/lib64/xorg/modules/extensions/libglx.so
[   989.547] (II) Module glx: vendor="X.Org Foundation"
[   989.547]    compiled for 1.19.6, module version = 1.0.0
[   989.547]    ABI class: X.Org Server Extension, version 10.0
[   989.547] (II) LoadModule: "nvidia"
[   989.547] (II) Loading /usr/lib64/xorg/modules/drivers/nvidia_drv.so
[   989.547] (II) Module nvidia: vendor="NVIDIA Corporation"
[   989.547]    compiled for 4.0.2, module version = 1.0.0
[   989.547]    Module class: X.Org Video Driver
[   989.547] (II) NVIDIA dlloader X Driver  390.48  Wed Mar 21 23:18:15 PDT 2018
[   989.547] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
[   989.547] (EE) No devices detected.
[   989.547] (EE)
Fatal server error:
[   989.547] (EE) no screens found(EE)
[   989.547] (EE)
Please consult the Fedora Project support
         at http://wiki.x.org
 for help.
[   989.548] (EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
[   989.548] (EE)
"No screens found"? Well it's right here in /etc/X11/xorg.conf:

Code: Select all

# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig:  version 390.48  (mockbuild@buildvm-02.online.rpmfusion.net)  Thu Mar 29 09:52:35 CEST $

Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0"
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
EndSection

Section "Files"
EndSection

Section "InputDevice"
    # generated from default
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/input/mice"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"
    # generated from default
    Identifier     "Keyboard0"
    Driver         "kbd"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    HorizSync       28.0 - 33.0
    VertRefresh     43.0 - 72.0
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection
Has anyone been able to PCI passthrough a video card using KVM?

Post Reply