Configure PCI-Passthrough Interface to NVIDIA GPU in a VM

This section provides instructions for configuring PCI-Passthrough interface to NVIDIA GPU Operator in a VM.

Procedure

  1. Source the platform environment.

    $ source /etc/platform/openrc
    ~(keystone_admin)$
    
  2. Lock controller-0 to enable the NVIDIA GPU device.

    ~(keystone_admin)$ system host-lock controller-0
    
    1. Verify that the NVIDIA GPU device is available in the table.

      ~(keystone_admin)$ system host-device-list controller-0 -a
      
    2. Enable the NVIDIA GPU device.

      ~(keystone_admin)$ system host-device-modify controller-0 <gpu_pci_address> --enable=True
      
  3. Unlock controller-0 to enable the NVIDIA GPU device.

    ~(keystone_admin)$ system host-unlock controller-0
    
  4. Add the NVIDIA GPU device information to the Nova overrides.

    1. Check the Nova Helm overrides.

      ~(keystone_admin)$ system helm-override-show stx-openstack nova openstack
    2. Check the conf.nova.pci.alias.values override. If your graphics card alias exists, for example, “name”: “nvidia-tesla-t4-vf”, check if the values in step, 2.b., are correct and proceed to step 5. If the values do not exist or are incorrect, update the Nova Helm overrides.

      1. Create a file containing the current conf.nova.pci.alias.values overrides and add the following additional entry in the values array for the NVIDIA device based on the values from step 2.b., for example:

        '{"vendor_id": *<vendor_id>*, "product_id": *<product_id>*, "device_type": "type-PF", "name": "nvidia-tesla-t4-pf"}'
        

        Where

        <vendor_id> is the ID of the vendor

        <product_id> is the ID of the product

      2. Save the .yaml file.

        conf:
          nova:
            pci:
              alias:
                values: ['{"vendor_id": "8086", "product_id": "0435", "name": "qat-dh895xcc-pf"}', '{"vendor_id": "8086", "product_id": "0443", "name": "qat-dh895xcc-vf"}', '{"vendor_id": "8086", "product_id": "37c8", "name": "qat-c62x-pf"}', '{"vendor_id": "8086", "product_id": "37c9", "name": "qat-c62x-vf"}', '{"name": "gpu"}', '{"vendor_id": "102b", "product_id": "0522", "name": "matrox-g200e"}', '{"vendor_id": "10de", "product_id": "13f2", "name": "nvidia-tesla-m60"}', '{"vendor_id": "10de", "product_id": "1b38", "name": "nvidia-tesla-p40"}', '{"vendor_id": "10de", "product_id": "1eb8", "device_type": "type-PF", "name": "nvidia-tesla-t4-pf"}']
        
      3. Upload the .yaml file to the platform and apply it.

        ~(keystone_admin)$ system helm-override-update stx-openstack nova openstack --reuse-values --values=your-override-file.yaml
      4. Apply the changes.

        ~(keystone_admin)$ system application-apply stx-openstack
  5. In OpenStack, add a new flavor for the NVIDIA GPU device, for example.

    # setup admin credentials for the containerized openstack application
    $ source /etc/platform/openrc
    ~(keystone_admin)$ OS_AUTH_URL=http://keystone.openstack.svc.cluster.local/v3
    # create new flavor with pci_passthrough:alias for nvidia device
    ~(keystone_admin)$ openstack flavor create --ram 8192 --vcpus 4 nvidiaT4gpu_8GB_v3 --property "pci_passthrough:alias"="nvidia-tesla-t4-pf:1" --property "hw:mem_page_size"="large"
    

    Note

    8 GB RAM, 4 VCPUs, and large memory page size are example values for GPU drivers’ system requirements. For valid system requirements for GPU drivers, see, https://www.nvidia.com/en-us/geforce/drivers/.

  6. In OpenStack, create a VM and test access to the NVIDIA GPU device.

    1. Create a new VM, using the newly created flavor in step 5.

    2. In the VM, install and test the CUDA drivers. See, https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html.