vGPU Support

In this guide we will show you how to set up Yaook with vGPU Support?

Requirements

To proceed this guide make sure that the following prerequisites are met:

INTEL CPU / AMD CPU
NVIDIA GPU which supports vGPU

A complete list can be found here

Architecture Graph

1. If you want to deploy your kubernetes cluster via yaook/k8s you need to enable the vgpu support. The following variables must be set in config.toml. The template config.template.toml is located under yaook/k8s/templates/config.template.toml. Note: Yaook/k8s is not a strict requirement for yaook/operator.

# vGPU Support
[nvidia.vgpu]
driver_blob_url = #vGPU Manager location
manager_filename = #vGPU Manager file

In Openstack, the Nova compute configuration must be customized. This requires the name of the folder with the selected vGPU configuration, in which the UUID (universally unique identifier) for the vGPU is created. A distinction must be made between 2 cases.

GPU without SR-IOV support
1. Physical GPUs supporting virtual GPUs propose mediate device types (mdev). You still need to get the right PCI port, in which the GPU is plugged in. A list with all vGPU types can be found here.
  $ lspci | grep NVIDIA 82:00.0 3D controller: NVIDIA Corporation TU104GL [Tesla T4] (rev a1)
2. Change to the mdev_supported_types directory on which you want to create the vGPU and find the subdirectory, that contains your chosen vGPU configuration. Replace "vgpu-type" with your chosen vGPU configuration.
  $ grep -l "vgpu-type" nvidia-*/name

GPU with SR-IOV support (Ampere architecture and newer)

Obtain the bus , domain, slot and function of the available virtual funcitons on the GPU
$ ls -l /sys/bus/pci/devices/domain\:bus\:slot.function/ | grep virtfn

This example shows the output of this command for a physical GPU with the slot 00, bus 82, domain 0000 and function 0.

$ ls -l /sys/bus/pci/devices/0000:82:00.0/ | grep virtfn
lrwxrwxrwx 1 root root           0 Jul 25 07:57 virtfn0 -> ../0000:82:00.4
lrwxrwxrwx 1 root root           0 Jul 25 07:57 virtfn1 -> ../0000:82:00.5
lrwxrwxrwx 1 root root           0 Jul 25 07:57 virtfn10 -> ../0000:82:01.6
lrwxrwxrwx 1 root root           0 Jul 25 07:57 virtfn11 -> ../0000:82:01.7
lrwxrwxrwx 1 root root           0 Jul 25 07:57 virtfn12 -> ../0000:82:02.0
lrwxrwxrwx 1 root root           0 Jul 25 07:57 virtfn13 -> ../0000:82:02.1
lrwxrwxrwx 1 root root           0 Jul 25 07:57 virtfn14 -> ../0000:82:02.2
lrwxrwxrwx 1 root root           0 Jul 25 07:57 virtfn15 -> ../0000:82:02.3
lrwxrwxrwx 1 root root           0 Jul 25 07:57 virtfn2 -> ../0000:82:00.6
lrwxrwxrwx 1 root root           0 Jul 25 07:57 virtfn3 -> ../0000:82:00.7
lrwxrwxrwx 1 root root           0 Jul 25 07:57 virtfn4 -> ../0000:82:01.0
lrwxrwxrwx 1 root root           0 Jul 25 07:57 virtfn5 -> ../0000:82:01.1
lrwxrwxrwx 1 root root           0 Jul 25 07:57 virtfn6 -> ../0000:82:01.2
lrwxrwxrwx 1 root root           0 Jul 25 07:57 virtfn7 -> ../0000:82:01.3
lrwxrwxrwx 1 root root           0 Jul 25 07:57 virtfn8 -> ../0000:82:01.4
lrwxrwxrwx 1 root root           0 Jul 25 07:57 virtfn9 -> ../0000:82:01.5

Choose the virtual function on which you want to create the vGPU. A list with all vGPU types can be found here.

$ cd /sys/class/mdev_bus/0000\:82\:00.4/mdev_supported_types/
$ grep -l "vgpu-type" nvidia-*/name

The most important variable to set is the enable_vgpu_types in the nova.yaml. The file is located under /docs/examples/nova.yaml. Here, you decide the given vGPU configuration considering the acquired license. Replace nvidia-233 with the folder name containing your chosen vGPU configuration.
compute: configTemplates: - nodeSelectors: - matchLabels: {} novaComputeConfig: DEFAULT: debug: True devices: enabled_vgpu_types: - nvidia-233

The last step is to configure a flavor to request one virtual GPU.

$ openstack flavor set <FLAVOR_VGPU> --property "resources:VGPU=1"