Message ID | 20240922124951.1946072-1-zhiw@nvidia.com (mailing list archive) |
---|---|
Headers | show |
Series | Introduce NVIDIA GPU Virtualization (vGPU) Support | expand |
On Sun, 22 Sep 2024 05:49:22 -0700 Zhi Wang <zhiw@nvidia.com> wrote: +Ben. Forget to add you. My bad. > 1. Background > ============= > > NVIDIA vGPU[1] software enables powerful GPU performance for workloads > ranging from graphics-rich virtual workstations to data science and > AI, enabling IT to leverage the management and security benefits of > virtualization as well as the performance of NVIDIA GPUs required for > modern workloads. Installed on a physical GPU in a cloud or enterprise > data center server, NVIDIA vGPU software creates virtual GPUs that can > be shared across multiple virtual machines. > > The vGPU architecture[2] can be illustrated as follow: > > +--------------------+ +--------------------+ > +--------------------+ +--------------------+ | Hypervisor | > | Guest VM | | Guest VM | | Guest VM > | | | | +----------------+ | | > +----------------+ | | +----------------+ | | +----------------+ | > | |Applications... | | | |Applications... | | | |Applications... | | > | | NVIDIA | | | +----------------+ | | +----------------+ > | | +----------------+ | | | Virtual GPU | | | > +----------------+ | | +----------------+ | | +----------------+ | | > | Manager | | | | Guest Driver | | | | Guest Driver | | > | | Guest Driver | | | +------^---------+ | | +----------------+ > | | +----------------+ | | +----------------+ | | | > | +---------^----------+ +----------^---------+ > +----------^---------+ | | | | > | | | | > +--------------+-----------------------+----------------------+---------+ > | | | | > | | | | | > | | | > +--------+--------------------------+-----------------------+----------------------+---------+ > +---------v--------------------------+-----------------------+----------------------+----------+ > | NVIDIA +----------v---------+ > +-----------v--------+ +-----------v--------+ | | Physical GPU > | Virtual GPU | | Virtual GPU | | Virtual GPU > | | | +--------------------+ > +--------------------+ +--------------------+ | > +----------------------------------------------------------------------------------------------+ > > Each NVIDIA vGPU is analogous to a conventional GPU, having a fixed > amount of GPU framebuffer, and one or more virtual display outputs or > "heads". The vGPU’s framebuffer is allocated out of the physical > GPU’s framebuffer at the time the vGPU is created, and the vGPU > retains exclusive use of that framebuffer until it is destroyed. > > The number of physical GPUs that a board has depends on the board. > Each physical GPU can support several different types of virtual GPU > (vGPU). vGPU types have a fixed amount of frame buffer, number of > supported display heads, and maximum resolutions. They are grouped > into different series according to the different classes of workload > for which they are optimized. Each series is identified by the last > letter of the vGPU type name. > > NVIDIA vGPU supports Windows and Linux guest VM operating systems. The > supported vGPU types depend on the guest VM OS. > > 2. Proposal for upstream > ======================== > > 2.1 Architecture > ---------------- > > Moving to the upstream, the proposed architecture can be illustrated > as followings: > > +--------------------+ > +--------------------+ +--------------------+ | Linux VM | > | Windows VM | | Guest VM | | +----------------+ | > | +----------------+ | | +----------------+ | | |Applications... | | > | |Applications... | | | |Applications... | | | +----------------+ | > | +----------------+ | | +----------------+ | ... | > +----------------+ | | +----------------+ | | +----------------+ | | > | Guest Driver | | | | Guest Driver | | | | Guest Driver | | | > +----------------+ | | +----------------+ | | +----------------+ | > +---------^----------+ +----------^---------+ +----------^---------+ > | | | > +--------------------------------------------------------------------+ > |+--------------------+ +--------------------+ > +--------------------+| || QEMU | | QEMU > | | QEMU || || | | > | | || |+--------------------+ > +--------------------+ +--------------------+| > +--------------------------------------------------------------------+ > | | | > +-----------------------------------------------------------------------------------------------+ > | > +----------------------------------------------------------------+ | > | | VFIO > | | | | > | | | > +-----------------------+ | +------------------------+ > +---------------------------------+| | | | Core Driver vGPU | | > | | | || | | > | Support <--->| <----> > || | | +-----------------------+ | | NVIDIA vGPU > Manager | | NVIDIA vGPU VFIO Variant Driver || | | | NVIDIA > GPU Core | | | | | > || | | | Driver | | > +------------------------+ +---------------------------------+| | | > +--------^--------------+ > +----------------------------------------------------------------+ | > | | | | > | | > +-----------------------------------------------------------------------------------------------+ > | | | > | > +----------|--------------------------|-----------------------|----------------------|----------+ > | v +----------v---------+ > +-----------v--------+ +-----------v--------+ | | NVIDIA > | PCI VF | | PCI VF | | PCI VF > | | | Physical GPU | | | > | | | | | | > (Virtual GPU) | | (Virtual GPU) | | (Virtual GPU) | | | > +--------------------+ +--------------------+ > +--------------------+ | > +-----------------------------------------------------------------------------------------------+ > > The supported GPU generations will be Ada which come with the > supported GPU architecture. Each vGPU is backed by a PCI virtual > function. > > The NVIDIA vGPU VFIO module together with VFIO sits on VFs, provides > extended management and features, e.g. selecting the vGPU types, > support live migration and driver warm update. > > Like other devices that VFIO supports, VFIO provides the standard > userspace APIs for device lifecycle management and advance feature > support. > > The NVIDIA vGPU manager provides necessary support to the NVIDIA vGPU > VFIO variant driver to create/destroy vGPUs, query available vGPU > types, select the vGPU type, etc. > > On the other side, NVIDIA vGPU manager talks to the NVIDIA GPU core > driver, which provide necessary support to reach the HW functions. > > 2.2 Requirements to the NVIDIA GPU core driver > ---------------------------------------------- > > The primary use case of CSP and enterprise is a standalone minimal > drivers of vGPU manager and other necessary components. > > NVIDIA vGPU manager talks to the NVIDIA GPU core driver, which provide > necessary support to: > > - Load the GSP firmware, boot the GSP, provide commnication channel. > - Manage the shared/partitioned HW resources. E.g. reserving FB > memory, channels for the vGPU mananger to create vGPUs. > - Exception handling. E.g. delivering the GSP events to vGPU manager. > - Host event dispatch. E.g. suspend/resume. > - Enumerations of HW configuration. > > The NVIDIA GPU core driver, which sits on the PCI device interface of > NVIDIA GPU, provides support to both DRM driver and the vGPU manager. > > In this RFC, the split nouveau GPU driver[3] is used as an example to > demostrate the requirements of vGPU manager to the core driver. The > nouveau driver is split into nouveau (the DRM driver) and nvkm (the > core driver). > > 3 Try the RFC patches > ----------------------- > > The RFC supports to create one VM to test the simple GPU workload. > > - Host kernel: > https://github.com/zhiwang-nvidia/linux/tree/zhi/vgpu-mgr-rfc > - Guest driver package: NVIDIA-Linux-x86_64-535.154.05.run [4] > > Install guest driver: > # export GRID_BUILD=1 > # ./NVIDIA-Linux-x86_64-535.154.05.run > > - Tested platforms: L40. > - Tested guest OS: Ubutnu 24.04 LTS. > - Supported experience: Linux rich desktop experience with simple 3D > workload, e.g. glmark2 > > 4 Demo > ------ > > A demo video can be found at: https://youtu.be/YwgIvvk-V94 > > [1] https://www.nvidia.com/en-us/data-center/virtual-solutions/ > [2] > https://docs.nvidia.com/vgpu/17.0/grid-vgpu-user-guide/index.html#architecture-grid-vgpu > [3] > https://lore.kernel.org/dri-devel/20240613170211.88779-1-bskeggs@nvidia.com/T/ > [4] > https://us.download.nvidia.com/XFree86/Linux-x86_64/535.154.05/NVIDIA-Linux-x86_64-535.154.05.run > > Zhi Wang (29): > nvkm/vgpu: introduce NVIDIA vGPU support prelude > nvkm/vgpu: attach to nvkm as a nvkm client > nvkm/vgpu: reserve a larger GSP heap when NVIDIA vGPU is enabled > nvkm/vgpu: set the VF partition count when NVIDIA vGPU is enabled > nvkm/vgpu: populate GSP_VF_INFO when NVIDIA vGPU is enabled > nvkm/vgpu: set RMSetSriovMode when NVIDIA vGPU is enabled > nvkm/gsp: add a notify handler for GSP event > GPUACCT_PERFMON_UTIL_SAMPLES > nvkm/vgpu: get the size VMMU segment from GSP firmware > nvkm/vgpu: introduce the reserved channel allocator > nvkm/vgpu: introduce interfaces for NVIDIA vGPU VFIO module > nvkm/vgpu: introduce GSP RM client alloc and free for vGPU > nvkm/vgpu: introduce GSP RM control interface for vGPU > nvkm: move chid.h to nvkm/engine. > nvkm/vgpu: introduce channel allocation for vGPU > nvkm/vgpu: introduce FB memory allocation for vGPU > nvkm/vgpu: introduce BAR1 map routines for vGPUs > nvkm/vgpu: introduce engine bitmap for vGPU > nvkm/vgpu: introduce pci_driver.sriov_configure() in nvkm > vfio/vgpu_mgr: introdcue vGPU lifecycle management prelude > vfio/vgpu_mgr: allocate GSP RM client for NVIDIA vGPU manager > vfio/vgpu_mgr: introduce vGPU type uploading > vfio/vgpu_mgr: allocate vGPU FB memory when creating vGPUs > vfio/vgpu_mgr: allocate vGPU channels when creating vGPUs > vfio/vgpu_mgr: allocate mgmt heap when creating vGPUs > vfio/vgpu_mgr: map mgmt heap when creating a vGPU > vfio/vgpu_mgr: allocate GSP RM client when creating vGPUs > vfio/vgpu_mgr: bootload the new vGPU > vfio/vgpu_mgr: introduce vGPU host RPC channel > vfio/vgpu_mgr: introduce NVIDIA vGPU VFIO variant driver > > .../drm/nouveau/include/nvkm/core/device.h | 3 + > .../drm/nouveau/include/nvkm/engine/chid.h | 29 + > .../gpu/drm/nouveau/include/nvkm/subdev/gsp.h | 1 + > .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h | 45 ++ > .../nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h | 12 + > drivers/gpu/drm/nouveau/nvkm/Kbuild | 1 + > drivers/gpu/drm/nouveau/nvkm/device/pci.c | 33 +- > .../gpu/drm/nouveau/nvkm/engine/fifo/chid.c | 49 +- > .../gpu/drm/nouveau/nvkm/engine/fifo/chid.h | 26 +- > .../gpu/drm/nouveau/nvkm/engine/fifo/r535.c | 3 + > .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 14 +- > drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild | 3 + > drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c | 302 +++++++++++ > .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c | 234 ++++++++ > drivers/vfio/pci/Kconfig | 2 + > drivers/vfio/pci/Makefile | 2 + > drivers/vfio/pci/nvidia-vgpu/Kconfig | 13 + > drivers/vfio/pci/nvidia-vgpu/Makefile | 8 + > drivers/vfio/pci/nvidia-vgpu/debug.h | 18 + > .../nvidia/inc/ctrl/ctrl0000/ctrl0000system.h | 30 + > .../nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h | 33 ++ > .../ctrl/ctrl2080/ctrl2080vgpumgrinternal.h | 152 ++++++ > .../common/sdk/nvidia/inc/ctrl/ctrla081.h | 109 ++++ > .../nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h | 213 ++++++++ > .../common/sdk/nvidia/inc/nv_vgpu_types.h | 51 ++ > .../common/sdk/vmioplugin/inc/vmioplugin.h | 26 + > .../pci/nvidia-vgpu/include/nvrm/nvtypes.h | 24 + > drivers/vfio/pci/nvidia-vgpu/nvkm.h | 94 ++++ > drivers/vfio/pci/nvidia-vgpu/rpc.c | 242 +++++++++ > drivers/vfio/pci/nvidia-vgpu/vfio.h | 43 ++ > drivers/vfio/pci/nvidia-vgpu/vfio_access.c | 297 ++++++++++ > drivers/vfio/pci/nvidia-vgpu/vfio_main.c | 511 > ++++++++++++++++++ drivers/vfio/pci/nvidia-vgpu/vgpu.c | > 352 ++++++++++++ drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c | 144 > +++++ drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h | 89 +++ > drivers/vfio/pci/nvidia-vgpu/vgpu_types.c | 466 ++++++++++++++++ > include/drm/nvkm_vgpu_mgr_vfio.h | 61 +++ > 37 files changed, 3702 insertions(+), 33 deletions(-) > create mode 100644 drivers/gpu/drm/nouveau/include/nvkm/engine/chid.h > create mode 100644 > drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h create mode > 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild create mode > 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c create mode > 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c create mode > 100644 drivers/vfio/pci/nvidia-vgpu/Kconfig create mode 100644 > drivers/vfio/pci/nvidia-vgpu/Makefile create mode 100644 > drivers/vfio/pci/nvidia-vgpu/debug.h create mode 100644 > drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl0000/ctrl0000system.h > create mode 100644 > drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h > create mode 100644 > drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080vgpumgrinternal.h > create mode 100644 > drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrla081.h > create mode 100644 > drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h > create mode 100644 > drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/nv_vgpu_types.h > create mode 100644 > drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/vmioplugin/inc/vmioplugin.h > create mode 100644 > drivers/vfio/pci/nvidia-vgpu/include/nvrm/nvtypes.h create mode > 100644 drivers/vfio/pci/nvidia-vgpu/nvkm.h create mode 100644 > drivers/vfio/pci/nvidia-vgpu/rpc.c create mode 100644 > drivers/vfio/pci/nvidia-vgpu/vfio.h create mode 100644 > drivers/vfio/pci/nvidia-vgpu/vfio_access.c create mode 100644 > drivers/vfio/pci/nvidia-vgpu/vfio_main.c create mode 100644 > drivers/vfio/pci/nvidia-vgpu/vgpu.c create mode 100644 > drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c create mode 100644 > drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h create mode 100644 > drivers/vfio/pci/nvidia-vgpu/vgpu_types.c create mode 100644 > include/drm/nvkm_vgpu_mgr_vfio.h >
> From: Zhi Wang <zhiw@nvidia.com> > Sent: Sunday, September 22, 2024 8:49 PM > [...] > > The NVIDIA vGPU VFIO module together with VFIO sits on VFs, provides > extended management and features, e.g. selecting the vGPU types, support > live migration and driver warm update. > > Like other devices that VFIO supports, VFIO provides the standard > userspace APIs for device lifecycle management and advance feature > support. > > The NVIDIA vGPU manager provides necessary support to the NVIDIA vGPU VFIO > variant driver to create/destroy vGPUs, query available vGPU types, select > the vGPU type, etc. > > On the other side, NVIDIA vGPU manager talks to the NVIDIA GPU core driver, > which provide necessary support to reach the HW functions. > I'm not sure VFIO is the right place to host the NVIDIA vGPU manager. It's very NVIDIA specific and naturally fit in the PF driver. The VFIO side should focus on what's necessary for managing userspace access to the VF hw, i.e. patch29.
On Sun, Sep 22, 2024 at 04:11:21PM +0300, Zhi Wang wrote: > On Sun, 22 Sep 2024 05:49:22 -0700 > Zhi Wang <zhiw@nvidia.com> wrote: > > +Ben. > > Forget to add you. My bad. Please also add the driver maintainers! I had to fetch the patchset from the KVM list, since they did not hit the nouveau list (I'm trying to get @nvidia.com addresses whitelisted). - Danilo > > > > 1. Background > > ============= > > > > NVIDIA vGPU[1] software enables powerful GPU performance for workloads > > ranging from graphics-rich virtual workstations to data science and > > AI, enabling IT to leverage the management and security benefits of > > virtualization as well as the performance of NVIDIA GPUs required for > > modern workloads. Installed on a physical GPU in a cloud or enterprise > > data center server, NVIDIA vGPU software creates virtual GPUs that can > > be shared across multiple virtual machines. > > > > The vGPU architecture[2] can be illustrated as follow: > > > > +--------------------+ +--------------------+ > > +--------------------+ +--------------------+ | Hypervisor | > > | Guest VM | | Guest VM | | Guest VM > > | | | | +----------------+ | | > > +----------------+ | | +----------------+ | | +----------------+ | > > | |Applications... | | | |Applications... | | | |Applications... | | > > | | NVIDIA | | | +----------------+ | | +----------------+ > > | | +----------------+ | | | Virtual GPU | | | > > +----------------+ | | +----------------+ | | +----------------+ | | > > | Manager | | | | Guest Driver | | | | Guest Driver | | > > | | Guest Driver | | | +------^---------+ | | +----------------+ > > | | +----------------+ | | +----------------+ | | | > > | +---------^----------+ +----------^---------+ > > +----------^---------+ | | | | > > | | | | > > +--------------+-----------------------+----------------------+---------+ > > | | | | > > | | | | | > > | | | > > +--------+--------------------------+-----------------------+----------------------+---------+ > > +---------v--------------------------+-----------------------+----------------------+----------+ > > | NVIDIA +----------v---------+ > > +-----------v--------+ +-----------v--------+ | | Physical GPU > > | Virtual GPU | | Virtual GPU | | Virtual GPU > > | | | +--------------------+ > > +--------------------+ +--------------------+ | > > +----------------------------------------------------------------------------------------------+ > > > > Each NVIDIA vGPU is analogous to a conventional GPU, having a fixed > > amount of GPU framebuffer, and one or more virtual display outputs or > > "heads". The vGPU’s framebuffer is allocated out of the physical > > GPU’s framebuffer at the time the vGPU is created, and the vGPU > > retains exclusive use of that framebuffer until it is destroyed. > > > > The number of physical GPUs that a board has depends on the board. > > Each physical GPU can support several different types of virtual GPU > > (vGPU). vGPU types have a fixed amount of frame buffer, number of > > supported display heads, and maximum resolutions. They are grouped > > into different series according to the different classes of workload > > for which they are optimized. Each series is identified by the last > > letter of the vGPU type name. > > > > NVIDIA vGPU supports Windows and Linux guest VM operating systems. The > > supported vGPU types depend on the guest VM OS. > > > > 2. Proposal for upstream > > ======================== > > > > 2.1 Architecture > > ---------------- > > > > Moving to the upstream, the proposed architecture can be illustrated > > as followings: > > > > +--------------------+ > > +--------------------+ +--------------------+ | Linux VM | > > | Windows VM | | Guest VM | | +----------------+ | > > | +----------------+ | | +----------------+ | | |Applications... | | > > | |Applications... | | | |Applications... | | | +----------------+ | > > | +----------------+ | | +----------------+ | ... | > > +----------------+ | | +----------------+ | | +----------------+ | | > > | Guest Driver | | | | Guest Driver | | | | Guest Driver | | | > > +----------------+ | | +----------------+ | | +----------------+ | > > +---------^----------+ +----------^---------+ +----------^---------+ > > | | | > > +--------------------------------------------------------------------+ > > |+--------------------+ +--------------------+ > > +--------------------+| || QEMU | | QEMU > > | | QEMU || || | | > > | | || |+--------------------+ > > +--------------------+ +--------------------+| > > +--------------------------------------------------------------------+ > > | | | > > +-----------------------------------------------------------------------------------------------+ > > | > > +----------------------------------------------------------------+ | > > | | VFIO > > | | | | > > | | | > > +-----------------------+ | +------------------------+ > > +---------------------------------+| | | | Core Driver vGPU | | > > | | | || | | > > | Support <--->| <----> > > || | | +-----------------------+ | | NVIDIA vGPU > > Manager | | NVIDIA vGPU VFIO Variant Driver || | | | NVIDIA > > GPU Core | | | | | > > || | | | Driver | | > > +------------------------+ +---------------------------------+| | | > > +--------^--------------+ > > +----------------------------------------------------------------+ | > > | | | | > > | | > > +-----------------------------------------------------------------------------------------------+ > > | | | > > | > > +----------|--------------------------|-----------------------|----------------------|----------+ > > | v +----------v---------+ > > +-----------v--------+ +-----------v--------+ | | NVIDIA > > | PCI VF | | PCI VF | | PCI VF > > | | | Physical GPU | | | > > | | | | | | > > (Virtual GPU) | | (Virtual GPU) | | (Virtual GPU) | | | > > +--------------------+ +--------------------+ > > +--------------------+ | > > +-----------------------------------------------------------------------------------------------+ > > > > The supported GPU generations will be Ada which come with the > > supported GPU architecture. Each vGPU is backed by a PCI virtual > > function. > > > > The NVIDIA vGPU VFIO module together with VFIO sits on VFs, provides > > extended management and features, e.g. selecting the vGPU types, > > support live migration and driver warm update. > > > > Like other devices that VFIO supports, VFIO provides the standard > > userspace APIs for device lifecycle management and advance feature > > support. > > > > The NVIDIA vGPU manager provides necessary support to the NVIDIA vGPU > > VFIO variant driver to create/destroy vGPUs, query available vGPU > > types, select the vGPU type, etc. > > > > On the other side, NVIDIA vGPU manager talks to the NVIDIA GPU core > > driver, which provide necessary support to reach the HW functions. > > > > 2.2 Requirements to the NVIDIA GPU core driver > > ---------------------------------------------- > > > > The primary use case of CSP and enterprise is a standalone minimal > > drivers of vGPU manager and other necessary components. > > > > NVIDIA vGPU manager talks to the NVIDIA GPU core driver, which provide > > necessary support to: > > > > - Load the GSP firmware, boot the GSP, provide commnication channel. > > - Manage the shared/partitioned HW resources. E.g. reserving FB > > memory, channels for the vGPU mananger to create vGPUs. > > - Exception handling. E.g. delivering the GSP events to vGPU manager. > > - Host event dispatch. E.g. suspend/resume. > > - Enumerations of HW configuration. > > > > The NVIDIA GPU core driver, which sits on the PCI device interface of > > NVIDIA GPU, provides support to both DRM driver and the vGPU manager. > > > > In this RFC, the split nouveau GPU driver[3] is used as an example to > > demostrate the requirements of vGPU manager to the core driver. The > > nouveau driver is split into nouveau (the DRM driver) and nvkm (the > > core driver). > > > > 3 Try the RFC patches > > ----------------------- > > > > The RFC supports to create one VM to test the simple GPU workload. > > > > - Host kernel: > > https://github.com/zhiwang-nvidia/linux/tree/zhi/vgpu-mgr-rfc > > - Guest driver package: NVIDIA-Linux-x86_64-535.154.05.run [4] > > > > Install guest driver: > > # export GRID_BUILD=1 > > # ./NVIDIA-Linux-x86_64-535.154.05.run > > > > - Tested platforms: L40. > > - Tested guest OS: Ubutnu 24.04 LTS. > > - Supported experience: Linux rich desktop experience with simple 3D > > workload, e.g. glmark2 > > > > 4 Demo > > ------ > > > > A demo video can be found at: https://youtu.be/YwgIvvk-V94 > > > > [1] https://www.nvidia.com/en-us/data-center/virtual-solutions/ > > [2] > > https://docs.nvidia.com/vgpu/17.0/grid-vgpu-user-guide/index.html#architecture-grid-vgpu > > [3] > > https://lore.kernel.org/dri-devel/20240613170211.88779-1-bskeggs@nvidia.com/T/ > > [4] > > https://us.download.nvidia.com/XFree86/Linux-x86_64/535.154.05/NVIDIA-Linux-x86_64-535.154.05.run > > > > Zhi Wang (29): > > nvkm/vgpu: introduce NVIDIA vGPU support prelude > > nvkm/vgpu: attach to nvkm as a nvkm client > > nvkm/vgpu: reserve a larger GSP heap when NVIDIA vGPU is enabled > > nvkm/vgpu: set the VF partition count when NVIDIA vGPU is enabled > > nvkm/vgpu: populate GSP_VF_INFO when NVIDIA vGPU is enabled > > nvkm/vgpu: set RMSetSriovMode when NVIDIA vGPU is enabled > > nvkm/gsp: add a notify handler for GSP event > > GPUACCT_PERFMON_UTIL_SAMPLES > > nvkm/vgpu: get the size VMMU segment from GSP firmware > > nvkm/vgpu: introduce the reserved channel allocator > > nvkm/vgpu: introduce interfaces for NVIDIA vGPU VFIO module > > nvkm/vgpu: introduce GSP RM client alloc and free for vGPU > > nvkm/vgpu: introduce GSP RM control interface for vGPU > > nvkm: move chid.h to nvkm/engine. > > nvkm/vgpu: introduce channel allocation for vGPU > > nvkm/vgpu: introduce FB memory allocation for vGPU > > nvkm/vgpu: introduce BAR1 map routines for vGPUs > > nvkm/vgpu: introduce engine bitmap for vGPU > > nvkm/vgpu: introduce pci_driver.sriov_configure() in nvkm > > vfio/vgpu_mgr: introdcue vGPU lifecycle management prelude > > vfio/vgpu_mgr: allocate GSP RM client for NVIDIA vGPU manager > > vfio/vgpu_mgr: introduce vGPU type uploading > > vfio/vgpu_mgr: allocate vGPU FB memory when creating vGPUs > > vfio/vgpu_mgr: allocate vGPU channels when creating vGPUs > > vfio/vgpu_mgr: allocate mgmt heap when creating vGPUs > > vfio/vgpu_mgr: map mgmt heap when creating a vGPU > > vfio/vgpu_mgr: allocate GSP RM client when creating vGPUs > > vfio/vgpu_mgr: bootload the new vGPU > > vfio/vgpu_mgr: introduce vGPU host RPC channel > > vfio/vgpu_mgr: introduce NVIDIA vGPU VFIO variant driver > > > > .../drm/nouveau/include/nvkm/core/device.h | 3 + > > .../drm/nouveau/include/nvkm/engine/chid.h | 29 + > > .../gpu/drm/nouveau/include/nvkm/subdev/gsp.h | 1 + > > .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h | 45 ++ > > .../nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h | 12 + > > drivers/gpu/drm/nouveau/nvkm/Kbuild | 1 + > > drivers/gpu/drm/nouveau/nvkm/device/pci.c | 33 +- > > .../gpu/drm/nouveau/nvkm/engine/fifo/chid.c | 49 +- > > .../gpu/drm/nouveau/nvkm/engine/fifo/chid.h | 26 +- > > .../gpu/drm/nouveau/nvkm/engine/fifo/r535.c | 3 + > > .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 14 +- > > drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild | 3 + > > drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c | 302 +++++++++++ > > .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c | 234 ++++++++ > > drivers/vfio/pci/Kconfig | 2 + > > drivers/vfio/pci/Makefile | 2 + > > drivers/vfio/pci/nvidia-vgpu/Kconfig | 13 + > > drivers/vfio/pci/nvidia-vgpu/Makefile | 8 + > > drivers/vfio/pci/nvidia-vgpu/debug.h | 18 + > > .../nvidia/inc/ctrl/ctrl0000/ctrl0000system.h | 30 + > > .../nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h | 33 ++ > > .../ctrl/ctrl2080/ctrl2080vgpumgrinternal.h | 152 ++++++ > > .../common/sdk/nvidia/inc/ctrl/ctrla081.h | 109 ++++ > > .../nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h | 213 ++++++++ > > .../common/sdk/nvidia/inc/nv_vgpu_types.h | 51 ++ > > .../common/sdk/vmioplugin/inc/vmioplugin.h | 26 + > > .../pci/nvidia-vgpu/include/nvrm/nvtypes.h | 24 + > > drivers/vfio/pci/nvidia-vgpu/nvkm.h | 94 ++++ > > drivers/vfio/pci/nvidia-vgpu/rpc.c | 242 +++++++++ > > drivers/vfio/pci/nvidia-vgpu/vfio.h | 43 ++ > > drivers/vfio/pci/nvidia-vgpu/vfio_access.c | 297 ++++++++++ > > drivers/vfio/pci/nvidia-vgpu/vfio_main.c | 511 > > ++++++++++++++++++ drivers/vfio/pci/nvidia-vgpu/vgpu.c | > > 352 ++++++++++++ drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c | 144 > > +++++ drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h | 89 +++ > > drivers/vfio/pci/nvidia-vgpu/vgpu_types.c | 466 ++++++++++++++++ > > include/drm/nvkm_vgpu_mgr_vfio.h | 61 +++ > > 37 files changed, 3702 insertions(+), 33 deletions(-) > > create mode 100644 drivers/gpu/drm/nouveau/include/nvkm/engine/chid.h > > create mode 100644 > > drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h create mode > > 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild create mode > > 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c create mode > > 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c create mode > > 100644 drivers/vfio/pci/nvidia-vgpu/Kconfig create mode 100644 > > drivers/vfio/pci/nvidia-vgpu/Makefile create mode 100644 > > drivers/vfio/pci/nvidia-vgpu/debug.h create mode 100644 > > drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl0000/ctrl0000system.h > > create mode 100644 > > drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h > > create mode 100644 > > drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080vgpumgrinternal.h > > create mode 100644 > > drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrla081.h > > create mode 100644 > > drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h > > create mode 100644 > > drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/nv_vgpu_types.h > > create mode 100644 > > drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/vmioplugin/inc/vmioplugin.h > > create mode 100644 > > drivers/vfio/pci/nvidia-vgpu/include/nvrm/nvtypes.h create mode > > 100644 drivers/vfio/pci/nvidia-vgpu/nvkm.h create mode 100644 > > drivers/vfio/pci/nvidia-vgpu/rpc.c create mode 100644 > > drivers/vfio/pci/nvidia-vgpu/vfio.h create mode 100644 > > drivers/vfio/pci/nvidia-vgpu/vfio_access.c create mode 100644 > > drivers/vfio/pci/nvidia-vgpu/vfio_main.c create mode 100644 > > drivers/vfio/pci/nvidia-vgpu/vgpu.c create mode 100644 > > drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c create mode 100644 > > drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h create mode 100644 > > drivers/vfio/pci/nvidia-vgpu/vgpu_types.c create mode 100644 > > include/drm/nvkm_vgpu_mgr_vfio.h > > >
Hi Zhi, Thanks for the very detailed cover letter. On Sun, Sep 22, 2024 at 05:49:22AM -0700, Zhi Wang wrote: > 1. Background > ============= > > NVIDIA vGPU[1] software enables powerful GPU performance for workloads > ranging from graphics-rich virtual workstations to data science and AI, > enabling IT to leverage the management and security benefits of > virtualization as well as the performance of NVIDIA GPUs required for > modern workloads. Installed on a physical GPU in a cloud or enterprise > data center server, NVIDIA vGPU software creates virtual GPUs that can > be shared across multiple virtual machines. > > The vGPU architecture[2] can be illustrated as follow: > > +--------------------+ +--------------------+ +--------------------+ +--------------------+ > | Hypervisor | | Guest VM | | Guest VM | | Guest VM | > | | | +----------------+ | | +----------------+ | | +----------------+ | > | +----------------+ | | |Applications... | | | |Applications... | | | |Applications... | | > | | NVIDIA | | | +----------------+ | | +----------------+ | | +----------------+ | > | | Virtual GPU | | | +----------------+ | | +----------------+ | | +----------------+ | > | | Manager | | | | Guest Driver | | | | Guest Driver | | | | Guest Driver | | > | +------^---------+ | | +----------------+ | | +----------------+ | | +----------------+ | > | | | +---------^----------+ +----------^---------+ +----------^---------+ > | | | | | | > | | +--------------+-----------------------+----------------------+---------+ > | | | | | | > | | | | | | > +--------+--------------------------+-----------------------+----------------------+---------+ > +---------v--------------------------+-----------------------+----------------------+----------+ > | NVIDIA +----------v---------+ +-----------v--------+ +-----------v--------+ | > | Physical GPU | Virtual GPU | | Virtual GPU | | Virtual GPU | | > | +--------------------+ +--------------------+ +--------------------+ | > +----------------------------------------------------------------------------------------------+ > > Each NVIDIA vGPU is analogous to a conventional GPU, having a fixed amount > of GPU framebuffer, and one or more virtual display outputs or "heads". > The vGPU’s framebuffer is allocated out of the physical GPU’s framebuffer > at the time the vGPU is created, and the vGPU retains exclusive use of > that framebuffer until it is destroyed. > > The number of physical GPUs that a board has depends on the board. Each > physical GPU can support several different types of virtual GPU (vGPU). > vGPU types have a fixed amount of frame buffer, number of supported > display heads, and maximum resolutions. They are grouped into different > series according to the different classes of workload for which they are > optimized. Each series is identified by the last letter of the vGPU type > name. > > NVIDIA vGPU supports Windows and Linux guest VM operating systems. The > supported vGPU types depend on the guest VM OS. > > 2. Proposal for upstream > ======================== What is the strategy in the mid / long term with this? As you know, we're trying to move to Nova and the blockers with the device / driver infrastructure have been resolved and we're able to move forward. Besides that, Dave made great progress on the firmware abstraction side of things. Is this more of a proof of concept? Do you plan to work on Nova in general and vGPU support for Nova? > > 2.1 Architecture > ---------------- > > Moving to the upstream, the proposed architecture can be illustrated as followings: > > +--------------------+ +--------------------+ +--------------------+ > | Linux VM | | Windows VM | | Guest VM | > | +----------------+ | | +----------------+ | | +----------------+ | > | |Applications... | | | |Applications... | | | |Applications... | | > | +----------------+ | | +----------------+ | | +----------------+ | ... > | +----------------+ | | +----------------+ | | +----------------+ | > | | Guest Driver | | | | Guest Driver | | | | Guest Driver | | > | +----------------+ | | +----------------+ | | +----------------+ | > +---------^----------+ +----------^---------+ +----------^---------+ > | | | > +--------------------------------------------------------------------+ > |+--------------------+ +--------------------+ +--------------------+| > || QEMU | | QEMU | | QEMU || > || | | | | || > |+--------------------+ +--------------------+ +--------------------+| > +--------------------------------------------------------------------+ > | | | > +-----------------------------------------------------------------------------------------------+ > | +----------------------------------------------------------------+ | > | | VFIO | | > | | | | > | +-----------------------+ | +------------------------+ +---------------------------------+| | > | | Core Driver vGPU | | | | | || | > | | Support <--->| <----> || | > | +-----------------------+ | | NVIDIA vGPU Manager | | NVIDIA vGPU VFIO Variant Driver || | > | | NVIDIA GPU Core | | | | | || | > | | Driver | | +------------------------+ +---------------------------------+| | > | +--------^--------------+ +----------------------------------------------------------------+ | > | | | | | | > +-----------------------------------------------------------------------------------------------+ > | | | | > +----------|--------------------------|-----------------------|----------------------|----------+ > | v +----------v---------+ +-----------v--------+ +-----------v--------+ | > | NVIDIA | PCI VF | | PCI VF | | PCI VF | | > | Physical GPU | | | | | | | > | | (Virtual GPU) | | (Virtual GPU) | | (Virtual GPU) | | > | +--------------------+ +--------------------+ +--------------------+ | > +-----------------------------------------------------------------------------------------------+ > > The supported GPU generations will be Ada which come with the supported > GPU architecture. Each vGPU is backed by a PCI virtual function. > > The NVIDIA vGPU VFIO module together with VFIO sits on VFs, provides > extended management and features, e.g. selecting the vGPU types, support > live migration and driver warm update. > > Like other devices that VFIO supports, VFIO provides the standard > userspace APIs for device lifecycle management and advance feature > support. > > The NVIDIA vGPU manager provides necessary support to the NVIDIA vGPU VFIO > variant driver to create/destroy vGPUs, query available vGPU types, select > the vGPU type, etc. > > On the other side, NVIDIA vGPU manager talks to the NVIDIA GPU core driver, > which provide necessary support to reach the HW functions. > > 2.2 Requirements to the NVIDIA GPU core driver > ---------------------------------------------- > > The primary use case of CSP and enterprise is a standalone minimal > drivers of vGPU manager and other necessary components. > > NVIDIA vGPU manager talks to the NVIDIA GPU core driver, which provide > necessary support to: > > - Load the GSP firmware, boot the GSP, provide commnication channel. > - Manage the shared/partitioned HW resources. E.g. reserving FB memory, > channels for the vGPU mananger to create vGPUs. > - Exception handling. E.g. delivering the GSP events to vGPU manager. > - Host event dispatch. E.g. suspend/resume. > - Enumerations of HW configuration. > > The NVIDIA GPU core driver, which sits on the PCI device interface of > NVIDIA GPU, provides support to both DRM driver and the vGPU manager. > > In this RFC, the split nouveau GPU driver[3] is used as an example to > demostrate the requirements of vGPU manager to the core driver. The > nouveau driver is split into nouveau (the DRM driver) and nvkm (the core > driver). > > 3 Try the RFC patches > ----------------------- > > The RFC supports to create one VM to test the simple GPU workload. > > - Host kernel: https://github.com/zhiwang-nvidia/linux/tree/zhi/vgpu-mgr-rfc > - Guest driver package: NVIDIA-Linux-x86_64-535.154.05.run [4] > > Install guest driver: > # export GRID_BUILD=1 > # ./NVIDIA-Linux-x86_64-535.154.05.run > > - Tested platforms: L40. > - Tested guest OS: Ubutnu 24.04 LTS. > - Supported experience: Linux rich desktop experience with simple 3D > workload, e.g. glmark2 > > 4 Demo > ------ > > A demo video can be found at: https://youtu.be/YwgIvvk-V94 > > [1] https://www.nvidia.com/en-us/data-center/virtual-solutions/ > [2] https://docs.nvidia.com/vgpu/17.0/grid-vgpu-user-guide/index.html#architecture-grid-vgpu > [3] https://lore.kernel.org/dri-devel/20240613170211.88779-1-bskeggs@nvidia.com/T/ > [4] https://us.download.nvidia.com/XFree86/Linux-x86_64/535.154.05/NVIDIA-Linux-x86_64-535.154.05.run > > Zhi Wang (29): > nvkm/vgpu: introduce NVIDIA vGPU support prelude > nvkm/vgpu: attach to nvkm as a nvkm client > nvkm/vgpu: reserve a larger GSP heap when NVIDIA vGPU is enabled > nvkm/vgpu: set the VF partition count when NVIDIA vGPU is enabled > nvkm/vgpu: populate GSP_VF_INFO when NVIDIA vGPU is enabled > nvkm/vgpu: set RMSetSriovMode when NVIDIA vGPU is enabled > nvkm/gsp: add a notify handler for GSP event > GPUACCT_PERFMON_UTIL_SAMPLES > nvkm/vgpu: get the size VMMU segment from GSP firmware > nvkm/vgpu: introduce the reserved channel allocator > nvkm/vgpu: introduce interfaces for NVIDIA vGPU VFIO module > nvkm/vgpu: introduce GSP RM client alloc and free for vGPU > nvkm/vgpu: introduce GSP RM control interface for vGPU > nvkm: move chid.h to nvkm/engine. > nvkm/vgpu: introduce channel allocation for vGPU > nvkm/vgpu: introduce FB memory allocation for vGPU > nvkm/vgpu: introduce BAR1 map routines for vGPUs > nvkm/vgpu: introduce engine bitmap for vGPU > nvkm/vgpu: introduce pci_driver.sriov_configure() in nvkm > vfio/vgpu_mgr: introdcue vGPU lifecycle management prelude > vfio/vgpu_mgr: allocate GSP RM client for NVIDIA vGPU manager > vfio/vgpu_mgr: introduce vGPU type uploading > vfio/vgpu_mgr: allocate vGPU FB memory when creating vGPUs > vfio/vgpu_mgr: allocate vGPU channels when creating vGPUs > vfio/vgpu_mgr: allocate mgmt heap when creating vGPUs > vfio/vgpu_mgr: map mgmt heap when creating a vGPU > vfio/vgpu_mgr: allocate GSP RM client when creating vGPUs > vfio/vgpu_mgr: bootload the new vGPU > vfio/vgpu_mgr: introduce vGPU host RPC channel > vfio/vgpu_mgr: introduce NVIDIA vGPU VFIO variant driver > > .../drm/nouveau/include/nvkm/core/device.h | 3 + > .../drm/nouveau/include/nvkm/engine/chid.h | 29 + > .../gpu/drm/nouveau/include/nvkm/subdev/gsp.h | 1 + > .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h | 45 ++ > .../nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h | 12 + > drivers/gpu/drm/nouveau/nvkm/Kbuild | 1 + > drivers/gpu/drm/nouveau/nvkm/device/pci.c | 33 +- > .../gpu/drm/nouveau/nvkm/engine/fifo/chid.c | 49 +- > .../gpu/drm/nouveau/nvkm/engine/fifo/chid.h | 26 +- > .../gpu/drm/nouveau/nvkm/engine/fifo/r535.c | 3 + > .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 14 +- > drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild | 3 + > drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c | 302 +++++++++++ > .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c | 234 ++++++++ > drivers/vfio/pci/Kconfig | 2 + > drivers/vfio/pci/Makefile | 2 + > drivers/vfio/pci/nvidia-vgpu/Kconfig | 13 + > drivers/vfio/pci/nvidia-vgpu/Makefile | 8 + > drivers/vfio/pci/nvidia-vgpu/debug.h | 18 + > .../nvidia/inc/ctrl/ctrl0000/ctrl0000system.h | 30 + > .../nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h | 33 ++ > .../ctrl/ctrl2080/ctrl2080vgpumgrinternal.h | 152 ++++++ > .../common/sdk/nvidia/inc/ctrl/ctrla081.h | 109 ++++ > .../nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h | 213 ++++++++ > .../common/sdk/nvidia/inc/nv_vgpu_types.h | 51 ++ > .../common/sdk/vmioplugin/inc/vmioplugin.h | 26 + > .../pci/nvidia-vgpu/include/nvrm/nvtypes.h | 24 + > drivers/vfio/pci/nvidia-vgpu/nvkm.h | 94 ++++ > drivers/vfio/pci/nvidia-vgpu/rpc.c | 242 +++++++++ > drivers/vfio/pci/nvidia-vgpu/vfio.h | 43 ++ > drivers/vfio/pci/nvidia-vgpu/vfio_access.c | 297 ++++++++++ > drivers/vfio/pci/nvidia-vgpu/vfio_main.c | 511 ++++++++++++++++++ > drivers/vfio/pci/nvidia-vgpu/vgpu.c | 352 ++++++++++++ > drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c | 144 +++++ > drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h | 89 +++ > drivers/vfio/pci/nvidia-vgpu/vgpu_types.c | 466 ++++++++++++++++ > include/drm/nvkm_vgpu_mgr_vfio.h | 61 +++ > 37 files changed, 3702 insertions(+), 33 deletions(-) > create mode 100644 drivers/gpu/drm/nouveau/include/nvkm/engine/chid.h > create mode 100644 drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h > create mode 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild > create mode 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c > create mode 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c > create mode 100644 drivers/vfio/pci/nvidia-vgpu/Kconfig > create mode 100644 drivers/vfio/pci/nvidia-vgpu/Makefile > create mode 100644 drivers/vfio/pci/nvidia-vgpu/debug.h > create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl0000/ctrl0000system.h > create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h > create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080vgpumgrinternal.h > create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrla081.h > create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h > create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/nv_vgpu_types.h > create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/vmioplugin/inc/vmioplugin.h > create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/nvtypes.h > create mode 100644 drivers/vfio/pci/nvidia-vgpu/nvkm.h > create mode 100644 drivers/vfio/pci/nvidia-vgpu/rpc.c > create mode 100644 drivers/vfio/pci/nvidia-vgpu/vfio.h > create mode 100644 drivers/vfio/pci/nvidia-vgpu/vfio_access.c > create mode 100644 drivers/vfio/pci/nvidia-vgpu/vfio_main.c > create mode 100644 drivers/vfio/pci/nvidia-vgpu/vgpu.c > create mode 100644 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c > create mode 100644 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h > create mode 100644 drivers/vfio/pci/nvidia-vgpu/vgpu_types.c > create mode 100644 include/drm/nvkm_vgpu_mgr_vfio.h > > -- > 2.34.1 >
On Mon, Sep 23, 2024 at 10:49:07AM +0200, Danilo Krummrich wrote: > > 2. Proposal for upstream > > ======================== > > What is the strategy in the mid / long term with this? > > As you know, we're trying to move to Nova and the blockers with the device / > driver infrastructure have been resolved and we're able to move forward. Besides > that, Dave made great progress on the firmware abstraction side of things. > > Is this more of a proof of concept? Do you plan to work on Nova in general and > vGPU support for Nova? This is intended to be a real product that customers would use, it is not a proof of concept. There is alot of demand for this kind of simplified virtualization infrastructure in the host side. The series here is the first attempt at making thin host infrastructure and Zhi/etc are doing it with an upstream-first approach. From the VFIO side I would like to see something like this merged in nearish future as it would bring a previously out of tree approach to be fully intree using our modern infrastructure. This is a big win for the VFIO world. As a commercial product this will be backported extensively to many old kernels and that is harder/impossible if it isn't exclusively in C. So, I think nova needs to co-exist in some way. Jason
On Mon, Sep 23, 2024 at 06:22:33AM +0000, Tian, Kevin wrote: > > From: Zhi Wang <zhiw@nvidia.com> > > Sent: Sunday, September 22, 2024 8:49 PM > > > [...] > > > > The NVIDIA vGPU VFIO module together with VFIO sits on VFs, provides > > extended management and features, e.g. selecting the vGPU types, support > > live migration and driver warm update. > > > > Like other devices that VFIO supports, VFIO provides the standard > > userspace APIs for device lifecycle management and advance feature > > support. > > > > The NVIDIA vGPU manager provides necessary support to the NVIDIA vGPU VFIO > > variant driver to create/destroy vGPUs, query available vGPU types, select > > the vGPU type, etc. > > > > On the other side, NVIDIA vGPU manager talks to the NVIDIA GPU core driver, > > which provide necessary support to reach the HW functions. > > > > I'm not sure VFIO is the right place to host the NVIDIA vGPU manager. > It's very NVIDIA specific and naturally fit in the PF driver. drm isn't a particularly logical place for that either :| Jason
On Mon, Sep 23, 2024 at 12:01:40PM -0300, Jason Gunthorpe wrote: > On Mon, Sep 23, 2024 at 10:49:07AM +0200, Danilo Krummrich wrote: > > > 2. Proposal for upstream > > > ======================== > > > > What is the strategy in the mid / long term with this? > > > > As you know, we're trying to move to Nova and the blockers with the device / > > driver infrastructure have been resolved and we're able to move forward. Besides > > that, Dave made great progress on the firmware abstraction side of things. > > > > Is this more of a proof of concept? Do you plan to work on Nova in general and > > vGPU support for Nova? > > This is intended to be a real product that customers would use, it is > not a proof of concept. There is alot of demand for this kind of > simplified virtualization infrastructure in the host side. I see... > The series > here is the first attempt at making thin host infrastructure and > Zhi/etc are doing it with an upstream-first approach. This is great! > > From the VFIO side I would like to see something like this merged in > nearish future as it would bring a previously out of tree approach to > be fully intree using our modern infrastructure. This is a big win for > the VFIO world. > > As a commercial product this will be backported extensively to many > old kernels and that is harder/impossible if it isn't exclusively in > C. So, I think nova needs to co-exist in some way. We'll surely not support two drivers for the same thing in the long term, neither does it make sense, nor is it sustainable. We have a lot of good reasons why we decided to move forward with Nova as a successor of Nouveau for GSP-based GPUs in the long term -- I also just held a talk about this at LPC. For the short/mid term I think it may be reasonable to start with Nouveau, but this must be based on some agreements, for instance: - take responsibility, e.g. commitment to help with maintainance with some of NVKM / NVIDIA GPU core (or whatever we want to call it) within Nouveau - commitment to help with Nova in general and, once applicable, move the vGPU parts over to Nova But I think the very last one naturally happens if we stop further support for new HW in Nouveau at some point. > > Jason >
On Tue, Sep 24, 2024 at 12:50:55AM +0200, Danilo Krummrich wrote: > > From the VFIO side I would like to see something like this merged in > > nearish future as it would bring a previously out of tree approach to > > be fully intree using our modern infrastructure. This is a big win for > > the VFIO world. > > > > As a commercial product this will be backported extensively to many > > old kernels and that is harder/impossible if it isn't exclusively in > > C. So, I think nova needs to co-exist in some way. > > We'll surely not support two drivers for the same thing in the long term, > neither does it make sense, nor is it sustainable. What is being done here is the normal correct kernel thing to do. Refactor the shared core code into a module and stick higher level stuff on top of it. Ideally Nova/Nouveau would exist as peers implementing DRM subsystem on this shared core infrastructure. We've done this sort of thing before in other places in the kernel. It has been proven to work well. So, I'm not sure why you think there should be two drivers in the long term? Do you have some technical reason why Nova can't fit into this modular architecture? Regardless, assuming Nova will eventually propose merging duplicated bootup code then I suggest it should be able to fully replace the C code with a kconfig switch and provide C compatible interfaces for VFIO. When Rust is sufficiently mature we can consider a deprecation schedule for the C version. I agree duplication doesn't make sense, but if it is essential to make everyone happy then we should do it to accommodate the ongoing Rust experiment. > We have a lot of good reasons why we decided to move forward with Nova as a > successor of Nouveau for GSP-based GPUs in the long term -- I also just held a > talk about this at LPC. I know, but this series is adding a VFIO driver to the kernel, and a complete Nova driver doesn't even exist yet. It is fine to think about future plans, but let's not get too far ahead of ourselves here.. > For the short/mid term I think it may be reasonable to start with > Nouveau, but this must be based on some agreements, for instance: > > - take responsibility, e.g. commitment to help with maintainance with some of > NVKM / NVIDIA GPU core (or whatever we want to call it) within Nouveau I fully expect NVIDIA teams to own this core driver and VFIO parts. I see there are no changes to the MAINTAINERs file in this RFC, that will need to be corrected. > - commitment to help with Nova in general and, once applicable, move the vGPU > parts over to Nova I think you will get help with Nova based on its own merit, but I don't like where you are going with this. Linus has had negative things to say about this sort of cross-linking and I agree with him. We should not be trying to extract unrelated promises on Nova as a condition for progressing a VFIO series. :\ > But I think the very last one naturally happens if we stop further support for > new HW in Nouveau at some point. I expect the core code would continue to support new HW going forward to support the VFIO driver, even if nouveau doesn't use it, until Rust reaches some full ecosystem readyness for the server space. There are going to be a lot of users of this code, let's not rush to harm them please. Fortunately there is no use case for DRM and VFIO to coexist in a hypervisor, so this does not turn into a such a technical problem like most other dual-driver situations. Jason
On 23/09/2024 11.38, Danilo Krummrich wrote: > External email: Use caution opening links or attachments > > > On Sun, Sep 22, 2024 at 04:11:21PM +0300, Zhi Wang wrote: >> On Sun, 22 Sep 2024 05:49:22 -0700 >> Zhi Wang <zhiw@nvidia.com> wrote: >> >> +Ben. >> >> Forget to add you. My bad. > > Please also add the driver maintainers! > > I had to fetch the patchset from the KVM list, since they did not hit the > nouveau list (I'm trying to get @nvidia.com addresses whitelisted). > > - Danilo > My bad. Will do in the next iteration. Weird...never thought this cound happen since alll my previous emails landed in the mailing list. Did you see any email discussion in the thread in the nouveau list? Feel free to let me know if I should send them again to nouveau list. Maybe it is also easier that you can pull the patches from my tree. Note that I will be on vacation until Oct 11th. Email reply might be slow. But I will read the emails in the mailing list. Thanks, Zhi. >> >> >>> 1. Background >>> ============= >>> >>> NVIDIA vGPU[1] software enables powerful GPU performance for workloads >>> ranging from graphics-rich virtual workstations to data science and >>> AI, enabling IT to leverage the management and security benefits of >>> virtualization as well as the performance of NVIDIA GPUs required for >>> modern workloads. Installed on a physical GPU in a cloud or enterprise >>> data center server, NVIDIA vGPU software creates virtual GPUs that can >>> be shared across multiple virtual machines. >>> >>> The vGPU architecture[2] can be illustrated as follow: >>> >>> +--------------------+ +--------------------+ >>> +--------------------+ +--------------------+ | Hypervisor | >>> | Guest VM | | Guest VM | | Guest VM >>> | | | | +----------------+ | | >>> +----------------+ | | +----------------+ | | +----------------+ | >>> | |Applications... | | | |Applications... | | | |Applications... | | >>> | | NVIDIA | | | +----------------+ | | +----------------+ >>> | | +----------------+ | | | Virtual GPU | | | >>> +----------------+ | | +----------------+ | | +----------------+ | | >>> | Manager | | | | Guest Driver | | | | Guest Driver | | >>> | | Guest Driver | | | +------^---------+ | | +----------------+ >>> | | +----------------+ | | +----------------+ | | | >>> | +---------^----------+ +----------^---------+ >>> +----------^---------+ | | | | >>> | | | | >>> +--------------+-----------------------+----------------------+---------+ >>> | | | | >>> | | | | | >>> | | | >>> +--------+--------------------------+-----------------------+----------------------+---------+ >>> +---------v--------------------------+-----------------------+----------------------+----------+ >>> | NVIDIA +----------v---------+ >>> +-----------v--------+ +-----------v--------+ | | Physical GPU >>> | Virtual GPU | | Virtual GPU | | Virtual GPU >>> | | | +--------------------+ >>> +--------------------+ +--------------------+ | >>> +----------------------------------------------------------------------------------------------+ >>> >>> Each NVIDIA vGPU is analogous to a conventional GPU, having a fixed >>> amount of GPU framebuffer, and one or more virtual display outputs or >>> "heads". The vGPU’s framebuffer is allocated out of the physical >>> GPU’s framebuffer at the time the vGPU is created, and the vGPU >>> retains exclusive use of that framebuffer until it is destroyed. >>> >>> The number of physical GPUs that a board has depends on the board. >>> Each physical GPU can support several different types of virtual GPU >>> (vGPU). vGPU types have a fixed amount of frame buffer, number of >>> supported display heads, and maximum resolutions. They are grouped >>> into different series according to the different classes of workload >>> for which they are optimized. Each series is identified by the last >>> letter of the vGPU type name. >>> >>> NVIDIA vGPU supports Windows and Linux guest VM operating systems. The >>> supported vGPU types depend on the guest VM OS. >>> >>> 2. Proposal for upstream >>> ======================== >>> >>> 2.1 Architecture >>> ---------------- >>> >>> Moving to the upstream, the proposed architecture can be illustrated >>> as followings: >>> >>> +--------------------+ >>> +--------------------+ +--------------------+ | Linux VM | >>> | Windows VM | | Guest VM | | +----------------+ | >>> | +----------------+ | | +----------------+ | | |Applications... | | >>> | |Applications... | | | |Applications... | | | +----------------+ | >>> | +----------------+ | | +----------------+ | ... | >>> +----------------+ | | +----------------+ | | +----------------+ | | >>> | Guest Driver | | | | Guest Driver | | | | Guest Driver | | | >>> +----------------+ | | +----------------+ | | +----------------+ | >>> +---------^----------+ +----------^---------+ +----------^---------+ >>> | | | >>> +--------------------------------------------------------------------+ >>> |+--------------------+ +--------------------+ >>> +--------------------+| || QEMU | | QEMU >>> | | QEMU || || | | >>> | | || |+--------------------+ >>> +--------------------+ +--------------------+| >>> +--------------------------------------------------------------------+ >>> | | | >>> +-----------------------------------------------------------------------------------------------+ >>> | >>> +----------------------------------------------------------------+ | >>> | | VFIO >>> | | | | >>> | | | >>> +-----------------------+ | +------------------------+ >>> +---------------------------------+| | | | Core Driver vGPU | | >>> | | | || | | >>> | Support <--->| <----> >>> || | | +-----------------------+ | | NVIDIA vGPU >>> Manager | | NVIDIA vGPU VFIO Variant Driver || | | | NVIDIA >>> GPU Core | | | | | >>> || | | | Driver | | >>> +------------------------+ +---------------------------------+| | | >>> +--------^--------------+ >>> +----------------------------------------------------------------+ | >>> | | | | >>> | | >>> +-----------------------------------------------------------------------------------------------+ >>> | | | >>> | >>> +----------|--------------------------|-----------------------|----------------------|----------+ >>> | v +----------v---------+ >>> +-----------v--------+ +-----------v--------+ | | NVIDIA >>> | PCI VF | | PCI VF | | PCI VF >>> | | | Physical GPU | | | >>> | | | | | | >>> (Virtual GPU) | | (Virtual GPU) | | (Virtual GPU) | | | >>> +--------------------+ +--------------------+ >>> +--------------------+ | >>> +-----------------------------------------------------------------------------------------------+ >>> >>> The supported GPU generations will be Ada which come with the >>> supported GPU architecture. Each vGPU is backed by a PCI virtual >>> function. >>> >>> The NVIDIA vGPU VFIO module together with VFIO sits on VFs, provides >>> extended management and features, e.g. selecting the vGPU types, >>> support live migration and driver warm update. >>> >>> Like other devices that VFIO supports, VFIO provides the standard >>> userspace APIs for device lifecycle management and advance feature >>> support. >>> >>> The NVIDIA vGPU manager provides necessary support to the NVIDIA vGPU >>> VFIO variant driver to create/destroy vGPUs, query available vGPU >>> types, select the vGPU type, etc. >>> >>> On the other side, NVIDIA vGPU manager talks to the NVIDIA GPU core >>> driver, which provide necessary support to reach the HW functions. >>> >>> 2.2 Requirements to the NVIDIA GPU core driver >>> ---------------------------------------------- >>> >>> The primary use case of CSP and enterprise is a standalone minimal >>> drivers of vGPU manager and other necessary components. >>> >>> NVIDIA vGPU manager talks to the NVIDIA GPU core driver, which provide >>> necessary support to: >>> >>> - Load the GSP firmware, boot the GSP, provide commnication channel. >>> - Manage the shared/partitioned HW resources. E.g. reserving FB >>> memory, channels for the vGPU mananger to create vGPUs. >>> - Exception handling. E.g. delivering the GSP events to vGPU manager. >>> - Host event dispatch. E.g. suspend/resume. >>> - Enumerations of HW configuration. >>> >>> The NVIDIA GPU core driver, which sits on the PCI device interface of >>> NVIDIA GPU, provides support to both DRM driver and the vGPU manager. >>> >>> In this RFC, the split nouveau GPU driver[3] is used as an example to >>> demostrate the requirements of vGPU manager to the core driver. The >>> nouveau driver is split into nouveau (the DRM driver) and nvkm (the >>> core driver). >>> >>> 3 Try the RFC patches >>> ----------------------- >>> >>> The RFC supports to create one VM to test the simple GPU workload. >>> >>> - Host kernel: >>> https://github.com/zhiwang-nvidia/linux/tree/zhi/vgpu-mgr-rfc >>> - Guest driver package: NVIDIA-Linux-x86_64-535.154.05.run [4] >>> >>> Install guest driver: >>> # export GRID_BUILD=1 >>> # ./NVIDIA-Linux-x86_64-535.154.05.run >>> >>> - Tested platforms: L40. >>> - Tested guest OS: Ubutnu 24.04 LTS. >>> - Supported experience: Linux rich desktop experience with simple 3D >>> workload, e.g. glmark2 >>> >>> 4 Demo >>> ------ >>> >>> A demo video can be found at: https://youtu.be/YwgIvvk-V94 >>> >>> [1] https://www.nvidia.com/en-us/data-center/virtual-solutions/ >>> [2] >>> https://docs.nvidia.com/vgpu/17.0/grid-vgpu-user-guide/index.html#architecture-grid-vgpu >>> [3] >>> https://lore.kernel.org/dri-devel/20240613170211.88779-1-bskeggs@nvidia.com/T/ >>> [4] >>> https://us.download.nvidia.com/XFree86/Linux-x86_64/535.154.05/NVIDIA-Linux-x86_64-535.154.05.run >>> >>> Zhi Wang (29): >>> nvkm/vgpu: introduce NVIDIA vGPU support prelude >>> nvkm/vgpu: attach to nvkm as a nvkm client >>> nvkm/vgpu: reserve a larger GSP heap when NVIDIA vGPU is enabled >>> nvkm/vgpu: set the VF partition count when NVIDIA vGPU is enabled >>> nvkm/vgpu: populate GSP_VF_INFO when NVIDIA vGPU is enabled >>> nvkm/vgpu: set RMSetSriovMode when NVIDIA vGPU is enabled >>> nvkm/gsp: add a notify handler for GSP event >>> GPUACCT_PERFMON_UTIL_SAMPLES >>> nvkm/vgpu: get the size VMMU segment from GSP firmware >>> nvkm/vgpu: introduce the reserved channel allocator >>> nvkm/vgpu: introduce interfaces for NVIDIA vGPU VFIO module >>> nvkm/vgpu: introduce GSP RM client alloc and free for vGPU >>> nvkm/vgpu: introduce GSP RM control interface for vGPU >>> nvkm: move chid.h to nvkm/engine. >>> nvkm/vgpu: introduce channel allocation for vGPU >>> nvkm/vgpu: introduce FB memory allocation for vGPU >>> nvkm/vgpu: introduce BAR1 map routines for vGPUs >>> nvkm/vgpu: introduce engine bitmap for vGPU >>> nvkm/vgpu: introduce pci_driver.sriov_configure() in nvkm >>> vfio/vgpu_mgr: introdcue vGPU lifecycle management prelude >>> vfio/vgpu_mgr: allocate GSP RM client for NVIDIA vGPU manager >>> vfio/vgpu_mgr: introduce vGPU type uploading >>> vfio/vgpu_mgr: allocate vGPU FB memory when creating vGPUs >>> vfio/vgpu_mgr: allocate vGPU channels when creating vGPUs >>> vfio/vgpu_mgr: allocate mgmt heap when creating vGPUs >>> vfio/vgpu_mgr: map mgmt heap when creating a vGPU >>> vfio/vgpu_mgr: allocate GSP RM client when creating vGPUs >>> vfio/vgpu_mgr: bootload the new vGPU >>> vfio/vgpu_mgr: introduce vGPU host RPC channel >>> vfio/vgpu_mgr: introduce NVIDIA vGPU VFIO variant driver >>> >>> .../drm/nouveau/include/nvkm/core/device.h | 3 + >>> .../drm/nouveau/include/nvkm/engine/chid.h | 29 + >>> .../gpu/drm/nouveau/include/nvkm/subdev/gsp.h | 1 + >>> .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h | 45 ++ >>> .../nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h | 12 + >>> drivers/gpu/drm/nouveau/nvkm/Kbuild | 1 + >>> drivers/gpu/drm/nouveau/nvkm/device/pci.c | 33 +- >>> .../gpu/drm/nouveau/nvkm/engine/fifo/chid.c | 49 +- >>> .../gpu/drm/nouveau/nvkm/engine/fifo/chid.h | 26 +- >>> .../gpu/drm/nouveau/nvkm/engine/fifo/r535.c | 3 + >>> .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 14 +- >>> drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild | 3 + >>> drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c | 302 +++++++++++ >>> .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c | 234 ++++++++ >>> drivers/vfio/pci/Kconfig | 2 + >>> drivers/vfio/pci/Makefile | 2 + >>> drivers/vfio/pci/nvidia-vgpu/Kconfig | 13 + >>> drivers/vfio/pci/nvidia-vgpu/Makefile | 8 + >>> drivers/vfio/pci/nvidia-vgpu/debug.h | 18 + >>> .../nvidia/inc/ctrl/ctrl0000/ctrl0000system.h | 30 + >>> .../nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h | 33 ++ >>> .../ctrl/ctrl2080/ctrl2080vgpumgrinternal.h | 152 ++++++ >>> .../common/sdk/nvidia/inc/ctrl/ctrla081.h | 109 ++++ >>> .../nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h | 213 ++++++++ >>> .../common/sdk/nvidia/inc/nv_vgpu_types.h | 51 ++ >>> .../common/sdk/vmioplugin/inc/vmioplugin.h | 26 + >>> .../pci/nvidia-vgpu/include/nvrm/nvtypes.h | 24 + >>> drivers/vfio/pci/nvidia-vgpu/nvkm.h | 94 ++++ >>> drivers/vfio/pci/nvidia-vgpu/rpc.c | 242 +++++++++ >>> drivers/vfio/pci/nvidia-vgpu/vfio.h | 43 ++ >>> drivers/vfio/pci/nvidia-vgpu/vfio_access.c | 297 ++++++++++ >>> drivers/vfio/pci/nvidia-vgpu/vfio_main.c | 511 >>> ++++++++++++++++++ drivers/vfio/pci/nvidia-vgpu/vgpu.c | >>> 352 ++++++++++++ drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c | 144 >>> +++++ drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h | 89 +++ >>> drivers/vfio/pci/nvidia-vgpu/vgpu_types.c | 466 ++++++++++++++++ >>> include/drm/nvkm_vgpu_mgr_vfio.h | 61 +++ >>> 37 files changed, 3702 insertions(+), 33 deletions(-) >>> create mode 100644 drivers/gpu/drm/nouveau/include/nvkm/engine/chid.h >>> create mode 100644 >>> drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h create mode >>> 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild create mode >>> 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c create mode >>> 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c create mode >>> 100644 drivers/vfio/pci/nvidia-vgpu/Kconfig create mode 100644 >>> drivers/vfio/pci/nvidia-vgpu/Makefile create mode 100644 >>> drivers/vfio/pci/nvidia-vgpu/debug.h create mode 100644 >>> drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl0000/ctrl0000system.h >>> create mode 100644 >>> drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h >>> create mode 100644 >>> drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080vgpumgrinternal.h >>> create mode 100644 >>> drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrla081.h >>> create mode 100644 >>> drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h >>> create mode 100644 >>> drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/nv_vgpu_types.h >>> create mode 100644 >>> drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/vmioplugin/inc/vmioplugin.h >>> create mode 100644 >>> drivers/vfio/pci/nvidia-vgpu/include/nvrm/nvtypes.h create mode >>> 100644 drivers/vfio/pci/nvidia-vgpu/nvkm.h create mode 100644 >>> drivers/vfio/pci/nvidia-vgpu/rpc.c create mode 100644 >>> drivers/vfio/pci/nvidia-vgpu/vfio.h create mode 100644 >>> drivers/vfio/pci/nvidia-vgpu/vfio_access.c create mode 100644 >>> drivers/vfio/pci/nvidia-vgpu/vfio_main.c create mode 100644 >>> drivers/vfio/pci/nvidia-vgpu/vgpu.c create mode 100644 >>> drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c create mode 100644 >>> drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h create mode 100644 >>> drivers/vfio/pci/nvidia-vgpu/vgpu_types.c create mode 100644 >>> include/drm/nvkm_vgpu_mgr_vfio.h >>> >>
On Tue, Sep 24, 2024 at 01:41:51PM -0300, Jason Gunthorpe wrote: > On Tue, Sep 24, 2024 at 12:50:55AM +0200, Danilo Krummrich wrote: > > > > From the VFIO side I would like to see something like this merged in > > > nearish future as it would bring a previously out of tree approach to > > > be fully intree using our modern infrastructure. This is a big win for > > > the VFIO world. > > > > > > As a commercial product this will be backported extensively to many > > > old kernels and that is harder/impossible if it isn't exclusively in > > > C. So, I think nova needs to co-exist in some way. > > > > We'll surely not support two drivers for the same thing in the long term, > > neither does it make sense, nor is it sustainable. > > What is being done here is the normal correct kernel thing to > do. Refactor the shared core code into a module and stick higher level > stuff on top of it. Ideally Nova/Nouveau would exist as peers > implementing DRM subsystem on this shared core infrastructure. We've > done this sort of thing before in other places in the kernel. It has > been proven to work well. So, that's where you have the wrong understanding of what we're working on: You seem to think that Nova is just another DRM subsystem layer on top of the NVKM parts (what you call the core driver) of Nouveau. But the whole point of Nova is to replace the NVKM parts of Nouveau, since that's where the problems we want to solve reside in. > > So, I'm not sure why you think there should be two drivers in the long > term? Do you have some technical reason why Nova can't fit into this > modular architecture? Like I said above, the whole point of Nova is to be the core driver, the DRM parts on top are more like "the icing on the cake". > > Regardless, assuming Nova will eventually propose merging duplicated > bootup code then I suggest it should be able to fully replace the C > code with a kconfig switch and provide C compatible interfaces for > VFIO. When Rust is sufficiently mature we can consider a deprecation > schedule for the C version. > > I agree duplication doesn't make sense, but if it is essential to make > everyone happy then we should do it to accommodate the ongoing Rust > experiment. > > > We have a lot of good reasons why we decided to move forward with Nova as a > > successor of Nouveau for GSP-based GPUs in the long term -- I also just held a > > talk about this at LPC. > > I know, but this series is adding a VFIO driver to the kernel, and a I have no concerns regarding the VFIO driver, this is about the new features that you intend to add to Nouveau. > complete Nova driver doesn't even exist yet. It is fine to think about > future plans, but let's not get too far ahead of ourselves here.. Well, that's true, but we can't just add new features to something that has been agreed to be replaced without having a strategy for this for the successor. > > > For the short/mid term I think it may be reasonable to start with > > Nouveau, but this must be based on some agreements, for instance: > > > > - take responsibility, e.g. commitment to help with maintainance with some of > > NVKM / NVIDIA GPU core (or whatever we want to call it) within Nouveau > > I fully expect NVIDIA teams to own this core driver and VFIO parts. I > see there are no changes to the MAINTAINERs file in this RFC, that > will need to be corrected. Well, I did not say to just take over the biggest part of Nouveau. Currently - and please correct me if I'm wrong - you make it sound to me as if you're not willing to respect the decisions that have been taken by Nouveau and DRM maintainers. > > > - commitment to help with Nova in general and, once applicable, move the vGPU > > parts over to Nova > > I think you will get help with Nova based on its own merit, but I > don't like where you are going with this. Linus has had negative > things to say about this sort of cross-linking and I agree with > him. We should not be trying to extract unrelated promises on Nova as > a condition for progressing a VFIO series. :\ No cross-linking, no unrelated promises. Again, we're working on a successor of Nouveau and if we keep adding features to Nouveau in the meantime, we have to have a strategy for the transition, otherwise we're effectively just ignoring this decision. So, I really need you to respect the fact that there has been a decision for a successor and that this *is* in fact relevant for all major changes to Nouveau as well. Once you do this, we get the chance to work things out for the short/mid term and for the long term and make everyone benefit. I encourage that NVIDIA wants to move things upstream and I'm absolutely willing to collaborate and help with the use-cases and goals NVIDIA has. But it really has to be a collaboration and this starts with acknowledging the goals of *each other*. > > > But I think the very last one naturally happens if we stop further support for > > new HW in Nouveau at some point. > > I expect the core code would continue to support new HW going forward > to support the VFIO driver, even if nouveau doesn't use it, until Rust > reaches some full ecosystem readyness for the server space. From an upstream perspective the kernel doesn't need to consider OOT drivers, i.e. the guest driver. This doesn't mean that we can't work something out for a seamless transition though. But again, this can only really work if we acknowledge the goals of each other. > > There are going to be a lot of users of this code, let's not rush to > harm them please. Please abstain from such kind of unconstructive insinuations; it's ridiculous to imply that upstream kernel developers and maintainers would harm the users of NVIDIA GPUs. > > Fortunately there is no use case for DRM and VFIO to coexist in a > hypervisor, so this does not turn into a such a technical problem like > most other dual-driver situations. > > Jason >
On Wed, 25 Sept 2024 at 05:57, Danilo Krummrich <dakr@kernel.org> wrote: > > On Tue, Sep 24, 2024 at 01:41:51PM -0300, Jason Gunthorpe wrote: > > On Tue, Sep 24, 2024 at 12:50:55AM +0200, Danilo Krummrich wrote: > > > > > > From the VFIO side I would like to see something like this merged in > > > > nearish future as it would bring a previously out of tree approach to > > > > be fully intree using our modern infrastructure. This is a big win for > > > > the VFIO world. > > > > > > > > As a commercial product this will be backported extensively to many > > > > old kernels and that is harder/impossible if it isn't exclusively in > > > > C. So, I think nova needs to co-exist in some way. > > > > > > We'll surely not support two drivers for the same thing in the long term, > > > neither does it make sense, nor is it sustainable. > > > > What is being done here is the normal correct kernel thing to > > do. Refactor the shared core code into a module and stick higher level > > stuff on top of it. Ideally Nova/Nouveau would exist as peers > > implementing DRM subsystem on this shared core infrastructure. We've > > done this sort of thing before in other places in the kernel. It has > > been proven to work well. > > So, that's where you have the wrong understanding of what we're working on: You > seem to think that Nova is just another DRM subsystem layer on top of the NVKM > parts (what you call the core driver) of Nouveau. > > But the whole point of Nova is to replace the NVKM parts of Nouveau, since > that's where the problems we want to solve reside in. Just to re-emphasise for Jason who might not be as across this stuff, NVKM replacement with rust is the main reason for the nova project, 100% the driving force for nova is the unstable NVIDIA firmware API. The ability to use rust proc-macros to hide the NVIDIA instability instead of trying to do it in C by either generators or abusing C macros (which I don't think are sufficient). The lower level nvkm driver needs to start being in rust before we can add support for newer stuff. Now there is possibly some scope about evolving the rust pieces in it as, rust wrapped in C APIs to make things easier for backports or avoid some pitfalls, but that is a discussion that we need to have here. I think the idea of a nova drm and nova core driver architecture is acceptable to most of us, but long term trying to main a nouveau based nvkm is definitely not acceptable due to the unstable firmware APIs. Dave.
On Wed, Sep 25, 2024 at 08:52:32AM +1000, Dave Airlie wrote: > On Wed, 25 Sept 2024 at 05:57, Danilo Krummrich <dakr@kernel.org> wrote: > > > > On Tue, Sep 24, 2024 at 01:41:51PM -0300, Jason Gunthorpe wrote: > > > On Tue, Sep 24, 2024 at 12:50:55AM +0200, Danilo Krummrich wrote: > > > > > > > > From the VFIO side I would like to see something like this merged in > > > > > nearish future as it would bring a previously out of tree approach to > > > > > be fully intree using our modern infrastructure. This is a big win for > > > > > the VFIO world. > > > > > > > > > > As a commercial product this will be backported extensively to many > > > > > old kernels and that is harder/impossible if it isn't exclusively in > > > > > C. So, I think nova needs to co-exist in some way. > > > > > > > > We'll surely not support two drivers for the same thing in the long term, > > > > neither does it make sense, nor is it sustainable. > > > > > > What is being done here is the normal correct kernel thing to > > > do. Refactor the shared core code into a module and stick higher level > > > stuff on top of it. Ideally Nova/Nouveau would exist as peers > > > implementing DRM subsystem on this shared core infrastructure. We've > > > done this sort of thing before in other places in the kernel. It has > > > been proven to work well. > > > > So, that's where you have the wrong understanding of what we're > > working on: You seem to think that Nova is just another DRM > > subsystem layer on top of the NVKM parts (what you call the core > > driver) of Nouveau. Well, no, I am calling a core driver to be the very minimal parts that are actually shared between vfio and drm. It should definitely not include key parts you want to work on in rust, like the command marshaling. I expect there is more work to do in order to make this kind of split, but this is what I'm thinking/expecting. > > But the whole point of Nova is to replace the NVKM parts of Nouveau, since > > that's where the problems we want to solve reside in. > > Just to re-emphasise for Jason who might not be as across this stuff, > > NVKM replacement with rust is the main reason for the nova project, > 100% the driving force for nova is the unstable NVIDIA firmware API. > The ability to use rust proc-macros to hide the NVIDIA instability > instead of trying to do it in C by either generators or abusing C > macros (which I don't think are sufficient). I would not include any of this in the very core most driver. My thinking is informed by what we've done in RDMA, particularly mlx5 which has a pretty thin PCI driver and each of the drivers stacked on top form their own command buffers directly. The PCI driver primarily just does some device bootup, command execution and interrupts because those are all shared by the subsystem drivers. We have a lot of experiance now building these kinds of multi-subsystem structures and this pattern works very well. So, broadly, build your rust proc macros on the DRM Nova driver and call a core function to submit a command buffer to the device and get back a response. VFIO will make it's command buffers with C and call the same core function. > I think the idea of a nova drm and nova core driver architecture is > acceptable to most of us, but long term trying to main a nouveau based > nvkm is definitely not acceptable due to the unstable firmware APIs. ? nova core, meaning nova rust, meaning vfio depends on rust, doesn't seem acceptable ? We need to keep rust isolated to DRM for the foreseeable future. Just need to find a separation that can do that. Jason
> > Well, no, I am calling a core driver to be the very minimal parts that > are actually shared between vfio and drm. It should definitely not > include key parts you want to work on in rust, like the command > marshaling. Unfortunately not, the fw ABI is the unsolved problem, rust is our best solution. > > I expect there is more work to do in order to make this kind of split, > but this is what I'm thinking/expecting. > > > > But the whole point of Nova is to replace the NVKM parts of Nouveau, since > > > that's where the problems we want to solve reside in. > > > > Just to re-emphasise for Jason who might not be as across this stuff, > > > > NVKM replacement with rust is the main reason for the nova project, > > 100% the driving force for nova is the unstable NVIDIA firmware API. > > The ability to use rust proc-macros to hide the NVIDIA instability > > instead of trying to do it in C by either generators or abusing C > > macros (which I don't think are sufficient). > > I would not include any of this in the very core most driver. My > thinking is informed by what we've done in RDMA, particularly mlx5 > which has a pretty thin PCI driver and each of the drivers stacked on > top form their own command buffers directly. The PCI driver primarily > just does some device bootup, command execution and interrupts because > those are all shared by the subsystem drivers. > > We have a lot of experiance now building these kinds of > multi-subsystem structures and this pattern works very well. > > So, broadly, build your rust proc macros on the DRM Nova driver and > call a core function to submit a command buffer to the device and get > back a response. > > VFIO will make it's command buffers with C and call the same core > function. > > > I think the idea of a nova drm and nova core driver architecture is > > acceptable to most of us, but long term trying to main a nouveau based > > nvkm is definitely not acceptable due to the unstable firmware APIs. > > ? nova core, meaning nova rust, meaning vfio depends on rust, doesn't > seem acceptable ? We need to keep rust isolated to DRM for the > foreseeable future. Just need to find a separation that can do that. That isn't going to happen, if we start with that as the default positioning it won't get us very far. The core has to be rust, because NVIDIA has an unstable firmware API. The unstable firmware API isn't some command marshalling, it's deep down into the depths of it, like memory sizing requirements, base message queue layout and encoding, firmware init procedures. These are all changeable at any time with no regard for upstream development, so upstream development needs to be insulated from these as much as possible. Using rust provides that insulation layer. The unstable ABI isn't a solvable problem in the short term, using rust is the maintainable answer. Now there are maybe some on/off ramps we can use here that might provide some solutions to bridge the gap. Using rust in the kernel has various levels, which we currently tie into one place, but if we consider different longer term progressions it might be possible to start with some rust that is easier to backport than other rust might be etc. Strategies for moving nvkm core from C to rust in steps, or along a sliding scale of fws supported could be open for discussion. The end result though is to have nova core and nova drm in rust, that is the decision upstream made 6-12 months ago, I don't see any of the initial reasons for using rust have been invalidated or removed that warrant revisiting that decision. Dave.
On Tue, Sep 24, 2024 at 09:56:58PM +0200, Danilo Krummrich wrote: > Currently - and please correct me if I'm wrong - you make it sound to me as if > you're not willing to respect the decisions that have been taken by Nouveau and > DRM maintainers. I've never said anything about your work, go do Nova, have fun. I'm just not agreeing to being forced into taking Rust dependencies in VFIO because Nova is participating in the Rust Experiment. I think the reasonable answer is to accept some code duplication, or try to consolidate around a small C core. I understad this is different than you may have planned so far for Nova, but all projects are subject to community feedback, especially when faced with new requirements. I think this discussion is getting a little overheated, there is lots of space here for everyone to do their things. Let's not get too excited. > I encourage that NVIDIA wants to move things upstream and I'm absolutely willing > to collaborate and help with the use-cases and goals NVIDIA has. But it really > has to be a collaboration and this starts with acknowledging the goals of *each > other*. I've always acknowledged Nova's goal - it is fine. It is just quite incompatible with the VFIO side requirement of no Rust in our stack until the ecosystem can consume it. I belive there is no reason we can't find an agreeable compromise. > > I expect the core code would continue to support new HW going forward > > to support the VFIO driver, even if nouveau doesn't use it, until Rust > > reaches some full ecosystem readyness for the server space. > > From an upstream perspective the kernel doesn't need to consider OOT drivers, > i.e. the guest driver. ?? VFIO already took the decision that it is agnostic to what is running in the VM. Run Windows-only VMs for all we care, it is still supposed to be virtualized correctly. > > There are going to be a lot of users of this code, let's not rush to > > harm them please. > > Please abstain from such kind of unconstructive insinuations; it's ridiculous to > imply that upstream kernel developers and maintainers would harm the users of > NVIDIA GPUs. You literally just said you'd want to effectively block usable VFIO support for new GPU HW when "we stop further support for new HW in Nouveau at some point" and "move the vGPU parts over to Nova(& rust)". I don't agree to that, it harms VFIO users, and is not acknowledging that conflicting goals exist. VFIO will decide when it starts to depend on rust, Nova should not force that decision on VFIO. They are very different ecosystems with different needs. Jason
On Wed, 25 Sept 2024 at 10:53, Jason Gunthorpe <jgg@nvidia.com> wrote: > > On Tue, Sep 24, 2024 at 09:56:58PM +0200, Danilo Krummrich wrote: > > > Currently - and please correct me if I'm wrong - you make it sound to me as if > > you're not willing to respect the decisions that have been taken by Nouveau and > > DRM maintainers. > > I've never said anything about your work, go do Nova, have fun. > > I'm just not agreeing to being forced into taking Rust dependencies in > VFIO because Nova is participating in the Rust Experiment. > > I think the reasonable answer is to accept some code duplication, or > try to consolidate around a small C core. I understad this is > different than you may have planned so far for Nova, but all projects > are subject to community feedback, especially when faced with new > requirements. > > I think this discussion is getting a little overheated, there is lots > of space here for everyone to do their things. Let's not get too > excited. How do you intend to solve the stable ABI problem caused by the GSP firmware? If you haven't got an answer to that, that is reasonable, you can talk about VFIO and DRM and who is in charge all you like, but it doesn't matter. Fundamentally the problem is the unstable API exposure isn't something you can build a castle on top of, the nova idea is to use rust to solve a fundamental problem with the NVIDIA driver design process forces on us (vfio included), I'm not seeing anything constructive as to why doing that in C would be worth the investment. Nothing has changed from when we designed nova, this idea was on the table then, it has all sorts of problems leaking the unstable ABI that have to be solved, and when I see a solution for that in C that is maintainable and doesn't leak like a sieve I might be interested, but you know keep thinking we are using rust so we can have fun, not because we are using it to solve maintainability problems caused by an internal NVIDIA design decision over which we have zero influence. Dave.
On Wed, Sep 25, 2024 at 10:18:44AM +1000, Dave Airlie wrote: > > ? nova core, meaning nova rust, meaning vfio depends on rust, doesn't > > seem acceptable ? We need to keep rust isolated to DRM for the > > foreseeable future. Just need to find a separation that can do that. > > That isn't going to happen, if we start with that as the default > positioning it won't get us very far. What do you want me to say to that? We can't have rust in VFIO right now, we don't have that luxury. This is just a fact, I can't change it. If you say upstream has to be rust then there just won't be upstream and this will all go OOT and stay as C code. That isn't a good outcome. Having rust usage actively harm participation in the kernel seems like the exact opposite of the consensus of the maintainer summit. > The core has to be rust, because NVIDIA has an unstable firmware API. > The unstable firmware API isn't some command marshalling, it's deep > down into the depths of it, like memory sizing requirements, base > message queue layout and encoding, firmware init procedures. I get the feeling the vast majorty of the work, and primary rust benefit, lies in the command marshalling. If the init *procedures* change, for instance, you are going to have to write branches no matter what language you use. I don't know, it is just a suggestion to consider. > Now there are maybe some on/off ramps we can use here that might > provide some solutions to bridge the gap. Using rust in the kernel has > various levels, which we currently tie into one place, but if we > consider different longer term progressions it might be possible to > start with some rust that is easier to backport than other rust might > be etc. That seems to be entirely unexplored territory. Certainly if the backporting can be shown to be solved then I have much less objection to having VFIO depend on rust. This is part of why I suggested that a rust core driver could expose the C APIs that VFIO needs with a kconfig switch. Then people can experiment and give feedback on what backporting this rust stuff is actually like. That would be valuable for everyone I think. Especially if the feedback is that backporting is no problem. Yes we have duplication while that is ongoing, but I think that is inevitable, and at least everyone could agree to the duplication and I expect NVIDIA would sign up to maintain the C VFIO stack top to bottom. > The end result though is to have nova core and nova drm in rust, that > is the decision upstream made 6-12 months ago, I don't see any of the > initial reasons for using rust have been invalidated or removed that > warrant revisiting that decision. Never said they did, but your decision to use Rust in Nova does not automatically mean a decision to use Rust in VFIO, and now we have a new requirement to couple the two together. It still must be resolved satisfactorily. Jason
On Tue, Sep 24, 2024 at 09:53:19PM -0300, Jason Gunthorpe wrote: > On Tue, Sep 24, 2024 at 09:56:58PM +0200, Danilo Krummrich wrote: > > > Currently - and please correct me if I'm wrong - you make it sound to me as if > > you're not willing to respect the decisions that have been taken by Nouveau and > > DRM maintainers. > > I've never said anything about your work, go do Nova, have fun. See, that's the attitude that doesn't get us anywhere. You act as if we'd just be toying around to have fun, position yourself as the one who wants to do the "real deal" and just claim that our decisions would harm users. And at the same time you proof that you did not get up to speed on what were the reasons to move in this direction and what are the problems we try to solve. This just won't lead to a constructive discussion that addresses your concerns. Try to not go like a bull at a gate. Instead start with asking questions to understand why we chose this direction and then feel free to raise concerns. I assure you, we will hear and recognize them! And I'm also sure that we'll find solutions and compromises. > > I'm just not agreeing to being forced into taking Rust dependencies in > VFIO because Nova is participating in the Rust Experiment. > > I think the reasonable answer is to accept some code duplication, or > try to consolidate around a small C core. I understad this is > different than you may have planned so far for Nova, but all projects > are subject to community feedback, especially when faced with new > requirements. Fully agree, and I'm absolutely open to consider feedback and new requirements. But again, consider what I said above -- you're creating counterproposals out of thin air, without considering what we have planned for so far at all. So, I wonder what kind of reaction you expect approaching things this way? > > I think this discussion is getting a little overheated, there is lots > of space here for everyone to do their things. Let's not get too > excited. > > > I encourage that NVIDIA wants to move things upstream and I'm absolutely willing > > to collaborate and help with the use-cases and goals NVIDIA has. But it really > > has to be a collaboration and this starts with acknowledging the goals of *each > > other*. > > I've always acknowledged Nova's goal - it is fine. > > It is just quite incompatible with the VFIO side requirement of no > Rust in our stack until the ecosystem can consume it. > > I belive there is no reason we can't find an agreeable compromise. I'm pretty sure we indeed can find agreeable compromise. But again, please understand that the way of approaching this you've chosen so far won't get us there. > > > > I expect the core code would continue to support new HW going forward > > > to support the VFIO driver, even if nouveau doesn't use it, until Rust > > > reaches some full ecosystem readyness for the server space. > > > > From an upstream perspective the kernel doesn't need to consider OOT drivers, > > i.e. the guest driver. > > ?? VFIO already took the decision that it is agnostic to what is > running in the VM. Run Windows-only VMs for all we care, it is still > supposed to be virtualized correctly. > > > > There are going to be a lot of users of this code, let's not rush to > > > harm them please. > > > > Please abstain from such kind of unconstructive insinuations; it's ridiculous to > > imply that upstream kernel developers and maintainers would harm the users of > > NVIDIA GPUs. > > You literally just said you'd want to effectively block usable VFIO > support for new GPU HW when "we stop further support for new HW in > Nouveau at some point" and "move the vGPU parts over to Nova(& rust)". Well, working on a successor means that once it's in place the support for the replaced thing has to end at some point. This doesn't mean that we can't work out ways to address your concerns. You just make it a binary thing and claim that if we don't choose 1 we harm users. This effectively denies looking for solutions of your concerns in the first place. And again, this won't get us anywhere. It just creates the impression that you're not interested in solutions, but push through your agenda. > > I don't agree to that, it harms VFIO users, and is not acknowledging > that conflicting goals exist. > > VFIO will decide when it starts to depend on rust, Nova should not > force that decision on VFIO. They are very different ecosystems with > different needs. > > Jason >
On Wed, Sep 25, 2024 at 11:08:40AM +1000, Dave Airlie wrote: > On Wed, 25 Sept 2024 at 10:53, Jason Gunthorpe <jgg@nvidia.com> wrote: > > > > On Tue, Sep 24, 2024 at 09:56:58PM +0200, Danilo Krummrich wrote: > > > > > Currently - and please correct me if I'm wrong - you make it sound to me as if > > > you're not willing to respect the decisions that have been taken by Nouveau and > > > DRM maintainers. > > > > I've never said anything about your work, go do Nova, have fun. > > > > I'm just not agreeing to being forced into taking Rust dependencies in > > VFIO because Nova is participating in the Rust Experiment. > > > > I think the reasonable answer is to accept some code duplication, or > > try to consolidate around a small C core. I understad this is > > different than you may have planned so far for Nova, but all projects > > are subject to community feedback, especially when faced with new > > requirements. > > > > I think this discussion is getting a little overheated, there is lots > > of space here for everyone to do their things. Let's not get too > > excited. > > How do you intend to solve the stable ABI problem caused by the GSP firmware? > > If you haven't got an answer to that, that is reasonable, you can talk > about VFIO and DRM and who is in charge all you like, but it doesn't > matter. I suggest the same answer everyone else building HW in the kernel operates under. You get to update your driver with your new HW once per generation. Not once per FW release, once per generation. That is a similar level of burden to maintain as most drivers. It is not as good as the excellence Mellanox does (no SW change for a new HW generation), but it is still good. I would apply this logic to Nova as well, no reason to be supporting random ABI changes coming out every month(s). > Fundamentally the problem is the unstable API exposure isn't something > you can build a castle on top of, the nova idea is to use rust to > solve a fundamental problem with the NVIDIA driver design process > forces on us (vfio included), I firmly believe you can't solve a stable ABI problem with language features in an OS. The ABI is totally unstable, it will change semantically, the order and nature of functions you need will change. New HW will need new behaviors and semantics. Language support can certainly handle the mindless churn that ideally shouldn't even be happening in the first place. The way you solve this is at the root, in the FW. Don't churn everything. I'm a big believer and supporter of the Mellanox super-stable approach that has really proven how valuable this concept is to everyone. So I agree with you, the extreme unstableness is not OK in upstream, it needs to slow down a lot to be acceptable. I don't necessarily agree to Mellanox like gold standard as the bar, but certainly must be way better than it is now. FWIW when I discussed the VFIO patches I was given some impression there would not be high levels of ABI churn on the VFIO side, and that there was awareness and understanding of this issue on Zhi's side. Jason
> From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Monday, September 23, 2024 11:02 PM > > On Mon, Sep 23, 2024 at 06:22:33AM +0000, Tian, Kevin wrote: > > > From: Zhi Wang <zhiw@nvidia.com> > > > Sent: Sunday, September 22, 2024 8:49 PM > > > > > [...] > > > > > > The NVIDIA vGPU VFIO module together with VFIO sits on VFs, provides > > > extended management and features, e.g. selecting the vGPU types, > support > > > live migration and driver warm update. > > > > > > Like other devices that VFIO supports, VFIO provides the standard > > > userspace APIs for device lifecycle management and advance feature > > > support. > > > > > > The NVIDIA vGPU manager provides necessary support to the NVIDIA > vGPU VFIO > > > variant driver to create/destroy vGPUs, query available vGPU types, select > > > the vGPU type, etc. > > > > > > On the other side, NVIDIA vGPU manager talks to the NVIDIA GPU core > driver, > > > which provide necessary support to reach the HW functions. > > > > > > > I'm not sure VFIO is the right place to host the NVIDIA vGPU manager. > > It's very NVIDIA specific and naturally fit in the PF driver. > > drm isn't a particularly logical place for that either :| > This RFC doesn't expose any new uAPI in the vGPU manager, e.g. with the vGPU type hard-coded to L40-24Q. In this way the boundary between code in VFIO and code in PF driver is probably more a vendor specific choice. However according to the cover letter it's reasonable for future extension to implement new uAPI for admin to select the vGPU type and potentially do more manual configurations before the target VF can be used: Then there comes an open whether VFIO is a right place to host such vendor specific provisioning interface. The existing mdev type based provisioning mechanism was considered a bad fit already. IIRC the previous discussion came to suggest putting the provisioning interface in the PF driver. There may be chance to generalize and move to VFIO but no idea what it will be until multiple drivers already demonstrate their own implementations as the base for discussion. But now seems you prefer to vendors putting their own provisioning interface in VFIO directly? Thanks Kevin
On Mon, Sep 23, 2024 at 12:01:40PM -0300, Jason Gunthorpe wrote: > On Mon, Sep 23, 2024 at 10:49:07AM +0200, Danilo Krummrich wrote: > > > 2. Proposal for upstream > > > ======================== > > > > What is the strategy in the mid / long term with this? > > > > As you know, we're trying to move to Nova and the blockers with the device / > > driver infrastructure have been resolved and we're able to move forward. Besides > > that, Dave made great progress on the firmware abstraction side of things. > > > > Is this more of a proof of concept? Do you plan to work on Nova in general and > > vGPU support for Nova? > > This is intended to be a real product that customers would use, it is > not a proof of concept. There is alot of demand for this kind of > simplified virtualization infrastructure in the host side. The series > here is the first attempt at making thin host infrastructure and > Zhi/etc are doing it with an upstream-first approach. > > >From the VFIO side I would like to see something like this merged in > nearish future as it would bring a previously out of tree approach to > be fully intree using our modern infrastructure. This is a big win for > the VFIO world. > > As a commercial product this will be backported extensively to many > old kernels and that is harder/impossible if it isn't exclusively in > C. So, I think nova needs to co-exist in some way. Please never make design decisions based on old ancient commercial kernels that have any relevance to upstream kernel development today. If you care about those kernels, work with the companies that get paid to support such things. Otherwise development upstream would just completely stall and never go forward, as you well know. As it seems that future support for this hardware is going to be in rust, just use those apis going forward and backport the small number of missing infrastructure patches to the relevant ancient kernels as well, it's not like that would even be noticed in the overall number of patches they take for normal subsystem improvements :) thanks, greg k-h
On Thu, Sep 26, 2024 at 11:14:27AM +0200, Greg KH wrote: > On Mon, Sep 23, 2024 at 12:01:40PM -0300, Jason Gunthorpe wrote: > > On Mon, Sep 23, 2024 at 10:49:07AM +0200, Danilo Krummrich wrote: > > > > 2. Proposal for upstream > > > > ======================== > > > > > > What is the strategy in the mid / long term with this? > > > > > > As you know, we're trying to move to Nova and the blockers with the device / > > > driver infrastructure have been resolved and we're able to move forward. Besides > > > that, Dave made great progress on the firmware abstraction side of things. > > > > > > Is this more of a proof of concept? Do you plan to work on Nova in general and > > > vGPU support for Nova? > > > > This is intended to be a real product that customers would use, it is > > not a proof of concept. There is alot of demand for this kind of > > simplified virtualization infrastructure in the host side. The series > > here is the first attempt at making thin host infrastructure and > > Zhi/etc are doing it with an upstream-first approach. > > > > >From the VFIO side I would like to see something like this merged in > > nearish future as it would bring a previously out of tree approach to > > be fully intree using our modern infrastructure. This is a big win for > > the VFIO world. > > > > As a commercial product this will be backported extensively to many > > old kernels and that is harder/impossible if it isn't exclusively in > > C. So, I think nova needs to co-exist in some way. > > Please never make design decisions based on old ancient commercial > kernels that have any relevance to upstream kernel development > today. Greg, you are being too extreme. Those "ancient commercial kernels" have a huge relevance to alot of our community because they are the users that actually run the code we are building and pay for it to be created. Yes we usually (but not always!) push back on accommodations upstream, but taking hard dependencies on rust is currently a very different thing. > If you care about those kernels, work with the companies that get paid > to support such things. Otherwise development upstream would just > completely stall and never go forward, as you well know. They seem to be engaged, but upstream rust isn't even done yet. So what exactly do you expect them to do? Throw out whole architectures from their products? I know how things work, I just don't think we are ready to elevate Rust to the category of decisions where upstream can ignore the downstream side readiness. In my view the community needs to agree to remove the experimental status from Rust first. > As it seems that future support for this hardware is going to be in > rust, just use those apis going forward and backport the small number of "those apis" don't even exist yet! There is a big multi-year gap between when pure upstream would even be ready to put something like VFIO on top of Nova and Rust and where we are now with this series. This argument is *way too early*. I'm deeply hoping we never have to actually have it, that by the time Nova gets merged Rust will be 100% ready upstream and there will be no issue. Please? Can that happen? Otherwise, let's slow down here. Nova is still years away from being finished. Nouveau is the in-tree driver for this HW. This series improves on Nouveau. We are definitely not at the point of refusing new code because it is not writte in Rust, RIGHT? Jason
On Thu, Sep 26, 2024 at 09:42:39AM -0300, Jason Gunthorpe wrote: > On Thu, Sep 26, 2024 at 11:14:27AM +0200, Greg KH wrote: > > On Mon, Sep 23, 2024 at 12:01:40PM -0300, Jason Gunthorpe wrote: > > > On Mon, Sep 23, 2024 at 10:49:07AM +0200, Danilo Krummrich wrote: > > > > > 2. Proposal for upstream > > > > > ======================== > > > > > > > > What is the strategy in the mid / long term with this? > > > > > > > > As you know, we're trying to move to Nova and the blockers with the device / > > > > driver infrastructure have been resolved and we're able to move forward. Besides > > > > that, Dave made great progress on the firmware abstraction side of things. > > > > > > > > Is this more of a proof of concept? Do you plan to work on Nova in general and > > > > vGPU support for Nova? > > > > > > This is intended to be a real product that customers would use, it is > > > not a proof of concept. There is alot of demand for this kind of > > > simplified virtualization infrastructure in the host side. The series > > > here is the first attempt at making thin host infrastructure and > > > Zhi/etc are doing it with an upstream-first approach. > > > > > > >From the VFIO side I would like to see something like this merged in > > > nearish future as it would bring a previously out of tree approach to > > > be fully intree using our modern infrastructure. This is a big win for > > > the VFIO world. > > > > > > As a commercial product this will be backported extensively to many > > > old kernels and that is harder/impossible if it isn't exclusively in > > > C. So, I think nova needs to co-exist in some way. > > > > Please never make design decisions based on old ancient commercial > > kernels that have any relevance to upstream kernel development > > today. > > Greg, you are being too extreme. Those "ancient commercial kernels" > have a huge relevance to alot of our community because they are the > users that actually run the code we are building and pay for it to be > created. Yes we usually (but not always!) push back on accommodations > upstream, but taking hard dependencies on rust is currently a very > different thing. That's fine, but again, do NOT make design decisions based on what you can, and can not, feel you can slide by one of these companies to get it into their old kernels. That's what I take objection to here. Also always remember please, that the % of overall Linux kernel installs, even counting out Android and embedded, is VERY tiny for these companies. The huge % overall is doing the "right thing" by using upstream kernels. And with the laws in place now that % is only going to grow and those older kernels will rightfully fall away into even smaller %. I know those companies pay for many developers, I'm not saying that their contributions are any less or more important than others, they all are equal. You wouldn't want design decisions for a patch series to be dictated by some really old Yocto kernel restrictions that are only in autos, right? We are a large community, that's what I'm saying. > Otherwise, let's slow down here. Nova is still years away from being > finished. Nouveau is the in-tree driver for this HW. This series > improves on Nouveau. We are definitely not at the point of refusing > new code because it is not writte in Rust, RIGHT? No, I do object to "we are ignoring the driver being proposed by the developers involved for this hardware by adding to the old one instead" which it seems like is happening here. Anyway, let's focus on the code, there's already real issues with this patch series as pointed out by me and others that need to be addressed before it can go anywhere. thanks, greg k-h
On Thu, Sep 26, 2024 at 06:43:44AM +0000, Tian, Kevin wrote: > Then there comes an open whether VFIO is a right place to host such > vendor specific provisioning interface. The existing mdev type based > provisioning mechanism was considered a bad fit already. > IIRC the previous discussion came to suggest putting the provisioning > interface in the PF driver. There may be chance to generalize and > move to VFIO but no idea what it will be until multiple drivers already > demonstrate their own implementations as the base for discussion. I am looking at fwctl do to alot of this in the SRIOV world. You'd provision the VF prior to opening VFIO using the fwctl interface and the VFIO would perceive a VF that has exactly the required properties. At least for SRIOV where the VM is talking directly to device FW, mdev/paravirtualization would be different. > But now seems you prefer to vendors putting their own provisioning > interface in VFIO directly? Maybe not, just that drm isn't the right place either. If the we do fwctl stuff then the VF provisioning would be done through a fwctl driver. I'm not entirely sure yet what this whole 'mgr' component is actually doing though. Jason
On Thu, Sep 26, 2024 at 02:54:38PM +0200, Greg KH wrote: > On Thu, Sep 26, 2024 at 09:42:39AM -0300, Jason Gunthorpe wrote: > > On Thu, Sep 26, 2024 at 11:14:27AM +0200, Greg KH wrote: > > > On Mon, Sep 23, 2024 at 12:01:40PM -0300, Jason Gunthorpe wrote: > > > > On Mon, Sep 23, 2024 at 10:49:07AM +0200, Danilo Krummrich wrote: > > > > > > 2. Proposal for upstream > > > > > > ======================== > > > > > > > > > > What is the strategy in the mid / long term with this? > > > > > > > > > > As you know, we're trying to move to Nova and the blockers with the device / > > > > > driver infrastructure have been resolved and we're able to move forward. Besides > > > > > that, Dave made great progress on the firmware abstraction side of things. > > > > > > > > > > Is this more of a proof of concept? Do you plan to work on Nova in general and > > > > > vGPU support for Nova? > > > > > > > > This is intended to be a real product that customers would use, it is > > > > not a proof of concept. There is alot of demand for this kind of > > > > simplified virtualization infrastructure in the host side. The series > > > > here is the first attempt at making thin host infrastructure and > > > > Zhi/etc are doing it with an upstream-first approach. > > > > > > > > >From the VFIO side I would like to see something like this merged in > > > > nearish future as it would bring a previously out of tree approach to > > > > be fully intree using our modern infrastructure. This is a big win for > > > > the VFIO world. > > > > > > > > As a commercial product this will be backported extensively to many > > > > old kernels and that is harder/impossible if it isn't exclusively in > > > > C. So, I think nova needs to co-exist in some way. > > > > > > Please never make design decisions based on old ancient commercial > > > kernels that have any relevance to upstream kernel development > > > today. > > > > Greg, you are being too extreme. Those "ancient commercial kernels" > > have a huge relevance to alot of our community because they are the > > users that actually run the code we are building and pay for it to be > > created. Yes we usually (but not always!) push back on accommodations > > upstream, but taking hard dependencies on rust is currently a very > > different thing. > > That's fine, but again, do NOT make design decisions based on what you > can, and can not, feel you can slide by one of these companies to get it > into their old kernels. That's what I take objection to here. > > Also always remember please, that the % of overall Linux kernel > installs, even counting out Android and embedded, is VERY tiny for these > companies. The huge % overall is doing the "right thing" by using > upstream kernels. And with the laws in place now that % is only going > to grow and those older kernels will rightfully fall away into even > smaller %. > > I know those companies pay for many developers, I'm not saying that > their contributions are any less or more important than others, they all > are equal. You wouldn't want design decisions for a patch series to be > dictated by some really old Yocto kernel restrictions that are only in > autos, right? We are a large community, that's what I'm saying. > > > Otherwise, let's slow down here. Nova is still years away from being > > finished. Nouveau is the in-tree driver for this HW. This series > > improves on Nouveau. We are definitely not at the point of refusing > > new code because it is not writte in Rust, RIGHT? Just a reminder on what I said and not said, respectively. I never said we can't support this in Nouveau for the short and mid term. But we can't add new features and support new use-cases in Nouveau *without* considering the way forward to the new driver. > > No, I do object to "we are ignoring the driver being proposed by the > developers involved for this hardware by adding to the old one instead" > which it seems like is happening here. > > Anyway, let's focus on the code, there's already real issues with this > patch series as pointed out by me and others that need to be addressed > before it can go anywhere. > > thanks, > > greg k-h >
On Thu, Sep 26, 2024 at 02:54:38PM +0200, Greg KH wrote: > That's fine, but again, do NOT make design decisions based on what you > can, and can not, feel you can slide by one of these companies to get it > into their old kernels. That's what I take objection to here. It is not slide by. It is a recognition that participating in the community gives everyone value. If you excessively deny value from one side they will have no reason to participate. In this case the value is that, with enough light work, the kernel-fork community can deploy this code to their users. This has been the accepted bargin for a long time now. There is a great big question mark over Rust regarding what impact it actually has on this dynamic. It is definitely not just backport a few hundred upstream patches. There is clearly new upstream development work needed still - arch support being a very obvious one. > Also always remember please, that the % of overall Linux kernel > installs, even counting out Android and embedded, is VERY tiny for these > companies. The huge % overall is doing the "right thing" by using > upstream kernels. And with the laws in place now that % is only going > to grow and those older kernels will rightfully fall away into even > smaller %. Who is "doing the right thing"? That is not what I see, we sell server HW to *everyone*. There are a couple sites that are "near" upstream, but that is not too common. Everyone is running some kind of kernel fork. I dislike this generalization you do with % of users. Almost 100% of NVIDIA server HW are running forks. I would estimate around 10% is above a 6.0 baseline. It is not tiny either, NVIDIA sold like $60B of server HW running Linux last year with this kind of demographic. So did Intel, AMD, etc. I would not describe this as "VERY tiny". Maybe you mean RHEL-alike specifically, and yes, they are a diminishing install share. However, the hyperscale companies more than make up for that with their internal secret proprietary forks :( > > Otherwise, let's slow down here. Nova is still years away from being > > finished. Nouveau is the in-tree driver for this HW. This series > > improves on Nouveau. We are definitely not at the point of refusing > > new code because it is not writte in Rust, RIGHT? > > No, I do object to "we are ignoring the driver being proposed by the > developers involved for this hardware by adding to the old one instead" > which it seems like is happening here. That is too harsh. We've consistently taken a community position that OOT stuff doesn't matter, and yes that includes OOT stuff that people we trust and respect are working on. Until it is ready for submission, and ideally merged, it is an unknown quantity. Good well meaning people routinely drop their projects, good projects run into unexpected roadblocks, and life happens. Nova is not being ignored, there is dialog, and yes some disagreement. Again, nobody here is talking about disrupting Nova. We just want to keep going as-is until we can all agree together it is ready to make a change. Jason
I hope and expect the nova and vgpu_mgr efforts to ultimately converge. First, for the fw ABI debacle: yes, it is unfortunate that we still don't have a stable ABI from GSP. We /are/ working on it, though there isn't anything to show, yet. FWIW, I expect the end result will be a much simpler interface than what is there today, and a stable interface that NVIDIA can guarantee. But, for now, we have a timing problem like Jason described: - We have customers eager for upstream vfio support in the near term, and that seems like something NVIDIA can develop/contribute/maintain in the near term, as an incremental step forward. - Nova is still early in its development, relative to nouveau/nvkm. - From NVIDIA's perspective, we're nervous about the backportability of rust-based components to enterprise kernels in the near term. - The stable GSP ABI is not going to be ready in the near term. I agree with what Dave said in one of the forks of this thread, in the context of NV2080_CTRL_VGPU_MGR_INTERNAL_BOOTLOAD_GSP_VGPU_PLUGIN_TASK_PARAMS: > The GSP firmware interfaces are not guaranteed stable. Exposing these > interfaces outside the nvkm core is unacceptable, as otherwise we > would have to adapt the whole kernel depending on the loaded firmware. > > You cannot use any nvidia sdk headers, these all have to be abstracted > behind things that have no bearing on the API. Agreed. Though not infinitely scalable, and not as clean as in rust, it seems possible to abstract NV2080_CTRL_VGPU_MGR_INTERNAL_BOOTLOAD_GSP_VGPU_PLUGIN_TASK_PARAMS behind a C-implemented abstraction layer in nvkm, at least for the short term. Is there a potential compromise where vgpu_mgr starts its life with a dependency on nvkm, and as things mature we migrate it to instead depend on nova? On Thu, Sep 26, 2024 at 11:40:57AM -0300, Jason Gunthorpe wrote: > On Thu, Sep 26, 2024 at 02:54:38PM +0200, Greg KH wrote: > > > That's fine, but again, do NOT make design decisions based on what you > > can, and can not, feel you can slide by one of these companies to get it > > into their old kernels. That's what I take objection to here. > > It is not slide by. It is a recognition that participating in the > community gives everyone value. If you excessively deny value from one > side they will have no reason to participate. > > In this case the value is that, with enough light work, the > kernel-fork community can deploy this code to their users. This has > been the accepted bargin for a long time now. > > There is a great big question mark over Rust regarding what impact it > actually has on this dynamic. It is definitely not just backport a few > hundred upstream patches. There is clearly new upstream development > work needed still - arch support being a very obvious one. > > > Also always remember please, that the % of overall Linux kernel > > installs, even counting out Android and embedded, is VERY tiny for these > > companies. The huge % overall is doing the "right thing" by using > > upstream kernels. And with the laws in place now that % is only going > > to grow and those older kernels will rightfully fall away into even > > smaller %. > > Who is "doing the right thing"? That is not what I see, we sell > server HW to *everyone*. There are a couple sites that are "near" > upstream, but that is not too common. Everyone is running some kind of > kernel fork. > > I dislike this generalization you do with % of users. Almost 100% of > NVIDIA server HW are running forks. I would estimate around 10% is > above a 6.0 baseline. It is not tiny either, NVIDIA sold like $60B of > server HW running Linux last year with this kind of demographic. So > did Intel, AMD, etc. > > I would not describe this as "VERY tiny". Maybe you mean RHEL-alike > specifically, and yes, they are a diminishing install share. However, > the hyperscale companies more than make up for that with their > internal secret proprietary forks :( > > > > Otherwise, let's slow down here. Nova is still years away from being > > > finished. Nouveau is the in-tree driver for this HW. This series > > > improves on Nouveau. We are definitely not at the point of refusing > > > new code because it is not writte in Rust, RIGHT? > > > > No, I do object to "we are ignoring the driver being proposed by the > > developers involved for this hardware by adding to the old one instead" > > which it seems like is happening here. > > That is too harsh. We've consistently taken a community position that > OOT stuff doesn't matter, and yes that includes OOT stuff that people > we trust and respect are working on. Until it is ready for submission, > and ideally merged, it is an unknown quantity. Good well meaning > people routinely drop their projects, good projects run into > unexpected roadblocks, and life happens. > > Nova is not being ignored, there is dialog, and yes some disagreement. > > Again, nobody here is talking about disrupting Nova. We just want to > keep going as-is until we can all agree together it is ready to make a > change. > > Jason
On Thu, Sep 26, 2024 at 11:07:56AM -0700, Andy Ritger wrote: > > I hope and expect the nova and vgpu_mgr efforts to ultimately converge. > > First, for the fw ABI debacle: yes, it is unfortunate that we still don't > have a stable ABI from GSP. We /are/ working on it, though there isn't > anything to show, yet. FWIW, I expect the end result will be a much > simpler interface than what is there today, and a stable interface that > NVIDIA can guarantee. > > But, for now, we have a timing problem like Jason described: > > - We have customers eager for upstream vfio support in the near term, > and that seems like something NVIDIA can develop/contribute/maintain in > the near term, as an incremental step forward. > > - Nova is still early in its development, relative to nouveau/nvkm. > > - From NVIDIA's perspective, we're nervous about the backportability of > rust-based components to enterprise kernels in the near term. > > - The stable GSP ABI is not going to be ready in the near term. > > > I agree with what Dave said in one of the forks of this thread, in the context of > NV2080_CTRL_VGPU_MGR_INTERNAL_BOOTLOAD_GSP_VGPU_PLUGIN_TASK_PARAMS: > > > The GSP firmware interfaces are not guaranteed stable. Exposing these > > interfaces outside the nvkm core is unacceptable, as otherwise we > > would have to adapt the whole kernel depending on the loaded firmware. > > > > You cannot use any nvidia sdk headers, these all have to be abstracted > > behind things that have no bearing on the API. > > Agreed. Though not infinitely scalable, and not > as clean as in rust, it seems possible to abstract > NV2080_CTRL_VGPU_MGR_INTERNAL_BOOTLOAD_GSP_VGPU_PLUGIN_TASK_PARAMS behind > a C-implemented abstraction layer in nvkm, at least for the short term. > > Is there a potential compromise where vgpu_mgr starts its life with a > dependency on nvkm, and as things mature we migrate it to instead depend > on nova? > Of course, I've always said that it's perfectly fine to go with Nouveau as long as Nova is not ready yet. But, and that's very central, the condition must be that we agree on the long term goal and agree on working towards this goal *together*. Having two competing upstream strategies is not acceptable. The baseline for the long term goal that we have set so far is Nova. And this must also be the baseline for a discussion. Raising concerns about that is perfectly valid, we can discuss them and look for solutions.
On Thu, Sep 26, 2024 at 11:40:57AM -0300, Jason Gunthorpe wrote: > On Thu, Sep 26, 2024 at 02:54:38PM +0200, Greg KH wrote: > > > > No, I do object to "we are ignoring the driver being proposed by the > > developers involved for this hardware by adding to the old one instead" > > which it seems like is happening here. > > That is too harsh. We've consistently taken a community position that > OOT stuff doesn't matter, and yes that includes OOT stuff that people > we trust and respect are working on. Until it is ready for submission, > and ideally merged, it is an unknown quantity. Good well meaning > people routinely drop their projects, good projects run into > unexpected roadblocks, and life happens. That's not the point -- at least it never was my point. Upstream has set a strategy, and it's totally fine to raise concerns, discuss them, look for solutions, draw conclusions and do adjustments where needed. But, we have to agree on a long term strategy and work towards the corresponding goals *together*. I don't want to end up in a situation where everyone just does their own thing. So, when you say things like "go do Nova, have fun", it really just sounds like as if you just want to do your own thing and ignore the existing upstream strategy instead of collaborate and shape it.
On Thu, Sep 26, 2024 at 09:55:28AM -0300, Jason Gunthorpe wrote: > I'm not entirely sure yet what this whole 'mgr' component is actually > doing though. Looking more closely I think some of it is certainly appropriate to be in vfio. Like when something opens the VFIO device it should allocate the PF device resources from FW, setup kernel structures and so on to allow the about to be opened VF to work. That is good VFIO topics. IOW if you don't open any VFIO devices there would be a minimal overhead But that stuff shouldn't be shunted into some weird "mgr", it should just be inside the struct vfio_device subclass inside the variant driver. How to get the provisioning into the kernel prior to VFIO open, and what kind of control object should exist for the hypervisor side of the VF, I'm not sure. In mlx5 we used devlink and a netdev/rdma "respresentor" for alot of this complex control stuff. Jason
> From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Friday, September 27, 2024 6:57 AM > > On Thu, Sep 26, 2024 at 09:55:28AM -0300, Jason Gunthorpe wrote: > > > I'm not entirely sure yet what this whole 'mgr' component is actually > > doing though. > > Looking more closely I think some of it is certainly appropriate to be > in vfio. Like when something opens the VFIO device it should allocate > the PF device resources from FW, setup kernel structures and so on to > allow the about to be opened VF to work. That is good VFIO topics. IOW > if you don't open any VFIO devices there would be a minimal overhead > > But that stuff shouldn't be shunted into some weird "mgr", it should > just be inside the struct vfio_device subclass inside the variant > driver. yes. That's why I said earlier that the current way looks fine as long as it won't expand to carry vendor specific provisioning interface. The majority of the series is to allocate backend resource when the device is opened. that's perfectly a VFIO topic. Just the point of hardcoding a vGPU type now while stating the mgr will supporting selecting a vGPU type later implies something not clearly designed. > > How to get the provisioning into the kernel prior to VFIO open, and > what kind of control object should exist for the hypervisor side of > the VF, I'm not sure. In mlx5 we used devlink and a netdev/rdma > "respresentor" for alot of this complex control stuff. > the mlx5 approach is what I envisioned. or the fwctl option is also fine after it's merged.
On Fri, Sep 27, 2024 at 12:42:56AM +0200, Danilo Krummrich wrote: > On Thu, Sep 26, 2024 at 11:40:57AM -0300, Jason Gunthorpe wrote: > > On Thu, Sep 26, 2024 at 02:54:38PM +0200, Greg KH wrote: > > > > > > No, I do object to "we are ignoring the driver being proposed by the > > > developers involved for this hardware by adding to the old one instead" > > > which it seems like is happening here. > > > > That is too harsh. We've consistently taken a community position that > > OOT stuff doesn't matter, and yes that includes OOT stuff that people > > we trust and respect are working on. Until it is ready for submission, > > and ideally merged, it is an unknown quantity. Good well meaning > > people routinely drop their projects, good projects run into > > unexpected roadblocks, and life happens. > > That's not the point -- at least it never was my point. > > Upstream has set a strategy, and it's totally fine to raise concerns, discuss > them, look for solutions, draw conclusions and do adjustments where needed. We don't really do strategy in the kernel. This language is a bit off putting. Linux runs on community consensus and if any strategy exists it is reflected by the code actually merged. When you say things like this it comes across as though you are implying there are two tiers to the community. Ie those that set the strategy and those that don't. > But, we have to agree on a long term strategy and work towards the corresponding > goals *together*. I think we went over all the options already. IMHO the right one is for nova and vfio to share some kind of core driver. The choice of Rust for nova complicates planning this, but it doesn't mean anyone is saying no to it. My main point is when this switches from VFIO on nouveau to VFIO on Nova is something that needs to be a mutual decision with the VFIO side and user community as well. > So, when you say things like "go do Nova, have fun", it really just sounds like > as if you just want to do your own thing and ignore the existing upstream > strategy instead of collaborate and shape it. I am saying I have no interest in interfering with your project. Really, I read your responses as though you feel Nova is under attack and I'm trying hard to say that is not at all my intention. Jason
On Fri, Sep 27, 2024 at 09:51:15AM -0300, Jason Gunthorpe wrote: > On Fri, Sep 27, 2024 at 12:42:56AM +0200, Danilo Krummrich wrote: > > On Thu, Sep 26, 2024 at 11:40:57AM -0300, Jason Gunthorpe wrote: > > > On Thu, Sep 26, 2024 at 02:54:38PM +0200, Greg KH wrote: > > > > > > > > No, I do object to "we are ignoring the driver being proposed by the > > > > developers involved for this hardware by adding to the old one instead" > > > > which it seems like is happening here. > > > > > > That is too harsh. We've consistently taken a community position that > > > OOT stuff doesn't matter, and yes that includes OOT stuff that people > > > we trust and respect are working on. Until it is ready for submission, > > > and ideally merged, it is an unknown quantity. Good well meaning > > > people routinely drop their projects, good projects run into > > > unexpected roadblocks, and life happens. > > > > That's not the point -- at least it never was my point. > > > > Upstream has set a strategy, and it's totally fine to raise concerns, discuss > > them, look for solutions, draw conclusions and do adjustments where needed. > > We don't really do strategy in the kernel. This language is a bit > off putting. Linux runs on community consensus and if any strategy > exists it is reflected by the code actually merged. We can also just call it "goals", but either way, of course maintainers set goals for the components they maintain and hence have some sort of "strategy" how they want to evolve their components, to solve existing or foreseeable problems. However, I agree that those things may be reevaluated based on community feedback and consensus. And I'm happy to do that. See, you're twisting my words and imply that we wouldn't look for community consensus, while I'm *explicitly* asking you to let us do exactly that. I want to find consensus on the long term goals that we all work on *together*, because I don't want to end up with competing projects. And I think it's reasonable to first consider the goals that have been set already. Again, feel free to raise concerns and we'll discuss them and look for solutions, but please not just ignore the existing goals. > > When you say things like this it comes across as though you are > implying there are two tiers to the community. Ie those that set the > strategy and those that don't. This isn't true, I just ask you to consider the goals that have been set already, because we have been working on this already. *We can discuss them*, but I indeed ask you to accept the current direction as a baseline for discussion. I don't think this is unreasonable, is it? > > > But, we have to agree on a long term strategy and work towards the corresponding > > goals *together*. > > I think we went over all the options already. IMHO the right one is > for nova and vfio to share some kind of core driver. The choice of > Rust for nova complicates planning this, but it doesn't mean anyone is > saying no to it. This is the problem, you're many steps ahead. You should start with understanding why we want the core driver to be in Rust. You then can raise your concerns about it and then we can discuss them and see if we can find solutions / consensus. But you're not even considering it, and instead start with a counter proposal. This isn't acceptable to me. > > My main point is when this switches from VFIO on nouveau to VFIO on > Nova is something that needs to be a mutual decision with the VFIO > side and user community as well. To me it's important that we agree on the goals and work towards them together. If we seriously do that, then the "when" should be trival to agree on. > > > So, when you say things like "go do Nova, have fun", it really just sounds like > > as if you just want to do your own thing and ignore the existing upstream > > strategy instead of collaborate and shape it. > > I am saying I have no interest in interfering with your > project. Really, I read your responses as though you feel Nova is > under attack and I'm trying hard to say that is not at all my > intention. I don't read this as Nova "being under attack" at all. I read it as "I don't care about the goal to have the core driver in Rust, nor do I care about the reasons you have for this.". > > Jason >
On Fri, Sep 27, 2024 at 04:22:32PM +0200, Danilo Krummrich wrote: > > When you say things like this it comes across as though you are > > implying there are two tiers to the community. Ie those that set the > > strategy and those that don't. > > This isn't true, I just ask you to consider the goals that have been set > already, because we have been working on this already. Why do keep saying I haven't? I have no intention of becoming involved in your project or nouveau. My only interest here is to get an agreement that we can get a VFIO driver (to improve the VFIO subsystem and community!) in the near term on top of in-tree nouveau. > > > But, we have to agree on a long term strategy and work towards the corresponding > > > goals *together*. > > > > I think we went over all the options already. IMHO the right one is > > for nova and vfio to share some kind of core driver. The choice of > > Rust for nova complicates planning this, but it doesn't mean anyone is > > saying no to it. > > This is the problem, you're many steps ahead. > > You should start with understanding why we want the core driver to be in Rust. > You then can raise your concerns about it and then we can discuss them and see > if we can find solutions / consensus. I don't want to debate with you about Nova. It is too far in the future, and it doesn't intersect with anything I am doing. > But you're not even considering it, and instead start with a counter proposal. > This isn't acceptable to me. I'm even agreeing to a transition into a core driver in Rust, someday, when the full community can agree it is the right time. What more do you want from me? Jason
On Fri, Sep 27, 2024 at 12:27:24PM -0300, Jason Gunthorpe wrote: > On Fri, Sep 27, 2024 at 04:22:32PM +0200, Danilo Krummrich wrote: > > > When you say things like this it comes across as though you are > > > implying there are two tiers to the community. Ie those that set the > > > strategy and those that don't. > > > > This isn't true, I just ask you to consider the goals that have been set > > already, because we have been working on this already. > > Why do keep saying I haven't? Because I haven't seen you to acknowlege that the current direction we're moving to is that we're trying to move away from Nouveau and start over with a new GSP-only solution. Instead you propose a huge architectural rework of Nouveau, extract a core driver from Nouveau and make this the long term solution. > > I have no intention of becoming involved in your project or > nouveau. My only interest here is to get an agreement that we can get > a VFIO driver (to improve the VFIO subsystem and community!) in the > near term on top of in-tree nouveau. Two aspects about this. First, Nova isn't a different project in this sense, it's the continuation of Nouveau to overcome several problems we have with Nouveau. Second, of course you have the intention of becoming involved in the Nouveau / Nova project. You ask for huge architectural changes of Nouveau, including new interfaces for a VFIO driver on top. If that's not becoming involved what else would it be? > > > > > But, we have to agree on a long term strategy and work towards the corresponding > > > > goals *together*. > > > > > > I think we went over all the options already. IMHO the right one is > > > for nova and vfio to share some kind of core driver. The choice of > > > Rust for nova complicates planning this, but it doesn't mean anyone is > > > saying no to it. > > > > This is the problem, you're many steps ahead. > > > > You should start with understanding why we want the core driver to be in Rust. > > You then can raise your concerns about it and then we can discuss them and see > > if we can find solutions / consensus. > > I don't want to debate with you about Nova. It is too far in the > future, and it doesn't intersect with anything I am doing. Sure it does. Again, Nova is intended to be the continuation of Nouveau. So, if you want to do a major rework in Nouveau (and hence become involved in the project) we have to make sure that we progress things in the same direction. How do you expect the project to be successful in the long term, if the involved parties are not willing to agree at a direction and common goals for the project? Or is it that you are simply not interested in long term? Do you have reasons to think that the problems we have with Nouveau just go away in the long term? Do you plan to solve them within Nouveau? If so, how do you plan to do that? > > > But you're not even considering it, and instead start with a counter proposal. > > This isn't acceptable to me. > > I'm even agreeing to a transition into a core driver in Rust, someday, > when the full community can agree it is the right time. > > What more do you want from me? I want that the people involved in the project seriously discuss and align on the direction and goals for the project in the long term and work towards them together.