Message ID | 20240922124951.1946072-3-zhiw@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Introduce NVIDIA GPU Virtualization (vGPU) Support | expand |
On Sun, Sep 22, 2024 at 05:49:24AM -0700, Zhi Wang wrote: > nvkm is a HW abstraction layer(HAL) that initializes the HW and > allows its clients to manipulate the GPU functions regardless of the > generations of GPU HW. On the top layer, it provides generic APIs for a > client to connect to NVKM, enumerate the GPU functions, and manipulate > the GPU HW. > > To reach nvkm, the client needs to connect to NVKM layer by layer: driver > layer, client layer, and eventually, the device layer, which provides all > the access routines to GPU functions. After a client attaches to NVKM, > it initializes the HW and is able to serve the clients. > > Attach to nvkm as a nvkm client. > > Cc: Neo Jia <cjia@nvidia.com> > Signed-off-by: Zhi Wang <zhiw@nvidia.com> > --- > .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h | 8 ++++ > .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c | 48 ++++++++++++++++++- > 2 files changed, 55 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h > index 3163fff1085b..9e10e18306b0 100644 > --- a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h > +++ b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h > @@ -7,6 +7,14 @@ > struct nvkm_vgpu_mgr { > bool enabled; > struct nvkm_device *nvkm_dev; > + > + const struct nvif_driver *driver; Meta-comment, why is this attempting to act like a "driver" and yet not tieing into the driver model code at all? Please fix that up, it's not ok to add more layers on top of a broken one like this. We have infrastructure for this type of thing, please don't route around it. thanks, greg k-h
On 26/09/2024 12.21, Greg KH wrote: > External email: Use caution opening links or attachments > > > On Sun, Sep 22, 2024 at 05:49:24AM -0700, Zhi Wang wrote: >> nvkm is a HW abstraction layer(HAL) that initializes the HW and >> allows its clients to manipulate the GPU functions regardless of the >> generations of GPU HW. On the top layer, it provides generic APIs for a >> client to connect to NVKM, enumerate the GPU functions, and manipulate >> the GPU HW. >> >> To reach nvkm, the client needs to connect to NVKM layer by layer: driver >> layer, client layer, and eventually, the device layer, which provides all >> the access routines to GPU functions. After a client attaches to NVKM, >> it initializes the HW and is able to serve the clients. >> >> Attach to nvkm as a nvkm client. >> >> Cc: Neo Jia <cjia@nvidia.com> >> Signed-off-by: Zhi Wang <zhiw@nvidia.com> >> --- >> .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h | 8 ++++ >> .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c | 48 ++++++++++++++++++- >> 2 files changed, 55 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h >> index 3163fff1085b..9e10e18306b0 100644 >> --- a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h >> +++ b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h >> @@ -7,6 +7,14 @@ >> struct nvkm_vgpu_mgr { >> bool enabled; >> struct nvkm_device *nvkm_dev; >> + >> + const struct nvif_driver *driver; > > Meta-comment, why is this attempting to act like a "driver" and yet not > tieing into the driver model code at all? Please fix that up, it's not > ok to add more layers on top of a broken one like this. We have > infrastructure for this type of thing, please don't route around it. > Thanks for the guidelines. Will try to work with folks and figure out a solution. Ben is doing quite some clean-ups of nouveau driver[1], they had been reviewed and merged by Danilo. Also, the split driver patchset he is working on seems a meaningful pre-step to fix this, as it also includes the re-factor of the interface between the nvkm and the nvif stuff. [1] https://lore.kernel.org/nouveau/CAPM=9tyW=YuDQrRwrYK_ayuvEnp+9irTuze=MP-zkowm3CFJ9A@mail.gmail.com/T/ [2] https://lore.kernel.org/dri-devel/20240613170211.88779-1-bskeggs@nvidia.com/T/ > thanks, > > greg k-h
On Mon, Oct 14, 2024 at 10:16:21AM +0000, Zhi Wang wrote: > On 26/09/2024 12.21, Greg KH wrote: > > External email: Use caution opening links or attachments > > > > > > On Sun, Sep 22, 2024 at 05:49:24AM -0700, Zhi Wang wrote: > >> nvkm is a HW abstraction layer(HAL) that initializes the HW and > >> allows its clients to manipulate the GPU functions regardless of the > >> generations of GPU HW. On the top layer, it provides generic APIs for a > >> client to connect to NVKM, enumerate the GPU functions, and manipulate > >> the GPU HW. > >> > >> To reach nvkm, the client needs to connect to NVKM layer by layer: driver > >> layer, client layer, and eventually, the device layer, which provides all > >> the access routines to GPU functions. After a client attaches to NVKM, > >> it initializes the HW and is able to serve the clients. > >> > >> Attach to nvkm as a nvkm client. > >> > >> Cc: Neo Jia <cjia@nvidia.com> > >> Signed-off-by: Zhi Wang <zhiw@nvidia.com> > >> --- > >> .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h | 8 ++++ > >> .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c | 48 ++++++++++++++++++- > >> 2 files changed, 55 insertions(+), 1 deletion(-) > >> > >> diff --git a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h > >> index 3163fff1085b..9e10e18306b0 100644 > >> --- a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h > >> +++ b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h > >> @@ -7,6 +7,14 @@ > >> struct nvkm_vgpu_mgr { > >> bool enabled; > >> struct nvkm_device *nvkm_dev; > >> + > >> + const struct nvif_driver *driver; > > > > Meta-comment, why is this attempting to act like a "driver" and yet not > > tieing into the driver model code at all? Please fix that up, it's not > > ok to add more layers on top of a broken one like this. We have > > infrastructure for this type of thing, please don't route around it. > > > > Thanks for the guidelines. Will try to work with folks and figure out a > solution. > > Ben is doing quite some clean-ups of nouveau driver[1], they had been > reviewed and merged by Danilo. Also, the split driver patchset he is > working on seems a meaningful pre-step to fix this, as it also includes > the re-factor of the interface between the nvkm and the nvif stuff. What we need is the auxbus code changes that are pointed to somewhere in those long threads, that needs to land first and this series rebased before it can be reviewed properly as that is going to change your device lifetime rules a lot. Please do that and then move this "nvif_driver" to be the proper driver core type to tie into the kernel correctly. thanks, greg k-h
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h index 3163fff1085b..9e10e18306b0 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h @@ -7,6 +7,14 @@ struct nvkm_vgpu_mgr { bool enabled; struct nvkm_device *nvkm_dev; + + const struct nvif_driver *driver; + + const struct nvif_client_impl *cli_impl; + struct nvif_client_priv *cli_priv; + + const struct nvif_device_impl *dev_impl; + struct nvif_device_priv *dev_priv; }; bool nvkm_vgpu_mgr_is_supported(struct nvkm_device *device); diff --git a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c index a506414e5ba2..0639596f8a96 100644 --- a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c +++ b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c @@ -1,5 +1,7 @@ /* SPDX-License-Identifier: MIT */ #include <core/device.h> +#include <core/driver.h> +#include <nvif/driverif.h> #include <core/pci.h> #include <vgpu_mgr/vgpu_mgr.h> @@ -42,6 +44,44 @@ bool nvkm_vgpu_mgr_is_enabled(struct nvkm_device *device) return device->vgpu_mgr.enabled; } +static void detach_nvkm(struct nvkm_vgpu_mgr *vgpu_mgr) +{ + if (vgpu_mgr->dev_impl) { + vgpu_mgr->dev_impl->del(vgpu_mgr->dev_priv); + vgpu_mgr->dev_impl = NULL; + } + + if (vgpu_mgr->cli_impl) { + vgpu_mgr->cli_impl->del(vgpu_mgr->cli_priv); + vgpu_mgr->cli_impl = NULL; + } +} + +static int attach_nvkm(struct nvkm_vgpu_mgr *vgpu_mgr) +{ + struct nvkm_device *device = vgpu_mgr->nvkm_dev; + int ret; + + ret = nvkm_driver_ctor(device, &vgpu_mgr->driver, + &vgpu_mgr->cli_impl, &vgpu_mgr->cli_priv); + if (ret) + return ret; + + ret = vgpu_mgr->cli_impl->device.new(vgpu_mgr->cli_priv, + &vgpu_mgr->dev_impl, + &vgpu_mgr->dev_priv); + if (ret) + goto fail_device_new; + + return 0; + +fail_device_new: + vgpu_mgr->cli_impl->del(vgpu_mgr->cli_priv); + vgpu_mgr->cli_impl = NULL; + + return ret; +} + /** * nvkm_vgpu_mgr_init - Initialize the vGPU manager support * @device: the nvkm_device pointer @@ -51,13 +91,18 @@ bool nvkm_vgpu_mgr_is_enabled(struct nvkm_device *device) int nvkm_vgpu_mgr_init(struct nvkm_device *device) { struct nvkm_vgpu_mgr *vgpu_mgr = &device->vgpu_mgr; + int ret; if (!nvkm_vgpu_mgr_is_supported(device)) return -ENODEV; vgpu_mgr->nvkm_dev = device; - vgpu_mgr->enabled = true; + ret = attach_nvkm(vgpu_mgr); + if (ret) + return ret; + + vgpu_mgr->enabled = true; pci_info(nvkm_to_pdev(device), "NVIDIA vGPU mananger support is enabled.\n"); @@ -72,5 +117,6 @@ void nvkm_vgpu_mgr_fini(struct nvkm_device *device) { struct nvkm_vgpu_mgr *vgpu_mgr = &device->vgpu_mgr; + detach_nvkm(vgpu_mgr); vgpu_mgr->enabled = false; }
nvkm is a HW abstraction layer(HAL) that initializes the HW and allows its clients to manipulate the GPU functions regardless of the generations of GPU HW. On the top layer, it provides generic APIs for a client to connect to NVKM, enumerate the GPU functions, and manipulate the GPU HW. To reach nvkm, the client needs to connect to NVKM layer by layer: driver layer, client layer, and eventually, the device layer, which provides all the access routines to GPU functions. After a client attaches to NVKM, it initializes the HW and is able to serve the clients. Attach to nvkm as a nvkm client. Cc: Neo Jia <cjia@nvidia.com> Signed-off-by: Zhi Wang <zhiw@nvidia.com> --- .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h | 8 ++++ .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c | 48 ++++++++++++++++++- 2 files changed, 55 insertions(+), 1 deletion(-)