diff mbox series

[RFC,06/29] nvkm/vgpu: set RMSetSriovMode when NVIDIA vGPU is enabled

Message ID 20240922124951.1946072-7-zhiw@nvidia.com (mailing list archive)
State New, archived
Headers show
Series Introduce NVIDIA GPU Virtualization (vGPU) Support | expand

Commit Message

Zhi Wang Sept. 22, 2024, 12:49 p.m. UTC
The registry object "RMSetSriovMode" is required to be set when vGPU is
enabled.

Set "RMSetSriovMode" to 1 when nvkm is loading the GSP firmware and
initialize the GSP registry objects, if vGPU is enabled.

Cc: Neo Jia <cjia@nvidia.com>
Cc: Surath Mitra <smitra@nvidia.com>
Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Jason Gunthorpe Sept. 26, 2024, 10:53 p.m. UTC | #1
On Sun, Sep 22, 2024 at 05:49:28AM -0700, Zhi Wang wrote:
> The registry object "RMSetSriovMode" is required to be set when vGPU is
> enabled.
> 
> Set "RMSetSriovMode" to 1 when nvkm is loading the GSP firmware and
> initialize the GSP registry objects, if vGPU is enabled.

Also really weird, this sounds like what the PCI sriov enable is for.

Jason
Zhi Wang Oct. 14, 2024, 7:38 a.m. UTC | #2
On 27/09/2024 1.53, Jason Gunthorpe wrote:
> On Sun, Sep 22, 2024 at 05:49:28AM -0700, Zhi Wang wrote:
>> The registry object "RMSetSriovMode" is required to be set when vGPU is
>> enabled.
>>
>> Set "RMSetSriovMode" to 1 when nvkm is loading the GSP firmware and
>> initialize the GSP registry objects, if vGPU is enabled.
> 
> Also really weird, this sounds like what the PCI sriov enable is for.
> 

As what has been explained in PATCH 4's reply, the concept of vGPU and 
VF are not identically equal. PCI SRIOV VF is the HW interface of 
reaching a vGPU and there were generations in which HW didn't have SRIOV 
VFs and a vGPU is reached via other means.

The "RMSetSriovMode" here is not equal to PCI SRIOV enable, which 
activates the VFs and let them present on PCI bus. It is to tell the GSP 
FW to enable the mode of "vGPUs are reached by VFs".

> Jason
Christoph Hellwig Oct. 15, 2024, 3:49 a.m. UTC | #3
On Mon, Oct 14, 2024 at 07:38:03AM +0000, Zhi Wang wrote:
> As what has been explained in PATCH 4's reply, the concept of vGPU and 
> VF are not identically equal. PCI SRIOV VF is the HW interface of 
> reaching a vGPU and there were generations in which HW didn't have SRIOV 
> VFs and a vGPU is reached via other means.

What does "were" mean.  Are they supported by this driver?  If so how.
If not that's entirely irrelevant.
Jason Gunthorpe Oct. 15, 2024, 12:23 p.m. UTC | #4
On Mon, Oct 14, 2024 at 07:38:03AM +0000, Zhi Wang wrote:
> On 27/09/2024 1.53, Jason Gunthorpe wrote:
> > On Sun, Sep 22, 2024 at 05:49:28AM -0700, Zhi Wang wrote:
> >> The registry object "RMSetSriovMode" is required to be set when vGPU is
> >> enabled.
> >>
> >> Set "RMSetSriovMode" to 1 when nvkm is loading the GSP firmware and
> >> initialize the GSP registry objects, if vGPU is enabled.
> > 
> > Also really weird, this sounds like what the PCI sriov enable is for.
> > 
> 
> As what has been explained in PATCH 4's reply, the concept of vGPU and 
> VF are not identically equal. PCI SRIOV VF is the HW interface of 
> reaching a vGPU and there were generations in which HW didn't have SRIOV 
> VFs and a vGPU is reached via other means.
> 
> The "RMSetSriovMode" here is not equal to PCI SRIOV enable, which 
> activates the VFs and let them present on PCI bus. It is to tell the GSP 
> FW to enable the mode of "vGPUs are reached by VFs".

Which is usless if you don't enable SRIOV, so again, this seems like
it should be dynamic and whatever activated this is doing should be
shifted to sriov enable time and not fw load time.

There is a fundamental issue in Linux with trying to configure drivers
statically when they are probed. We want to avoid that as much as
possible.

If it can't be properly dynamic then the driver needs to take its
configuration from device flash, or you need to build a whole system
to allow configuring and rebooting the device - this is pretty hard.

Jason
diff mbox series

Patch

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
index 49552d7df88f..a7db2a7880dd 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
@@ -1500,6 +1500,9 @@  r535_gsp_rpc_set_registry(struct nvkm_gsp *gsp)
 		kfree(p);
 	}
 
+	if (nvkm_vgpu_mgr_is_supported(gsp->subdev.device))
+		add_registry_num(gsp, "RMSetSriovMode", 1);
+
 	rpc = nvkm_gsp_rpc_get(gsp, NV_VGPU_MSG_FUNCTION_SET_REGISTRY, gsp->registry_rpc_size);
 	if (IS_ERR(rpc)) {
 		ret = PTR_ERR(rpc);