diff mbox series

[3/3] nouveau/gsp: add some basic registry entries.

Message ID 20231031051943.1957328-4-airlied@gmail.com (mailing list archive)
State New, archived
Headers show
Series [1/3] nouveau/gsp: move to 535.113.01 | expand

Commit Message

Dave Airlie Oct. 31, 2023, 5:18 a.m. UTC
From: Dave Airlie <airlied@redhat.com>

The nvidia driver sets these two basic registry entries always,
so copy it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
---
 .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c    | 45 ++++++++++++++-----
 1 file changed, 35 insertions(+), 10 deletions(-)

Comments

Timur Tabi Oct. 31, 2023, 3:52 p.m. UTC | #1
On Tue, Oct 31, 2023 at 12:20 AM Dave Airlie <airlied@gmail.com> wrote:
> +#define NV_GSP_REG_NUM_ENTRIES 2
> +
> +static const struct nv_gsp_registry_entries r535_registry_entries[NV_GSP_REG_NUM_ENTRIES] = {
> +       { "RMSecBusResetEnable", 1 },
> +       { "RMForcePcieConfigSave", 1 },
> +};

How about :

static const struct nv_gsp_registry_entries r535_registry_entries[] = {
       { "RMSecBusResetEnable", 1 },
       { "RMForcePcieConfigSave", 1 },
};

#define NV_GSP_REG_NUM_ENTRIES ARRAY_SIZE(r535_registry_entries)
Dave Airlie Oct. 31, 2023, 7:29 p.m. UTC | #2
On Wed, 1 Nov 2023 at 01:53, Timur Tabi <timur@kernel.org> wrote:
>
> On Tue, Oct 31, 2023 at 12:20 AM Dave Airlie <airlied@gmail.com> wrote:
> > +#define NV_GSP_REG_NUM_ENTRIES 2
> > +
> > +static const struct nv_gsp_registry_entries r535_registry_entries[NV_GSP_REG_NUM_ENTRIES] = {
> > +       { "RMSecBusResetEnable", 1 },
> > +       { "RMForcePcieConfigSave", 1 },
> > +};
>
> How about :
>
> static const struct nv_gsp_registry_entries r535_registry_entries[] = {
>        { "RMSecBusResetEnable", 1 },
>        { "RMForcePcieConfigSave", 1 },
> };
>
> #define NV_GSP_REG_NUM_ENTRIES ARRAY_SIZE(r535_registry_entries)

Good plan. I'll change that now.

Dave.
Timur Tabi Oct. 31, 2023, 8:02 p.m. UTC | #3
On Wed, 2023-11-01 at 05:29 +1000, Dave Airlie wrote:
> > 
> > static const struct nv_gsp_registry_entries r535_registry_entries[] = {
> >         { "RMSecBusResetEnable", 1 },
> >         { "RMForcePcieConfigSave", 1 },
> > };
> > 
> > #define NV_GSP_REG_NUM_ENTRIES ARRAY_SIZE(r535_registry_entries)
> 
> Good plan. I'll change that now.

BTW, I looked into these two options.  Basically, these options enable
"secondary bus reset", although I'm not exactly sure what that means.  It
does enable saving of PCI config registers, so perhaps it preserves those
registers for Nouveau when GSP-RM does a PCI reset.

Also, like I mentioned on IRC earlier, I've got a bunch of patches that
provide mroe robust registry control, including setting options from the
command-line.  I'll post those upstream once the dust has settled.
Timur Tabi Nov. 7, 2023, 6:51 p.m. UTC | #4
On Tue, 2023-10-31 at 15:18 +1000, Dave Airlie wrote:
+       strings = (char *)&rpc->entries[NV_GSP_REG_NUM_ENTRIES];

I get a UBSAN index-out-of-bounds error on boot at this line.

[ 17.765746] nouveau 0000:65:00.0: gsp: cmdq: wptr 1
[ 17.765748] ================================================================================
[ 17.774170] UBSAN: array-index-out-of-bounds in drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1065:33
[ 17.783449] index 2 is out of range for type 'PACKED_REGISTRY_ENTRY [*]'
[ 17.790132] CPU: 0 PID: 234 Comm: kworker/0:4 Not tainted 6.6.0-rc5+ #1
[ 17.790135] Hardware name: ASUS X299-A/PRIME X299-A, BIOS 2002 09/25/2019
[ 17.790136] Workqueue: events work_for_cpu_fn
[ 17.790143] Call Trace:
[ 17.790145] <TASK>
[ 17.790148] dump_stack_lvl+0x48/0x70
[ 17.790155] dump_stack+0x10/0x20
[ 17.790156] __ubsan_handle_out_of_bounds+0xc6/0x110
[ 17.790160] r535_gsp_oneinit+0xf81/0x1530 [nouveau]
[ 17.790292] ? __dev_printk+0x39/0xa0
[ 17.790295] ? _dev_info+0x75/0xa0
[ 17.790298] tu102_gsp_oneinit+0x9b/0xd0 [nouveau]

I'm not sure what the fix is.  Do we need __attribute__((no_sanitize("array-bounds"))) on PACKED_REGISTRY_TABLE?
Dave Airlie Nov. 7, 2023, 6:54 p.m. UTC | #5
On Wed, 8 Nov 2023 at 04:51, Timur Tabi <ttabi@nvidia.com> wrote:
>
> On Tue, 2023-10-31 at 15:18 +1000, Dave Airlie wrote:
>
> +       strings = (char *)&rpc->entries[NV_GSP_REG_NUM_ENTRIES];
>
>
> I get a UBSAN index-out-of-bounds error on boot at this line.
>
> [ 17.765746] nouveau 0000:65:00.0: gsp: cmdq: wptr 1
> [ 17.765748] ================================================================================
> [ 17.774170] UBSAN: array-index-out-of-bounds in drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1065:33
> [ 17.783449] index 2 is out of range for type 'PACKED_REGISTRY_ENTRY [*]'
> [ 17.790132] CPU: 0 PID: 234 Comm: kworker/0:4 Not tainted 6.6.0-rc5+ #1
> [ 17.790135] Hardware name: ASUS X299-A/PRIME X299-A, BIOS 2002 09/25/2019
> [ 17.790136] Workqueue: events work_for_cpu_fn
> [ 17.790143] Call Trace:
> [ 17.790145] <TASK>
> [ 17.790148] dump_stack_lvl+0x48/0x70
> [ 17.790155] dump_stack+0x10/0x20
> [ 17.790156] __ubsan_handle_out_of_bounds+0xc6/0x110
> [ 17.790160] r535_gsp_oneinit+0xf81/0x1530 [nouveau]
> [ 17.790292] ? __dev_printk+0x39/0xa0
> [ 17.790295] ? _dev_info+0x75/0xa0
> [ 17.790298] tu102_gsp_oneinit+0x9b/0xd0 [nouveau]
>
> I'm not sure what the fix is.  Do we need __attribute__((no_sanitize("array-bounds"))) on PACKED_REGISTRY_TABLE?

yes that is probably the right answer for this, if we want to reuse
the structs that we get from the nvidia driver.

Dave.
Timur Tabi Nov. 7, 2023, 7:02 p.m. UTC | #6
On Wed, 2023-11-08 at 04:54 +1000, Dave Airlie wrote:
yes that is probably the right answer for this, if we want to reuse
the structs that we get from the nvidia driver.

ok, I'll submit a patch.
Timur Tabi Nov. 27, 2023, 8:47 p.m. UTC | #7
On Tue, Oct 31, 2023 at 12:20 AM Dave Airlie <airlied@gmail.com> wrote:
>         rpc->size = sizeof(*rpc);
> -       rpc->numEntries = 1;
> -       rpc->entries[0].nameOffset = offsetof(typeof(*rpc), entries[1]);
> -       rpc->entries[0].type = 1;
> -       rpc->entries[0].data = 0;
> -       rpc->entries[0].length = 4;
> -
> -       strings = (char *)&rpc->entries[1];
> -       strings[0] = '\0';
> +       rpc->numEntries = NV_GSP_REG_NUM_ENTRIES;
> +
> +       str_offset = offsetof(typeof(*rpc), entries[NV_GSP_REG_NUM_ENTRIES]);
> +       strings = (char *)&rpc->entries[NV_GSP_REG_NUM_ENTRIES];
> +       for (i = 0; i < NV_GSP_REG_NUM_ENTRIES; i++) {
> +               int name_len = strlen(r535_registry_entries[i].name) + 1;
> +               rpc->entries[i].nameOffset = str_offset;
> +               rpc->entries[i].type = 1;
> +               rpc->entries[i].data = r535_registry_entries[i].value;
> +               rpc->entries[i].length = 4;
> +               memcpy(strings, r535_registry_entries[i].name, name_len);
> +               strings += name_len;
> +               str_offset += name_len;
> +       }

I'm working on a patch that replaces this code with a
dynamically-generated registry so that we can set registry keys via
the driver's command-line (like the Nvidia driver).  However, you have
a bug here.  rpc->size must be the total size of the RPC, including
all the PACKED_REGISTRY_ENTRY structs and the strings that follow
them.  You can see this by looking at RmPackageRegistry() and
regCountEntriesAndSize() in OpenRM.
Dave Airlie Nov. 27, 2023, 9:32 p.m. UTC | #8
On Tue, 28 Nov 2023 at 06:48, Timur Tabi <timur@kernel.org> wrote:
>
> On Tue, Oct 31, 2023 at 12:20 AM Dave Airlie <airlied@gmail.com> wrote:
> >         rpc->size = sizeof(*rpc);
> > -       rpc->numEntries = 1;
> > -       rpc->entries[0].nameOffset = offsetof(typeof(*rpc), entries[1]);
> > -       rpc->entries[0].type = 1;
> > -       rpc->entries[0].data = 0;
> > -       rpc->entries[0].length = 4;
> > -
> > -       strings = (char *)&rpc->entries[1];
> > -       strings[0] = '\0';
> > +       rpc->numEntries = NV_GSP_REG_NUM_ENTRIES;
> > +
> > +       str_offset = offsetof(typeof(*rpc), entries[NV_GSP_REG_NUM_ENTRIES]);
> > +       strings = (char *)&rpc->entries[NV_GSP_REG_NUM_ENTRIES];
> > +       for (i = 0; i < NV_GSP_REG_NUM_ENTRIES; i++) {
> > +               int name_len = strlen(r535_registry_entries[i].name) + 1;
> > +               rpc->entries[i].nameOffset = str_offset;
> > +               rpc->entries[i].type = 1;
> > +               rpc->entries[i].data = r535_registry_entries[i].value;
> > +               rpc->entries[i].length = 4;
> > +               memcpy(strings, r535_registry_entries[i].name, name_len);
> > +               strings += name_len;
> > +               str_offset += name_len;
> > +       }
>
> I'm working on a patch that replaces this code with a
> dynamically-generated registry so that we can set registry keys via
> the driver's command-line (like the Nvidia driver).

I'm not sure we'd want that, except maybe as a debugging aid, I'd
really like to have nouveau just pick the correct set of registry
entries, but I suppose there might be some cases where setting the
from the command line would be good for testing.

> a bug here.  rpc->size must be the total size of the RPC, including
> all the PACKED_REGISTRY_ENTRY structs and the strings that follow
> them.  You can see this by looking at RmPackageRegistry() and
> regCountEntriesAndSize() in OpenRM.

Oh interesting, I'll take a look today.

Dave.
diff mbox series

Patch

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
index b6f6b5e747d4..5bd38b1de226 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
@@ -1029,26 +1029,51 @@  r535_gsp_rpc_unloading_guest_driver(struct nvkm_gsp *gsp, bool suspend)
 	return nvkm_gsp_rpc_wr(gsp, rpc, true);
 }
 
+/* dword only */
+struct nv_gsp_registry_entries {
+	const char *name;
+	uint32_t value;
+};
+
+#define NV_GSP_REG_NUM_ENTRIES 2
+
+static const struct nv_gsp_registry_entries r535_registry_entries[NV_GSP_REG_NUM_ENTRIES] = {
+	{ "RMSecBusResetEnable", 1 },
+	{ "RMForcePcieConfigSave", 1 },
+};
+
 static int
 r535_gsp_rpc_set_registry(struct nvkm_gsp *gsp)
 {
 	PACKED_REGISTRY_TABLE *rpc;
 	char *strings;
+	int str_offset;
+	int i;
+	size_t rpc_size = sizeof(*rpc) + sizeof(rpc->entries[0]) * NV_GSP_REG_NUM_ENTRIES;
 
-	rpc = nvkm_gsp_rpc_get(gsp, NV_VGPU_MSG_FUNCTION_SET_REGISTRY,
-			       sizeof(*rpc) + sizeof(rpc->entries[0]) + 1);
+	/* add strings + null terminator */
+	for (i = 0; i < NV_GSP_REG_NUM_ENTRIES; i++)
+		rpc_size += strlen(r535_registry_entries[i].name) + 1;
+
+	rpc = nvkm_gsp_rpc_get(gsp, NV_VGPU_MSG_FUNCTION_SET_REGISTRY, rpc_size);
 	if (IS_ERR(rpc))
 		return PTR_ERR(rpc);
 
 	rpc->size = sizeof(*rpc);
-	rpc->numEntries = 1;
-	rpc->entries[0].nameOffset = offsetof(typeof(*rpc), entries[1]);
-	rpc->entries[0].type = 1;
-	rpc->entries[0].data = 0;
-	rpc->entries[0].length = 4;
-
-	strings = (char *)&rpc->entries[1];
-	strings[0] = '\0';
+	rpc->numEntries = NV_GSP_REG_NUM_ENTRIES;
+
+	str_offset = offsetof(typeof(*rpc), entries[NV_GSP_REG_NUM_ENTRIES]);
+	strings = (char *)&rpc->entries[NV_GSP_REG_NUM_ENTRIES];
+	for (i = 0; i < NV_GSP_REG_NUM_ENTRIES; i++) {
+		int name_len = strlen(r535_registry_entries[i].name) + 1;
+		rpc->entries[i].nameOffset = str_offset;
+		rpc->entries[i].type = 1;
+		rpc->entries[i].data = r535_registry_entries[i].value;
+		rpc->entries[i].length = 4;
+		memcpy(strings, r535_registry_entries[i].name, name_len);
+		strings += name_len;
+		str_offset += name_len;
+	}
 
 	return nvkm_gsp_rpc_wr(gsp, rpc, false);
 }