Message ID | 20231031051943.1957328-4-airlied@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/3] nouveau/gsp: move to 535.113.01 | expand |
On Tue, Oct 31, 2023 at 12:20 AM Dave Airlie <airlied@gmail.com> wrote: > +#define NV_GSP_REG_NUM_ENTRIES 2 > + > +static const struct nv_gsp_registry_entries r535_registry_entries[NV_GSP_REG_NUM_ENTRIES] = { > + { "RMSecBusResetEnable", 1 }, > + { "RMForcePcieConfigSave", 1 }, > +}; How about : static const struct nv_gsp_registry_entries r535_registry_entries[] = { { "RMSecBusResetEnable", 1 }, { "RMForcePcieConfigSave", 1 }, }; #define NV_GSP_REG_NUM_ENTRIES ARRAY_SIZE(r535_registry_entries)
On Wed, 1 Nov 2023 at 01:53, Timur Tabi <timur@kernel.org> wrote: > > On Tue, Oct 31, 2023 at 12:20 AM Dave Airlie <airlied@gmail.com> wrote: > > +#define NV_GSP_REG_NUM_ENTRIES 2 > > + > > +static const struct nv_gsp_registry_entries r535_registry_entries[NV_GSP_REG_NUM_ENTRIES] = { > > + { "RMSecBusResetEnable", 1 }, > > + { "RMForcePcieConfigSave", 1 }, > > +}; > > How about : > > static const struct nv_gsp_registry_entries r535_registry_entries[] = { > { "RMSecBusResetEnable", 1 }, > { "RMForcePcieConfigSave", 1 }, > }; > > #define NV_GSP_REG_NUM_ENTRIES ARRAY_SIZE(r535_registry_entries) Good plan. I'll change that now. Dave.
On Wed, 2023-11-01 at 05:29 +1000, Dave Airlie wrote: > > > > static const struct nv_gsp_registry_entries r535_registry_entries[] = { > > { "RMSecBusResetEnable", 1 }, > > { "RMForcePcieConfigSave", 1 }, > > }; > > > > #define NV_GSP_REG_NUM_ENTRIES ARRAY_SIZE(r535_registry_entries) > > Good plan. I'll change that now. BTW, I looked into these two options. Basically, these options enable "secondary bus reset", although I'm not exactly sure what that means. It does enable saving of PCI config registers, so perhaps it preserves those registers for Nouveau when GSP-RM does a PCI reset. Also, like I mentioned on IRC earlier, I've got a bunch of patches that provide mroe robust registry control, including setting options from the command-line. I'll post those upstream once the dust has settled.
On Tue, 2023-10-31 at 15:18 +1000, Dave Airlie wrote: + strings = (char *)&rpc->entries[NV_GSP_REG_NUM_ENTRIES]; I get a UBSAN index-out-of-bounds error on boot at this line. [ 17.765746] nouveau 0000:65:00.0: gsp: cmdq: wptr 1 [ 17.765748] ================================================================================ [ 17.774170] UBSAN: array-index-out-of-bounds in drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1065:33 [ 17.783449] index 2 is out of range for type 'PACKED_REGISTRY_ENTRY [*]' [ 17.790132] CPU: 0 PID: 234 Comm: kworker/0:4 Not tainted 6.6.0-rc5+ #1 [ 17.790135] Hardware name: ASUS X299-A/PRIME X299-A, BIOS 2002 09/25/2019 [ 17.790136] Workqueue: events work_for_cpu_fn [ 17.790143] Call Trace: [ 17.790145] <TASK> [ 17.790148] dump_stack_lvl+0x48/0x70 [ 17.790155] dump_stack+0x10/0x20 [ 17.790156] __ubsan_handle_out_of_bounds+0xc6/0x110 [ 17.790160] r535_gsp_oneinit+0xf81/0x1530 [nouveau] [ 17.790292] ? __dev_printk+0x39/0xa0 [ 17.790295] ? _dev_info+0x75/0xa0 [ 17.790298] tu102_gsp_oneinit+0x9b/0xd0 [nouveau] I'm not sure what the fix is. Do we need __attribute__((no_sanitize("array-bounds"))) on PACKED_REGISTRY_TABLE?
On Wed, 8 Nov 2023 at 04:51, Timur Tabi <ttabi@nvidia.com> wrote: > > On Tue, 2023-10-31 at 15:18 +1000, Dave Airlie wrote: > > + strings = (char *)&rpc->entries[NV_GSP_REG_NUM_ENTRIES]; > > > I get a UBSAN index-out-of-bounds error on boot at this line. > > [ 17.765746] nouveau 0000:65:00.0: gsp: cmdq: wptr 1 > [ 17.765748] ================================================================================ > [ 17.774170] UBSAN: array-index-out-of-bounds in drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1065:33 > [ 17.783449] index 2 is out of range for type 'PACKED_REGISTRY_ENTRY [*]' > [ 17.790132] CPU: 0 PID: 234 Comm: kworker/0:4 Not tainted 6.6.0-rc5+ #1 > [ 17.790135] Hardware name: ASUS X299-A/PRIME X299-A, BIOS 2002 09/25/2019 > [ 17.790136] Workqueue: events work_for_cpu_fn > [ 17.790143] Call Trace: > [ 17.790145] <TASK> > [ 17.790148] dump_stack_lvl+0x48/0x70 > [ 17.790155] dump_stack+0x10/0x20 > [ 17.790156] __ubsan_handle_out_of_bounds+0xc6/0x110 > [ 17.790160] r535_gsp_oneinit+0xf81/0x1530 [nouveau] > [ 17.790292] ? __dev_printk+0x39/0xa0 > [ 17.790295] ? _dev_info+0x75/0xa0 > [ 17.790298] tu102_gsp_oneinit+0x9b/0xd0 [nouveau] > > I'm not sure what the fix is. Do we need __attribute__((no_sanitize("array-bounds"))) on PACKED_REGISTRY_TABLE? yes that is probably the right answer for this, if we want to reuse the structs that we get from the nvidia driver. Dave.
On Wed, 2023-11-08 at 04:54 +1000, Dave Airlie wrote: yes that is probably the right answer for this, if we want to reuse the structs that we get from the nvidia driver. ok, I'll submit a patch.
On Tue, Oct 31, 2023 at 12:20 AM Dave Airlie <airlied@gmail.com> wrote: > rpc->size = sizeof(*rpc); > - rpc->numEntries = 1; > - rpc->entries[0].nameOffset = offsetof(typeof(*rpc), entries[1]); > - rpc->entries[0].type = 1; > - rpc->entries[0].data = 0; > - rpc->entries[0].length = 4; > - > - strings = (char *)&rpc->entries[1]; > - strings[0] = '\0'; > + rpc->numEntries = NV_GSP_REG_NUM_ENTRIES; > + > + str_offset = offsetof(typeof(*rpc), entries[NV_GSP_REG_NUM_ENTRIES]); > + strings = (char *)&rpc->entries[NV_GSP_REG_NUM_ENTRIES]; > + for (i = 0; i < NV_GSP_REG_NUM_ENTRIES; i++) { > + int name_len = strlen(r535_registry_entries[i].name) + 1; > + rpc->entries[i].nameOffset = str_offset; > + rpc->entries[i].type = 1; > + rpc->entries[i].data = r535_registry_entries[i].value; > + rpc->entries[i].length = 4; > + memcpy(strings, r535_registry_entries[i].name, name_len); > + strings += name_len; > + str_offset += name_len; > + } I'm working on a patch that replaces this code with a dynamically-generated registry so that we can set registry keys via the driver's command-line (like the Nvidia driver). However, you have a bug here. rpc->size must be the total size of the RPC, including all the PACKED_REGISTRY_ENTRY structs and the strings that follow them. You can see this by looking at RmPackageRegistry() and regCountEntriesAndSize() in OpenRM.
On Tue, 28 Nov 2023 at 06:48, Timur Tabi <timur@kernel.org> wrote: > > On Tue, Oct 31, 2023 at 12:20 AM Dave Airlie <airlied@gmail.com> wrote: > > rpc->size = sizeof(*rpc); > > - rpc->numEntries = 1; > > - rpc->entries[0].nameOffset = offsetof(typeof(*rpc), entries[1]); > > - rpc->entries[0].type = 1; > > - rpc->entries[0].data = 0; > > - rpc->entries[0].length = 4; > > - > > - strings = (char *)&rpc->entries[1]; > > - strings[0] = '\0'; > > + rpc->numEntries = NV_GSP_REG_NUM_ENTRIES; > > + > > + str_offset = offsetof(typeof(*rpc), entries[NV_GSP_REG_NUM_ENTRIES]); > > + strings = (char *)&rpc->entries[NV_GSP_REG_NUM_ENTRIES]; > > + for (i = 0; i < NV_GSP_REG_NUM_ENTRIES; i++) { > > + int name_len = strlen(r535_registry_entries[i].name) + 1; > > + rpc->entries[i].nameOffset = str_offset; > > + rpc->entries[i].type = 1; > > + rpc->entries[i].data = r535_registry_entries[i].value; > > + rpc->entries[i].length = 4; > > + memcpy(strings, r535_registry_entries[i].name, name_len); > > + strings += name_len; > > + str_offset += name_len; > > + } > > I'm working on a patch that replaces this code with a > dynamically-generated registry so that we can set registry keys via > the driver's command-line (like the Nvidia driver). I'm not sure we'd want that, except maybe as a debugging aid, I'd really like to have nouveau just pick the correct set of registry entries, but I suppose there might be some cases where setting the from the command line would be good for testing. > a bug here. rpc->size must be the total size of the RPC, including > all the PACKED_REGISTRY_ENTRY structs and the strings that follow > them. You can see this by looking at RmPackageRegistry() and > regCountEntriesAndSize() in OpenRM. Oh interesting, I'll take a look today. Dave.
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c index b6f6b5e747d4..5bd38b1de226 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c @@ -1029,26 +1029,51 @@ r535_gsp_rpc_unloading_guest_driver(struct nvkm_gsp *gsp, bool suspend) return nvkm_gsp_rpc_wr(gsp, rpc, true); } +/* dword only */ +struct nv_gsp_registry_entries { + const char *name; + uint32_t value; +}; + +#define NV_GSP_REG_NUM_ENTRIES 2 + +static const struct nv_gsp_registry_entries r535_registry_entries[NV_GSP_REG_NUM_ENTRIES] = { + { "RMSecBusResetEnable", 1 }, + { "RMForcePcieConfigSave", 1 }, +}; + static int r535_gsp_rpc_set_registry(struct nvkm_gsp *gsp) { PACKED_REGISTRY_TABLE *rpc; char *strings; + int str_offset; + int i; + size_t rpc_size = sizeof(*rpc) + sizeof(rpc->entries[0]) * NV_GSP_REG_NUM_ENTRIES; - rpc = nvkm_gsp_rpc_get(gsp, NV_VGPU_MSG_FUNCTION_SET_REGISTRY, - sizeof(*rpc) + sizeof(rpc->entries[0]) + 1); + /* add strings + null terminator */ + for (i = 0; i < NV_GSP_REG_NUM_ENTRIES; i++) + rpc_size += strlen(r535_registry_entries[i].name) + 1; + + rpc = nvkm_gsp_rpc_get(gsp, NV_VGPU_MSG_FUNCTION_SET_REGISTRY, rpc_size); if (IS_ERR(rpc)) return PTR_ERR(rpc); rpc->size = sizeof(*rpc); - rpc->numEntries = 1; - rpc->entries[0].nameOffset = offsetof(typeof(*rpc), entries[1]); - rpc->entries[0].type = 1; - rpc->entries[0].data = 0; - rpc->entries[0].length = 4; - - strings = (char *)&rpc->entries[1]; - strings[0] = '\0'; + rpc->numEntries = NV_GSP_REG_NUM_ENTRIES; + + str_offset = offsetof(typeof(*rpc), entries[NV_GSP_REG_NUM_ENTRIES]); + strings = (char *)&rpc->entries[NV_GSP_REG_NUM_ENTRIES]; + for (i = 0; i < NV_GSP_REG_NUM_ENTRIES; i++) { + int name_len = strlen(r535_registry_entries[i].name) + 1; + rpc->entries[i].nameOffset = str_offset; + rpc->entries[i].type = 1; + rpc->entries[i].data = r535_registry_entries[i].value; + rpc->entries[i].length = 4; + memcpy(strings, r535_registry_entries[i].name, name_len); + strings += name_len; + str_offset += name_len; + } return nvkm_gsp_rpc_wr(gsp, rpc, false); }