Message ID | 20220215011123.734572-1-daniele.ceraolospurio@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915/guc: Initialize GuC submission locks and queues early | expand |
On 15/02/2022 01:11, Daniele Ceraolo Spurio wrote: > Move initialization of submission-related spinlock, lists and workers to > init_early. This fixes an issue where if the GuC init fails we might > still try to get the lock in the context cleanup code. Note that it is What's the worst case impact on non-debug builds aka is Fixes: required? Regards, Tvrtko > safe to call the GuC context cleanup code even if the init failed > because all contexts are initialized with an invalid GuC ID, which will > cause the GuC side of the cleanup to be skipped, so it is easier to just > make sure the variables are initialized than to special case the cleanup > to handle the case when they're not. > > References: https://gitlab.freedesktop.org/drm/intel/-/issues/4932 > Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> > Cc: Matthew Brost <matthew.brost@intel.com> > Cc: John Harrison <John.C.Harrison@Intel.com> > --- > .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 27 ++++++++++--------- > 1 file changed, 14 insertions(+), 13 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > index b3a429a92c0da..2160da2c83cbf 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > @@ -1818,24 +1818,11 @@ int intel_guc_submission_init(struct intel_guc *guc) > */ > GEM_BUG_ON(!guc->lrc_desc_pool); > > - xa_init_flags(&guc->context_lookup, XA_FLAGS_LOCK_IRQ); > - > - spin_lock_init(&guc->submission_state.lock); > - INIT_LIST_HEAD(&guc->submission_state.guc_id_list); > - ida_init(&guc->submission_state.guc_ids); > - INIT_LIST_HEAD(&guc->submission_state.destroyed_contexts); > - INIT_WORK(&guc->submission_state.destroyed_worker, > - destroyed_worker_func); > - INIT_WORK(&guc->submission_state.reset_fail_worker, > - reset_fail_worker_func); > - > guc->submission_state.guc_ids_bitmap = > bitmap_zalloc(NUMBER_MULTI_LRC_GUC_ID(guc), GFP_KERNEL); > if (!guc->submission_state.guc_ids_bitmap) > return -ENOMEM; > > - spin_lock_init(&guc->timestamp.lock); > - INIT_DELAYED_WORK(&guc->timestamp.work, guc_timestamp_ping); > guc->timestamp.ping_delay = (POLL_TIME_CLKS / gt->clock_frequency + 1) * HZ; > guc->timestamp.shift = gpm_timestamp_shift(gt); > > @@ -3831,6 +3818,20 @@ static bool __guc_submission_selected(struct intel_guc *guc) > > void intel_guc_submission_init_early(struct intel_guc *guc) > { > + xa_init_flags(&guc->context_lookup, XA_FLAGS_LOCK_IRQ); > + > + spin_lock_init(&guc->submission_state.lock); > + INIT_LIST_HEAD(&guc->submission_state.guc_id_list); > + ida_init(&guc->submission_state.guc_ids); > + INIT_LIST_HEAD(&guc->submission_state.destroyed_contexts); > + INIT_WORK(&guc->submission_state.destroyed_worker, > + destroyed_worker_func); > + INIT_WORK(&guc->submission_state.reset_fail_worker, > + reset_fail_worker_func); > + > + spin_lock_init(&guc->timestamp.lock); > + INIT_DELAYED_WORK(&guc->timestamp.work, guc_timestamp_ping); > + > guc->submission_state.num_guc_ids = GUC_MAX_LRC_DESCRIPTORS; > guc->submission_supported = __guc_submission_supported(guc); > guc->submission_selected = __guc_submission_selected(guc);
On 2/15/2022 1:09 AM, Tvrtko Ursulin wrote: > > On 15/02/2022 01:11, Daniele Ceraolo Spurio wrote: >> Move initialization of submission-related spinlock, lists and workers to >> init_early. This fixes an issue where if the GuC init fails we might >> still try to get the lock in the context cleanup code. Note that it is > > What's the worst case impact on non-debug builds aka is Fixes: required? There is no lock contention in this scenario and nothing is done within the locked section (because submission is not initialized and all contexts are marked as invalid), so no problems from the fact that the lock doesn't work. Is that enough to avoid a fixes tag? Daniele > > Regards, > > Tvrtko > >> safe to call the GuC context cleanup code even if the init failed >> because all contexts are initialized with an invalid GuC ID, which will >> cause the GuC side of the cleanup to be skipped, so it is easier to just >> make sure the variables are initialized than to special case the cleanup >> to handle the case when they're not. >> >> References: https://gitlab.freedesktop.org/drm/intel/-/issues/4932 >> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> >> Cc: Matthew Brost <matthew.brost@intel.com> >> Cc: John Harrison <John.C.Harrison@Intel.com> >> --- >> .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 27 ++++++++++--------- >> 1 file changed, 14 insertions(+), 13 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c >> b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c >> index b3a429a92c0da..2160da2c83cbf 100644 >> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c >> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c >> @@ -1818,24 +1818,11 @@ int intel_guc_submission_init(struct >> intel_guc *guc) >> */ >> GEM_BUG_ON(!guc->lrc_desc_pool); >> - xa_init_flags(&guc->context_lookup, XA_FLAGS_LOCK_IRQ); >> - >> - spin_lock_init(&guc->submission_state.lock); >> - INIT_LIST_HEAD(&guc->submission_state.guc_id_list); >> - ida_init(&guc->submission_state.guc_ids); >> - INIT_LIST_HEAD(&guc->submission_state.destroyed_contexts); >> - INIT_WORK(&guc->submission_state.destroyed_worker, >> - destroyed_worker_func); >> - INIT_WORK(&guc->submission_state.reset_fail_worker, >> - reset_fail_worker_func); >> - >> guc->submission_state.guc_ids_bitmap = >> bitmap_zalloc(NUMBER_MULTI_LRC_GUC_ID(guc), GFP_KERNEL); >> if (!guc->submission_state.guc_ids_bitmap) >> return -ENOMEM; >> - spin_lock_init(&guc->timestamp.lock); >> - INIT_DELAYED_WORK(&guc->timestamp.work, guc_timestamp_ping); >> guc->timestamp.ping_delay = (POLL_TIME_CLKS / >> gt->clock_frequency + 1) * HZ; >> guc->timestamp.shift = gpm_timestamp_shift(gt); >> @@ -3831,6 +3818,20 @@ static bool __guc_submission_selected(struct >> intel_guc *guc) >> void intel_guc_submission_init_early(struct intel_guc *guc) >> { >> + xa_init_flags(&guc->context_lookup, XA_FLAGS_LOCK_IRQ); >> + >> + spin_lock_init(&guc->submission_state.lock); >> + INIT_LIST_HEAD(&guc->submission_state.guc_id_list); >> + ida_init(&guc->submission_state.guc_ids); >> + INIT_LIST_HEAD(&guc->submission_state.destroyed_contexts); >> + INIT_WORK(&guc->submission_state.destroyed_worker, >> + destroyed_worker_func); >> + INIT_WORK(&guc->submission_state.reset_fail_worker, >> + reset_fail_worker_func); >> + >> + spin_lock_init(&guc->timestamp.lock); >> + INIT_DELAYED_WORK(&guc->timestamp.work, guc_timestamp_ping); >> + >> guc->submission_state.num_guc_ids = GUC_MAX_LRC_DESCRIPTORS; >> guc->submission_supported = __guc_submission_supported(guc); >> guc->submission_selected = __guc_submission_selected(guc);
On 15/02/2022 16:39, Ceraolo Spurio, Daniele wrote: > On 2/15/2022 1:09 AM, Tvrtko Ursulin wrote: >> >> On 15/02/2022 01:11, Daniele Ceraolo Spurio wrote: >>> Move initialization of submission-related spinlock, lists and workers to >>> init_early. This fixes an issue where if the GuC init fails we might >>> still try to get the lock in the context cleanup code. Note that it is >> >> What's the worst case impact on non-debug builds aka is Fixes: required? > > There is no lock contention in this scenario and nothing is done within > the locked section (because submission is not initialized and all > contexts are marked as invalid), so no problems from the fact that the > lock doesn't work. Is that enough to avoid a fixes tag? If the "lock doesn't work" is benign due no chance of touching uninitialised memory and then deadlocking then it's good. I just couldn't read that part from the commit message and did not have time to go dig into the code. Regards, Tvrtko > Daniele > >> >> Regards, >> >> Tvrtko >> >>> safe to call the GuC context cleanup code even if the init failed >>> because all contexts are initialized with an invalid GuC ID, which will >>> cause the GuC side of the cleanup to be skipped, so it is easier to just >>> make sure the variables are initialized than to special case the cleanup >>> to handle the case when they're not. >>> >>> References: https://gitlab.freedesktop.org/drm/intel/-/issues/4932 >>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> >>> Cc: Matthew Brost <matthew.brost@intel.com> >>> Cc: John Harrison <John.C.Harrison@Intel.com> >>> --- >>> .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 27 ++++++++++--------- >>> 1 file changed, 14 insertions(+), 13 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c >>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c >>> index b3a429a92c0da..2160da2c83cbf 100644 >>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c >>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c >>> @@ -1818,24 +1818,11 @@ int intel_guc_submission_init(struct >>> intel_guc *guc) >>> */ >>> GEM_BUG_ON(!guc->lrc_desc_pool); >>> - xa_init_flags(&guc->context_lookup, XA_FLAGS_LOCK_IRQ); >>> - >>> - spin_lock_init(&guc->submission_state.lock); >>> - INIT_LIST_HEAD(&guc->submission_state.guc_id_list); >>> - ida_init(&guc->submission_state.guc_ids); >>> - INIT_LIST_HEAD(&guc->submission_state.destroyed_contexts); >>> - INIT_WORK(&guc->submission_state.destroyed_worker, >>> - destroyed_worker_func); >>> - INIT_WORK(&guc->submission_state.reset_fail_worker, >>> - reset_fail_worker_func); >>> - >>> guc->submission_state.guc_ids_bitmap = >>> bitmap_zalloc(NUMBER_MULTI_LRC_GUC_ID(guc), GFP_KERNEL); >>> if (!guc->submission_state.guc_ids_bitmap) >>> return -ENOMEM; >>> - spin_lock_init(&guc->timestamp.lock); >>> - INIT_DELAYED_WORK(&guc->timestamp.work, guc_timestamp_ping); >>> guc->timestamp.ping_delay = (POLL_TIME_CLKS / >>> gt->clock_frequency + 1) * HZ; >>> guc->timestamp.shift = gpm_timestamp_shift(gt); >>> @@ -3831,6 +3818,20 @@ static bool __guc_submission_selected(struct >>> intel_guc *guc) >>> void intel_guc_submission_init_early(struct intel_guc *guc) >>> { >>> + xa_init_flags(&guc->context_lookup, XA_FLAGS_LOCK_IRQ); >>> + >>> + spin_lock_init(&guc->submission_state.lock); >>> + INIT_LIST_HEAD(&guc->submission_state.guc_id_list); >>> + ida_init(&guc->submission_state.guc_ids); >>> + INIT_LIST_HEAD(&guc->submission_state.destroyed_contexts); >>> + INIT_WORK(&guc->submission_state.destroyed_worker, >>> + destroyed_worker_func); >>> + INIT_WORK(&guc->submission_state.reset_fail_worker, >>> + reset_fail_worker_func); >>> + >>> + spin_lock_init(&guc->timestamp.lock); >>> + INIT_DELAYED_WORK(&guc->timestamp.work, guc_timestamp_ping); >>> + >>> guc->submission_state.num_guc_ids = GUC_MAX_LRC_DESCRIPTORS; >>> guc->submission_supported = __guc_submission_supported(guc); >>> guc->submission_selected = __guc_submission_selected(guc); >
On 2/14/2022 17:11, Daniele Ceraolo Spurio wrote: > Move initialization of submission-related spinlock, lists and workers to > init_early. This fixes an issue where if the GuC init fails we might > still try to get the lock in the context cleanup code. Note that it is > safe to call the GuC context cleanup code even if the init failed > because all contexts are initialized with an invalid GuC ID, which will > cause the GuC side of the cleanup to be skipped, so it is easier to just > make sure the variables are initialized than to special case the cleanup > to handle the case when they're not. > > References: https://gitlab.freedesktop.org/drm/intel/-/issues/4932 > Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> > Cc: Matthew Brost <matthew.brost@intel.com> > Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> > --- > .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 27 ++++++++++--------- > 1 file changed, 14 insertions(+), 13 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > index b3a429a92c0da..2160da2c83cbf 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > @@ -1818,24 +1818,11 @@ int intel_guc_submission_init(struct intel_guc *guc) > */ > GEM_BUG_ON(!guc->lrc_desc_pool); > > - xa_init_flags(&guc->context_lookup, XA_FLAGS_LOCK_IRQ); > - > - spin_lock_init(&guc->submission_state.lock); > - INIT_LIST_HEAD(&guc->submission_state.guc_id_list); > - ida_init(&guc->submission_state.guc_ids); > - INIT_LIST_HEAD(&guc->submission_state.destroyed_contexts); > - INIT_WORK(&guc->submission_state.destroyed_worker, > - destroyed_worker_func); > - INIT_WORK(&guc->submission_state.reset_fail_worker, > - reset_fail_worker_func); > - > guc->submission_state.guc_ids_bitmap = > bitmap_zalloc(NUMBER_MULTI_LRC_GUC_ID(guc), GFP_KERNEL); > if (!guc->submission_state.guc_ids_bitmap) > return -ENOMEM; > > - spin_lock_init(&guc->timestamp.lock); > - INIT_DELAYED_WORK(&guc->timestamp.work, guc_timestamp_ping); > guc->timestamp.ping_delay = (POLL_TIME_CLKS / gt->clock_frequency + 1) * HZ; > guc->timestamp.shift = gpm_timestamp_shift(gt); > > @@ -3831,6 +3818,20 @@ static bool __guc_submission_selected(struct intel_guc *guc) > > void intel_guc_submission_init_early(struct intel_guc *guc) > { > + xa_init_flags(&guc->context_lookup, XA_FLAGS_LOCK_IRQ); > + > + spin_lock_init(&guc->submission_state.lock); > + INIT_LIST_HEAD(&guc->submission_state.guc_id_list); > + ida_init(&guc->submission_state.guc_ids); > + INIT_LIST_HEAD(&guc->submission_state.destroyed_contexts); > + INIT_WORK(&guc->submission_state.destroyed_worker, > + destroyed_worker_func); > + INIT_WORK(&guc->submission_state.reset_fail_worker, > + reset_fail_worker_func); > + > + spin_lock_init(&guc->timestamp.lock); > + INIT_DELAYED_WORK(&guc->timestamp.work, guc_timestamp_ping); > + > guc->submission_state.num_guc_ids = GUC_MAX_LRC_DESCRIPTORS; > guc->submission_supported = __guc_submission_supported(guc); > guc->submission_selected = __guc_submission_selected(guc);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index b3a429a92c0da..2160da2c83cbf 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -1818,24 +1818,11 @@ int intel_guc_submission_init(struct intel_guc *guc) */ GEM_BUG_ON(!guc->lrc_desc_pool); - xa_init_flags(&guc->context_lookup, XA_FLAGS_LOCK_IRQ); - - spin_lock_init(&guc->submission_state.lock); - INIT_LIST_HEAD(&guc->submission_state.guc_id_list); - ida_init(&guc->submission_state.guc_ids); - INIT_LIST_HEAD(&guc->submission_state.destroyed_contexts); - INIT_WORK(&guc->submission_state.destroyed_worker, - destroyed_worker_func); - INIT_WORK(&guc->submission_state.reset_fail_worker, - reset_fail_worker_func); - guc->submission_state.guc_ids_bitmap = bitmap_zalloc(NUMBER_MULTI_LRC_GUC_ID(guc), GFP_KERNEL); if (!guc->submission_state.guc_ids_bitmap) return -ENOMEM; - spin_lock_init(&guc->timestamp.lock); - INIT_DELAYED_WORK(&guc->timestamp.work, guc_timestamp_ping); guc->timestamp.ping_delay = (POLL_TIME_CLKS / gt->clock_frequency + 1) * HZ; guc->timestamp.shift = gpm_timestamp_shift(gt); @@ -3831,6 +3818,20 @@ static bool __guc_submission_selected(struct intel_guc *guc) void intel_guc_submission_init_early(struct intel_guc *guc) { + xa_init_flags(&guc->context_lookup, XA_FLAGS_LOCK_IRQ); + + spin_lock_init(&guc->submission_state.lock); + INIT_LIST_HEAD(&guc->submission_state.guc_id_list); + ida_init(&guc->submission_state.guc_ids); + INIT_LIST_HEAD(&guc->submission_state.destroyed_contexts); + INIT_WORK(&guc->submission_state.destroyed_worker, + destroyed_worker_func); + INIT_WORK(&guc->submission_state.reset_fail_worker, + reset_fail_worker_func); + + spin_lock_init(&guc->timestamp.lock); + INIT_DELAYED_WORK(&guc->timestamp.work, guc_timestamp_ping); + guc->submission_state.num_guc_ids = GUC_MAX_LRC_DESCRIPTORS; guc->submission_supported = __guc_submission_supported(guc); guc->submission_selected = __guc_submission_selected(guc);
Move initialization of submission-related spinlock, lists and workers to init_early. This fixes an issue where if the GuC init fails we might still try to get the lock in the context cleanup code. Note that it is safe to call the GuC context cleanup code even if the init failed because all contexts are initialized with an invalid GuC ID, which will cause the GuC side of the cleanup to be skipped, so it is easier to just make sure the variables are initialized than to special case the cleanup to handle the case when they're not. References: https://gitlab.freedesktop.org/drm/intel/-/issues/4932 Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> --- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 27 ++++++++++--------- 1 file changed, 14 insertions(+), 13 deletions(-)