Message ID | d7b953c7a4ba747c8196a164e2f8c5aef468d048.1657289332.git.karolina.drobnik@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915: Apply waitboosting before fence wait | expand |
On Fri, Jul 08, 2022 at 04:20:13PM +0200, Karolina Drobnik wrote: > From: Chris Wilson <chris@chris-wilson.co.uk> > > One impact of commit 047a1b877ed4 ("dma-buf & drm/amdgpu: remove > dma_resv workaround") is that it stores many, many more fences. Whereas > adding an exclusive fence used to remove the shared fence list, that > list is now preserved and the write fences included into the list. Not > just a single write fence, but now a write/read fence per context. That > causes us to have to track more fences than before (albeit half of those > are redundant), and we trigger more interrupts for multi-engine > workloads. > > As part of reducing the impact from handling more signaling, we observe > we only need to kick the signal worker after adding a fence iff we have s/iff/if > good cause to believe that there is work to be done in processing the > fence i.e. we either need to enable the interrupt or the request is > already complete but we don't know if we saw the interrupt and so need > to check signaling. > > References: 047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv workaround") > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Signed-off-by: Karolina Drobnik <karolina.drobnik@intel.com> > --- > drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c > index 9dc9dccf7b09..ecc990ec1b95 100644 > --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c > +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c > @@ -399,7 +399,8 @@ static void insert_breadcrumb(struct i915_request *rq) > * the request as it may have completed and raised the interrupt as > * we were attaching it into the lists. > */ > - irq_work_queue(&b->irq_work); > + if (!b->irq_armed || __i915_request_is_complete(rq)) would we need the READ_ONCE(irq_armed) ? would we need to use the irq_lock? > + irq_work_queue(&b->irq_work); > } > > bool i915_request_enable_breadcrumb(struct i915_request *rq) > -- > 2.25.1 >
On Fri, Jul 08, 2022 at 10:40:24AM -0400, Rodrigo Vivi wrote: > On Fri, Jul 08, 2022 at 04:20:13PM +0200, Karolina Drobnik wrote: > > From: Chris Wilson <chris@chris-wilson.co.uk> > > > > One impact of commit 047a1b877ed4 ("dma-buf & drm/amdgpu: remove > > dma_resv workaround") is that it stores many, many more fences. Whereas > > adding an exclusive fence used to remove the shared fence list, that > > list is now preserved and the write fences included into the list. Not > > just a single write fence, but now a write/read fence per context. That > > causes us to have to track more fences than before (albeit half of those > > are redundant), and we trigger more interrupts for multi-engine > > workloads. > > > > As part of reducing the impact from handling more signaling, we observe > > we only need to kick the signal worker after adding a fence iff we have > > s/iff/if > > > good cause to believe that there is work to be done in processing the > > fence i.e. we either need to enable the interrupt or the request is > > already complete but we don't know if we saw the interrupt and so need > > to check signaling. > > > > References: 047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv workaround") > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Signed-off-by: Karolina Drobnik <karolina.drobnik@intel.com> > > --- > > drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c > > index 9dc9dccf7b09..ecc990ec1b95 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c > > +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c > > @@ -399,7 +399,8 @@ static void insert_breadcrumb(struct i915_request *rq) > > * the request as it may have completed and raised the interrupt as > > * we were attaching it into the lists. > > */ > > - irq_work_queue(&b->irq_work); > > + if (!b->irq_armed || __i915_request_is_complete(rq)) > > would we need the READ_ONCE(irq_armed) ? > would we need to use the irq_lock? gentle ping on these questions here so maybe we can get this ready for 5.20 still... Thanks, Rodrigo. > > > + irq_work_queue(&b->irq_work); > > } > > > > bool i915_request_enable_breadcrumb(struct i915_request *rq) > > -- > > 2.25.1 > >
Hi Rodrigo, Many thanks for taking another look at the patches. On 08.07.2022 16:40, Rodrigo Vivi wrote: > On Fri, Jul 08, 2022 at 04:20:13PM +0200, Karolina Drobnik wrote: >> From: Chris Wilson <chris@chris-wilson.co.uk> >> >> One impact of commit 047a1b877ed4 ("dma-buf & drm/amdgpu: remove >> dma_resv workaround") is that it stores many, many more fences. Whereas >> adding an exclusive fence used to remove the shared fence list, that >> list is now preserved and the write fences included into the list. Not >> just a single write fence, but now a write/read fence per context. That >> causes us to have to track more fences than before (albeit half of those >> are redundant), and we trigger more interrupts for multi-engine >> workloads. >> >> As part of reducing the impact from handling more signaling, we observe >> we only need to kick the signal worker after adding a fence iff we have > > s/iff/if This is fine, it means "if, and only if" >> good cause to believe that there is work to be done in processing the >> fence i.e. we either need to enable the interrupt or the request is >> already complete but we don't know if we saw the interrupt and so need >> to check signaling. >> >> References: 047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv workaround") >> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >> Signed-off-by: Karolina Drobnik <karolina.drobnik@intel.com> >> --- >> drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c >> index 9dc9dccf7b09..ecc990ec1b95 100644 >> --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c >> +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c >> @@ -399,7 +399,8 @@ static void insert_breadcrumb(struct i915_request *rq) >> * the request as it may have completed and raised the interrupt as >> * we were attaching it into the lists. >> */ >> - irq_work_queue(&b->irq_work); >> + if (!b->irq_armed || __i915_request_is_complete(rq)) > > would we need the READ_ONCE(irq_armed) ? > would we need to use the irq_lock? I'll rephrase Chris' answer here: No, it doesn't need either, the workqueuing is unrelated to the irq_lock. The worker enables the interrupt if there are any breadcrumbs at the end of its task. When queuing the work, we have to consider the race conditions: - If the worker is running and b->irq_armed at this point, we know the irq will remain armed - If the worker is running and !b->irq_armed at this point, we will kick the worker again -- it doesn't make any difference then if the worker is in the process of trying to arm the irq - If the worker is not running, b->irq_armed is constant, no race Ergo, the only race condition is where the worker is trying to arm the irq, and we end up running the worker a second time. The only danger to consider is _not_ running the worker when we need to. Once we put the breadcrumb on the signal, it has to be removed at some point. Normally this is only performed by the worker, so we have to confident that the worker will be run. We know that if the irq is armed (after we have attached this breadcrumb) there must be another run of the worker. The other condition then, if the irq is armed, but the breadcrumb is already completed, we may not see an interrupt from the gpu as the breadcrumb may have completed as we attached it, keeping the worker alive, but not noticing the completed breadcrumb in that case, we have to simulate the interrupt ourselves and give the worker a kick. The irq_lock is immaterial in both cases. >> + irq_work_queue(&b->irq_work); >> } >> >> bool i915_request_enable_breadcrumb(struct i915_request *rq) >> -- >> 2.25.1 >>
Hi Karolina, > One impact of commit 047a1b877ed4 ("dma-buf & drm/amdgpu: remove > dma_resv workaround") is that it stores many, many more fences. Whereas > adding an exclusive fence used to remove the shared fence list, that > list is now preserved and the write fences included into the list. Not > just a single write fence, but now a write/read fence per context. That > causes us to have to track more fences than before (albeit half of those > are redundant), and we trigger more interrupts for multi-engine > workloads. > > As part of reducing the impact from handling more signaling, we observe > we only need to kick the signal worker after adding a fence iff we have > good cause to believe that there is work to be done in processing the > fence i.e. we either need to enable the interrupt or the request is > already complete but we don't know if we saw the interrupt and so need > to check signaling. > > References: 047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv workaround") > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Signed-off-by: Karolina Drobnik <karolina.drobnik@intel.com> sorry, I missed this patch. Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Thanks, Andi
On Tue, Jul 12, 2022 at 08:29:32AM +0200, Karolina Drobnik wrote: > Hi Rodrigo, > > Many thanks for taking another look at the patches. > > On 08.07.2022 16:40, Rodrigo Vivi wrote: > > On Fri, Jul 08, 2022 at 04:20:13PM +0200, Karolina Drobnik wrote: > > > From: Chris Wilson <chris@chris-wilson.co.uk> > > > > > > One impact of commit 047a1b877ed4 ("dma-buf & drm/amdgpu: remove > > > dma_resv workaround") is that it stores many, many more fences. Whereas > > > adding an exclusive fence used to remove the shared fence list, that > > > list is now preserved and the write fences included into the list. Not > > > just a single write fence, but now a write/read fence per context. That > > > causes us to have to track more fences than before (albeit half of those > > > are redundant), and we trigger more interrupts for multi-engine > > > workloads. > > > > > > As part of reducing the impact from handling more signaling, we observe > > > we only need to kick the signal worker after adding a fence iff we have > > > > s/iff/if > > This is fine, it means "if, and only if" > > > > good cause to believe that there is work to be done in processing the > > > fence i.e. we either need to enable the interrupt or the request is > > > already complete but we don't know if we saw the interrupt and so need > > > to check signaling. > > > > > > References: 047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv workaround") > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > > Signed-off-by: Karolina Drobnik <karolina.drobnik@intel.com> > > > --- > > > drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c > > > index 9dc9dccf7b09..ecc990ec1b95 100644 > > > --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c > > > +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c > > > @@ -399,7 +399,8 @@ static void insert_breadcrumb(struct i915_request *rq) > > > * the request as it may have completed and raised the interrupt as > > > * we were attaching it into the lists. > > > */ > > > - irq_work_queue(&b->irq_work); > > > + if (!b->irq_armed || __i915_request_is_complete(rq)) > > > > would we need the READ_ONCE(irq_armed) ? > > would we need to use the irq_lock? > > I'll rephrase Chris' answer here: > > No, it doesn't need either, the workqueuing is unrelated to the irq_lock. > The worker enables the interrupt if there are any breadcrumbs at the end of > its task. When queuing the work, we have to consider the race conditions: > > - If the worker is running and b->irq_armed at this point, we know the > irq will remain armed > - If the worker is running and !b->irq_armed at this point, we will > kick the worker again -- it doesn't make any difference then if the > worker is in the process of trying to arm the irq > - If the worker is not running, b->irq_armed is constant, no race > > Ergo, the only race condition is where the worker is trying to arm the irq, > and we end up running the worker a second time. > > The only danger to consider is _not_ running the worker when we need to. > Once we put the breadcrumb on the signal, it has to be removed at some > point. Normally this is only performed by the worker, so we have to > confident that the worker will be run. We know that if the irq is armed > (after we have attached this breadcrumb) there must be another run of the > worker. > > The other condition then, if the irq is armed, but the breadcrumb is already > completed, we may not see an interrupt from the gpu as the breadcrumb may > have completed as we attached it, keeping the worker alive, but not noticing > the completed breadcrumb in that case, we have to simulate the interrupt > ourselves and give the worker a kick. > > The irq_lock is immaterial in both cases. > I just pushed the patch. More relying on multiple reviews and on the tests that unblock our users than on this explanation here. If the locks exist to protect some access we need to use it. It should be simple like that. Magic cases where locks don't apply just helps this castle of cards to fall apart later. > > > + irq_work_queue(&b->irq_work); > > > } > > > bool i915_request_enable_breadcrumb(struct i915_request *rq) > > > -- > > > 2.25.1 > > >
diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c index 9dc9dccf7b09..ecc990ec1b95 100644 --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c @@ -399,7 +399,8 @@ static void insert_breadcrumb(struct i915_request *rq) * the request as it may have completed and raised the interrupt as * we were attaching it into the lists. */ - irq_work_queue(&b->irq_work); + if (!b->irq_armed || __i915_request_is_complete(rq)) + irq_work_queue(&b->irq_work); } bool i915_request_enable_breadcrumb(struct i915_request *rq)