[18/66] drm/i915: Always defer fenced work to the worker

Message ID	20200715115147.11866-18-chris@chris-wilson.co.uk (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=ev6m=A2=lists.freedesktop.org=intel-gfx-bounces@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AA76A20658 From: Chris Wilson <chris@chris-wilson.co.uk> To: intel-gfx@lists.freedesktop.org Date: Wed, 15 Jul 2020 12:50:59 +0100 Message-Id: <20200715115147.11866-18-chris@chris-wilson.co.uk> In-Reply-To: <20200715115147.11866-1-chris@chris-wilson.co.uk> References: <20200715115147.11866-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 18/66] drm/i915: Always defer fenced work to the worker Precedence: list Cc: Chris Wilson <chris@chris-wilson.co.uk> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
Series	[01/66] drm/i915: Reduce i915_request.lock contention for i915_request_wait \| expand [01/66] drm/i915: Reduce i915_request.lock contention for i915_request_wait [02/66] drm/i915: Remove i915_request.lock requirement for execution callbacks [03/66] drm/i915: Remove requirement for holding i915_request.lock for breadcrumbs [04/66] drm/i915: Add a couple of missing i915_active_fini() [05/66] drm/i915: Skip taking acquire mutex for no ref->active callback [06/66] drm/i915: Export a preallocate variant of i915_active_acquire() [07/66] drm/i915: Keep the most recently used active-fence upon discard [08/66] drm/i915: Make the stale cached active node available for any timeline [09/66] drm/i915: Provide a fastpath for waiting on vma bindings [10/66] drm/i915: Soften the tasklet flush frequency before waits [11/66] drm/i915: Preallocate stashes for vma page-directories [12/66] drm/i915: Switch to object allocations for page directories [13/66] drm/i915/gem: Don't drop the timeline lock during execbuf [14/66] drm/i915/gem: Rename execbuf.bind_link to unbound_link [15/66] drm/i915/gem: Break apart the early i915_vma_pin from execbuf object lookup [16/66] drm/i915/gem: Remove the call for no-evict i915_vma_pin [17/66] drm/i915: Add list_for_each_entry_safe_continue_reverse [18/66] drm/i915: Always defer fenced work to the worker [19/66] drm/i915/gem: Assign context id for async work [20/66] drm/i915/gem: Separate the ww_mutex walker into its own list [21/66] drm/i915/gem: Asynchronous GTT unbinding [22/66] drm/i915/gem: Bind the fence async for execbuf [23/66] drm/i915/gem: Include cmdparser in common execbuf pinning [24/66] drm/i915/gem: Include secure batch in common execbuf pinning [25/66] drm/i915/gem: Reintroduce multiple passes for reloc processing [26/66] drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2. [27/66] drm/i915/gem: Pull execbuf dma resv under a single critical section [28/66] drm/i915/gem: Replace i915_gem_object.mm.mutex with reservation_ww_class [29/66] drm/i915: Hold wakeref for the duration of the vma GGTT binding [30/66] drm/i915: Specialise GGTT binding [31/66] drm/i915/gt: Acquire backing storage for the context [32/66] drm/i915/gt: Push the wait for the context to bound to the request [33/66] drm/i915: Remove unused i915_gem_evict_vm() [34/66] drm/i915/gt: Decouple completed requests on unwind [35/66] drm/i915/gt: Check for a completed last request once [36/66] drm/i915/gt: Replace direct submit with direct call to tasklet [37/66] drm/i915/gt: Free stale request on destroying the virtual engine [38/66] drm/i915/gt: Use virtual_engine during execlists_dequeue [39/66] drm/i915/gt: Decouple inflight virtual engines [40/66] drm/i915/gt: Defer schedule_out until after the next dequeue [41/66] drm/i915/gt: Resubmit the virtual engine on schedule-out [42/66] drm/i915/gt: Simplify virtual engine handling for execlists_hold() [43/66] drm/i915/gt: ce->inflight updates are now serialised [44/66] drm/i915/gt: Drop atomic for engine->fw_active tracking [45/66] drm/i915/gt: Extract busy-stats for ring-scheduler [46/66] drm/i915/gt: Convert stats.active to plain unsigned int [47/66] drm/i915: Lift waiter/signaler iterators [48/66] drm/i915: Strip out internal priorities [49/66] drm/i915: Remove I915_USER_PRIORITY_SHIFT [50/66] drm/i915: Replace engine->schedule() with a known request operation [51/66] drm/i915/gt: Do not suspend bonded requests if one hangs [52/66] drm/i915: Teach the i915_dependency to use a double-lock [53/66] drm/i915: Restructure priority inheritance [54/66] drm/i915/gt: Remove timeslice suppression [55/66] drm/i915: Fair low-latency scheduling [56/66] drm/i915/gt: Specify a deadline for the heartbeat [57/66] drm/i915: Replace the priority boosting for the display with a deadline [58/66] drm/i915: Move saturated workload detection to the GT [59/66] Restore "drm/i915: drop engine_pin/unpin_breadcrumbs_irq" [60/66] drm/i915/gt: Couple tasklet scheduling for all CS interrupts [61/66] drm/i915/gt: Support creation of 'internal' rings [62/66] drm/i915/gt: Use client timeline address for seqno writes [63/66] drm/i915/gt: Infrastructure for ring scheduling [64/66] drm/i915/gt: Implement ring scheduler for gen6/7 [65/66] drm/i915/gt: Enable ring scheduling for gen6/7 [66/66] drm/i915/gem: Remove timeline nesting from snb relocs

Message ID

20200715115147.11866-18-chris@chris-wilson.co.uk (mailing list archive)

State

New, archived

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AA76A20658
From: Chris Wilson <chris@chris-wilson.co.uk>
To: intel-gfx@lists.freedesktop.org
Date: Wed, 15 Jul 2020 12:50:59 +0100
Message-Id: <20200715115147.11866-18-chris@chris-wilson.co.uk>
In-Reply-To: <20200715115147.11866-1-chris@chris-wilson.co.uk>
References: <20200715115147.11866-1-chris@chris-wilson.co.uk>
MIME-Version: 1.0
Subject: [Intel-gfx] [PATCH 18/66] drm/i915: Always defer fenced work to the
 worker
Precedence: list
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

Series

[01/66] drm/i915: Reduce i915_request.lock contention for i915_request_wait | expand

Commit Message

Chris Wilson July 15, 2020, 11:50 a.m. UTC

Currently, if an error is raised we always call the cleanup locally
[and skip the main work callback]. However, some future users may need
to take a mutex to cleanup and so we cannot immediately execute the
cleanup as we may still be in interrupt context.

With the execute-immediate flag, for most cases this should result in
immediate cleanup of an error.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_sw_fence_work.c | 25 +++++++++++------------
 1 file changed, 12 insertions(+), 13 deletions(-)

Comments

Thomas Hellström (Intel) July 31, 2020, 9:03 a.m. UTC | #1

On 7/15/20 1:50 PM, Chris Wilson wrote:
> Currently, if an error is raised we always call the cleanup locally
> [and skip the main work callback]. However, some future users
Could you add an example of those future users?
> may need
> to take a mutex to cleanup and so we cannot immediately execute the
> cleanup as we may still be in interrupt context.
>
> With the execute-immediate flag, for most cases this should result in
> immediate cleanup of an error.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Otherwise Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>

Chris Wilson July 31, 2020, 1:28 p.m. UTC | #2

Quoting Thomas Hellström (Intel) (2020-07-31 10:03:59)
> 
> On 7/15/20 1:50 PM, Chris Wilson wrote:
> > Currently, if an error is raised we always call the cleanup locally
> > [and skip the main work callback]. However, some future users
> Could you add an example of those future users?

In the next (or two) patch, the code needs to do the error cleanup from
process context. Since the error paths should be relatively infrequent,
and more often than not raised synchronously, I didn't see a reason to
build in a flag to say whether or not the release-on-error could be
performed immediately from the interrupt context.

The example in this series is that even if an error is thrown, we have
committed changes to the ppGTT layout (in particular marking PTE to be
evicted) and so we must complete unbinding the old pages from the ppGTT,
otherwise they may remain accessible.
-Chris

Thomas Hellström (Intel) July 31, 2020, 1:31 p.m. UTC | #3

On 7/31/20 3:28 PM, Chris Wilson wrote:
> Quoting Thomas Hellström (Intel) (2020-07-31 10:03:59)
>> On 7/15/20 1:50 PM, Chris Wilson wrote:
>>> Currently, if an error is raised we always call the cleanup locally
>>> [and skip the main work callback]. However, some future users
>> Could you add an example of those future users?
> In the next (or two) patch, the code needs to do the error cleanup from
> process context. Since the error paths should be relatively infrequent,
> and more often than not raised synchronously, I didn't see a reason to
> build in a flag to say whether or not the release-on-error could be
> performed immediately from the interrupt context.
>
> The example in this series is that even if an error is thrown, we have
> committed changes to the ppGTT layout (in particular marking PTE to be
> evicted) and so we must complete unbinding the old pages from the ppGTT,
> otherwise they may remain accessible.


Thanks.

>   I was mostly thinking if this or something similar could be added to the commit message to aid in understanding why the change is needed.

/Thomas




> -Chris

diff --git a/drivers/gpu/drm/i915/i915_sw_fence_work.c b/drivers/gpu/drm/i915/i915_sw_fence_work.c
index a3a81bb8f2c3..29f63ebc24e8 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence_work.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence_work.c
@@ -16,11 +16,14 @@  static void fence_complete(struct dma_fence_work *f)
 static void fence_work(struct work_struct *work)
 {
 	struct dma_fence_work *f = container_of(work, typeof(*f), work);
-	int err;
 
-	err = f->ops->work(f);
-	if (err)
-		dma_fence_set_error(&f->dma, err);
+	if (!f->dma.error) {
+		int err;
+
+		err = f->ops->work(f);
+		if (err)
+			dma_fence_set_error(&f->dma, err);
+	}
 
 	fence_complete(f);
 	dma_fence_put(&f->dma);
@@ -36,15 +39,11 @@  fence_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
 		if (fence->error)
 			dma_fence_set_error(&f->dma, fence->error);
 
-		if (!f->dma.error) {
-			dma_fence_get(&f->dma);
-			if (test_bit(DMA_FENCE_WORK_IMM, &f->dma.flags))
-				fence_work(&f->work);
-			else
-				queue_work(system_unbound_wq, &f->work);
-		} else {
-			fence_complete(f);
-		}
+		dma_fence_get(&f->dma);
+		if (test_bit(DMA_FENCE_WORK_IMM, &f->dma.flags))
+			fence_work(&f->work);
+		else
+			queue_work(system_unbound_wq, &f->work);
 		break;
 
 	case FENCE_FREE:

[18/66] drm/i915: Always defer fenced work to the worker

Commit Message

Comments

Patch