[02/37] drm/i915/gt: Protect context lifetime with RCU

Message ID	20200805122231.23313-3-chris@chris-wilson.co.uk (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=TleC=BP=lists.freedesktop.org=intel-gfx-bounces@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1BECE22D05 From: Chris Wilson <chris@chris-wilson.co.uk> To: intel-gfx@lists.freedesktop.org Date: Wed, 5 Aug 2020 13:21:56 +0100 Message-Id: <20200805122231.23313-3-chris@chris-wilson.co.uk> In-Reply-To: <20200805122231.23313-1-chris@chris-wilson.co.uk> References: <20200805122231.23313-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 02/37] drm/i915/gt: Protect context lifetime with RCU Precedence: list Cc: Chris Wilson <chris@chris-wilson.co.uk> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
Series	Replace obj->mm.lock with reservation_ww_class \| expand [00/37] Replace obj->mm.lock with reservation_ww_class [01/37] drm/i915/gem: Reduce context termination list iteration guard to RCU [02/37] drm/i915/gt: Protect context lifetime with RCU [03/37] drm/i915/gt: Free stale request on destroying the virtual engine [04/37] drm/i915/gt: Defer enabling the breadcrumb interrupt to after submission [05/37] drm/i915/gt: Track signaled breadcrumbs outside of the breadcrumb spinlock [06/37] drm/i915/gt: Don't cancel the interrupt shadow too early [07/37] drm/i915/gt: Split the breadcrumb spinlock between global and contexts [08/37] drm/i915/gem: Don't drop the timeline lock during execbuf [09/37] drm/i915/gem: Rename execbuf.bind_link to unbound_link [10/37] drm/i915/gem: Rename the list of relocations to reloc_list [11/37] drm/i915/gem: Move the 'cached' info to i915_execbuffer [12/37] drm/i915/gem: Break apart the early i915_vma_pin from execbuf object lookup [13/37] drm/i915/gem: Remove the call for no-evict i915_vma_pin [14/37] drm/i915: Serialise i915_vma_pin_inplace() with i915_vma_unbind() [15/37] drm/i915: Add list_for_each_entry_safe_continue_reverse [16/37] drm/i915: Always defer fenced work to the worker [17/37] drm/i915/gem: Assign context id for async work [18/37] drm/i915/gem: Separate the ww_mutex walker into its own list [19/37] drm/i915/gem: Asynchronous GTT unbinding [20/37] drm/i915/gem: Bind the fence async for execbuf [21/37] drm/i915/gem: Include cmdparser in common execbuf pinning [22/37] drm/i915/gem: Include secure batch in common execbuf pinning [23/37] drm/i915/gem: Manage GTT placement bias (starting offset) explicitly [24/37] drm/i915/gem: Reintroduce multiple passes for reloc processing [25/37] drm/i915: Add an implementation for common reservation_ww_class locking [26/37] drm/i915/gem: Pull execbuf dma resv under a single critical section [27/37] drm/i915/gtt: map the PD up front [28/37] drm/i915: Acquire the object lock around page directories [29/37] drm/i915/gem: Replace i915_gem_object.mm.mutex with reservation_ww_class [30/37] drm/i915: Hold wakeref for the duration of the vma GGTT binding [31/37] drm/i915/gt: Refactor heartbeat request construction and submission [32/37] drm/i915: Specialise GGTT binding [33/37] drm/i915/gt: Acquire backing storage for the context [34/37] drm/i915/gt: Push the wait for the context to bound to the request [35/37] drm/i915: Remove unused i915_gem_evict_vm() [36/37] drm/i915/display: Drop object lock from intel_unpin_fb_vma [37/37] drm/i915/gem: Delay attach mmu-notifier until we acquire the pinned userptr

Message ID

20200805122231.23313-3-chris@chris-wilson.co.uk (mailing list archive)

State

New, archived

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1BECE22D05
From: Chris Wilson <chris@chris-wilson.co.uk>
To: intel-gfx@lists.freedesktop.org
Date: Wed,  5 Aug 2020 13:21:56 +0100
Message-Id: <20200805122231.23313-3-chris@chris-wilson.co.uk>
In-Reply-To: <20200805122231.23313-1-chris@chris-wilson.co.uk>
References: <20200805122231.23313-1-chris@chris-wilson.co.uk>
MIME-Version: 1.0
Subject: [Intel-gfx] [PATCH 02/37] drm/i915/gt: Protect context lifetime
 with RCU
Precedence: list
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

Series

Replace obj->mm.lock with reservation_ww_class | expand

Commit Message

Chris Wilson Aug. 5, 2020, 12:21 p.m. UTC

Allow a brief period for continued access to a dead intel_context by
deferring the release of the struct until after an RCU grace period.
As we are using a dedicated slab cache for the contexts, we can defer
the release of the slab pages via RCU, with the caveat that individual
structs may be reused from the freelist within an RCU grace period. To
handle that, we have to avoid clearing members of the zombie struct.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_context.c | 330 +++++++++++++-----------
 drivers/gpu/drm/i915/i915_active.c      |  10 +
 drivers/gpu/drm/i915/i915_active.h      |   2 +
 drivers/gpu/drm/i915/i915_utils.h       |   7 +
 4 files changed, 202 insertions(+), 147 deletions(-)

Comments

Tvrtko Ursulin Aug. 5, 2020, 3:03 p.m. UTC | #1

On 05/08/2020 13:21, Chris Wilson wrote:
> Allow a brief period for continued access to a dead intel_context by
> deferring the release of the struct until after an RCU grace period.
> As we are using a dedicated slab cache for the contexts, we can defer
> the release of the slab pages via RCU, with the caveat that individual
> structs may be reused from the freelist within an RCU grace period. To
> handle that, we have to avoid clearing members of the zombie struct.

Is this related to debugfs race, optimising the driver latencies or 
both? Need to hack up mutex_reinit bothers me, on top of general desire 
to avoid even more rcu complexity.

Regards,

Tvrtko

> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_context.c | 330 +++++++++++++-----------
>   drivers/gpu/drm/i915/i915_active.c      |  10 +
>   drivers/gpu/drm/i915/i915_active.h      |   2 +
>   drivers/gpu/drm/i915/i915_utils.h       |   7 +
>   4 files changed, 202 insertions(+), 147 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
> index 52db2bde44a3..4e7924640ffa 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.c
> +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> @@ -22,7 +22,7 @@ static struct i915_global_context {
>   
>   static struct intel_context *intel_context_alloc(void)
>   {
> -	return kmem_cache_zalloc(global.slab_ce, GFP_KERNEL);
> +	return kmem_cache_alloc(global.slab_ce, GFP_KERNEL);
>   }
>   
>   void intel_context_free(struct intel_context *ce)
> @@ -30,6 +30,177 @@ void intel_context_free(struct intel_context *ce)
>   	kmem_cache_free(global.slab_ce, ce);
>   }
>   
> +static int __context_pin_state(struct i915_vma *vma)
> +{
> +	unsigned int bias = i915_ggtt_pin_bias(vma) | PIN_OFFSET_BIAS;
> +	int err;
> +
> +	err = i915_ggtt_pin(vma, 0, bias | PIN_HIGH);
> +	if (err)
> +		return err;
> +
> +	err = i915_active_acquire(&vma->active);
> +	if (err)
> +		goto err_unpin;
> +
> +	/*
> +	 * And mark it as a globally pinned object to let the shrinker know
> +	 * it cannot reclaim the object until we release it.
> +	 */
> +	i915_vma_make_unshrinkable(vma);
> +	vma->obj->mm.dirty = true;
> +
> +	return 0;
> +
> +err_unpin:
> +	i915_vma_unpin(vma);
> +	return err;
> +}
> +
> +static void __context_unpin_state(struct i915_vma *vma)
> +{
> +	i915_vma_make_shrinkable(vma);
> +	i915_active_release(&vma->active);
> +	__i915_vma_unpin(vma);
> +}
> +
> +static int __ring_active(struct intel_ring *ring)
> +{
> +	int err;
> +
> +	err = intel_ring_pin(ring);
> +	if (err)
> +		return err;
> +
> +	err = i915_active_acquire(&ring->vma->active);
> +	if (err)
> +		goto err_pin;
> +
> +	return 0;
> +
> +err_pin:
> +	intel_ring_unpin(ring);
> +	return err;
> +}
> +
> +static void __ring_retire(struct intel_ring *ring)
> +{
> +	i915_active_release(&ring->vma->active);
> +	intel_ring_unpin(ring);
> +}
> +
> +__i915_active_call
> +static void __intel_context_retire(struct i915_active *active)
> +{
> +	struct intel_context *ce = container_of(active, typeof(*ce), active);
> +
> +	CE_TRACE(ce, "retire runtime: { total:%lluns, avg:%lluns }\n",
> +		 intel_context_get_total_runtime_ns(ce),
> +		 intel_context_get_avg_runtime_ns(ce));
> +
> +	set_bit(CONTEXT_VALID_BIT, &ce->flags);
> +	if (ce->state)
> +		__context_unpin_state(ce->state);
> +
> +	intel_timeline_unpin(ce->timeline);
> +	__ring_retire(ce->ring);
> +
> +	intel_context_put(ce);
> +}
> +
> +static int __intel_context_active(struct i915_active *active)
> +{
> +	struct intel_context *ce = container_of(active, typeof(*ce), active);
> +	int err;
> +
> +	CE_TRACE(ce, "active\n");
> +
> +	intel_context_get(ce);
> +
> +	err = __ring_active(ce->ring);
> +	if (err)
> +		goto err_put;
> +
> +	err = intel_timeline_pin(ce->timeline);
> +	if (err)
> +		goto err_ring;
> +
> +	if (!ce->state)
> +		return 0;
> +
> +	err = __context_pin_state(ce->state);
> +	if (err)
> +		goto err_timeline;
> +
> +	return 0;
> +
> +err_timeline:
> +	intel_timeline_unpin(ce->timeline);
> +err_ring:
> +	__ring_retire(ce->ring);
> +err_put:
> +	intel_context_put(ce);
> +	return err;
> +}
> +
> +static void __intel_context_ctor(void *arg)
> +{
> +	struct intel_context *ce = arg;
> +
> +	INIT_LIST_HEAD(&ce->signal_link);
> +	INIT_LIST_HEAD(&ce->signals);
> +
> +	atomic_set(&ce->pin_count, 0);
> +	mutex_init(&ce->pin_mutex);
> +
> +	ce->active_count = 0;
> +	i915_active_init(&ce->active,
> +			 __intel_context_active, __intel_context_retire);
> +
> +	ce->inflight = NULL;
> +	ce->lrc_reg_state = NULL;
> +	ce->lrc.desc = 0;
> +}
> +
> +static void
> +__intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine)
> +{
> +	GEM_BUG_ON(!engine->cops);
> +	GEM_BUG_ON(!engine->gt->vm);
> +
> +	kref_init(&ce->ref);
> +	i915_active_reinit(&ce->active);
> +	mutex_reinit(&ce->pin_mutex);
> +
> +	ce->engine = engine;
> +	ce->ops = engine->cops;
> +	ce->sseu = engine->sseu;
> +
> +	ce->wa_bb_page = 0;
> +	ce->flags = 0;
> +	ce->tag = 0;
> +
> +	memset(&ce->runtime, 0, sizeof(ce->runtime));
> +
> +	ce->vm = i915_vm_get(engine->gt->vm);
> +	ce->gem_context = NULL;
> +
> +	ce->ring = __intel_context_ring_size(SZ_4K);
> +	ce->timeline = NULL;
> +	ce->state = NULL;
> +
> +	GEM_BUG_ON(atomic_read(&ce->pin_count));
> +	GEM_BUG_ON(ce->active_count);
> +	GEM_BUG_ON(ce->inflight);
> +}
> +
> +void
> +intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine)
> +{
> +	__intel_context_ctor(ce);
> +	__intel_context_init(ce, engine);
> +}
> +
>   struct intel_context *
>   intel_context_create(struct intel_engine_cs *engine)
>   {
> @@ -39,7 +210,7 @@ intel_context_create(struct intel_engine_cs *engine)
>   	if (!ce)
>   		return ERR_PTR(-ENOMEM);
>   
> -	intel_context_init(ce, engine);
> +	__intel_context_init(ce, engine);
>   	return ce;
>   }
>   
> @@ -158,154 +329,13 @@ void intel_context_unpin(struct intel_context *ce)
>   	/*
>   	 * Once released, we may asynchronously drop the active reference.
>   	 * As that may be the only reference keeping the context alive,
> -	 * take an extra now so that it is not freed before we finish
> +	 * hold onto RCU so that it is not freed before we finish
>   	 * dereferencing it.
>   	 */
> -	intel_context_get(ce);
> +	rcu_read_lock();
>   	intel_context_active_release(ce);
> -	intel_context_put(ce);
> -}
> -
> -static int __context_pin_state(struct i915_vma *vma)
> -{
> -	unsigned int bias = i915_ggtt_pin_bias(vma) | PIN_OFFSET_BIAS;
> -	int err;
> -
> -	err = i915_ggtt_pin(vma, 0, bias | PIN_HIGH);
> -	if (err)
> -		return err;
> -
> -	err = i915_active_acquire(&vma->active);
> -	if (err)
> -		goto err_unpin;
> -
> -	/*
> -	 * And mark it as a globally pinned object to let the shrinker know
> -	 * it cannot reclaim the object until we release it.
> -	 */
> -	i915_vma_make_unshrinkable(vma);
> -	vma->obj->mm.dirty = true;
> -
> -	return 0;
> -
> -err_unpin:
> -	i915_vma_unpin(vma);
> -	return err;
> -}
> -
> -static void __context_unpin_state(struct i915_vma *vma)
> -{
> -	i915_vma_make_shrinkable(vma);
> -	i915_active_release(&vma->active);
> -	__i915_vma_unpin(vma);
> -}
> -
> -static int __ring_active(struct intel_ring *ring)
> -{
> -	int err;
> -
> -	err = intel_ring_pin(ring);
> -	if (err)
> -		return err;
> -
> -	err = i915_active_acquire(&ring->vma->active);
> -	if (err)
> -		goto err_pin;
> -
> -	return 0;
> -
> -err_pin:
> -	intel_ring_unpin(ring);
> -	return err;
> -}
> -
> -static void __ring_retire(struct intel_ring *ring)
> -{
> -	i915_active_release(&ring->vma->active);
> -	intel_ring_unpin(ring);
> +	rcu_read_unlock();
>   }
> -
> -__i915_active_call
> -static void __intel_context_retire(struct i915_active *active)
> -{
> -	struct intel_context *ce = container_of(active, typeof(*ce), active);
> -
> -	CE_TRACE(ce, "retire runtime: { total:%lluns, avg:%lluns }\n",
> -		 intel_context_get_total_runtime_ns(ce),
> -		 intel_context_get_avg_runtime_ns(ce));
> -
> -	set_bit(CONTEXT_VALID_BIT, &ce->flags);
> -	if (ce->state)
> -		__context_unpin_state(ce->state);
> -
> -	intel_timeline_unpin(ce->timeline);
> -	__ring_retire(ce->ring);
> -
> -	intel_context_put(ce);
> -}
> -
> -static int __intel_context_active(struct i915_active *active)
> -{
> -	struct intel_context *ce = container_of(active, typeof(*ce), active);
> -	int err;
> -
> -	CE_TRACE(ce, "active\n");
> -
> -	intel_context_get(ce);
> -
> -	err = __ring_active(ce->ring);
> -	if (err)
> -		goto err_put;
> -
> -	err = intel_timeline_pin(ce->timeline);
> -	if (err)
> -		goto err_ring;
> -
> -	if (!ce->state)
> -		return 0;
> -
> -	err = __context_pin_state(ce->state);
> -	if (err)
> -		goto err_timeline;
> -
> -	return 0;
> -
> -err_timeline:
> -	intel_timeline_unpin(ce->timeline);
> -err_ring:
> -	__ring_retire(ce->ring);
> -err_put:
> -	intel_context_put(ce);
> -	return err;
> -}
> -
> -void
> -intel_context_init(struct intel_context *ce,
> -		   struct intel_engine_cs *engine)
> -{
> -	GEM_BUG_ON(!engine->cops);
> -	GEM_BUG_ON(!engine->gt->vm);
> -
> -	kref_init(&ce->ref);
> -
> -	ce->engine = engine;
> -	ce->ops = engine->cops;
> -	ce->sseu = engine->sseu;
> -	ce->ring = __intel_context_ring_size(SZ_4K);
> -
> -	ewma_runtime_init(&ce->runtime.avg);
> -
> -	ce->vm = i915_vm_get(engine->gt->vm);
> -
> -	INIT_LIST_HEAD(&ce->signal_link);
> -	INIT_LIST_HEAD(&ce->signals);
> -
> -	mutex_init(&ce->pin_mutex);
> -
> -	i915_active_init(&ce->active,
> -			 __intel_context_active, __intel_context_retire);
> -}
> -
>   void intel_context_fini(struct intel_context *ce)
>   {
>   	if (ce->timeline)
> @@ -333,7 +363,13 @@ static struct i915_global_context global = { {
>   
>   int __init i915_global_context_init(void)
>   {
> -	global.slab_ce = KMEM_CACHE(intel_context, SLAB_HWCACHE_ALIGN);
> +	global.slab_ce =
> +		kmem_cache_create("intel_context",
> +				  sizeof(struct intel_context),
> +				  __alignof__(struct intel_context),
> +				  SLAB_HWCACHE_ALIGN |
> +				  SLAB_TYPESAFE_BY_RCU,
> +				  __intel_context_ctor);
>   	if (!global.slab_ce)
>   		return -ENOMEM;
>   
> diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
> index b0a6522be3d1..6a3bb10e3bb2 100644
> --- a/drivers/gpu/drm/i915/i915_active.c
> +++ b/drivers/gpu/drm/i915/i915_active.c
> @@ -795,6 +795,16 @@ void i915_active_fini(struct i915_active *ref)
>   		kmem_cache_free(global.slab_cache, ref->cache);
>   }
>   
> +void i915_active_reinit(struct i915_active *ref)
> +{
> +	GEM_BUG_ON(!i915_active_is_idle(ref));
> +	debug_active_init(ref);
> +	mutex_reinit(&ref->mutex);
> +
> +	ref->cache = NULL;
> +	ref->tree = RB_ROOT;
> +}
> +
>   static inline bool is_idle_barrier(struct active_node *node, u64 idx)
>   {
>   	return node->timeline == idx && !i915_active_fence_isset(&node->base);
> diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h
> index fb165d3f01cf..6df7e721616d 100644
> --- a/drivers/gpu/drm/i915/i915_active.h
> +++ b/drivers/gpu/drm/i915/i915_active.h
> @@ -219,6 +219,8 @@ i915_active_is_idle(const struct i915_active *ref)
>   
>   void i915_active_fini(struct i915_active *ref);
>   
> +void i915_active_reinit(struct i915_active *ref);
> +
>   int i915_active_acquire_preallocate_barrier(struct i915_active *ref,
>   					    struct intel_engine_cs *engine);
>   void i915_active_acquire_barrier(struct i915_active *ref);
> diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
> index 54773371e6bd..ef8db3aa75c7 100644
> --- a/drivers/gpu/drm/i915/i915_utils.h
> +++ b/drivers/gpu/drm/i915/i915_utils.h
> @@ -443,6 +443,13 @@ static inline bool timer_expired(const struct timer_list *t)
>   	return READ_ONCE(t->expires) && !timer_pending(t);
>   }
>   
> +static inline void mutex_reinit(struct mutex *lock)
> +{
> +#if IS_ENABLED(CONFIG_DEBUG_MUTEXES)
> +	lock->magic = lock;
> +#endif
> +}
> +
>   /*
>    * This is a lookalike for IS_ENABLED() that takes a kconfig value,
>    * e.g. CONFIG_DRM_I915_SPIN_REQUEST, and evaluates whether it is non-zero
>

Chris Wilson Aug. 6, 2020, 10:14 a.m. UTC | #2

Quoting Tvrtko Ursulin (2020-08-05 16:03:58)
> 
> On 05/08/2020 13:21, Chris Wilson wrote:
> > Allow a brief period for continued access to a dead intel_context by
> > deferring the release of the struct until after an RCU grace period.
> > As we are using a dedicated slab cache for the contexts, we can defer
> > the release of the slab pages via RCU, with the caveat that individual
> > structs may be reused from the freelist within an RCU grace period. To
> > handle that, we have to avoid clearing members of the zombie struct.
> 
> Is this related to debugfs race, optimising the driver latencies or 
> both? Need to hack up mutex_reinit bothers me, on top of general desire 
> to avoid even more rcu complexity.

There's an issue in the next patch exposed by virtual engine
destruction that needs a fix handily provided by the rcu grace period,
and it's used in the following patch to reduce irqoff lock contention.
-Chris

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index 52db2bde44a3..4e7924640ffa 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -22,7 +22,7 @@  static struct i915_global_context {
 
 static struct intel_context *intel_context_alloc(void)
 {
-	return kmem_cache_zalloc(global.slab_ce, GFP_KERNEL);
+	return kmem_cache_alloc(global.slab_ce, GFP_KERNEL);
 }
 
 void intel_context_free(struct intel_context *ce)
@@ -30,6 +30,177 @@  void intel_context_free(struct intel_context *ce)
 	kmem_cache_free(global.slab_ce, ce);
 }
 
+static int __context_pin_state(struct i915_vma *vma)
+{
+	unsigned int bias = i915_ggtt_pin_bias(vma) | PIN_OFFSET_BIAS;
+	int err;
+
+	err = i915_ggtt_pin(vma, 0, bias | PIN_HIGH);
+	if (err)
+		return err;
+
+	err = i915_active_acquire(&vma->active);
+	if (err)
+		goto err_unpin;
+
+	/*
+	 * And mark it as a globally pinned object to let the shrinker know
+	 * it cannot reclaim the object until we release it.
+	 */
+	i915_vma_make_unshrinkable(vma);
+	vma->obj->mm.dirty = true;
+
+	return 0;
+
+err_unpin:
+	i915_vma_unpin(vma);
+	return err;
+}
+
+static void __context_unpin_state(struct i915_vma *vma)
+{
+	i915_vma_make_shrinkable(vma);
+	i915_active_release(&vma->active);
+	__i915_vma_unpin(vma);
+}
+
+static int __ring_active(struct intel_ring *ring)
+{
+	int err;
+
+	err = intel_ring_pin(ring);
+	if (err)
+		return err;
+
+	err = i915_active_acquire(&ring->vma->active);
+	if (err)
+		goto err_pin;
+
+	return 0;
+
+err_pin:
+	intel_ring_unpin(ring);
+	return err;
+}
+
+static void __ring_retire(struct intel_ring *ring)
+{
+	i915_active_release(&ring->vma->active);
+	intel_ring_unpin(ring);
+}
+
+__i915_active_call
+static void __intel_context_retire(struct i915_active *active)
+{
+	struct intel_context *ce = container_of(active, typeof(*ce), active);
+
+	CE_TRACE(ce, "retire runtime: { total:%lluns, avg:%lluns }\n",
+		 intel_context_get_total_runtime_ns(ce),
+		 intel_context_get_avg_runtime_ns(ce));
+
+	set_bit(CONTEXT_VALID_BIT, &ce->flags);
+	if (ce->state)
+		__context_unpin_state(ce->state);
+
+	intel_timeline_unpin(ce->timeline);
+	__ring_retire(ce->ring);
+
+	intel_context_put(ce);
+}
+
+static int __intel_context_active(struct i915_active *active)
+{
+	struct intel_context *ce = container_of(active, typeof(*ce), active);
+	int err;
+
+	CE_TRACE(ce, "active\n");
+
+	intel_context_get(ce);
+
+	err = __ring_active(ce->ring);
+	if (err)
+		goto err_put;
+
+	err = intel_timeline_pin(ce->timeline);
+	if (err)
+		goto err_ring;
+
+	if (!ce->state)
+		return 0;
+
+	err = __context_pin_state(ce->state);
+	if (err)
+		goto err_timeline;
+
+	return 0;
+
+err_timeline:
+	intel_timeline_unpin(ce->timeline);
+err_ring:
+	__ring_retire(ce->ring);
+err_put:
+	intel_context_put(ce);
+	return err;
+}
+
+static void __intel_context_ctor(void *arg)
+{
+	struct intel_context *ce = arg;
+
+	INIT_LIST_HEAD(&ce->signal_link);
+	INIT_LIST_HEAD(&ce->signals);
+
+	atomic_set(&ce->pin_count, 0);
+	mutex_init(&ce->pin_mutex);
+
+	ce->active_count = 0;
+	i915_active_init(&ce->active,
+			 __intel_context_active, __intel_context_retire);
+
+	ce->inflight = NULL;
+	ce->lrc_reg_state = NULL;
+	ce->lrc.desc = 0;
+}
+
+static void
+__intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine)
+{
+	GEM_BUG_ON(!engine->cops);
+	GEM_BUG_ON(!engine->gt->vm);
+
+	kref_init(&ce->ref);
+	i915_active_reinit(&ce->active);
+	mutex_reinit(&ce->pin_mutex);
+
+	ce->engine = engine;
+	ce->ops = engine->cops;
+	ce->sseu = engine->sseu;
+
+	ce->wa_bb_page = 0;
+	ce->flags = 0;
+	ce->tag = 0;
+
+	memset(&ce->runtime, 0, sizeof(ce->runtime));
+
+	ce->vm = i915_vm_get(engine->gt->vm);
+	ce->gem_context = NULL;
+
+	ce->ring = __intel_context_ring_size(SZ_4K);
+	ce->timeline = NULL;
+	ce->state = NULL;
+
+	GEM_BUG_ON(atomic_read(&ce->pin_count));
+	GEM_BUG_ON(ce->active_count);
+	GEM_BUG_ON(ce->inflight);
+}
+
+void
+intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine)
+{
+	__intel_context_ctor(ce);
+	__intel_context_init(ce, engine);
+}
+
 struct intel_context *
 intel_context_create(struct intel_engine_cs *engine)
 {
@@ -39,7 +210,7 @@  intel_context_create(struct intel_engine_cs *engine)
 	if (!ce)
 		return ERR_PTR(-ENOMEM);
 
-	intel_context_init(ce, engine);
+	__intel_context_init(ce, engine);
 	return ce;
 }
 
@@ -158,154 +329,13 @@  void intel_context_unpin(struct intel_context *ce)
 	/*
 	 * Once released, we may asynchronously drop the active reference.
 	 * As that may be the only reference keeping the context alive,
-	 * take an extra now so that it is not freed before we finish
+	 * hold onto RCU so that it is not freed before we finish
 	 * dereferencing it.
 	 */
-	intel_context_get(ce);
+	rcu_read_lock();
 	intel_context_active_release(ce);
-	intel_context_put(ce);
-}
-
-static int __context_pin_state(struct i915_vma *vma)
-{
-	unsigned int bias = i915_ggtt_pin_bias(vma) | PIN_OFFSET_BIAS;
-	int err;
-
-	err = i915_ggtt_pin(vma, 0, bias | PIN_HIGH);
-	if (err)
-		return err;
-
-	err = i915_active_acquire(&vma->active);
-	if (err)
-		goto err_unpin;
-
-	/*
-	 * And mark it as a globally pinned object to let the shrinker know
-	 * it cannot reclaim the object until we release it.
-	 */
-	i915_vma_make_unshrinkable(vma);
-	vma->obj->mm.dirty = true;
-
-	return 0;
-
-err_unpin:
-	i915_vma_unpin(vma);
-	return err;
-}
-
-static void __context_unpin_state(struct i915_vma *vma)
-{
-	i915_vma_make_shrinkable(vma);
-	i915_active_release(&vma->active);
-	__i915_vma_unpin(vma);
-}
-
-static int __ring_active(struct intel_ring *ring)
-{
-	int err;
-
-	err = intel_ring_pin(ring);
-	if (err)
-		return err;
-
-	err = i915_active_acquire(&ring->vma->active);
-	if (err)
-		goto err_pin;
-
-	return 0;
-
-err_pin:
-	intel_ring_unpin(ring);
-	return err;
-}
-
-static void __ring_retire(struct intel_ring *ring)
-{
-	i915_active_release(&ring->vma->active);
-	intel_ring_unpin(ring);
+	rcu_read_unlock();
 }
-
-__i915_active_call
-static void __intel_context_retire(struct i915_active *active)
-{
-	struct intel_context *ce = container_of(active, typeof(*ce), active);
-
-	CE_TRACE(ce, "retire runtime: { total:%lluns, avg:%lluns }\n",
-		 intel_context_get_total_runtime_ns(ce),
-		 intel_context_get_avg_runtime_ns(ce));
-
-	set_bit(CONTEXT_VALID_BIT, &ce->flags);
-	if (ce->state)
-		__context_unpin_state(ce->state);
-
-	intel_timeline_unpin(ce->timeline);
-	__ring_retire(ce->ring);
-
-	intel_context_put(ce);
-}
-
-static int __intel_context_active(struct i915_active *active)
-{
-	struct intel_context *ce = container_of(active, typeof(*ce), active);
-	int err;
-
-	CE_TRACE(ce, "active\n");
-
-	intel_context_get(ce);
-
-	err = __ring_active(ce->ring);
-	if (err)
-		goto err_put;
-
-	err = intel_timeline_pin(ce->timeline);
-	if (err)
-		goto err_ring;
-
-	if (!ce->state)
-		return 0;
-
-	err = __context_pin_state(ce->state);
-	if (err)
-		goto err_timeline;
-
-	return 0;
-
-err_timeline:
-	intel_timeline_unpin(ce->timeline);
-err_ring:
-	__ring_retire(ce->ring);
-err_put:
-	intel_context_put(ce);
-	return err;
-}
-
-void
-intel_context_init(struct intel_context *ce,
-		   struct intel_engine_cs *engine)
-{
-	GEM_BUG_ON(!engine->cops);
-	GEM_BUG_ON(!engine->gt->vm);
-
-	kref_init(&ce->ref);
-
-	ce->engine = engine;
-	ce->ops = engine->cops;
-	ce->sseu = engine->sseu;
-	ce->ring = __intel_context_ring_size(SZ_4K);
-
-	ewma_runtime_init(&ce->runtime.avg);
-
-	ce->vm = i915_vm_get(engine->gt->vm);
-
-	INIT_LIST_HEAD(&ce->signal_link);
-	INIT_LIST_HEAD(&ce->signals);
-
-	mutex_init(&ce->pin_mutex);
-
-	i915_active_init(&ce->active,
-			 __intel_context_active, __intel_context_retire);
-}
-
 void intel_context_fini(struct intel_context *ce)
 {
 	if (ce->timeline)
@@ -333,7 +363,13 @@  static struct i915_global_context global = { {
 
 int __init i915_global_context_init(void)
 {
-	global.slab_ce = KMEM_CACHE(intel_context, SLAB_HWCACHE_ALIGN);
+	global.slab_ce =
+		kmem_cache_create("intel_context",
+				  sizeof(struct intel_context),
+				  __alignof__(struct intel_context),
+				  SLAB_HWCACHE_ALIGN |
+				  SLAB_TYPESAFE_BY_RCU,
+				  __intel_context_ctor);
 	if (!global.slab_ce)
 		return -ENOMEM;
 
diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
index b0a6522be3d1..6a3bb10e3bb2 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -795,6 +795,16 @@  void i915_active_fini(struct i915_active *ref)
 		kmem_cache_free(global.slab_cache, ref->cache);
 }
 
+void i915_active_reinit(struct i915_active *ref)
+{
+	GEM_BUG_ON(!i915_active_is_idle(ref));
+	debug_active_init(ref);
+	mutex_reinit(&ref->mutex);
+
+	ref->cache = NULL;
+	ref->tree = RB_ROOT;
+}
+
 static inline bool is_idle_barrier(struct active_node *node, u64 idx)
 {
 	return node->timeline == idx && !i915_active_fence_isset(&node->base);
diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h
index fb165d3f01cf..6df7e721616d 100644
--- a/drivers/gpu/drm/i915/i915_active.h
+++ b/drivers/gpu/drm/i915/i915_active.h
@@ -219,6 +219,8 @@  i915_active_is_idle(const struct i915_active *ref)
 
 void i915_active_fini(struct i915_active *ref);
 
+void i915_active_reinit(struct i915_active *ref);
+
 int i915_active_acquire_preallocate_barrier(struct i915_active *ref,
 					    struct intel_engine_cs *engine);
 void i915_active_acquire_barrier(struct i915_active *ref);
diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
index 54773371e6bd..ef8db3aa75c7 100644
--- a/drivers/gpu/drm/i915/i915_utils.h
+++ b/drivers/gpu/drm/i915/i915_utils.h
@@ -443,6 +443,13 @@  static inline bool timer_expired(const struct timer_list *t)
 	return READ_ONCE(t->expires) && !timer_pending(t);
 }
 
+static inline void mutex_reinit(struct mutex *lock)
+{
+#if IS_ENABLED(CONFIG_DEBUG_MUTEXES)
+	lock->magic = lock;
+#endif
+}
+
 /*
  * This is a lookalike for IS_ENABLED() that takes a kconfig value,
  * e.g. CONFIG_DRM_I915_SPIN_REQUEST, and evaluates whether it is non-zero

[02/37] drm/i915/gt: Protect context lifetime with RCU

Commit Message

Comments

Patch