diff mbox series

[01/12] drm/i915/guc: Avoid reclaim locks during reset

Message ID 20190701100502.15639-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show
Series [01/12] drm/i915/guc: Avoid reclaim locks during reset | expand

Commit Message

Chris Wilson July 1, 2019, 10:04 a.m. UTC
During reset, we must be very selective in which locks we take as most
are tainted by being held across a wait or reclaim (kmalloc) which
implicitly waits. Inside the guc reset path, we reset the ADS to sane
defaults, but must keep it pinned from initialisation to avoid having to
pin it during reset.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 drivers/gpu/drm/i915/intel_guc.h     |  4 ++++
 drivers/gpu/drm/i915/intel_guc_ads.c | 26 +++++++++++++-------------
 2 files changed, 17 insertions(+), 13 deletions(-)

Comments

Michal Wajdeczko July 1, 2019, 12:36 p.m. UTC | #1
On Mon, 01 Jul 2019 12:04:51 +0200, Chris Wilson  
<chris@chris-wilson.co.uk> wrote:

> During reset, we must be very selective in which locks we take as most
> are tainted by being held across a wait or reclaim (kmalloc) which
> implicitly waits. Inside the guc reset path, we reset the ADS to sane
> defaults, but must keep it pinned from initialisation to avoid having to
> pin it during reset.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>

but I'm little worried about leaving stale guc->ads_blob below:

> @@ -183,7 +183,7 @@ int intel_guc_ads_create(struct intel_guc *guc)
> void intel_guc_ads_destroy(struct intel_guc *guc)
>  {
> -	i915_vma_unpin_and_release(&guc->ads_vma, 0);
> +	i915_vma_unpin_and_release(&guc->ads_vma, I915_VMA_RELEASE_MAP);
>  }

maybe there is a way to get ptr right from the pinned/mapped vma
without introducing extra separate field that might go out of sync ?

Michal
Chris Wilson July 1, 2019, 1:48 p.m. UTC | #2
Quoting Michal Wajdeczko (2019-07-01 13:36:28)
> On Mon, 01 Jul 2019 12:04:51 +0200, Chris Wilson  
> <chris@chris-wilson.co.uk> wrote:
> 
> > During reset, we must be very selective in which locks we take as most
> > are tainted by being held across a wait or reclaim (kmalloc) which
> > implicitly waits. Inside the guc reset path, we reset the ADS to sane
> > defaults, but must keep it pinned from initialisation to avoid having to
> > pin it during reset.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> 
> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> 
> but I'm little worried about leaving stale guc->ads_blob below:
> 
> > @@ -183,7 +183,7 @@ int intel_guc_ads_create(struct intel_guc *guc)
> > void intel_guc_ads_destroy(struct intel_guc *guc)
> >  {
> > -     i915_vma_unpin_and_release(&guc->ads_vma, 0);
> > +     i915_vma_unpin_and_release(&guc->ads_vma, I915_VMA_RELEASE_MAP);
> >  }
> 
> maybe there is a way to get ptr right from the pinned/mapped vma
> without introducing extra separate field that might go out of sync ?

You mean the vaddr? I look at it as your token of ownership: this is the
address I pinned. While you own that pin, it is not allowed to change.

I expect, If we ever start wanting separate concurrent views of the
object, the return from pin_map will be its own little refcount -- or
simply not be cached. So to remind myself, the cache is because vmap is
slow and we use it frequently for cmdparsing.

So we could just transfer ownership of the map entirely to the caller
and leave it to utilities like the buffer cache to retain the map. I
don't think we actually have concurrent users of the maps, but I
wouldn't bet on it.
-Chris
Daniele Ceraolo Spurio July 1, 2019, 6:12 p.m. UTC | #3
On 7/1/19 3:04 AM, Chris Wilson wrote:
> During reset, we must be very selective in which locks we take as most
> are tainted by being held across a wait or reclaim (kmalloc) which
> implicitly waits. Inside the guc reset path, we reset the ADS to sane
> defaults, but must keep it pinned from initialisation to avoid having to
> pin it during reset.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

I'm wondering if we should add an assert when the locks are taken inside 
the reset path to catch similar issues in the future, because they could 
slip through review.

Daniele

> ---
>   drivers/gpu/drm/i915/intel_guc.h     |  4 ++++
>   drivers/gpu/drm/i915/intel_guc_ads.c | 26 +++++++++++++-------------
>   2 files changed, 17 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> index d6a75bc3d7f4..d91c96679dbb 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -35,6 +35,8 @@
>   #include "i915_utils.h"
>   #include "i915_vma.h"
>   
> +struct __guc_ads_blob;
> +
>   struct guc_preempt_work {
>   	struct work_struct work;
>   	struct intel_engine_cs *engine;
> @@ -65,6 +67,8 @@ struct intel_guc {
>   	} interrupts;
>   
>   	struct i915_vma *ads_vma;
> +	struct __guc_ads_blob *ads_blob;
> +
>   	struct i915_vma *stage_desc_pool;
>   	void *stage_desc_pool_vaddr;
>   	struct ida stage_ids;
> diff --git a/drivers/gpu/drm/i915/intel_guc_ads.c b/drivers/gpu/drm/i915/intel_guc_ads.c
> index ecb69fc94218..69859d1e047f 100644
> --- a/drivers/gpu/drm/i915/intel_guc_ads.c
> +++ b/drivers/gpu/drm/i915/intel_guc_ads.c
> @@ -83,18 +83,14 @@ struct __guc_ads_blob {
>   	u8 reg_state_buffer[GUC_S3_SAVE_SPACE_PAGES * PAGE_SIZE];
>   } __packed;
>   
> -static int __guc_ads_init(struct intel_guc *guc)
> +static void __guc_ads_init(struct intel_guc *guc)
>   {
>   	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> -	struct __guc_ads_blob *blob;
> +	struct __guc_ads_blob *blob = guc->ads_blob;
>   	const u32 skipped_size = LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE;
>   	u32 base;
>   	u8 engine_class;
>   
> -	blob = i915_gem_object_pin_map(guc->ads_vma->obj, I915_MAP_WB);
> -	if (IS_ERR(blob))
> -		return PTR_ERR(blob);
> -
>   	/* GuC scheduling policies */
>   	guc_policies_init(&blob->policies);
>   
> @@ -144,9 +140,7 @@ static int __guc_ads_init(struct intel_guc *guc)
>   	blob->ads.gt_system_info = base + ptr_offset(blob, system_info);
>   	blob->ads.clients_info = base + ptr_offset(blob, clients_info);
>   
> -	i915_gem_object_unpin_map(guc->ads_vma->obj);
> -
> -	return 0;
> +	i915_gem_object_flush_map(guc->ads_vma->obj);
>   }
>   
>   /**
> @@ -160,6 +154,7 @@ int intel_guc_ads_create(struct intel_guc *guc)
>   {
>   	const u32 size = PAGE_ALIGN(sizeof(struct __guc_ads_blob));
>   	struct i915_vma *vma;
> +	void *blob;
>   	int ret;
>   
>   	GEM_BUG_ON(guc->ads_vma);
> @@ -168,11 +163,16 @@ int intel_guc_ads_create(struct intel_guc *guc)
>   	if (IS_ERR(vma))
>   		return PTR_ERR(vma);
>   
> +	blob = i915_gem_object_pin_map(vma->obj, I915_MAP_WB);
> +	if (IS_ERR(blob)) {
> +		ret = PTR_ERR(blob);
> +		goto err_vma;
> +	}
> +
>   	guc->ads_vma = vma;
> +	guc->ads_blob = blob;
>   
> -	ret = __guc_ads_init(guc);
> -	if (ret)
> -		goto err_vma;
> +	__guc_ads_init(guc);
>   
>   	return 0;
>   
> @@ -183,7 +183,7 @@ int intel_guc_ads_create(struct intel_guc *guc)
>   
>   void intel_guc_ads_destroy(struct intel_guc *guc)
>   {
> -	i915_vma_unpin_and_release(&guc->ads_vma, 0);
> +	i915_vma_unpin_and_release(&guc->ads_vma, I915_VMA_RELEASE_MAP);
>   }
>   
>   /**
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index d6a75bc3d7f4..d91c96679dbb 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -35,6 +35,8 @@ 
 #include "i915_utils.h"
 #include "i915_vma.h"
 
+struct __guc_ads_blob;
+
 struct guc_preempt_work {
 	struct work_struct work;
 	struct intel_engine_cs *engine;
@@ -65,6 +67,8 @@  struct intel_guc {
 	} interrupts;
 
 	struct i915_vma *ads_vma;
+	struct __guc_ads_blob *ads_blob;
+
 	struct i915_vma *stage_desc_pool;
 	void *stage_desc_pool_vaddr;
 	struct ida stage_ids;
diff --git a/drivers/gpu/drm/i915/intel_guc_ads.c b/drivers/gpu/drm/i915/intel_guc_ads.c
index ecb69fc94218..69859d1e047f 100644
--- a/drivers/gpu/drm/i915/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/intel_guc_ads.c
@@ -83,18 +83,14 @@  struct __guc_ads_blob {
 	u8 reg_state_buffer[GUC_S3_SAVE_SPACE_PAGES * PAGE_SIZE];
 } __packed;
 
-static int __guc_ads_init(struct intel_guc *guc)
+static void __guc_ads_init(struct intel_guc *guc)
 {
 	struct drm_i915_private *dev_priv = guc_to_i915(guc);
-	struct __guc_ads_blob *blob;
+	struct __guc_ads_blob *blob = guc->ads_blob;
 	const u32 skipped_size = LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE;
 	u32 base;
 	u8 engine_class;
 
-	blob = i915_gem_object_pin_map(guc->ads_vma->obj, I915_MAP_WB);
-	if (IS_ERR(blob))
-		return PTR_ERR(blob);
-
 	/* GuC scheduling policies */
 	guc_policies_init(&blob->policies);
 
@@ -144,9 +140,7 @@  static int __guc_ads_init(struct intel_guc *guc)
 	blob->ads.gt_system_info = base + ptr_offset(blob, system_info);
 	blob->ads.clients_info = base + ptr_offset(blob, clients_info);
 
-	i915_gem_object_unpin_map(guc->ads_vma->obj);
-
-	return 0;
+	i915_gem_object_flush_map(guc->ads_vma->obj);
 }
 
 /**
@@ -160,6 +154,7 @@  int intel_guc_ads_create(struct intel_guc *guc)
 {
 	const u32 size = PAGE_ALIGN(sizeof(struct __guc_ads_blob));
 	struct i915_vma *vma;
+	void *blob;
 	int ret;
 
 	GEM_BUG_ON(guc->ads_vma);
@@ -168,11 +163,16 @@  int intel_guc_ads_create(struct intel_guc *guc)
 	if (IS_ERR(vma))
 		return PTR_ERR(vma);
 
+	blob = i915_gem_object_pin_map(vma->obj, I915_MAP_WB);
+	if (IS_ERR(blob)) {
+		ret = PTR_ERR(blob);
+		goto err_vma;
+	}
+
 	guc->ads_vma = vma;
+	guc->ads_blob = blob;
 
-	ret = __guc_ads_init(guc);
-	if (ret)
-		goto err_vma;
+	__guc_ads_init(guc);
 
 	return 0;
 
@@ -183,7 +183,7 @@  int intel_guc_ads_create(struct intel_guc *guc)
 
 void intel_guc_ads_destroy(struct intel_guc *guc)
 {
-	i915_vma_unpin_and_release(&guc->ads_vma, 0);
+	i915_vma_unpin_and_release(&guc->ads_vma, I915_VMA_RELEASE_MAP);
 }
 
 /**