drm/i915/ggtt: do not set bits 1-11 in gen12 ptes

Message ID	20200226185657.26445-1-daniele.ceraolospurio@intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=hW0+=4O=lists.freedesktop.org=intel-gfx-bounces@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1A0A52467A From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> To: intel-gfx@lists.freedesktop.org Date: Wed, 26 Feb 2020 10:56:57 -0800 Message-Id: <20200226185657.26445-1-daniele.ceraolospurio@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH] drm/i915/ggtt: do not set bits 1-11 in gen12 ptes Precedence: list Cc: "Sodhi, Vunny" <vunny.sodhi@intel.com>, Matthew Auld <matthew.auld@intel.com> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
Series	drm/i915/ggtt: do not set bits 1-11 in gen12 ptes \| expand drm/i915/ggtt: do not set bits 1-11 in gen12 ptes

Message ID

20200226185657.26445-1-daniele.ceraolospurio@intel.com (mailing list archive)

State

New, archived

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1A0A52467A
From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
To: intel-gfx@lists.freedesktop.org
Date: Wed, 26 Feb 2020 10:56:57 -0800
Message-Id: <20200226185657.26445-1-daniele.ceraolospurio@intel.com>
MIME-Version: 1.0
Subject: [Intel-gfx] [PATCH] drm/i915/ggtt: do not set bits 1-11 in gen12
 ptes
Precedence: list
Cc: "Sodhi, Vunny" <vunny.sodhi@intel.com>,
 Matthew Auld <matthew.auld@intel.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

Series

drm/i915/ggtt: do not set bits 1-11 in gen12 ptes | expand

Commit Message

Daniele Ceraolo Spurio Feb. 26, 2020, 6:56 p.m. UTC

On TGL, bits 2-4 in the GGTT PTE are not ignored anymore and are
instead used for some extra VT-d capabilities. We don't (yet?) have
support for those capabilities, but, given that we shared the pte_encode
function betweed GGTT and PPGTT, we still set those bits to the PPGTT
PPAT values. The DMA engine gets very confused when those bits are
set while the iommu is enabled, leading to errors. E.g. when loading
the GuC we get:

[    9.796218] DMAR: DRHD: handling fault status reg 2
[    9.796235] DMAR: [DMA Write] Request device [00:02.0] PASID ffffffff fault addr 0 [fault reason 02] Present bit in context entry is clear
[    9.899215] [drm:intel_guc_fw_upload [i915]] *ERROR* GuC firmware signature verification failed

To fix this, just have dedicated gen8_pte_encode function per type of
gtt. Also, explicitly set vm->pte_encode for gen8_ppgtt, even if we
don't use it, to make sure we don't accidentally assign it to the GGTT
one, like we do for gen6_ppgtt, in case we need it in the future.

Reported-by: "Sodhi, Vunny" <vunny.sodhi@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 26 ++++++++++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_ggtt.c | 13 ++++++++++---
 drivers/gpu/drm/i915/gt/intel_gtt.c  | 24 ------------------------
 drivers/gpu/drm/i915/gt/intel_gtt.h  |  4 ----
 4 files changed, 36 insertions(+), 31 deletions(-)

Comments

Mika Kuoppala Feb. 27, 2020, 10:51 a.m. UTC | #1

Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> writes:

> On TGL, bits 2-4 in the GGTT PTE are not ignored anymore and are
> instead used for some extra VT-d capabilities. We don't (yet?) have
> support for those capabilities, but, given that we shared the pte_encode
> function betweed GGTT and PPGTT, we still set those bits to the PPGTT
> PPAT values. The DMA engine gets very confused when those bits are
> set while the iommu is enabled, leading to errors. E.g. when loading
> the GuC we get:
>
> [    9.796218] DMAR: DRHD: handling fault status reg 2
> [    9.796235] DMAR: [DMA Write] Request device [00:02.0] PASID ffffffff fault addr 0 [fault reason 02] Present bit in context entry is clear
> [    9.899215] [drm:intel_guc_fw_upload [i915]] *ERROR* GuC firmware signature verification failed
>
> To fix this, just have dedicated gen8_pte_encode function per type of
> gtt. Also, explicitly set vm->pte_encode for gen8_ppgtt, even if we
> don't use it, to make sure we don't accidentally assign it to the GGTT
> one, like we do for gen6_ppgtt, in case we need it in the future.

Nice find. Tho I feel that the subject and commit message needs
a bit massaging for future archeologists.

We are changing the gen8+ ggtt encoding and not only gen12 and
it should be noted.

>
> Reported-by: "Sodhi, Vunny" <vunny.sodhi@intel.com>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 26 ++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/gt/intel_ggtt.c | 13 ++++++++++---
>  drivers/gpu/drm/i915/gt/intel_gtt.c  | 24 ------------------------
>  drivers/gpu/drm/i915/gt/intel_gtt.h  |  4 ----
>  4 files changed, 36 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index 4d1de2d97d5c..9aabc5815d38 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -25,6 +25,30 @@ static u64 gen8_pde_encode(const dma_addr_t addr,
>  	return pde;
>  }
>  
> +static u64 gen8_pte_encode(dma_addr_t addr,
> +			   enum i915_cache_level level,
> +			   u32 flags)
> +{
> +	gen8_pte_t pte = addr | _PAGE_PRESENT | _PAGE_RW;
> +
> +	if (unlikely(flags & PTE_READ_ONLY))
> +		pte &= ~_PAGE_RW;
> +
> +	switch (level) {
> +	case I915_CACHE_NONE:
> +		pte |= PPAT_UNCACHED;
> +		break;
> +	case I915_CACHE_WT:
> +		pte |= PPAT_DISPLAY_ELLC;
> +		break;
> +	default:
> +		pte |= PPAT_CACHED;
> +		break;
> +	}
> +
> +	return pte;
> +}
> +
>  static void gen8_ppgtt_notify_vgt(struct i915_ppgtt *ppgtt, bool create)
>  {
>  	struct drm_i915_private *i915 = ppgtt->vm.i915;
> @@ -706,6 +730,8 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
>  	ppgtt->vm.allocate_va_range = gen8_ppgtt_alloc;
>  	ppgtt->vm.clear_range = gen8_ppgtt_clear;
>  
> +	ppgtt->vm.pte_encode = gen8_pte_encode;
> +
>  	if (intel_vgpu_active(gt->i915))
>  		gen8_ppgtt_notify_vgt(ppgtt, true);
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> index 41a00281f364..e45eca985b14 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> @@ -157,6 +157,13 @@ static void gmch_ggtt_invalidate(struct i915_ggtt *ggtt)
>  	intel_gtt_chipset_flush();
>  }
>  
> +static u64 gen8_ggtt_pte_encode(dma_addr_t addr,
> +				enum i915_cache_level level,
> +				u32 flags)
> +{
> +	return addr | _PAGE_PRESENT;

The bspec is bit ambivalent in here. First it says R/W is ignored but
further ahead it states that bit0 and bit1 defines the validity
and rest are ignored for aperture/display.

This said, I am leaning to give a minimal set a try first :)

> +}
> +
>  static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
>  {
>  	writeq(pte, addr);
> @@ -172,7 +179,7 @@ static void gen8_ggtt_insert_page(struct i915_address_space *vm,
>  	gen8_pte_t __iomem *pte =
>  		(gen8_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
>  
> -	gen8_set_pte(pte, gen8_pte_encode(addr, level, 0));
> +	gen8_set_pte(pte, gen8_ggtt_pte_encode(addr, level, 0));

Make me ponder why we don't use the vm->pte_encode all the way
as we have it :P

Nevertheless, with s/gen12/gen8 ggtt encoding in subject/
commit msg and this is,

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

>  
>  	ggtt->invalidate(ggtt);
>  }
> @@ -185,7 +192,7 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
>  	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
>  	struct sgt_iter sgt_iter;
>  	gen8_pte_t __iomem *gtt_entries;
> -	const gen8_pte_t pte_encode = gen8_pte_encode(0, level, 0);
> +	const gen8_pte_t pte_encode = gen8_ggtt_pte_encode(0, level, 0);
>  	dma_addr_t addr;
>  
>  	/*
> @@ -857,7 +864,7 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
>  	ggtt->vm.vma_ops.set_pages   = ggtt_set_pages;
>  	ggtt->vm.vma_ops.clear_pages = clear_pages;
>  
> -	ggtt->vm.pte_encode = gen8_pte_encode;
> +	ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
>  
>  	setup_private_pat(ggtt->vm.gt->uncore);
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> index bb9a6e638175..c8db4f95c1b7 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> @@ -484,30 +484,6 @@ void gtt_write_workarounds(struct intel_gt *gt)
>  	}
>  }
>  
> -u64 gen8_pte_encode(dma_addr_t addr,
> -		    enum i915_cache_level level,
> -		    u32 flags)
> -{
> -	gen8_pte_t pte = addr | _PAGE_PRESENT | _PAGE_RW;
> -
> -	if (unlikely(flags & PTE_READ_ONLY))
> -		pte &= ~_PAGE_RW;
> -
> -	switch (level) {
> -	case I915_CACHE_NONE:
> -		pte |= PPAT_UNCACHED;
> -		break;
> -	case I915_CACHE_WT:
> -		pte |= PPAT_DISPLAY_ELLC;
> -		break;
> -	default:
> -		pte |= PPAT_CACHED;
> -		break;
> -	}
> -
> -	return pte;
> -}
> -
>  static void tgl_setup_private_ppat(struct intel_uncore *uncore)
>  {
>  	/* TGL doesn't support LLC or AGE settings */
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> index 23004445806a..9a185f4537e1 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> @@ -515,10 +515,6 @@ struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt);
>  void i915_ggtt_suspend(struct i915_ggtt *gtt);
>  void i915_ggtt_resume(struct i915_ggtt *ggtt);
>  
> -u64 gen8_pte_encode(dma_addr_t addr,
> -		    enum i915_cache_level level,
> -		    u32 flags);
> -
>  int setup_page_dma(struct i915_address_space *vm, struct i915_page_dma *p);
>  void cleanup_page_dma(struct i915_address_space *vm, struct i915_page_dma *p);
>  
> -- 
> 2.24.1

Chris Wilson Feb. 27, 2020, 10:58 a.m. UTC | #2

Quoting Mika Kuoppala (2020-02-27 10:51:46)
> Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> writes:
> >  static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
> >  {
> >       writeq(pte, addr);
> > @@ -172,7 +179,7 @@ static void gen8_ggtt_insert_page(struct i915_address_space *vm,
> >       gen8_pte_t __iomem *pte =
> >               (gen8_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
> >  
> > -     gen8_set_pte(pte, gen8_pte_encode(addr, level, 0));
> > +     gen8_set_pte(pte, gen8_ggtt_pte_encode(addr, level, 0));
> 
> Make me ponder why we don't use the vm->pte_encode all the way
> as we have it :P

Don't use vfuncs if the target routine is known.

Apparently it's even cheaper to do a bsearch on a page of target
routines than it is to use retpolines.

Anyway, the era of vfuncs is over.
-Chris

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 4d1de2d97d5c..9aabc5815d38 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -25,6 +25,30 @@  static u64 gen8_pde_encode(const dma_addr_t addr,
 	return pde;
 }
 
+static u64 gen8_pte_encode(dma_addr_t addr,
+			   enum i915_cache_level level,
+			   u32 flags)
+{
+	gen8_pte_t pte = addr | _PAGE_PRESENT | _PAGE_RW;
+
+	if (unlikely(flags & PTE_READ_ONLY))
+		pte &= ~_PAGE_RW;
+
+	switch (level) {
+	case I915_CACHE_NONE:
+		pte |= PPAT_UNCACHED;
+		break;
+	case I915_CACHE_WT:
+		pte |= PPAT_DISPLAY_ELLC;
+		break;
+	default:
+		pte |= PPAT_CACHED;
+		break;
+	}
+
+	return pte;
+}
+
 static void gen8_ppgtt_notify_vgt(struct i915_ppgtt *ppgtt, bool create)
 {
 	struct drm_i915_private *i915 = ppgtt->vm.i915;
@@ -706,6 +730,8 @@  struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
 	ppgtt->vm.allocate_va_range = gen8_ppgtt_alloc;
 	ppgtt->vm.clear_range = gen8_ppgtt_clear;
 
+	ppgtt->vm.pte_encode = gen8_pte_encode;
+
 	if (intel_vgpu_active(gt->i915))
 		gen8_ppgtt_notify_vgt(ppgtt, true);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 41a00281f364..e45eca985b14 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -157,6 +157,13 @@  static void gmch_ggtt_invalidate(struct i915_ggtt *ggtt)
 	intel_gtt_chipset_flush();
 }
 
+static u64 gen8_ggtt_pte_encode(dma_addr_t addr,
+				enum i915_cache_level level,
+				u32 flags)
+{
+	return addr | _PAGE_PRESENT;
+}
+
 static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
 {
 	writeq(pte, addr);
@@ -172,7 +179,7 @@  static void gen8_ggtt_insert_page(struct i915_address_space *vm,
 	gen8_pte_t __iomem *pte =
 		(gen8_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
 
-	gen8_set_pte(pte, gen8_pte_encode(addr, level, 0));
+	gen8_set_pte(pte, gen8_ggtt_pte_encode(addr, level, 0));
 
 	ggtt->invalidate(ggtt);
 }
@@ -185,7 +192,7 @@  static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
 	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
 	struct sgt_iter sgt_iter;
 	gen8_pte_t __iomem *gtt_entries;
-	const gen8_pte_t pte_encode = gen8_pte_encode(0, level, 0);
+	const gen8_pte_t pte_encode = gen8_ggtt_pte_encode(0, level, 0);
 	dma_addr_t addr;
 
 	/*
@@ -857,7 +864,7 @@  static int gen8_gmch_probe(struct i915_ggtt *ggtt)
 	ggtt->vm.vma_ops.set_pages   = ggtt_set_pages;
 	ggtt->vm.vma_ops.clear_pages = clear_pages;
 
-	ggtt->vm.pte_encode = gen8_pte_encode;
+	ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
 
 	setup_private_pat(ggtt->vm.gt->uncore);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index bb9a6e638175..c8db4f95c1b7 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -484,30 +484,6 @@  void gtt_write_workarounds(struct intel_gt *gt)
 	}
 }
 
-u64 gen8_pte_encode(dma_addr_t addr,
-		    enum i915_cache_level level,
-		    u32 flags)
-{
-	gen8_pte_t pte = addr | _PAGE_PRESENT | _PAGE_RW;
-
-	if (unlikely(flags & PTE_READ_ONLY))
-		pte &= ~_PAGE_RW;
-
-	switch (level) {
-	case I915_CACHE_NONE:
-		pte |= PPAT_UNCACHED;
-		break;
-	case I915_CACHE_WT:
-		pte |= PPAT_DISPLAY_ELLC;
-		break;
-	default:
-		pte |= PPAT_CACHED;
-		break;
-	}
-
-	return pte;
-}
-
 static void tgl_setup_private_ppat(struct intel_uncore *uncore)
 {
 	/* TGL doesn't support LLC or AGE settings */
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 23004445806a..9a185f4537e1 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -515,10 +515,6 @@  struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt);
 void i915_ggtt_suspend(struct i915_ggtt *gtt);
 void i915_ggtt_resume(struct i915_ggtt *ggtt);
 
-u64 gen8_pte_encode(dma_addr_t addr,
-		    enum i915_cache_level level,
-		    u32 flags);
-
 int setup_page_dma(struct i915_address_space *vm, struct i915_page_dma *p);
 void cleanup_page_dma(struct i915_address_space *vm, struct i915_page_dma *p);

drm/i915/ggtt: do not set bits 1-11 in gen12 ptes

Commit Message

Comments

Patch