diff mbox series

drm/i915/gt: Temporarily disable CPU caching into DMA for MTL

Message ID 20231102175831.872763-1-jonathan.cavitt@intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915/gt: Temporarily disable CPU caching into DMA for MTL | expand

Commit Message

Cavitt, Jonathan Nov. 2, 2023, 5:58 p.m. UTC
FIXME: It is suspected that some Address Translation Service (ATS)
issue on IOMMU is causing CAT errors to occur on some MTL workloads.
Applying a write barrier to the ppgtt set entry functions appeared
to have no effect, so we must temporarily use I915_MAP_WC in the
map_pt_dma class of functions on MTL until a proper ATS solution is
found.

Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
CC: Chris Wilson <chris.p.wilson@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gtt.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

Comments

Sripada, Radhakrishna Nov. 3, 2023, 8:01 p.m. UTC | #1
Hi Jonathan,

> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Jonathan
> Cavitt
> Sent: Thursday, November 2, 2023 10:59 AM
> To: intel-gfx@lists.freedesktop.org
> Cc: Gupta, saurabhg <saurabhg.gupta@intel.com>; Cavitt, Jonathan
> <jonathan.cavitt@intel.com>; chris.p.wilson@linux.intel.com
> Subject: [Intel-gfx] [PATCH] drm/i915/gt: Temporarily disable CPU caching into
> DMA for MTL
> 
> FIXME: It is suspected that some Address Translation Service (ATS)
> issue on IOMMU is causing CAT errors to occur on some MTL workloads.
> Applying a write barrier to the ppgtt set entry functions appeared
> to have no effect, so we must temporarily use I915_MAP_WC in the
> map_pt_dma class of functions on MTL until a proper ATS solution is
> found.
> 
What is the performance impact here? Are we disabling caching only
for the pte changes/scratch pages or does it extend beyond?

Regards,
Radhakrishna(RK) Sripada 
> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> CC: Chris Wilson <chris.p.wilson@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_gtt.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c
> b/drivers/gpu/drm/i915/gt/intel_gtt.c
> index 4fbed27ef0ecc..21719563a602a 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> @@ -95,6 +95,16 @@ int map_pt_dma(struct i915_address_space *vm, struct
> drm_i915_gem_object *obj)
>  	void *vaddr;
> 
>  	type = intel_gt_coherent_map_type(vm->gt, obj, true);
> +	/*
> +	 * FIXME: It is suspected that some Address Translation Service (ATS)
> +	 * issue on IOMMU is causing CAT errors to occur on some MTL
> workloads.
> +	 * Applying a write barrier to the ppgtt set entry functions appeared
> +	 * to have no effect, so we must temporarily use I915_MAP_WC here on
> +	 * MTL until a proper ATS solution is found.
> +	 */
> +	if (IS_METEORLAKE(vm->i915))
> +		type = I915_MAP_WC;
> +
>  	vaddr = i915_gem_object_pin_map_unlocked(obj, type);
>  	if (IS_ERR(vaddr))
>  		return PTR_ERR(vaddr);
> @@ -109,6 +119,16 @@ int map_pt_dma_locked(struct i915_address_space
> *vm, struct drm_i915_gem_object
>  	void *vaddr;
> 
>  	type = intel_gt_coherent_map_type(vm->gt, obj, true);
> +	/*
> +	 * FIXME: It is suspected that some Address Translation Service (ATS)
> +	 * issue on IOMMU is causing CAT errors to occur on some MTL
> workloads.
> +	 * Applying a write barrier to the ppgtt set entry functions appeared
> +	 * to have no effect, so we must temporarily use I915_MAP_WC here on
> +	 * MTL until a proper ATS solution is found.
> +	 */
> +	if (IS_METEORLAKE(vm->i915))
> +		type = I915_MAP_WC;
> +
>  	vaddr = i915_gem_object_pin_map(obj, type);
>  	if (IS_ERR(vaddr))
>  		return PTR_ERR(vaddr);
> --
> 2.25.1
Cavitt, Jonathan Nov. 3, 2023, 8:15 p.m. UTC | #2
-----Original Message-----
From: Sripada, Radhakrishna <radhakrishna.sripada@intel.com> 
Sent: Friday, November 3, 2023 1:02 PM
To: Cavitt, Jonathan <jonathan.cavitt@intel.com>; intel-gfx@lists.freedesktop.org
Cc: Gupta, saurabhg <saurabhg.gupta@intel.com>; Cavitt, Jonathan <jonathan.cavitt@intel.com>; chris.p.wilson@linux.intel.com
Subject: RE: [Intel-gfx] [PATCH] drm/i915/gt: Temporarily disable CPU caching into DMA for MTL
> 
> Hi Jonathan,
> 
> > -----Original Message-----
> > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Jonathan
> > Cavitt
> > Sent: Thursday, November 2, 2023 10:59 AM
> > To: intel-gfx@lists.freedesktop.org
> > Cc: Gupta, saurabhg <saurabhg.gupta@intel.com>; Cavitt, Jonathan
> > <jonathan.cavitt@intel.com>; chris.p.wilson@linux.intel.com
> > Subject: [Intel-gfx] [PATCH] drm/i915/gt: Temporarily disable CPU caching into
> > DMA for MTL
> > 
> > FIXME: It is suspected that some Address Translation Service (ATS)
> > issue on IOMMU is causing CAT errors to occur on some MTL workloads.
> > Applying a write barrier to the ppgtt set entry functions appeared
> > to have no effect, so we must temporarily use I915_MAP_WC in the
> > map_pt_dma class of functions on MTL until a proper ATS solution is
> > found.
> > 
> What is the performance impact here? Are we disabling caching only
> for the pte changes/scratch pages or does it extend beyond?


I don't actually know what map_pt_dma is used for, but if the name is
indicative of its purpose, it should only impact mappings into the dma
page table.
As for the performance impact, I don't imagine it'll be much.  Maybe
a single-digit percentage slowdown?  It might actually improve
performance if we're avoiding enough cache misses, but the true
performance impact would have to be measured empirically.
-Jonathan Cavitt


> 
> Regards,
> Radhakrishna(RK) Sripada 
> > Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > CC: Chris Wilson <chris.p.wilson@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_gtt.c | 20 ++++++++++++++++++++
> >  1 file changed, 20 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c
> > b/drivers/gpu/drm/i915/gt/intel_gtt.c
> > index 4fbed27ef0ecc..21719563a602a 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> > @@ -95,6 +95,16 @@ int map_pt_dma(struct i915_address_space *vm, struct
> > drm_i915_gem_object *obj)
> >  	void *vaddr;
> > 
> >  	type = intel_gt_coherent_map_type(vm->gt, obj, true);
> > +	/*
> > +	 * FIXME: It is suspected that some Address Translation Service (ATS)
> > +	 * issue on IOMMU is causing CAT errors to occur on some MTL
> > workloads.
> > +	 * Applying a write barrier to the ppgtt set entry functions appeared
> > +	 * to have no effect, so we must temporarily use I915_MAP_WC here on
> > +	 * MTL until a proper ATS solution is found.
> > +	 */
> > +	if (IS_METEORLAKE(vm->i915))
> > +		type = I915_MAP_WC;
> > +
> >  	vaddr = i915_gem_object_pin_map_unlocked(obj, type);
> >  	if (IS_ERR(vaddr))
> >  		return PTR_ERR(vaddr);
> > @@ -109,6 +119,16 @@ int map_pt_dma_locked(struct i915_address_space
> > *vm, struct drm_i915_gem_object
> >  	void *vaddr;
> > 
> >  	type = intel_gt_coherent_map_type(vm->gt, obj, true);
> > +	/*
> > +	 * FIXME: It is suspected that some Address Translation Service (ATS)
> > +	 * issue on IOMMU is causing CAT errors to occur on some MTL
> > workloads.
> > +	 * Applying a write barrier to the ppgtt set entry functions appeared
> > +	 * to have no effect, so we must temporarily use I915_MAP_WC here on
> > +	 * MTL until a proper ATS solution is found.
> > +	 */
> > +	if (IS_METEORLAKE(vm->i915))
> > +		type = I915_MAP_WC;
> > +
> >  	vaddr = i915_gem_object_pin_map(obj, type);
> >  	if (IS_ERR(vaddr))
> >  		return PTR_ERR(vaddr);
> > --
> > 2.25.1
> 
>
Sripada, Radhakrishna Nov. 3, 2023, 9:06 p.m. UTC | #3
Hi Jonathan,

> -----Original Message-----
> From: Cavitt, Jonathan <jonathan.cavitt@intel.com>
> Sent: Friday, November 3, 2023 1:15 PM
> To: Sripada, Radhakrishna <radhakrishna.sripada@intel.com>; intel-
> gfx@lists.freedesktop.org
> Cc: Gupta, saurabhg <saurabhg.gupta@intel.com>;
> chris.p.wilson@linux.intel.com
> Subject: RE: [Intel-gfx] [PATCH] drm/i915/gt: Temporarily disable CPU caching
> into DMA for MTL
> 
> -----Original Message-----
> From: Sripada, Radhakrishna <radhakrishna.sripada@intel.com>
> Sent: Friday, November 3, 2023 1:02 PM
> To: Cavitt, Jonathan <jonathan.cavitt@intel.com>; intel-gfx@lists.freedesktop.org
> Cc: Gupta, saurabhg <saurabhg.gupta@intel.com>; Cavitt, Jonathan
> <jonathan.cavitt@intel.com>; chris.p.wilson@linux.intel.com
> Subject: RE: [Intel-gfx] [PATCH] drm/i915/gt: Temporarily disable CPU caching
> into DMA for MTL
> >
> > Hi Jonathan,
> >
> > > -----Original Message-----
> > > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
> Jonathan
> > > Cavitt
> > > Sent: Thursday, November 2, 2023 10:59 AM
> > > To: intel-gfx@lists.freedesktop.org
> > > Cc: Gupta, saurabhg <saurabhg.gupta@intel.com>; Cavitt, Jonathan
> > > <jonathan.cavitt@intel.com>; chris.p.wilson@linux.intel.com
> > > Subject: [Intel-gfx] [PATCH] drm/i915/gt: Temporarily disable CPU caching into
> > > DMA for MTL
> > >
> > > FIXME: It is suspected that some Address Translation Service (ATS)
> > > issue on IOMMU is causing CAT errors to occur on some MTL workloads.
> > > Applying a write barrier to the ppgtt set entry functions appeared
> > > to have no effect, so we must temporarily use I915_MAP_WC in the
> > > map_pt_dma class of functions on MTL until a proper ATS solution is
> > > found.
> > >
> > What is the performance impact here? Are we disabling caching only
> > for the pte changes/scratch pages or does it extend beyond?
> 
> 
> I don't actually know what map_pt_dma is used for, but if the name is
> indicative of its purpose, it should only impact mappings into the dma
> page table.
> As for the performance impact, I don't imagine it'll be much.  Maybe
> a single-digit percentage slowdown?  It might actually improve
> performance if we're avoiding enough cache misses, but the true
> performance impact would have to be measured empirically.
> -Jonathan Cavitt
> 
Even I am assuming the performance impact to be low as only pte uncached
would be uncached and hence
Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>


> 
> >
> > Regards,
> > Radhakrishna(RK) Sripada
> > > Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > > CC: Chris Wilson <chris.p.wilson@linux.intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/gt/intel_gtt.c | 20 ++++++++++++++++++++
> > >  1 file changed, 20 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c
> > > b/drivers/gpu/drm/i915/gt/intel_gtt.c
> > > index 4fbed27ef0ecc..21719563a602a 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> > > @@ -95,6 +95,16 @@ int map_pt_dma(struct i915_address_space *vm,
> struct
> > > drm_i915_gem_object *obj)
> > >  	void *vaddr;
> > >
> > >  	type = intel_gt_coherent_map_type(vm->gt, obj, true);
> > > +	/*
> > > +	 * FIXME: It is suspected that some Address Translation Service (ATS)
> > > +	 * issue on IOMMU is causing CAT errors to occur on some MTL
> > > workloads.
> > > +	 * Applying a write barrier to the ppgtt set entry functions appeared
> > > +	 * to have no effect, so we must temporarily use I915_MAP_WC here on
> > > +	 * MTL until a proper ATS solution is found.
> > > +	 */
> > > +	if (IS_METEORLAKE(vm->i915))
> > > +		type = I915_MAP_WC;
> > > +
> > >  	vaddr = i915_gem_object_pin_map_unlocked(obj, type);
> > >  	if (IS_ERR(vaddr))
> > >  		return PTR_ERR(vaddr);
> > > @@ -109,6 +119,16 @@ int map_pt_dma_locked(struct i915_address_space
> > > *vm, struct drm_i915_gem_object
> > >  	void *vaddr;
> > >
> > >  	type = intel_gt_coherent_map_type(vm->gt, obj, true);
> > > +	/*
> > > +	 * FIXME: It is suspected that some Address Translation Service (ATS)
> > > +	 * issue on IOMMU is causing CAT errors to occur on some MTL
> > > workloads.
> > > +	 * Applying a write barrier to the ppgtt set entry functions appeared
> > > +	 * to have no effect, so we must temporarily use I915_MAP_WC here on
> > > +	 * MTL until a proper ATS solution is found.
> > > +	 */
> > > +	if (IS_METEORLAKE(vm->i915))
> > > +		type = I915_MAP_WC;
> > > +
> > >  	vaddr = i915_gem_object_pin_map(obj, type);
> > >  	if (IS_ERR(vaddr))
> > >  		return PTR_ERR(vaddr);
> > > --
> > > 2.25.1
> >
> >
Andi Shyti Nov. 6, 2023, 4:37 p.m. UTC | #4
Hi Jonathan,

On Thu, Nov 02, 2023 at 10:58:31AM -0700, Jonathan Cavitt wrote:
> FIXME: It is suspected that some Address Translation Service (ATS)
> issue on IOMMU is causing CAT errors to occur on some MTL workloads.
> Applying a write barrier to the ppgtt set entry functions appeared
> to have no effect, so we must temporarily use I915_MAP_WC in the
> map_pt_dma class of functions on MTL until a proper ATS solution is
> found.
> 
> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> CC: Chris Wilson <chris.p.wilson@linux.intel.com>

acked and pushed!

Thanks!
Andi
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 4fbed27ef0ecc..21719563a602a 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -95,6 +95,16 @@  int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 	void *vaddr;
 
 	type = intel_gt_coherent_map_type(vm->gt, obj, true);
+	/*
+	 * FIXME: It is suspected that some Address Translation Service (ATS)
+	 * issue on IOMMU is causing CAT errors to occur on some MTL workloads.
+	 * Applying a write barrier to the ppgtt set entry functions appeared
+	 * to have no effect, so we must temporarily use I915_MAP_WC here on
+	 * MTL until a proper ATS solution is found.
+	 */
+	if (IS_METEORLAKE(vm->i915))
+		type = I915_MAP_WC;
+
 	vaddr = i915_gem_object_pin_map_unlocked(obj, type);
 	if (IS_ERR(vaddr))
 		return PTR_ERR(vaddr);
@@ -109,6 +119,16 @@  int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object
 	void *vaddr;
 
 	type = intel_gt_coherent_map_type(vm->gt, obj, true);
+	/*
+	 * FIXME: It is suspected that some Address Translation Service (ATS)
+	 * issue on IOMMU is causing CAT errors to occur on some MTL workloads.
+	 * Applying a write barrier to the ppgtt set entry functions appeared
+	 * to have no effect, so we must temporarily use I915_MAP_WC here on
+	 * MTL until a proper ATS solution is found.
+	 */
+	if (IS_METEORLAKE(vm->i915))
+		type = I915_MAP_WC;
+
 	vaddr = i915_gem_object_pin_map(obj, type);
 	if (IS_ERR(vaddr))
 		return PTR_ERR(vaddr);