drm/i915/ehl: unconditionally flush the pages on acquire

Message ID	20210709151933.1994078-1-matthew.auld@intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=9WF3=MB=lists.freedesktop.org=intel-gfx-bounces@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E0A04613B2 From: Matthew Auld <matthew.auld@intel.com> To: intel-gfx@lists.freedesktop.org Date: Fri, 9 Jul 2021 16:19:33 +0100 Message-Id: <20210709151933.1994078-1-matthew.auld@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH] drm/i915/ehl: unconditionally flush the pages on acquire Precedence: list Cc: Lucas De Marchi <lucas.demarchi@intel.com>, dri-devel@lists.freedesktop.org, Chris Wilson <chris.p.wilson@intel.com>, Francisco Jerez <francisco.jerez.plata@intel.com> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
Series	drm/i915/ehl: unconditionally flush the pages on acquire \| expand drm/i915/ehl: unconditionally flush the pages on acquire

Message ID

20210709151933.1994078-1-matthew.auld@intel.com (mailing list archive)

State

New, archived

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E0A04613B2
From: Matthew Auld <matthew.auld@intel.com>
To: intel-gfx@lists.freedesktop.org
Date: Fri,  9 Jul 2021 16:19:33 +0100
Message-Id: <20210709151933.1994078-1-matthew.auld@intel.com>
MIME-Version: 1.0
Subject: [Intel-gfx] [PATCH] drm/i915/ehl: unconditionally flush the pages
 on acquire
Precedence: list
Cc: Lucas De Marchi <lucas.demarchi@intel.com>,
 dri-devel@lists.freedesktop.org,
 Chris Wilson <chris.p.wilson@intel.com>,
 Francisco Jerez <francisco.jerez.plata@intel.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

Series

drm/i915/ehl: unconditionally flush the pages on acquire | expand

Commit Message

Matthew Auld July 9, 2021, 3:19 p.m. UTC

EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
possible for userspace to bypass the GTT caching bits set by the kernel,
as per the given object cache_level. This is troublesome since the heavy
flush we apply when first acquiring the pages is skipped if the kernel
thinks the object is coherent with the GPU. As a result it might be
possible to bypass the cache and read the contents of the page directly,
which could be stale data. If it's just a case of userspace shooting
themselves in the foot then so be it, but since i915 takes the stance of
always zeroing memory before handing it to userspace, we need to prevent
this.

BSpec: 34007
References: 046091758b50 ("Revert "drm/i915/ehl: Update MOCS table for EHL"")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: Francisco Jerez <francisco.jerez.plata@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris.p.wilson@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 29 +++++++++++++++++++++--
 1 file changed, 27 insertions(+), 2 deletions(-)

Comments

Daniel Vetter July 9, 2021, 4:13 p.m. UTC | #1

On Fri, Jul 9, 2021 at 5:19 PM Matthew Auld <matthew.auld@intel.com> wrote:
>
> EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
> possible for userspace to bypass the GTT caching bits set by the kernel,
> as per the given object cache_level. This is troublesome since the heavy
> flush we apply when first acquiring the pages is skipped if the kernel
> thinks the object is coherent with the GPU. As a result it might be
> possible to bypass the cache and read the contents of the page directly,
> which could be stale data. If it's just a case of userspace shooting
> themselves in the foot then so be it, but since i915 takes the stance of
> always zeroing memory before handing it to userspace, we need to prevent
> this.
>
> BSpec: 34007
> References: 046091758b50 ("Revert "drm/i915/ehl: Update MOCS table for EHL"")
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
> Cc: Francisco Jerez <francisco.jerez.plata@intel.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> Cc: Chris Wilson <chris.p.wilson@intel.com>
> Cc: Matt Roper <matthew.d.roper@intel.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 29 +++++++++++++++++++++--
>  1 file changed, 27 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> index 6a04cce188fc..7e9ec68cce9e 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> @@ -298,11 +298,12 @@ __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
>
>  void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_table *pages)
>  {
> +       struct drm_i915_private *i915 = to_i915(obj->base.dev);
>         struct sgt_iter sgt_iter;
>         struct pagevec pvec;
>         struct page *page;
>
> -       GEM_WARN_ON(IS_DGFX(to_i915(obj->base.dev)));
> +       GEM_WARN_ON(IS_DGFX(i915));
>         __i915_gem_object_release_shmem(obj, pages, true);
>
>         i915_gem_gtt_finish_pages(obj, pages);
> @@ -325,7 +326,12 @@ void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_
>         }
>         if (pagevec_count(&pvec))
>                 check_release_pagevec(&pvec);
> -       obj->mm.dirty = false;
> +
> +       /* See the comment in shmem_object_init() for why we need this */
> +       if (IS_JSL_EHL(i915) && obj->flags & I915_BO_ALLOC_USER)
> +               obj->mm.dirty = true;
> +       else
> +               obj->mm.dirty = false;
>
>         sg_free_table(pages);
>         kfree(pages);
> @@ -539,6 +545,25 @@ static int shmem_object_init(struct intel_memory_region *mem,
>
>         i915_gem_object_set_cache_coherency(obj, cache_level);
>
> +       /*
> +        * EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
> +        * possible for userspace to bypass the GTT caching bits set by the
> +        * kernel, as per the given object cache_level. This is troublesome
> +        * since the heavy flush we apply when first gathering the pages is
> +        * skipped if the kernel thinks the object is coherent with the GPU. As
> +        * a result it might be possible to bypass the cache and read the
> +        * contents of the page directly, which could be stale data. If it's
> +        * just a case of userspace shooting themselves in the foot then so be
> +        * it, but since i915 takes the stance of always zeroing memory before
> +        * handing it to userspace, we need to prevent this.
> +        *
> +        * By setting cache_dirty here we make the clflush when first acquiring
> +        * the pages unconditional on such platforms. We also set this again in
> +        * put_pages().
> +        */
> +       if (IS_JSL_EHL(i915) && flags & I915_BO_ALLOC_USER)
> +               obj->cache_dirty = true;

I don't think this is enough, because every time we drop our pages
shmem could move them around or swap them out, and we get fresh ones.
So we need to re-force this every time we grab new pages.

Also there's already a pile of other cases (well not WB coherency
mode) where userspace can be clever and bypass the coherency if we
don't clflush first. I think it'd be really good to have all that in
one places as much as possible.

Finally this is extremely tricky code, and obj->cache_dirty and
related stuff isn't really documented. kerneldoc for all that would be
really good.
-Daniel

> +
>         i915_gem_object_init_memory_region(obj, mem);
>
>         return 0;
> --
> 2.26.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Matthew Auld July 9, 2021, 4:34 p.m. UTC | #2

On Fri, 9 Jul 2021 at 17:13, Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Fri, Jul 9, 2021 at 5:19 PM Matthew Auld <matthew.auld@intel.com> wrote:
> >
> > EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
> > possible for userspace to bypass the GTT caching bits set by the kernel,
> > as per the given object cache_level. This is troublesome since the heavy
> > flush we apply when first acquiring the pages is skipped if the kernel
> > thinks the object is coherent with the GPU. As a result it might be
> > possible to bypass the cache and read the contents of the page directly,
> > which could be stale data. If it's just a case of userspace shooting
> > themselves in the foot then so be it, but since i915 takes the stance of
> > always zeroing memory before handing it to userspace, we need to prevent
> > this.
> >
> > BSpec: 34007
> > References: 046091758b50 ("Revert "drm/i915/ehl: Update MOCS table for EHL"")
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
> > Cc: Francisco Jerez <francisco.jerez.plata@intel.com>
> > Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> > Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> > Cc: Chris Wilson <chris.p.wilson@intel.com>
> > Cc: Matt Roper <matthew.d.roper@intel.com>
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 29 +++++++++++++++++++++--
> >  1 file changed, 27 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > index 6a04cce188fc..7e9ec68cce9e 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > @@ -298,11 +298,12 @@ __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
> >
> >  void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_table *pages)
> >  {
> > +       struct drm_i915_private *i915 = to_i915(obj->base.dev);
> >         struct sgt_iter sgt_iter;
> >         struct pagevec pvec;
> >         struct page *page;
> >
> > -       GEM_WARN_ON(IS_DGFX(to_i915(obj->base.dev)));
> > +       GEM_WARN_ON(IS_DGFX(i915));
> >         __i915_gem_object_release_shmem(obj, pages, true);
> >
> >         i915_gem_gtt_finish_pages(obj, pages);
> > @@ -325,7 +326,12 @@ void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_
> >         }
> >         if (pagevec_count(&pvec))
> >                 check_release_pagevec(&pvec);
> > -       obj->mm.dirty = false;
> > +
> > +       /* See the comment in shmem_object_init() for why we need this */
> > +       if (IS_JSL_EHL(i915) && obj->flags & I915_BO_ALLOC_USER)
> > +               obj->mm.dirty = true;
> > +       else
> > +               obj->mm.dirty = false;
> >
> >         sg_free_table(pages);
> >         kfree(pages);
> > @@ -539,6 +545,25 @@ static int shmem_object_init(struct intel_memory_region *mem,
> >
> >         i915_gem_object_set_cache_coherency(obj, cache_level);
> >
> > +       /*
> > +        * EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
> > +        * possible for userspace to bypass the GTT caching bits set by the
> > +        * kernel, as per the given object cache_level. This is troublesome
> > +        * since the heavy flush we apply when first gathering the pages is
> > +        * skipped if the kernel thinks the object is coherent with the GPU. As
> > +        * a result it might be possible to bypass the cache and read the
> > +        * contents of the page directly, which could be stale data. If it's
> > +        * just a case of userspace shooting themselves in the foot then so be
> > +        * it, but since i915 takes the stance of always zeroing memory before
> > +        * handing it to userspace, we need to prevent this.
> > +        *
> > +        * By setting cache_dirty here we make the clflush when first acquiring
> > +        * the pages unconditional on such platforms. We also set this again in
> > +        * put_pages().
> > +        */
> > +       if (IS_JSL_EHL(i915) && flags & I915_BO_ALLOC_USER)
> > +               obj->cache_dirty = true;
>
> I don't think this is enough, because every time we drop our pages
> shmem could move them around or swap them out, and we get fresh ones.
> So we need to re-force this every time we grab new pages.

We also rearm this in put_pages(), or at least we do in v2, so if the
pages are swapped out or whatever it should then flush them again when
we re-acquire the pages.

>
> Also there's already a pile of other cases (well not WB coherency
> mode) where userspace can be clever and bypass the coherency if we
> don't clflush first. I think it'd be really good to have all that in
> one places as much as possible.
>
> Finally this is extremely tricky code, and obj->cache_dirty and
> related stuff isn't really documented. kerneldoc for all that would be
> really good.

Ok, I'll take a look.

> -Daniel
>
> > +
> >         i915_gem_object_init_memory_region(obj, mem);
> >
> >         return 0;
> > --
> > 2.26.3
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

Daniel Vetter July 9, 2021, 4:57 p.m. UTC | #3

On Fri, Jul 9, 2021 at 6:35 PM Matthew Auld
<matthew.william.auld@gmail.com> wrote:
>
> On Fri, 9 Jul 2021 at 17:13, Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Fri, Jul 9, 2021 at 5:19 PM Matthew Auld <matthew.auld@intel.com> wrote:
> > >
> > > EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
> > > possible for userspace to bypass the GTT caching bits set by the kernel,
> > > as per the given object cache_level. This is troublesome since the heavy
> > > flush we apply when first acquiring the pages is skipped if the kernel
> > > thinks the object is coherent with the GPU. As a result it might be
> > > possible to bypass the cache and read the contents of the page directly,
> > > which could be stale data. If it's just a case of userspace shooting
> > > themselves in the foot then so be it, but since i915 takes the stance of
> > > always zeroing memory before handing it to userspace, we need to prevent
> > > this.
> > >
> > > BSpec: 34007
> > > References: 046091758b50 ("Revert "drm/i915/ehl: Update MOCS table for EHL"")
> > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
> > > Cc: Francisco Jerez <francisco.jerez.plata@intel.com>
> > > Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> > > Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> > > Cc: Chris Wilson <chris.p.wilson@intel.com>
> > > Cc: Matt Roper <matthew.d.roper@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 29 +++++++++++++++++++++--
> > >  1 file changed, 27 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > > index 6a04cce188fc..7e9ec68cce9e 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > > @@ -298,11 +298,12 @@ __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
> > >
> > >  void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_table *pages)
> > >  {
> > > +       struct drm_i915_private *i915 = to_i915(obj->base.dev);
> > >         struct sgt_iter sgt_iter;
> > >         struct pagevec pvec;
> > >         struct page *page;
> > >
> > > -       GEM_WARN_ON(IS_DGFX(to_i915(obj->base.dev)));
> > > +       GEM_WARN_ON(IS_DGFX(i915));
> > >         __i915_gem_object_release_shmem(obj, pages, true);
> > >
> > >         i915_gem_gtt_finish_pages(obj, pages);
> > > @@ -325,7 +326,12 @@ void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_
> > >         }
> > >         if (pagevec_count(&pvec))
> > >                 check_release_pagevec(&pvec);
> > > -       obj->mm.dirty = false;
> > > +
> > > +       /* See the comment in shmem_object_init() for why we need this */
> > > +       if (IS_JSL_EHL(i915) && obj->flags & I915_BO_ALLOC_USER)
> > > +               obj->mm.dirty = true;
> > > +       else
> > > +               obj->mm.dirty = false;
> > >
> > >         sg_free_table(pages);
> > >         kfree(pages);
> > > @@ -539,6 +545,25 @@ static int shmem_object_init(struct intel_memory_region *mem,
> > >
> > >         i915_gem_object_set_cache_coherency(obj, cache_level);
> > >
> > > +       /*
> > > +        * EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
> > > +        * possible for userspace to bypass the GTT caching bits set by the
> > > +        * kernel, as per the given object cache_level. This is troublesome
> > > +        * since the heavy flush we apply when first gathering the pages is
> > > +        * skipped if the kernel thinks the object is coherent with the GPU. As
> > > +        * a result it might be possible to bypass the cache and read the
> > > +        * contents of the page directly, which could be stale data. If it's
> > > +        * just a case of userspace shooting themselves in the foot then so be
> > > +        * it, but since i915 takes the stance of always zeroing memory before
> > > +        * handing it to userspace, we need to prevent this.
> > > +        *
> > > +        * By setting cache_dirty here we make the clflush when first acquiring
> > > +        * the pages unconditional on such platforms. We also set this again in
> > > +        * put_pages().
> > > +        */
> > > +       if (IS_JSL_EHL(i915) && flags & I915_BO_ALLOC_USER)
> > > +               obj->cache_dirty = true;
> >
> > I don't think this is enough, because every time we drop our pages
> > shmem could move them around or swap them out, and we get fresh ones.
> > So we need to re-force this every time we grab new pages.
>
> We also rearm this in put_pages(), or at least we do in v2, so if the
> pages are swapped out or whatever it should then flush them again when
> we re-acquire the pages.

Yeah v2 looks better, that put_pages on obj->mm.dirty made no sense.

Conceptually I think it's cleaner though if we set this in get_pages,
since that's the action that requires the cleaning. But maybe that
doesn't work from a sequencing pov? I'd have thought that any time we
get to check whether we need to clflush the pages would exist already
...

Maybe it would be even cleaner if get_pages would issue the clflush
directly, long-term at least, when we have the infrastructure for
pipeline clear/move in place and make sure we never ignore such a
fence. That's perhaps conceptually the cleanest version.
-Daniel

> > Also there's already a pile of other cases (well not WB coherency
> > mode) where userspace can be clever and bypass the coherency if we
> > don't clflush first. I think it'd be really good to have all that in
> > one places as much as possible.
> >
> > Finally this is extremely tricky code, and obj->cache_dirty and
> > related stuff isn't really documented. kerneldoc for all that would be
> > really good.
>
> Ok, I'll take a look.
>
> > -Daniel
> >
> > > +
> > >         i915_gem_object_init_memory_region(obj, mem);
> > >
> > >         return 0;
> > > --
> > > 2.26.3
> > >
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >
> >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 6a04cce188fc..7e9ec68cce9e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -298,11 +298,12 @@  __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
 
 void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_table *pages)
 {
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	struct sgt_iter sgt_iter;
 	struct pagevec pvec;
 	struct page *page;
 
-	GEM_WARN_ON(IS_DGFX(to_i915(obj->base.dev)));
+	GEM_WARN_ON(IS_DGFX(i915));
 	__i915_gem_object_release_shmem(obj, pages, true);
 
 	i915_gem_gtt_finish_pages(obj, pages);
@@ -325,7 +326,12 @@  void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_
 	}
 	if (pagevec_count(&pvec))
 		check_release_pagevec(&pvec);
-	obj->mm.dirty = false;
+
+	/* See the comment in shmem_object_init() for why we need this */
+	if (IS_JSL_EHL(i915) && obj->flags & I915_BO_ALLOC_USER)
+		obj->mm.dirty = true;
+	else
+		obj->mm.dirty = false;
 
 	sg_free_table(pages);
 	kfree(pages);
@@ -539,6 +545,25 @@  static int shmem_object_init(struct intel_memory_region *mem,
 
 	i915_gem_object_set_cache_coherency(obj, cache_level);
 
+	/*
+	 * EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
+	 * possible for userspace to bypass the GTT caching bits set by the
+	 * kernel, as per the given object cache_level. This is troublesome
+	 * since the heavy flush we apply when first gathering the pages is
+	 * skipped if the kernel thinks the object is coherent with the GPU. As
+	 * a result it might be possible to bypass the cache and read the
+	 * contents of the page directly, which could be stale data. If it's
+	 * just a case of userspace shooting themselves in the foot then so be
+	 * it, but since i915 takes the stance of always zeroing memory before
+	 * handing it to userspace, we need to prevent this.
+	 *
+	 * By setting cache_dirty here we make the clflush when first acquiring
+	 * the pages unconditional on such platforms. We also set this again in
+	 * put_pages().
+	 */
+	if (IS_JSL_EHL(i915) && flags & I915_BO_ALLOC_USER)
+		obj->cache_dirty = true;
+
 	i915_gem_object_init_memory_region(obj, mem);
 
 	return 0;

drm/i915/ehl: unconditionally flush the pages on acquire

Commit Message

Comments

Patch