diff mbox

drm/i915: Skip clearing the GGTT on full-ppgtt systems

Message ID 1463051981-24852-1-git-send-email-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson May 12, 2016, 11:19 a.m. UTC
Under full-ppgtt, access to the global GTT is carefully regulated
through hardware functions (i.e. userspace cannot read and write to
arbitrary locations in the GGTT via the GPU). With this restriction in
place, we can forgo clearing stale entries from the GGTT as they will
not be accessed.

For aliasing-ppgtt, we could almost do the same except that we do allow
userspace access to the global-GTT via execbuf in order to workraound
some quirks of certain instructions. (This execbuf path is filtered out
with EINVAL on full-ppgtt.)

The most dramatic effect this will have will be during resume, as with
full-ppgtt the GGTT is only used sparingly.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: David Weinehall <david.weinehall@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

Comments

Joonas Lahtinen May 13, 2016, 6:22 a.m. UTC | #1
On to, 2016-05-12 at 12:19 +0100, Chris Wilson wrote:
> Under full-ppgtt, access to the global GTT is carefully regulated
> through hardware functions (i.e. userspace cannot read and write to
> arbitrary locations in the GGTT via the GPU). With this restriction in
> place, we can forgo clearing stale entries from the GGTT as they will
> not be accessed.
> 
> For aliasing-ppgtt, we could almost do the same except that we do allow
> userspace access to the global-GTT via execbuf in order to workraound
> some quirks of certain instructions. (This execbuf path is filtered out
> with EINVAL on full-ppgtt.)
> 
> The most dramatic effect this will have will be during resume, as with
> full-ppgtt the GGTT is only used sparingly.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: David Weinehall <david.weinehall@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 20 +++++++++++++++-----
>  1 file changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 319f3b459b3e..7eab619a3eb2 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -2477,6 +2477,13 @@ static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
>  	assert_rpm_atomic_end(dev_priv, rpm_atomic_seq);
>  }
>  
> +static void nop_clear_range(struct i915_address_space *vm,
> +			    uint64_t start,
> +			    uint64_t length,
> +			    bool use_scratch)
> +{
> +}
> +
>  static void gen8_ggtt_clear_range(struct i915_address_space *vm,
>  				  uint64_t start,
>  				  uint64_t length,
> @@ -3072,14 +3079,17 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
>  
>  	ret = ggtt_probe_common(dev, ggtt->size);
>  
> -	ggtt->base.clear_range = gen8_ggtt_clear_range;
> -	if (IS_CHERRYVIEW(dev_priv))
> -		ggtt->base.insert_entries = gen8_ggtt_insert_entries__BKL;
> -	else
> -		ggtt->base.insert_entries = gen8_ggtt_insert_entries;

This form looks better to my eyes, especially once you start adding a
couple of other if ()'s for same callback.

BAT seems to have gone crazy with this patch, too.

>  	ggtt->base.bind_vma = ggtt_bind_vma;
>  	ggtt->base.unbind_vma = ggtt_unbind_vma;
>  
> +	ggtt->base.clear_range = nop_clear_range;
> +	if (!USES_FULL_PPGTT(dev_priv))
> +		ggtt->base.clear_range = gen8_ggtt_clear_range;
> +
> +	ggtt->base.insert_entries = gen8_ggtt_insert_entries;
> +	if (IS_CHERRYVIEW(dev_priv))
> +		ggtt->base.insert_entries = gen8_ggtt_insert_entries__BKL;
> +
>  	return ret;
>  }
>
Chris Wilson May 13, 2016, 7:10 a.m. UTC | #2
On Fri, May 13, 2016 at 09:22:03AM +0300, Joonas Lahtinen wrote:
> On to, 2016-05-12 at 12:19 +0100, Chris Wilson wrote:
> > Under full-ppgtt, access to the global GTT is carefully regulated
> > through hardware functions (i.e. userspace cannot read and write to
> > arbitrary locations in the GGTT via the GPU). With this restriction in
> > place, we can forgo clearing stale entries from the GGTT as they will
> > not be accessed.
> > 
> > For aliasing-ppgtt, we could almost do the same except that we do allow
> > userspace access to the global-GTT via execbuf in order to workraound
> > some quirks of certain instructions. (This execbuf path is filtered out
> > with EINVAL on full-ppgtt.)
> > 
> > The most dramatic effect this will have will be during resume, as with
> > full-ppgtt the GGTT is only used sparingly.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: David Weinehall <david.weinehall@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_gem_gtt.c | 20 +++++++++++++++-----
> >  1 file changed, 15 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index 319f3b459b3e..7eab619a3eb2 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -2477,6 +2477,13 @@ static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
> >  	assert_rpm_atomic_end(dev_priv, rpm_atomic_seq);
> >  }
> >  
> > +static void nop_clear_range(struct i915_address_space *vm,
> > +			    uint64_t start,
> > +			    uint64_t length,
> > +			    bool use_scratch)
> > +{
> > +}
> > +
> >  static void gen8_ggtt_clear_range(struct i915_address_space *vm,
> >  				  uint64_t start,
> >  				  uint64_t length,
> > @@ -3072,14 +3079,17 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
> >  
> >  	ret = ggtt_probe_common(dev, ggtt->size);
> >  
> > -	ggtt->base.clear_range = gen8_ggtt_clear_range;
> > -	if (IS_CHERRYVIEW(dev_priv))
> > -		ggtt->base.insert_entries = gen8_ggtt_insert_entries__BKL;
> > -	else
> > -		ggtt->base.insert_entries = gen8_ggtt_insert_entries;
> 
> This form looks better to my eyes, especially once you start adding a
> couple of other if ()'s for same callback.

This form is actually unusual, the idiom is
	func = default;
	if (unusual)
		func = bar;
which is meant to highlight the normal vs exceptional paths.

> BAT seems to have gone crazy with this patch, too.

There's no reason for this patch to make anything go crazy. The danger
is of course that we don't clear PTEs and so if you access unassigned
ranges, you write into pages owned by the system and not us. That would
be a major bug in our handling of the GGTT - atm the only known bugs
relate to partial vma (afaik).
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 319f3b459b3e..7eab619a3eb2 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2477,6 +2477,13 @@  static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
 	assert_rpm_atomic_end(dev_priv, rpm_atomic_seq);
 }
 
+static void nop_clear_range(struct i915_address_space *vm,
+			    uint64_t start,
+			    uint64_t length,
+			    bool use_scratch)
+{
+}
+
 static void gen8_ggtt_clear_range(struct i915_address_space *vm,
 				  uint64_t start,
 				  uint64_t length,
@@ -3072,14 +3079,17 @@  static int gen8_gmch_probe(struct i915_ggtt *ggtt)
 
 	ret = ggtt_probe_common(dev, ggtt->size);
 
-	ggtt->base.clear_range = gen8_ggtt_clear_range;
-	if (IS_CHERRYVIEW(dev_priv))
-		ggtt->base.insert_entries = gen8_ggtt_insert_entries__BKL;
-	else
-		ggtt->base.insert_entries = gen8_ggtt_insert_entries;
 	ggtt->base.bind_vma = ggtt_bind_vma;
 	ggtt->base.unbind_vma = ggtt_unbind_vma;
 
+	ggtt->base.clear_range = nop_clear_range;
+	if (!USES_FULL_PPGTT(dev_priv))
+		ggtt->base.clear_range = gen8_ggtt_clear_range;
+
+	ggtt->base.insert_entries = gen8_ggtt_insert_entries;
+	if (IS_CHERRYVIEW(dev_priv))
+		ggtt->base.insert_entries = gen8_ggtt_insert_entries__BKL;
+
 	return ret;
 }