diff mbox

[6/8] drm/i915: Add bind/unbind object functions to VM

Message ID 1377906241-8463-7-git-send-email-benjamin.widawsky@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ben Widawsky Aug. 30, 2013, 11:43 p.m. UTC
From: Ben Widawsky <ben@bwidawsk.net>

As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.

Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.

v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels
Use VMA for bind/unbind (Daniel, Ben)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     |  69 +++++++++++++-----------
 drivers/gpu/drm/i915/i915_gem_gtt.c | 101 ++++++++++++++++++++++++++++++++++++
 2 files changed, 140 insertions(+), 30 deletions(-)

Comments

Chris Wilson Aug. 31, 2013, 12:12 a.m. UTC | #1
On Fri, Aug 30, 2013 at 04:43:59PM -0700, Ben Widawsky wrote:
> From: Ben Widawsky <ben@bwidawsk.net>
> 
> As we plumb the code with more VM information, it has become more
> obvious that the easiest way to deal with bind and unbind is to simply
> put the function pointers in the vm, and let those choose the correct
> way to handle the page table updates. This change allows many places in
> the code to simply be vm->bind, and not have to worry about
> distinguishing PPGTT vs GGTT.
> 
> Notice that this patch has no impact on functionality. I've decided to
> save the actual change until the next patch because I think it's easier
> to review that way. I'm happy to squash the two, or let Daniel do it on
> merge.
> 
> v2:
> Make ggtt handle the quirky aliasing ppgtt
> Add flags to bind object to support above
> Don't ever call bind/unbind directly for PPGTT until we have real, full
> PPGTT (use NULLs to assert this)
> Make sure we rebind the ggtt if there already is a ggtt binding. This
> happens on set cache levels
> Use VMA for bind/unbind (Daniel, Ben)
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

You like pokkadot paint for its inconsistency? Other than interesting
alternation of styles, I see nothing wrong with the logic.
-Chris
Ben Widawsky Aug. 31, 2013, 3:40 a.m. UTC | #2
On Sat, Aug 31, 2013 at 01:12:55AM +0100, Chris Wilson wrote:
> On Fri, Aug 30, 2013 at 04:43:59PM -0700, Ben Widawsky wrote:
> > From: Ben Widawsky <ben@bwidawsk.net>
> > 
> > As we plumb the code with more VM information, it has become more
> > obvious that the easiest way to deal with bind and unbind is to simply
> > put the function pointers in the vm, and let those choose the correct
> > way to handle the page table updates. This change allows many places in
> > the code to simply be vm->bind, and not have to worry about
> > distinguishing PPGTT vs GGTT.
> > 
> > Notice that this patch has no impact on functionality. I've decided to
> > save the actual change until the next patch because I think it's easier
> > to review that way. I'm happy to squash the two, or let Daniel do it on
> > merge.
> > 
> > v2:
> > Make ggtt handle the quirky aliasing ppgtt
> > Add flags to bind object to support above
> > Don't ever call bind/unbind directly for PPGTT until we have real, full
> > PPGTT (use NULLs to assert this)
> > Make sure we rebind the ggtt if there already is a ggtt binding. This
> > happens on set cache levels
> > Use VMA for bind/unbind (Daniel, Ben)
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> You like pokkadot paint for its inconsistency? Other than interesting
> alternation of styles, I see nothing wrong with the logic.
> -Chris
> 

To what are you referring? I'm probably more than willing to change
whatever displeases you.
Ville Syrjala Sept. 2, 2013, 12:46 p.m. UTC | #3
On Fri, Aug 30, 2013 at 04:43:59PM -0700, Ben Widawsky wrote:
> From: Ben Widawsky <ben@bwidawsk.net>
> 
> As we plumb the code with more VM information, it has become more
> obvious that the easiest way to deal with bind and unbind is to simply
> put the function pointers in the vm, and let those choose the correct
> way to handle the page table updates. This change allows many places in
> the code to simply be vm->bind, and not have to worry about
> distinguishing PPGTT vs GGTT.
> 
> Notice that this patch has no impact on functionality. I've decided to
> save the actual change until the next patch because I think it's easier
> to review that way. I'm happy to squash the two, or let Daniel do it on
> merge.
> 
> v2:
> Make ggtt handle the quirky aliasing ppgtt
> Add flags to bind object to support above
> Don't ever call bind/unbind directly for PPGTT until we have real, full
> PPGTT (use NULLs to assert this)
> Make sure we rebind the ggtt if there already is a ggtt binding. This
> happens on set cache levels
> Use VMA for bind/unbind (Daniel, Ben)
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_drv.h     |  69 +++++++++++++-----------
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 101 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 140 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index c9ed77a..377ca63 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -461,6 +461,36 @@ enum i915_cache_level {
>  
>  typedef uint32_t gen6_gtt_pte_t;
>  
> +/**
> + * A VMA represents a GEM BO that is bound into an address space. Therefore, a
> + * VMA's presence cannot be guaranteed before binding, or after unbinding the
> + * object into/from the address space.
> + *
> + * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
> + * will always be <= an objects lifetime. So object refcounting should cover us.
> + */
> +struct i915_vma {
> +	struct drm_mm_node node;
> +	struct drm_i915_gem_object *obj;
> +	struct i915_address_space *vm;
> +
> +	/** This object's place on the active/inactive lists */
> +	struct list_head mm_list;
> +
> +	struct list_head vma_link; /* Link in the object's VMA list */
> +
> +	/** This vma's place in the batchbuffer or on the eviction list */
> +	struct list_head exec_list;
> +
> +	/**
> +	 * Used for performing relocations during execbuffer insertion.
> +	 */
> +	struct hlist_node exec_node;
> +	unsigned long exec_handle;
> +	struct drm_i915_gem_exec_object2 *exec_entry;
> +
> +};
> +
>  struct i915_address_space {
>  	struct drm_mm mm;
>  	struct drm_device *dev;
> @@ -499,9 +529,18 @@ struct i915_address_space {
>  	/* FIXME: Need a more generic return type */
>  	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
>  				     enum i915_cache_level level);
> +
> +	/** Unmap an object from an address space. This usually consists of
> +	 * setting the valid PTE entries to a reserved scratch page. */
> +	void (*unbind_vma)(struct i915_vma *vma);
>  	void (*clear_range)(struct i915_address_space *vm,
>  			    unsigned int first_entry,
>  			    unsigned int num_entries);
> +	/* Map an object into an address space with the given cache flags. */
> +#define GLOBAL_BIND (1<<0)
> +	void (*bind_vma)(struct i915_vma *vma,
> +			 enum i915_cache_level cache_level,
> +			 u32 flags);
>  	void (*insert_entries)(struct i915_address_space *vm,
>  			       struct sg_table *st,
>  			       unsigned int first_entry,
> @@ -548,36 +587,6 @@ struct i915_hw_ppgtt {
>  	int (*enable)(struct drm_device *dev);
>  };
>  
> -/**
> - * A VMA represents a GEM BO that is bound into an address space. Therefore, a
> - * VMA's presence cannot be guaranteed before binding, or after unbinding the
> - * object into/from the address space.
> - *
> - * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
> - * will always be <= an objects lifetime. So object refcounting should cover us.
> - */
> -struct i915_vma {
> -	struct drm_mm_node node;
> -	struct drm_i915_gem_object *obj;
> -	struct i915_address_space *vm;
> -
> -	/** This object's place on the active/inactive lists */
> -	struct list_head mm_list;
> -
> -	struct list_head vma_link; /* Link in the object's VMA list */
> -
> -	/** This vma's place in the batchbuffer or on the eviction list */
> -	struct list_head exec_list;
> -
> -	/**
> -	 * Used for performing relocations during execbuffer insertion.
> -	 */
> -	struct hlist_node exec_node;
> -	unsigned long exec_handle;
> -	struct drm_i915_gem_exec_object2 *exec_entry;
> -
> -};
> -
>  struct i915_ctx_hang_stats {
>  	/* This context had batch pending when hang was declared */
>  	unsigned batch_pending;
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 212f6d8..d5a4580 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -57,6 +57,11 @@
>  #define HSW_WB_ELLC_LLC_AGE0		HSW_CACHEABILITY_CONTROL(0xb)
>  #define HSW_WT_ELLC_LLC_AGE0		HSW_CACHEABILITY_CONTROL(0x6)
>  
> +static void gen6_ppgtt_bind_vma(struct i915_vma *vma,
> +				enum i915_cache_level cache_level,
> +				u32 flags);
> +static void gen6_ppgtt_unbind_vma(struct i915_vma *vma);
> +
>  static gen6_gtt_pte_t snb_pte_encode(dma_addr_t addr,
>  				     enum i915_cache_level level)
>  {
> @@ -332,7 +337,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>  	ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode;
>  	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
>  	ppgtt->enable = gen6_ppgtt_enable;
> +	ppgtt->base.unbind_vma = NULL;
>  	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
> +	ppgtt->base.bind_vma = NULL;
>  	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
>  	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
>  	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
> @@ -439,6 +446,18 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
>  				   cache_level);
>  }
>  
> +static void __always_unused
> +gen6_ppgtt_bind_vma(struct i915_vma *vma,
> +		    enum i915_cache_level cache_level,
> +		    u32 flags)
> +{
> +	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> +
> +	WARN_ON(flags);
> +
> +	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
> +}
> +
>  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
>  			      struct drm_i915_gem_object *obj)
>  {
> @@ -447,6 +466,14 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
>  				obj->base.size >> PAGE_SHIFT);
>  }
>  
> +static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
> +{
> +	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> +
> +	gen6_ppgtt_clear_range(vma->vm, entry,
> +			       vma->obj->base.size >> PAGE_SHIFT);
> +}
> +
>  extern int intel_iommu_gfx_mapped;
>  /* Certain Gen5 chipsets require require idling the GPU before
>   * unmapping anything from the GTT when VT-d is enabled.
> @@ -592,6 +619,19 @@ static void i915_ggtt_insert_entries(struct i915_address_space *vm,
>  
>  }
>  
> +static void i915_ggtt_bind_vma(struct i915_vma *vma,
> +			       enum i915_cache_level cache_level,
> +			       u32 unused)
> +{
> +	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> +	unsigned int flags = (cache_level == I915_CACHE_NONE) ?
> +		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
> +
> +	BUG_ON(!i915_is_ggtt(vma->vm));
> +	intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
> +	vma->obj->has_global_gtt_mapping = 1;
> +}
> +
>  static void i915_ggtt_clear_range(struct i915_address_space *vm,
>  				  unsigned int first_entry,
>  				  unsigned int num_entries)
> @@ -599,6 +639,46 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm,
>  	intel_gtt_clear_range(first_entry, num_entries);
>  }
>  
> +static void i915_ggtt_unbind_vma(struct i915_vma *vma)
> +{
> +	const unsigned int first = vma->node.start >> PAGE_SHIFT;
> +	const unsigned int size = vma->obj->base.size >> PAGE_SHIFT;
> +
> +	BUG_ON(!i915_is_ggtt(vma->vm));
> +	vma->obj->has_global_gtt_mapping = 0;
> +	intel_gtt_clear_range(first, size);
> +}
> +
> +static void gen6_ggtt_bind_vma(struct i915_vma *vma,
> +			       enum i915_cache_level cache_level,
> +			       u32 flags)
> +{
> +	struct drm_device *dev = vma->vm->dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct drm_i915_gem_object *obj = vma->obj;
> +	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> +
> +	/* If there is an aliasing PPGTT, and the user didn't explicitly ask for
> +	 * the global, just use aliasing */
> +	if (dev_priv->mm.aliasing_ppgtt && !(flags & GLOBAL_BIND) &&
> +	    !obj->has_global_gtt_mapping) {
> +		gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
> +					  vma->obj->pages, entry, cache_level);
> +		vma->obj->has_aliasing_ppgtt_mapping = 1;
> +		return;
> +	}
> +
> +	gen6_ggtt_insert_entries(vma->vm, obj->pages, entry, cache_level);
> +	obj->has_global_gtt_mapping = 1;
> +
> +	/* If put the mapping in the aliasing PPGTT as well as Global if we have
> +	 * aliasing, but the user requested global. */
> +	if (dev_priv->mm.aliasing_ppgtt) {
> +		gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
> +					  vma->obj->pages, entry, cache_level);
> +		vma->obj->has_aliasing_ppgtt_mapping = 1;
> +	}
> +}

There's a bit of duplication here. How about this:
{
	if (!aliasing_ppgtt ||
	    flags & GLOBAL_BIND ||
	    has_ggtt_mapping {
		gen6_ppgtt_insert_entries()
		has_global_gtt_mapping = true;
	}

	if (aliasing_ppgtt) {
		ppgtt_insert
		has_aliasing_ppgtt_mapping = true;
	}
}

Not sure if this was the same thing that bothered Chris.

>  
>  void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
>  			      enum i915_cache_level cache_level)
> @@ -627,6 +707,23 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
>  	obj->has_global_gtt_mapping = 0;
>  }
>  
> +static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
> +{
> +	struct drm_device *dev = vma->vm->dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> +
> +	gen6_ggtt_clear_range(vma->vm, entry,
> +			      vma->obj->base.size >> PAGE_SHIFT);
> +	vma->obj->has_global_gtt_mapping = 0;
> +	if (dev_priv->mm.aliasing_ppgtt && vma->obj->has_aliasing_ppgtt_mapping) {
> +		gen6_ppgtt_clear_range(&dev_priv->mm.aliasing_ppgtt->base,
> +				       entry,
> +				       vma->obj->base.size >> PAGE_SHIFT);
> +		vma->obj->has_aliasing_ppgtt_mapping = 0;
> +	}
> +}

gen6_ggtt_bind_vma() might not have added the ggtt mapping, so I
suppose you should check has_global_gtt_mapping to avoid useless
work. Also the dev_priv->mm.aliasing_ppgtt check is a bit
pointless as there's no way has_aliasing_ppgtt_mapping would be
set then if there's no aliasing ppgtt.

> +
>  void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
>  {
>  	struct drm_device *dev = obj->base.dev;
> @@ -860,7 +957,9 @@ static int gen6_gmch_probe(struct drm_device *dev,
>  		DRM_ERROR("Scratch setup failed\n");
>  
>  	dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
> +	dev_priv->gtt.base.unbind_vma = gen6_ggtt_unbind_vma;
>  	dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
> +	dev_priv->gtt.base.bind_vma = gen6_ggtt_bind_vma;
>  
>  	return ret;
>  }
> @@ -892,7 +991,9 @@ static int i915_gmch_probe(struct drm_device *dev,
>  
>  	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
>  	dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
> +	dev_priv->gtt.base.unbind_vma = i915_ggtt_unbind_vma;
>  	dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
> +	dev_priv->gtt.base.bind_vma = i915_ggtt_bind_vma;
>  
>  	return 0;
>  }
> -- 
> 1.8.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Chris Wilson Sept. 3, 2013, 2:48 p.m. UTC | #4
On Fri, Aug 30, 2013 at 08:40:36PM -0700, Ben Widawsky wrote:
> On Sat, Aug 31, 2013 at 01:12:55AM +0100, Chris Wilson wrote:
> > On Fri, Aug 30, 2013 at 04:43:59PM -0700, Ben Widawsky wrote:
> > > From: Ben Widawsky <ben@bwidawsk.net>
> > > 
> > > As we plumb the code with more VM information, it has become more
> > > obvious that the easiest way to deal with bind and unbind is to simply
> > > put the function pointers in the vm, and let those choose the correct
> > > way to handle the page table updates. This change allows many places in
> > > the code to simply be vm->bind, and not have to worry about
> > > distinguishing PPGTT vs GGTT.
> > > 
> > > Notice that this patch has no impact on functionality. I've decided to
> > > save the actual change until the next patch because I think it's easier
> > > to review that way. I'm happy to squash the two, or let Daniel do it on
> > > merge.
> > > 
> > > v2:
> > > Make ggtt handle the quirky aliasing ppgtt
> > > Add flags to bind object to support above
> > > Don't ever call bind/unbind directly for PPGTT until we have real, full
> > > PPGTT (use NULLs to assert this)
> > > Make sure we rebind the ggtt if there already is a ggtt binding. This
> > > happens on set cache levels
> > > Use VMA for bind/unbind (Daniel, Ben)
> > > 
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > 
> > You like pokkadot paint for its inconsistency? Other than interesting
> > alternation of styles, I see nothing wrong with the logic.
> > -Chris
> > 
> 
> To what are you referring? I'm probably more than willing to change
> whatever displeases you.

You were alternating between using temporary variables like

  const unsigned long entry = vma->node.start >> PAGE_SHIFT;

and doing the computation inline, with no consistency as far as I could
see. It was so very minor, but it looks like cut'n'paste code from
multiple sources.
-Chris
Ben Widawsky Sept. 4, 2013, 12:20 a.m. UTC | #5
On Mon, Sep 02, 2013 at 03:46:52PM +0300, Ville Syrjälä wrote:
> On Fri, Aug 30, 2013 at 04:43:59PM -0700, Ben Widawsky wrote:
> > From: Ben Widawsky <ben@bwidawsk.net>
> > 
> > As we plumb the code with more VM information, it has become more
> > obvious that the easiest way to deal with bind and unbind is to simply
> > put the function pointers in the vm, and let those choose the correct
> > way to handle the page table updates. This change allows many places in
> > the code to simply be vm->bind, and not have to worry about
> > distinguishing PPGTT vs GGTT.
> > 
> > Notice that this patch has no impact on functionality. I've decided to
> > save the actual change until the next patch because I think it's easier
> > to review that way. I'm happy to squash the two, or let Daniel do it on
> > merge.
> > 
> > v2:
> > Make ggtt handle the quirky aliasing ppgtt
> > Add flags to bind object to support above
> > Don't ever call bind/unbind directly for PPGTT until we have real, full
> > PPGTT (use NULLs to assert this)
> > Make sure we rebind the ggtt if there already is a ggtt binding. This
> > happens on set cache levels
> > Use VMA for bind/unbind (Daniel, Ben)
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h     |  69 +++++++++++++-----------
> >  drivers/gpu/drm/i915/i915_gem_gtt.c | 101 ++++++++++++++++++++++++++++++++++++
> >  2 files changed, 140 insertions(+), 30 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index c9ed77a..377ca63 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -461,6 +461,36 @@ enum i915_cache_level {
> >  
> >  typedef uint32_t gen6_gtt_pte_t;
> >  
> > +/**
> > + * A VMA represents a GEM BO that is bound into an address space. Therefore, a
> > + * VMA's presence cannot be guaranteed before binding, or after unbinding the
> > + * object into/from the address space.
> > + *
> > + * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
> > + * will always be <= an objects lifetime. So object refcounting should cover us.
> > + */
> > +struct i915_vma {
> > +	struct drm_mm_node node;
> > +	struct drm_i915_gem_object *obj;
> > +	struct i915_address_space *vm;
> > +
> > +	/** This object's place on the active/inactive lists */
> > +	struct list_head mm_list;
> > +
> > +	struct list_head vma_link; /* Link in the object's VMA list */
> > +
> > +	/** This vma's place in the batchbuffer or on the eviction list */
> > +	struct list_head exec_list;
> > +
> > +	/**
> > +	 * Used for performing relocations during execbuffer insertion.
> > +	 */
> > +	struct hlist_node exec_node;
> > +	unsigned long exec_handle;
> > +	struct drm_i915_gem_exec_object2 *exec_entry;
> > +
> > +};
> > +
> >  struct i915_address_space {
> >  	struct drm_mm mm;
> >  	struct drm_device *dev;
> > @@ -499,9 +529,18 @@ struct i915_address_space {
> >  	/* FIXME: Need a more generic return type */
> >  	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> >  				     enum i915_cache_level level);
> > +
> > +	/** Unmap an object from an address space. This usually consists of
> > +	 * setting the valid PTE entries to a reserved scratch page. */
> > +	void (*unbind_vma)(struct i915_vma *vma);
> >  	void (*clear_range)(struct i915_address_space *vm,
> >  			    unsigned int first_entry,
> >  			    unsigned int num_entries);
> > +	/* Map an object into an address space with the given cache flags. */
> > +#define GLOBAL_BIND (1<<0)
> > +	void (*bind_vma)(struct i915_vma *vma,
> > +			 enum i915_cache_level cache_level,
> > +			 u32 flags);
> >  	void (*insert_entries)(struct i915_address_space *vm,
> >  			       struct sg_table *st,
> >  			       unsigned int first_entry,
> > @@ -548,36 +587,6 @@ struct i915_hw_ppgtt {
> >  	int (*enable)(struct drm_device *dev);
> >  };
> >  
> > -/**
> > - * A VMA represents a GEM BO that is bound into an address space. Therefore, a
> > - * VMA's presence cannot be guaranteed before binding, or after unbinding the
> > - * object into/from the address space.
> > - *
> > - * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
> > - * will always be <= an objects lifetime. So object refcounting should cover us.
> > - */
> > -struct i915_vma {
> > -	struct drm_mm_node node;
> > -	struct drm_i915_gem_object *obj;
> > -	struct i915_address_space *vm;
> > -
> > -	/** This object's place on the active/inactive lists */
> > -	struct list_head mm_list;
> > -
> > -	struct list_head vma_link; /* Link in the object's VMA list */
> > -
> > -	/** This vma's place in the batchbuffer or on the eviction list */
> > -	struct list_head exec_list;
> > -
> > -	/**
> > -	 * Used for performing relocations during execbuffer insertion.
> > -	 */
> > -	struct hlist_node exec_node;
> > -	unsigned long exec_handle;
> > -	struct drm_i915_gem_exec_object2 *exec_entry;
> > -
> > -};
> > -
> >  struct i915_ctx_hang_stats {
> >  	/* This context had batch pending when hang was declared */
> >  	unsigned batch_pending;
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index 212f6d8..d5a4580 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -57,6 +57,11 @@
> >  #define HSW_WB_ELLC_LLC_AGE0		HSW_CACHEABILITY_CONTROL(0xb)
> >  #define HSW_WT_ELLC_LLC_AGE0		HSW_CACHEABILITY_CONTROL(0x6)
> >  
> > +static void gen6_ppgtt_bind_vma(struct i915_vma *vma,
> > +				enum i915_cache_level cache_level,
> > +				u32 flags);
> > +static void gen6_ppgtt_unbind_vma(struct i915_vma *vma);
> > +
> >  static gen6_gtt_pte_t snb_pte_encode(dma_addr_t addr,
> >  				     enum i915_cache_level level)
> >  {
> > @@ -332,7 +337,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> >  	ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode;
> >  	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
> >  	ppgtt->enable = gen6_ppgtt_enable;
> > +	ppgtt->base.unbind_vma = NULL;
> >  	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
> > +	ppgtt->base.bind_vma = NULL;
> >  	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
> >  	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
> >  	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
> > @@ -439,6 +446,18 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> >  				   cache_level);
> >  }
> >  
> > +static void __always_unused
> > +gen6_ppgtt_bind_vma(struct i915_vma *vma,
> > +		    enum i915_cache_level cache_level,
> > +		    u32 flags)
> > +{
> > +	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > +
> > +	WARN_ON(flags);
> > +
> > +	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
> > +}
> > +
> >  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> >  			      struct drm_i915_gem_object *obj)
> >  {
> > @@ -447,6 +466,14 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> >  				obj->base.size >> PAGE_SHIFT);
> >  }
> >  
> > +static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
> > +{
> > +	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > +
> > +	gen6_ppgtt_clear_range(vma->vm, entry,
> > +			       vma->obj->base.size >> PAGE_SHIFT);
> > +}
> > +
> >  extern int intel_iommu_gfx_mapped;
> >  /* Certain Gen5 chipsets require require idling the GPU before
> >   * unmapping anything from the GTT when VT-d is enabled.
> > @@ -592,6 +619,19 @@ static void i915_ggtt_insert_entries(struct i915_address_space *vm,
> >  
> >  }
> >  
> > +static void i915_ggtt_bind_vma(struct i915_vma *vma,
> > +			       enum i915_cache_level cache_level,
> > +			       u32 unused)
> > +{
> > +	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > +	unsigned int flags = (cache_level == I915_CACHE_NONE) ?
> > +		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
> > +
> > +	BUG_ON(!i915_is_ggtt(vma->vm));
> > +	intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
> > +	vma->obj->has_global_gtt_mapping = 1;
> > +}
> > +
> >  static void i915_ggtt_clear_range(struct i915_address_space *vm,
> >  				  unsigned int first_entry,
> >  				  unsigned int num_entries)
> > @@ -599,6 +639,46 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm,
> >  	intel_gtt_clear_range(first_entry, num_entries);
> >  }
> >  
> > +static void i915_ggtt_unbind_vma(struct i915_vma *vma)
> > +{
> > +	const unsigned int first = vma->node.start >> PAGE_SHIFT;
> > +	const unsigned int size = vma->obj->base.size >> PAGE_SHIFT;
> > +
> > +	BUG_ON(!i915_is_ggtt(vma->vm));
> > +	vma->obj->has_global_gtt_mapping = 0;
> > +	intel_gtt_clear_range(first, size);
> > +}
> > +
> > +static void gen6_ggtt_bind_vma(struct i915_vma *vma,
> > +			       enum i915_cache_level cache_level,
> > +			       u32 flags)
> > +{
> > +	struct drm_device *dev = vma->vm->dev;
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct drm_i915_gem_object *obj = vma->obj;
> > +	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > +
> > +	/* If there is an aliasing PPGTT, and the user didn't explicitly ask for
> > +	 * the global, just use aliasing */
> > +	if (dev_priv->mm.aliasing_ppgtt && !(flags & GLOBAL_BIND) &&
> > +	    !obj->has_global_gtt_mapping) {
> > +		gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
> > +					  vma->obj->pages, entry, cache_level);
> > +		vma->obj->has_aliasing_ppgtt_mapping = 1;
> > +		return;
> > +	}
> > +
> > +	gen6_ggtt_insert_entries(vma->vm, obj->pages, entry, cache_level);
> > +	obj->has_global_gtt_mapping = 1;
> > +
> > +	/* If put the mapping in the aliasing PPGTT as well as Global if we have
> > +	 * aliasing, but the user requested global. */
> > +	if (dev_priv->mm.aliasing_ppgtt) {
> > +		gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
> > +					  vma->obj->pages, entry, cache_level);
> > +		vma->obj->has_aliasing_ppgtt_mapping = 1;
> > +	}
> > +}
> 
> There's a bit of duplication here. How about this:
> {
> 	if (!aliasing_ppgtt ||
> 	    flags & GLOBAL_BIND ||
> 	    has_ggtt_mapping {
> 		gen6_ppgtt_insert_entries()
                ^ I'll assume you meant ggtt
> 		has_global_gtt_mapping = true;
> 	}
> 
> 	if (aliasing_ppgtt) {
> 		ppgtt_insert
> 		has_aliasing_ppgtt_mapping = true;
> 	}
> }

This looks like an improvement to me. Consider it done.

> 
> Not sure if this was the same thing that bothered Chris.
> 
> >  
> >  void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> >  			      enum i915_cache_level cache_level)
> > @@ -627,6 +707,23 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
> >  	obj->has_global_gtt_mapping = 0;
> >  }
> >  
> > +static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
> > +{
> > +	struct drm_device *dev = vma->vm->dev;
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > +
> > +	gen6_ggtt_clear_range(vma->vm, entry,
> > +			      vma->obj->base.size >> PAGE_SHIFT);
> > +	vma->obj->has_global_gtt_mapping = 0;
> > +	if (dev_priv->mm.aliasing_ppgtt && vma->obj->has_aliasing_ppgtt_mapping) {
> > +		gen6_ppgtt_clear_range(&dev_priv->mm.aliasing_ppgtt->base,
> > +				       entry,
> > +				       vma->obj->base.size >> PAGE_SHIFT);
> > +		vma->obj->has_aliasing_ppgtt_mapping = 0;
> > +	}
> > +}
> 
> gen6_ggtt_bind_vma() might not have added the ggtt mapping, so I
> suppose you should check has_global_gtt_mapping to avoid useless
> work. Also the dev_priv->mm.aliasing_ppgtt check is a bit
> pointless as there's no way has_aliasing_ppgtt_mapping would be
> set then if there's no aliasing ppgtt.
> 

Good points for both. Hopefully changing the first one doesn't expose
any pre-existing bugs.

> > +
> >  void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> > @@ -860,7 +957,9 @@ static int gen6_gmch_probe(struct drm_device *dev,
> >  		DRM_ERROR("Scratch setup failed\n");
> >  
> >  	dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
> > +	dev_priv->gtt.base.unbind_vma = gen6_ggtt_unbind_vma;
> >  	dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
> > +	dev_priv->gtt.base.bind_vma = gen6_ggtt_bind_vma;
> >  
> >  	return ret;
> >  }
> > @@ -892,7 +991,9 @@ static int i915_gmch_probe(struct drm_device *dev,
> >  
> >  	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
> >  	dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
> > +	dev_priv->gtt.base.unbind_vma = i915_ggtt_unbind_vma;
> >  	dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
> > +	dev_priv->gtt.base.bind_vma = i915_ggtt_bind_vma;
> >  
> >  	return 0;
> >  }
> > -- 
> > 1.8.4
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Ville Syrjälä
> Intel OTC
Ville Syrjala Sept. 4, 2013, 7:31 a.m. UTC | #6
On Tue, Sep 03, 2013 at 05:20:08PM -0700, Ben Widawsky wrote:
> On Mon, Sep 02, 2013 at 03:46:52PM +0300, Ville Syrjälä wrote:
> > On Fri, Aug 30, 2013 at 04:43:59PM -0700, Ben Widawsky wrote:
> > > From: Ben Widawsky <ben@bwidawsk.net>
> > > 
> > > As we plumb the code with more VM information, it has become more
> > > obvious that the easiest way to deal with bind and unbind is to simply
> > > put the function pointers in the vm, and let those choose the correct
> > > way to handle the page table updates. This change allows many places in
> > > the code to simply be vm->bind, and not have to worry about
> > > distinguishing PPGTT vs GGTT.
> > > 
> > > Notice that this patch has no impact on functionality. I've decided to
> > > save the actual change until the next patch because I think it's easier
> > > to review that way. I'm happy to squash the two, or let Daniel do it on
> > > merge.
> > > 
> > > v2:
> > > Make ggtt handle the quirky aliasing ppgtt
> > > Add flags to bind object to support above
> > > Don't ever call bind/unbind directly for PPGTT until we have real, full
> > > PPGTT (use NULLs to assert this)
> > > Make sure we rebind the ggtt if there already is a ggtt binding. This
> > > happens on set cache levels
> > > Use VMA for bind/unbind (Daniel, Ben)
> > > 
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > ---
> > >  drivers/gpu/drm/i915/i915_drv.h     |  69 +++++++++++++-----------
> > >  drivers/gpu/drm/i915/i915_gem_gtt.c | 101 ++++++++++++++++++++++++++++++++++++
> > >  2 files changed, 140 insertions(+), 30 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index c9ed77a..377ca63 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -461,6 +461,36 @@ enum i915_cache_level {
> > >  
> > >  typedef uint32_t gen6_gtt_pte_t;
> > >  
> > > +/**
> > > + * A VMA represents a GEM BO that is bound into an address space. Therefore, a
> > > + * VMA's presence cannot be guaranteed before binding, or after unbinding the
> > > + * object into/from the address space.
> > > + *
> > > + * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
> > > + * will always be <= an objects lifetime. So object refcounting should cover us.
> > > + */
> > > +struct i915_vma {
> > > +	struct drm_mm_node node;
> > > +	struct drm_i915_gem_object *obj;
> > > +	struct i915_address_space *vm;
> > > +
> > > +	/** This object's place on the active/inactive lists */
> > > +	struct list_head mm_list;
> > > +
> > > +	struct list_head vma_link; /* Link in the object's VMA list */
> > > +
> > > +	/** This vma's place in the batchbuffer or on the eviction list */
> > > +	struct list_head exec_list;
> > > +
> > > +	/**
> > > +	 * Used for performing relocations during execbuffer insertion.
> > > +	 */
> > > +	struct hlist_node exec_node;
> > > +	unsigned long exec_handle;
> > > +	struct drm_i915_gem_exec_object2 *exec_entry;
> > > +
> > > +};
> > > +
> > >  struct i915_address_space {
> > >  	struct drm_mm mm;
> > >  	struct drm_device *dev;
> > > @@ -499,9 +529,18 @@ struct i915_address_space {
> > >  	/* FIXME: Need a more generic return type */
> > >  	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> > >  				     enum i915_cache_level level);
> > > +
> > > +	/** Unmap an object from an address space. This usually consists of
> > > +	 * setting the valid PTE entries to a reserved scratch page. */
> > > +	void (*unbind_vma)(struct i915_vma *vma);
> > >  	void (*clear_range)(struct i915_address_space *vm,
> > >  			    unsigned int first_entry,
> > >  			    unsigned int num_entries);
> > > +	/* Map an object into an address space with the given cache flags. */
> > > +#define GLOBAL_BIND (1<<0)
> > > +	void (*bind_vma)(struct i915_vma *vma,
> > > +			 enum i915_cache_level cache_level,
> > > +			 u32 flags);
> > >  	void (*insert_entries)(struct i915_address_space *vm,
> > >  			       struct sg_table *st,
> > >  			       unsigned int first_entry,
> > > @@ -548,36 +587,6 @@ struct i915_hw_ppgtt {
> > >  	int (*enable)(struct drm_device *dev);
> > >  };
> > >  
> > > -/**
> > > - * A VMA represents a GEM BO that is bound into an address space. Therefore, a
> > > - * VMA's presence cannot be guaranteed before binding, or after unbinding the
> > > - * object into/from the address space.
> > > - *
> > > - * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
> > > - * will always be <= an objects lifetime. So object refcounting should cover us.
> > > - */
> > > -struct i915_vma {
> > > -	struct drm_mm_node node;
> > > -	struct drm_i915_gem_object *obj;
> > > -	struct i915_address_space *vm;
> > > -
> > > -	/** This object's place on the active/inactive lists */
> > > -	struct list_head mm_list;
> > > -
> > > -	struct list_head vma_link; /* Link in the object's VMA list */
> > > -
> > > -	/** This vma's place in the batchbuffer or on the eviction list */
> > > -	struct list_head exec_list;
> > > -
> > > -	/**
> > > -	 * Used for performing relocations during execbuffer insertion.
> > > -	 */
> > > -	struct hlist_node exec_node;
> > > -	unsigned long exec_handle;
> > > -	struct drm_i915_gem_exec_object2 *exec_entry;
> > > -
> > > -};
> > > -
> > >  struct i915_ctx_hang_stats {
> > >  	/* This context had batch pending when hang was declared */
> > >  	unsigned batch_pending;
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > index 212f6d8..d5a4580 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > @@ -57,6 +57,11 @@
> > >  #define HSW_WB_ELLC_LLC_AGE0		HSW_CACHEABILITY_CONTROL(0xb)
> > >  #define HSW_WT_ELLC_LLC_AGE0		HSW_CACHEABILITY_CONTROL(0x6)
> > >  
> > > +static void gen6_ppgtt_bind_vma(struct i915_vma *vma,
> > > +				enum i915_cache_level cache_level,
> > > +				u32 flags);
> > > +static void gen6_ppgtt_unbind_vma(struct i915_vma *vma);
> > > +
> > >  static gen6_gtt_pte_t snb_pte_encode(dma_addr_t addr,
> > >  				     enum i915_cache_level level)
> > >  {
> > > @@ -332,7 +337,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> > >  	ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode;
> > >  	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
> > >  	ppgtt->enable = gen6_ppgtt_enable;
> > > +	ppgtt->base.unbind_vma = NULL;
> > >  	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
> > > +	ppgtt->base.bind_vma = NULL;
> > >  	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
> > >  	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
> > >  	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
> > > @@ -439,6 +446,18 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> > >  				   cache_level);
> > >  }
> > >  
> > > +static void __always_unused
> > > +gen6_ppgtt_bind_vma(struct i915_vma *vma,
> > > +		    enum i915_cache_level cache_level,
> > > +		    u32 flags)
> > > +{
> > > +	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > > +
> > > +	WARN_ON(flags);
> > > +
> > > +	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
> > > +}
> > > +
> > >  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > >  			      struct drm_i915_gem_object *obj)
> > >  {
> > > @@ -447,6 +466,14 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > >  				obj->base.size >> PAGE_SHIFT);
> > >  }
> > >  
> > > +static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
> > > +{
> > > +	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > > +
> > > +	gen6_ppgtt_clear_range(vma->vm, entry,
> > > +			       vma->obj->base.size >> PAGE_SHIFT);
> > > +}
> > > +
> > >  extern int intel_iommu_gfx_mapped;
> > >  /* Certain Gen5 chipsets require require idling the GPU before
> > >   * unmapping anything from the GTT when VT-d is enabled.
> > > @@ -592,6 +619,19 @@ static void i915_ggtt_insert_entries(struct i915_address_space *vm,
> > >  
> > >  }
> > >  
> > > +static void i915_ggtt_bind_vma(struct i915_vma *vma,
> > > +			       enum i915_cache_level cache_level,
> > > +			       u32 unused)
> > > +{
> > > +	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > > +	unsigned int flags = (cache_level == I915_CACHE_NONE) ?
> > > +		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
> > > +
> > > +	BUG_ON(!i915_is_ggtt(vma->vm));
> > > +	intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
> > > +	vma->obj->has_global_gtt_mapping = 1;
> > > +}
> > > +
> > >  static void i915_ggtt_clear_range(struct i915_address_space *vm,
> > >  				  unsigned int first_entry,
> > >  				  unsigned int num_entries)
> > > @@ -599,6 +639,46 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm,
> > >  	intel_gtt_clear_range(first_entry, num_entries);
> > >  }
> > >  
> > > +static void i915_ggtt_unbind_vma(struct i915_vma *vma)
> > > +{
> > > +	const unsigned int first = vma->node.start >> PAGE_SHIFT;
> > > +	const unsigned int size = vma->obj->base.size >> PAGE_SHIFT;
> > > +
> > > +	BUG_ON(!i915_is_ggtt(vma->vm));
> > > +	vma->obj->has_global_gtt_mapping = 0;
> > > +	intel_gtt_clear_range(first, size);
> > > +}
> > > +
> > > +static void gen6_ggtt_bind_vma(struct i915_vma *vma,
> > > +			       enum i915_cache_level cache_level,
> > > +			       u32 flags)
> > > +{
> > > +	struct drm_device *dev = vma->vm->dev;
> > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > +	struct drm_i915_gem_object *obj = vma->obj;
> > > +	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > > +
> > > +	/* If there is an aliasing PPGTT, and the user didn't explicitly ask for
> > > +	 * the global, just use aliasing */
> > > +	if (dev_priv->mm.aliasing_ppgtt && !(flags & GLOBAL_BIND) &&
> > > +	    !obj->has_global_gtt_mapping) {
> > > +		gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
> > > +					  vma->obj->pages, entry, cache_level);
> > > +		vma->obj->has_aliasing_ppgtt_mapping = 1;
> > > +		return;
> > > +	}
> > > +
> > > +	gen6_ggtt_insert_entries(vma->vm, obj->pages, entry, cache_level);
> > > +	obj->has_global_gtt_mapping = 1;
> > > +
> > > +	/* If put the mapping in the aliasing PPGTT as well as Global if we have
> > > +	 * aliasing, but the user requested global. */
> > > +	if (dev_priv->mm.aliasing_ppgtt) {
> > > +		gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
> > > +					  vma->obj->pages, entry, cache_level);
> > > +		vma->obj->has_aliasing_ppgtt_mapping = 1;
> > > +	}
> > > +}
> > 
> > There's a bit of duplication here. How about this:
> > {
> > 	if (!aliasing_ppgtt ||
> > 	    flags & GLOBAL_BIND ||
> > 	    has_ggtt_mapping {
> > 		gen6_ppgtt_insert_entries()
>                 ^ I'll assume you meant ggtt

Yes.

Just got a funny mental image from this incident:
"Look here boy, this is how it's done!" -> crash and burn

<snip the rest>
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c9ed77a..377ca63 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -461,6 +461,36 @@  enum i915_cache_level {
 
 typedef uint32_t gen6_gtt_pte_t;
 
+/**
+ * A VMA represents a GEM BO that is bound into an address space. Therefore, a
+ * VMA's presence cannot be guaranteed before binding, or after unbinding the
+ * object into/from the address space.
+ *
+ * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
+ * will always be <= an objects lifetime. So object refcounting should cover us.
+ */
+struct i915_vma {
+	struct drm_mm_node node;
+	struct drm_i915_gem_object *obj;
+	struct i915_address_space *vm;
+
+	/** This object's place on the active/inactive lists */
+	struct list_head mm_list;
+
+	struct list_head vma_link; /* Link in the object's VMA list */
+
+	/** This vma's place in the batchbuffer or on the eviction list */
+	struct list_head exec_list;
+
+	/**
+	 * Used for performing relocations during execbuffer insertion.
+	 */
+	struct hlist_node exec_node;
+	unsigned long exec_handle;
+	struct drm_i915_gem_exec_object2 *exec_entry;
+
+};
+
 struct i915_address_space {
 	struct drm_mm mm;
 	struct drm_device *dev;
@@ -499,9 +529,18 @@  struct i915_address_space {
 	/* FIXME: Need a more generic return type */
 	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
 				     enum i915_cache_level level);
+
+	/** Unmap an object from an address space. This usually consists of
+	 * setting the valid PTE entries to a reserved scratch page. */
+	void (*unbind_vma)(struct i915_vma *vma);
 	void (*clear_range)(struct i915_address_space *vm,
 			    unsigned int first_entry,
 			    unsigned int num_entries);
+	/* Map an object into an address space with the given cache flags. */
+#define GLOBAL_BIND (1<<0)
+	void (*bind_vma)(struct i915_vma *vma,
+			 enum i915_cache_level cache_level,
+			 u32 flags);
 	void (*insert_entries)(struct i915_address_space *vm,
 			       struct sg_table *st,
 			       unsigned int first_entry,
@@ -548,36 +587,6 @@  struct i915_hw_ppgtt {
 	int (*enable)(struct drm_device *dev);
 };
 
-/**
- * A VMA represents a GEM BO that is bound into an address space. Therefore, a
- * VMA's presence cannot be guaranteed before binding, or after unbinding the
- * object into/from the address space.
- *
- * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
- * will always be <= an objects lifetime. So object refcounting should cover us.
- */
-struct i915_vma {
-	struct drm_mm_node node;
-	struct drm_i915_gem_object *obj;
-	struct i915_address_space *vm;
-
-	/** This object's place on the active/inactive lists */
-	struct list_head mm_list;
-
-	struct list_head vma_link; /* Link in the object's VMA list */
-
-	/** This vma's place in the batchbuffer or on the eviction list */
-	struct list_head exec_list;
-
-	/**
-	 * Used for performing relocations during execbuffer insertion.
-	 */
-	struct hlist_node exec_node;
-	unsigned long exec_handle;
-	struct drm_i915_gem_exec_object2 *exec_entry;
-
-};
-
 struct i915_ctx_hang_stats {
 	/* This context had batch pending when hang was declared */
 	unsigned batch_pending;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 212f6d8..d5a4580 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -57,6 +57,11 @@ 
 #define HSW_WB_ELLC_LLC_AGE0		HSW_CACHEABILITY_CONTROL(0xb)
 #define HSW_WT_ELLC_LLC_AGE0		HSW_CACHEABILITY_CONTROL(0x6)
 
+static void gen6_ppgtt_bind_vma(struct i915_vma *vma,
+				enum i915_cache_level cache_level,
+				u32 flags);
+static void gen6_ppgtt_unbind_vma(struct i915_vma *vma);
+
 static gen6_gtt_pte_t snb_pte_encode(dma_addr_t addr,
 				     enum i915_cache_level level)
 {
@@ -332,7 +337,9 @@  static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode;
 	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
 	ppgtt->enable = gen6_ppgtt_enable;
+	ppgtt->base.unbind_vma = NULL;
 	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
+	ppgtt->base.bind_vma = NULL;
 	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
 	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
 	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
@@ -439,6 +446,18 @@  void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
 				   cache_level);
 }
 
+static void __always_unused
+gen6_ppgtt_bind_vma(struct i915_vma *vma,
+		    enum i915_cache_level cache_level,
+		    u32 flags)
+{
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	WARN_ON(flags);
+
+	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
+}
+
 void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 			      struct drm_i915_gem_object *obj)
 {
@@ -447,6 +466,14 @@  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 				obj->base.size >> PAGE_SHIFT);
 }
 
+static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
+{
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	gen6_ppgtt_clear_range(vma->vm, entry,
+			       vma->obj->base.size >> PAGE_SHIFT);
+}
+
 extern int intel_iommu_gfx_mapped;
 /* Certain Gen5 chipsets require require idling the GPU before
  * unmapping anything from the GTT when VT-d is enabled.
@@ -592,6 +619,19 @@  static void i915_ggtt_insert_entries(struct i915_address_space *vm,
 
 }
 
+static void i915_ggtt_bind_vma(struct i915_vma *vma,
+			       enum i915_cache_level cache_level,
+			       u32 unused)
+{
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+	unsigned int flags = (cache_level == I915_CACHE_NONE) ?
+		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
+
+	BUG_ON(!i915_is_ggtt(vma->vm));
+	intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
+	vma->obj->has_global_gtt_mapping = 1;
+}
+
 static void i915_ggtt_clear_range(struct i915_address_space *vm,
 				  unsigned int first_entry,
 				  unsigned int num_entries)
@@ -599,6 +639,46 @@  static void i915_ggtt_clear_range(struct i915_address_space *vm,
 	intel_gtt_clear_range(first_entry, num_entries);
 }
 
+static void i915_ggtt_unbind_vma(struct i915_vma *vma)
+{
+	const unsigned int first = vma->node.start >> PAGE_SHIFT;
+	const unsigned int size = vma->obj->base.size >> PAGE_SHIFT;
+
+	BUG_ON(!i915_is_ggtt(vma->vm));
+	vma->obj->has_global_gtt_mapping = 0;
+	intel_gtt_clear_range(first, size);
+}
+
+static void gen6_ggtt_bind_vma(struct i915_vma *vma,
+			       enum i915_cache_level cache_level,
+			       u32 flags)
+{
+	struct drm_device *dev = vma->vm->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_gem_object *obj = vma->obj;
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	/* If there is an aliasing PPGTT, and the user didn't explicitly ask for
+	 * the global, just use aliasing */
+	if (dev_priv->mm.aliasing_ppgtt && !(flags & GLOBAL_BIND) &&
+	    !obj->has_global_gtt_mapping) {
+		gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
+					  vma->obj->pages, entry, cache_level);
+		vma->obj->has_aliasing_ppgtt_mapping = 1;
+		return;
+	}
+
+	gen6_ggtt_insert_entries(vma->vm, obj->pages, entry, cache_level);
+	obj->has_global_gtt_mapping = 1;
+
+	/* If put the mapping in the aliasing PPGTT as well as Global if we have
+	 * aliasing, but the user requested global. */
+	if (dev_priv->mm.aliasing_ppgtt) {
+		gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base,
+					  vma->obj->pages, entry, cache_level);
+		vma->obj->has_aliasing_ppgtt_mapping = 1;
+	}
+}
 
 void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
 			      enum i915_cache_level cache_level)
@@ -627,6 +707,23 @@  void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
 	obj->has_global_gtt_mapping = 0;
 }
 
+static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
+{
+	struct drm_device *dev = vma->vm->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
+
+	gen6_ggtt_clear_range(vma->vm, entry,
+			      vma->obj->base.size >> PAGE_SHIFT);
+	vma->obj->has_global_gtt_mapping = 0;
+	if (dev_priv->mm.aliasing_ppgtt && vma->obj->has_aliasing_ppgtt_mapping) {
+		gen6_ppgtt_clear_range(&dev_priv->mm.aliasing_ppgtt->base,
+				       entry,
+				       vma->obj->base.size >> PAGE_SHIFT);
+		vma->obj->has_aliasing_ppgtt_mapping = 0;
+	}
+}
+
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
 {
 	struct drm_device *dev = obj->base.dev;
@@ -860,7 +957,9 @@  static int gen6_gmch_probe(struct drm_device *dev,
 		DRM_ERROR("Scratch setup failed\n");
 
 	dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
+	dev_priv->gtt.base.unbind_vma = gen6_ggtt_unbind_vma;
 	dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
+	dev_priv->gtt.base.bind_vma = gen6_ggtt_bind_vma;
 
 	return ret;
 }
@@ -892,7 +991,9 @@  static int i915_gmch_probe(struct drm_device *dev,
 
 	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
 	dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
+	dev_priv->gtt.base.unbind_vma = i915_ggtt_unbind_vma;
 	dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries;
+	dev_priv->gtt.base.bind_vma = i915_ggtt_bind_vma;
 
 	return 0;
 }