[06/11] drm/i915: plumb VM into object operations

Message ID	1373350122-5118-7-git-send-email-ben@bwidawsk.net (mailing list archive)
State	New, archived
Headers	show Return-Path: <intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org> From: Ben Widawsky <ben@bwidawsk.net> To: Intel GFX <intel-gfx@lists.freedesktop.org> Date: Mon, 8 Jul 2013 23:08:37 -0700 Message-Id: <1373350122-5118-7-git-send-email-ben@bwidawsk.net> In-Reply-To: <1373350122-5118-1-git-send-email-ben@bwidawsk.net> References: <1373350122-5118-1-git-send-email-ben@bwidawsk.net> Cc: Ben Widawsky <ben@bwidawsk.net> Subject: [Intel-gfx] [PATCH 06/11] drm/i915: plumb VM into object operations Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org

Ben Widawsky July 9, 2013, 6:08 a.m. UTC

This patch was formerly known as:
"drm/i915: Create VMAs (part 3) - plumbing"

This patch adds a VM argument, bind/unbind, and the object
offset/size/color getters/setters. It preserves the old ggtt helper
functions because things still need, and will continue to need them.

Some code will still need to be ported over after this.

v2: Fix purge to pick an object and unbind all vmas
This was doable because of the global bound list change.

v3: With the commit to actually pin/unpin pages in place, there is no
longer a need to check if unbind succeeded before calling put_pages().
Make put_pages only BUG() after checking pin count.

v4: Rebased on top of the new hangcheck work by Mika
plumbed eb_destroy also
Many checkpatch related fixes

v5: Very large rebase

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |  31 ++-
 drivers/gpu/drm/i915/i915_dma.c            |   4 -
 drivers/gpu/drm/i915/i915_drv.h            | 107 +++++-----
 drivers/gpu/drm/i915/i915_gem.c            | 316 +++++++++++++++++++++--------
 drivers/gpu/drm/i915/i915_gem_context.c    |   9 +-
 drivers/gpu/drm/i915/i915_gem_evict.c      |  51 +++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 +++++---
 drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 ++--
 drivers/gpu/drm/i915/i915_gem_stolen.c     |   6 +-
 drivers/gpu/drm/i915/i915_gem_tiling.c     |  10 +-
 drivers/gpu/drm/i915/i915_irq.c            |   6 +-
 drivers/gpu/drm/i915/i915_trace.h          |  20 +-
 drivers/gpu/drm/i915/intel_fb.c            |   1 -
 drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
 drivers/gpu/drm/i915/intel_pm.c            |   2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    |  16 +-
 16 files changed, 468 insertions(+), 239 deletions(-)

Daniel Vetter July 9, 2013, 7:15 a.m. UTC | #1

On Mon, Jul 08, 2013 at 11:08:37PM -0700, Ben Widawsky wrote:
> This patch was formerly known as:
> "drm/i915: Create VMAs (part 3) - plumbing"
> 
> This patch adds a VM argument, bind/unbind, and the object
> offset/size/color getters/setters. It preserves the old ggtt helper
> functions because things still need, and will continue to need them.
> 
> Some code will still need to be ported over after this.
> 
> v2: Fix purge to pick an object and unbind all vmas
> This was doable because of the global bound list change.
> 
> v3: With the commit to actually pin/unpin pages in place, there is no
> longer a need to check if unbind succeeded before calling put_pages().
> Make put_pages only BUG() after checking pin count.
> 
> v4: Rebased on top of the new hangcheck work by Mika
> plumbed eb_destroy also
> Many checkpatch related fixes
> 
> v5: Very large rebase
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

This one is a rather large beast. Any chance we could split it into
topics, e.g. convert execbuf code, convert shrinker code? Or does that get
messy, fast?

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c        |  31 ++-
>  drivers/gpu/drm/i915/i915_dma.c            |   4 -
>  drivers/gpu/drm/i915/i915_drv.h            | 107 +++++-----
>  drivers/gpu/drm/i915/i915_gem.c            | 316 +++++++++++++++++++++--------
>  drivers/gpu/drm/i915/i915_gem_context.c    |   9 +-
>  drivers/gpu/drm/i915/i915_gem_evict.c      |  51 +++--
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 +++++---
>  drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 ++--
>  drivers/gpu/drm/i915/i915_gem_stolen.c     |   6 +-
>  drivers/gpu/drm/i915/i915_gem_tiling.c     |  10 +-
>  drivers/gpu/drm/i915/i915_irq.c            |   6 +-
>  drivers/gpu/drm/i915/i915_trace.h          |  20 +-
>  drivers/gpu/drm/i915/intel_fb.c            |   1 -
>  drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
>  drivers/gpu/drm/i915/intel_pm.c            |   2 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c    |  16 +-
>  16 files changed, 468 insertions(+), 239 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 16b2aaf..867ed07 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -122,9 +122,18 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>  		seq_printf(m, " (pinned x %d)", obj->pin_count);
>  	if (obj->fence_reg != I915_FENCE_REG_NONE)
>  		seq_printf(m, " (fence: %d)", obj->fence_reg);
> -	if (i915_gem_obj_ggtt_bound(obj))
> -		seq_printf(m, " (gtt offset: %08lx, size: %08x)",
> -			   i915_gem_obj_ggtt_offset(obj), (unsigned int)i915_gem_obj_ggtt_size(obj));
> +	if (i915_gem_obj_bound_any(obj)) {

list_for_each will short-circuit already, so this is redundant.

> +		struct i915_vma *vma;
> +		list_for_each_entry(vma, &obj->vma_list, vma_link) {
> +			if (!i915_is_ggtt(vma->vm))
> +				seq_puts(m, " (pp");
> +			else
> +				seq_puts(m, " (g");
> +			seq_printf(m, " gtt offset: %08lx, size: %08lx)",

                                       ^ that space looks superflous now

> +				   i915_gem_obj_offset(obj, vma->vm),
> +				   i915_gem_obj_size(obj, vma->vm));
> +		}
> +	}
>  	if (obj->stolen)
>  		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
>  	if (obj->pin_mappable || obj->fault_mappable) {
> @@ -186,6 +195,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
>  	return 0;
>  }
>  
> +/* FIXME: Support multiple VM? */
>  #define count_objects(list, member) do { \
>  	list_for_each_entry(obj, list, member) { \
>  		size += i915_gem_obj_ggtt_size(obj); \
> @@ -2049,18 +2059,21 @@ i915_drop_caches_set(void *data, u64 val)
>  
>  	if (val & DROP_BOUND) {
>  		list_for_each_entry_safe(obj, next, &vm->inactive_list,
> -					 mm_list)
> -			if (obj->pin_count == 0) {
> -				ret = i915_gem_object_unbind(obj);
> -				if (ret)
> -					goto unlock;
> -			}
> +					 mm_list) {
> +			if (obj->pin_count)
> +				continue;
> +
> +			ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
> +			if (ret)
> +				goto unlock;
> +		}
>  	}
>  
>  	if (val & DROP_UNBOUND) {
>  		list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list,
>  					 global_list)
>  			if (obj->pages_pin_count == 0) {
> +				/* FIXME: Do this for all vms? */
>  				ret = i915_gem_object_put_pages(obj);
>  				if (ret)
>  					goto unlock;
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index d13e21f..b190439 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1497,10 +1497,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
>  
>  	i915_dump_device_info(dev_priv);
>  
> -	INIT_LIST_HEAD(&dev_priv->vm_list);
> -	INIT_LIST_HEAD(&dev_priv->gtt.base.global_link);
> -	list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list);
> -
>  	if (i915_get_bridge_dev(dev)) {
>  		ret = -EIO;
>  		goto free_priv;
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 38cccc8..48baccc 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1363,52 +1363,6 @@ struct drm_i915_gem_object {
>  
>  #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
>  
> -/* This is a temporary define to help transition us to real VMAs. If you see
> - * this, you're either reviewing code, or bisecting it. */
> -static inline struct i915_vma *
> -__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj)
> -{
> -	if (list_empty(&obj->vma_list))
> -		return NULL;
> -	return list_first_entry(&obj->vma_list, struct i915_vma, vma_link);
> -}
> -
> -/* Whether or not this object is currently mapped by the translation tables */
> -static inline bool
> -i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o)
> -{
> -	struct i915_vma *vma = __i915_gem_obj_to_vma(o);
> -	if (vma == NULL)
> -		return false;
> -	return drm_mm_node_allocated(&vma->node);
> -}
> -
> -/* Offset of the first PTE pointing to this object */
> -static inline unsigned long
> -i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o)
> -{
> -	BUG_ON(list_empty(&o->vma_list));
> -	return __i915_gem_obj_to_vma(o)->node.start;
> -}
> -
> -/* The size used in the translation tables may be larger than the actual size of
> - * the object on GEN2/GEN3 because of the way tiling is handled. See
> - * i915_gem_get_gtt_size() for more details.
> - */
> -static inline unsigned long
> -i915_gem_obj_ggtt_size(struct drm_i915_gem_object *o)
> -{
> -	BUG_ON(list_empty(&o->vma_list));
> -	return __i915_gem_obj_to_vma(o)->node.size;
> -}
> -
> -static inline void
> -i915_gem_obj_ggtt_set_color(struct drm_i915_gem_object *o,
> -			    enum i915_cache_level color)
> -{
> -	__i915_gem_obj_to_vma(o)->node.color = color;
> -}
> -
>  /**
>   * Request queue structure.
>   *
> @@ -1726,11 +1680,13 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
>  void i915_gem_vma_destroy(struct i915_vma *vma);
>  
>  int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
> +				     struct i915_address_space *vm,
>  				     uint32_t alignment,
>  				     bool map_and_fenceable,
>  				     bool nonblocking);
>  void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
> -int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
> +int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> +					struct i915_address_space *vm);
>  int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
>  void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
>  void i915_gem_lastclose(struct drm_device *dev);
> @@ -1760,6 +1716,7 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
>  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
>  			 struct intel_ring_buffer *to);
>  void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> +				    struct i915_address_space *vm,
>  				    struct intel_ring_buffer *ring);
>  
>  int i915_gem_dumb_create(struct drm_file *file_priv,
> @@ -1866,6 +1823,7 @@ i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size,
>  			    int tiling_mode, bool fenced);
>  
>  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> +				    struct i915_address_space *vm,
>  				    enum i915_cache_level cache_level);
>  
>  struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
> @@ -1876,6 +1834,56 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
>  
>  void i915_gem_restore_fences(struct drm_device *dev);
>  
> +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> +				  struct i915_address_space *vm);
> +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
> +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> +			struct i915_address_space *vm);
> +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> +				struct i915_address_space *vm);
> +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> +			    struct i915_address_space *vm,
> +			    enum i915_cache_level color);
> +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> +				     struct i915_address_space *vm);
> +/* Some GGTT VM helpers */
> +#define obj_to_ggtt(obj) \
> +	(&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
> +static inline bool i915_is_ggtt(struct i915_address_space *vm)
> +{
> +	struct i915_address_space *ggtt =
> +		&((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base;
> +	return vm == ggtt;
> +}
> +
> +static inline bool i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *obj)
> +{
> +	return i915_gem_obj_bound(obj, obj_to_ggtt(obj));
> +}
> +
> +static inline unsigned long
> +i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *obj)
> +{
> +	return i915_gem_obj_offset(obj, obj_to_ggtt(obj));
> +}
> +
> +static inline unsigned long
> +i915_gem_obj_ggtt_size(struct drm_i915_gem_object *obj)
> +{
> +	return i915_gem_obj_size(obj, obj_to_ggtt(obj));
> +}
> +
> +static inline int __must_check
> +i915_gem_ggtt_pin(struct drm_i915_gem_object *obj,
> +		  uint32_t alignment,
> +		  bool map_and_fenceable,
> +		  bool nonblocking)
> +{
> +	return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment,
> +				   map_and_fenceable, nonblocking);
> +}
> +#undef obj_to_ggtt
> +
>  /* i915_gem_context.c */
>  void i915_gem_context_init(struct drm_device *dev);
>  void i915_gem_context_fini(struct drm_device *dev);
> @@ -1912,6 +1920,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
>  
>  void i915_gem_restore_gtt_mappings(struct drm_device *dev);
>  int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
> +/* FIXME: this is never okay with full PPGTT */
>  void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
>  				enum i915_cache_level cache_level);
>  void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
> @@ -1928,7 +1937,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev)
>  
>  
>  /* i915_gem_evict.c */
> -int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
> +int __must_check i915_gem_evict_something(struct drm_device *dev,
> +					  struct i915_address_space *vm,
> +					  int min_size,
>  					  unsigned alignment,
>  					  unsigned cache_level,
>  					  bool mappable,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 058ad44..21015cd 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -38,10 +38,12 @@
>  
>  static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
>  static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
> -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> -						    unsigned alignment,
> -						    bool map_and_fenceable,
> -						    bool nonblocking);
> +static __must_check int
> +i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> +			    struct i915_address_space *vm,
> +			    unsigned alignment,
> +			    bool map_and_fenceable,
> +			    bool nonblocking);
>  static int i915_gem_phys_pwrite(struct drm_device *dev,
>  				struct drm_i915_gem_object *obj,
>  				struct drm_i915_gem_pwrite *args,
> @@ -135,7 +137,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
>  static inline bool
>  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
>  {
> -	return i915_gem_obj_ggtt_bound(obj) && !obj->active;
> +	return i915_gem_obj_bound_any(obj) && !obj->active;
>  }
>  
>  int
> @@ -422,7 +424,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
>  		 * anyway again before the next pread happens. */
>  		if (obj->cache_level == I915_CACHE_NONE)
>  			needs_clflush = 1;
> -		if (i915_gem_obj_ggtt_bound(obj)) {
> +		if (i915_gem_obj_bound_any(obj)) {
>  			ret = i915_gem_object_set_to_gtt_domain(obj, false);

This is essentially a very convoluted version of "if there's gpu rendering
outstanding, please wait for it". Maybe we should switch this to

	if (obj->active)
		wait_rendering(obj, true);

Same for the shmem_pwrite case below. Would be a separate patch to prep
things though. Can I volunteer you for that? The ugly part is to review
whether any of the lru list updating that set_domain does in addition to
wait_rendering is required, but on a quick read that's not the case.

>  			if (ret)
>  				return ret;
> @@ -594,7 +596,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
>  	char __user *user_data;
>  	int page_offset, page_length, ret;
>  
> -	ret = i915_gem_object_pin(obj, 0, true, true);
> +	ret = i915_gem_ggtt_pin(obj, 0, true, true);
>  	if (ret)
>  		goto out;
>  
> @@ -739,7 +741,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
>  		 * right away and we therefore have to clflush anyway. */
>  		if (obj->cache_level == I915_CACHE_NONE)
>  			needs_clflush_after = 1;
> -		if (i915_gem_obj_ggtt_bound(obj)) {
> +		if (i915_gem_obj_bound_any(obj)) {

... see above.
>  			ret = i915_gem_object_set_to_gtt_domain(obj, true);
>  			if (ret)
>  				return ret;
> @@ -1346,7 +1348,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>  	}
>  
>  	/* Now bind it into the GTT if needed */
> -	ret = i915_gem_object_pin(obj, 0, true, false);
> +	ret = i915_gem_ggtt_pin(obj,  0, true, false);
>  	if (ret)
>  		goto unlock;
>  
> @@ -1668,11 +1670,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
>  	if (obj->pages == NULL)
>  		return 0;
>  
> -	BUG_ON(i915_gem_obj_ggtt_bound(obj));
> -
>  	if (obj->pages_pin_count)
>  		return -EBUSY;
>  
> +	BUG_ON(i915_gem_obj_bound_any(obj));
> +
>  	/* ->put_pages might need to allocate memory for the bit17 swizzle
>  	 * array, hence protect them from being reaped by removing them from gtt
>  	 * lists early. */
> @@ -1692,7 +1694,6 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
>  		  bool purgeable_only)
>  {
>  	struct drm_i915_gem_object *obj, *next;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	long count = 0;
>  
>  	list_for_each_entry_safe(obj, next,
> @@ -1706,14 +1707,22 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
>  		}
>  	}
>  
> -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) {
> -		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
> -		    i915_gem_object_unbind(obj) == 0 &&
> -		    i915_gem_object_put_pages(obj) == 0) {
> +	list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list,
> +				 global_list) {
> +		struct i915_vma *vma, *v;
> +
> +		if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
> +			continue;
> +
> +		list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link)
> +			if (i915_gem_object_unbind(obj, vma->vm))
> +				break;
> +
> +		if (!i915_gem_object_put_pages(obj))
>  			count += obj->base.size >> PAGE_SHIFT;
> -			if (count >= target)
> -				return count;
> -		}
> +
> +		if (count >= target)
> +			return count;
>  	}
>  
>  	return count;
> @@ -1873,11 +1882,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
>  
>  void
>  i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> +			       struct i915_address_space *vm,
>  			       struct intel_ring_buffer *ring)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	u32 seqno = intel_ring_get_seqno(ring);
>  
>  	BUG_ON(ring == NULL);
> @@ -1910,12 +1919,9 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  }
>  
>  static void
> -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> +				 struct i915_address_space *vm)
>  {
> -	struct drm_device *dev = obj->base.dev;
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> -
>  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
>  	BUG_ON(!obj->active);
>  
> @@ -2117,10 +2123,11 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
>  	spin_unlock(&file_priv->mm.lock);
>  }
>  
> -static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj)
> +static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj,
> +				    struct i915_address_space *vm)
>  {
> -	if (acthd >= i915_gem_obj_ggtt_offset(obj) &&
> -	    acthd < i915_gem_obj_ggtt_offset(obj) + obj->base.size)
> +	if (acthd >= i915_gem_obj_offset(obj, vm) &&
> +	    acthd < i915_gem_obj_offset(obj, vm) + obj->base.size)
>  		return true;
>  
>  	return false;
> @@ -2143,6 +2150,17 @@ static bool i915_head_inside_request(const u32 acthd_unmasked,
>  	return false;
>  }
>  
> +static struct i915_address_space *
> +request_to_vm(struct drm_i915_gem_request *request)
> +{
> +	struct drm_i915_private *dev_priv = request->ring->dev->dev_private;
> +	struct i915_address_space *vm;
> +
> +	vm = &dev_priv->gtt.base;
> +
> +	return vm;
> +}
> +
>  static bool i915_request_guilty(struct drm_i915_gem_request *request,
>  				const u32 acthd, bool *inside)
>  {
> @@ -2150,9 +2168,9 @@ static bool i915_request_guilty(struct drm_i915_gem_request *request,
>  	 * pointing inside the ring, matches the batch_obj address range.
>  	 * However this is extremely unlikely.
>  	 */
> -
>  	if (request->batch_obj) {
> -		if (i915_head_inside_object(acthd, request->batch_obj)) {
> +		if (i915_head_inside_object(acthd, request->batch_obj,
> +					    request_to_vm(request))) {
>  			*inside = true;
>  			return true;
>  		}
> @@ -2172,17 +2190,21 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring,
>  {
>  	struct i915_ctx_hang_stats *hs = NULL;
>  	bool inside, guilty;
> +	unsigned long offset = 0;
>  
>  	/* Innocent until proven guilty */
>  	guilty = false;
>  
> +	if (request->batch_obj)
> +		offset = i915_gem_obj_offset(request->batch_obj,
> +					     request_to_vm(request));
> +
>  	if (ring->hangcheck.action != wait &&
>  	    i915_request_guilty(request, acthd, &inside)) {
>  		DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n",
>  			  ring->name,
>  			  inside ? "inside" : "flushing",
> -			  request->batch_obj ?
> -			  i915_gem_obj_ggtt_offset(request->batch_obj) : 0,
> +			  offset,
>  			  request->ctx ? request->ctx->id : 0,
>  			  acthd);
>  
> @@ -2239,13 +2261,15 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
>  	}
>  
>  	while (!list_empty(&ring->active_list)) {
> +		struct i915_address_space *vm;
>  		struct drm_i915_gem_object *obj;
>  
>  		obj = list_first_entry(&ring->active_list,
>  				       struct drm_i915_gem_object,
>  				       ring_list);
>  
> -		i915_gem_object_move_to_inactive(obj);
> +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +			i915_gem_object_move_to_inactive(obj, vm);
>  	}
>  }
>  
> @@ -2263,7 +2287,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
>  void i915_gem_reset(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct i915_address_space *vm;
>  	struct drm_i915_gem_object *obj;
>  	struct intel_ring_buffer *ring;
>  	int i;
> @@ -2274,8 +2298,9 @@ void i915_gem_reset(struct drm_device *dev)
>  	/* Move everything out of the GPU domains to ensure we do any
>  	 * necessary invalidation upon reuse.
>  	 */
> -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> -		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> +			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
>  
>  	i915_gem_restore_fences(dev);
>  }
> @@ -2320,6 +2345,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
>  	 * by the ringbuffer to the flushing/inactive lists as appropriate.
>  	 */
>  	while (!list_empty(&ring->active_list)) {
> +		struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +		struct i915_address_space *vm;
>  		struct drm_i915_gem_object *obj;
>  
>  		obj = list_first_entry(&ring->active_list,
> @@ -2329,7 +2356,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
>  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
>  			break;
>  
> -		i915_gem_object_move_to_inactive(obj);
> +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +			i915_gem_object_move_to_inactive(obj, vm);
>  	}
>  
>  	if (unlikely(ring->trace_irq_seqno &&
> @@ -2575,13 +2603,14 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
>   * Unbinds an object from the GTT aperture.
>   */
>  int
> -i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> +i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> +		       struct i915_address_space *vm)
>  {
>  	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
>  	struct i915_vma *vma;
>  	int ret;
>  
> -	if (!i915_gem_obj_ggtt_bound(obj))
> +	if (!i915_gem_obj_bound(obj, vm))
>  		return 0;
>  
>  	if (obj->pin_count)
> @@ -2604,7 +2633,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
>  	if (ret)
>  		return ret;
>  
> -	trace_i915_gem_object_unbind(obj);
> +	trace_i915_gem_object_unbind(obj, vm);
>  
>  	if (obj->has_global_gtt_mapping)
>  		i915_gem_gtt_unbind_object(obj);
> @@ -2619,7 +2648,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
>  	/* Avoid an unnecessary call to unbind on rebind. */
>  	obj->map_and_fenceable = true;
>  
> -	vma = __i915_gem_obj_to_vma(obj);
> +	vma = i915_gem_obj_to_vma(obj, vm);
>  	list_del(&vma->vma_link);
>  	drm_mm_remove_node(&vma->node);
>  	i915_gem_vma_destroy(vma);
> @@ -2748,6 +2777,7 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
>  		     "object 0x%08lx not 512K or pot-size 0x%08x aligned\n",
>  		     i915_gem_obj_ggtt_offset(obj), size);
>  
> +
>  		pitch_val = obj->stride / 128;
>  		pitch_val = ffs(pitch_val) - 1;
>  
> @@ -3069,23 +3099,25 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
>   */
>  static int
>  i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> +			    struct i915_address_space *vm,
>  			    unsigned alignment,
>  			    bool map_and_fenceable,
>  			    bool nonblocking)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	u32 size, fence_size, fence_alignment, unfenced_alignment;
>  	bool mappable, fenceable;
> -	size_t gtt_max = map_and_fenceable ?
> -		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
> +	size_t gtt_max =
> +		map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total;
>  	struct i915_vma *vma;
>  	int ret;
>  
>  	if (WARN_ON(!list_empty(&obj->vma_list)))
>  		return -EBUSY;
>  
> +	BUG_ON(!i915_is_ggtt(vm));
> +
>  	fence_size = i915_gem_get_gtt_size(dev,
>  					   obj->base.size,
>  					   obj->tiling_mode);
> @@ -3125,18 +3157,21 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
>  	i915_gem_object_pin_pages(obj);
>  
>  	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> +	/* For now we only ever use 1 vma per object */
> +	WARN_ON(!list_empty(&obj->vma_list));
> +
> +	vma = i915_gem_vma_create(obj, vm);
>  	if (vma == NULL) {
>  		i915_gem_object_unpin_pages(obj);
>  		return -ENOMEM;
>  	}
>  
>  search_free:
> -	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
> -						  &vma->node,
> +	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
>  						  size, alignment,
>  						  obj->cache_level, 0, gtt_max);
>  	if (ret) {
> -		ret = i915_gem_evict_something(dev, size, alignment,
> +		ret = i915_gem_evict_something(dev, vm, size, alignment,
>  					       obj->cache_level,
>  					       map_and_fenceable,
>  					       nonblocking);
> @@ -3162,18 +3197,25 @@ search_free:
>  
>  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
>  	list_add_tail(&obj->mm_list, &vm->inactive_list);
> -	list_add(&vma->vma_link, &obj->vma_list);
> +
> +	/* Keep GGTT vmas first to make debug easier */
> +	if (i915_is_ggtt(vm))
> +		list_add(&vma->vma_link, &obj->vma_list);
> +	else
> +		list_add_tail(&vma->vma_link, &obj->vma_list);
>  
>  	fenceable =
> +		i915_is_ggtt(vm) &&
>  		i915_gem_obj_ggtt_size(obj) == fence_size &&
>  		(i915_gem_obj_ggtt_offset(obj) & (fence_alignment - 1)) == 0;
>  
> -	mappable = i915_gem_obj_ggtt_offset(obj) + obj->base.size <=
> -		dev_priv->gtt.mappable_end;
> +	mappable =
> +		i915_is_ggtt(vm) &&
> +		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
>  
>  	obj->map_and_fenceable = mappable && fenceable;
>  
> -	trace_i915_gem_object_bind(obj, map_and_fenceable);
> +	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
>  	i915_gem_verify_gtt(dev);
>  	return 0;
>  }
> @@ -3271,7 +3313,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>  	int ret;
>  
>  	/* Not valid to be called on unbound objects. */
> -	if (!i915_gem_obj_ggtt_bound(obj))
> +	if (!i915_gem_obj_bound_any(obj))
>  		return -EINVAL;

If we're converting the shmem paths over to wait_rendering then there's
only the fault handler and the set_domain ioctl left. For the later it
would make sense to clflush even when an object is on the unbound list, to
allow userspace to optimize when the clflushing happens. But that would
only make sense in conjunction with Chris' create2 ioctl and a flag to
preallocate the storage (and so putting the object onto the unbound list).
So nothing to do here.

>  
>  	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
> @@ -3317,11 +3359,12 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>  }
>  
>  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> +				    struct i915_address_space *vm,
>  				    enum i915_cache_level cache_level)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> +	struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
>  	int ret;
>  
>  	if (obj->cache_level == cache_level)
> @@ -3333,12 +3376,15 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  	}
>  
>  	if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> -		ret = i915_gem_object_unbind(obj);
> +		ret = i915_gem_object_unbind(obj, vm);
>  		if (ret)
>  			return ret;
>  	}
>  
> -	if (i915_gem_obj_ggtt_bound(obj)) {
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +		if (!i915_gem_obj_bound(obj, vm))
> +			continue;

Hm, shouldn't we have a per-object list of vmas? Or will that follow later
on?

Self-correction: It exists already ... why can't we use this here?

> +
>  		ret = i915_gem_object_finish_gpu(obj);
>  		if (ret)
>  			return ret;
> @@ -3361,7 +3407,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
>  					       obj, cache_level);
>  
> -		i915_gem_obj_ggtt_set_color(obj, cache_level);
> +		i915_gem_obj_set_color(obj, vm, cache_level);
>  	}
>  
>  	if (cache_level == I915_CACHE_NONE) {
> @@ -3421,6 +3467,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
>  			       struct drm_file *file)
>  {
>  	struct drm_i915_gem_caching *args = data;
> +	struct drm_i915_private *dev_priv;
>  	struct drm_i915_gem_object *obj;
>  	enum i915_cache_level level;
>  	int ret;
> @@ -3445,8 +3492,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
>  		ret = -ENOENT;
>  		goto unlock;
>  	}
> +	dev_priv = obj->base.dev->dev_private;
>  
> -	ret = i915_gem_object_set_cache_level(obj, level);
> +	/* FIXME: Add interface for specific VM? */
> +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
>  
>  	drm_gem_object_unreference(&obj->base);
>  unlock:
> @@ -3464,6 +3513,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>  				     u32 alignment,
>  				     struct intel_ring_buffer *pipelined)
>  {
> +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
>  	u32 old_read_domains, old_write_domain;
>  	int ret;
>  
> @@ -3482,7 +3532,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>  	 * of uncaching, which would allow us to flush all the LLC-cached data
>  	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
>  	 */
> -	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> +					      I915_CACHE_NONE);
>  	if (ret)
>  		return ret;
>  
> @@ -3490,7 +3541,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>  	 * (e.g. libkms for the bootup splash), we have to ensure that we
>  	 * always use map_and_fenceable for all scanout buffers.
>  	 */
> -	ret = i915_gem_object_pin(obj, alignment, true, false);
> +	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
>  	if (ret)
>  		return ret;
>  
> @@ -3633,6 +3684,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
>  
>  int
>  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> +		    struct i915_address_space *vm,
>  		    uint32_t alignment,
>  		    bool map_and_fenceable,
>  		    bool nonblocking)
> @@ -3642,26 +3694,29 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
>  		return -EBUSY;
>  
> -	if (i915_gem_obj_ggtt_bound(obj)) {
> -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> +	BUG_ON(map_and_fenceable && !i915_is_ggtt(vm));

WARN_ON, since presumably we can keep on going if we get this wrong
(albeit with slightly corrupted state, so render corruptions might
follow).

> +
> +	if (i915_gem_obj_bound(obj, vm)) {
> +		if ((alignment &&
> +		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
>  		    (map_and_fenceable && !obj->map_and_fenceable)) {
>  			WARN(obj->pin_count,
>  			     "bo is already pinned with incorrect alignment:"
>  			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
>  			     " obj->map_and_fenceable=%d\n",
> -			     i915_gem_obj_ggtt_offset(obj), alignment,
> +			     i915_gem_obj_offset(obj, vm), alignment,
>  			     map_and_fenceable,
>  			     obj->map_and_fenceable);
> -			ret = i915_gem_object_unbind(obj);
> +			ret = i915_gem_object_unbind(obj, vm);
>  			if (ret)
>  				return ret;
>  		}
>  	}
>  
> -	if (!i915_gem_obj_ggtt_bound(obj)) {
> +	if (!i915_gem_obj_bound(obj, vm)) {
>  		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
>  
> -		ret = i915_gem_object_bind_to_gtt(obj, alignment,
> +		ret = i915_gem_object_bind_to_gtt(obj, vm, alignment,
>  						  map_and_fenceable,
>  						  nonblocking);
>  		if (ret)
> @@ -3684,7 +3739,7 @@ void
>  i915_gem_object_unpin(struct drm_i915_gem_object *obj)
>  {
>  	BUG_ON(obj->pin_count == 0);
> -	BUG_ON(!i915_gem_obj_ggtt_bound(obj));
> +	BUG_ON(!i915_gem_obj_bound_any(obj));
>  
>  	if (--obj->pin_count == 0)
>  		obj->pin_mappable = false;
> @@ -3722,7 +3777,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
>  	}
>  
>  	if (obj->user_pin_count == 0) {
> -		ret = i915_gem_object_pin(obj, args->alignment, true, false);
> +		ret = i915_gem_ggtt_pin(obj, args->alignment, true, false);
>  		if (ret)
>  			goto out;
>  	}
> @@ -3957,6 +4012,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
>  	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
>  	struct drm_device *dev = obj->base.dev;
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> +	struct i915_vma *vma, *next;
>  
>  	trace_i915_gem_object_destroy(obj);
>  
> @@ -3964,15 +4020,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
>  		i915_gem_detach_phys_object(dev, obj);
>  
>  	obj->pin_count = 0;
> -	if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
> -		bool was_interruptible;
> +	/* NB: 0 or 1 elements */
> +	WARN_ON(!list_empty(&obj->vma_list) &&
> +		!list_is_singular(&obj->vma_list));
> +	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
> +		int ret = i915_gem_object_unbind(obj, vma->vm);
> +		if (WARN_ON(ret == -ERESTARTSYS)) {
> +			bool was_interruptible;
>  
> -		was_interruptible = dev_priv->mm.interruptible;
> -		dev_priv->mm.interruptible = false;
> +			was_interruptible = dev_priv->mm.interruptible;
> +			dev_priv->mm.interruptible = false;
>  
> -		WARN_ON(i915_gem_object_unbind(obj));
> +			WARN_ON(i915_gem_object_unbind(obj, vma->vm));
>  
> -		dev_priv->mm.interruptible = was_interruptible;
> +			dev_priv->mm.interruptible = was_interruptible;
> +		}
>  	}
>  
>  	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
> @@ -4332,6 +4394,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
>  	INIT_LIST_HEAD(&ring->request_list);
>  }
>  
> +static void i915_init_vm(struct drm_i915_private *dev_priv,
> +			 struct i915_address_space *vm)
> +{
> +	vm->dev = dev_priv->dev;
> +	INIT_LIST_HEAD(&vm->active_list);
> +	INIT_LIST_HEAD(&vm->inactive_list);
> +	INIT_LIST_HEAD(&vm->global_link);
> +	list_add(&vm->global_link, &dev_priv->vm_list);
> +}
> +
>  void
>  i915_gem_load(struct drm_device *dev)
>  {
> @@ -4344,8 +4416,9 @@ i915_gem_load(struct drm_device *dev)
>  				  SLAB_HWCACHE_ALIGN,
>  				  NULL);
>  
> -	INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
> -	INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
> +	INIT_LIST_HEAD(&dev_priv->vm_list);
> +	i915_init_vm(dev_priv, &dev_priv->gtt.base);
> +
>  	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
>  	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
>  	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> @@ -4616,9 +4689,9 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>  			     struct drm_i915_private,
>  			     mm.inactive_shrinker);
>  	struct drm_device *dev = dev_priv->dev;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct i915_address_space *vm;
>  	struct drm_i915_gem_object *obj;
> -	int nr_to_scan = sc->nr_to_scan;
> +	int nr_to_scan;
>  	bool unlock = true;
>  	int cnt;
>  
> @@ -4632,6 +4705,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>  		unlock = false;
>  	}
>  
> +	nr_to_scan = sc->nr_to_scan;
>  	if (nr_to_scan) {
>  		nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
>  		if (nr_to_scan > 0)
> @@ -4645,11 +4719,93 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
>  		if (obj->pages_pin_count == 0)
>  			cnt += obj->base.size >> PAGE_SHIFT;
> -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> -		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> -			cnt += obj->base.size >> PAGE_SHIFT;
> +
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +		list_for_each_entry(obj, &vm->inactive_list, global_list)
> +			if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> +				cnt += obj->base.size >> PAGE_SHIFT;

Isn't this now double-counting objects? In the shrinker we only care about
how much physical RAM an object occupies, not how much virtual space it
occupies. So just walking the bound list of objects here should be good
enough ...

>  
>  	if (unlock)
>  		mutex_unlock(&dev->struct_mutex);
>  	return cnt;
>  }
> +
> +/* All the new VM stuff */
> +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> +				  struct i915_address_space *vm)
> +{
> +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> +	struct i915_vma *vma;
> +
> +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> +		vm = &dev_priv->gtt.base;
> +
> +	BUG_ON(list_empty(&o->vma_list));
> +	list_for_each_entry(vma, &o->vma_list, vma_link) {

Imo the vma list walking here and in the other helpers below indicates
that we should deal more often in vmas instead of (object, vm) pairs. Or
is this again something that'll get fixed later on?

I just want to avoid diff churn, and it also makes reviewing easier if the
foreshadowing is correct ;-) So generally I'd vote for more liberal
sprinkling of obj_to_vma in callers.

> +		if (vma->vm == vm)
> +			return vma->node.start;
> +
> +	}
> +	return -1;
> +}
> +
> +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
> +{
> +	return !list_empty(&o->vma_list);
> +}
> +
> +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> +			struct i915_address_space *vm)
> +{
> +	struct i915_vma *vma;
> +
> +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> +		if (vma->vm == vm)
> +			return true;
> +	}
> +	return false;
> +}
> +
> +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> +				struct i915_address_space *vm)
> +{
> +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> +	struct i915_vma *vma;
> +
> +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> +		vm = &dev_priv->gtt.base;
> +	BUG_ON(list_empty(&o->vma_list));
> +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> +		if (vma->vm == vm)
> +			return vma->node.size;
> +	}
> +
> +	return 0;
> +}
> +
> +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> +			    struct i915_address_space *vm,
> +			    enum i915_cache_level color)
> +{
> +	struct i915_vma *vma;
> +	BUG_ON(list_empty(&o->vma_list));
> +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> +		if (vma->vm == vm) {
> +			vma->node.color = color;
> +			return;
> +		}
> +	}
> +
> +	WARN(1, "Couldn't set color for VM %p\n", vm);
> +}
> +
> +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> +				     struct i915_address_space *vm)
> +{
> +	struct i915_vma *vma;
> +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> +		if (vma->vm == vm)
> +			return vma;
> +
> +	return NULL;
> +}
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 2074544..c92fd81 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev,
>  
>  	if (INTEL_INFO(dev)->gen >= 7) {
>  		ret = i915_gem_object_set_cache_level(ctx->obj,
> +						      &dev_priv->gtt.base,
>  						      I915_CACHE_LLC_MLC);
>  		/* Failure shouldn't ever happen this early */
>  		if (WARN_ON(ret))
> @@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
>  	 * default context.
>  	 */
>  	dev_priv->ring[RCS].default_context = ctx;
> -	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> +	ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
>  	if (ret) {
>  		DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
>  		goto err_destroy;
> @@ -398,6 +399,7 @@ mi_set_context(struct intel_ring_buffer *ring,
>  static int do_switch(struct i915_hw_context *to)
>  {
>  	struct intel_ring_buffer *ring = to->ring;
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  	struct i915_hw_context *from = ring->last_context;
>  	u32 hw_flags = 0;
>  	int ret;
> @@ -407,7 +409,7 @@ static int do_switch(struct i915_hw_context *to)
>  	if (from == to)
>  		return 0;
>  
> -	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
> +	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
>  	if (ret)
>  		return ret;
>  
> @@ -444,7 +446,8 @@ static int do_switch(struct i915_hw_context *to)
>  	 */
>  	if (from != NULL) {
>  		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> -		i915_gem_object_move_to_active(from->obj, ring);
> +		i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
> +					       ring);
>  		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
>  		 * whole damn pipeline, we don't need to explicitly mark the
>  		 * object dirty. The only exception is that the context must be
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index df61f33..32efdc0 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -32,24 +32,21 @@
>  #include "i915_trace.h"
>  
>  static bool
> -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
> +mark_free(struct i915_vma *vma, struct list_head *unwind)
>  {
> -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> -
> -	if (obj->pin_count)
> +	if (vma->obj->pin_count)
>  		return false;
>  
> -	list_add(&obj->exec_list, unwind);
> +	list_add(&vma->obj->exec_list, unwind);
>  	return drm_mm_scan_add_block(&vma->node);
>  }
>  
>  int
> -i915_gem_evict_something(struct drm_device *dev, int min_size,
> -			 unsigned alignment, unsigned cache_level,
> +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> +			 int min_size, unsigned alignment, unsigned cache_level,
>  			 bool mappable, bool nonblocking)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	struct list_head eviction_list, unwind_list;
>  	struct i915_vma *vma;
>  	struct drm_i915_gem_object *obj;
> @@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
>  	 */
>  
>  	INIT_LIST_HEAD(&unwind_list);
> -	if (mappable)
> +	if (mappable) {
> +		BUG_ON(!i915_is_ggtt(vm));
>  		drm_mm_init_scan_with_range(&vm->mm, min_size,
>  					    alignment, cache_level, 0,
>  					    dev_priv->gtt.mappable_end);
> -	else
> +	} else
>  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
>  
>  	/* First see if there is a large enough contiguous idle region... */
>  	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> -		if (mark_free(obj, &unwind_list))
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +		if (mark_free(vma, &unwind_list))
>  			goto found;
>  	}
>  
> @@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
>  
>  	/* Now merge in the soon-to-be-expired objects... */
>  	list_for_each_entry(obj, &vm->active_list, mm_list) {
> -		if (mark_free(obj, &unwind_list))
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +		if (mark_free(vma, &unwind_list))
>  			goto found;
>  	}
>  
> @@ -109,7 +109,7 @@ none:
>  		obj = list_first_entry(&unwind_list,
>  				       struct drm_i915_gem_object,
>  				       exec_list);
> -		vma = __i915_gem_obj_to_vma(obj);
> +		vma = i915_gem_obj_to_vma(obj, vm);
>  		ret = drm_mm_scan_remove_block(&vma->node);
>  		BUG_ON(ret);
>  
> @@ -130,7 +130,7 @@ found:
>  		obj = list_first_entry(&unwind_list,
>  				       struct drm_i915_gem_object,
>  				       exec_list);
> -		vma = __i915_gem_obj_to_vma(obj);
> +		vma = i915_gem_obj_to_vma(obj, vm);
>  		if (drm_mm_scan_remove_block(&vma->node)) {
>  			list_move(&obj->exec_list, &eviction_list);
>  			drm_gem_object_reference(&obj->base);
> @@ -145,7 +145,7 @@ found:
>  				       struct drm_i915_gem_object,
>  				       exec_list);
>  		if (ret == 0)
> -			ret = i915_gem_object_unbind(obj);
> +			ret = i915_gem_object_unbind(obj, vm);
>  
>  		list_del_init(&obj->exec_list);
>  		drm_gem_object_unreference(&obj->base);
> @@ -158,13 +158,18 @@ int
>  i915_gem_evict_everything(struct drm_device *dev)

I suspect evict_everything eventually wants a address_space *vm argument
for those cases where we only want to evict everything in a given vm. Atm
we have two use-cases of this:
- Called from the shrinker as a last-ditch effort. For that it should move
  _every_ object onto the unbound list.
- Called from execbuf for badly-fragmented address spaces to clean up the
  mess. For that case we only care about one address space.

>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct i915_address_space *vm;
>  	struct drm_i915_gem_object *obj, *next;
> -	bool lists_empty;
> +	bool lists_empty = true;
>  	int ret;
>  
> -	lists_empty = (list_empty(&vm->inactive_list) &&
> -		       list_empty(&vm->active_list));
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +		lists_empty = (list_empty(&vm->inactive_list) &&
> +			       list_empty(&vm->active_list));
> +		if (!lists_empty)
> +			lists_empty = false;
> +	}
> +
>  	if (lists_empty)
>  		return -ENOSPC;
>  
> @@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
>  	i915_gem_retire_requests(dev);
>  
>  	/* Having flushed everything, unbind() should never raise an error */
> -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> -		if (obj->pin_count == 0)
> -			WARN_ON(i915_gem_object_unbind(obj));
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> +			if (obj->pin_count == 0)
> +				WARN_ON(i915_gem_object_unbind(obj, vm));
> +	}
>  
>  	return 0;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 5aeb447..e90182d 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
>  }
>  
>  static void
> -eb_destroy(struct eb_objects *eb)
> +eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
>  {
>  	while (!list_empty(&eb->objects)) {
>  		struct drm_i915_gem_object *obj;
> @@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
>  static int
>  i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>  				   struct eb_objects *eb,
> -				   struct drm_i915_gem_relocation_entry *reloc)
> +				   struct drm_i915_gem_relocation_entry *reloc,
> +				   struct i915_address_space *vm)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_gem_object *target_obj;
> @@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>  
>  static int
>  i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> -				    struct eb_objects *eb)
> +				    struct eb_objects *eb,
> +				    struct i915_address_space *vm)
>  {
>  #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
>  	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
> @@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
>  		do {
>  			u64 offset = r->presumed_offset;
>  
> -			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
> +			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
> +								 vm);
>  			if (ret)
>  				return ret;
>  
> @@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
>  static int
>  i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
>  					 struct eb_objects *eb,
> -					 struct drm_i915_gem_relocation_entry *relocs)
> +					 struct drm_i915_gem_relocation_entry *relocs,
> +					 struct i915_address_space *vm)
>  {
>  	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
>  	int i, ret;
>  
>  	for (i = 0; i < entry->relocation_count; i++) {
> -		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
> +		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
> +							 vm);
>  		if (ret)
>  			return ret;
>  	}
> @@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
>  }
>  
>  static int
> -i915_gem_execbuffer_relocate(struct eb_objects *eb)
> +i915_gem_execbuffer_relocate(struct eb_objects *eb,
> +			     struct i915_address_space *vm)
>  {
>  	struct drm_i915_gem_object *obj;
>  	int ret = 0;
> @@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
>  	 */
>  	pagefault_disable();
>  	list_for_each_entry(obj, &eb->objects, exec_list) {
> -		ret = i915_gem_execbuffer_relocate_object(obj, eb);
> +		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
>  		if (ret)
>  			break;
>  	}
> @@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
>  static int
>  i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>  				   struct intel_ring_buffer *ring,
> +				   struct i915_address_space *vm,
>  				   bool *need_reloc)
>  {
>  	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> @@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>  		obj->tiling_mode != I915_TILING_NONE;
>  	need_mappable = need_fence || need_reloc_mappable(obj);
>  
> -	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
> +	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
> +				  false);
>  	if (ret)
>  		return ret;
>  
> @@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>  		obj->has_aliasing_ppgtt_mapping = 1;
>  	}
>  
> -	if (entry->offset != i915_gem_obj_ggtt_offset(obj)) {
> -		entry->offset = i915_gem_obj_ggtt_offset(obj);
> +	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
> +		entry->offset = i915_gem_obj_offset(obj, vm);
>  		*need_reloc = true;
>  	}
>  
> @@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
>  {
>  	struct drm_i915_gem_exec_object2 *entry;
>  
> -	if (!i915_gem_obj_ggtt_bound(obj))
> +	if (!i915_gem_obj_bound_any(obj))
>  		return;
>  
>  	entry = obj->exec_entry;
> @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
>  static int
>  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
>  			    struct list_head *objects,
> +			    struct i915_address_space *vm,
>  			    bool *need_relocs)
>  {
>  	struct drm_i915_gem_object *obj;
> @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
>  		list_for_each_entry(obj, objects, exec_list) {
>  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
>  			bool need_fence, need_mappable;
> +			u32 obj_offset;
>  
> -			if (!i915_gem_obj_ggtt_bound(obj))
> +			if (!i915_gem_obj_bound(obj, vm))
>  				continue;

I wonder a bit how we could avoid the multipler (obj, vm) -> vma lookups
here ... Maybe we should cache them in some pointer somewhere (either in
the eb object or by adding a new pointer to the object struct, e.g.
obj->eb_vma, similar to obj->eb_list).

>  
> +			obj_offset = i915_gem_obj_offset(obj, vm);
>  			need_fence =
>  				has_fenced_gpu_access &&
>  				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
>  				obj->tiling_mode != I915_TILING_NONE;
>  			need_mappable = need_fence || need_reloc_mappable(obj);
>  
> +			BUG_ON((need_mappable || need_fence) &&
> +			       !i915_is_ggtt(vm));
> +
>  			if ((entry->alignment &&
> -			     i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) ||
> +			     obj_offset & (entry->alignment - 1)) ||
>  			    (need_mappable && !obj->map_and_fenceable))
> -				ret = i915_gem_object_unbind(obj);
> +				ret = i915_gem_object_unbind(obj, vm);
>  			else
> -				ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> +				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
>  			if (ret)
>  				goto err;
>  		}
>  
>  		/* Bind fresh objects */
>  		list_for_each_entry(obj, objects, exec_list) {
> -			if (i915_gem_obj_ggtt_bound(obj))
> +			if (i915_gem_obj_bound(obj, vm))
>  				continue;
>  
> -			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> +			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
>  			if (ret)
>  				goto err;
>  		}
> @@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
>  				  struct drm_file *file,
>  				  struct intel_ring_buffer *ring,
>  				  struct eb_objects *eb,
> -				  struct drm_i915_gem_exec_object2 *exec)
> +				  struct drm_i915_gem_exec_object2 *exec,
> +				  struct i915_address_space *vm)
>  {
>  	struct drm_i915_gem_relocation_entry *reloc;
>  	struct drm_i915_gem_object *obj;
> @@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
>  		goto err;
>  
>  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
>  	if (ret)
>  		goto err;
>  
>  	list_for_each_entry(obj, &eb->objects, exec_list) {
>  		int offset = obj->exec_entry - exec;
>  		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
> -							       reloc + reloc_offset[offset]);
> +							       reloc + reloc_offset[offset],
> +							       vm);
>  		if (ret)
>  			goto err;
>  	}
> @@ -768,6 +784,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
>  
>  static void
>  i915_gem_execbuffer_move_to_active(struct list_head *objects,
> +				   struct i915_address_space *vm,
>  				   struct intel_ring_buffer *ring)
>  {
>  	struct drm_i915_gem_object *obj;
> @@ -782,7 +799,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
>  		obj->base.read_domains = obj->base.pending_read_domains;
>  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
>  
> -		i915_gem_object_move_to_active(obj, ring);
> +		i915_gem_object_move_to_active(obj, vm, ring);
>  		if (obj->base.write_domain) {
>  			obj->dirty = 1;
>  			obj->last_write_seqno = intel_ring_get_seqno(ring);
> @@ -836,7 +853,8 @@ static int
>  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  		       struct drm_file *file,
>  		       struct drm_i915_gem_execbuffer2 *args,
> -		       struct drm_i915_gem_exec_object2 *exec)
> +		       struct drm_i915_gem_exec_object2 *exec,
> +		       struct i915_address_space *vm)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
>  	struct eb_objects *eb;
> @@ -998,17 +1016,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  
>  	/* Move the objects en-masse into the GTT, evicting if necessary. */
>  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
>  	if (ret)
>  		goto err;
>  
>  	/* The objects are in their final locations, apply the relocations. */
>  	if (need_relocs)
> -		ret = i915_gem_execbuffer_relocate(eb);
> +		ret = i915_gem_execbuffer_relocate(eb, vm);
>  	if (ret) {
>  		if (ret == -EFAULT) {
>  			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
> -								eb, exec);
> +								eb, exec, vm);
>  			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
>  		}
>  		if (ret)
> @@ -1059,7 +1077,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  			goto err;
>  	}
>  
> -	exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset;
> +	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> +		args->batch_start_offset;
>  	exec_len = args->batch_len;
>  	if (cliprects) {
>  		for (i = 0; i < args->num_cliprects; i++) {
> @@ -1084,11 +1103,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  
>  	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
>  
> -	i915_gem_execbuffer_move_to_active(&eb->objects, ring);
> +	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
>  	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
>  
>  err:
> -	eb_destroy(eb);
> +	eb_destroy(eb, vm);
>  
>  	mutex_unlock(&dev->struct_mutex);
>  
> @@ -1105,6 +1124,7 @@ int
>  i915_gem_execbuffer(struct drm_device *dev, void *data,
>  		    struct drm_file *file)
>  {
> +	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_gem_execbuffer *args = data;
>  	struct drm_i915_gem_execbuffer2 exec2;
>  	struct drm_i915_gem_exec_object *exec_list = NULL;
> @@ -1160,7 +1180,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
>  	exec2.flags = I915_EXEC_RENDER;
>  	i915_execbuffer2_set_context_id(exec2, 0);
>  
> -	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
> +	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
> +				     &dev_priv->gtt.base);
>  	if (!ret) {
>  		/* Copy the new buffer offsets back to the user's exec list. */
>  		for (i = 0; i < args->buffer_count; i++)
> @@ -1186,6 +1207,7 @@ int
>  i915_gem_execbuffer2(struct drm_device *dev, void *data,
>  		     struct drm_file *file)
>  {
> +	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_gem_execbuffer2 *args = data;
>  	struct drm_i915_gem_exec_object2 *exec2_list = NULL;
>  	int ret;
> @@ -1216,7 +1238,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
>  		return -EFAULT;
>  	}
>  
> -	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
> +	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
> +				     &dev_priv->gtt.base);
>  	if (!ret) {
>  		/* Copy the new buffer offsets back to the user's exec list. */
>  		ret = copy_to_user(to_user_ptr(args->buffers_ptr),
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 298fc42..70ce2f6 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -367,6 +367,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
>  			    ppgtt->base.total);
>  	}
>  
> +	/* i915_init_vm(dev_priv, &ppgtt->base) */
> +
>  	return ret;
>  }
>  
> @@ -386,17 +388,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
>  			    struct drm_i915_gem_object *obj,
>  			    enum i915_cache_level cache_level)
>  {
> -	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> -				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> -				   cache_level);
> +	struct i915_address_space *vm = &ppgtt->base;
> +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> +
> +	vm->insert_entries(vm, obj->pages,
> +			   obj_offset >> PAGE_SHIFT,
> +			   cache_level);
>  }
>  
>  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
>  			      struct drm_i915_gem_object *obj)
>  {
> -	ppgtt->base.clear_range(&ppgtt->base,
> -				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> -				obj->base.size >> PAGE_SHIFT);
> +	struct i915_address_space *vm = &ppgtt->base;
> +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> +
> +	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
> +			obj->base.size >> PAGE_SHIFT);
>  }
>  
>  extern int intel_iommu_gfx_mapped;
> @@ -447,6 +454,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
>  				       dev_priv->gtt.base.start / PAGE_SIZE,
>  				       dev_priv->gtt.base.total / PAGE_SIZE);
>  
> +	if (dev_priv->mm.aliasing_ppgtt)
> +		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> +
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
>  		i915_gem_clflush_object(obj);
>  		i915_gem_gtt_bind_object(obj, obj->cache_level);
> @@ -625,7 +635,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>  	 * aperture.  One page should be enough to keep any prefetching inside
>  	 * of the aperture.
>  	 */
> -	drm_i915_private_t *dev_priv = dev->dev_private;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
>  	struct drm_mm_node *entry;
>  	struct drm_i915_gem_object *obj;
>  	unsigned long hole_start, hole_end;
> @@ -633,19 +644,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>  	BUG_ON(mappable_end > end);
>  
>  	/* Subtract the guard page ... */
> -	drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
> +	drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
>  	if (!HAS_LLC(dev))
>  		dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
>  
>  	/* Mark any preallocated objects as occupied */
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> -		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
>  		int ret;
>  		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
>  			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
>  
>  		WARN_ON(i915_gem_obj_ggtt_bound(obj));
> -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> +		ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
>  		if (ret)
>  			DRM_DEBUG_KMS("Reservation failed\n");
>  		obj->has_global_gtt_mapping = 1;
> @@ -656,19 +667,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>  	dev_priv->gtt.base.total = end - start;
>  
>  	/* Clear any non-preallocated blocks */
> -	drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
> -			     hole_start, hole_end) {
> +	drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
>  		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
>  		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
>  			      hole_start, hole_end);
> -		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> -					       hole_start / PAGE_SIZE,
> -					       count);
> +		ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count);
>  	}
>  
>  	/* And finally clear the reserved guard page */
> -	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> -				       end / PAGE_SIZE - 1, 1);
> +	ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1);
>  }
>  
>  static bool
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index 245eb1d..bfe61fa 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -391,7 +391,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>  	if (gtt_offset == I915_GTT_OFFSET_NONE)
>  		return obj;
>  
> -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> +	vma = i915_gem_vma_create(obj, vm);
>  	if (!vma) {
>  		drm_gem_object_unreference(&obj->base);
>  		return NULL;
> @@ -404,8 +404,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>  	 */
>  	vma->node.start = gtt_offset;
>  	vma->node.size = size;
> -	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> +	if (drm_mm_initialized(&vm->mm)) {
> +		ret = drm_mm_reserve_node(&vm->mm, &vma->node);

These two hunks here for stolen look fishy - we only ever use the stolen
preallocated stuff for objects with mappings in the global gtt. So keeping
that explicit is imo the better approach. And tbh I'm confused where the
local variable vm is from ...

>  		if (ret) {
>  			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
>  			goto unref_out;
> diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> index 92a8d27..808ca2a 100644
> --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> @@ -360,17 +360,19 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
>  
>  		obj->map_and_fenceable =
>  			!i915_gem_obj_ggtt_bound(obj) ||
> -			(i915_gem_obj_ggtt_offset(obj) + obj->base.size <= dev_priv->gtt.mappable_end &&
> +			(i915_gem_obj_ggtt_offset(obj) +
> +			 obj->base.size <= dev_priv->gtt.mappable_end &&
>  			 i915_gem_object_fence_ok(obj, args->tiling_mode));
>  
>  		/* Rebind if we need a change of alignment */
>  		if (!obj->map_and_fenceable) {
> -			u32 unfenced_alignment =
> +			struct i915_address_space *ggtt = &dev_priv->gtt.base;
> +			u32 unfenced_align =
>  				i915_gem_get_gtt_alignment(dev, obj->base.size,
>  							    args->tiling_mode,
>  							    false);
> -			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_alignment - 1))
> -				ret = i915_gem_object_unbind(obj);
> +			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_align - 1))
> +				ret = i915_gem_object_unbind(obj, ggtt);
>  		}
>  
>  		if (ret == 0) {
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 79fbb17..28fa0ff 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1716,6 +1716,9 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
>  	if (HAS_BROKEN_CS_TLB(dev_priv->dev)) {
>  		u32 acthd = I915_READ(ACTHD);
>  
> +		if (WARN_ON(HAS_HW_CONTEXTS(dev_priv->dev)))
> +			return NULL;
> +
>  		if (WARN_ON(ring->id != RCS))
>  			return NULL;
>  
> @@ -1802,7 +1805,8 @@ static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
>  		return;
>  
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> -		if ((error->ccid & PAGE_MASK) == i915_gem_obj_ggtt_offset(obj)) {
> +		if ((error->ccid & PAGE_MASK) ==
> +		    i915_gem_obj_ggtt_offset(obj)) {
>  			ering->ctx = i915_error_object_create_sized(dev_priv,
>  								    obj, 1);
>  			break;
> diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> index 7d283b5..3f019d3 100644
> --- a/drivers/gpu/drm/i915/i915_trace.h
> +++ b/drivers/gpu/drm/i915/i915_trace.h
> @@ -34,11 +34,13 @@ TRACE_EVENT(i915_gem_object_create,
>  );
>  
>  TRACE_EVENT(i915_gem_object_bind,
> -	    TP_PROTO(struct drm_i915_gem_object *obj, bool mappable),
> -	    TP_ARGS(obj, mappable),
> +	    TP_PROTO(struct drm_i915_gem_object *obj,
> +		     struct i915_address_space *vm, bool mappable),
> +	    TP_ARGS(obj, vm, mappable),
>  
>  	    TP_STRUCT__entry(
>  			     __field(struct drm_i915_gem_object *, obj)
> +			     __field(struct i915_address_space *, vm)
>  			     __field(u32, offset)
>  			     __field(u32, size)
>  			     __field(bool, mappable)
> @@ -46,8 +48,8 @@ TRACE_EVENT(i915_gem_object_bind,
>  
>  	    TP_fast_assign(
>  			   __entry->obj = obj;
> -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> +			   __entry->size = i915_gem_obj_size(obj, vm);
>  			   __entry->mappable = mappable;
>  			   ),
>  
> @@ -57,19 +59,21 @@ TRACE_EVENT(i915_gem_object_bind,
>  );
>  
>  TRACE_EVENT(i915_gem_object_unbind,
> -	    TP_PROTO(struct drm_i915_gem_object *obj),
> -	    TP_ARGS(obj),
> +	    TP_PROTO(struct drm_i915_gem_object *obj,
> +		     struct i915_address_space *vm),
> +	    TP_ARGS(obj, vm),
>  
>  	    TP_STRUCT__entry(
>  			     __field(struct drm_i915_gem_object *, obj)
> +			     __field(struct i915_address_space *, vm)
>  			     __field(u32, offset)
>  			     __field(u32, size)
>  			     ),
>  
>  	    TP_fast_assign(
>  			   __entry->obj = obj;
> -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> +			   __entry->size = i915_gem_obj_size(obj, vm);
>  			   ),
>  
>  	    TP_printk("obj=%p, offset=%08x size=%x",
> diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
> index f3c97e0..b69cc63 100644
> --- a/drivers/gpu/drm/i915/intel_fb.c
> +++ b/drivers/gpu/drm/i915/intel_fb.c
> @@ -170,7 +170,6 @@ static int intelfb_create(struct drm_fb_helper *helper,
>  		      fb->width, fb->height,
>  		      i915_gem_obj_ggtt_offset(obj), obj);
>  
> -
>  	mutex_unlock(&dev->struct_mutex);
>  	vga_switcheroo_client_fb_set(dev->pdev, info);
>  	return 0;
> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> index 81c3ca1..517e278 100644
> --- a/drivers/gpu/drm/i915/intel_overlay.c
> +++ b/drivers/gpu/drm/i915/intel_overlay.c
> @@ -1350,7 +1350,7 @@ void intel_setup_overlay(struct drm_device *dev)
>  		}
>  		overlay->flip_addr = reg_bo->phys_obj->handle->busaddr;
>  	} else {
> -		ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false);
> +		ret = i915_gem_ggtt_pin(reg_bo, PAGE_SIZE, true, false);
>  		if (ret) {
>  			DRM_ERROR("failed to pin overlay register bo\n");
>  			goto out_free_bo;
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 125a741..449e57c 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2858,7 +2858,7 @@ intel_alloc_context_page(struct drm_device *dev)
>  		return NULL;
>  	}
>  
> -	ret = i915_gem_object_pin(ctx, 4096, true, false);
> +	ret = i915_gem_ggtt_pin(ctx, 4096, true, false);
>  	if (ret) {
>  		DRM_ERROR("failed to pin power context: %d\n", ret);
>  		goto err_unref;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index bc4c11b..ebed61d 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -481,6 +481,7 @@ out:
>  static int
>  init_pipe_control(struct intel_ring_buffer *ring)
>  {
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  	struct pipe_control *pc;
>  	struct drm_i915_gem_object *obj;
>  	int ret;
> @@ -499,9 +500,10 @@ init_pipe_control(struct intel_ring_buffer *ring)
>  		goto err;
>  	}
>  
> -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> +					I915_CACHE_LLC);
>  
> -	ret = i915_gem_object_pin(obj, 4096, true, false);
> +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
>  	if (ret)
>  		goto err_unref;
>  
> @@ -1212,6 +1214,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring)
>  static int init_status_page(struct intel_ring_buffer *ring)
>  {
>  	struct drm_device *dev = ring->dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_gem_object *obj;
>  	int ret;
>  
> @@ -1222,9 +1225,10 @@ static int init_status_page(struct intel_ring_buffer *ring)
>  		goto err;
>  	}
>  
> -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> +					I915_CACHE_LLC);
>  
> -	ret = i915_gem_object_pin(obj, 4096, true, false);
> +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
>  	if (ret != 0) {
>  		goto err_unref;
>  	}
> @@ -1307,7 +1311,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
>  
>  	ring->obj = obj;
>  
> -	ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false);
> +	ret = i915_gem_ggtt_pin(obj, PAGE_SIZE, true, false);
>  	if (ret)
>  		goto err_unref;
>  
> @@ -1828,7 +1832,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
>  			return -ENOMEM;
>  		}
>  
> -		ret = i915_gem_object_pin(obj, 0, true, false);
> +		ret = i915_gem_ggtt_pin(obj, 0, true, false);
>  		if (ret != 0) {
>  			drm_gem_object_unreference(&obj->base);
>  			DRM_ERROR("Failed to ping batch bo\n");
> -- 
> 1.8.3.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Ben Widawsky July 10, 2013, 4:37 p.m. UTC | #2

On Tue, Jul 09, 2013 at 09:15:01AM +0200, Daniel Vetter wrote:
> On Mon, Jul 08, 2013 at 11:08:37PM -0700, Ben Widawsky wrote:
> > This patch was formerly known as:
> > "drm/i915: Create VMAs (part 3) - plumbing"
> > 
> > This patch adds a VM argument, bind/unbind, and the object
> > offset/size/color getters/setters. It preserves the old ggtt helper
> > functions because things still need, and will continue to need them.
> > 
> > Some code will still need to be ported over after this.
> > 
> > v2: Fix purge to pick an object and unbind all vmas
> > This was doable because of the global bound list change.
> > 
> > v3: With the commit to actually pin/unpin pages in place, there is no
> > longer a need to check if unbind succeeded before calling put_pages().
> > Make put_pages only BUG() after checking pin count.
> > 
> > v4: Rebased on top of the new hangcheck work by Mika
> > plumbed eb_destroy also
> > Many checkpatch related fixes
> > 
> > v5: Very large rebase
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> This one is a rather large beast. Any chance we could split it into
> topics, e.g. convert execbuf code, convert shrinker code? Or does that get
> messy, fast?
> 

I've thought of this...

The one solution I came up with is to have two bind/unbind functions
(similar to what I did with pin, and indeed it was my original plan with
pin), and do the set_caching one separately.

I think it won't be too messy, just a lot of typing, as Keith likes to
say.

However, my opinion was, since it's early in the merge cycle, we don't
yet have multiple VMs, and it's /mostly/ a copypasta kind of patch, it's
not a big deal. At a functional level too, I felt this made more sense.

So I'll defer to your request on this and start splitting it up, unless
my email has changed your mind ;-).

> > ---
> >  drivers/gpu/drm/i915/i915_debugfs.c        |  31 ++-
> >  drivers/gpu/drm/i915/i915_dma.c            |   4 -
> >  drivers/gpu/drm/i915/i915_drv.h            | 107 +++++-----
> >  drivers/gpu/drm/i915/i915_gem.c            | 316 +++++++++++++++++++++--------
> >  drivers/gpu/drm/i915/i915_gem_context.c    |   9 +-
> >  drivers/gpu/drm/i915/i915_gem_evict.c      |  51 +++--
> >  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 +++++---
> >  drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 ++--
> >  drivers/gpu/drm/i915/i915_gem_stolen.c     |   6 +-
> >  drivers/gpu/drm/i915/i915_gem_tiling.c     |  10 +-
> >  drivers/gpu/drm/i915/i915_irq.c            |   6 +-
> >  drivers/gpu/drm/i915/i915_trace.h          |  20 +-
> >  drivers/gpu/drm/i915/intel_fb.c            |   1 -
> >  drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
> >  drivers/gpu/drm/i915/intel_pm.c            |   2 +-
> >  drivers/gpu/drm/i915/intel_ringbuffer.c    |  16 +-
> >  16 files changed, 468 insertions(+), 239 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > index 16b2aaf..867ed07 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -122,9 +122,18 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
> >  		seq_printf(m, " (pinned x %d)", obj->pin_count);
> >  	if (obj->fence_reg != I915_FENCE_REG_NONE)
> >  		seq_printf(m, " (fence: %d)", obj->fence_reg);
> > -	if (i915_gem_obj_ggtt_bound(obj))
> > -		seq_printf(m, " (gtt offset: %08lx, size: %08x)",
> > -			   i915_gem_obj_ggtt_offset(obj), (unsigned int)i915_gem_obj_ggtt_size(obj));
> > +	if (i915_gem_obj_bound_any(obj)) {
> 
> list_for_each will short-circuit already, so this is redundant.
> 
> > +		struct i915_vma *vma;
> > +		list_for_each_entry(vma, &obj->vma_list, vma_link) {
> > +			if (!i915_is_ggtt(vma->vm))
> > +				seq_puts(m, " (pp");
> > +			else
> > +				seq_puts(m, " (g");
> > +			seq_printf(m, " gtt offset: %08lx, size: %08lx)",
> 
>                                        ^ that space looks superflous now
> 
> > +				   i915_gem_obj_offset(obj, vma->vm),
> > +				   i915_gem_obj_size(obj, vma->vm));
> > +		}
> > +	}
> >  	if (obj->stolen)
> >  		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
> >  	if (obj->pin_mappable || obj->fault_mappable) {
> > @@ -186,6 +195,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> >  	return 0;
> >  }
> >  
> > +/* FIXME: Support multiple VM? */
> >  #define count_objects(list, member) do { \
> >  	list_for_each_entry(obj, list, member) { \
> >  		size += i915_gem_obj_ggtt_size(obj); \
> > @@ -2049,18 +2059,21 @@ i915_drop_caches_set(void *data, u64 val)
> >  
> >  	if (val & DROP_BOUND) {
> >  		list_for_each_entry_safe(obj, next, &vm->inactive_list,
> > -					 mm_list)
> > -			if (obj->pin_count == 0) {
> > -				ret = i915_gem_object_unbind(obj);
> > -				if (ret)
> > -					goto unlock;
> > -			}
> > +					 mm_list) {
> > +			if (obj->pin_count)
> > +				continue;
> > +
> > +			ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
> > +			if (ret)
> > +				goto unlock;
> > +		}
> >  	}
> >  
> >  	if (val & DROP_UNBOUND) {
> >  		list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list,
> >  					 global_list)
> >  			if (obj->pages_pin_count == 0) {
> > +				/* FIXME: Do this for all vms? */
> >  				ret = i915_gem_object_put_pages(obj);
> >  				if (ret)
> >  					goto unlock;
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > index d13e21f..b190439 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -1497,10 +1497,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
> >  
> >  	i915_dump_device_info(dev_priv);
> >  
> > -	INIT_LIST_HEAD(&dev_priv->vm_list);
> > -	INIT_LIST_HEAD(&dev_priv->gtt.base.global_link);
> > -	list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list);
> > -
> >  	if (i915_get_bridge_dev(dev)) {
> >  		ret = -EIO;
> >  		goto free_priv;
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 38cccc8..48baccc 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1363,52 +1363,6 @@ struct drm_i915_gem_object {
> >  
> >  #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
> >  
> > -/* This is a temporary define to help transition us to real VMAs. If you see
> > - * this, you're either reviewing code, or bisecting it. */
> > -static inline struct i915_vma *
> > -__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj)
> > -{
> > -	if (list_empty(&obj->vma_list))
> > -		return NULL;
> > -	return list_first_entry(&obj->vma_list, struct i915_vma, vma_link);
> > -}
> > -
> > -/* Whether or not this object is currently mapped by the translation tables */
> > -static inline bool
> > -i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o)
> > -{
> > -	struct i915_vma *vma = __i915_gem_obj_to_vma(o);
> > -	if (vma == NULL)
> > -		return false;
> > -	return drm_mm_node_allocated(&vma->node);
> > -}
> > -
> > -/* Offset of the first PTE pointing to this object */
> > -static inline unsigned long
> > -i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o)
> > -{
> > -	BUG_ON(list_empty(&o->vma_list));
> > -	return __i915_gem_obj_to_vma(o)->node.start;
> > -}
> > -
> > -/* The size used in the translation tables may be larger than the actual size of
> > - * the object on GEN2/GEN3 because of the way tiling is handled. See
> > - * i915_gem_get_gtt_size() for more details.
> > - */
> > -static inline unsigned long
> > -i915_gem_obj_ggtt_size(struct drm_i915_gem_object *o)
> > -{
> > -	BUG_ON(list_empty(&o->vma_list));
> > -	return __i915_gem_obj_to_vma(o)->node.size;
> > -}
> > -
> > -static inline void
> > -i915_gem_obj_ggtt_set_color(struct drm_i915_gem_object *o,
> > -			    enum i915_cache_level color)
> > -{
> > -	__i915_gem_obj_to_vma(o)->node.color = color;
> > -}
> > -
> >  /**
> >   * Request queue structure.
> >   *
> > @@ -1726,11 +1680,13 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
> >  void i915_gem_vma_destroy(struct i915_vma *vma);
> >  
> >  int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > +				     struct i915_address_space *vm,
> >  				     uint32_t alignment,
> >  				     bool map_and_fenceable,
> >  				     bool nonblocking);
> >  void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
> > -int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
> > +int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> > +					struct i915_address_space *vm);
> >  int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
> >  void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
> >  void i915_gem_lastclose(struct drm_device *dev);
> > @@ -1760,6 +1716,7 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
> >  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
> >  			 struct intel_ring_buffer *to);
> >  void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > +				    struct i915_address_space *vm,
> >  				    struct intel_ring_buffer *ring);
> >  
> >  int i915_gem_dumb_create(struct drm_file *file_priv,
> > @@ -1866,6 +1823,7 @@ i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size,
> >  			    int tiling_mode, bool fenced);
> >  
> >  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > +				    struct i915_address_space *vm,
> >  				    enum i915_cache_level cache_level);
> >  
> >  struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
> > @@ -1876,6 +1834,56 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
> >  
> >  void i915_gem_restore_fences(struct drm_device *dev);
> >  
> > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > +				  struct i915_address_space *vm);
> > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
> > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > +			struct i915_address_space *vm);
> > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > +				struct i915_address_space *vm);
> > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> > +			    struct i915_address_space *vm,
> > +			    enum i915_cache_level color);
> > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > +				     struct i915_address_space *vm);
> > +/* Some GGTT VM helpers */
> > +#define obj_to_ggtt(obj) \
> > +	(&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
> > +static inline bool i915_is_ggtt(struct i915_address_space *vm)
> > +{
> > +	struct i915_address_space *ggtt =
> > +		&((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base;
> > +	return vm == ggtt;
> > +}
> > +
> > +static inline bool i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *obj)
> > +{
> > +	return i915_gem_obj_bound(obj, obj_to_ggtt(obj));
> > +}
> > +
> > +static inline unsigned long
> > +i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *obj)
> > +{
> > +	return i915_gem_obj_offset(obj, obj_to_ggtt(obj));
> > +}
> > +
> > +static inline unsigned long
> > +i915_gem_obj_ggtt_size(struct drm_i915_gem_object *obj)
> > +{
> > +	return i915_gem_obj_size(obj, obj_to_ggtt(obj));
> > +}
> > +
> > +static inline int __must_check
> > +i915_gem_ggtt_pin(struct drm_i915_gem_object *obj,
> > +		  uint32_t alignment,
> > +		  bool map_and_fenceable,
> > +		  bool nonblocking)
> > +{
> > +	return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment,
> > +				   map_and_fenceable, nonblocking);
> > +}
> > +#undef obj_to_ggtt
> > +
> >  /* i915_gem_context.c */
> >  void i915_gem_context_init(struct drm_device *dev);
> >  void i915_gem_context_fini(struct drm_device *dev);
> > @@ -1912,6 +1920,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> >  
> >  void i915_gem_restore_gtt_mappings(struct drm_device *dev);
> >  int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
> > +/* FIXME: this is never okay with full PPGTT */
> >  void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> >  				enum i915_cache_level cache_level);
> >  void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
> > @@ -1928,7 +1937,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev)
> >  
> >  
> >  /* i915_gem_evict.c */
> > -int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
> > +int __must_check i915_gem_evict_something(struct drm_device *dev,
> > +					  struct i915_address_space *vm,
> > +					  int min_size,
> >  					  unsigned alignment,
> >  					  unsigned cache_level,
> >  					  bool mappable,
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 058ad44..21015cd 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -38,10 +38,12 @@
> >  
> >  static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
> >  static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
> > -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > -						    unsigned alignment,
> > -						    bool map_and_fenceable,
> > -						    bool nonblocking);
> > +static __must_check int
> > +i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > +			    struct i915_address_space *vm,
> > +			    unsigned alignment,
> > +			    bool map_and_fenceable,
> > +			    bool nonblocking);
> >  static int i915_gem_phys_pwrite(struct drm_device *dev,
> >  				struct drm_i915_gem_object *obj,
> >  				struct drm_i915_gem_pwrite *args,
> > @@ -135,7 +137,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
> >  static inline bool
> >  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
> >  {
> > -	return i915_gem_obj_ggtt_bound(obj) && !obj->active;
> > +	return i915_gem_obj_bound_any(obj) && !obj->active;
> >  }
> >  
> >  int
> > @@ -422,7 +424,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
> >  		 * anyway again before the next pread happens. */
> >  		if (obj->cache_level == I915_CACHE_NONE)
> >  			needs_clflush = 1;
> > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > +		if (i915_gem_obj_bound_any(obj)) {
> >  			ret = i915_gem_object_set_to_gtt_domain(obj, false);
> 
> This is essentially a very convoluted version of "if there's gpu rendering
> outstanding, please wait for it". Maybe we should switch this to
> 
> 	if (obj->active)
> 		wait_rendering(obj, true);
> 
> Same for the shmem_pwrite case below. Would be a separate patch to prep
> things though. Can I volunteer you for that? The ugly part is to review
> whether any of the lru list updating that set_domain does in addition to
> wait_rendering is required, but on a quick read that's not the case.
> 
> >  			if (ret)
> >  				return ret;
> > @@ -594,7 +596,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
> >  	char __user *user_data;
> >  	int page_offset, page_length, ret;
> >  
> > -	ret = i915_gem_object_pin(obj, 0, true, true);
> > +	ret = i915_gem_ggtt_pin(obj, 0, true, true);
> >  	if (ret)
> >  		goto out;
> >  
> > @@ -739,7 +741,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
> >  		 * right away and we therefore have to clflush anyway. */
gg> >  		if (obj->cache_level == I915_CACHE_NONE)
> >  			needs_clflush_after = 1;
> > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > +		if (i915_gem_obj_bound_any(obj)) {
> 
> ... see above.
> >  			ret = i915_gem_object_set_to_gtt_domain(obj, true);
> >  			if (ret)
> >  				return ret;
> > @@ -1346,7 +1348,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> >  	}
> >  
> >  	/* Now bind it into the GTT if needed */
> > -	ret = i915_gem_object_pin(obj, 0, true, false);
> > +	ret = i915_gem_ggtt_pin(obj,  0, true, false);
> >  	if (ret)
> >  		goto unlock;
> >  
> > @@ -1668,11 +1670,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
> >  	if (obj->pages == NULL)
> >  		return 0;
> >  
> > -	BUG_ON(i915_gem_obj_ggtt_bound(obj));
> > -
> >  	if (obj->pages_pin_count)
> >  		return -EBUSY;
> >  
> > +	BUG_ON(i915_gem_obj_bound_any(obj));
> > +
> >  	/* ->put_pages might need to allocate memory for the bit17 swizzle
> >  	 * array, hence protect them from being reaped by removing them from gtt
> >  	 * lists early. */
> > @@ -1692,7 +1694,6 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
> >  		  bool purgeable_only)
> >  {
> >  	struct drm_i915_gem_object *obj, *next;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	long count = 0;
> >  
> >  	list_for_each_entry_safe(obj, next,
> > @@ -1706,14 +1707,22 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
> >  		}
> >  	}
> >  
> > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) {
> > -		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
> > -		    i915_gem_object_unbind(obj) == 0 &&
> > -		    i915_gem_object_put_pages(obj) == 0) {
> > +	list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list,
> > +				 global_list) {
> > +		struct i915_vma *vma, *v;
> > +
> > +		if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
> > +			continue;
> > +
> > +		list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link)
> > +			if (i915_gem_object_unbind(obj, vma->vm))
> > +				break;
> > +
> > +		if (!i915_gem_object_put_pages(obj))
> >  			count += obj->base.size >> PAGE_SHIFT;
> > -			if (count >= target)
> > -				return count;
> > -		}
> > +
> > +		if (count >= target)
> > +			return count;
> >  	}
> >  
> >  	return count;
> > @@ -1873,11 +1882,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
> >  
> >  void
> >  i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > +			       struct i915_address_space *vm,
> >  			       struct intel_ring_buffer *ring)
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	u32 seqno = intel_ring_get_seqno(ring);
> >  
> >  	BUG_ON(ring == NULL);
> > @@ -1910,12 +1919,9 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> >  }
> >  
> >  static void
> > -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> > +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> > +				 struct i915_address_space *vm)
> >  {
> > -	struct drm_device *dev = obj->base.dev;
> > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > -
> >  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
> >  	BUG_ON(!obj->active);
> >  
> > @@ -2117,10 +2123,11 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
> >  	spin_unlock(&file_priv->mm.lock);
> >  }
> >  
> > -static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj)
> > +static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj,
> > +				    struct i915_address_space *vm)
> >  {
> > -	if (acthd >= i915_gem_obj_ggtt_offset(obj) &&
> > -	    acthd < i915_gem_obj_ggtt_offset(obj) + obj->base.size)
> > +	if (acthd >= i915_gem_obj_offset(obj, vm) &&
> > +	    acthd < i915_gem_obj_offset(obj, vm) + obj->base.size)
> >  		return true;
> >  
> >  	return false;
> > @@ -2143,6 +2150,17 @@ static bool i915_head_inside_request(const u32 acthd_unmasked,
> >  	return false;
> >  }
> >  
> > +static struct i915_address_space *
> > +request_to_vm(struct drm_i915_gem_request *request)
> > +{
> > +	struct drm_i915_private *dev_priv = request->ring->dev->dev_private;
> > +	struct i915_address_space *vm;
> > +
> > +	vm = &dev_priv->gtt.base;
> > +
> > +	return vm;
> > +}
> > +
> >  static bool i915_request_guilty(struct drm_i915_gem_request *request,
> >  				const u32 acthd, bool *inside)
> >  {
> > @@ -2150,9 +2168,9 @@ static bool i915_request_guilty(struct drm_i915_gem_request *request,
> >  	 * pointing inside the ring, matches the batch_obj address range.
> >  	 * However this is extremely unlikely.
> >  	 */
> > -
> >  	if (request->batch_obj) {
> > -		if (i915_head_inside_object(acthd, request->batch_obj)) {
> > +		if (i915_head_inside_object(acthd, request->batch_obj,
> > +					    request_to_vm(request))) {
> >  			*inside = true;
> >  			return true;
> >  		}
> > @@ -2172,17 +2190,21 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring,
> >  {
> >  	struct i915_ctx_hang_stats *hs = NULL;
> >  	bool inside, guilty;
> > +	unsigned long offset = 0;
> >  
> >  	/* Innocent until proven guilty */
> >  	guilty = false;
> >  
> > +	if (request->batch_obj)
> > +		offset = i915_gem_obj_offset(request->batch_obj,
> > +					     request_to_vm(request));
> > +
> >  	if (ring->hangcheck.action != wait &&
> >  	    i915_request_guilty(request, acthd, &inside)) {
> >  		DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n",
> >  			  ring->name,
> >  			  inside ? "inside" : "flushing",
> > -			  request->batch_obj ?
> > -			  i915_gem_obj_ggtt_offset(request->batch_obj) : 0,
> > +			  offset,
> >  			  request->ctx ? request->ctx->id : 0,
> >  			  acthd);
> >  
> > @@ -2239,13 +2261,15 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
> >  	}
> >  
> >  	while (!list_empty(&ring->active_list)) {
> > +		struct i915_address_space *vm;
> >  		struct drm_i915_gem_object *obj;
> >  
> >  		obj = list_first_entry(&ring->active_list,
> >  				       struct drm_i915_gem_object,
> >  				       ring_list);
> >  
> > -		i915_gem_object_move_to_inactive(obj);
> > +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > +			i915_gem_object_move_to_inactive(obj, vm);
> >  	}
> >  }
> >  
> > @@ -2263,7 +2287,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
> >  void i915_gem_reset(struct drm_device *dev)
> >  {
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > +	struct i915_address_space *vm;
> >  	struct drm_i915_gem_object *obj;
> >  	struct intel_ring_buffer *ring;
> >  	int i;
> > @@ -2274,8 +2298,9 @@ void i915_gem_reset(struct drm_device *dev)
> >  	/* Move everything out of the GPU domains to ensure we do any
> >  	 * necessary invalidation upon reuse.
> >  	 */
> > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > -		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > +		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > +			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> >  
> >  	i915_gem_restore_fences(dev);
> >  }
> > @@ -2320,6 +2345,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> >  	 * by the ringbuffer to the flushing/inactive lists as appropriate.
> >  	 */
> >  	while (!list_empty(&ring->active_list)) {
> > +		struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > +		struct i915_address_space *vm;
> >  		struct drm_i915_gem_object *obj;
> >  
> >  		obj = list_first_entry(&ring->active_list,
> > @@ -2329,7 +2356,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> >  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
> >  			break;
> >  
> > -		i915_gem_object_move_to_inactive(obj);
> > +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > +			i915_gem_object_move_to_inactive(obj, vm);
> >  	}
> >  
> >  	if (unlikely(ring->trace_irq_seqno &&
> > @@ -2575,13 +2603,14 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
> >   * Unbinds an object from the GTT aperture.
> >   */
> >  int
> > -i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> > +i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> > +		       struct i915_address_space *vm)
> >  {
> >  	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
> >  	struct i915_vma *vma;
> >  	int ret;
> >  
> > -	if (!i915_gem_obj_ggtt_bound(obj))
> > +	if (!i915_gem_obj_bound(obj, vm))
> >  		return 0;
> >  
> >  	if (obj->pin_count)
> > @@ -2604,7 +2633,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> >  	if (ret)
> >  		return ret;
> >  
> > -	trace_i915_gem_object_unbind(obj);
> > +	trace_i915_gem_object_unbind(obj, vm);
> >  
> >  	if (obj->has_global_gtt_mapping)
> >  		i915_gem_gtt_unbind_object(obj);
> > @@ -2619,7 +2648,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> >  	/* Avoid an unnecessary call to unbind on rebind. */
> >  	obj->map_and_fenceable = true;
> >  
> > -	vma = __i915_gem_obj_to_vma(obj);
> > +	vma = i915_gem_obj_to_vma(obj, vm);
> >  	list_del(&vma->vma_link);
> >  	drm_mm_remove_node(&vma->node);
> >  	i915_gem_vma_destroy(vma);
> > @@ -2748,6 +2777,7 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
> >  		     "object 0x%08lx not 512K or pot-size 0x%08x aligned\n",
> >  		     i915_gem_obj_ggtt_offset(obj), size);
> >  
> > +
> >  		pitch_val = obj->stride / 128;
> >  		pitch_val = ffs(pitch_val) - 1;
> >  
> > @@ -3069,23 +3099,25 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
> >   */
> >  static int
> >  i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > +			    struct i915_address_space *vm,
> >  			    unsigned alignment,
> >  			    bool map_and_fenceable,
> >  			    bool nonblocking)
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	u32 size, fence_size, fence_alignment, unfenced_alignment;
> >  	bool mappable, fenceable;
> > -	size_t gtt_max = map_and_fenceable ?
> > -		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
> > +	size_t gtt_max =
> > +		map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total;
> >  	struct i915_vma *vma;
> >  	int ret;
> >  
> >  	if (WARN_ON(!list_empty(&obj->vma_list)))
> >  		return -EBUSY;
> >  
> > +	BUG_ON(!i915_is_ggtt(vm));
> > +
> >  	fence_size = i915_gem_get_gtt_size(dev,
> >  					   obj->base.size,
> >  					   obj->tiling_mode);
> > @@ -3125,18 +3157,21 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> >  	i915_gem_object_pin_pages(obj);
> >  
> >  	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > +	/* For now we only ever use 1 vma per object */
> > +	WARN_ON(!list_empty(&obj->vma_list));
> > +
> > +	vma = i915_gem_vma_create(obj, vm);
> >  	if (vma == NULL) {
> >  		i915_gem_object_unpin_pages(obj);
> >  		return -ENOMEM;
> >  	}
> >  
> >  search_free:
> > -	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
> > -						  &vma->node,
> > +	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
> >  						  size, alignment,
> >  						  obj->cache_level, 0, gtt_max);
> >  	if (ret) {
> > -		ret = i915_gem_evict_something(dev, size, alignment,
> > +		ret = i915_gem_evict_something(dev, vm, size, alignment,
> >  					       obj->cache_level,
> >  					       map_and_fenceable,
> >  					       nonblocking);
> > @@ -3162,18 +3197,25 @@ search_free:
> >  
> >  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> >  	list_add_tail(&obj->mm_list, &vm->inactive_list);
> > -	list_add(&vma->vma_link, &obj->vma_list);
> > +
> > +	/* Keep GGTT vmas first to make debug easier */
> > +	if (i915_is_ggtt(vm))
> > +		list_add(&vma->vma_link, &obj->vma_list);
> > +	else
> > +		list_add_tail(&vma->vma_link, &obj->vma_list);
> >  
> >  	fenceable =
> > +		i915_is_ggtt(vm) &&
> >  		i915_gem_obj_ggtt_size(obj) == fence_size &&
> >  		(i915_gem_obj_ggtt_offset(obj) & (fence_alignment - 1)) == 0;
> >  
> > -	mappable = i915_gem_obj_ggtt_offset(obj) + obj->base.size <=
> > -		dev_priv->gtt.mappable_end;
> > +	mappable =
> > +		i915_is_ggtt(vm) &&
> > +		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
> >  
> >  	obj->map_and_fenceable = mappable && fenceable;
> >  
> > -	trace_i915_gem_object_bind(obj, map_and_fenceable);
> > +	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
> >  	i915_gem_verify_gtt(dev);
> >  	return 0;
> >  }
> > @@ -3271,7 +3313,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> >  	int ret;
> >  
> >  	/* Not valid to be called on unbound objects. */
> > -	if (!i915_gem_obj_ggtt_bound(obj))
> > +	if (!i915_gem_obj_bound_any(obj))
> >  		return -EINVAL;
> 
> If we're converting the shmem paths over to wait_rendering then there's
> only the fault handler and the set_domain ioctl left. For the later it
> would make sense to clflush even when an object is on the unbound list, to
> allow userspace to optimize when the clflushing happens. But that would
> only make sense in conjunction with Chris' create2 ioctl and a flag to
> preallocate the storage (and so putting the object onto the unbound list).
> So nothing to do here.
> 
> >  
> >  	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
> > @@ -3317,11 +3359,12 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> >  }
> >  
> >  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > +				    struct i915_address_space *vm,
> >  				    enum i915_cache_level cache_level)
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > +	struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> >  	int ret;
> >  
> >  	if (obj->cache_level == cache_level)
> > @@ -3333,12 +3376,15 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> >  	}
> >  
> >  	if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> > -		ret = i915_gem_object_unbind(obj);
> > +		ret = i915_gem_object_unbind(obj, vm);
> >  		if (ret)
> >  			return ret;
> >  	}
> >  
> > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > +		if (!i915_gem_obj_bound(obj, vm))
> > +			continue;
> 
> Hm, shouldn't we have a per-object list of vmas? Or will that follow later
> on?
> 
> Self-correction: It exists already ... why can't we use this here?
> 
> > +
> >  		ret = i915_gem_object_finish_gpu(obj);
> >  		if (ret)
> >  			return ret;
> > @@ -3361,7 +3407,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> >  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> >  					       obj, cache_level);
> >  
> > -		i915_gem_obj_ggtt_set_color(obj, cache_level);
> > +		i915_gem_obj_set_color(obj, vm, cache_level);
> >  	}
> >  
> >  	if (cache_level == I915_CACHE_NONE) {
> > @@ -3421,6 +3467,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> >  			       struct drm_file *file)
> >  {
> >  	struct drm_i915_gem_caching *args = data;
> > +	struct drm_i915_private *dev_priv;
> >  	struct drm_i915_gem_object *obj;
> >  	enum i915_cache_level level;
> >  	int ret;
> > @@ -3445,8 +3492,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> >  		ret = -ENOENT;
> >  		goto unlock;
> >  	}
> > +	dev_priv = obj->base.dev->dev_private;
> >  
> > -	ret = i915_gem_object_set_cache_level(obj, level);
> > +	/* FIXME: Add interface for specific VM? */
> > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
> >  
> >  	drm_gem_object_unreference(&obj->base);
> >  unlock:
> > @@ -3464,6 +3513,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> >  				     u32 alignment,
> >  				     struct intel_ring_buffer *pipelined)
> >  {
> > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> >  	u32 old_read_domains, old_write_domain;
> >  	int ret;
> >  
> > @@ -3482,7 +3532,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> >  	 * of uncaching, which would allow us to flush all the LLC-cached data
> >  	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
> >  	 */
> > -	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > +					      I915_CACHE_NONE);
> >  	if (ret)
> >  		return ret;
> >  
> > @@ -3490,7 +3541,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> >  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> >  	 * always use map_and_fenceable for all scanout buffers.
> >  	 */
> > -	ret = i915_gem_object_pin(obj, alignment, true, false);
> > +	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
> >  	if (ret)
> >  		return ret;
> >  
> > @@ -3633,6 +3684,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
> >  
> >  int
> >  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > +		    struct i915_address_space *vm,
> >  		    uint32_t alignment,
> >  		    bool map_and_fenceable,
> >  		    bool nonblocking)
> > @@ -3642,26 +3694,29 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> >  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
> >  		return -EBUSY;
> >  
> > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> > +	BUG_ON(map_and_fenceable && !i915_is_ggtt(vm));
> 
> WARN_ON, since presumably we can keep on going if we get this wrong
> (albeit with slightly corrupted state, so render corruptions might
> follow).
> 
> > +
> > +	if (i915_gem_obj_bound(obj, vm)) {
> > +		if ((alignment &&
> > +		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
> >  		    (map_and_fenceable && !obj->map_and_fenceable)) {
> >  			WARN(obj->pin_count,
> >  			     "bo is already pinned with incorrect alignment:"
> >  			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
> >  			     " obj->map_and_fenceable=%d\n",
> > -			     i915_gem_obj_ggtt_offset(obj), alignment,
> > +			     i915_gem_obj_offset(obj, vm), alignment,
> >  			     map_and_fenceable,
> >  			     obj->map_and_fenceable);
> > -			ret = i915_gem_object_unbind(obj);
> > +			ret = i915_gem_object_unbind(obj, vm);
> >  			if (ret)
> >  				return ret;
> >  		}
> >  	}
> >  
> > -	if (!i915_gem_obj_ggtt_bound(obj)) {
> > +	if (!i915_gem_obj_bound(obj, vm)) {
> >  		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> >  
> > -		ret = i915_gem_object_bind_to_gtt(obj, alignment,
> > +		ret = i915_gem_object_bind_to_gtt(obj, vm, alignment,
> >  						  map_and_fenceable,
> >  						  nonblocking);
> >  		if (ret)
> > @@ -3684,7 +3739,7 @@ void
> >  i915_gem_object_unpin(struct drm_i915_gem_object *obj)
> >  {
> >  	BUG_ON(obj->pin_count == 0);
> > -	BUG_ON(!i915_gem_obj_ggtt_bound(obj));
> > +	BUG_ON(!i915_gem_obj_bound_any(obj));
> >  
> >  	if (--obj->pin_count == 0)
> >  		obj->pin_mappable = false;
> > @@ -3722,7 +3777,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
> >  	}
> >  
> >  	if (obj->user_pin_count == 0) {
> > -		ret = i915_gem_object_pin(obj, args->alignment, true, false);
> > +		ret = i915_gem_ggtt_pin(obj, args->alignment, true, false);
> >  		if (ret)
> >  			goto out;
> >  	}
> > @@ -3957,6 +4012,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> >  	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
> >  	struct drm_device *dev = obj->base.dev;
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > +	struct i915_vma *vma, *next;
> >  
> >  	trace_i915_gem_object_destroy(obj);
> >  
> > @@ -3964,15 +4020,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> >  		i915_gem_detach_phys_object(dev, obj);
> >  
> >  	obj->pin_count = 0;
> > -	if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
> > -		bool was_interruptible;
> > +	/* NB: 0 or 1 elements */
> > +	WARN_ON(!list_empty(&obj->vma_list) &&
> > +		!list_is_singular(&obj->vma_list));
> > +	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
> > +		int ret = i915_gem_object_unbind(obj, vma->vm);
> > +		if (WARN_ON(ret == -ERESTARTSYS)) {
> > +			bool was_interruptible;
> >  
> > -		was_interruptible = dev_priv->mm.interruptible;
> > -		dev_priv->mm.interruptible = false;
> > +			was_interruptible = dev_priv->mm.interruptible;
> > +			dev_priv->mm.interruptible = false;
> >  
> > -		WARN_ON(i915_gem_object_unbind(obj));
> > +			WARN_ON(i915_gem_object_unbind(obj, vma->vm));
> >  
> > -		dev_priv->mm.interruptible = was_interruptible;
> > +			dev_priv->mm.interruptible = was_interruptible;
> > +		}
> >  	}
> >  
> >  	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
> > @@ -4332,6 +4394,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
> >  	INIT_LIST_HEAD(&ring->request_list);
> >  }
> >  
> > +static void i915_init_vm(struct drm_i915_private *dev_priv,
> > +			 struct i915_address_space *vm)
> > +{
> > +	vm->dev = dev_priv->dev;
> > +	INIT_LIST_HEAD(&vm->active_list);
> > +	INIT_LIST_HEAD(&vm->inactive_list);
> > +	INIT_LIST_HEAD(&vm->global_link);
> > +	list_add(&vm->global_link, &dev_priv->vm_list);
> > +}
> > +
> >  void
> >  i915_gem_load(struct drm_device *dev)
> >  {
> > @@ -4344,8 +4416,9 @@ i915_gem_load(struct drm_device *dev)
> >  				  SLAB_HWCACHE_ALIGN,
> >  				  NULL);
> >  
> > -	INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
> > -	INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
> > +	INIT_LIST_HEAD(&dev_priv->vm_list);
> > +	i915_init_vm(dev_priv, &dev_priv->gtt.base);
> > +
> >  	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
> >  	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
> >  	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> > @@ -4616,9 +4689,9 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> >  			     struct drm_i915_private,
> >  			     mm.inactive_shrinker);
> >  	struct drm_device *dev = dev_priv->dev;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > +	struct i915_address_space *vm;
> >  	struct drm_i915_gem_object *obj;
> > -	int nr_to_scan = sc->nr_to_scan;
> > +	int nr_to_scan;
> >  	bool unlock = true;
> >  	int cnt;
> >  
> > @@ -4632,6 +4705,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> >  		unlock = false;
> >  	}
> >  
> > +	nr_to_scan = sc->nr_to_scan;
> >  	if (nr_to_scan) {
> >  		nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
> >  		if (nr_to_scan > 0)
> > @@ -4645,11 +4719,93 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> >  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
> >  		if (obj->pages_pin_count == 0)
> >  			cnt += obj->base.size >> PAGE_SHIFT;
> > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > -		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > -			cnt += obj->base.size >> PAGE_SHIFT;
> > +
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > +		list_for_each_entry(obj, &vm->inactive_list, global_list)
> > +			if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > +				cnt += obj->base.size >> PAGE_SHIFT;
> 
> Isn't this now double-counting objects? In the shrinker we only care about
> how much physical RAM an object occupies, not how much virtual space it
> occupies. So just walking the bound list of objects here should be good
> enough ...
> 
> >  
> >  	if (unlock)
> >  		mutex_unlock(&dev->struct_mutex);
> >  	return cnt;
> >  }
> > +
> > +/* All the new VM stuff */
> > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > +				  struct i915_address_space *vm)
> > +{
> > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > +	struct i915_vma *vma;
> > +
> > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > +		vm = &dev_priv->gtt.base;
> > +
> > +	BUG_ON(list_empty(&o->vma_list));
> > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> 
> Imo the vma list walking here and in the other helpers below indicates
> that we should deal more often in vmas instead of (object, vm) pairs. Or
> is this again something that'll get fixed later on?
> 
> I just want to avoid diff churn, and it also makes reviewing easier if the
> foreshadowing is correct ;-) So generally I'd vote for more liberal
> sprinkling of obj_to_vma in callers.
> 
> > +		if (vma->vm == vm)
> > +			return vma->node.start;
> > +
> > +	}
> > +	return -1;
> > +}
> > +
> > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
> > +{
> > +	return !list_empty(&o->vma_list);
> > +}
> > +
> > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > +			struct i915_address_space *vm)
> > +{
> > +	struct i915_vma *vma;
> > +
> > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > +		if (vma->vm == vm)
> > +			return true;
> > +	}
> > +	return false;
> > +}
> > +
> > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > +				struct i915_address_space *vm)
> > +{
> > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > +	struct i915_vma *vma;
> > +
> > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > +		vm = &dev_priv->gtt.base;
> > +	BUG_ON(list_empty(&o->vma_list));
> > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > +		if (vma->vm == vm)
> > +			return vma->node.size;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> > +			    struct i915_address_space *vm,
> > +			    enum i915_cache_level color)
> > +{
> > +	struct i915_vma *vma;
> > +	BUG_ON(list_empty(&o->vma_list));
> > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > +		if (vma->vm == vm) {
> > +			vma->node.color = color;
> > +			return;
> > +		}
> > +	}
> > +
> > +	WARN(1, "Couldn't set color for VM %p\n", vm);
> > +}
> > +
> > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > +				     struct i915_address_space *vm)
> > +{
> > +	struct i915_vma *vma;
> > +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> > +		if (vma->vm == vm)
> > +			return vma;
> > +
> > +	return NULL;
> > +}
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > index 2074544..c92fd81 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev,
> >  
> >  	if (INTEL_INFO(dev)->gen >= 7) {
> >  		ret = i915_gem_object_set_cache_level(ctx->obj,
> > +						      &dev_priv->gtt.base,
> >  						      I915_CACHE_LLC_MLC);
> >  		/* Failure shouldn't ever happen this early */
> >  		if (WARN_ON(ret))
> > @@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
> >  	 * default context.
> >  	 */
> >  	dev_priv->ring[RCS].default_context = ctx;
> > -	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> > +	ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> >  	if (ret) {
> >  		DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
> >  		goto err_destroy;
> > @@ -398,6 +399,7 @@ mi_set_context(struct intel_ring_buffer *ring,
> >  static int do_switch(struct i915_hw_context *to)
> >  {
> >  	struct intel_ring_buffer *ring = to->ring;
> > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> >  	struct i915_hw_context *from = ring->last_context;
> >  	u32 hw_flags = 0;
> >  	int ret;
> > @@ -407,7 +409,7 @@ static int do_switch(struct i915_hw_context *to)
> >  	if (from == to)
> >  		return 0;
> >  
> > -	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
> > +	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
> >  	if (ret)
> >  		return ret;
> >  
> > @@ -444,7 +446,8 @@ static int do_switch(struct i915_hw_context *to)
> >  	 */
> >  	if (from != NULL) {
> >  		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> > -		i915_gem_object_move_to_active(from->obj, ring);
> > +		i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
> > +					       ring);
> >  		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
> >  		 * whole damn pipeline, we don't need to explicitly mark the
> >  		 * object dirty. The only exception is that the context must be
> > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > index df61f33..32efdc0 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > @@ -32,24 +32,21 @@
> >  #include "i915_trace.h"
> >  
> >  static bool
> > -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
> > +mark_free(struct i915_vma *vma, struct list_head *unwind)
> >  {
> > -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > -
> > -	if (obj->pin_count)
> > +	if (vma->obj->pin_count)
> >  		return false;
> >  
> > -	list_add(&obj->exec_list, unwind);
> > +	list_add(&vma->obj->exec_list, unwind);
> >  	return drm_mm_scan_add_block(&vma->node);
> >  }
> >  
> >  int
> > -i915_gem_evict_something(struct drm_device *dev, int min_size,
> > -			 unsigned alignment, unsigned cache_level,
> > +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> > +			 int min_size, unsigned alignment, unsigned cache_level,
> >  			 bool mappable, bool nonblocking)
> >  {
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	struct list_head eviction_list, unwind_list;
> >  	struct i915_vma *vma;
> >  	struct drm_i915_gem_object *obj;
> > @@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> >  	 */
> >  
> >  	INIT_LIST_HEAD(&unwind_list);
> > -	if (mappable)
> > +	if (mappable) {
> > +		BUG_ON(!i915_is_ggtt(vm));
> >  		drm_mm_init_scan_with_range(&vm->mm, min_size,
> >  					    alignment, cache_level, 0,
> >  					    dev_priv->gtt.mappable_end);
> > -	else
> > +	} else
> >  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
> >  
> >  	/* First see if there is a large enough contiguous idle region... */
> >  	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> > -		if (mark_free(obj, &unwind_list))
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > +		if (mark_free(vma, &unwind_list))
> >  			goto found;
> >  	}
> >  
> > @@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> >  
> >  	/* Now merge in the soon-to-be-expired objects... */
> >  	list_for_each_entry(obj, &vm->active_list, mm_list) {
> > -		if (mark_free(obj, &unwind_list))
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > +		if (mark_free(vma, &unwind_list))
> >  			goto found;
> >  	}
> >  
> > @@ -109,7 +109,7 @@ none:
> >  		obj = list_first_entry(&unwind_list,
> >  				       struct drm_i915_gem_object,
> >  				       exec_list);
> > -		vma = __i915_gem_obj_to_vma(obj);
> > +		vma = i915_gem_obj_to_vma(obj, vm);
> >  		ret = drm_mm_scan_remove_block(&vma->node);
> >  		BUG_ON(ret);
> >  
> > @@ -130,7 +130,7 @@ found:
> >  		obj = list_first_entry(&unwind_list,
> >  				       struct drm_i915_gem_object,
> >  				       exec_list);
> > -		vma = __i915_gem_obj_to_vma(obj);
> > +		vma = i915_gem_obj_to_vma(obj, vm);
> >  		if (drm_mm_scan_remove_block(&vma->node)) {
> >  			list_move(&obj->exec_list, &eviction_list);
> >  			drm_gem_object_reference(&obj->base);
> > @@ -145,7 +145,7 @@ found:
> >  				       struct drm_i915_gem_object,
> >  				       exec_list);
> >  		if (ret == 0)
> > -			ret = i915_gem_object_unbind(obj);
> > +			ret = i915_gem_object_unbind(obj, vm);
> >  
> >  		list_del_init(&obj->exec_list);
> >  		drm_gem_object_unreference(&obj->base);
> > @@ -158,13 +158,18 @@ int
> >  i915_gem_evict_everything(struct drm_device *dev)
> 
> I suspect evict_everything eventually wants a address_space *vm argument
> for those cases where we only want to evict everything in a given vm. Atm
> we have two use-cases of this:
> - Called from the shrinker as a last-ditch effort. For that it should move
>   _every_ object onto the unbound list.
> - Called from execbuf for badly-fragmented address spaces to clean up the
>   mess. For that case we only care about one address space.
> 
> >  {
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > +	struct i915_address_space *vm;
> >  	struct drm_i915_gem_object *obj, *next;
> > -	bool lists_empty;
> > +	bool lists_empty = true;
> >  	int ret;
> >  
> > -	lists_empty = (list_empty(&vm->inactive_list) &&
> > -		       list_empty(&vm->active_list));
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > +		lists_empty = (list_empty(&vm->inactive_list) &&
> > +			       list_empty(&vm->active_list));
> > +		if (!lists_empty)
> > +			lists_empty = false;
> > +	}
> > +
> >  	if (lists_empty)
> >  		return -ENOSPC;
> >  
> > @@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
> >  	i915_gem_retire_requests(dev);
> >  
> >  	/* Having flushed everything, unbind() should never raise an error */
> > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > -		if (obj->pin_count == 0)
> > -			WARN_ON(i915_gem_object_unbind(obj));
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > +		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > +			if (obj->pin_count == 0)
> > +				WARN_ON(i915_gem_object_unbind(obj, vm));
> > +	}
> >  
> >  	return 0;
> >  }
> > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > index 5aeb447..e90182d 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > @@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
> >  }
> >  
> >  static void
> > -eb_destroy(struct eb_objects *eb)
> > +eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
> >  {
> >  	while (!list_empty(&eb->objects)) {
> >  		struct drm_i915_gem_object *obj;
> > @@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
> >  static int
> >  i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> >  				   struct eb_objects *eb,
> > -				   struct drm_i915_gem_relocation_entry *reloc)
> > +				   struct drm_i915_gem_relocation_entry *reloc,
> > +				   struct i915_address_space *vm)
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> >  	struct drm_gem_object *target_obj;
> > @@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> >  
> >  static int
> >  i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > -				    struct eb_objects *eb)
> > +				    struct eb_objects *eb,
> > +				    struct i915_address_space *vm)
> >  {
> >  #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
> >  	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
> > @@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> >  		do {
> >  			u64 offset = r->presumed_offset;
> >  
> > -			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
> > +			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
> > +								 vm);
> >  			if (ret)
> >  				return ret;
> >  
> > @@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> >  static int
> >  i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> >  					 struct eb_objects *eb,
> > -					 struct drm_i915_gem_relocation_entry *relocs)
> > +					 struct drm_i915_gem_relocation_entry *relocs,
> > +					 struct i915_address_space *vm)
> >  {
> >  	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> >  	int i, ret;
> >  
> >  	for (i = 0; i < entry->relocation_count; i++) {
> > -		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
> > +		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
> > +							 vm);
> >  		if (ret)
> >  			return ret;
> >  	}
> > @@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> >  }
> >  
> >  static int
> > -i915_gem_execbuffer_relocate(struct eb_objects *eb)
> > +i915_gem_execbuffer_relocate(struct eb_objects *eb,
> > +			     struct i915_address_space *vm)
> >  {
> >  	struct drm_i915_gem_object *obj;
> >  	int ret = 0;
> > @@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
> >  	 */
> >  	pagefault_disable();
> >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> > -		ret = i915_gem_execbuffer_relocate_object(obj, eb);
> > +		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
> >  		if (ret)
> >  			break;
> >  	}
> > @@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
> >  static int
> >  i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> >  				   struct intel_ring_buffer *ring,
> > +				   struct i915_address_space *vm,
> >  				   bool *need_reloc)
> >  {
> >  	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > @@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> >  		obj->tiling_mode != I915_TILING_NONE;
> >  	need_mappable = need_fence || need_reloc_mappable(obj);
> >  
> > -	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
> > +	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
> > +				  false);
> >  	if (ret)
> >  		return ret;
> >  
> > @@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> >  		obj->has_aliasing_ppgtt_mapping = 1;
> >  	}
> >  
> > -	if (entry->offset != i915_gem_obj_ggtt_offset(obj)) {
> > -		entry->offset = i915_gem_obj_ggtt_offset(obj);
> > +	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
> > +		entry->offset = i915_gem_obj_offset(obj, vm);
> >  		*need_reloc = true;
> >  	}
> >  
> > @@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> >  {
> >  	struct drm_i915_gem_exec_object2 *entry;
> >  
> > -	if (!i915_gem_obj_ggtt_bound(obj))
> > +	if (!i915_gem_obj_bound_any(obj))
> >  		return;
> >  
> >  	entry = obj->exec_entry;
> > @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> >  static int
> >  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> >  			    struct list_head *objects,
> > +			    struct i915_address_space *vm,
> >  			    bool *need_relocs)
> >  {
> >  	struct drm_i915_gem_object *obj;
> > @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> >  		list_for_each_entry(obj, objects, exec_list) {
> >  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> >  			bool need_fence, need_mappable;
> > +			u32 obj_offset;
> >  
> > -			if (!i915_gem_obj_ggtt_bound(obj))
> > +			if (!i915_gem_obj_bound(obj, vm))
> >  				continue;
> 
> I wonder a bit how we could avoid the multipler (obj, vm) -> vma lookups
> here ... Maybe we should cache them in some pointer somewhere (either in
> the eb object or by adding a new pointer to the object struct, e.g.
> obj->eb_vma, similar to obj->eb_list).
> 
> >  
> > +			obj_offset = i915_gem_obj_offset(obj, vm);
> >  			need_fence =
> >  				has_fenced_gpu_access &&
> >  				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
> >  				obj->tiling_mode != I915_TILING_NONE;
> >  			need_mappable = need_fence || need_reloc_mappable(obj);
> >  
> > +			BUG_ON((need_mappable || need_fence) &&
> > +			       !i915_is_ggtt(vm));
> > +
> >  			if ((entry->alignment &&
> > -			     i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) ||
> > +			     obj_offset & (entry->alignment - 1)) ||
> >  			    (need_mappable && !obj->map_and_fenceable))
> > -				ret = i915_gem_object_unbind(obj);
> > +				ret = i915_gem_object_unbind(obj, vm);
> >  			else
> > -				ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > +				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> >  			if (ret)
> >  				goto err;
> >  		}
> >  
> >  		/* Bind fresh objects */
> >  		list_for_each_entry(obj, objects, exec_list) {
> > -			if (i915_gem_obj_ggtt_bound(obj))
> > +			if (i915_gem_obj_bound(obj, vm))
> >  				continue;
> >  
> > -			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > +			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> >  			if (ret)
> >  				goto err;
> >  		}
> > @@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> >  				  struct drm_file *file,
> >  				  struct intel_ring_buffer *ring,
> >  				  struct eb_objects *eb,
> > -				  struct drm_i915_gem_exec_object2 *exec)
> > +				  struct drm_i915_gem_exec_object2 *exec,
> > +				  struct i915_address_space *vm)
> >  {
> >  	struct drm_i915_gem_relocation_entry *reloc;
> >  	struct drm_i915_gem_object *obj;
> > @@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> >  		goto err;
> >  
> >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> >  	if (ret)
> >  		goto err;
> >  
> >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> >  		int offset = obj->exec_entry - exec;
> >  		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
> > -							       reloc + reloc_offset[offset]);
> > +							       reloc + reloc_offset[offset],
> > +							       vm);
> >  		if (ret)
> >  			goto err;
> >  	}
> > @@ -768,6 +784,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
> >  
> >  static void
> >  i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > +				   struct i915_address_space *vm,
> >  				   struct intel_ring_buffer *ring)
> >  {
> >  	struct drm_i915_gem_object *obj;
> > @@ -782,7 +799,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
> >  		obj->base.read_domains = obj->base.pending_read_domains;
> >  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
> >  
> > -		i915_gem_object_move_to_active(obj, ring);
> > +		i915_gem_object_move_to_active(obj, vm, ring);
> >  		if (obj->base.write_domain) {
> >  			obj->dirty = 1;
> >  			obj->last_write_seqno = intel_ring_get_seqno(ring);
> > @@ -836,7 +853,8 @@ static int
> >  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  		       struct drm_file *file,
> >  		       struct drm_i915_gem_execbuffer2 *args,
> > -		       struct drm_i915_gem_exec_object2 *exec)
> > +		       struct drm_i915_gem_exec_object2 *exec,
> > +		       struct i915_address_space *vm)
> >  {
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> >  	struct eb_objects *eb;
> > @@ -998,17 +1016,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  
> >  	/* Move the objects en-masse into the GTT, evicting if necessary. */
> >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> >  	if (ret)
> >  		goto err;
> >  
> >  	/* The objects are in their final locations, apply the relocations. */
> >  	if (need_relocs)
> > -		ret = i915_gem_execbuffer_relocate(eb);
> > +		ret = i915_gem_execbuffer_relocate(eb, vm);
> >  	if (ret) {
> >  		if (ret == -EFAULT) {
> >  			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
> > -								eb, exec);
> > +								eb, exec, vm);
> >  			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
> >  		}
> >  		if (ret)
> > @@ -1059,7 +1077,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  			goto err;
> >  	}
> >  
> > -	exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset;
> > +	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> > +		args->batch_start_offset;
> >  	exec_len = args->batch_len;
> >  	if (cliprects) {
> >  		for (i = 0; i < args->num_cliprects; i++) {
> > @@ -1084,11 +1103,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  
> >  	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
> >  
> > -	i915_gem_execbuffer_move_to_active(&eb->objects, ring);
> > +	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
> >  	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
> >  
> >  err:
> > -	eb_destroy(eb);
> > +	eb_destroy(eb, vm);
> >  
> >  	mutex_unlock(&dev->struct_mutex);
> >  
> > @@ -1105,6 +1124,7 @@ int
> >  i915_gem_execbuffer(struct drm_device *dev, void *data,
> >  		    struct drm_file *file)
> >  {
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	struct drm_i915_gem_execbuffer *args = data;
> >  	struct drm_i915_gem_execbuffer2 exec2;
> >  	struct drm_i915_gem_exec_object *exec_list = NULL;
> > @@ -1160,7 +1180,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
> >  	exec2.flags = I915_EXEC_RENDER;
> >  	i915_execbuffer2_set_context_id(exec2, 0);
> >  
> > -	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
> > +	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
> > +				     &dev_priv->gtt.base);
> >  	if (!ret) {
> >  		/* Copy the new buffer offsets back to the user's exec list. */
> >  		for (i = 0; i < args->buffer_count; i++)
> > @@ -1186,6 +1207,7 @@ int
> >  i915_gem_execbuffer2(struct drm_device *dev, void *data,
> >  		     struct drm_file *file)
> >  {
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	struct drm_i915_gem_execbuffer2 *args = data;
> >  	struct drm_i915_gem_exec_object2 *exec2_list = NULL;
> >  	int ret;
> > @@ -1216,7 +1238,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
> >  		return -EFAULT;
> >  	}
> >  
> > -	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
> > +	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
> > +				     &dev_priv->gtt.base);
> >  	if (!ret) {
> >  		/* Copy the new buffer offsets back to the user's exec list. */
> >  		ret = copy_to_user(to_user_ptr(args->buffers_ptr),
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index 298fc42..70ce2f6 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -367,6 +367,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
> >  			    ppgtt->base.total);
> >  	}
> >  
> > +	/* i915_init_vm(dev_priv, &ppgtt->base) */
> > +
> >  	return ret;
> >  }
> >  
> > @@ -386,17 +388,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> >  			    struct drm_i915_gem_object *obj,
> >  			    enum i915_cache_level cache_level)
> >  {
> > -	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> > -				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > -				   cache_level);
> > +	struct i915_address_space *vm = &ppgtt->base;
> > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > +
> > +	vm->insert_entries(vm, obj->pages,
> > +			   obj_offset >> PAGE_SHIFT,
> > +			   cache_level);
> >  }
> >  
> >  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> >  			      struct drm_i915_gem_object *obj)
> >  {
> > -	ppgtt->base.clear_range(&ppgtt->base,
> > -				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > -				obj->base.size >> PAGE_SHIFT);
> > +	struct i915_address_space *vm = &ppgtt->base;
> > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > +
> > +	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
> > +			obj->base.size >> PAGE_SHIFT);
> >  }
> >  
> >  extern int intel_iommu_gfx_mapped;
> > @@ -447,6 +454,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> >  				       dev_priv->gtt.base.start / PAGE_SIZE,
> >  				       dev_priv->gtt.base.total / PAGE_SIZE);
> >  
> > +	if (dev_priv->mm.aliasing_ppgtt)
> > +		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> > +
> >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> >  		i915_gem_clflush_object(obj);
> >  		i915_gem_gtt_bind_object(obj, obj->cache_level);
> > @@ -625,7 +635,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> >  	 * aperture.  One page should be enough to keep any prefetching inside
> >  	 * of the aperture.
> >  	 */
> > -	drm_i915_private_t *dev_priv = dev->dev_private;
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
> >  	struct drm_mm_node *entry;
> >  	struct drm_i915_gem_object *obj;
> >  	unsigned long hole_start, hole_end;
> > @@ -633,19 +644,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> >  	BUG_ON(mappable_end > end);
> >  
> >  	/* Subtract the guard page ... */
> > -	drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
> > +	drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
> >  	if (!HAS_LLC(dev))
> >  		dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
> >  
> >  	/* Mark any preallocated objects as occupied */
> >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > -		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
> >  		int ret;
> >  		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
> >  			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
> >  
> >  		WARN_ON(i915_gem_obj_ggtt_bound(obj));
> > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > +		ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
> >  		if (ret)
> >  			DRM_DEBUG_KMS("Reservation failed\n");
> >  		obj->has_global_gtt_mapping = 1;
> > @@ -656,19 +667,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> >  	dev_priv->gtt.base.total = end - start;
> >  
> >  	/* Clear any non-preallocated blocks */
> > -	drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
> > -			     hole_start, hole_end) {
> > +	drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
> >  		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
> >  		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
> >  			      hole_start, hole_end);
> > -		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > -					       hole_start / PAGE_SIZE,
> > -					       count);
> > +		ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count);
> >  	}
> >  
> >  	/* And finally clear the reserved guard page */
> > -	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > -				       end / PAGE_SIZE - 1, 1);
> > +	ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1);
> >  }
> >  
> >  static bool
> > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > index 245eb1d..bfe61fa 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > @@ -391,7 +391,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> >  	if (gtt_offset == I915_GTT_OFFSET_NONE)
> >  		return obj;
> >  
> > -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > +	vma = i915_gem_vma_create(obj, vm);
> >  	if (!vma) {
> >  		drm_gem_object_unreference(&obj->base);
> >  		return NULL;
> > @@ -404,8 +404,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> >  	 */
> >  	vma->node.start = gtt_offset;
> >  	vma->node.size = size;
> > -	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > +	if (drm_mm_initialized(&vm->mm)) {
> > +		ret = drm_mm_reserve_node(&vm->mm, &vma->node);
> 
> These two hunks here for stolen look fishy - we only ever use the stolen
> preallocated stuff for objects with mappings in the global gtt. So keeping
> that explicit is imo the better approach. And tbh I'm confused where the
> local variable vm is from ...
> 
> >  		if (ret) {
> >  			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
> >  			goto unref_out;
> > diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > index 92a8d27..808ca2a 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > @@ -360,17 +360,19 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
> >  
> >  		obj->map_and_fenceable =
> >  			!i915_gem_obj_ggtt_bound(obj) ||
> > -			(i915_gem_obj_ggtt_offset(obj) + obj->base.size <= dev_priv->gtt.mappable_end &&
> > +			(i915_gem_obj_ggtt_offset(obj) +
> > +			 obj->base.size <= dev_priv->gtt.mappable_end &&
> >  			 i915_gem_object_fence_ok(obj, args->tiling_mode));
> >  
> >  		/* Rebind if we need a change of alignment */
> >  		if (!obj->map_and_fenceable) {
> > -			u32 unfenced_alignment =
> > +			struct i915_address_space *ggtt = &dev_priv->gtt.base;
> > +			u32 unfenced_align =
> >  				i915_gem_get_gtt_alignment(dev, obj->base.size,
> >  							    args->tiling_mode,
> >  							    false);
> > -			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_alignment - 1))
> > -				ret = i915_gem_object_unbind(obj);
> > +			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_align - 1))
> > +				ret = i915_gem_object_unbind(obj, ggtt);
> >  		}
> >  
> >  		if (ret == 0) {
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index 79fbb17..28fa0ff 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -1716,6 +1716,9 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
> >  	if (HAS_BROKEN_CS_TLB(dev_priv->dev)) {
> >  		u32 acthd = I915_READ(ACTHD);
> >  
> > +		if (WARN_ON(HAS_HW_CONTEXTS(dev_priv->dev)))
> > +			return NULL;
> > +
> >  		if (WARN_ON(ring->id != RCS))
> >  			return NULL;
> >  
> > @@ -1802,7 +1805,8 @@ static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
> >  		return;
> >  
> >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > -		if ((error->ccid & PAGE_MASK) == i915_gem_obj_ggtt_offset(obj)) {
> > +		if ((error->ccid & PAGE_MASK) ==
> > +		    i915_gem_obj_ggtt_offset(obj)) {
> >  			ering->ctx = i915_error_object_create_sized(dev_priv,
> >  								    obj, 1);
> >  			break;
> > diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> > index 7d283b5..3f019d3 100644
> > --- a/drivers/gpu/drm/i915/i915_trace.h
> > +++ b/drivers/gpu/drm/i915/i915_trace.h
> > @@ -34,11 +34,13 @@ TRACE_EVENT(i915_gem_object_create,
> >  );
> >  
> >  TRACE_EVENT(i915_gem_object_bind,
> > -	    TP_PROTO(struct drm_i915_gem_object *obj, bool mappable),
> > -	    TP_ARGS(obj, mappable),
> > +	    TP_PROTO(struct drm_i915_gem_object *obj,
> > +		     struct i915_address_space *vm, bool mappable),
> > +	    TP_ARGS(obj, vm, mappable),
> >  
> >  	    TP_STRUCT__entry(
> >  			     __field(struct drm_i915_gem_object *, obj)
> > +			     __field(struct i915_address_space *, vm)
> >  			     __field(u32, offset)
> >  			     __field(u32, size)
> >  			     __field(bool, mappable)
> > @@ -46,8 +48,8 @@ TRACE_EVENT(i915_gem_object_bind,
> >  
> >  	    TP_fast_assign(
> >  			   __entry->obj = obj;
> > -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> > -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> > +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> > +			   __entry->size = i915_gem_obj_size(obj, vm);
> >  			   __entry->mappable = mappable;
> >  			   ),
> >  
> > @@ -57,19 +59,21 @@ TRACE_EVENT(i915_gem_object_bind,
> >  );
> >  
> >  TRACE_EVENT(i915_gem_object_unbind,
> > -	    TP_PROTO(struct drm_i915_gem_object *obj),
> > -	    TP_ARGS(obj),
> > +	    TP_PROTO(struct drm_i915_gem_object *obj,
> > +		     struct i915_address_space *vm),
> > +	    TP_ARGS(obj, vm),
> >  
> >  	    TP_STRUCT__entry(
> >  			     __field(struct drm_i915_gem_object *, obj)
> > +			     __field(struct i915_address_space *, vm)
> >  			     __field(u32, offset)
> >  			     __field(u32, size)
> >  			     ),
> >  
> >  	    TP_fast_assign(
> >  			   __entry->obj = obj;
> > -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> > -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> > +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> > +			   __entry->size = i915_gem_obj_size(obj, vm);
> >  			   ),
> >  
> >  	    TP_printk("obj=%p, offset=%08x size=%x",
> > diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
> > index f3c97e0..b69cc63 100644
> > --- a/drivers/gpu/drm/i915/intel_fb.c
> > +++ b/drivers/gpu/drm/i915/intel_fb.c
> > @@ -170,7 +170,6 @@ static int intelfb_create(struct drm_fb_helper *helper,
> >  		      fb->width, fb->height,
> >  		      i915_gem_obj_ggtt_offset(obj), obj);
> >  
> > -
> >  	mutex_unlock(&dev->struct_mutex);
> >  	vga_switcheroo_client_fb_set(dev->pdev, info);
> >  	return 0;
> > diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> > index 81c3ca1..517e278 100644
> > --- a/drivers/gpu/drm/i915/intel_overlay.c
> > +++ b/drivers/gpu/drm/i915/intel_overlay.c
> > @@ -1350,7 +1350,7 @@ void intel_setup_overlay(struct drm_device *dev)
> >  		}
> >  		overlay->flip_addr = reg_bo->phys_obj->handle->busaddr;
> >  	} else {
> > -		ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false);
> > +		ret = i915_gem_ggtt_pin(reg_bo, PAGE_SIZE, true, false);
> >  		if (ret) {
> >  			DRM_ERROR("failed to pin overlay register bo\n");
> >  			goto out_free_bo;
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > index 125a741..449e57c 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -2858,7 +2858,7 @@ intel_alloc_context_page(struct drm_device *dev)
> >  		return NULL;
> >  	}
> >  
> > -	ret = i915_gem_object_pin(ctx, 4096, true, false);
> > +	ret = i915_gem_ggtt_pin(ctx, 4096, true, false);
> >  	if (ret) {
> >  		DRM_ERROR("failed to pin power context: %d\n", ret);
> >  		goto err_unref;
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index bc4c11b..ebed61d 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -481,6 +481,7 @@ out:
> >  static int
> >  init_pipe_control(struct intel_ring_buffer *ring)
> >  {
> > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> >  	struct pipe_control *pc;
> >  	struct drm_i915_gem_object *obj;
> >  	int ret;
> > @@ -499,9 +500,10 @@ init_pipe_control(struct intel_ring_buffer *ring)
> >  		goto err;
> >  	}
> >  
> > -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> > +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > +					I915_CACHE_LLC);
> >  
> > -	ret = i915_gem_object_pin(obj, 4096, true, false);
> > +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
> >  	if (ret)
> >  		goto err_unref;
> >  
> > @@ -1212,6 +1214,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring)
> >  static int init_status_page(struct intel_ring_buffer *ring)
> >  {
> >  	struct drm_device *dev = ring->dev;
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	struct drm_i915_gem_object *obj;
> >  	int ret;
> >  
> > @@ -1222,9 +1225,10 @@ static int init_status_page(struct intel_ring_buffer *ring)
> >  		goto err;
> >  	}
> >  
> > -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> > +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > +					I915_CACHE_LLC);
> >  
> > -	ret = i915_gem_object_pin(obj, 4096, true, false);
> > +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
> >  	if (ret != 0) {
> >  		goto err_unref;
> >  	}
> > @@ -1307,7 +1311,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
> >  
> >  	ring->obj = obj;
> >  
> > -	ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false);
> > +	ret = i915_gem_ggtt_pin(obj, PAGE_SIZE, true, false);
> >  	if (ret)
> >  		goto err_unref;
> >  
> > @@ -1828,7 +1832,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
> >  			return -ENOMEM;
> >  		}
> >  
> > -		ret = i915_gem_object_pin(obj, 0, true, false);
> > +		ret = i915_gem_ggtt_pin(obj, 0, true, false);
> >  		if (ret != 0) {
> >  			drm_gem_object_unreference(&obj->base);
> >  			DRM_ERROR("Failed to ping batch bo\n");
> > -- 
> > 1.8.3.2
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Daniel Vetter July 10, 2013, 5:05 p.m. UTC | #3

On Wed, Jul 10, 2013 at 09:37:10AM -0700, Ben Widawsky wrote:
> On Tue, Jul 09, 2013 at 09:15:01AM +0200, Daniel Vetter wrote:
> > On Mon, Jul 08, 2013 at 11:08:37PM -0700, Ben Widawsky wrote:
> > > This patch was formerly known as:
> > > "drm/i915: Create VMAs (part 3) - plumbing"
> > > 
> > > This patch adds a VM argument, bind/unbind, and the object
> > > offset/size/color getters/setters. It preserves the old ggtt helper
> > > functions because things still need, and will continue to need them.
> > > 
> > > Some code will still need to be ported over after this.
> > > 
> > > v2: Fix purge to pick an object and unbind all vmas
> > > This was doable because of the global bound list change.
> > > 
> > > v3: With the commit to actually pin/unpin pages in place, there is no
> > > longer a need to check if unbind succeeded before calling put_pages().
> > > Make put_pages only BUG() after checking pin count.
> > > 
> > > v4: Rebased on top of the new hangcheck work by Mika
> > > plumbed eb_destroy also
> > > Many checkpatch related fixes
> > > 
> > > v5: Very large rebase
> > > 
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > 
> > This one is a rather large beast. Any chance we could split it into
> > topics, e.g. convert execbuf code, convert shrinker code? Or does that get
> > messy, fast?
> > 
> 
> I've thought of this...
> 
> The one solution I came up with is to have two bind/unbind functions
> (similar to what I did with pin, and indeed it was my original plan with
> pin), and do the set_caching one separately.
> 
> I think it won't be too messy, just a lot of typing, as Keith likes to
> say.
> 
> However, my opinion was, since it's early in the merge cycle, we don't
> yet have multiple VMs, and it's /mostly/ a copypasta kind of patch, it's
> not a big deal. At a functional level too, I felt this made more sense.
> 
> So I'll defer to your request on this and start splitting it up, unless
> my email has changed your mind ;-).

Well, my concern is mostly in reviewing since we need to think about each
case and whether it makes sense to talk in therms of vma or objects in
that function. And what exactly to test.

If you've played around and concluded it'll be a mess then I don't think
it'll help in reviewing. So pointless.

Still, there's a bunch of questions on this patch that we need to discuss
;-)

Cheers, Daniel

> 
> > > ---
> > >  drivers/gpu/drm/i915/i915_debugfs.c        |  31 ++-
> > >  drivers/gpu/drm/i915/i915_dma.c            |   4 -
> > >  drivers/gpu/drm/i915/i915_drv.h            | 107 +++++-----
> > >  drivers/gpu/drm/i915/i915_gem.c            | 316 +++++++++++++++++++++--------
> > >  drivers/gpu/drm/i915/i915_gem_context.c    |   9 +-
> > >  drivers/gpu/drm/i915/i915_gem_evict.c      |  51 +++--
> > >  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 +++++---
> > >  drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 ++--
> > >  drivers/gpu/drm/i915/i915_gem_stolen.c     |   6 +-
> > >  drivers/gpu/drm/i915/i915_gem_tiling.c     |  10 +-
> > >  drivers/gpu/drm/i915/i915_irq.c            |   6 +-
> > >  drivers/gpu/drm/i915/i915_trace.h          |  20 +-
> > >  drivers/gpu/drm/i915/intel_fb.c            |   1 -
> > >  drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
> > >  drivers/gpu/drm/i915/intel_pm.c            |   2 +-
> > >  drivers/gpu/drm/i915/intel_ringbuffer.c    |  16 +-
> > >  16 files changed, 468 insertions(+), 239 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > > index 16b2aaf..867ed07 100644
> > > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > > @@ -122,9 +122,18 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
> > >  		seq_printf(m, " (pinned x %d)", obj->pin_count);
> > >  	if (obj->fence_reg != I915_FENCE_REG_NONE)
> > >  		seq_printf(m, " (fence: %d)", obj->fence_reg);
> > > -	if (i915_gem_obj_ggtt_bound(obj))
> > > -		seq_printf(m, " (gtt offset: %08lx, size: %08x)",
> > > -			   i915_gem_obj_ggtt_offset(obj), (unsigned int)i915_gem_obj_ggtt_size(obj));
> > > +	if (i915_gem_obj_bound_any(obj)) {
> > 
> > list_for_each will short-circuit already, so this is redundant.
> > 
> > > +		struct i915_vma *vma;
> > > +		list_for_each_entry(vma, &obj->vma_list, vma_link) {
> > > +			if (!i915_is_ggtt(vma->vm))
> > > +				seq_puts(m, " (pp");
> > > +			else
> > > +				seq_puts(m, " (g");
> > > +			seq_printf(m, " gtt offset: %08lx, size: %08lx)",
> > 
> >                                        ^ that space looks superflous now
> > 
> > > +				   i915_gem_obj_offset(obj, vma->vm),
> > > +				   i915_gem_obj_size(obj, vma->vm));
> > > +		}
> > > +	}
> > >  	if (obj->stolen)
> > >  		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
> > >  	if (obj->pin_mappable || obj->fault_mappable) {
> > > @@ -186,6 +195,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> > >  	return 0;
> > >  }
> > >  
> > > +/* FIXME: Support multiple VM? */
> > >  #define count_objects(list, member) do { \
> > >  	list_for_each_entry(obj, list, member) { \
> > >  		size += i915_gem_obj_ggtt_size(obj); \
> > > @@ -2049,18 +2059,21 @@ i915_drop_caches_set(void *data, u64 val)
> > >  
> > >  	if (val & DROP_BOUND) {
> > >  		list_for_each_entry_safe(obj, next, &vm->inactive_list,
> > > -					 mm_list)
> > > -			if (obj->pin_count == 0) {
> > > -				ret = i915_gem_object_unbind(obj);
> > > -				if (ret)
> > > -					goto unlock;
> > > -			}
> > > +					 mm_list) {
> > > +			if (obj->pin_count)
> > > +				continue;
> > > +
> > > +			ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
> > > +			if (ret)
> > > +				goto unlock;
> > > +		}
> > >  	}
> > >  
> > >  	if (val & DROP_UNBOUND) {
> > >  		list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list,
> > >  					 global_list)
> > >  			if (obj->pages_pin_count == 0) {
> > > +				/* FIXME: Do this for all vms? */
> > >  				ret = i915_gem_object_put_pages(obj);
> > >  				if (ret)
> > >  					goto unlock;
> > > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > > index d13e21f..b190439 100644
> > > --- a/drivers/gpu/drm/i915/i915_dma.c
> > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > @@ -1497,10 +1497,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
> > >  
> > >  	i915_dump_device_info(dev_priv);
> > >  
> > > -	INIT_LIST_HEAD(&dev_priv->vm_list);
> > > -	INIT_LIST_HEAD(&dev_priv->gtt.base.global_link);
> > > -	list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list);
> > > -
> > >  	if (i915_get_bridge_dev(dev)) {
> > >  		ret = -EIO;
> > >  		goto free_priv;
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index 38cccc8..48baccc 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -1363,52 +1363,6 @@ struct drm_i915_gem_object {
> > >  
> > >  #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
> > >  
> > > -/* This is a temporary define to help transition us to real VMAs. If you see
> > > - * this, you're either reviewing code, or bisecting it. */
> > > -static inline struct i915_vma *
> > > -__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj)
> > > -{
> > > -	if (list_empty(&obj->vma_list))
> > > -		return NULL;
> > > -	return list_first_entry(&obj->vma_list, struct i915_vma, vma_link);
> > > -}
> > > -
> > > -/* Whether or not this object is currently mapped by the translation tables */
> > > -static inline bool
> > > -i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o)
> > > -{
> > > -	struct i915_vma *vma = __i915_gem_obj_to_vma(o);
> > > -	if (vma == NULL)
> > > -		return false;
> > > -	return drm_mm_node_allocated(&vma->node);
> > > -}
> > > -
> > > -/* Offset of the first PTE pointing to this object */
> > > -static inline unsigned long
> > > -i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o)
> > > -{
> > > -	BUG_ON(list_empty(&o->vma_list));
> > > -	return __i915_gem_obj_to_vma(o)->node.start;
> > > -}
> > > -
> > > -/* The size used in the translation tables may be larger than the actual size of
> > > - * the object on GEN2/GEN3 because of the way tiling is handled. See
> > > - * i915_gem_get_gtt_size() for more details.
> > > - */
> > > -static inline unsigned long
> > > -i915_gem_obj_ggtt_size(struct drm_i915_gem_object *o)
> > > -{
> > > -	BUG_ON(list_empty(&o->vma_list));
> > > -	return __i915_gem_obj_to_vma(o)->node.size;
> > > -}
> > > -
> > > -static inline void
> > > -i915_gem_obj_ggtt_set_color(struct drm_i915_gem_object *o,
> > > -			    enum i915_cache_level color)
> > > -{
> > > -	__i915_gem_obj_to_vma(o)->node.color = color;
> > > -}
> > > -
> > >  /**
> > >   * Request queue structure.
> > >   *
> > > @@ -1726,11 +1680,13 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
> > >  void i915_gem_vma_destroy(struct i915_vma *vma);
> > >  
> > >  int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > +				     struct i915_address_space *vm,
> > >  				     uint32_t alignment,
> > >  				     bool map_and_fenceable,
> > >  				     bool nonblocking);
> > >  void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
> > > -int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
> > > +int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> > > +					struct i915_address_space *vm);
> > >  int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
> > >  void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
> > >  void i915_gem_lastclose(struct drm_device *dev);
> > > @@ -1760,6 +1716,7 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
> > >  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
> > >  			 struct intel_ring_buffer *to);
> > >  void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > > +				    struct i915_address_space *vm,
> > >  				    struct intel_ring_buffer *ring);
> > >  
> > >  int i915_gem_dumb_create(struct drm_file *file_priv,
> > > @@ -1866,6 +1823,7 @@ i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size,
> > >  			    int tiling_mode, bool fenced);
> > >  
> > >  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > +				    struct i915_address_space *vm,
> > >  				    enum i915_cache_level cache_level);
> > >  
> > >  struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
> > > @@ -1876,6 +1834,56 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
> > >  
> > >  void i915_gem_restore_fences(struct drm_device *dev);
> > >  
> > > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > > +				  struct i915_address_space *vm);
> > > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
> > > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > > +			struct i915_address_space *vm);
> > > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > > +				struct i915_address_space *vm);
> > > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> > > +			    struct i915_address_space *vm,
> > > +			    enum i915_cache_level color);
> > > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > > +				     struct i915_address_space *vm);
> > > +/* Some GGTT VM helpers */
> > > +#define obj_to_ggtt(obj) \
> > > +	(&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
> > > +static inline bool i915_is_ggtt(struct i915_address_space *vm)
> > > +{
> > > +	struct i915_address_space *ggtt =
> > > +		&((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base;
> > > +	return vm == ggtt;
> > > +}
> > > +
> > > +static inline bool i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *obj)
> > > +{
> > > +	return i915_gem_obj_bound(obj, obj_to_ggtt(obj));
> > > +}
> > > +
> > > +static inline unsigned long
> > > +i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *obj)
> > > +{
> > > +	return i915_gem_obj_offset(obj, obj_to_ggtt(obj));
> > > +}
> > > +
> > > +static inline unsigned long
> > > +i915_gem_obj_ggtt_size(struct drm_i915_gem_object *obj)
> > > +{
> > > +	return i915_gem_obj_size(obj, obj_to_ggtt(obj));
> > > +}
> > > +
> > > +static inline int __must_check
> > > +i915_gem_ggtt_pin(struct drm_i915_gem_object *obj,
> > > +		  uint32_t alignment,
> > > +		  bool map_and_fenceable,
> > > +		  bool nonblocking)
> > > +{
> > > +	return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment,
> > > +				   map_and_fenceable, nonblocking);
> > > +}
> > > +#undef obj_to_ggtt
> > > +
> > >  /* i915_gem_context.c */
> > >  void i915_gem_context_init(struct drm_device *dev);
> > >  void i915_gem_context_fini(struct drm_device *dev);
> > > @@ -1912,6 +1920,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > >  
> > >  void i915_gem_restore_gtt_mappings(struct drm_device *dev);
> > >  int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
> > > +/* FIXME: this is never okay with full PPGTT */
> > >  void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> > >  				enum i915_cache_level cache_level);
> > >  void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
> > > @@ -1928,7 +1937,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev)
> > >  
> > >  
> > >  /* i915_gem_evict.c */
> > > -int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > +int __must_check i915_gem_evict_something(struct drm_device *dev,
> > > +					  struct i915_address_space *vm,
> > > +					  int min_size,
> > >  					  unsigned alignment,
> > >  					  unsigned cache_level,
> > >  					  bool mappable,
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index 058ad44..21015cd 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -38,10 +38,12 @@
> > >  
> > >  static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
> > >  static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
> > > -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > -						    unsigned alignment,
> > > -						    bool map_and_fenceable,
> > > -						    bool nonblocking);
> > > +static __must_check int
> > > +i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > +			    struct i915_address_space *vm,
> > > +			    unsigned alignment,
> > > +			    bool map_and_fenceable,
> > > +			    bool nonblocking);
> > >  static int i915_gem_phys_pwrite(struct drm_device *dev,
> > >  				struct drm_i915_gem_object *obj,
> > >  				struct drm_i915_gem_pwrite *args,
> > > @@ -135,7 +137,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
> > >  static inline bool
> > >  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
> > >  {
> > > -	return i915_gem_obj_ggtt_bound(obj) && !obj->active;
> > > +	return i915_gem_obj_bound_any(obj) && !obj->active;
> > >  }
> > >  
> > >  int
> > > @@ -422,7 +424,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
> > >  		 * anyway again before the next pread happens. */
> > >  		if (obj->cache_level == I915_CACHE_NONE)
> > >  			needs_clflush = 1;
> > > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > > +		if (i915_gem_obj_bound_any(obj)) {
> > >  			ret = i915_gem_object_set_to_gtt_domain(obj, false);
> > 
> > This is essentially a very convoluted version of "if there's gpu rendering
> > outstanding, please wait for it". Maybe we should switch this to
> > 
> > 	if (obj->active)
> > 		wait_rendering(obj, true);
> > 
> > Same for the shmem_pwrite case below. Would be a separate patch to prep
> > things though. Can I volunteer you for that? The ugly part is to review
> > whether any of the lru list updating that set_domain does in addition to
> > wait_rendering is required, but on a quick read that's not the case.
> > 
> > >  			if (ret)
> > >  				return ret;
> > > @@ -594,7 +596,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
> > >  	char __user *user_data;
> > >  	int page_offset, page_length, ret;
> > >  
> > > -	ret = i915_gem_object_pin(obj, 0, true, true);
> > > +	ret = i915_gem_ggtt_pin(obj, 0, true, true);
> > >  	if (ret)
> > >  		goto out;
> > >  
> > > @@ -739,7 +741,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
> > >  		 * right away and we therefore have to clflush anyway. */
> gg> >  		if (obj->cache_level == I915_CACHE_NONE)
> > >  			needs_clflush_after = 1;
> > > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > > +		if (i915_gem_obj_bound_any(obj)) {
> > 
> > ... see above.
> > >  			ret = i915_gem_object_set_to_gtt_domain(obj, true);
> > >  			if (ret)
> > >  				return ret;
> > > @@ -1346,7 +1348,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> > >  	}
> > >  
> > >  	/* Now bind it into the GTT if needed */
> > > -	ret = i915_gem_object_pin(obj, 0, true, false);
> > > +	ret = i915_gem_ggtt_pin(obj,  0, true, false);
> > >  	if (ret)
> > >  		goto unlock;
> > >  
> > > @@ -1668,11 +1670,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
> > >  	if (obj->pages == NULL)
> > >  		return 0;
> > >  
> > > -	BUG_ON(i915_gem_obj_ggtt_bound(obj));
> > > -
> > >  	if (obj->pages_pin_count)
> > >  		return -EBUSY;
> > >  
> > > +	BUG_ON(i915_gem_obj_bound_any(obj));
> > > +
> > >  	/* ->put_pages might need to allocate memory for the bit17 swizzle
> > >  	 * array, hence protect them from being reaped by removing them from gtt
> > >  	 * lists early. */
> > > @@ -1692,7 +1694,6 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
> > >  		  bool purgeable_only)
> > >  {
> > >  	struct drm_i915_gem_object *obj, *next;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > >  	long count = 0;
> > >  
> > >  	list_for_each_entry_safe(obj, next,
> > > @@ -1706,14 +1707,22 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
> > >  		}
> > >  	}
> > >  
> > > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) {
> > > -		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
> > > -		    i915_gem_object_unbind(obj) == 0 &&
> > > -		    i915_gem_object_put_pages(obj) == 0) {
> > > +	list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list,
> > > +				 global_list) {
> > > +		struct i915_vma *vma, *v;
> > > +
> > > +		if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
> > > +			continue;
> > > +
> > > +		list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link)
> > > +			if (i915_gem_object_unbind(obj, vma->vm))
> > > +				break;
> > > +
> > > +		if (!i915_gem_object_put_pages(obj))
> > >  			count += obj->base.size >> PAGE_SHIFT;
> > > -			if (count >= target)
> > > -				return count;
> > > -		}
> > > +
> > > +		if (count >= target)
> > > +			return count;
> > >  	}
> > >  
> > >  	return count;
> > > @@ -1873,11 +1882,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
> > >  
> > >  void
> > >  i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > > +			       struct i915_address_space *vm,
> > >  			       struct intel_ring_buffer *ring)
> > >  {
> > >  	struct drm_device *dev = obj->base.dev;
> > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > >  	u32 seqno = intel_ring_get_seqno(ring);
> > >  
> > >  	BUG_ON(ring == NULL);
> > > @@ -1910,12 +1919,9 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > >  }
> > >  
> > >  static void
> > > -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> > > +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> > > +				 struct i915_address_space *vm)
> > >  {
> > > -	struct drm_device *dev = obj->base.dev;
> > > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > -
> > >  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
> > >  	BUG_ON(!obj->active);
> > >  
> > > @@ -2117,10 +2123,11 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
> > >  	spin_unlock(&file_priv->mm.lock);
> > >  }
> > >  
> > > -static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj)
> > > +static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj,
> > > +				    struct i915_address_space *vm)
> > >  {
> > > -	if (acthd >= i915_gem_obj_ggtt_offset(obj) &&
> > > -	    acthd < i915_gem_obj_ggtt_offset(obj) + obj->base.size)
> > > +	if (acthd >= i915_gem_obj_offset(obj, vm) &&
> > > +	    acthd < i915_gem_obj_offset(obj, vm) + obj->base.size)
> > >  		return true;
> > >  
> > >  	return false;
> > > @@ -2143,6 +2150,17 @@ static bool i915_head_inside_request(const u32 acthd_unmasked,
> > >  	return false;
> > >  }
> > >  
> > > +static struct i915_address_space *
> > > +request_to_vm(struct drm_i915_gem_request *request)
> > > +{
> > > +	struct drm_i915_private *dev_priv = request->ring->dev->dev_private;
> > > +	struct i915_address_space *vm;
> > > +
> > > +	vm = &dev_priv->gtt.base;
> > > +
> > > +	return vm;
> > > +}
> > > +
> > >  static bool i915_request_guilty(struct drm_i915_gem_request *request,
> > >  				const u32 acthd, bool *inside)
> > >  {
> > > @@ -2150,9 +2168,9 @@ static bool i915_request_guilty(struct drm_i915_gem_request *request,
> > >  	 * pointing inside the ring, matches the batch_obj address range.
> > >  	 * However this is extremely unlikely.
> > >  	 */
> > > -
> > >  	if (request->batch_obj) {
> > > -		if (i915_head_inside_object(acthd, request->batch_obj)) {
> > > +		if (i915_head_inside_object(acthd, request->batch_obj,
> > > +					    request_to_vm(request))) {
> > >  			*inside = true;
> > >  			return true;
> > >  		}
> > > @@ -2172,17 +2190,21 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring,
> > >  {
> > >  	struct i915_ctx_hang_stats *hs = NULL;
> > >  	bool inside, guilty;
> > > +	unsigned long offset = 0;
> > >  
> > >  	/* Innocent until proven guilty */
> > >  	guilty = false;
> > >  
> > > +	if (request->batch_obj)
> > > +		offset = i915_gem_obj_offset(request->batch_obj,
> > > +					     request_to_vm(request));
> > > +
> > >  	if (ring->hangcheck.action != wait &&
> > >  	    i915_request_guilty(request, acthd, &inside)) {
> > >  		DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n",
> > >  			  ring->name,
> > >  			  inside ? "inside" : "flushing",
> > > -			  request->batch_obj ?
> > > -			  i915_gem_obj_ggtt_offset(request->batch_obj) : 0,
> > > +			  offset,
> > >  			  request->ctx ? request->ctx->id : 0,
> > >  			  acthd);
> > >  
> > > @@ -2239,13 +2261,15 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
> > >  	}
> > >  
> > >  	while (!list_empty(&ring->active_list)) {
> > > +		struct i915_address_space *vm;
> > >  		struct drm_i915_gem_object *obj;
> > >  
> > >  		obj = list_first_entry(&ring->active_list,
> > >  				       struct drm_i915_gem_object,
> > >  				       ring_list);
> > >  
> > > -		i915_gem_object_move_to_inactive(obj);
> > > +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > +			i915_gem_object_move_to_inactive(obj, vm);
> > >  	}
> > >  }
> > >  
> > > @@ -2263,7 +2287,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
> > >  void i915_gem_reset(struct drm_device *dev)
> > >  {
> > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > +	struct i915_address_space *vm;
> > >  	struct drm_i915_gem_object *obj;
> > >  	struct intel_ring_buffer *ring;
> > >  	int i;
> > > @@ -2274,8 +2298,9 @@ void i915_gem_reset(struct drm_device *dev)
> > >  	/* Move everything out of the GPU domains to ensure we do any
> > >  	 * necessary invalidation upon reuse.
> > >  	 */
> > > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > -		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > +		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > +			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > >  
> > >  	i915_gem_restore_fences(dev);
> > >  }
> > > @@ -2320,6 +2345,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> > >  	 * by the ringbuffer to the flushing/inactive lists as appropriate.
> > >  	 */
> > >  	while (!list_empty(&ring->active_list)) {
> > > +		struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > > +		struct i915_address_space *vm;
> > >  		struct drm_i915_gem_object *obj;
> > >  
> > >  		obj = list_first_entry(&ring->active_list,
> > > @@ -2329,7 +2356,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> > >  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
> > >  			break;
> > >  
> > > -		i915_gem_object_move_to_inactive(obj);
> > > +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > +			i915_gem_object_move_to_inactive(obj, vm);
> > >  	}
> > >  
> > >  	if (unlikely(ring->trace_irq_seqno &&
> > > @@ -2575,13 +2603,14 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
> > >   * Unbinds an object from the GTT aperture.
> > >   */
> > >  int
> > > -i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> > > +i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> > > +		       struct i915_address_space *vm)
> > >  {
> > >  	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
> > >  	struct i915_vma *vma;
> > >  	int ret;
> > >  
> > > -	if (!i915_gem_obj_ggtt_bound(obj))
> > > +	if (!i915_gem_obj_bound(obj, vm))
> > >  		return 0;
> > >  
> > >  	if (obj->pin_count)
> > > @@ -2604,7 +2633,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > -	trace_i915_gem_object_unbind(obj);
> > > +	trace_i915_gem_object_unbind(obj, vm);
> > >  
> > >  	if (obj->has_global_gtt_mapping)
> > >  		i915_gem_gtt_unbind_object(obj);
> > > @@ -2619,7 +2648,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> > >  	/* Avoid an unnecessary call to unbind on rebind. */
> > >  	obj->map_and_fenceable = true;
> > >  
> > > -	vma = __i915_gem_obj_to_vma(obj);
> > > +	vma = i915_gem_obj_to_vma(obj, vm);
> > >  	list_del(&vma->vma_link);
> > >  	drm_mm_remove_node(&vma->node);
> > >  	i915_gem_vma_destroy(vma);
> > > @@ -2748,6 +2777,7 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
> > >  		     "object 0x%08lx not 512K or pot-size 0x%08x aligned\n",
> > >  		     i915_gem_obj_ggtt_offset(obj), size);
> > >  
> > > +
> > >  		pitch_val = obj->stride / 128;
> > >  		pitch_val = ffs(pitch_val) - 1;
> > >  
> > > @@ -3069,23 +3099,25 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
> > >   */
> > >  static int
> > >  i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > +			    struct i915_address_space *vm,
> > >  			    unsigned alignment,
> > >  			    bool map_and_fenceable,
> > >  			    bool nonblocking)
> > >  {
> > >  	struct drm_device *dev = obj->base.dev;
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > >  	u32 size, fence_size, fence_alignment, unfenced_alignment;
> > >  	bool mappable, fenceable;
> > > -	size_t gtt_max = map_and_fenceable ?
> > > -		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
> > > +	size_t gtt_max =
> > > +		map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total;
> > >  	struct i915_vma *vma;
> > >  	int ret;
> > >  
> > >  	if (WARN_ON(!list_empty(&obj->vma_list)))
> > >  		return -EBUSY;
> > >  
> > > +	BUG_ON(!i915_is_ggtt(vm));
> > > +
> > >  	fence_size = i915_gem_get_gtt_size(dev,
> > >  					   obj->base.size,
> > >  					   obj->tiling_mode);
> > > @@ -3125,18 +3157,21 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > >  	i915_gem_object_pin_pages(obj);
> > >  
> > >  	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > > +	/* For now we only ever use 1 vma per object */
> > > +	WARN_ON(!list_empty(&obj->vma_list));
> > > +
> > > +	vma = i915_gem_vma_create(obj, vm);
> > >  	if (vma == NULL) {
> > >  		i915_gem_object_unpin_pages(obj);
> > >  		return -ENOMEM;
> > >  	}
> > >  
> > >  search_free:
> > > -	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
> > > -						  &vma->node,
> > > +	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
> > >  						  size, alignment,
> > >  						  obj->cache_level, 0, gtt_max);
> > >  	if (ret) {
> > > -		ret = i915_gem_evict_something(dev, size, alignment,
> > > +		ret = i915_gem_evict_something(dev, vm, size, alignment,
> > >  					       obj->cache_level,
> > >  					       map_and_fenceable,
> > >  					       nonblocking);
> > > @@ -3162,18 +3197,25 @@ search_free:
> > >  
> > >  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> > >  	list_add_tail(&obj->mm_list, &vm->inactive_list);
> > > -	list_add(&vma->vma_link, &obj->vma_list);
> > > +
> > > +	/* Keep GGTT vmas first to make debug easier */
> > > +	if (i915_is_ggtt(vm))
> > > +		list_add(&vma->vma_link, &obj->vma_list);
> > > +	else
> > > +		list_add_tail(&vma->vma_link, &obj->vma_list);
> > >  
> > >  	fenceable =
> > > +		i915_is_ggtt(vm) &&
> > >  		i915_gem_obj_ggtt_size(obj) == fence_size &&
> > >  		(i915_gem_obj_ggtt_offset(obj) & (fence_alignment - 1)) == 0;
> > >  
> > > -	mappable = i915_gem_obj_ggtt_offset(obj) + obj->base.size <=
> > > -		dev_priv->gtt.mappable_end;
> > > +	mappable =
> > > +		i915_is_ggtt(vm) &&
> > > +		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
> > >  
> > >  	obj->map_and_fenceable = mappable && fenceable;
> > >  
> > > -	trace_i915_gem_object_bind(obj, map_and_fenceable);
> > > +	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
> > >  	i915_gem_verify_gtt(dev);
> > >  	return 0;
> > >  }
> > > @@ -3271,7 +3313,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> > >  	int ret;
> > >  
> > >  	/* Not valid to be called on unbound objects. */
> > > -	if (!i915_gem_obj_ggtt_bound(obj))
> > > +	if (!i915_gem_obj_bound_any(obj))
> > >  		return -EINVAL;
> > 
> > If we're converting the shmem paths over to wait_rendering then there's
> > only the fault handler and the set_domain ioctl left. For the later it
> > would make sense to clflush even when an object is on the unbound list, to
> > allow userspace to optimize when the clflushing happens. But that would
> > only make sense in conjunction with Chris' create2 ioctl and a flag to
> > preallocate the storage (and so putting the object onto the unbound list).
> > So nothing to do here.
> > 
> > >  
> > >  	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
> > > @@ -3317,11 +3359,12 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> > >  }
> > >  
> > >  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > +				    struct i915_address_space *vm,
> > >  				    enum i915_cache_level cache_level)
> > >  {
> > >  	struct drm_device *dev = obj->base.dev;
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > > +	struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > >  	int ret;
> > >  
> > >  	if (obj->cache_level == cache_level)
> > > @@ -3333,12 +3376,15 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > >  	}
> > >  
> > >  	if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> > > -		ret = i915_gem_object_unbind(obj);
> > > +		ret = i915_gem_object_unbind(obj, vm);
> > >  		if (ret)
> > >  			return ret;
> > >  	}
> > >  
> > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > +		if (!i915_gem_obj_bound(obj, vm))
> > > +			continue;
> > 
> > Hm, shouldn't we have a per-object list of vmas? Or will that follow later
> > on?
> > 
> > Self-correction: It exists already ... why can't we use this here?
> > 
> > > +
> > >  		ret = i915_gem_object_finish_gpu(obj);
> > >  		if (ret)
> > >  			return ret;
> > > @@ -3361,7 +3407,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > >  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> > >  					       obj, cache_level);
> > >  
> > > -		i915_gem_obj_ggtt_set_color(obj, cache_level);
> > > +		i915_gem_obj_set_color(obj, vm, cache_level);
> > >  	}
> > >  
> > >  	if (cache_level == I915_CACHE_NONE) {
> > > @@ -3421,6 +3467,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > >  			       struct drm_file *file)
> > >  {
> > >  	struct drm_i915_gem_caching *args = data;
> > > +	struct drm_i915_private *dev_priv;
> > >  	struct drm_i915_gem_object *obj;
> > >  	enum i915_cache_level level;
> > >  	int ret;
> > > @@ -3445,8 +3492,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > >  		ret = -ENOENT;
> > >  		goto unlock;
> > >  	}
> > > +	dev_priv = obj->base.dev->dev_private;
> > >  
> > > -	ret = i915_gem_object_set_cache_level(obj, level);
> > > +	/* FIXME: Add interface for specific VM? */
> > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
> > >  
> > >  	drm_gem_object_unreference(&obj->base);
> > >  unlock:
> > > @@ -3464,6 +3513,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > >  				     u32 alignment,
> > >  				     struct intel_ring_buffer *pipelined)
> > >  {
> > > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > >  	u32 old_read_domains, old_write_domain;
> > >  	int ret;
> > >  
> > > @@ -3482,7 +3532,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > >  	 * of uncaching, which would allow us to flush all the LLC-cached data
> > >  	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
> > >  	 */
> > > -	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > > +					      I915_CACHE_NONE);
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > @@ -3490,7 +3541,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > >  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> > >  	 * always use map_and_fenceable for all scanout buffers.
> > >  	 */
> > > -	ret = i915_gem_object_pin(obj, alignment, true, false);
> > > +	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > @@ -3633,6 +3684,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
> > >  
> > >  int
> > >  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > +		    struct i915_address_space *vm,
> > >  		    uint32_t alignment,
> > >  		    bool map_and_fenceable,
> > >  		    bool nonblocking)
> > > @@ -3642,26 +3694,29 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > >  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
> > >  		return -EBUSY;
> > >  
> > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> > > +	BUG_ON(map_and_fenceable && !i915_is_ggtt(vm));
> > 
> > WARN_ON, since presumably we can keep on going if we get this wrong
> > (albeit with slightly corrupted state, so render corruptions might
> > follow).
> > 
> > > +
> > > +	if (i915_gem_obj_bound(obj, vm)) {
> > > +		if ((alignment &&
> > > +		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
> > >  		    (map_and_fenceable && !obj->map_and_fenceable)) {
> > >  			WARN(obj->pin_count,
> > >  			     "bo is already pinned with incorrect alignment:"
> > >  			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
> > >  			     " obj->map_and_fenceable=%d\n",
> > > -			     i915_gem_obj_ggtt_offset(obj), alignment,
> > > +			     i915_gem_obj_offset(obj, vm), alignment,
> > >  			     map_and_fenceable,
> > >  			     obj->map_and_fenceable);
> > > -			ret = i915_gem_object_unbind(obj);
> > > +			ret = i915_gem_object_unbind(obj, vm);
> > >  			if (ret)
> > >  				return ret;
> > >  		}
> > >  	}
> > >  
> > > -	if (!i915_gem_obj_ggtt_bound(obj)) {
> > > +	if (!i915_gem_obj_bound(obj, vm)) {
> > >  		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > >  
> > > -		ret = i915_gem_object_bind_to_gtt(obj, alignment,
> > > +		ret = i915_gem_object_bind_to_gtt(obj, vm, alignment,
> > >  						  map_and_fenceable,
> > >  						  nonblocking);
> > >  		if (ret)
> > > @@ -3684,7 +3739,7 @@ void
> > >  i915_gem_object_unpin(struct drm_i915_gem_object *obj)
> > >  {
> > >  	BUG_ON(obj->pin_count == 0);
> > > -	BUG_ON(!i915_gem_obj_ggtt_bound(obj));
> > > +	BUG_ON(!i915_gem_obj_bound_any(obj));
> > >  
> > >  	if (--obj->pin_count == 0)
> > >  		obj->pin_mappable = false;
> > > @@ -3722,7 +3777,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
> > >  	}
> > >  
> > >  	if (obj->user_pin_count == 0) {
> > > -		ret = i915_gem_object_pin(obj, args->alignment, true, false);
> > > +		ret = i915_gem_ggtt_pin(obj, args->alignment, true, false);
> > >  		if (ret)
> > >  			goto out;
> > >  	}
> > > @@ -3957,6 +4012,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> > >  	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
> > >  	struct drm_device *dev = obj->base.dev;
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > +	struct i915_vma *vma, *next;
> > >  
> > >  	trace_i915_gem_object_destroy(obj);
> > >  
> > > @@ -3964,15 +4020,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> > >  		i915_gem_detach_phys_object(dev, obj);
> > >  
> > >  	obj->pin_count = 0;
> > > -	if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
> > > -		bool was_interruptible;
> > > +	/* NB: 0 or 1 elements */
> > > +	WARN_ON(!list_empty(&obj->vma_list) &&
> > > +		!list_is_singular(&obj->vma_list));
> > > +	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
> > > +		int ret = i915_gem_object_unbind(obj, vma->vm);
> > > +		if (WARN_ON(ret == -ERESTARTSYS)) {
> > > +			bool was_interruptible;
> > >  
> > > -		was_interruptible = dev_priv->mm.interruptible;
> > > -		dev_priv->mm.interruptible = false;
> > > +			was_interruptible = dev_priv->mm.interruptible;
> > > +			dev_priv->mm.interruptible = false;
> > >  
> > > -		WARN_ON(i915_gem_object_unbind(obj));
> > > +			WARN_ON(i915_gem_object_unbind(obj, vma->vm));
> > >  
> > > -		dev_priv->mm.interruptible = was_interruptible;
> > > +			dev_priv->mm.interruptible = was_interruptible;
> > > +		}
> > >  	}
> > >  
> > >  	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
> > > @@ -4332,6 +4394,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
> > >  	INIT_LIST_HEAD(&ring->request_list);
> > >  }
> > >  
> > > +static void i915_init_vm(struct drm_i915_private *dev_priv,
> > > +			 struct i915_address_space *vm)
> > > +{
> > > +	vm->dev = dev_priv->dev;
> > > +	INIT_LIST_HEAD(&vm->active_list);
> > > +	INIT_LIST_HEAD(&vm->inactive_list);
> > > +	INIT_LIST_HEAD(&vm->global_link);
> > > +	list_add(&vm->global_link, &dev_priv->vm_list);
> > > +}
> > > +
> > >  void
> > >  i915_gem_load(struct drm_device *dev)
> > >  {
> > > @@ -4344,8 +4416,9 @@ i915_gem_load(struct drm_device *dev)
> > >  				  SLAB_HWCACHE_ALIGN,
> > >  				  NULL);
> > >  
> > > -	INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
> > > -	INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
> > > +	INIT_LIST_HEAD(&dev_priv->vm_list);
> > > +	i915_init_vm(dev_priv, &dev_priv->gtt.base);
> > > +
> > >  	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
> > >  	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
> > >  	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> > > @@ -4616,9 +4689,9 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > >  			     struct drm_i915_private,
> > >  			     mm.inactive_shrinker);
> > >  	struct drm_device *dev = dev_priv->dev;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > +	struct i915_address_space *vm;
> > >  	struct drm_i915_gem_object *obj;
> > > -	int nr_to_scan = sc->nr_to_scan;
> > > +	int nr_to_scan;
> > >  	bool unlock = true;
> > >  	int cnt;
> > >  
> > > @@ -4632,6 +4705,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > >  		unlock = false;
> > >  	}
> > >  
> > > +	nr_to_scan = sc->nr_to_scan;
> > >  	if (nr_to_scan) {
> > >  		nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
> > >  		if (nr_to_scan > 0)
> > > @@ -4645,11 +4719,93 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > >  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
> > >  		if (obj->pages_pin_count == 0)
> > >  			cnt += obj->base.size >> PAGE_SHIFT;
> > > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > -		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > -			cnt += obj->base.size >> PAGE_SHIFT;
> > > +
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > +		list_for_each_entry(obj, &vm->inactive_list, global_list)
> > > +			if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > +				cnt += obj->base.size >> PAGE_SHIFT;
> > 
> > Isn't this now double-counting objects? In the shrinker we only care about
> > how much physical RAM an object occupies, not how much virtual space it
> > occupies. So just walking the bound list of objects here should be good
> > enough ...
> > 
> > >  
> > >  	if (unlock)
> > >  		mutex_unlock(&dev->struct_mutex);
> > >  	return cnt;
> > >  }
> > > +
> > > +/* All the new VM stuff */
> > > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > > +				  struct i915_address_space *vm)
> > > +{
> > > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > > +	struct i915_vma *vma;
> > > +
> > > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > > +		vm = &dev_priv->gtt.base;
> > > +
> > > +	BUG_ON(list_empty(&o->vma_list));
> > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > 
> > Imo the vma list walking here and in the other helpers below indicates
> > that we should deal more often in vmas instead of (object, vm) pairs. Or
> > is this again something that'll get fixed later on?
> > 
> > I just want to avoid diff churn, and it also makes reviewing easier if the
> > foreshadowing is correct ;-) So generally I'd vote for more liberal
> > sprinkling of obj_to_vma in callers.
> > 
> > > +		if (vma->vm == vm)
> > > +			return vma->node.start;
> > > +
> > > +	}
> > > +	return -1;
> > > +}
> > > +
> > > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
> > > +{
> > > +	return !list_empty(&o->vma_list);
> > > +}
> > > +
> > > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > > +			struct i915_address_space *vm)
> > > +{
> > > +	struct i915_vma *vma;
> > > +
> > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > +		if (vma->vm == vm)
> > > +			return true;
> > > +	}
> > > +	return false;
> > > +}
> > > +
> > > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > > +				struct i915_address_space *vm)
> > > +{
> > > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > > +	struct i915_vma *vma;
> > > +
> > > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > > +		vm = &dev_priv->gtt.base;
> > > +	BUG_ON(list_empty(&o->vma_list));
> > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > +		if (vma->vm == vm)
> > > +			return vma->node.size;
> > > +	}
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> > > +			    struct i915_address_space *vm,
> > > +			    enum i915_cache_level color)
> > > +{
> > > +	struct i915_vma *vma;
> > > +	BUG_ON(list_empty(&o->vma_list));
> > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > +		if (vma->vm == vm) {
> > > +			vma->node.color = color;
> > > +			return;
> > > +		}
> > > +	}
> > > +
> > > +	WARN(1, "Couldn't set color for VM %p\n", vm);
> > > +}
> > > +
> > > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > > +				     struct i915_address_space *vm)
> > > +{
> > > +	struct i915_vma *vma;
> > > +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> > > +		if (vma->vm == vm)
> > > +			return vma;
> > > +
> > > +	return NULL;
> > > +}
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > > index 2074544..c92fd81 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > > @@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev,
> > >  
> > >  	if (INTEL_INFO(dev)->gen >= 7) {
> > >  		ret = i915_gem_object_set_cache_level(ctx->obj,
> > > +						      &dev_priv->gtt.base,
> > >  						      I915_CACHE_LLC_MLC);
> > >  		/* Failure shouldn't ever happen this early */
> > >  		if (WARN_ON(ret))
> > > @@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
> > >  	 * default context.
> > >  	 */
> > >  	dev_priv->ring[RCS].default_context = ctx;
> > > -	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> > > +	ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> > >  	if (ret) {
> > >  		DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
> > >  		goto err_destroy;
> > > @@ -398,6 +399,7 @@ mi_set_context(struct intel_ring_buffer *ring,
> > >  static int do_switch(struct i915_hw_context *to)
> > >  {
> > >  	struct intel_ring_buffer *ring = to->ring;
> > > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > >  	struct i915_hw_context *from = ring->last_context;
> > >  	u32 hw_flags = 0;
> > >  	int ret;
> > > @@ -407,7 +409,7 @@ static int do_switch(struct i915_hw_context *to)
> > >  	if (from == to)
> > >  		return 0;
> > >  
> > > -	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
> > > +	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > @@ -444,7 +446,8 @@ static int do_switch(struct i915_hw_context *to)
> > >  	 */
> > >  	if (from != NULL) {
> > >  		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> > > -		i915_gem_object_move_to_active(from->obj, ring);
> > > +		i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
> > > +					       ring);
> > >  		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
> > >  		 * whole damn pipeline, we don't need to explicitly mark the
> > >  		 * object dirty. The only exception is that the context must be
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > index df61f33..32efdc0 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > @@ -32,24 +32,21 @@
> > >  #include "i915_trace.h"
> > >  
> > >  static bool
> > > -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
> > > +mark_free(struct i915_vma *vma, struct list_head *unwind)
> > >  {
> > > -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > > -
> > > -	if (obj->pin_count)
> > > +	if (vma->obj->pin_count)
> > >  		return false;
> > >  
> > > -	list_add(&obj->exec_list, unwind);
> > > +	list_add(&vma->obj->exec_list, unwind);
> > >  	return drm_mm_scan_add_block(&vma->node);
> > >  }
> > >  
> > >  int
> > > -i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > -			 unsigned alignment, unsigned cache_level,
> > > +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> > > +			 int min_size, unsigned alignment, unsigned cache_level,
> > >  			 bool mappable, bool nonblocking)
> > >  {
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > >  	struct list_head eviction_list, unwind_list;
> > >  	struct i915_vma *vma;
> > >  	struct drm_i915_gem_object *obj;
> > > @@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> > >  	 */
> > >  
> > >  	INIT_LIST_HEAD(&unwind_list);
> > > -	if (mappable)
> > > +	if (mappable) {
> > > +		BUG_ON(!i915_is_ggtt(vm));
> > >  		drm_mm_init_scan_with_range(&vm->mm, min_size,
> > >  					    alignment, cache_level, 0,
> > >  					    dev_priv->gtt.mappable_end);
> > > -	else
> > > +	} else
> > >  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
> > >  
> > >  	/* First see if there is a large enough contiguous idle region... */
> > >  	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> > > -		if (mark_free(obj, &unwind_list))
> > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > +		if (mark_free(vma, &unwind_list))
> > >  			goto found;
> > >  	}
> > >  
> > > @@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> > >  
> > >  	/* Now merge in the soon-to-be-expired objects... */
> > >  	list_for_each_entry(obj, &vm->active_list, mm_list) {
> > > -		if (mark_free(obj, &unwind_list))
> > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > +		if (mark_free(vma, &unwind_list))
> > >  			goto found;
> > >  	}
> > >  
> > > @@ -109,7 +109,7 @@ none:
> > >  		obj = list_first_entry(&unwind_list,
> > >  				       struct drm_i915_gem_object,
> > >  				       exec_list);
> > > -		vma = __i915_gem_obj_to_vma(obj);
> > > +		vma = i915_gem_obj_to_vma(obj, vm);
> > >  		ret = drm_mm_scan_remove_block(&vma->node);
> > >  		BUG_ON(ret);
> > >  
> > > @@ -130,7 +130,7 @@ found:
> > >  		obj = list_first_entry(&unwind_list,
> > >  				       struct drm_i915_gem_object,
> > >  				       exec_list);
> > > -		vma = __i915_gem_obj_to_vma(obj);
> > > +		vma = i915_gem_obj_to_vma(obj, vm);
> > >  		if (drm_mm_scan_remove_block(&vma->node)) {
> > >  			list_move(&obj->exec_list, &eviction_list);
> > >  			drm_gem_object_reference(&obj->base);
> > > @@ -145,7 +145,7 @@ found:
> > >  				       struct drm_i915_gem_object,
> > >  				       exec_list);
> > >  		if (ret == 0)
> > > -			ret = i915_gem_object_unbind(obj);
> > > +			ret = i915_gem_object_unbind(obj, vm);
> > >  
> > >  		list_del_init(&obj->exec_list);
> > >  		drm_gem_object_unreference(&obj->base);
> > > @@ -158,13 +158,18 @@ int
> > >  i915_gem_evict_everything(struct drm_device *dev)
> > 
> > I suspect evict_everything eventually wants a address_space *vm argument
> > for those cases where we only want to evict everything in a given vm. Atm
> > we have two use-cases of this:
> > - Called from the shrinker as a last-ditch effort. For that it should move
> >   _every_ object onto the unbound list.
> > - Called from execbuf for badly-fragmented address spaces to clean up the
> >   mess. For that case we only care about one address space.
> > 
> > >  {
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > +	struct i915_address_space *vm;
> > >  	struct drm_i915_gem_object *obj, *next;
> > > -	bool lists_empty;
> > > +	bool lists_empty = true;
> > >  	int ret;
> > >  
> > > -	lists_empty = (list_empty(&vm->inactive_list) &&
> > > -		       list_empty(&vm->active_list));
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > +		lists_empty = (list_empty(&vm->inactive_list) &&
> > > +			       list_empty(&vm->active_list));
> > > +		if (!lists_empty)
> > > +			lists_empty = false;
> > > +	}
> > > +
> > >  	if (lists_empty)
> > >  		return -ENOSPC;
> > >  
> > > @@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
> > >  	i915_gem_retire_requests(dev);
> > >  
> > >  	/* Having flushed everything, unbind() should never raise an error */
> > > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > -		if (obj->pin_count == 0)
> > > -			WARN_ON(i915_gem_object_unbind(obj));
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > +		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > +			if (obj->pin_count == 0)
> > > +				WARN_ON(i915_gem_object_unbind(obj, vm));
> > > +	}
> > >  
> > >  	return 0;
> > >  }
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > index 5aeb447..e90182d 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > @@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
> > >  }
> > >  
> > >  static void
> > > -eb_destroy(struct eb_objects *eb)
> > > +eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
> > >  {
> > >  	while (!list_empty(&eb->objects)) {
> > >  		struct drm_i915_gem_object *obj;
> > > @@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
> > >  static int
> > >  i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> > >  				   struct eb_objects *eb,
> > > -				   struct drm_i915_gem_relocation_entry *reloc)
> > > +				   struct drm_i915_gem_relocation_entry *reloc,
> > > +				   struct i915_address_space *vm)
> > >  {
> > >  	struct drm_device *dev = obj->base.dev;
> > >  	struct drm_gem_object *target_obj;
> > > @@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> > >  
> > >  static int
> > >  i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > > -				    struct eb_objects *eb)
> > > +				    struct eb_objects *eb,
> > > +				    struct i915_address_space *vm)
> > >  {
> > >  #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
> > >  	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
> > > @@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > >  		do {
> > >  			u64 offset = r->presumed_offset;
> > >  
> > > -			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
> > > +			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
> > > +								 vm);
> > >  			if (ret)
> > >  				return ret;
> > >  
> > > @@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > >  static int
> > >  i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> > >  					 struct eb_objects *eb,
> > > -					 struct drm_i915_gem_relocation_entry *relocs)
> > > +					 struct drm_i915_gem_relocation_entry *relocs,
> > > +					 struct i915_address_space *vm)
> > >  {
> > >  	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> > >  	int i, ret;
> > >  
> > >  	for (i = 0; i < entry->relocation_count; i++) {
> > > -		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
> > > +		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
> > > +							 vm);
> > >  		if (ret)
> > >  			return ret;
> > >  	}
> > > @@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> > >  }
> > >  
> > >  static int
> > > -i915_gem_execbuffer_relocate(struct eb_objects *eb)
> > > +i915_gem_execbuffer_relocate(struct eb_objects *eb,
> > > +			     struct i915_address_space *vm)
> > >  {
> > >  	struct drm_i915_gem_object *obj;
> > >  	int ret = 0;
> > > @@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
> > >  	 */
> > >  	pagefault_disable();
> > >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> > > -		ret = i915_gem_execbuffer_relocate_object(obj, eb);
> > > +		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
> > >  		if (ret)
> > >  			break;
> > >  	}
> > > @@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
> > >  static int
> > >  i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > >  				   struct intel_ring_buffer *ring,
> > > +				   struct i915_address_space *vm,
> > >  				   bool *need_reloc)
> > >  {
> > >  	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > @@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > >  		obj->tiling_mode != I915_TILING_NONE;
> > >  	need_mappable = need_fence || need_reloc_mappable(obj);
> > >  
> > > -	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
> > > +	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
> > > +				  false);
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > @@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > >  		obj->has_aliasing_ppgtt_mapping = 1;
> > >  	}
> > >  
> > > -	if (entry->offset != i915_gem_obj_ggtt_offset(obj)) {
> > > -		entry->offset = i915_gem_obj_ggtt_offset(obj);
> > > +	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
> > > +		entry->offset = i915_gem_obj_offset(obj, vm);
> > >  		*need_reloc = true;
> > >  	}
> > >  
> > > @@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> > >  {
> > >  	struct drm_i915_gem_exec_object2 *entry;
> > >  
> > > -	if (!i915_gem_obj_ggtt_bound(obj))
> > > +	if (!i915_gem_obj_bound_any(obj))
> > >  		return;
> > >  
> > >  	entry = obj->exec_entry;
> > > @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> > >  static int
> > >  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > >  			    struct list_head *objects,
> > > +			    struct i915_address_space *vm,
> > >  			    bool *need_relocs)
> > >  {
> > >  	struct drm_i915_gem_object *obj;
> > > @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > >  		list_for_each_entry(obj, objects, exec_list) {
> > >  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> > >  			bool need_fence, need_mappable;
> > > +			u32 obj_offset;
> > >  
> > > -			if (!i915_gem_obj_ggtt_bound(obj))
> > > +			if (!i915_gem_obj_bound(obj, vm))
> > >  				continue;
> > 
> > I wonder a bit how we could avoid the multipler (obj, vm) -> vma lookups
> > here ... Maybe we should cache them in some pointer somewhere (either in
> > the eb object or by adding a new pointer to the object struct, e.g.
> > obj->eb_vma, similar to obj->eb_list).
> > 
> > >  
> > > +			obj_offset = i915_gem_obj_offset(obj, vm);
> > >  			need_fence =
> > >  				has_fenced_gpu_access &&
> > >  				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
> > >  				obj->tiling_mode != I915_TILING_NONE;
> > >  			need_mappable = need_fence || need_reloc_mappable(obj);
> > >  
> > > +			BUG_ON((need_mappable || need_fence) &&
> > > +			       !i915_is_ggtt(vm));
> > > +
> > >  			if ((entry->alignment &&
> > > -			     i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) ||
> > > +			     obj_offset & (entry->alignment - 1)) ||
> > >  			    (need_mappable && !obj->map_and_fenceable))
> > > -				ret = i915_gem_object_unbind(obj);
> > > +				ret = i915_gem_object_unbind(obj, vm);
> > >  			else
> > > -				ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > > +				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> > >  			if (ret)
> > >  				goto err;
> > >  		}
> > >  
> > >  		/* Bind fresh objects */
> > >  		list_for_each_entry(obj, objects, exec_list) {
> > > -			if (i915_gem_obj_ggtt_bound(obj))
> > > +			if (i915_gem_obj_bound(obj, vm))
> > >  				continue;
> > >  
> > > -			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > > +			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> > >  			if (ret)
> > >  				goto err;
> > >  		}
> > > @@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> > >  				  struct drm_file *file,
> > >  				  struct intel_ring_buffer *ring,
> > >  				  struct eb_objects *eb,
> > > -				  struct drm_i915_gem_exec_object2 *exec)
> > > +				  struct drm_i915_gem_exec_object2 *exec,
> > > +				  struct i915_address_space *vm)
> > >  {
> > >  	struct drm_i915_gem_relocation_entry *reloc;
> > >  	struct drm_i915_gem_object *obj;
> > > @@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> > >  		goto err;
> > >  
> > >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> > >  	if (ret)
> > >  		goto err;
> > >  
> > >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> > >  		int offset = obj->exec_entry - exec;
> > >  		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
> > > -							       reloc + reloc_offset[offset]);
> > > +							       reloc + reloc_offset[offset],
> > > +							       vm);
> > >  		if (ret)
> > >  			goto err;
> > >  	}
> > > @@ -768,6 +784,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
> > >  
> > >  static void
> > >  i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > > +				   struct i915_address_space *vm,
> > >  				   struct intel_ring_buffer *ring)
> > >  {
> > >  	struct drm_i915_gem_object *obj;
> > > @@ -782,7 +799,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > >  		obj->base.read_domains = obj->base.pending_read_domains;
> > >  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
> > >  
> > > -		i915_gem_object_move_to_active(obj, ring);
> > > +		i915_gem_object_move_to_active(obj, vm, ring);
> > >  		if (obj->base.write_domain) {
> > >  			obj->dirty = 1;
> > >  			obj->last_write_seqno = intel_ring_get_seqno(ring);
> > > @@ -836,7 +853,8 @@ static int
> > >  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > >  		       struct drm_file *file,
> > >  		       struct drm_i915_gem_execbuffer2 *args,
> > > -		       struct drm_i915_gem_exec_object2 *exec)
> > > +		       struct drm_i915_gem_exec_object2 *exec,
> > > +		       struct i915_address_space *vm)
> > >  {
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > >  	struct eb_objects *eb;
> > > @@ -998,17 +1016,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > >  
> > >  	/* Move the objects en-masse into the GTT, evicting if necessary. */
> > >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> > >  	if (ret)
> > >  		goto err;
> > >  
> > >  	/* The objects are in their final locations, apply the relocations. */
> > >  	if (need_relocs)
> > > -		ret = i915_gem_execbuffer_relocate(eb);
> > > +		ret = i915_gem_execbuffer_relocate(eb, vm);
> > >  	if (ret) {
> > >  		if (ret == -EFAULT) {
> > >  			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
> > > -								eb, exec);
> > > +								eb, exec, vm);
> > >  			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
> > >  		}
> > >  		if (ret)
> > > @@ -1059,7 +1077,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > >  			goto err;
> > >  	}
> > >  
> > > -	exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset;
> > > +	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> > > +		args->batch_start_offset;
> > >  	exec_len = args->batch_len;
> > >  	if (cliprects) {
> > >  		for (i = 0; i < args->num_cliprects; i++) {
> > > @@ -1084,11 +1103,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > >  
> > >  	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
> > >  
> > > -	i915_gem_execbuffer_move_to_active(&eb->objects, ring);
> > > +	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
> > >  	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
> > >  
> > >  err:
> > > -	eb_destroy(eb);
> > > +	eb_destroy(eb, vm);
> > >  
> > >  	mutex_unlock(&dev->struct_mutex);
> > >  
> > > @@ -1105,6 +1124,7 @@ int
> > >  i915_gem_execbuffer(struct drm_device *dev, void *data,
> > >  		    struct drm_file *file)
> > >  {
> > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > >  	struct drm_i915_gem_execbuffer *args = data;
> > >  	struct drm_i915_gem_execbuffer2 exec2;
> > >  	struct drm_i915_gem_exec_object *exec_list = NULL;
> > > @@ -1160,7 +1180,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
> > >  	exec2.flags = I915_EXEC_RENDER;
> > >  	i915_execbuffer2_set_context_id(exec2, 0);
> > >  
> > > -	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
> > > +	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
> > > +				     &dev_priv->gtt.base);
> > >  	if (!ret) {
> > >  		/* Copy the new buffer offsets back to the user's exec list. */
> > >  		for (i = 0; i < args->buffer_count; i++)
> > > @@ -1186,6 +1207,7 @@ int
> > >  i915_gem_execbuffer2(struct drm_device *dev, void *data,
> > >  		     struct drm_file *file)
> > >  {
> > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > >  	struct drm_i915_gem_execbuffer2 *args = data;
> > >  	struct drm_i915_gem_exec_object2 *exec2_list = NULL;
> > >  	int ret;
> > > @@ -1216,7 +1238,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
> > >  		return -EFAULT;
> > >  	}
> > >  
> > > -	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
> > > +	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
> > > +				     &dev_priv->gtt.base);
> > >  	if (!ret) {
> > >  		/* Copy the new buffer offsets back to the user's exec list. */
> > >  		ret = copy_to_user(to_user_ptr(args->buffers_ptr),
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > index 298fc42..70ce2f6 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > @@ -367,6 +367,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
> > >  			    ppgtt->base.total);
> > >  	}
> > >  
> > > +	/* i915_init_vm(dev_priv, &ppgtt->base) */
> > > +
> > >  	return ret;
> > >  }
> > >  
> > > @@ -386,17 +388,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> > >  			    struct drm_i915_gem_object *obj,
> > >  			    enum i915_cache_level cache_level)
> > >  {
> > > -	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> > > -				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > -				   cache_level);
> > > +	struct i915_address_space *vm = &ppgtt->base;
> > > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > > +
> > > +	vm->insert_entries(vm, obj->pages,
> > > +			   obj_offset >> PAGE_SHIFT,
> > > +			   cache_level);
> > >  }
> > >  
> > >  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > >  			      struct drm_i915_gem_object *obj)
> > >  {
> > > -	ppgtt->base.clear_range(&ppgtt->base,
> > > -				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > -				obj->base.size >> PAGE_SHIFT);
> > > +	struct i915_address_space *vm = &ppgtt->base;
> > > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > > +
> > > +	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
> > > +			obj->base.size >> PAGE_SHIFT);
> > >  }
> > >  
> > >  extern int intel_iommu_gfx_mapped;
> > > @@ -447,6 +454,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> > >  				       dev_priv->gtt.base.start / PAGE_SIZE,
> > >  				       dev_priv->gtt.base.total / PAGE_SIZE);
> > >  
> > > +	if (dev_priv->mm.aliasing_ppgtt)
> > > +		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> > > +
> > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > >  		i915_gem_clflush_object(obj);
> > >  		i915_gem_gtt_bind_object(obj, obj->cache_level);
> > > @@ -625,7 +635,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > >  	 * aperture.  One page should be enough to keep any prefetching inside
> > >  	 * of the aperture.
> > >  	 */
> > > -	drm_i915_private_t *dev_priv = dev->dev_private;
> > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > +	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
> > >  	struct drm_mm_node *entry;
> > >  	struct drm_i915_gem_object *obj;
> > >  	unsigned long hole_start, hole_end;
> > > @@ -633,19 +644,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > >  	BUG_ON(mappable_end > end);
> > >  
> > >  	/* Subtract the guard page ... */
> > > -	drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
> > > +	drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
> > >  	if (!HAS_LLC(dev))
> > >  		dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
> > >  
> > >  	/* Mark any preallocated objects as occupied */
> > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > > -		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
> > >  		int ret;
> > >  		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
> > >  			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
> > >  
> > >  		WARN_ON(i915_gem_obj_ggtt_bound(obj));
> > > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > > +		ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
> > >  		if (ret)
> > >  			DRM_DEBUG_KMS("Reservation failed\n");
> > >  		obj->has_global_gtt_mapping = 1;
> > > @@ -656,19 +667,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > >  	dev_priv->gtt.base.total = end - start;
> > >  
> > >  	/* Clear any non-preallocated blocks */
> > > -	drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
> > > -			     hole_start, hole_end) {
> > > +	drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
> > >  		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
> > >  		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
> > >  			      hole_start, hole_end);
> > > -		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > > -					       hole_start / PAGE_SIZE,
> > > -					       count);
> > > +		ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count);
> > >  	}
> > >  
> > >  	/* And finally clear the reserved guard page */
> > > -	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > > -				       end / PAGE_SIZE - 1, 1);
> > > +	ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1);
> > >  }
> > >  
> > >  static bool
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > index 245eb1d..bfe61fa 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > @@ -391,7 +391,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > >  	if (gtt_offset == I915_GTT_OFFSET_NONE)
> > >  		return obj;
> > >  
> > > -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > > +	vma = i915_gem_vma_create(obj, vm);
> > >  	if (!vma) {
> > >  		drm_gem_object_unreference(&obj->base);
> > >  		return NULL;
> > > @@ -404,8 +404,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > >  	 */
> > >  	vma->node.start = gtt_offset;
> > >  	vma->node.size = size;
> > > -	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> > > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > > +	if (drm_mm_initialized(&vm->mm)) {
> > > +		ret = drm_mm_reserve_node(&vm->mm, &vma->node);
> > 
> > These two hunks here for stolen look fishy - we only ever use the stolen
> > preallocated stuff for objects with mappings in the global gtt. So keeping
> > that explicit is imo the better approach. And tbh I'm confused where the
> > local variable vm is from ...
> > 
> > >  		if (ret) {
> > >  			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
> > >  			goto unref_out;
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > > index 92a8d27..808ca2a 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > > @@ -360,17 +360,19 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
> > >  
> > >  		obj->map_and_fenceable =
> > >  			!i915_gem_obj_ggtt_bound(obj) ||
> > > -			(i915_gem_obj_ggtt_offset(obj) + obj->base.size <= dev_priv->gtt.mappable_end &&
> > > +			(i915_gem_obj_ggtt_offset(obj) +
> > > +			 obj->base.size <= dev_priv->gtt.mappable_end &&
> > >  			 i915_gem_object_fence_ok(obj, args->tiling_mode));
> > >  
> > >  		/* Rebind if we need a change of alignment */
> > >  		if (!obj->map_and_fenceable) {
> > > -			u32 unfenced_alignment =
> > > +			struct i915_address_space *ggtt = &dev_priv->gtt.base;
> > > +			u32 unfenced_align =
> > >  				i915_gem_get_gtt_alignment(dev, obj->base.size,
> > >  							    args->tiling_mode,
> > >  							    false);
> > > -			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_alignment - 1))
> > > -				ret = i915_gem_object_unbind(obj);
> > > +			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_align - 1))
> > > +				ret = i915_gem_object_unbind(obj, ggtt);
> > >  		}
> > >  
> > >  		if (ret == 0) {
> > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > > index 79fbb17..28fa0ff 100644
> > > --- a/drivers/gpu/drm/i915/i915_irq.c
> > > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > > @@ -1716,6 +1716,9 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
> > >  	if (HAS_BROKEN_CS_TLB(dev_priv->dev)) {
> > >  		u32 acthd = I915_READ(ACTHD);
> > >  
> > > +		if (WARN_ON(HAS_HW_CONTEXTS(dev_priv->dev)))
> > > +			return NULL;
> > > +
> > >  		if (WARN_ON(ring->id != RCS))
> > >  			return NULL;
> > >  
> > > @@ -1802,7 +1805,8 @@ static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
> > >  		return;
> > >  
> > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > > -		if ((error->ccid & PAGE_MASK) == i915_gem_obj_ggtt_offset(obj)) {
> > > +		if ((error->ccid & PAGE_MASK) ==
> > > +		    i915_gem_obj_ggtt_offset(obj)) {
> > >  			ering->ctx = i915_error_object_create_sized(dev_priv,
> > >  								    obj, 1);
> > >  			break;
> > > diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> > > index 7d283b5..3f019d3 100644
> > > --- a/drivers/gpu/drm/i915/i915_trace.h
> > > +++ b/drivers/gpu/drm/i915/i915_trace.h
> > > @@ -34,11 +34,13 @@ TRACE_EVENT(i915_gem_object_create,
> > >  );
> > >  
> > >  TRACE_EVENT(i915_gem_object_bind,
> > > -	    TP_PROTO(struct drm_i915_gem_object *obj, bool mappable),
> > > -	    TP_ARGS(obj, mappable),
> > > +	    TP_PROTO(struct drm_i915_gem_object *obj,
> > > +		     struct i915_address_space *vm, bool mappable),
> > > +	    TP_ARGS(obj, vm, mappable),
> > >  
> > >  	    TP_STRUCT__entry(
> > >  			     __field(struct drm_i915_gem_object *, obj)
> > > +			     __field(struct i915_address_space *, vm)
> > >  			     __field(u32, offset)
> > >  			     __field(u32, size)
> > >  			     __field(bool, mappable)
> > > @@ -46,8 +48,8 @@ TRACE_EVENT(i915_gem_object_bind,
> > >  
> > >  	    TP_fast_assign(
> > >  			   __entry->obj = obj;
> > > -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> > > -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> > > +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> > > +			   __entry->size = i915_gem_obj_size(obj, vm);
> > >  			   __entry->mappable = mappable;
> > >  			   ),
> > >  
> > > @@ -57,19 +59,21 @@ TRACE_EVENT(i915_gem_object_bind,
> > >  );
> > >  
> > >  TRACE_EVENT(i915_gem_object_unbind,
> > > -	    TP_PROTO(struct drm_i915_gem_object *obj),
> > > -	    TP_ARGS(obj),
> > > +	    TP_PROTO(struct drm_i915_gem_object *obj,
> > > +		     struct i915_address_space *vm),
> > > +	    TP_ARGS(obj, vm),
> > >  
> > >  	    TP_STRUCT__entry(
> > >  			     __field(struct drm_i915_gem_object *, obj)
> > > +			     __field(struct i915_address_space *, vm)
> > >  			     __field(u32, offset)
> > >  			     __field(u32, size)
> > >  			     ),
> > >  
> > >  	    TP_fast_assign(
> > >  			   __entry->obj = obj;
> > > -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> > > -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> > > +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> > > +			   __entry->size = i915_gem_obj_size(obj, vm);
> > >  			   ),
> > >  
> > >  	    TP_printk("obj=%p, offset=%08x size=%x",
> > > diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
> > > index f3c97e0..b69cc63 100644
> > > --- a/drivers/gpu/drm/i915/intel_fb.c
> > > +++ b/drivers/gpu/drm/i915/intel_fb.c
> > > @@ -170,7 +170,6 @@ static int intelfb_create(struct drm_fb_helper *helper,
> > >  		      fb->width, fb->height,
> > >  		      i915_gem_obj_ggtt_offset(obj), obj);
> > >  
> > > -
> > >  	mutex_unlock(&dev->struct_mutex);
> > >  	vga_switcheroo_client_fb_set(dev->pdev, info);
> > >  	return 0;
> > > diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> > > index 81c3ca1..517e278 100644
> > > --- a/drivers/gpu/drm/i915/intel_overlay.c
> > > +++ b/drivers/gpu/drm/i915/intel_overlay.c
> > > @@ -1350,7 +1350,7 @@ void intel_setup_overlay(struct drm_device *dev)
> > >  		}
> > >  		overlay->flip_addr = reg_bo->phys_obj->handle->busaddr;
> > >  	} else {
> > > -		ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false);
> > > +		ret = i915_gem_ggtt_pin(reg_bo, PAGE_SIZE, true, false);
> > >  		if (ret) {
> > >  			DRM_ERROR("failed to pin overlay register bo\n");
> > >  			goto out_free_bo;
> > > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > > index 125a741..449e57c 100644
> > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > @@ -2858,7 +2858,7 @@ intel_alloc_context_page(struct drm_device *dev)
> > >  		return NULL;
> > >  	}
> > >  
> > > -	ret = i915_gem_object_pin(ctx, 4096, true, false);
> > > +	ret = i915_gem_ggtt_pin(ctx, 4096, true, false);
> > >  	if (ret) {
> > >  		DRM_ERROR("failed to pin power context: %d\n", ret);
> > >  		goto err_unref;
> > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > index bc4c11b..ebed61d 100644
> > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > @@ -481,6 +481,7 @@ out:
> > >  static int
> > >  init_pipe_control(struct intel_ring_buffer *ring)
> > >  {
> > > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > >  	struct pipe_control *pc;
> > >  	struct drm_i915_gem_object *obj;
> > >  	int ret;
> > > @@ -499,9 +500,10 @@ init_pipe_control(struct intel_ring_buffer *ring)
> > >  		goto err;
> > >  	}
> > >  
> > > -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> > > +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > > +					I915_CACHE_LLC);
> > >  
> > > -	ret = i915_gem_object_pin(obj, 4096, true, false);
> > > +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
> > >  	if (ret)
> > >  		goto err_unref;
> > >  
> > > @@ -1212,6 +1214,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring)
> > >  static int init_status_page(struct intel_ring_buffer *ring)
> > >  {
> > >  	struct drm_device *dev = ring->dev;
> > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > >  	struct drm_i915_gem_object *obj;
> > >  	int ret;
> > >  
> > > @@ -1222,9 +1225,10 @@ static int init_status_page(struct intel_ring_buffer *ring)
> > >  		goto err;
> > >  	}
> > >  
> > > -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> > > +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > > +					I915_CACHE_LLC);
> > >  
> > > -	ret = i915_gem_object_pin(obj, 4096, true, false);
> > > +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
> > >  	if (ret != 0) {
> > >  		goto err_unref;
> > >  	}
> > > @@ -1307,7 +1311,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
> > >  
> > >  	ring->obj = obj;
> > >  
> > > -	ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false);
> > > +	ret = i915_gem_ggtt_pin(obj, PAGE_SIZE, true, false);
> > >  	if (ret)
> > >  		goto err_unref;
> > >  
> > > @@ -1828,7 +1832,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
> > >  			return -ENOMEM;
> > >  		}
> > >  
> > > -		ret = i915_gem_object_pin(obj, 0, true, false);
> > > +		ret = i915_gem_ggtt_pin(obj, 0, true, false);
> > >  		if (ret != 0) {
> > >  			drm_gem_object_unreference(&obj->base);
> > >  			DRM_ERROR("Failed to ping batch bo\n");
> > > -- 
> > > 1.8.3.2
> > > 
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > 
> > -- 
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> 
> -- 
> Ben Widawsky, Intel Open Source Technology Center

Ben Widawsky July 10, 2013, 10:23 p.m. UTC | #4

On Wed, Jul 10, 2013 at 07:05:52PM +0200, Daniel Vetter wrote:
> On Wed, Jul 10, 2013 at 09:37:10AM -0700, Ben Widawsky wrote:
> > On Tue, Jul 09, 2013 at 09:15:01AM +0200, Daniel Vetter wrote:
> > > On Mon, Jul 08, 2013 at 11:08:37PM -0700, Ben Widawsky wrote:
> > > > This patch was formerly known as:
> > > > "drm/i915: Create VMAs (part 3) - plumbing"
> > > > 
> > > > This patch adds a VM argument, bind/unbind, and the object
> > > > offset/size/color getters/setters. It preserves the old ggtt helper
> > > > functions because things still need, and will continue to need them.
> > > > 
> > > > Some code will still need to be ported over after this.
> > > > 
> > > > v2: Fix purge to pick an object and unbind all vmas
> > > > This was doable because of the global bound list change.
> > > > 
> > > > v3: With the commit to actually pin/unpin pages in place, there is no
> > > > longer a need to check if unbind succeeded before calling put_pages().
> > > > Make put_pages only BUG() after checking pin count.
> > > > 
> > > > v4: Rebased on top of the new hangcheck work by Mika
> > > > plumbed eb_destroy also
> > > > Many checkpatch related fixes
> > > > 
> > > > v5: Very large rebase
> > > > 
> > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > 
> > > This one is a rather large beast. Any chance we could split it into
> > > topics, e.g. convert execbuf code, convert shrinker code? Or does that get
> > > messy, fast?
> > > 
> > 
> > I've thought of this...
> > 
> > The one solution I came up with is to have two bind/unbind functions
> > (similar to what I did with pin, and indeed it was my original plan with
> > pin), and do the set_caching one separately.
> > 
> > I think it won't be too messy, just a lot of typing, as Keith likes to
> > say.
> > 
> > However, my opinion was, since it's early in the merge cycle, we don't
> > yet have multiple VMs, and it's /mostly/ a copypasta kind of patch, it's
> > not a big deal. At a functional level too, I felt this made more sense.
> > 
> > So I'll defer to your request on this and start splitting it up, unless
> > my email has changed your mind ;-).
> 
> Well, my concern is mostly in reviewing since we need to think about each
> case and whether it makes sense to talk in therms of vma or objects in
> that function. And what exactly to test.
> 
> If you've played around and concluded it'll be a mess then I don't think
> it'll help in reviewing. So pointless.

I said I don't think it will be a mess, though I feel it won't really
help review too much. Can you take a crack and review and poke me if you
want me to try it. I'd rather not do it if I can avoid it, so I can try
to go back to my 15 patch maximum rule.

> 
> Still, there's a bunch of questions on this patch that we need to discuss
> ;-)

Ready whenever.

> 
> Cheers, Daniel
> 
> > 
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_debugfs.c        |  31 ++-
> > > >  drivers/gpu/drm/i915/i915_dma.c            |   4 -
> > > >  drivers/gpu/drm/i915/i915_drv.h            | 107 +++++-----
> > > >  drivers/gpu/drm/i915/i915_gem.c            | 316 +++++++++++++++++++++--------
> > > >  drivers/gpu/drm/i915/i915_gem_context.c    |   9 +-
> > > >  drivers/gpu/drm/i915/i915_gem_evict.c      |  51 +++--
> > > >  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 +++++---
> > > >  drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 ++--
> > > >  drivers/gpu/drm/i915/i915_gem_stolen.c     |   6 +-
> > > >  drivers/gpu/drm/i915/i915_gem_tiling.c     |  10 +-
> > > >  drivers/gpu/drm/i915/i915_irq.c            |   6 +-
> > > >  drivers/gpu/drm/i915/i915_trace.h          |  20 +-
> > > >  drivers/gpu/drm/i915/intel_fb.c            |   1 -
> > > >  drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
> > > >  drivers/gpu/drm/i915/intel_pm.c            |   2 +-
> > > >  drivers/gpu/drm/i915/intel_ringbuffer.c    |  16 +-
> > > >  16 files changed, 468 insertions(+), 239 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > > > index 16b2aaf..867ed07 100644
> > > > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > > > @@ -122,9 +122,18 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
> > > >  		seq_printf(m, " (pinned x %d)", obj->pin_count);
> > > >  	if (obj->fence_reg != I915_FENCE_REG_NONE)
> > > >  		seq_printf(m, " (fence: %d)", obj->fence_reg);
> > > > -	if (i915_gem_obj_ggtt_bound(obj))
> > > > -		seq_printf(m, " (gtt offset: %08lx, size: %08x)",
> > > > -			   i915_gem_obj_ggtt_offset(obj), (unsigned int)i915_gem_obj_ggtt_size(obj));
> > > > +	if (i915_gem_obj_bound_any(obj)) {
> > > 
> > > list_for_each will short-circuit already, so this is redundant.
> > > 
> > > > +		struct i915_vma *vma;
> > > > +		list_for_each_entry(vma, &obj->vma_list, vma_link) {
> > > > +			if (!i915_is_ggtt(vma->vm))
> > > > +				seq_puts(m, " (pp");
> > > > +			else
> > > > +				seq_puts(m, " (g");
> > > > +			seq_printf(m, " gtt offset: %08lx, size: %08lx)",
> > > 
> > >                                        ^ that space looks superflous now
> > > 
> > > > +				   i915_gem_obj_offset(obj, vma->vm),
> > > > +				   i915_gem_obj_size(obj, vma->vm));
> > > > +		}
> > > > +	}
> > > >  	if (obj->stolen)
> > > >  		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
> > > >  	if (obj->pin_mappable || obj->fault_mappable) {
> > > > @@ -186,6 +195,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> > > >  	return 0;
> > > >  }
> > > >  
> > > > +/* FIXME: Support multiple VM? */
> > > >  #define count_objects(list, member) do { \
> > > >  	list_for_each_entry(obj, list, member) { \
> > > >  		size += i915_gem_obj_ggtt_size(obj); \
> > > > @@ -2049,18 +2059,21 @@ i915_drop_caches_set(void *data, u64 val)
> > > >  
> > > >  	if (val & DROP_BOUND) {
> > > >  		list_for_each_entry_safe(obj, next, &vm->inactive_list,
> > > > -					 mm_list)
> > > > -			if (obj->pin_count == 0) {
> > > > -				ret = i915_gem_object_unbind(obj);
> > > > -				if (ret)
> > > > -					goto unlock;
> > > > -			}
> > > > +					 mm_list) {
> > > > +			if (obj->pin_count)
> > > > +				continue;
> > > > +
> > > > +			ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
> > > > +			if (ret)
> > > > +				goto unlock;
> > > > +		}
> > > >  	}
> > > >  
> > > >  	if (val & DROP_UNBOUND) {
> > > >  		list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list,
> > > >  					 global_list)
> > > >  			if (obj->pages_pin_count == 0) {
> > > > +				/* FIXME: Do this for all vms? */
> > > >  				ret = i915_gem_object_put_pages(obj);
> > > >  				if (ret)
> > > >  					goto unlock;
> > > > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > > > index d13e21f..b190439 100644
> > > > --- a/drivers/gpu/drm/i915/i915_dma.c
> > > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > > @@ -1497,10 +1497,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
> > > >  
> > > >  	i915_dump_device_info(dev_priv);
> > > >  
> > > > -	INIT_LIST_HEAD(&dev_priv->vm_list);
> > > > -	INIT_LIST_HEAD(&dev_priv->gtt.base.global_link);
> > > > -	list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list);
> > > > -
> > > >  	if (i915_get_bridge_dev(dev)) {
> > > >  		ret = -EIO;
> > > >  		goto free_priv;
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > > index 38cccc8..48baccc 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > @@ -1363,52 +1363,6 @@ struct drm_i915_gem_object {
> > > >  
> > > >  #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
> > > >  
> > > > -/* This is a temporary define to help transition us to real VMAs. If you see
> > > > - * this, you're either reviewing code, or bisecting it. */
> > > > -static inline struct i915_vma *
> > > > -__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj)
> > > > -{
> > > > -	if (list_empty(&obj->vma_list))
> > > > -		return NULL;
> > > > -	return list_first_entry(&obj->vma_list, struct i915_vma, vma_link);
> > > > -}
> > > > -
> > > > -/* Whether or not this object is currently mapped by the translation tables */
> > > > -static inline bool
> > > > -i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o)
> > > > -{
> > > > -	struct i915_vma *vma = __i915_gem_obj_to_vma(o);
> > > > -	if (vma == NULL)
> > > > -		return false;
> > > > -	return drm_mm_node_allocated(&vma->node);
> > > > -}
> > > > -
> > > > -/* Offset of the first PTE pointing to this object */
> > > > -static inline unsigned long
> > > > -i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o)
> > > > -{
> > > > -	BUG_ON(list_empty(&o->vma_list));
> > > > -	return __i915_gem_obj_to_vma(o)->node.start;
> > > > -}
> > > > -
> > > > -/* The size used in the translation tables may be larger than the actual size of
> > > > - * the object on GEN2/GEN3 because of the way tiling is handled. See
> > > > - * i915_gem_get_gtt_size() for more details.
> > > > - */
> > > > -static inline unsigned long
> > > > -i915_gem_obj_ggtt_size(struct drm_i915_gem_object *o)
> > > > -{
> > > > -	BUG_ON(list_empty(&o->vma_list));
> > > > -	return __i915_gem_obj_to_vma(o)->node.size;
> > > > -}
> > > > -
> > > > -static inline void
> > > > -i915_gem_obj_ggtt_set_color(struct drm_i915_gem_object *o,
> > > > -			    enum i915_cache_level color)
> > > > -{
> > > > -	__i915_gem_obj_to_vma(o)->node.color = color;
> > > > -}
> > > > -
> > > >  /**
> > > >   * Request queue structure.
> > > >   *
> > > > @@ -1726,11 +1680,13 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
> > > >  void i915_gem_vma_destroy(struct i915_vma *vma);
> > > >  
> > > >  int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > > +				     struct i915_address_space *vm,
> > > >  				     uint32_t alignment,
> > > >  				     bool map_and_fenceable,
> > > >  				     bool nonblocking);
> > > >  void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
> > > > -int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
> > > > +int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> > > > +					struct i915_address_space *vm);
> > > >  int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
> > > >  void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
> > > >  void i915_gem_lastclose(struct drm_device *dev);
> > > > @@ -1760,6 +1716,7 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
> > > >  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
> > > >  			 struct intel_ring_buffer *to);
> > > >  void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > > > +				    struct i915_address_space *vm,
> > > >  				    struct intel_ring_buffer *ring);
> > > >  
> > > >  int i915_gem_dumb_create(struct drm_file *file_priv,
> > > > @@ -1866,6 +1823,7 @@ i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size,
> > > >  			    int tiling_mode, bool fenced);
> > > >  
> > > >  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > > +				    struct i915_address_space *vm,
> > > >  				    enum i915_cache_level cache_level);
> > > >  
> > > >  struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
> > > > @@ -1876,6 +1834,56 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
> > > >  
> > > >  void i915_gem_restore_fences(struct drm_device *dev);
> > > >  
> > > > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > > > +				  struct i915_address_space *vm);
> > > > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
> > > > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > > > +			struct i915_address_space *vm);
> > > > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > > > +				struct i915_address_space *vm);
> > > > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> > > > +			    struct i915_address_space *vm,
> > > > +			    enum i915_cache_level color);
> > > > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > > > +				     struct i915_address_space *vm);
> > > > +/* Some GGTT VM helpers */
> > > > +#define obj_to_ggtt(obj) \
> > > > +	(&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
> > > > +static inline bool i915_is_ggtt(struct i915_address_space *vm)
> > > > +{
> > > > +	struct i915_address_space *ggtt =
> > > > +		&((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base;
> > > > +	return vm == ggtt;
> > > > +}
> > > > +
> > > > +static inline bool i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *obj)
> > > > +{
> > > > +	return i915_gem_obj_bound(obj, obj_to_ggtt(obj));
> > > > +}
> > > > +
> > > > +static inline unsigned long
> > > > +i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *obj)
> > > > +{
> > > > +	return i915_gem_obj_offset(obj, obj_to_ggtt(obj));
> > > > +}
> > > > +
> > > > +static inline unsigned long
> > > > +i915_gem_obj_ggtt_size(struct drm_i915_gem_object *obj)
> > > > +{
> > > > +	return i915_gem_obj_size(obj, obj_to_ggtt(obj));
> > > > +}
> > > > +
> > > > +static inline int __must_check
> > > > +i915_gem_ggtt_pin(struct drm_i915_gem_object *obj,
> > > > +		  uint32_t alignment,
> > > > +		  bool map_and_fenceable,
> > > > +		  bool nonblocking)
> > > > +{
> > > > +	return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment,
> > > > +				   map_and_fenceable, nonblocking);
> > > > +}
> > > > +#undef obj_to_ggtt
> > > > +
> > > >  /* i915_gem_context.c */
> > > >  void i915_gem_context_init(struct drm_device *dev);
> > > >  void i915_gem_context_fini(struct drm_device *dev);
> > > > @@ -1912,6 +1920,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > > >  
> > > >  void i915_gem_restore_gtt_mappings(struct drm_device *dev);
> > > >  int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
> > > > +/* FIXME: this is never okay with full PPGTT */
> > > >  void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> > > >  				enum i915_cache_level cache_level);
> > > >  void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
> > > > @@ -1928,7 +1937,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev)
> > > >  
> > > >  
> > > >  /* i915_gem_evict.c */
> > > > -int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > > +int __must_check i915_gem_evict_something(struct drm_device *dev,
> > > > +					  struct i915_address_space *vm,
> > > > +					  int min_size,
> > > >  					  unsigned alignment,
> > > >  					  unsigned cache_level,
> > > >  					  bool mappable,
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > > index 058ad44..21015cd 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > > @@ -38,10 +38,12 @@
> > > >  
> > > >  static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
> > > >  static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
> > > > -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > > -						    unsigned alignment,
> > > > -						    bool map_and_fenceable,
> > > > -						    bool nonblocking);
> > > > +static __must_check int
> > > > +i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > > +			    struct i915_address_space *vm,
> > > > +			    unsigned alignment,
> > > > +			    bool map_and_fenceable,
> > > > +			    bool nonblocking);
> > > >  static int i915_gem_phys_pwrite(struct drm_device *dev,
> > > >  				struct drm_i915_gem_object *obj,
> > > >  				struct drm_i915_gem_pwrite *args,
> > > > @@ -135,7 +137,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
> > > >  static inline bool
> > > >  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
> > > >  {
> > > > -	return i915_gem_obj_ggtt_bound(obj) && !obj->active;
> > > > +	return i915_gem_obj_bound_any(obj) && !obj->active;
> > > >  }
> > > >  
> > > >  int
> > > > @@ -422,7 +424,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
> > > >  		 * anyway again before the next pread happens. */
> > > >  		if (obj->cache_level == I915_CACHE_NONE)
> > > >  			needs_clflush = 1;
> > > > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > > > +		if (i915_gem_obj_bound_any(obj)) {
> > > >  			ret = i915_gem_object_set_to_gtt_domain(obj, false);
> > > 
> > > This is essentially a very convoluted version of "if there's gpu rendering
> > > outstanding, please wait for it". Maybe we should switch this to
> > > 
> > > 	if (obj->active)
> > > 		wait_rendering(obj, true);
> > > 
> > > Same for the shmem_pwrite case below. Would be a separate patch to prep
> > > things though. Can I volunteer you for that? The ugly part is to review
> > > whether any of the lru list updating that set_domain does in addition to
> > > wait_rendering is required, but on a quick read that's not the case.
> > > 
> > > >  			if (ret)
> > > >  				return ret;
> > > > @@ -594,7 +596,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
> > > >  	char __user *user_data;
> > > >  	int page_offset, page_length, ret;
> > > >  
> > > > -	ret = i915_gem_object_pin(obj, 0, true, true);
> > > > +	ret = i915_gem_ggtt_pin(obj, 0, true, true);
> > > >  	if (ret)
> > > >  		goto out;
> > > >  
> > > > @@ -739,7 +741,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
> > > >  		 * right away and we therefore have to clflush anyway. */
> > gg> >  		if (obj->cache_level == I915_CACHE_NONE)
> > > >  			needs_clflush_after = 1;
> > > > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > > > +		if (i915_gem_obj_bound_any(obj)) {
> > > 
> > > ... see above.
> > > >  			ret = i915_gem_object_set_to_gtt_domain(obj, true);
> > > >  			if (ret)
> > > >  				return ret;
> > > > @@ -1346,7 +1348,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> > > >  	}
> > > >  
> > > >  	/* Now bind it into the GTT if needed */
> > > > -	ret = i915_gem_object_pin(obj, 0, true, false);
> > > > +	ret = i915_gem_ggtt_pin(obj,  0, true, false);
> > > >  	if (ret)
> > > >  		goto unlock;
> > > >  
> > > > @@ -1668,11 +1670,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
> > > >  	if (obj->pages == NULL)
> > > >  		return 0;
> > > >  
> > > > -	BUG_ON(i915_gem_obj_ggtt_bound(obj));
> > > > -
> > > >  	if (obj->pages_pin_count)
> > > >  		return -EBUSY;
> > > >  
> > > > +	BUG_ON(i915_gem_obj_bound_any(obj));
> > > > +
> > > >  	/* ->put_pages might need to allocate memory for the bit17 swizzle
> > > >  	 * array, hence protect them from being reaped by removing them from gtt
> > > >  	 * lists early. */
> > > > @@ -1692,7 +1694,6 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
> > > >  		  bool purgeable_only)
> > > >  {
> > > >  	struct drm_i915_gem_object *obj, *next;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > >  	long count = 0;
> > > >  
> > > >  	list_for_each_entry_safe(obj, next,
> > > > @@ -1706,14 +1707,22 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
> > > >  		}
> > > >  	}
> > > >  
> > > > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) {
> > > > -		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
> > > > -		    i915_gem_object_unbind(obj) == 0 &&
> > > > -		    i915_gem_object_put_pages(obj) == 0) {
> > > > +	list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list,
> > > > +				 global_list) {
> > > > +		struct i915_vma *vma, *v;
> > > > +
> > > > +		if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
> > > > +			continue;
> > > > +
> > > > +		list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link)
> > > > +			if (i915_gem_object_unbind(obj, vma->vm))
> > > > +				break;
> > > > +
> > > > +		if (!i915_gem_object_put_pages(obj))
> > > >  			count += obj->base.size >> PAGE_SHIFT;
> > > > -			if (count >= target)
> > > > -				return count;
> > > > -		}
> > > > +
> > > > +		if (count >= target)
> > > > +			return count;
> > > >  	}
> > > >  
> > > >  	return count;
> > > > @@ -1873,11 +1882,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
> > > >  
> > > >  void
> > > >  i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > > > +			       struct i915_address_space *vm,
> > > >  			       struct intel_ring_buffer *ring)
> > > >  {
> > > >  	struct drm_device *dev = obj->base.dev;
> > > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > >  	u32 seqno = intel_ring_get_seqno(ring);
> > > >  
> > > >  	BUG_ON(ring == NULL);
> > > > @@ -1910,12 +1919,9 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > > >  }
> > > >  
> > > >  static void
> > > > -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> > > > +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> > > > +				 struct i915_address_space *vm)
> > > >  {
> > > > -	struct drm_device *dev = obj->base.dev;
> > > > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > > -
> > > >  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
> > > >  	BUG_ON(!obj->active);
> > > >  
> > > > @@ -2117,10 +2123,11 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
> > > >  	spin_unlock(&file_priv->mm.lock);
> > > >  }
> > > >  
> > > > -static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj)
> > > > +static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj,
> > > > +				    struct i915_address_space *vm)
> > > >  {
> > > > -	if (acthd >= i915_gem_obj_ggtt_offset(obj) &&
> > > > -	    acthd < i915_gem_obj_ggtt_offset(obj) + obj->base.size)
> > > > +	if (acthd >= i915_gem_obj_offset(obj, vm) &&
> > > > +	    acthd < i915_gem_obj_offset(obj, vm) + obj->base.size)
> > > >  		return true;
> > > >  
> > > >  	return false;
> > > > @@ -2143,6 +2150,17 @@ static bool i915_head_inside_request(const u32 acthd_unmasked,
> > > >  	return false;
> > > >  }
> > > >  
> > > > +static struct i915_address_space *
> > > > +request_to_vm(struct drm_i915_gem_request *request)
> > > > +{
> > > > +	struct drm_i915_private *dev_priv = request->ring->dev->dev_private;
> > > > +	struct i915_address_space *vm;
> > > > +
> > > > +	vm = &dev_priv->gtt.base;
> > > > +
> > > > +	return vm;
> > > > +}
> > > > +
> > > >  static bool i915_request_guilty(struct drm_i915_gem_request *request,
> > > >  				const u32 acthd, bool *inside)
> > > >  {
> > > > @@ -2150,9 +2168,9 @@ static bool i915_request_guilty(struct drm_i915_gem_request *request,
> > > >  	 * pointing inside the ring, matches the batch_obj address range.
> > > >  	 * However this is extremely unlikely.
> > > >  	 */
> > > > -
> > > >  	if (request->batch_obj) {
> > > > -		if (i915_head_inside_object(acthd, request->batch_obj)) {
> > > > +		if (i915_head_inside_object(acthd, request->batch_obj,
> > > > +					    request_to_vm(request))) {
> > > >  			*inside = true;
> > > >  			return true;
> > > >  		}
> > > > @@ -2172,17 +2190,21 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring,
> > > >  {
> > > >  	struct i915_ctx_hang_stats *hs = NULL;
> > > >  	bool inside, guilty;
> > > > +	unsigned long offset = 0;
> > > >  
> > > >  	/* Innocent until proven guilty */
> > > >  	guilty = false;
> > > >  
> > > > +	if (request->batch_obj)
> > > > +		offset = i915_gem_obj_offset(request->batch_obj,
> > > > +					     request_to_vm(request));
> > > > +
> > > >  	if (ring->hangcheck.action != wait &&
> > > >  	    i915_request_guilty(request, acthd, &inside)) {
> > > >  		DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n",
> > > >  			  ring->name,
> > > >  			  inside ? "inside" : "flushing",
> > > > -			  request->batch_obj ?
> > > > -			  i915_gem_obj_ggtt_offset(request->batch_obj) : 0,
> > > > +			  offset,
> > > >  			  request->ctx ? request->ctx->id : 0,
> > > >  			  acthd);
> > > >  
> > > > @@ -2239,13 +2261,15 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
> > > >  	}
> > > >  
> > > >  	while (!list_empty(&ring->active_list)) {
> > > > +		struct i915_address_space *vm;
> > > >  		struct drm_i915_gem_object *obj;
> > > >  
> > > >  		obj = list_first_entry(&ring->active_list,
> > > >  				       struct drm_i915_gem_object,
> > > >  				       ring_list);
> > > >  
> > > > -		i915_gem_object_move_to_inactive(obj);
> > > > +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > > +			i915_gem_object_move_to_inactive(obj, vm);
> > > >  	}
> > > >  }
> > > >  
> > > > @@ -2263,7 +2287,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
> > > >  void i915_gem_reset(struct drm_device *dev)
> > > >  {
> > > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > > +	struct i915_address_space *vm;
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	struct intel_ring_buffer *ring;
> > > >  	int i;
> > > > @@ -2274,8 +2298,9 @@ void i915_gem_reset(struct drm_device *dev)
> > > >  	/* Move everything out of the GPU domains to ensure we do any
> > > >  	 * necessary invalidation upon reuse.
> > > >  	 */
> > > > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > > -		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > > +		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > > +			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > > >  
> > > >  	i915_gem_restore_fences(dev);
> > > >  }
> > > > @@ -2320,6 +2345,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> > > >  	 * by the ringbuffer to the flushing/inactive lists as appropriate.
> > > >  	 */
> > > >  	while (!list_empty(&ring->active_list)) {
> > > > +		struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > > > +		struct i915_address_space *vm;
> > > >  		struct drm_i915_gem_object *obj;
> > > >  
> > > >  		obj = list_first_entry(&ring->active_list,
> > > > @@ -2329,7 +2356,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> > > >  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
> > > >  			break;
> > > >  
> > > > -		i915_gem_object_move_to_inactive(obj);
> > > > +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > > +			i915_gem_object_move_to_inactive(obj, vm);
> > > >  	}
> > > >  
> > > >  	if (unlikely(ring->trace_irq_seqno &&
> > > > @@ -2575,13 +2603,14 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
> > > >   * Unbinds an object from the GTT aperture.
> > > >   */
> > > >  int
> > > > -i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> > > > +i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> > > > +		       struct i915_address_space *vm)
> > > >  {
> > > >  	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
> > > >  	struct i915_vma *vma;
> > > >  	int ret;
> > > >  
> > > > -	if (!i915_gem_obj_ggtt_bound(obj))
> > > > +	if (!i915_gem_obj_bound(obj, vm))
> > > >  		return 0;
> > > >  
> > > >  	if (obj->pin_count)
> > > > @@ -2604,7 +2633,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> > > >  	if (ret)
> > > >  		return ret;
> > > >  
> > > > -	trace_i915_gem_object_unbind(obj);
> > > > +	trace_i915_gem_object_unbind(obj, vm);
> > > >  
> > > >  	if (obj->has_global_gtt_mapping)
> > > >  		i915_gem_gtt_unbind_object(obj);
> > > > @@ -2619,7 +2648,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> > > >  	/* Avoid an unnecessary call to unbind on rebind. */
> > > >  	obj->map_and_fenceable = true;
> > > >  
> > > > -	vma = __i915_gem_obj_to_vma(obj);
> > > > +	vma = i915_gem_obj_to_vma(obj, vm);
> > > >  	list_del(&vma->vma_link);
> > > >  	drm_mm_remove_node(&vma->node);
> > > >  	i915_gem_vma_destroy(vma);
> > > > @@ -2748,6 +2777,7 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
> > > >  		     "object 0x%08lx not 512K or pot-size 0x%08x aligned\n",
> > > >  		     i915_gem_obj_ggtt_offset(obj), size);
> > > >  
> > > > +
> > > >  		pitch_val = obj->stride / 128;
> > > >  		pitch_val = ffs(pitch_val) - 1;
> > > >  
> > > > @@ -3069,23 +3099,25 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
> > > >   */
> > > >  static int
> > > >  i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > > +			    struct i915_address_space *vm,
> > > >  			    unsigned alignment,
> > > >  			    bool map_and_fenceable,
> > > >  			    bool nonblocking)
> > > >  {
> > > >  	struct drm_device *dev = obj->base.dev;
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > >  	u32 size, fence_size, fence_alignment, unfenced_alignment;
> > > >  	bool mappable, fenceable;
> > > > -	size_t gtt_max = map_and_fenceable ?
> > > > -		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
> > > > +	size_t gtt_max =
> > > > +		map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total;
> > > >  	struct i915_vma *vma;
> > > >  	int ret;
> > > >  
> > > >  	if (WARN_ON(!list_empty(&obj->vma_list)))
> > > >  		return -EBUSY;
> > > >  
> > > > +	BUG_ON(!i915_is_ggtt(vm));
> > > > +
> > > >  	fence_size = i915_gem_get_gtt_size(dev,
> > > >  					   obj->base.size,
> > > >  					   obj->tiling_mode);
> > > > @@ -3125,18 +3157,21 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > >  	i915_gem_object_pin_pages(obj);
> > > >  
> > > >  	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > > > +	/* For now we only ever use 1 vma per object */
> > > > +	WARN_ON(!list_empty(&obj->vma_list));
> > > > +
> > > > +	vma = i915_gem_vma_create(obj, vm);
> > > >  	if (vma == NULL) {
> > > >  		i915_gem_object_unpin_pages(obj);
> > > >  		return -ENOMEM;
> > > >  	}
> > > >  
> > > >  search_free:
> > > > -	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
> > > > -						  &vma->node,
> > > > +	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
> > > >  						  size, alignment,
> > > >  						  obj->cache_level, 0, gtt_max);
> > > >  	if (ret) {
> > > > -		ret = i915_gem_evict_something(dev, size, alignment,
> > > > +		ret = i915_gem_evict_something(dev, vm, size, alignment,
> > > >  					       obj->cache_level,
> > > >  					       map_and_fenceable,
> > > >  					       nonblocking);
> > > > @@ -3162,18 +3197,25 @@ search_free:
> > > >  
> > > >  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> > > >  	list_add_tail(&obj->mm_list, &vm->inactive_list);
> > > > -	list_add(&vma->vma_link, &obj->vma_list);
> > > > +
> > > > +	/* Keep GGTT vmas first to make debug easier */
> > > > +	if (i915_is_ggtt(vm))
> > > > +		list_add(&vma->vma_link, &obj->vma_list);
> > > > +	else
> > > > +		list_add_tail(&vma->vma_link, &obj->vma_list);
> > > >  
> > > >  	fenceable =
> > > > +		i915_is_ggtt(vm) &&
> > > >  		i915_gem_obj_ggtt_size(obj) == fence_size &&
> > > >  		(i915_gem_obj_ggtt_offset(obj) & (fence_alignment - 1)) == 0;
> > > >  
> > > > -	mappable = i915_gem_obj_ggtt_offset(obj) + obj->base.size <=
> > > > -		dev_priv->gtt.mappable_end;
> > > > +	mappable =
> > > > +		i915_is_ggtt(vm) &&
> > > > +		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
> > > >  
> > > >  	obj->map_and_fenceable = mappable && fenceable;
> > > >  
> > > > -	trace_i915_gem_object_bind(obj, map_and_fenceable);
> > > > +	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
> > > >  	i915_gem_verify_gtt(dev);
> > > >  	return 0;
> > > >  }
> > > > @@ -3271,7 +3313,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> > > >  	int ret;
> > > >  
> > > >  	/* Not valid to be called on unbound objects. */
> > > > -	if (!i915_gem_obj_ggtt_bound(obj))
> > > > +	if (!i915_gem_obj_bound_any(obj))
> > > >  		return -EINVAL;
> > > 
> > > If we're converting the shmem paths over to wait_rendering then there's
> > > only the fault handler and the set_domain ioctl left. For the later it
> > > would make sense to clflush even when an object is on the unbound list, to
> > > allow userspace to optimize when the clflushing happens. But that would
> > > only make sense in conjunction with Chris' create2 ioctl and a flag to
> > > preallocate the storage (and so putting the object onto the unbound list).
> > > So nothing to do here.
> > > 
> > > >  
> > > >  	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
> > > > @@ -3317,11 +3359,12 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> > > >  }
> > > >  
> > > >  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > > +				    struct i915_address_space *vm,
> > > >  				    enum i915_cache_level cache_level)
> > > >  {
> > > >  	struct drm_device *dev = obj->base.dev;
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > > > +	struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > >  	int ret;
> > > >  
> > > >  	if (obj->cache_level == cache_level)
> > > > @@ -3333,12 +3376,15 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > >  	}
> > > >  
> > > >  	if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> > > > -		ret = i915_gem_object_unbind(obj);
> > > > +		ret = i915_gem_object_unbind(obj, vm);
> > > >  		if (ret)
> > > >  			return ret;
> > > >  	}
> > > >  
> > > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > > +		if (!i915_gem_obj_bound(obj, vm))
> > > > +			continue;
> > > 
> > > Hm, shouldn't we have a per-object list of vmas? Or will that follow later
> > > on?
> > > 
> > > Self-correction: It exists already ... why can't we use this here?
> > > 
> > > > +
> > > >  		ret = i915_gem_object_finish_gpu(obj);
> > > >  		if (ret)
> > > >  			return ret;
> > > > @@ -3361,7 +3407,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > >  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> > > >  					       obj, cache_level);
> > > >  
> > > > -		i915_gem_obj_ggtt_set_color(obj, cache_level);
> > > > +		i915_gem_obj_set_color(obj, vm, cache_level);
> > > >  	}
> > > >  
> > > >  	if (cache_level == I915_CACHE_NONE) {
> > > > @@ -3421,6 +3467,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > > >  			       struct drm_file *file)
> > > >  {
> > > >  	struct drm_i915_gem_caching *args = data;
> > > > +	struct drm_i915_private *dev_priv;
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	enum i915_cache_level level;
> > > >  	int ret;
> > > > @@ -3445,8 +3492,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > > >  		ret = -ENOENT;
> > > >  		goto unlock;
> > > >  	}
> > > > +	dev_priv = obj->base.dev->dev_private;
> > > >  
> > > > -	ret = i915_gem_object_set_cache_level(obj, level);
> > > > +	/* FIXME: Add interface for specific VM? */
> > > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
> > > >  
> > > >  	drm_gem_object_unreference(&obj->base);
> > > >  unlock:
> > > > @@ -3464,6 +3513,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > >  				     u32 alignment,
> > > >  				     struct intel_ring_buffer *pipelined)
> > > >  {
> > > > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > >  	u32 old_read_domains, old_write_domain;
> > > >  	int ret;
> > > >  
> > > > @@ -3482,7 +3532,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > >  	 * of uncaching, which would allow us to flush all the LLC-cached data
> > > >  	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
> > > >  	 */
> > > > -	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> > > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > > > +					      I915_CACHE_NONE);
> > > >  	if (ret)
> > > >  		return ret;
> > > >  
> > > > @@ -3490,7 +3541,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > >  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> > > >  	 * always use map_and_fenceable for all scanout buffers.
> > > >  	 */
> > > > -	ret = i915_gem_object_pin(obj, alignment, true, false);
> > > > +	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
> > > >  	if (ret)
> > > >  		return ret;
> > > >  
> > > > @@ -3633,6 +3684,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
> > > >  
> > > >  int
> > > >  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > > +		    struct i915_address_space *vm,
> > > >  		    uint32_t alignment,
> > > >  		    bool map_and_fenceable,
> > > >  		    bool nonblocking)
> > > > @@ -3642,26 +3694,29 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > >  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
> > > >  		return -EBUSY;
> > > >  
> > > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > > -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> > > > +	BUG_ON(map_and_fenceable && !i915_is_ggtt(vm));
> > > 
> > > WARN_ON, since presumably we can keep on going if we get this wrong
> > > (albeit with slightly corrupted state, so render corruptions might
> > > follow).
> > > 
> > > > +
> > > > +	if (i915_gem_obj_bound(obj, vm)) {
> > > > +		if ((alignment &&
> > > > +		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
> > > >  		    (map_and_fenceable && !obj->map_and_fenceable)) {
> > > >  			WARN(obj->pin_count,
> > > >  			     "bo is already pinned with incorrect alignment:"
> > > >  			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
> > > >  			     " obj->map_and_fenceable=%d\n",
> > > > -			     i915_gem_obj_ggtt_offset(obj), alignment,
> > > > +			     i915_gem_obj_offset(obj, vm), alignment,
> > > >  			     map_and_fenceable,
> > > >  			     obj->map_and_fenceable);
> > > > -			ret = i915_gem_object_unbind(obj);
> > > > +			ret = i915_gem_object_unbind(obj, vm);
> > > >  			if (ret)
> > > >  				return ret;
> > > >  		}
> > > >  	}
> > > >  
> > > > -	if (!i915_gem_obj_ggtt_bound(obj)) {
> > > > +	if (!i915_gem_obj_bound(obj, vm)) {
> > > >  		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > >  
> > > > -		ret = i915_gem_object_bind_to_gtt(obj, alignment,
> > > > +		ret = i915_gem_object_bind_to_gtt(obj, vm, alignment,
> > > >  						  map_and_fenceable,
> > > >  						  nonblocking);
> > > >  		if (ret)
> > > > @@ -3684,7 +3739,7 @@ void
> > > >  i915_gem_object_unpin(struct drm_i915_gem_object *obj)
> > > >  {
> > > >  	BUG_ON(obj->pin_count == 0);
> > > > -	BUG_ON(!i915_gem_obj_ggtt_bound(obj));
> > > > +	BUG_ON(!i915_gem_obj_bound_any(obj));
> > > >  
> > > >  	if (--obj->pin_count == 0)
> > > >  		obj->pin_mappable = false;
> > > > @@ -3722,7 +3777,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
> > > >  	}
> > > >  
> > > >  	if (obj->user_pin_count == 0) {
> > > > -		ret = i915_gem_object_pin(obj, args->alignment, true, false);
> > > > +		ret = i915_gem_ggtt_pin(obj, args->alignment, true, false);
> > > >  		if (ret)
> > > >  			goto out;
> > > >  	}
> > > > @@ -3957,6 +4012,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> > > >  	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
> > > >  	struct drm_device *dev = obj->base.dev;
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > +	struct i915_vma *vma, *next;
> > > >  
> > > >  	trace_i915_gem_object_destroy(obj);
> > > >  
> > > > @@ -3964,15 +4020,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> > > >  		i915_gem_detach_phys_object(dev, obj);
> > > >  
> > > >  	obj->pin_count = 0;
> > > > -	if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
> > > > -		bool was_interruptible;
> > > > +	/* NB: 0 or 1 elements */
> > > > +	WARN_ON(!list_empty(&obj->vma_list) &&
> > > > +		!list_is_singular(&obj->vma_list));
> > > > +	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
> > > > +		int ret = i915_gem_object_unbind(obj, vma->vm);
> > > > +		if (WARN_ON(ret == -ERESTARTSYS)) {
> > > > +			bool was_interruptible;
> > > >  
> > > > -		was_interruptible = dev_priv->mm.interruptible;
> > > > -		dev_priv->mm.interruptible = false;
> > > > +			was_interruptible = dev_priv->mm.interruptible;
> > > > +			dev_priv->mm.interruptible = false;
> > > >  
> > > > -		WARN_ON(i915_gem_object_unbind(obj));
> > > > +			WARN_ON(i915_gem_object_unbind(obj, vma->vm));
> > > >  
> > > > -		dev_priv->mm.interruptible = was_interruptible;
> > > > +			dev_priv->mm.interruptible = was_interruptible;
> > > > +		}
> > > >  	}
> > > >  
> > > >  	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
> > > > @@ -4332,6 +4394,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
> > > >  	INIT_LIST_HEAD(&ring->request_list);
> > > >  }
> > > >  
> > > > +static void i915_init_vm(struct drm_i915_private *dev_priv,
> > > > +			 struct i915_address_space *vm)
> > > > +{
> > > > +	vm->dev = dev_priv->dev;
> > > > +	INIT_LIST_HEAD(&vm->active_list);
> > > > +	INIT_LIST_HEAD(&vm->inactive_list);
> > > > +	INIT_LIST_HEAD(&vm->global_link);
> > > > +	list_add(&vm->global_link, &dev_priv->vm_list);
> > > > +}
> > > > +
> > > >  void
> > > >  i915_gem_load(struct drm_device *dev)
> > > >  {
> > > > @@ -4344,8 +4416,9 @@ i915_gem_load(struct drm_device *dev)
> > > >  				  SLAB_HWCACHE_ALIGN,
> > > >  				  NULL);
> > > >  
> > > > -	INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
> > > > -	INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
> > > > +	INIT_LIST_HEAD(&dev_priv->vm_list);
> > > > +	i915_init_vm(dev_priv, &dev_priv->gtt.base);
> > > > +
> > > >  	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
> > > >  	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
> > > >  	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> > > > @@ -4616,9 +4689,9 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > > >  			     struct drm_i915_private,
> > > >  			     mm.inactive_shrinker);
> > > >  	struct drm_device *dev = dev_priv->dev;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > > +	struct i915_address_space *vm;
> > > >  	struct drm_i915_gem_object *obj;
> > > > -	int nr_to_scan = sc->nr_to_scan;
> > > > +	int nr_to_scan;
> > > >  	bool unlock = true;
> > > >  	int cnt;
> > > >  
> > > > @@ -4632,6 +4705,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > > >  		unlock = false;
> > > >  	}
> > > >  
> > > > +	nr_to_scan = sc->nr_to_scan;
> > > >  	if (nr_to_scan) {
> > > >  		nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
> > > >  		if (nr_to_scan > 0)
> > > > @@ -4645,11 +4719,93 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > > >  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
> > > >  		if (obj->pages_pin_count == 0)
> > > >  			cnt += obj->base.size >> PAGE_SHIFT;
> > > > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > > -		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > > -			cnt += obj->base.size >> PAGE_SHIFT;
> > > > +
> > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > > +		list_for_each_entry(obj, &vm->inactive_list, global_list)
> > > > +			if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > > +				cnt += obj->base.size >> PAGE_SHIFT;
> > > 
> > > Isn't this now double-counting objects? In the shrinker we only care about
> > > how much physical RAM an object occupies, not how much virtual space it
> > > occupies. So just walking the bound list of objects here should be good
> > > enough ...
> > > 
> > > >  
> > > >  	if (unlock)
> > > >  		mutex_unlock(&dev->struct_mutex);
> > > >  	return cnt;
> > > >  }
> > > > +
> > > > +/* All the new VM stuff */
> > > > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > > > +				  struct i915_address_space *vm)
> > > > +{
> > > > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > > > +	struct i915_vma *vma;
> > > > +
> > > > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > > > +		vm = &dev_priv->gtt.base;
> > > > +
> > > > +	BUG_ON(list_empty(&o->vma_list));
> > > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > 
> > > Imo the vma list walking here and in the other helpers below indicates
> > > that we should deal more often in vmas instead of (object, vm) pairs. Or
> > > is this again something that'll get fixed later on?
> > > 
> > > I just want to avoid diff churn, and it also makes reviewing easier if the
> > > foreshadowing is correct ;-) So generally I'd vote for more liberal
> > > sprinkling of obj_to_vma in callers.
> > > 
> > > > +		if (vma->vm == vm)
> > > > +			return vma->node.start;
> > > > +
> > > > +	}
> > > > +	return -1;
> > > > +}
> > > > +
> > > > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
> > > > +{
> > > > +	return !list_empty(&o->vma_list);
> > > > +}
> > > > +
> > > > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > > > +			struct i915_address_space *vm)
> > > > +{
> > > > +	struct i915_vma *vma;
> > > > +
> > > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > > +		if (vma->vm == vm)
> > > > +			return true;
> > > > +	}
> > > > +	return false;
> > > > +}
> > > > +
> > > > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > > > +				struct i915_address_space *vm)
> > > > +{
> > > > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > > > +	struct i915_vma *vma;
> > > > +
> > > > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > > > +		vm = &dev_priv->gtt.base;
> > > > +	BUG_ON(list_empty(&o->vma_list));
> > > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > > +		if (vma->vm == vm)
> > > > +			return vma->node.size;
> > > > +	}
> > > > +
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> > > > +			    struct i915_address_space *vm,
> > > > +			    enum i915_cache_level color)
> > > > +{
> > > > +	struct i915_vma *vma;
> > > > +	BUG_ON(list_empty(&o->vma_list));
> > > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > > +		if (vma->vm == vm) {
> > > > +			vma->node.color = color;
> > > > +			return;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	WARN(1, "Couldn't set color for VM %p\n", vm);
> > > > +}
> > > > +
> > > > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > > > +				     struct i915_address_space *vm)
> > > > +{
> > > > +	struct i915_vma *vma;
> > > > +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> > > > +		if (vma->vm == vm)
> > > > +			return vma;
> > > > +
> > > > +	return NULL;
> > > > +}
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > > > index 2074544..c92fd81 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > > > @@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev,
> > > >  
> > > >  	if (INTEL_INFO(dev)->gen >= 7) {
> > > >  		ret = i915_gem_object_set_cache_level(ctx->obj,
> > > > +						      &dev_priv->gtt.base,
> > > >  						      I915_CACHE_LLC_MLC);
> > > >  		/* Failure shouldn't ever happen this early */
> > > >  		if (WARN_ON(ret))
> > > > @@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
> > > >  	 * default context.
> > > >  	 */
> > > >  	dev_priv->ring[RCS].default_context = ctx;
> > > > -	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> > > > +	ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> > > >  	if (ret) {
> > > >  		DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
> > > >  		goto err_destroy;
> > > > @@ -398,6 +399,7 @@ mi_set_context(struct intel_ring_buffer *ring,
> > > >  static int do_switch(struct i915_hw_context *to)
> > > >  {
> > > >  	struct intel_ring_buffer *ring = to->ring;
> > > > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > > >  	struct i915_hw_context *from = ring->last_context;
> > > >  	u32 hw_flags = 0;
> > > >  	int ret;
> > > > @@ -407,7 +409,7 @@ static int do_switch(struct i915_hw_context *to)
> > > >  	if (from == to)
> > > >  		return 0;
> > > >  
> > > > -	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
> > > > +	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
> > > >  	if (ret)
> > > >  		return ret;
> > > >  
> > > > @@ -444,7 +446,8 @@ static int do_switch(struct i915_hw_context *to)
> > > >  	 */
> > > >  	if (from != NULL) {
> > > >  		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> > > > -		i915_gem_object_move_to_active(from->obj, ring);
> > > > +		i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
> > > > +					       ring);
> > > >  		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
> > > >  		 * whole damn pipeline, we don't need to explicitly mark the
> > > >  		 * object dirty. The only exception is that the context must be
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > > index df61f33..32efdc0 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > > @@ -32,24 +32,21 @@
> > > >  #include "i915_trace.h"
> > > >  
> > > >  static bool
> > > > -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
> > > > +mark_free(struct i915_vma *vma, struct list_head *unwind)
> > > >  {
> > > > -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > > > -
> > > > -	if (obj->pin_count)
> > > > +	if (vma->obj->pin_count)
> > > >  		return false;
> > > >  
> > > > -	list_add(&obj->exec_list, unwind);
> > > > +	list_add(&vma->obj->exec_list, unwind);
> > > >  	return drm_mm_scan_add_block(&vma->node);
> > > >  }
> > > >  
> > > >  int
> > > > -i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > > -			 unsigned alignment, unsigned cache_level,
> > > > +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> > > > +			 int min_size, unsigned alignment, unsigned cache_level,
> > > >  			 bool mappable, bool nonblocking)
> > > >  {
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > >  	struct list_head eviction_list, unwind_list;
> > > >  	struct i915_vma *vma;
> > > >  	struct drm_i915_gem_object *obj;
> > > > @@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > >  	 */
> > > >  
> > > >  	INIT_LIST_HEAD(&unwind_list);
> > > > -	if (mappable)
> > > > +	if (mappable) {
> > > > +		BUG_ON(!i915_is_ggtt(vm));
> > > >  		drm_mm_init_scan_with_range(&vm->mm, min_size,
> > > >  					    alignment, cache_level, 0,
> > > >  					    dev_priv->gtt.mappable_end);
> > > > -	else
> > > > +	} else
> > > >  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
> > > >  
> > > >  	/* First see if there is a large enough contiguous idle region... */
> > > >  	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> > > > -		if (mark_free(obj, &unwind_list))
> > > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > > +		if (mark_free(vma, &unwind_list))
> > > >  			goto found;
> > > >  	}
> > > >  
> > > > @@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > >  
> > > >  	/* Now merge in the soon-to-be-expired objects... */
> > > >  	list_for_each_entry(obj, &vm->active_list, mm_list) {
> > > > -		if (mark_free(obj, &unwind_list))
> > > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > > +		if (mark_free(vma, &unwind_list))
> > > >  			goto found;
> > > >  	}
> > > >  
> > > > @@ -109,7 +109,7 @@ none:
> > > >  		obj = list_first_entry(&unwind_list,
> > > >  				       struct drm_i915_gem_object,
> > > >  				       exec_list);
> > > > -		vma = __i915_gem_obj_to_vma(obj);
> > > > +		vma = i915_gem_obj_to_vma(obj, vm);
> > > >  		ret = drm_mm_scan_remove_block(&vma->node);
> > > >  		BUG_ON(ret);
> > > >  
> > > > @@ -130,7 +130,7 @@ found:
> > > >  		obj = list_first_entry(&unwind_list,
> > > >  				       struct drm_i915_gem_object,
> > > >  				       exec_list);
> > > > -		vma = __i915_gem_obj_to_vma(obj);
> > > > +		vma = i915_gem_obj_to_vma(obj, vm);
> > > >  		if (drm_mm_scan_remove_block(&vma->node)) {
> > > >  			list_move(&obj->exec_list, &eviction_list);
> > > >  			drm_gem_object_reference(&obj->base);
> > > > @@ -145,7 +145,7 @@ found:
> > > >  				       struct drm_i915_gem_object,
> > > >  				       exec_list);
> > > >  		if (ret == 0)
> > > > -			ret = i915_gem_object_unbind(obj);
> > > > +			ret = i915_gem_object_unbind(obj, vm);
> > > >  
> > > >  		list_del_init(&obj->exec_list);
> > > >  		drm_gem_object_unreference(&obj->base);
> > > > @@ -158,13 +158,18 @@ int
> > > >  i915_gem_evict_everything(struct drm_device *dev)
> > > 
> > > I suspect evict_everything eventually wants a address_space *vm argument
> > > for those cases where we only want to evict everything in a given vm. Atm
> > > we have two use-cases of this:
> > > - Called from the shrinker as a last-ditch effort. For that it should move
> > >   _every_ object onto the unbound list.
> > > - Called from execbuf for badly-fragmented address spaces to clean up the
> > >   mess. For that case we only care about one address space.
> > > 
> > > >  {
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > > +	struct i915_address_space *vm;
> > > >  	struct drm_i915_gem_object *obj, *next;
> > > > -	bool lists_empty;
> > > > +	bool lists_empty = true;
> > > >  	int ret;
> > > >  
> > > > -	lists_empty = (list_empty(&vm->inactive_list) &&
> > > > -		       list_empty(&vm->active_list));
> > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > > +		lists_empty = (list_empty(&vm->inactive_list) &&
> > > > +			       list_empty(&vm->active_list));
> > > > +		if (!lists_empty)
> > > > +			lists_empty = false;
> > > > +	}
> > > > +
> > > >  	if (lists_empty)
> > > >  		return -ENOSPC;
> > > >  
> > > > @@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
> > > >  	i915_gem_retire_requests(dev);
> > > >  
> > > >  	/* Having flushed everything, unbind() should never raise an error */
> > > > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > > -		if (obj->pin_count == 0)
> > > > -			WARN_ON(i915_gem_object_unbind(obj));
> > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > > +		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > > +			if (obj->pin_count == 0)
> > > > +				WARN_ON(i915_gem_object_unbind(obj, vm));
> > > > +	}
> > > >  
> > > >  	return 0;
> > > >  }
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > > index 5aeb447..e90182d 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > > @@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
> > > >  }
> > > >  
> > > >  static void
> > > > -eb_destroy(struct eb_objects *eb)
> > > > +eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
> > > >  {
> > > >  	while (!list_empty(&eb->objects)) {
> > > >  		struct drm_i915_gem_object *obj;
> > > > @@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
> > > >  static int
> > > >  i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> > > >  				   struct eb_objects *eb,
> > > > -				   struct drm_i915_gem_relocation_entry *reloc)
> > > > +				   struct drm_i915_gem_relocation_entry *reloc,
> > > > +				   struct i915_address_space *vm)
> > > >  {
> > > >  	struct drm_device *dev = obj->base.dev;
> > > >  	struct drm_gem_object *target_obj;
> > > > @@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> > > >  
> > > >  static int
> > > >  i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > > > -				    struct eb_objects *eb)
> > > > +				    struct eb_objects *eb,
> > > > +				    struct i915_address_space *vm)
> > > >  {
> > > >  #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
> > > >  	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
> > > > @@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > > >  		do {
> > > >  			u64 offset = r->presumed_offset;
> > > >  
> > > > -			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
> > > > +			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
> > > > +								 vm);
> > > >  			if (ret)
> > > >  				return ret;
> > > >  
> > > > @@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > > >  static int
> > > >  i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> > > >  					 struct eb_objects *eb,
> > > > -					 struct drm_i915_gem_relocation_entry *relocs)
> > > > +					 struct drm_i915_gem_relocation_entry *relocs,
> > > > +					 struct i915_address_space *vm)
> > > >  {
> > > >  	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> > > >  	int i, ret;
> > > >  
> > > >  	for (i = 0; i < entry->relocation_count; i++) {
> > > > -		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
> > > > +		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
> > > > +							 vm);
> > > >  		if (ret)
> > > >  			return ret;
> > > >  	}
> > > > @@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> > > >  }
> > > >  
> > > >  static int
> > > > -i915_gem_execbuffer_relocate(struct eb_objects *eb)
> > > > +i915_gem_execbuffer_relocate(struct eb_objects *eb,
> > > > +			     struct i915_address_space *vm)
> > > >  {
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	int ret = 0;
> > > > @@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
> > > >  	 */
> > > >  	pagefault_disable();
> > > >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> > > > -		ret = i915_gem_execbuffer_relocate_object(obj, eb);
> > > > +		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
> > > >  		if (ret)
> > > >  			break;
> > > >  	}
> > > > @@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
> > > >  static int
> > > >  i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > > >  				   struct intel_ring_buffer *ring,
> > > > +				   struct i915_address_space *vm,
> > > >  				   bool *need_reloc)
> > > >  {
> > > >  	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > > @@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > > >  		obj->tiling_mode != I915_TILING_NONE;
> > > >  	need_mappable = need_fence || need_reloc_mappable(obj);
> > > >  
> > > > -	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
> > > > +	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
> > > > +				  false);
> > > >  	if (ret)
> > > >  		return ret;
> > > >  
> > > > @@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > > >  		obj->has_aliasing_ppgtt_mapping = 1;
> > > >  	}
> > > >  
> > > > -	if (entry->offset != i915_gem_obj_ggtt_offset(obj)) {
> > > > -		entry->offset = i915_gem_obj_ggtt_offset(obj);
> > > > +	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
> > > > +		entry->offset = i915_gem_obj_offset(obj, vm);
> > > >  		*need_reloc = true;
> > > >  	}
> > > >  
> > > > @@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> > > >  {
> > > >  	struct drm_i915_gem_exec_object2 *entry;
> > > >  
> > > > -	if (!i915_gem_obj_ggtt_bound(obj))
> > > > +	if (!i915_gem_obj_bound_any(obj))
> > > >  		return;
> > > >  
> > > >  	entry = obj->exec_entry;
> > > > @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> > > >  static int
> > > >  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > > >  			    struct list_head *objects,
> > > > +			    struct i915_address_space *vm,
> > > >  			    bool *need_relocs)
> > > >  {
> > > >  	struct drm_i915_gem_object *obj;
> > > > @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > > >  		list_for_each_entry(obj, objects, exec_list) {
> > > >  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> > > >  			bool need_fence, need_mappable;
> > > > +			u32 obj_offset;
> > > >  
> > > > -			if (!i915_gem_obj_ggtt_bound(obj))
> > > > +			if (!i915_gem_obj_bound(obj, vm))
> > > >  				continue;
> > > 
> > > I wonder a bit how we could avoid the multipler (obj, vm) -> vma lookups
> > > here ... Maybe we should cache them in some pointer somewhere (either in
> > > the eb object or by adding a new pointer to the object struct, e.g.
> > > obj->eb_vma, similar to obj->eb_list).
> > > 
> > > >  
> > > > +			obj_offset = i915_gem_obj_offset(obj, vm);
> > > >  			need_fence =
> > > >  				has_fenced_gpu_access &&
> > > >  				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
> > > >  				obj->tiling_mode != I915_TILING_NONE;
> > > >  			need_mappable = need_fence || need_reloc_mappable(obj);
> > > >  
> > > > +			BUG_ON((need_mappable || need_fence) &&
> > > > +			       !i915_is_ggtt(vm));
> > > > +
> > > >  			if ((entry->alignment &&
> > > > -			     i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) ||
> > > > +			     obj_offset & (entry->alignment - 1)) ||
> > > >  			    (need_mappable && !obj->map_and_fenceable))
> > > > -				ret = i915_gem_object_unbind(obj);
> > > > +				ret = i915_gem_object_unbind(obj, vm);
> > > >  			else
> > > > -				ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > > > +				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> > > >  			if (ret)
> > > >  				goto err;
> > > >  		}
> > > >  
> > > >  		/* Bind fresh objects */
> > > >  		list_for_each_entry(obj, objects, exec_list) {
> > > > -			if (i915_gem_obj_ggtt_bound(obj))
> > > > +			if (i915_gem_obj_bound(obj, vm))
> > > >  				continue;
> > > >  
> > > > -			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > > > +			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> > > >  			if (ret)
> > > >  				goto err;
> > > >  		}
> > > > @@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> > > >  				  struct drm_file *file,
> > > >  				  struct intel_ring_buffer *ring,
> > > >  				  struct eb_objects *eb,
> > > > -				  struct drm_i915_gem_exec_object2 *exec)
> > > > +				  struct drm_i915_gem_exec_object2 *exec,
> > > > +				  struct i915_address_space *vm)
> > > >  {
> > > >  	struct drm_i915_gem_relocation_entry *reloc;
> > > >  	struct drm_i915_gem_object *obj;
> > > > @@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> > > >  		goto err;
> > > >  
> > > >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > > > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > > > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> > > >  	if (ret)
> > > >  		goto err;
> > > >  
> > > >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> > > >  		int offset = obj->exec_entry - exec;
> > > >  		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
> > > > -							       reloc + reloc_offset[offset]);
> > > > +							       reloc + reloc_offset[offset],
> > > > +							       vm);
> > > >  		if (ret)
> > > >  			goto err;
> > > >  	}
> > > > @@ -768,6 +784,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
> > > >  
> > > >  static void
> > > >  i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > > > +				   struct i915_address_space *vm,
> > > >  				   struct intel_ring_buffer *ring)
> > > >  {
> > > >  	struct drm_i915_gem_object *obj;
> > > > @@ -782,7 +799,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > > >  		obj->base.read_domains = obj->base.pending_read_domains;
> > > >  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
> > > >  
> > > > -		i915_gem_object_move_to_active(obj, ring);
> > > > +		i915_gem_object_move_to_active(obj, vm, ring);
> > > >  		if (obj->base.write_domain) {
> > > >  			obj->dirty = 1;
> > > >  			obj->last_write_seqno = intel_ring_get_seqno(ring);
> > > > @@ -836,7 +853,8 @@ static int
> > > >  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > > >  		       struct drm_file *file,
> > > >  		       struct drm_i915_gem_execbuffer2 *args,
> > > > -		       struct drm_i915_gem_exec_object2 *exec)
> > > > +		       struct drm_i915_gem_exec_object2 *exec,
> > > > +		       struct i915_address_space *vm)
> > > >  {
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > >  	struct eb_objects *eb;
> > > > @@ -998,17 +1016,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > > >  
> > > >  	/* Move the objects en-masse into the GTT, evicting if necessary. */
> > > >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > > > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > > > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> > > >  	if (ret)
> > > >  		goto err;
> > > >  
> > > >  	/* The objects are in their final locations, apply the relocations. */
> > > >  	if (need_relocs)
> > > > -		ret = i915_gem_execbuffer_relocate(eb);
> > > > +		ret = i915_gem_execbuffer_relocate(eb, vm);
> > > >  	if (ret) {
> > > >  		if (ret == -EFAULT) {
> > > >  			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
> > > > -								eb, exec);
> > > > +								eb, exec, vm);
> > > >  			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
> > > >  		}
> > > >  		if (ret)
> > > > @@ -1059,7 +1077,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > > >  			goto err;
> > > >  	}
> > > >  
> > > > -	exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset;
> > > > +	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> > > > +		args->batch_start_offset;
> > > >  	exec_len = args->batch_len;
> > > >  	if (cliprects) {
> > > >  		for (i = 0; i < args->num_cliprects; i++) {
> > > > @@ -1084,11 +1103,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > > >  
> > > >  	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
> > > >  
> > > > -	i915_gem_execbuffer_move_to_active(&eb->objects, ring);
> > > > +	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
> > > >  	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
> > > >  
> > > >  err:
> > > > -	eb_destroy(eb);
> > > > +	eb_destroy(eb, vm);
> > > >  
> > > >  	mutex_unlock(&dev->struct_mutex);
> > > >  
> > > > @@ -1105,6 +1124,7 @@ int
> > > >  i915_gem_execbuffer(struct drm_device *dev, void *data,
> > > >  		    struct drm_file *file)
> > > >  {
> > > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > >  	struct drm_i915_gem_execbuffer *args = data;
> > > >  	struct drm_i915_gem_execbuffer2 exec2;
> > > >  	struct drm_i915_gem_exec_object *exec_list = NULL;
> > > > @@ -1160,7 +1180,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
> > > >  	exec2.flags = I915_EXEC_RENDER;
> > > >  	i915_execbuffer2_set_context_id(exec2, 0);
> > > >  
> > > > -	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
> > > > +	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
> > > > +				     &dev_priv->gtt.base);
> > > >  	if (!ret) {
> > > >  		/* Copy the new buffer offsets back to the user's exec list. */
> > > >  		for (i = 0; i < args->buffer_count; i++)
> > > > @@ -1186,6 +1207,7 @@ int
> > > >  i915_gem_execbuffer2(struct drm_device *dev, void *data,
> > > >  		     struct drm_file *file)
> > > >  {
> > > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > >  	struct drm_i915_gem_execbuffer2 *args = data;
> > > >  	struct drm_i915_gem_exec_object2 *exec2_list = NULL;
> > > >  	int ret;
> > > > @@ -1216,7 +1238,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
> > > >  		return -EFAULT;
> > > >  	}
> > > >  
> > > > -	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
> > > > +	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
> > > > +				     &dev_priv->gtt.base);
> > > >  	if (!ret) {
> > > >  		/* Copy the new buffer offsets back to the user's exec list. */
> > > >  		ret = copy_to_user(to_user_ptr(args->buffers_ptr),
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > > index 298fc42..70ce2f6 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > > @@ -367,6 +367,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
> > > >  			    ppgtt->base.total);
> > > >  	}
> > > >  
> > > > +	/* i915_init_vm(dev_priv, &ppgtt->base) */
> > > > +
> > > >  	return ret;
> > > >  }
> > > >  
> > > > @@ -386,17 +388,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> > > >  			    struct drm_i915_gem_object *obj,
> > > >  			    enum i915_cache_level cache_level)
> > > >  {
> > > > -	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> > > > -				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > > -				   cache_level);
> > > > +	struct i915_address_space *vm = &ppgtt->base;
> > > > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > > > +
> > > > +	vm->insert_entries(vm, obj->pages,
> > > > +			   obj_offset >> PAGE_SHIFT,
> > > > +			   cache_level);
> > > >  }
> > > >  
> > > >  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > > >  			      struct drm_i915_gem_object *obj)
> > > >  {
> > > > -	ppgtt->base.clear_range(&ppgtt->base,
> > > > -				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > > -				obj->base.size >> PAGE_SHIFT);
> > > > +	struct i915_address_space *vm = &ppgtt->base;
> > > > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > > > +
> > > > +	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
> > > > +			obj->base.size >> PAGE_SHIFT);
> > > >  }
> > > >  
> > > >  extern int intel_iommu_gfx_mapped;
> > > > @@ -447,6 +454,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> > > >  				       dev_priv->gtt.base.start / PAGE_SIZE,
> > > >  				       dev_priv->gtt.base.total / PAGE_SIZE);
> > > >  
> > > > +	if (dev_priv->mm.aliasing_ppgtt)
> > > > +		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> > > > +
> > > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > > >  		i915_gem_clflush_object(obj);
> > > >  		i915_gem_gtt_bind_object(obj, obj->cache_level);
> > > > @@ -625,7 +635,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > > >  	 * aperture.  One page should be enough to keep any prefetching inside
> > > >  	 * of the aperture.
> > > >  	 */
> > > > -	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > +	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
> > > >  	struct drm_mm_node *entry;
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	unsigned long hole_start, hole_end;
> > > > @@ -633,19 +644,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > > >  	BUG_ON(mappable_end > end);
> > > >  
> > > >  	/* Subtract the guard page ... */
> > > > -	drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
> > > > +	drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
> > > >  	if (!HAS_LLC(dev))
> > > >  		dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
> > > >  
> > > >  	/* Mark any preallocated objects as occupied */
> > > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > > > -		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
> > > >  		int ret;
> > > >  		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
> > > >  			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
> > > >  
> > > >  		WARN_ON(i915_gem_obj_ggtt_bound(obj));
> > > > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > > > +		ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
> > > >  		if (ret)
> > > >  			DRM_DEBUG_KMS("Reservation failed\n");
> > > >  		obj->has_global_gtt_mapping = 1;
> > > > @@ -656,19 +667,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > > >  	dev_priv->gtt.base.total = end - start;
> > > >  
> > > >  	/* Clear any non-preallocated blocks */
> > > > -	drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
> > > > -			     hole_start, hole_end) {
> > > > +	drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
> > > >  		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
> > > >  		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
> > > >  			      hole_start, hole_end);
> > > > -		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > > > -					       hole_start / PAGE_SIZE,
> > > > -					       count);
> > > > +		ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count);
> > > >  	}
> > > >  
> > > >  	/* And finally clear the reserved guard page */
> > > > -	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > > > -				       end / PAGE_SIZE - 1, 1);
> > > > +	ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1);
> > > >  }
> > > >  
> > > >  static bool
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > index 245eb1d..bfe61fa 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > @@ -391,7 +391,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > > >  	if (gtt_offset == I915_GTT_OFFSET_NONE)
> > > >  		return obj;
> > > >  
> > > > -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > > > +	vma = i915_gem_vma_create(obj, vm);
> > > >  	if (!vma) {
> > > >  		drm_gem_object_unreference(&obj->base);
> > > >  		return NULL;
> > > > @@ -404,8 +404,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > > >  	 */
> > > >  	vma->node.start = gtt_offset;
> > > >  	vma->node.size = size;
> > > > -	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> > > > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > > > +	if (drm_mm_initialized(&vm->mm)) {
> > > > +		ret = drm_mm_reserve_node(&vm->mm, &vma->node);
> > > 
> > > These two hunks here for stolen look fishy - we only ever use the stolen
> > > preallocated stuff for objects with mappings in the global gtt. So keeping
> > > that explicit is imo the better approach. And tbh I'm confused where the
> > > local variable vm is from ...
> > > 
> > > >  		if (ret) {
> > > >  			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
> > > >  			goto unref_out;
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > > > index 92a8d27..808ca2a 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > > > @@ -360,17 +360,19 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
> > > >  
> > > >  		obj->map_and_fenceable =
> > > >  			!i915_gem_obj_ggtt_bound(obj) ||
> > > > -			(i915_gem_obj_ggtt_offset(obj) + obj->base.size <= dev_priv->gtt.mappable_end &&
> > > > +			(i915_gem_obj_ggtt_offset(obj) +
> > > > +			 obj->base.size <= dev_priv->gtt.mappable_end &&
> > > >  			 i915_gem_object_fence_ok(obj, args->tiling_mode));
> > > >  
> > > >  		/* Rebind if we need a change of alignment */
> > > >  		if (!obj->map_and_fenceable) {
> > > > -			u32 unfenced_alignment =
> > > > +			struct i915_address_space *ggtt = &dev_priv->gtt.base;
> > > > +			u32 unfenced_align =
> > > >  				i915_gem_get_gtt_alignment(dev, obj->base.size,
> > > >  							    args->tiling_mode,
> > > >  							    false);
> > > > -			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_alignment - 1))
> > > > -				ret = i915_gem_object_unbind(obj);
> > > > +			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_align - 1))
> > > > +				ret = i915_gem_object_unbind(obj, ggtt);
> > > >  		}
> > > >  
> > > >  		if (ret == 0) {
> > > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > > > index 79fbb17..28fa0ff 100644
> > > > --- a/drivers/gpu/drm/i915/i915_irq.c
> > > > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > > > @@ -1716,6 +1716,9 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
> > > >  	if (HAS_BROKEN_CS_TLB(dev_priv->dev)) {
> > > >  		u32 acthd = I915_READ(ACTHD);
> > > >  
> > > > +		if (WARN_ON(HAS_HW_CONTEXTS(dev_priv->dev)))
> > > > +			return NULL;
> > > > +
> > > >  		if (WARN_ON(ring->id != RCS))
> > > >  			return NULL;
> > > >  
> > > > @@ -1802,7 +1805,8 @@ static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
> > > >  		return;
> > > >  
> > > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > > > -		if ((error->ccid & PAGE_MASK) == i915_gem_obj_ggtt_offset(obj)) {
> > > > +		if ((error->ccid & PAGE_MASK) ==
> > > > +		    i915_gem_obj_ggtt_offset(obj)) {
> > > >  			ering->ctx = i915_error_object_create_sized(dev_priv,
> > > >  								    obj, 1);
> > > >  			break;
> > > > diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> > > > index 7d283b5..3f019d3 100644
> > > > --- a/drivers/gpu/drm/i915/i915_trace.h
> > > > +++ b/drivers/gpu/drm/i915/i915_trace.h
> > > > @@ -34,11 +34,13 @@ TRACE_EVENT(i915_gem_object_create,
> > > >  );
> > > >  
> > > >  TRACE_EVENT(i915_gem_object_bind,
> > > > -	    TP_PROTO(struct drm_i915_gem_object *obj, bool mappable),
> > > > -	    TP_ARGS(obj, mappable),
> > > > +	    TP_PROTO(struct drm_i915_gem_object *obj,
> > > > +		     struct i915_address_space *vm, bool mappable),
> > > > +	    TP_ARGS(obj, vm, mappable),
> > > >  
> > > >  	    TP_STRUCT__entry(
> > > >  			     __field(struct drm_i915_gem_object *, obj)
> > > > +			     __field(struct i915_address_space *, vm)
> > > >  			     __field(u32, offset)
> > > >  			     __field(u32, size)
> > > >  			     __field(bool, mappable)
> > > > @@ -46,8 +48,8 @@ TRACE_EVENT(i915_gem_object_bind,
> > > >  
> > > >  	    TP_fast_assign(
> > > >  			   __entry->obj = obj;
> > > > -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> > > > -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> > > > +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> > > > +			   __entry->size = i915_gem_obj_size(obj, vm);
> > > >  			   __entry->mappable = mappable;
> > > >  			   ),
> > > >  
> > > > @@ -57,19 +59,21 @@ TRACE_EVENT(i915_gem_object_bind,
> > > >  );
> > > >  
> > > >  TRACE_EVENT(i915_gem_object_unbind,
> > > > -	    TP_PROTO(struct drm_i915_gem_object *obj),
> > > > -	    TP_ARGS(obj),
> > > > +	    TP_PROTO(struct drm_i915_gem_object *obj,
> > > > +		     struct i915_address_space *vm),
> > > > +	    TP_ARGS(obj, vm),
> > > >  
> > > >  	    TP_STRUCT__entry(
> > > >  			     __field(struct drm_i915_gem_object *, obj)
> > > > +			     __field(struct i915_address_space *, vm)
> > > >  			     __field(u32, offset)
> > > >  			     __field(u32, size)
> > > >  			     ),
> > > >  
> > > >  	    TP_fast_assign(
> > > >  			   __entry->obj = obj;
> > > > -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> > > > -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> > > > +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> > > > +			   __entry->size = i915_gem_obj_size(obj, vm);
> > > >  			   ),
> > > >  
> > > >  	    TP_printk("obj=%p, offset=%08x size=%x",
> > > > diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
> > > > index f3c97e0..b69cc63 100644
> > > > --- a/drivers/gpu/drm/i915/intel_fb.c
> > > > +++ b/drivers/gpu/drm/i915/intel_fb.c
> > > > @@ -170,7 +170,6 @@ static int intelfb_create(struct drm_fb_helper *helper,
> > > >  		      fb->width, fb->height,
> > > >  		      i915_gem_obj_ggtt_offset(obj), obj);
> > > >  
> > > > -
> > > >  	mutex_unlock(&dev->struct_mutex);
> > > >  	vga_switcheroo_client_fb_set(dev->pdev, info);
> > > >  	return 0;
> > > > diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> > > > index 81c3ca1..517e278 100644
> > > > --- a/drivers/gpu/drm/i915/intel_overlay.c
> > > > +++ b/drivers/gpu/drm/i915/intel_overlay.c
> > > > @@ -1350,7 +1350,7 @@ void intel_setup_overlay(struct drm_device *dev)
> > > >  		}
> > > >  		overlay->flip_addr = reg_bo->phys_obj->handle->busaddr;
> > > >  	} else {
> > > > -		ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false);
> > > > +		ret = i915_gem_ggtt_pin(reg_bo, PAGE_SIZE, true, false);
> > > >  		if (ret) {
> > > >  			DRM_ERROR("failed to pin overlay register bo\n");
> > > >  			goto out_free_bo;
> > > > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > > > index 125a741..449e57c 100644
> > > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > > @@ -2858,7 +2858,7 @@ intel_alloc_context_page(struct drm_device *dev)
> > > >  		return NULL;
> > > >  	}
> > > >  
> > > > -	ret = i915_gem_object_pin(ctx, 4096, true, false);
> > > > +	ret = i915_gem_ggtt_pin(ctx, 4096, true, false);
> > > >  	if (ret) {
> > > >  		DRM_ERROR("failed to pin power context: %d\n", ret);
> > > >  		goto err_unref;
> > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > index bc4c11b..ebed61d 100644
> > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > @@ -481,6 +481,7 @@ out:
> > > >  static int
> > > >  init_pipe_control(struct intel_ring_buffer *ring)
> > > >  {
> > > > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > > >  	struct pipe_control *pc;
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	int ret;
> > > > @@ -499,9 +500,10 @@ init_pipe_control(struct intel_ring_buffer *ring)
> > > >  		goto err;
> > > >  	}
> > > >  
> > > > -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> > > > +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > > > +					I915_CACHE_LLC);
> > > >  
> > > > -	ret = i915_gem_object_pin(obj, 4096, true, false);
> > > > +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
> > > >  	if (ret)
> > > >  		goto err_unref;
> > > >  
> > > > @@ -1212,6 +1214,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring)
> > > >  static int init_status_page(struct intel_ring_buffer *ring)
> > > >  {
> > > >  	struct drm_device *dev = ring->dev;
> > > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	int ret;
> > > >  
> > > > @@ -1222,9 +1225,10 @@ static int init_status_page(struct intel_ring_buffer *ring)
> > > >  		goto err;
> > > >  	}
> > > >  
> > > > -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> > > > +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > > > +					I915_CACHE_LLC);
> > > >  
> > > > -	ret = i915_gem_object_pin(obj, 4096, true, false);
> > > > +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
> > > >  	if (ret != 0) {
> > > >  		goto err_unref;
> > > >  	}
> > > > @@ -1307,7 +1311,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
> > > >  
> > > >  	ring->obj = obj;
> > > >  
> > > > -	ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false);
> > > > +	ret = i915_gem_ggtt_pin(obj, PAGE_SIZE, true, false);
> > > >  	if (ret)
> > > >  		goto err_unref;
> > > >  
> > > > @@ -1828,7 +1832,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
> > > >  			return -ENOMEM;
> > > >  		}
> > > >  
> > > > -		ret = i915_gem_object_pin(obj, 0, true, false);
> > > > +		ret = i915_gem_ggtt_pin(obj, 0, true, false);
> > > >  		if (ret != 0) {
> > > >  			drm_gem_object_unreference(&obj->base);
> > > >  			DRM_ERROR("Failed to ping batch bo\n");
> > > > -- 
> > > > 1.8.3.2
> > > > 
> > > > _______________________________________________
> > > > Intel-gfx mailing list
> > > > Intel-gfx@lists.freedesktop.org
> > > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > > 
> > > -- 
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> > 
> > -- 
> > Ben Widawsky, Intel Open Source Technology Center
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Daniel Vetter July 11, 2013, 6:01 a.m. UTC | #5

On Thu, Jul 11, 2013 at 12:23 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Wed, Jul 10, 2013 at 07:05:52PM +0200, Daniel Vetter wrote:
>> On Wed, Jul 10, 2013 at 09:37:10AM -0700, Ben Widawsky wrote:
>> > On Tue, Jul 09, 2013 at 09:15:01AM +0200, Daniel Vetter wrote:
>> > > On Mon, Jul 08, 2013 at 11:08:37PM -0700, Ben Widawsky wrote:
>> > > > This patch was formerly known as:
>> > > > "drm/i915: Create VMAs (part 3) - plumbing"
>> > > >
>> > > > This patch adds a VM argument, bind/unbind, and the object
>> > > > offset/size/color getters/setters. It preserves the old ggtt helper
>> > > > functions because things still need, and will continue to need them.
>> > > >
>> > > > Some code will still need to be ported over after this.
>> > > >
>> > > > v2: Fix purge to pick an object and unbind all vmas
>> > > > This was doable because of the global bound list change.
>> > > >
>> > > > v3: With the commit to actually pin/unpin pages in place, there is no
>> > > > longer a need to check if unbind succeeded before calling put_pages().
>> > > > Make put_pages only BUG() after checking pin count.
>> > > >
>> > > > v4: Rebased on top of the new hangcheck work by Mika
>> > > > plumbed eb_destroy also
>> > > > Many checkpatch related fixes
>> > > >
>> > > > v5: Very large rebase
>> > > >
>> > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>> > >
>> > > This one is a rather large beast. Any chance we could split it into
>> > > topics, e.g. convert execbuf code, convert shrinker code? Or does that get
>> > > messy, fast?
>> > >
>> >
>> > I've thought of this...
>> >
>> > The one solution I came up with is to have two bind/unbind functions
>> > (similar to what I did with pin, and indeed it was my original plan with
>> > pin), and do the set_caching one separately.
>> >
>> > I think it won't be too messy, just a lot of typing, as Keith likes to
>> > say.
>> >
>> > However, my opinion was, since it's early in the merge cycle, we don't
>> > yet have multiple VMs, and it's /mostly/ a copypasta kind of patch, it's
>> > not a big deal. At a functional level too, I felt this made more sense.
>> >
>> > So I'll defer to your request on this and start splitting it up, unless
>> > my email has changed your mind ;-).
>>
>> Well, my concern is mostly in reviewing since we need to think about each
>> case and whether it makes sense to talk in therms of vma or objects in
>> that function. And what exactly to test.
>>
>> If you've played around and concluded it'll be a mess then I don't think
>> it'll help in reviewing. So pointless.
>
> I said I don't think it will be a mess, though I feel it won't really
> help review too much. Can you take a crack and review and poke me if you
> want me to try it. I'd rather not do it if I can avoid it, so I can try
> to go back to my 15 patch maximum rule.
>
>>
>> Still, there's a bunch of questions on this patch that we need to discuss
>> ;-)
>
> Ready whenever.

It's waiting for you in my first reply, just scroll down a bit ;-)
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

Ben Widawsky July 12, 2013, 2:23 a.m. UTC | #6

On Tue, Jul 09, 2013 at 09:15:01AM +0200, Daniel Vetter wrote:
> On Mon, Jul 08, 2013 at 11:08:37PM -0700, Ben Widawsky wrote:
> > This patch was formerly known as:
> > "drm/i915: Create VMAs (part 3) - plumbing"
> > 
> > This patch adds a VM argument, bind/unbind, and the object
> > offset/size/color getters/setters. It preserves the old ggtt helper
> > functions because things still need, and will continue to need them.
> > 
> > Some code will still need to be ported over after this.
> > 
> > v2: Fix purge to pick an object and unbind all vmas
> > This was doable because of the global bound list change.
> > 
> > v3: With the commit to actually pin/unpin pages in place, there is no
> > longer a need to check if unbind succeeded before calling put_pages().
> > Make put_pages only BUG() after checking pin count.
> > 
> > v4: Rebased on top of the new hangcheck work by Mika
> > plumbed eb_destroy also
> > Many checkpatch related fixes
> > 
> > v5: Very large rebase
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> This one is a rather large beast. Any chance we could split it into
> topics, e.g. convert execbuf code, convert shrinker code? Or does that get
> messy, fast?
> 
> > ---
> >  drivers/gpu/drm/i915/i915_debugfs.c        |  31 ++-
> >  drivers/gpu/drm/i915/i915_dma.c            |   4 -
> >  drivers/gpu/drm/i915/i915_drv.h            | 107 +++++-----
> >  drivers/gpu/drm/i915/i915_gem.c            | 316 +++++++++++++++++++++--------
> >  drivers/gpu/drm/i915/i915_gem_context.c    |   9 +-
> >  drivers/gpu/drm/i915/i915_gem_evict.c      |  51 +++--
> >  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 +++++---
> >  drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 ++--
> >  drivers/gpu/drm/i915/i915_gem_stolen.c     |   6 +-
> >  drivers/gpu/drm/i915/i915_gem_tiling.c     |  10 +-
> >  drivers/gpu/drm/i915/i915_irq.c            |   6 +-
> >  drivers/gpu/drm/i915/i915_trace.h          |  20 +-
> >  drivers/gpu/drm/i915/intel_fb.c            |   1 -
> >  drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
> >  drivers/gpu/drm/i915/intel_pm.c            |   2 +-
> >  drivers/gpu/drm/i915/intel_ringbuffer.c    |  16 +-
> >  16 files changed, 468 insertions(+), 239 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > index 16b2aaf..867ed07 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -122,9 +122,18 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
> >  		seq_printf(m, " (pinned x %d)", obj->pin_count);
> >  	if (obj->fence_reg != I915_FENCE_REG_NONE)
> >  		seq_printf(m, " (fence: %d)", obj->fence_reg);
> > -	if (i915_gem_obj_ggtt_bound(obj))
> > -		seq_printf(m, " (gtt offset: %08lx, size: %08x)",
> > -			   i915_gem_obj_ggtt_offset(obj), (unsigned int)i915_gem_obj_ggtt_size(obj));
> > +	if (i915_gem_obj_bound_any(obj)) {
> 
> list_for_each will short-circuit already, so this is redundant.
> 

Got it.

> > +		struct i915_vma *vma;
> > +		list_for_each_entry(vma, &obj->vma_list, vma_link) {
> > +			if (!i915_is_ggtt(vma->vm))
> > +				seq_puts(m, " (pp");
> > +			else
> > +				seq_puts(m, " (g");
> > +			seq_printf(m, " gtt offset: %08lx, size: %08lx)",
> 
>                                        ^ that space looks superflous now
Got it.

> 
> > +				   i915_gem_obj_offset(obj, vma->vm),
> > +				   i915_gem_obj_size(obj, vma->vm));
> > +		}
> > +	}
> >  	if (obj->stolen)
> >  		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
> >  	if (obj->pin_mappable || obj->fault_mappable) {
> > @@ -186,6 +195,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> >  	return 0;
> >  }
> >  
> > +/* FIXME: Support multiple VM? */
> >  #define count_objects(list, member) do { \
> >  	list_for_each_entry(obj, list, member) { \
> >  		size += i915_gem_obj_ggtt_size(obj); \
> > @@ -2049,18 +2059,21 @@ i915_drop_caches_set(void *data, u64 val)
> >  
> >  	if (val & DROP_BOUND) {
> >  		list_for_each_entry_safe(obj, next, &vm->inactive_list,
> > -					 mm_list)
> > -			if (obj->pin_count == 0) {
> > -				ret = i915_gem_object_unbind(obj);
> > -				if (ret)
> > -					goto unlock;
> > -			}
> > +					 mm_list) {
> > +			if (obj->pin_count)
> > +				continue;
> > +
> > +			ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
> > +			if (ret)
> > +				goto unlock;
> > +		}
> >  	}
> >  
> >  	if (val & DROP_UNBOUND) {
> >  		list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list,
> >  					 global_list)
> >  			if (obj->pages_pin_count == 0) {
> > +				/* FIXME: Do this for all vms? */
> >  				ret = i915_gem_object_put_pages(obj);
> >  				if (ret)
> >  					goto unlock;
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > index d13e21f..b190439 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -1497,10 +1497,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
> >  
> >  	i915_dump_device_info(dev_priv);
> >  
> > -	INIT_LIST_HEAD(&dev_priv->vm_list);
> > -	INIT_LIST_HEAD(&dev_priv->gtt.base.global_link);
> > -	list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list);
> > -
> >  	if (i915_get_bridge_dev(dev)) {
> >  		ret = -EIO;
> >  		goto free_priv;
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 38cccc8..48baccc 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1363,52 +1363,6 @@ struct drm_i915_gem_object {
> >  
> >  #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
> >  
> > -/* This is a temporary define to help transition us to real VMAs. If you see
> > - * this, you're either reviewing code, or bisecting it. */
> > -static inline struct i915_vma *
> > -__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj)
> > -{
> > -	if (list_empty(&obj->vma_list))
> > -		return NULL;
> > -	return list_first_entry(&obj->vma_list, struct i915_vma, vma_link);
> > -}
> > -
> > -/* Whether or not this object is currently mapped by the translation tables */
> > -static inline bool
> > -i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o)
> > -{
> > -	struct i915_vma *vma = __i915_gem_obj_to_vma(o);
> > -	if (vma == NULL)
> > -		return false;
> > -	return drm_mm_node_allocated(&vma->node);
> > -}
> > -
> > -/* Offset of the first PTE pointing to this object */
> > -static inline unsigned long
> > -i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o)
> > -{
> > -	BUG_ON(list_empty(&o->vma_list));
> > -	return __i915_gem_obj_to_vma(o)->node.start;
> > -}
> > -
> > -/* The size used in the translation tables may be larger than the actual size of
> > - * the object on GEN2/GEN3 because of the way tiling is handled. See
> > - * i915_gem_get_gtt_size() for more details.
> > - */
> > -static inline unsigned long
> > -i915_gem_obj_ggtt_size(struct drm_i915_gem_object *o)
> > -{
> > -	BUG_ON(list_empty(&o->vma_list));
> > -	return __i915_gem_obj_to_vma(o)->node.size;
> > -}
> > -
> > -static inline void
> > -i915_gem_obj_ggtt_set_color(struct drm_i915_gem_object *o,
> > -			    enum i915_cache_level color)
> > -{
> > -	__i915_gem_obj_to_vma(o)->node.color = color;
> > -}
> > -
> >  /**
> >   * Request queue structure.
> >   *
> > @@ -1726,11 +1680,13 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
> >  void i915_gem_vma_destroy(struct i915_vma *vma);
> >  
> >  int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > +				     struct i915_address_space *vm,
> >  				     uint32_t alignment,
> >  				     bool map_and_fenceable,
> >  				     bool nonblocking);
> >  void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
> > -int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
> > +int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> > +					struct i915_address_space *vm);
> >  int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
> >  void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
> >  void i915_gem_lastclose(struct drm_device *dev);
> > @@ -1760,6 +1716,7 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
> >  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
> >  			 struct intel_ring_buffer *to);
> >  void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > +				    struct i915_address_space *vm,
> >  				    struct intel_ring_buffer *ring);
> >  
> >  int i915_gem_dumb_create(struct drm_file *file_priv,
> > @@ -1866,6 +1823,7 @@ i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size,
> >  			    int tiling_mode, bool fenced);
> >  
> >  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > +				    struct i915_address_space *vm,
> >  				    enum i915_cache_level cache_level);
> >  
> >  struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
> > @@ -1876,6 +1834,56 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
> >  
> >  void i915_gem_restore_fences(struct drm_device *dev);
> >  
> > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > +				  struct i915_address_space *vm);
> > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
> > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > +			struct i915_address_space *vm);
> > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > +				struct i915_address_space *vm);
> > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> > +			    struct i915_address_space *vm,
> > +			    enum i915_cache_level color);
> > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > +				     struct i915_address_space *vm);
> > +/* Some GGTT VM helpers */
> > +#define obj_to_ggtt(obj) \
> > +	(&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
> > +static inline bool i915_is_ggtt(struct i915_address_space *vm)
> > +{
> > +	struct i915_address_space *ggtt =
> > +		&((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base;
> > +	return vm == ggtt;
> > +}
> > +
> > +static inline bool i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *obj)
> > +{
> > +	return i915_gem_obj_bound(obj, obj_to_ggtt(obj));
> > +}
> > +
> > +static inline unsigned long
> > +i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *obj)
> > +{
> > +	return i915_gem_obj_offset(obj, obj_to_ggtt(obj));
> > +}
> > +
> > +static inline unsigned long
> > +i915_gem_obj_ggtt_size(struct drm_i915_gem_object *obj)
> > +{
> > +	return i915_gem_obj_size(obj, obj_to_ggtt(obj));
> > +}
> > +
> > +static inline int __must_check
> > +i915_gem_ggtt_pin(struct drm_i915_gem_object *obj,
> > +		  uint32_t alignment,
> > +		  bool map_and_fenceable,
> > +		  bool nonblocking)
> > +{
> > +	return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment,
> > +				   map_and_fenceable, nonblocking);
> > +}
> > +#undef obj_to_ggtt
> > +
> >  /* i915_gem_context.c */
> >  void i915_gem_context_init(struct drm_device *dev);
> >  void i915_gem_context_fini(struct drm_device *dev);
> > @@ -1912,6 +1920,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> >  
> >  void i915_gem_restore_gtt_mappings(struct drm_device *dev);
> >  int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
> > +/* FIXME: this is never okay with full PPGTT */
> >  void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> >  				enum i915_cache_level cache_level);
> >  void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
> > @@ -1928,7 +1937,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev)
> >  
> >  
> >  /* i915_gem_evict.c */
> > -int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
> > +int __must_check i915_gem_evict_something(struct drm_device *dev,
> > +					  struct i915_address_space *vm,
> > +					  int min_size,
> >  					  unsigned alignment,
> >  					  unsigned cache_level,
> >  					  bool mappable,
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 058ad44..21015cd 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -38,10 +38,12 @@
> >  
> >  static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
> >  static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
> > -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > -						    unsigned alignment,
> > -						    bool map_and_fenceable,
> > -						    bool nonblocking);
> > +static __must_check int
> > +i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > +			    struct i915_address_space *vm,
> > +			    unsigned alignment,
> > +			    bool map_and_fenceable,
> > +			    bool nonblocking);
> >  static int i915_gem_phys_pwrite(struct drm_device *dev,
> >  				struct drm_i915_gem_object *obj,
> >  				struct drm_i915_gem_pwrite *args,
> > @@ -135,7 +137,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
> >  static inline bool
> >  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
> >  {
> > -	return i915_gem_obj_ggtt_bound(obj) && !obj->active;
> > +	return i915_gem_obj_bound_any(obj) && !obj->active;
> >  }
> >  
> >  int
> > @@ -422,7 +424,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
> >  		 * anyway again before the next pread happens. */
> >  		if (obj->cache_level == I915_CACHE_NONE)
> >  			needs_clflush = 1;
> > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > +		if (i915_gem_obj_bound_any(obj)) {
> >  			ret = i915_gem_object_set_to_gtt_domain(obj, false);
> 
> This is essentially a very convoluted version of "if there's gpu rendering
> outstanding, please wait for it". Maybe we should switch this to
> 
> 	if (obj->active)
> 		wait_rendering(obj, true);
> 
> Same for the shmem_pwrite case below. Would be a separate patch to prep
> things though. Can I volunteer you for that? The ugly part is to review
> whether any of the lru list updating that set_domain does in addition to
> wait_rendering is required, but on a quick read that's not the case.

Just reading the comment above it says we need the clflush. I don't
actually understand why we do that even after reading the comment, but
meh. You tell me, I don't mind doing this as a prep first.

> 
> >  			if (ret)
> >  				return ret;
> > @@ -594,7 +596,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
> >  	char __user *user_data;
> >  	int page_offset, page_length, ret;
> >  
> > -	ret = i915_gem_object_pin(obj, 0, true, true);
> > +	ret = i915_gem_ggtt_pin(obj, 0, true, true);
> >  	if (ret)
> >  		goto out;
> >  
> > @@ -739,7 +741,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
> >  		 * right away and we therefore have to clflush anyway. */
> >  		if (obj->cache_level == I915_CACHE_NONE)
> >  			needs_clflush_after = 1;
> > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > +		if (i915_gem_obj_bound_any(obj)) {
> 
> ... see above.
> >  			ret = i915_gem_object_set_to_gtt_domain(obj, true);
> >  			if (ret)
> >  				return ret;
> > @@ -1346,7 +1348,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> >  	}
> >  
> >  	/* Now bind it into the GTT if needed */
> > -	ret = i915_gem_object_pin(obj, 0, true, false);
> > +	ret = i915_gem_ggtt_pin(obj,  0, true, false);
> >  	if (ret)
> >  		goto unlock;
> >  
> > @@ -1668,11 +1670,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
> >  	if (obj->pages == NULL)
> >  		return 0;
> >  
> > -	BUG_ON(i915_gem_obj_ggtt_bound(obj));
> > -
> >  	if (obj->pages_pin_count)
> >  		return -EBUSY;
> >  
> > +	BUG_ON(i915_gem_obj_bound_any(obj));
> > +
> >  	/* ->put_pages might need to allocate memory for the bit17 swizzle
> >  	 * array, hence protect them from being reaped by removing them from gtt
> >  	 * lists early. */
> > @@ -1692,7 +1694,6 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
> >  		  bool purgeable_only)
> >  {
> >  	struct drm_i915_gem_object *obj, *next;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	long count = 0;
> >  
> >  	list_for_each_entry_safe(obj, next,
> > @@ -1706,14 +1707,22 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
> >  		}
> >  	}
> >  
> > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) {
> > -		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
> > -		    i915_gem_object_unbind(obj) == 0 &&
> > -		    i915_gem_object_put_pages(obj) == 0) {
> > +	list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list,
> > +				 global_list) {
> > +		struct i915_vma *vma, *v;
> > +
> > +		if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
> > +			continue;
> > +
> > +		list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link)
> > +			if (i915_gem_object_unbind(obj, vma->vm))
> > +				break;
> > +
> > +		if (!i915_gem_object_put_pages(obj))
> >  			count += obj->base.size >> PAGE_SHIFT;
> > -			if (count >= target)
> > -				return count;
> > -		}
> > +
> > +		if (count >= target)
> > +			return count;
> >  	}
> >  
> >  	return count;
> > @@ -1873,11 +1882,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
> >  
> >  void
> >  i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > +			       struct i915_address_space *vm,
> >  			       struct intel_ring_buffer *ring)
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	u32 seqno = intel_ring_get_seqno(ring);
> >  
> >  	BUG_ON(ring == NULL);
> > @@ -1910,12 +1919,9 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> >  }
> >  
> >  static void
> > -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> > +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> > +				 struct i915_address_space *vm)
> >  {
> > -	struct drm_device *dev = obj->base.dev;
> > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > -
> >  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
> >  	BUG_ON(!obj->active);
> >  
> > @@ -2117,10 +2123,11 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
> >  	spin_unlock(&file_priv->mm.lock);
> >  }
> >  
> > -static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj)
> > +static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj,
> > +				    struct i915_address_space *vm)
> >  {
> > -	if (acthd >= i915_gem_obj_ggtt_offset(obj) &&
> > -	    acthd < i915_gem_obj_ggtt_offset(obj) + obj->base.size)
> > +	if (acthd >= i915_gem_obj_offset(obj, vm) &&
> > +	    acthd < i915_gem_obj_offset(obj, vm) + obj->base.size)
> >  		return true;
> >  
> >  	return false;
> > @@ -2143,6 +2150,17 @@ static bool i915_head_inside_request(const u32 acthd_unmasked,
> >  	return false;
> >  }
> >  
> > +static struct i915_address_space *
> > +request_to_vm(struct drm_i915_gem_request *request)
> > +{
> > +	struct drm_i915_private *dev_priv = request->ring->dev->dev_private;
> > +	struct i915_address_space *vm;
> > +
> > +	vm = &dev_priv->gtt.base;
> > +
> > +	return vm;
> > +}
> > +
> >  static bool i915_request_guilty(struct drm_i915_gem_request *request,
> >  				const u32 acthd, bool *inside)
> >  {
> > @@ -2150,9 +2168,9 @@ static bool i915_request_guilty(struct drm_i915_gem_request *request,
> >  	 * pointing inside the ring, matches the batch_obj address range.
> >  	 * However this is extremely unlikely.
> >  	 */
> > -
> >  	if (request->batch_obj) {
> > -		if (i915_head_inside_object(acthd, request->batch_obj)) {
> > +		if (i915_head_inside_object(acthd, request->batch_obj,
> > +					    request_to_vm(request))) {
> >  			*inside = true;
> >  			return true;
> >  		}
> > @@ -2172,17 +2190,21 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring,
> >  {
> >  	struct i915_ctx_hang_stats *hs = NULL;
> >  	bool inside, guilty;
> > +	unsigned long offset = 0;
> >  
> >  	/* Innocent until proven guilty */
> >  	guilty = false;
> >  
> > +	if (request->batch_obj)
> > +		offset = i915_gem_obj_offset(request->batch_obj,
> > +					     request_to_vm(request));
> > +
> >  	if (ring->hangcheck.action != wait &&
> >  	    i915_request_guilty(request, acthd, &inside)) {
> >  		DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n",
> >  			  ring->name,
> >  			  inside ? "inside" : "flushing",
> > -			  request->batch_obj ?
> > -			  i915_gem_obj_ggtt_offset(request->batch_obj) : 0,
> > +			  offset,
> >  			  request->ctx ? request->ctx->id : 0,
> >  			  acthd);
> >  
> > @@ -2239,13 +2261,15 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
> >  	}
> >  
> >  	while (!list_empty(&ring->active_list)) {
> > +		struct i915_address_space *vm;
> >  		struct drm_i915_gem_object *obj;
> >  
> >  		obj = list_first_entry(&ring->active_list,
> >  				       struct drm_i915_gem_object,
> >  				       ring_list);
> >  
> > -		i915_gem_object_move_to_inactive(obj);
> > +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > +			i915_gem_object_move_to_inactive(obj, vm);
> >  	}
> >  }
> >  
> > @@ -2263,7 +2287,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
> >  void i915_gem_reset(struct drm_device *dev)
> >  {
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > +	struct i915_address_space *vm;
> >  	struct drm_i915_gem_object *obj;
> >  	struct intel_ring_buffer *ring;
> >  	int i;
> > @@ -2274,8 +2298,9 @@ void i915_gem_reset(struct drm_device *dev)
> >  	/* Move everything out of the GPU domains to ensure we do any
> >  	 * necessary invalidation upon reuse.
> >  	 */
> > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > -		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > +		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > +			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> >  
> >  	i915_gem_restore_fences(dev);
> >  }
> > @@ -2320,6 +2345,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> >  	 * by the ringbuffer to the flushing/inactive lists as appropriate.
> >  	 */
> >  	while (!list_empty(&ring->active_list)) {
> > +		struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > +		struct i915_address_space *vm;
> >  		struct drm_i915_gem_object *obj;
> >  
> >  		obj = list_first_entry(&ring->active_list,
> > @@ -2329,7 +2356,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
> >  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
> >  			break;
> >  
> > -		i915_gem_object_move_to_inactive(obj);
> > +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > +			i915_gem_object_move_to_inactive(obj, vm);
> >  	}
> >  
> >  	if (unlikely(ring->trace_irq_seqno &&
> > @@ -2575,13 +2603,14 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
> >   * Unbinds an object from the GTT aperture.
> >   */
> >  int
> > -i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> > +i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> > +		       struct i915_address_space *vm)
> >  {
> >  	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
> >  	struct i915_vma *vma;
> >  	int ret;
> >  
> > -	if (!i915_gem_obj_ggtt_bound(obj))
> > +	if (!i915_gem_obj_bound(obj, vm))
> >  		return 0;
> >  
> >  	if (obj->pin_count)
> > @@ -2604,7 +2633,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> >  	if (ret)
> >  		return ret;
> >  
> > -	trace_i915_gem_object_unbind(obj);
> > +	trace_i915_gem_object_unbind(obj, vm);
> >  
> >  	if (obj->has_global_gtt_mapping)
> >  		i915_gem_gtt_unbind_object(obj);
> > @@ -2619,7 +2648,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> >  	/* Avoid an unnecessary call to unbind on rebind. */
> >  	obj->map_and_fenceable = true;
> >  
> > -	vma = __i915_gem_obj_to_vma(obj);
> > +	vma = i915_gem_obj_to_vma(obj, vm);
> >  	list_del(&vma->vma_link);
> >  	drm_mm_remove_node(&vma->node);
> >  	i915_gem_vma_destroy(vma);
> > @@ -2748,6 +2777,7 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
> >  		     "object 0x%08lx not 512K or pot-size 0x%08x aligned\n",
> >  		     i915_gem_obj_ggtt_offset(obj), size);
> >  
> > +
> >  		pitch_val = obj->stride / 128;
> >  		pitch_val = ffs(pitch_val) - 1;
> >  
> > @@ -3069,23 +3099,25 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
> >   */
> >  static int
> >  i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > +			    struct i915_address_space *vm,
> >  			    unsigned alignment,
> >  			    bool map_and_fenceable,
> >  			    bool nonblocking)
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	u32 size, fence_size, fence_alignment, unfenced_alignment;
> >  	bool mappable, fenceable;
> > -	size_t gtt_max = map_and_fenceable ?
> > -		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
> > +	size_t gtt_max =
> > +		map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total;
> >  	struct i915_vma *vma;
> >  	int ret;
> >  
> >  	if (WARN_ON(!list_empty(&obj->vma_list)))
> >  		return -EBUSY;
> >  
> > +	BUG_ON(!i915_is_ggtt(vm));
> > +
> >  	fence_size = i915_gem_get_gtt_size(dev,
> >  					   obj->base.size,
> >  					   obj->tiling_mode);
> > @@ -3125,18 +3157,21 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> >  	i915_gem_object_pin_pages(obj);
> >  
> >  	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > +	/* For now we only ever use 1 vma per object */
> > +	WARN_ON(!list_empty(&obj->vma_list));
> > +
> > +	vma = i915_gem_vma_create(obj, vm);
> >  	if (vma == NULL) {
> >  		i915_gem_object_unpin_pages(obj);
> >  		return -ENOMEM;
> >  	}
> >  
> >  search_free:
> > -	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
> > -						  &vma->node,
> > +	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
> >  						  size, alignment,
> >  						  obj->cache_level, 0, gtt_max);
> >  	if (ret) {
> > -		ret = i915_gem_evict_something(dev, size, alignment,
> > +		ret = i915_gem_evict_something(dev, vm, size, alignment,
> >  					       obj->cache_level,
> >  					       map_and_fenceable,
> >  					       nonblocking);
> > @@ -3162,18 +3197,25 @@ search_free:
> >  
> >  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> >  	list_add_tail(&obj->mm_list, &vm->inactive_list);
> > -	list_add(&vma->vma_link, &obj->vma_list);
> > +
> > +	/* Keep GGTT vmas first to make debug easier */
> > +	if (i915_is_ggtt(vm))
> > +		list_add(&vma->vma_link, &obj->vma_list);
> > +	else
> > +		list_add_tail(&vma->vma_link, &obj->vma_list);
> >  
> >  	fenceable =
> > +		i915_is_ggtt(vm) &&
> >  		i915_gem_obj_ggtt_size(obj) == fence_size &&
> >  		(i915_gem_obj_ggtt_offset(obj) & (fence_alignment - 1)) == 0;
> >  
> > -	mappable = i915_gem_obj_ggtt_offset(obj) + obj->base.size <=
> > -		dev_priv->gtt.mappable_end;
> > +	mappable =
> > +		i915_is_ggtt(vm) &&
> > +		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
> >  
> >  	obj->map_and_fenceable = mappable && fenceable;
> >  
> > -	trace_i915_gem_object_bind(obj, map_and_fenceable);
> > +	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
> >  	i915_gem_verify_gtt(dev);
> >  	return 0;
> >  }
> > @@ -3271,7 +3313,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> >  	int ret;
> >  
> >  	/* Not valid to be called on unbound objects. */
> > -	if (!i915_gem_obj_ggtt_bound(obj))
> > +	if (!i915_gem_obj_bound_any(obj))
> >  		return -EINVAL;
> 
> If we're converting the shmem paths over to wait_rendering then there's
> only the fault handler and the set_domain ioctl left. For the later it
> would make sense to clflush even when an object is on the unbound list, to
> allow userspace to optimize when the clflushing happens. But that would
> only make sense in conjunction with Chris' create2 ioctl and a flag to
> preallocate the storage (and so putting the object onto the unbound list).
> So nothing to do here.
> 
> >  
> >  	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
> > @@ -3317,11 +3359,12 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> >  }
> >  
> >  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > +				    struct i915_address_space *vm,
> >  				    enum i915_cache_level cache_level)
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > +	struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> >  	int ret;
> >  
> >  	if (obj->cache_level == cache_level)
> > @@ -3333,12 +3376,15 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> >  	}
> >  
> >  	if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> > -		ret = i915_gem_object_unbind(obj);
> > +		ret = i915_gem_object_unbind(obj, vm);
> >  		if (ret)
> >  			return ret;
> >  	}
> >  
> > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > +		if (!i915_gem_obj_bound(obj, vm))
> > +			continue;
> 
> Hm, shouldn't we have a per-object list of vmas? Or will that follow later
> on?
> 
> Self-correction: It exists already ... why can't we use this here?

Yes. That should work, I'll fix it and test it. It looks slightly worse
IMO in terms of code clarity, but I don't mind the change.

> 
> > +
> >  		ret = i915_gem_object_finish_gpu(obj);
> >  		if (ret)
> >  			return ret;
> > @@ -3361,7 +3407,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> >  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> >  					       obj, cache_level);
> >  
> > -		i915_gem_obj_ggtt_set_color(obj, cache_level);
> > +		i915_gem_obj_set_color(obj, vm, cache_level);
> >  	}
> >  
> >  	if (cache_level == I915_CACHE_NONE) {
> > @@ -3421,6 +3467,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> >  			       struct drm_file *file)
> >  {
> >  	struct drm_i915_gem_caching *args = data;
> > +	struct drm_i915_private *dev_priv;
> >  	struct drm_i915_gem_object *obj;
> >  	enum i915_cache_level level;
> >  	int ret;
> > @@ -3445,8 +3492,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> >  		ret = -ENOENT;
> >  		goto unlock;
> >  	}
> > +	dev_priv = obj->base.dev->dev_private;
> >  
> > -	ret = i915_gem_object_set_cache_level(obj, level);
> > +	/* FIXME: Add interface for specific VM? */
> > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
> >  
> >  	drm_gem_object_unreference(&obj->base);
> >  unlock:
> > @@ -3464,6 +3513,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> >  				     u32 alignment,
> >  				     struct intel_ring_buffer *pipelined)
> >  {
> > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> >  	u32 old_read_domains, old_write_domain;
> >  	int ret;
> >  
> > @@ -3482,7 +3532,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> >  	 * of uncaching, which would allow us to flush all the LLC-cached data
> >  	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
> >  	 */
> > -	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > +					      I915_CACHE_NONE);
> >  	if (ret)
> >  		return ret;
> >  
> > @@ -3490,7 +3541,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> >  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> >  	 * always use map_and_fenceable for all scanout buffers.
> >  	 */
> > -	ret = i915_gem_object_pin(obj, alignment, true, false);
> > +	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
> >  	if (ret)
> >  		return ret;
> >  
> > @@ -3633,6 +3684,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
> >  
> >  int
> >  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > +		    struct i915_address_space *vm,
> >  		    uint32_t alignment,
> >  		    bool map_and_fenceable,
> >  		    bool nonblocking)
> > @@ -3642,26 +3694,29 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> >  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
> >  		return -EBUSY;
> >  
> > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> > +	BUG_ON(map_and_fenceable && !i915_is_ggtt(vm));
> 
> WARN_ON, since presumably we can keep on going if we get this wrong
> (albeit with slightly corrupted state, so render corruptions might
> follow).

Can we make a deal, can we leave this as BUG_ON with a FIXME to convert
it at the end of merging?

> 
> > +
> > +	if (i915_gem_obj_bound(obj, vm)) {
> > +		if ((alignment &&
> > +		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
> >  		    (map_and_fenceable && !obj->map_and_fenceable)) {
> >  			WARN(obj->pin_count,
> >  			     "bo is already pinned with incorrect alignment:"
> >  			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
> >  			     " obj->map_and_fenceable=%d\n",
> > -			     i915_gem_obj_ggtt_offset(obj), alignment,
> > +			     i915_gem_obj_offset(obj, vm), alignment,
> >  			     map_and_fenceable,
> >  			     obj->map_and_fenceable);
> > -			ret = i915_gem_object_unbind(obj);
> > +			ret = i915_gem_object_unbind(obj, vm);
> >  			if (ret)
> >  				return ret;
> >  		}
> >  	}
> >  
> > -	if (!i915_gem_obj_ggtt_bound(obj)) {
> > +	if (!i915_gem_obj_bound(obj, vm)) {
> >  		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> >  
> > -		ret = i915_gem_object_bind_to_gtt(obj, alignment,
> > +		ret = i915_gem_object_bind_to_gtt(obj, vm, alignment,
> >  						  map_and_fenceable,
> >  						  nonblocking);
> >  		if (ret)
> > @@ -3684,7 +3739,7 @@ void
> >  i915_gem_object_unpin(struct drm_i915_gem_object *obj)
> >  {
> >  	BUG_ON(obj->pin_count == 0);
> > -	BUG_ON(!i915_gem_obj_ggtt_bound(obj));
> > +	BUG_ON(!i915_gem_obj_bound_any(obj));
> >  
> >  	if (--obj->pin_count == 0)
> >  		obj->pin_mappable = false;
> > @@ -3722,7 +3777,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
> >  	}
> >  
> >  	if (obj->user_pin_count == 0) {
> > -		ret = i915_gem_object_pin(obj, args->alignment, true, false);
> > +		ret = i915_gem_ggtt_pin(obj, args->alignment, true, false);
> >  		if (ret)
> >  			goto out;
> >  	}
> > @@ -3957,6 +4012,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> >  	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
> >  	struct drm_device *dev = obj->base.dev;
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > +	struct i915_vma *vma, *next;
> >  
> >  	trace_i915_gem_object_destroy(obj);
> >  
> > @@ -3964,15 +4020,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> >  		i915_gem_detach_phys_object(dev, obj);
> >  
> >  	obj->pin_count = 0;
> > -	if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
> > -		bool was_interruptible;
> > +	/* NB: 0 or 1 elements */
> > +	WARN_ON(!list_empty(&obj->vma_list) &&
> > +		!list_is_singular(&obj->vma_list));
> > +	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
> > +		int ret = i915_gem_object_unbind(obj, vma->vm);
> > +		if (WARN_ON(ret == -ERESTARTSYS)) {
> > +			bool was_interruptible;
> >  
> > -		was_interruptible = dev_priv->mm.interruptible;
> > -		dev_priv->mm.interruptible = false;
> > +			was_interruptible = dev_priv->mm.interruptible;
> > +			dev_priv->mm.interruptible = false;
> >  
> > -		WARN_ON(i915_gem_object_unbind(obj));
> > +			WARN_ON(i915_gem_object_unbind(obj, vma->vm));
> >  
> > -		dev_priv->mm.interruptible = was_interruptible;
> > +			dev_priv->mm.interruptible = was_interruptible;
> > +		}
> >  	}
> >  
> >  	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
> > @@ -4332,6 +4394,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
> >  	INIT_LIST_HEAD(&ring->request_list);
> >  }
> >  
> > +static void i915_init_vm(struct drm_i915_private *dev_priv,
> > +			 struct i915_address_space *vm)
> > +{
> > +	vm->dev = dev_priv->dev;
> > +	INIT_LIST_HEAD(&vm->active_list);
> > +	INIT_LIST_HEAD(&vm->inactive_list);
> > +	INIT_LIST_HEAD(&vm->global_link);
> > +	list_add(&vm->global_link, &dev_priv->vm_list);
> > +}
> > +
> >  void
> >  i915_gem_load(struct drm_device *dev)
> >  {
> > @@ -4344,8 +4416,9 @@ i915_gem_load(struct drm_device *dev)
> >  				  SLAB_HWCACHE_ALIGN,
> >  				  NULL);
> >  
> > -	INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
> > -	INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
> > +	INIT_LIST_HEAD(&dev_priv->vm_list);
> > +	i915_init_vm(dev_priv, &dev_priv->gtt.base);
> > +
> >  	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
> >  	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
> >  	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> > @@ -4616,9 +4689,9 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> >  			     struct drm_i915_private,
> >  			     mm.inactive_shrinker);
> >  	struct drm_device *dev = dev_priv->dev;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > +	struct i915_address_space *vm;
> >  	struct drm_i915_gem_object *obj;
> > -	int nr_to_scan = sc->nr_to_scan;
> > +	int nr_to_scan;
> >  	bool unlock = true;
> >  	int cnt;
> >  
> > @@ -4632,6 +4705,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> >  		unlock = false;
> >  	}
> >  
> > +	nr_to_scan = sc->nr_to_scan;
> >  	if (nr_to_scan) {
> >  		nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
> >  		if (nr_to_scan > 0)
> > @@ -4645,11 +4719,93 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> >  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
> >  		if (obj->pages_pin_count == 0)
> >  			cnt += obj->base.size >> PAGE_SHIFT;
> > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > -		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > -			cnt += obj->base.size >> PAGE_SHIFT;
> > +
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > +		list_for_each_entry(obj, &vm->inactive_list, global_list)
> > +			if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > +				cnt += obj->base.size >> PAGE_SHIFT;
> 
> Isn't this now double-counting objects? In the shrinker we only care about
> how much physical RAM an object occupies, not how much virtual space it
> occupies. So just walking the bound list of objects here should be good
> enough ...
> 

Maybe I've misunderstood you. My code is wrong, but I think you're idea
requires a prep patch because it changes functionality, right?

So let me know if I've understood you.

> >  
> >  	if (unlock)
> >  		mutex_unlock(&dev->struct_mutex);
> >  	return cnt;
> >  }
> > +
> > +/* All the new VM stuff */
> > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > +				  struct i915_address_space *vm)
> > +{
> > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > +	struct i915_vma *vma;
> > +
> > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > +		vm = &dev_priv->gtt.base;
> > +
> > +	BUG_ON(list_empty(&o->vma_list));
> > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> 
> Imo the vma list walking here and in the other helpers below indicates
> that we should deal more often in vmas instead of (object, vm) pairs. Or
> is this again something that'll get fixed later on?
> 
> I just want to avoid diff churn, and it also makes reviewing easier if the
> foreshadowing is correct ;-) So generally I'd vote for more liberal
> sprinkling of obj_to_vma in callers.

It's not something I fixed in the whole series. I think it makes sense
conceptually, to keep some things as <obj,vm> and others as direct vma.

If you want me to change something, you need to be more specific since
no action specifically comes to mind at this point in the series.

> 
> > +		if (vma->vm == vm)
> > +			return vma->node.start;
> > +
> > +	}
> > +	return -1;
> > +}
> > +
> > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
> > +{
> > +	return !list_empty(&o->vma_list);
> > +}
> > +
> > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > +			struct i915_address_space *vm)
> > +{
> > +	struct i915_vma *vma;
> > +
> > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > +		if (vma->vm == vm)
> > +			return true;
> > +	}
> > +	return false;
> > +}
> > +
> > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > +				struct i915_address_space *vm)
> > +{
> > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > +	struct i915_vma *vma;
> > +
> > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > +		vm = &dev_priv->gtt.base;
> > +	BUG_ON(list_empty(&o->vma_list));
> > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > +		if (vma->vm == vm)
> > +			return vma->node.size;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> > +			    struct i915_address_space *vm,
> > +			    enum i915_cache_level color)
> > +{
> > +	struct i915_vma *vma;
> > +	BUG_ON(list_empty(&o->vma_list));
> > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > +		if (vma->vm == vm) {
> > +			vma->node.color = color;
> > +			return;
> > +		}
> > +	}
> > +
> > +	WARN(1, "Couldn't set color for VM %p\n", vm);
> > +}
> > +
> > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > +				     struct i915_address_space *vm)
> > +{
> > +	struct i915_vma *vma;
> > +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> > +		if (vma->vm == vm)
> > +			return vma;
> > +
> > +	return NULL;
> > +}
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > index 2074544..c92fd81 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev,
> >  
> >  	if (INTEL_INFO(dev)->gen >= 7) {
> >  		ret = i915_gem_object_set_cache_level(ctx->obj,
> > +						      &dev_priv->gtt.base,
> >  						      I915_CACHE_LLC_MLC);
> >  		/* Failure shouldn't ever happen this early */
> >  		if (WARN_ON(ret))
> > @@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
> >  	 * default context.
> >  	 */
> >  	dev_priv->ring[RCS].default_context = ctx;
> > -	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> > +	ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> >  	if (ret) {
> >  		DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
> >  		goto err_destroy;
> > @@ -398,6 +399,7 @@ mi_set_context(struct intel_ring_buffer *ring,
> >  static int do_switch(struct i915_hw_context *to)
> >  {
> >  	struct intel_ring_buffer *ring = to->ring;
> > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> >  	struct i915_hw_context *from = ring->last_context;
> >  	u32 hw_flags = 0;
> >  	int ret;
> > @@ -407,7 +409,7 @@ static int do_switch(struct i915_hw_context *to)
> >  	if (from == to)
> >  		return 0;
> >  
> > -	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
> > +	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
> >  	if (ret)
> >  		return ret;
> >  
> > @@ -444,7 +446,8 @@ static int do_switch(struct i915_hw_context *to)
> >  	 */
> >  	if (from != NULL) {
> >  		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> > -		i915_gem_object_move_to_active(from->obj, ring);
> > +		i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
> > +					       ring);
> >  		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
> >  		 * whole damn pipeline, we don't need to explicitly mark the
> >  		 * object dirty. The only exception is that the context must be
> > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > index df61f33..32efdc0 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > @@ -32,24 +32,21 @@
> >  #include "i915_trace.h"
> >  
> >  static bool
> > -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
> > +mark_free(struct i915_vma *vma, struct list_head *unwind)
> >  {
> > -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > -
> > -	if (obj->pin_count)
> > +	if (vma->obj->pin_count)
> >  		return false;
> >  
> > -	list_add(&obj->exec_list, unwind);
> > +	list_add(&vma->obj->exec_list, unwind);
> >  	return drm_mm_scan_add_block(&vma->node);
> >  }
> >  
> >  int
> > -i915_gem_evict_something(struct drm_device *dev, int min_size,
> > -			 unsigned alignment, unsigned cache_level,
> > +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> > +			 int min_size, unsigned alignment, unsigned cache_level,
> >  			 bool mappable, bool nonblocking)
> >  {
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> >  	struct list_head eviction_list, unwind_list;
> >  	struct i915_vma *vma;
> >  	struct drm_i915_gem_object *obj;
> > @@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> >  	 */
> >  
> >  	INIT_LIST_HEAD(&unwind_list);
> > -	if (mappable)
> > +	if (mappable) {
> > +		BUG_ON(!i915_is_ggtt(vm));
> >  		drm_mm_init_scan_with_range(&vm->mm, min_size,
> >  					    alignment, cache_level, 0,
> >  					    dev_priv->gtt.mappable_end);
> > -	else
> > +	} else
> >  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
> >  
> >  	/* First see if there is a large enough contiguous idle region... */
> >  	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> > -		if (mark_free(obj, &unwind_list))
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > +		if (mark_free(vma, &unwind_list))
> >  			goto found;
> >  	}
> >  
> > @@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> >  
> >  	/* Now merge in the soon-to-be-expired objects... */
> >  	list_for_each_entry(obj, &vm->active_list, mm_list) {
> > -		if (mark_free(obj, &unwind_list))
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > +		if (mark_free(vma, &unwind_list))
> >  			goto found;
> >  	}
> >  
> > @@ -109,7 +109,7 @@ none:
> >  		obj = list_first_entry(&unwind_list,
> >  				       struct drm_i915_gem_object,
> >  				       exec_list);
> > -		vma = __i915_gem_obj_to_vma(obj);
> > +		vma = i915_gem_obj_to_vma(obj, vm);
> >  		ret = drm_mm_scan_remove_block(&vma->node);
> >  		BUG_ON(ret);
> >  
> > @@ -130,7 +130,7 @@ found:
> >  		obj = list_first_entry(&unwind_list,
> >  				       struct drm_i915_gem_object,
> >  				       exec_list);
> > -		vma = __i915_gem_obj_to_vma(obj);
> > +		vma = i915_gem_obj_to_vma(obj, vm);
> >  		if (drm_mm_scan_remove_block(&vma->node)) {
> >  			list_move(&obj->exec_list, &eviction_list);
> >  			drm_gem_object_reference(&obj->base);
> > @@ -145,7 +145,7 @@ found:
> >  				       struct drm_i915_gem_object,
> >  				       exec_list);
> >  		if (ret == 0)
> > -			ret = i915_gem_object_unbind(obj);
> > +			ret = i915_gem_object_unbind(obj, vm);
> >  
> >  		list_del_init(&obj->exec_list);
> >  		drm_gem_object_unreference(&obj->base);
> > @@ -158,13 +158,18 @@ int
> >  i915_gem_evict_everything(struct drm_device *dev)
> 
> I suspect evict_everything eventually wants a address_space *vm argument
> for those cases where we only want to evict everything in a given vm. Atm
> we have two use-cases of this:
> - Called from the shrinker as a last-ditch effort. For that it should move
>   _every_ object onto the unbound list.
> - Called from execbuf for badly-fragmented address spaces to clean up the
>   mess. For that case we only care about one address space.

The current thing is more or less a result of Chris' suggestions. A
non-posted iteration did plumb the vm, and after reworking to the
suggestion made by Chris, the vm didn't make much sense anymore.

For point #1, it requires VM prioritization I think. I don't really see
any other way to fairly manage it.

For point #2, that I agree it might be useful, but we can easily create
a new function, and not call it "shrinker" to do it. 


> 
> >  {
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > +	struct i915_address_space *vm;
> >  	struct drm_i915_gem_object *obj, *next;
> > -	bool lists_empty;
> > +	bool lists_empty = true;
> >  	int ret;
> >  
> > -	lists_empty = (list_empty(&vm->inactive_list) &&
> > -		       list_empty(&vm->active_list));
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > +		lists_empty = (list_empty(&vm->inactive_list) &&
> > +			       list_empty(&vm->active_list));
> > +		if (!lists_empty)
> > +			lists_empty = false;
> > +	}
> > +
> >  	if (lists_empty)
> >  		return -ENOSPC;
> >  
> > @@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
> >  	i915_gem_retire_requests(dev);
> >  
> >  	/* Having flushed everything, unbind() should never raise an error */
> > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > -		if (obj->pin_count == 0)
> > -			WARN_ON(i915_gem_object_unbind(obj));
> > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > +		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > +			if (obj->pin_count == 0)
> > +				WARN_ON(i915_gem_object_unbind(obj, vm));
> > +	}
> >  
> >  	return 0;
> >  }
> > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > index 5aeb447..e90182d 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > @@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
> >  }
> >  
> >  static void
> > -eb_destroy(struct eb_objects *eb)
> > +eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
> >  {
> >  	while (!list_empty(&eb->objects)) {
> >  		struct drm_i915_gem_object *obj;
> > @@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
> >  static int
> >  i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> >  				   struct eb_objects *eb,
> > -				   struct drm_i915_gem_relocation_entry *reloc)
> > +				   struct drm_i915_gem_relocation_entry *reloc,
> > +				   struct i915_address_space *vm)
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> >  	struct drm_gem_object *target_obj;
> > @@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> >  
> >  static int
> >  i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > -				    struct eb_objects *eb)
> > +				    struct eb_objects *eb,
> > +				    struct i915_address_space *vm)
> >  {
> >  #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
> >  	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
> > @@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> >  		do {
> >  			u64 offset = r->presumed_offset;
> >  
> > -			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
> > +			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
> > +								 vm);
> >  			if (ret)
> >  				return ret;
> >  
> > @@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> >  static int
> >  i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> >  					 struct eb_objects *eb,
> > -					 struct drm_i915_gem_relocation_entry *relocs)
> > +					 struct drm_i915_gem_relocation_entry *relocs,
> > +					 struct i915_address_space *vm)
> >  {
> >  	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> >  	int i, ret;
> >  
> >  	for (i = 0; i < entry->relocation_count; i++) {
> > -		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
> > +		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
> > +							 vm);
> >  		if (ret)
> >  			return ret;
> >  	}
> > @@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> >  }
> >  
> >  static int
> > -i915_gem_execbuffer_relocate(struct eb_objects *eb)
> > +i915_gem_execbuffer_relocate(struct eb_objects *eb,
> > +			     struct i915_address_space *vm)
> >  {
> >  	struct drm_i915_gem_object *obj;
> >  	int ret = 0;
> > @@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
> >  	 */
> >  	pagefault_disable();
> >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> > -		ret = i915_gem_execbuffer_relocate_object(obj, eb);
> > +		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
> >  		if (ret)
> >  			break;
> >  	}
> > @@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
> >  static int
> >  i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> >  				   struct intel_ring_buffer *ring,
> > +				   struct i915_address_space *vm,
> >  				   bool *need_reloc)
> >  {
> >  	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > @@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> >  		obj->tiling_mode != I915_TILING_NONE;
> >  	need_mappable = need_fence || need_reloc_mappable(obj);
> >  
> > -	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
> > +	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
> > +				  false);
> >  	if (ret)
> >  		return ret;
> >  
> > @@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> >  		obj->has_aliasing_ppgtt_mapping = 1;
> >  	}
> >  
> > -	if (entry->offset != i915_gem_obj_ggtt_offset(obj)) {
> > -		entry->offset = i915_gem_obj_ggtt_offset(obj);
> > +	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
> > +		entry->offset = i915_gem_obj_offset(obj, vm);
> >  		*need_reloc = true;
> >  	}
> >  
> > @@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> >  {
> >  	struct drm_i915_gem_exec_object2 *entry;
> >  
> > -	if (!i915_gem_obj_ggtt_bound(obj))
> > +	if (!i915_gem_obj_bound_any(obj))
> >  		return;
> >  
> >  	entry = obj->exec_entry;
> > @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> >  static int
> >  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> >  			    struct list_head *objects,
> > +			    struct i915_address_space *vm,
> >  			    bool *need_relocs)
> >  {
> >  	struct drm_i915_gem_object *obj;
> > @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> >  		list_for_each_entry(obj, objects, exec_list) {
> >  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> >  			bool need_fence, need_mappable;
> > +			u32 obj_offset;
> >  
> > -			if (!i915_gem_obj_ggtt_bound(obj))
> > +			if (!i915_gem_obj_bound(obj, vm))
> >  				continue;
> 
> I wonder a bit how we could avoid the multipler (obj, vm) -> vma lookups
> here ... Maybe we should cache them in some pointer somewhere (either in
> the eb object or by adding a new pointer to the object struct, e.g.
> obj->eb_vma, similar to obj->eb_list).
> 

I agree, and even did this at one unposted patch too. However, I think
it's a premature optimization which risks code correctness. So I think
somewhere a FIXME needs to happen to address that issue. (Or if Chris
complains bitterly about some perf hit).

> >  
> > +			obj_offset = i915_gem_obj_offset(obj, vm);
> >  			need_fence =
> >  				has_fenced_gpu_access &&
> >  				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
> >  				obj->tiling_mode != I915_TILING_NONE;
> >  			need_mappable = need_fence || need_reloc_mappable(obj);
> >  
> > +			BUG_ON((need_mappable || need_fence) &&
> > +			       !i915_is_ggtt(vm));
> > +
> >  			if ((entry->alignment &&
> > -			     i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) ||
> > +			     obj_offset & (entry->alignment - 1)) ||
> >  			    (need_mappable && !obj->map_and_fenceable))
> > -				ret = i915_gem_object_unbind(obj);
> > +				ret = i915_gem_object_unbind(obj, vm);
> >  			else
> > -				ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > +				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> >  			if (ret)
> >  				goto err;
> >  		}
> >  
> >  		/* Bind fresh objects */
> >  		list_for_each_entry(obj, objects, exec_list) {
> > -			if (i915_gem_obj_ggtt_bound(obj))
> > +			if (i915_gem_obj_bound(obj, vm))
> >  				continue;
> >  
> > -			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > +			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> >  			if (ret)
> >  				goto err;
> >  		}
> > @@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> >  				  struct drm_file *file,
> >  				  struct intel_ring_buffer *ring,
> >  				  struct eb_objects *eb,
> > -				  struct drm_i915_gem_exec_object2 *exec)
> > +				  struct drm_i915_gem_exec_object2 *exec,
> > +				  struct i915_address_space *vm)
> >  {
> >  	struct drm_i915_gem_relocation_entry *reloc;
> >  	struct drm_i915_gem_object *obj;
> > @@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> >  		goto err;
> >  
> >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> >  	if (ret)
> >  		goto err;
> >  
> >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> >  		int offset = obj->exec_entry - exec;
> >  		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
> > -							       reloc + reloc_offset[offset]);
> > +							       reloc + reloc_offset[offset],
> > +							       vm);
> >  		if (ret)
> >  			goto err;
> >  	}
> > @@ -768,6 +784,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
> >  
> >  static void
> >  i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > +				   struct i915_address_space *vm,
> >  				   struct intel_ring_buffer *ring)
> >  {
> >  	struct drm_i915_gem_object *obj;
> > @@ -782,7 +799,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
> >  		obj->base.read_domains = obj->base.pending_read_domains;
> >  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
> >  
> > -		i915_gem_object_move_to_active(obj, ring);
> > +		i915_gem_object_move_to_active(obj, vm, ring);
> >  		if (obj->base.write_domain) {
> >  			obj->dirty = 1;
> >  			obj->last_write_seqno = intel_ring_get_seqno(ring);
> > @@ -836,7 +853,8 @@ static int
> >  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  		       struct drm_file *file,
> >  		       struct drm_i915_gem_execbuffer2 *args,
> > -		       struct drm_i915_gem_exec_object2 *exec)
> > +		       struct drm_i915_gem_exec_object2 *exec,
> > +		       struct i915_address_space *vm)
> >  {
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> >  	struct eb_objects *eb;
> > @@ -998,17 +1016,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  
> >  	/* Move the objects en-masse into the GTT, evicting if necessary. */
> >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> >  	if (ret)
> >  		goto err;
> >  
> >  	/* The objects are in their final locations, apply the relocations. */
> >  	if (need_relocs)
> > -		ret = i915_gem_execbuffer_relocate(eb);
> > +		ret = i915_gem_execbuffer_relocate(eb, vm);
> >  	if (ret) {
> >  		if (ret == -EFAULT) {
> >  			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
> > -								eb, exec);
> > +								eb, exec, vm);
> >  			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
> >  		}
> >  		if (ret)
> > @@ -1059,7 +1077,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  			goto err;
> >  	}
> >  
> > -	exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset;
> > +	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> > +		args->batch_start_offset;
> >  	exec_len = args->batch_len;
> >  	if (cliprects) {
> >  		for (i = 0; i < args->num_cliprects; i++) {
> > @@ -1084,11 +1103,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  
> >  	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
> >  
> > -	i915_gem_execbuffer_move_to_active(&eb->objects, ring);
> > +	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
> >  	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
> >  
> >  err:
> > -	eb_destroy(eb);
> > +	eb_destroy(eb, vm);
> >  
> >  	mutex_unlock(&dev->struct_mutex);
> >  
> > @@ -1105,6 +1124,7 @@ int
> >  i915_gem_execbuffer(struct drm_device *dev, void *data,
> >  		    struct drm_file *file)
> >  {
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	struct drm_i915_gem_execbuffer *args = data;
> >  	struct drm_i915_gem_execbuffer2 exec2;
> >  	struct drm_i915_gem_exec_object *exec_list = NULL;
> > @@ -1160,7 +1180,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
> >  	exec2.flags = I915_EXEC_RENDER;
> >  	i915_execbuffer2_set_context_id(exec2, 0);
> >  
> > -	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
> > +	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
> > +				     &dev_priv->gtt.base);
> >  	if (!ret) {
> >  		/* Copy the new buffer offsets back to the user's exec list. */
> >  		for (i = 0; i < args->buffer_count; i++)
> > @@ -1186,6 +1207,7 @@ int
> >  i915_gem_execbuffer2(struct drm_device *dev, void *data,
> >  		     struct drm_file *file)
> >  {
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	struct drm_i915_gem_execbuffer2 *args = data;
> >  	struct drm_i915_gem_exec_object2 *exec2_list = NULL;
> >  	int ret;
> > @@ -1216,7 +1238,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
> >  		return -EFAULT;
> >  	}
> >  
> > -	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
> > +	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
> > +				     &dev_priv->gtt.base);
> >  	if (!ret) {
> >  		/* Copy the new buffer offsets back to the user's exec list. */
> >  		ret = copy_to_user(to_user_ptr(args->buffers_ptr),
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index 298fc42..70ce2f6 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -367,6 +367,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
> >  			    ppgtt->base.total);
> >  	}
> >  
> > +	/* i915_init_vm(dev_priv, &ppgtt->base) */
> > +
> >  	return ret;
> >  }
> >  
> > @@ -386,17 +388,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> >  			    struct drm_i915_gem_object *obj,
> >  			    enum i915_cache_level cache_level)
> >  {
> > -	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> > -				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > -				   cache_level);
> > +	struct i915_address_space *vm = &ppgtt->base;
> > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > +
> > +	vm->insert_entries(vm, obj->pages,
> > +			   obj_offset >> PAGE_SHIFT,
> > +			   cache_level);
> >  }
> >  
> >  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> >  			      struct drm_i915_gem_object *obj)
> >  {
> > -	ppgtt->base.clear_range(&ppgtt->base,
> > -				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > -				obj->base.size >> PAGE_SHIFT);
> > +	struct i915_address_space *vm = &ppgtt->base;
> > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > +
> > +	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
> > +			obj->base.size >> PAGE_SHIFT);
> >  }
> >  
> >  extern int intel_iommu_gfx_mapped;
> > @@ -447,6 +454,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> >  				       dev_priv->gtt.base.start / PAGE_SIZE,
> >  				       dev_priv->gtt.base.total / PAGE_SIZE);
> >  
> > +	if (dev_priv->mm.aliasing_ppgtt)
> > +		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> > +
> >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> >  		i915_gem_clflush_object(obj);
> >  		i915_gem_gtt_bind_object(obj, obj->cache_level);
> > @@ -625,7 +635,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> >  	 * aperture.  One page should be enough to keep any prefetching inside
> >  	 * of the aperture.
> >  	 */
> > -	drm_i915_private_t *dev_priv = dev->dev_private;
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
> >  	struct drm_mm_node *entry;
> >  	struct drm_i915_gem_object *obj;
> >  	unsigned long hole_start, hole_end;
> > @@ -633,19 +644,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> >  	BUG_ON(mappable_end > end);
> >  
> >  	/* Subtract the guard page ... */
> > -	drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
> > +	drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
> >  	if (!HAS_LLC(dev))
> >  		dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
> >  
> >  	/* Mark any preallocated objects as occupied */
> >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > -		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
> >  		int ret;
> >  		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
> >  			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
> >  
> >  		WARN_ON(i915_gem_obj_ggtt_bound(obj));
> > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > +		ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
> >  		if (ret)
> >  			DRM_DEBUG_KMS("Reservation failed\n");
> >  		obj->has_global_gtt_mapping = 1;
> > @@ -656,19 +667,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> >  	dev_priv->gtt.base.total = end - start;
> >  
> >  	/* Clear any non-preallocated blocks */
> > -	drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
> > -			     hole_start, hole_end) {
> > +	drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
> >  		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
> >  		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
> >  			      hole_start, hole_end);
> > -		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > -					       hole_start / PAGE_SIZE,
> > -					       count);
> > +		ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count);
> >  	}
> >  
> >  	/* And finally clear the reserved guard page */
> > -	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > -				       end / PAGE_SIZE - 1, 1);
> > +	ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1);
> >  }
> >  
> >  static bool
> > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > index 245eb1d..bfe61fa 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > @@ -391,7 +391,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> >  	if (gtt_offset == I915_GTT_OFFSET_NONE)
> >  		return obj;
> >  
> > -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > +	vma = i915_gem_vma_create(obj, vm);
> >  	if (!vma) {
> >  		drm_gem_object_unreference(&obj->base);
> >  		return NULL;
> > @@ -404,8 +404,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> >  	 */
> >  	vma->node.start = gtt_offset;
> >  	vma->node.size = size;
> > -	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > +	if (drm_mm_initialized(&vm->mm)) {
> > +		ret = drm_mm_reserve_node(&vm->mm, &vma->node);
> 
> These two hunks here for stolen look fishy - we only ever use the stolen
> preallocated stuff for objects with mappings in the global gtt. So keeping
> that explicit is imo the better approach. And tbh I'm confused where the
> local variable vm is from ...

If we don't create a vma for it, we potentially have to special case a
bunch of places, I think. I'm not actually sure of this, but the
overhead to do it is quite small.

Anyway, I'll look this over again nd see what I think.

> 
> >  		if (ret) {
> >  			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
> >  			goto unref_out;
> > diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > index 92a8d27..808ca2a 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > @@ -360,17 +360,19 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
> >  
> >  		obj->map_and_fenceable =
> >  			!i915_gem_obj_ggtt_bound(obj) ||
> > -			(i915_gem_obj_ggtt_offset(obj) + obj->base.size <= dev_priv->gtt.mappable_end &&
> > +			(i915_gem_obj_ggtt_offset(obj) +
> > +			 obj->base.size <= dev_priv->gtt.mappable_end &&
> >  			 i915_gem_object_fence_ok(obj, args->tiling_mode));
> >  
> >  		/* Rebind if we need a change of alignment */
> >  		if (!obj->map_and_fenceable) {
> > -			u32 unfenced_alignment =
> > +			struct i915_address_space *ggtt = &dev_priv->gtt.base;
> > +			u32 unfenced_align =
> >  				i915_gem_get_gtt_alignment(dev, obj->base.size,
> >  							    args->tiling_mode,
> >  							    false);
> > -			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_alignment - 1))
> > -				ret = i915_gem_object_unbind(obj);
> > +			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_align - 1))
> > +				ret = i915_gem_object_unbind(obj, ggtt);
> >  		}
> >  
> >  		if (ret == 0) {
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index 79fbb17..28fa0ff 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -1716,6 +1716,9 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
> >  	if (HAS_BROKEN_CS_TLB(dev_priv->dev)) {
> >  		u32 acthd = I915_READ(ACTHD);
> >  
> > +		if (WARN_ON(HAS_HW_CONTEXTS(dev_priv->dev)))
> > +			return NULL;
> > +
> >  		if (WARN_ON(ring->id != RCS))
> >  			return NULL;
> >  
> > @@ -1802,7 +1805,8 @@ static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
> >  		return;
> >  
> >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > -		if ((error->ccid & PAGE_MASK) == i915_gem_obj_ggtt_offset(obj)) {
> > +		if ((error->ccid & PAGE_MASK) ==
> > +		    i915_gem_obj_ggtt_offset(obj)) {
> >  			ering->ctx = i915_error_object_create_sized(dev_priv,
> >  								    obj, 1);
> >  			break;
> > diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> > index 7d283b5..3f019d3 100644
> > --- a/drivers/gpu/drm/i915/i915_trace.h
> > +++ b/drivers/gpu/drm/i915/i915_trace.h
> > @@ -34,11 +34,13 @@ TRACE_EVENT(i915_gem_object_create,
> >  );
> >  
> >  TRACE_EVENT(i915_gem_object_bind,
> > -	    TP_PROTO(struct drm_i915_gem_object *obj, bool mappable),
> > -	    TP_ARGS(obj, mappable),
> > +	    TP_PROTO(struct drm_i915_gem_object *obj,
> > +		     struct i915_address_space *vm, bool mappable),
> > +	    TP_ARGS(obj, vm, mappable),
> >  
> >  	    TP_STRUCT__entry(
> >  			     __field(struct drm_i915_gem_object *, obj)
> > +			     __field(struct i915_address_space *, vm)
> >  			     __field(u32, offset)
> >  			     __field(u32, size)
> >  			     __field(bool, mappable)
> > @@ -46,8 +48,8 @@ TRACE_EVENT(i915_gem_object_bind,
> >  
> >  	    TP_fast_assign(
> >  			   __entry->obj = obj;
> > -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> > -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> > +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> > +			   __entry->size = i915_gem_obj_size(obj, vm);
> >  			   __entry->mappable = mappable;
> >  			   ),
> >  
> > @@ -57,19 +59,21 @@ TRACE_EVENT(i915_gem_object_bind,
> >  );
> >  
> >  TRACE_EVENT(i915_gem_object_unbind,
> > -	    TP_PROTO(struct drm_i915_gem_object *obj),
> > -	    TP_ARGS(obj),
> > +	    TP_PROTO(struct drm_i915_gem_object *obj,
> > +		     struct i915_address_space *vm),
> > +	    TP_ARGS(obj, vm),
> >  
> >  	    TP_STRUCT__entry(
> >  			     __field(struct drm_i915_gem_object *, obj)
> > +			     __field(struct i915_address_space *, vm)
> >  			     __field(u32, offset)
> >  			     __field(u32, size)
> >  			     ),
> >  
> >  	    TP_fast_assign(
> >  			   __entry->obj = obj;
> > -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> > -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> > +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> > +			   __entry->size = i915_gem_obj_size(obj, vm);
> >  			   ),
> >  
> >  	    TP_printk("obj=%p, offset=%08x size=%x",
> > diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
> > index f3c97e0..b69cc63 100644
> > --- a/drivers/gpu/drm/i915/intel_fb.c
> > +++ b/drivers/gpu/drm/i915/intel_fb.c
> > @@ -170,7 +170,6 @@ static int intelfb_create(struct drm_fb_helper *helper,
> >  		      fb->width, fb->height,
> >  		      i915_gem_obj_ggtt_offset(obj), obj);
> >  
> > -
> >  	mutex_unlock(&dev->struct_mutex);
> >  	vga_switcheroo_client_fb_set(dev->pdev, info);
> >  	return 0;
> > diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> > index 81c3ca1..517e278 100644
> > --- a/drivers/gpu/drm/i915/intel_overlay.c
> > +++ b/drivers/gpu/drm/i915/intel_overlay.c
> > @@ -1350,7 +1350,7 @@ void intel_setup_overlay(struct drm_device *dev)
> >  		}
> >  		overlay->flip_addr = reg_bo->phys_obj->handle->busaddr;
> >  	} else {
> > -		ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false);
> > +		ret = i915_gem_ggtt_pin(reg_bo, PAGE_SIZE, true, false);
> >  		if (ret) {
> >  			DRM_ERROR("failed to pin overlay register bo\n");
> >  			goto out_free_bo;
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > index 125a741..449e57c 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -2858,7 +2858,7 @@ intel_alloc_context_page(struct drm_device *dev)
> >  		return NULL;
> >  	}
> >  
> > -	ret = i915_gem_object_pin(ctx, 4096, true, false);
> > +	ret = i915_gem_ggtt_pin(ctx, 4096, true, false);
> >  	if (ret) {
> >  		DRM_ERROR("failed to pin power context: %d\n", ret);
> >  		goto err_unref;
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index bc4c11b..ebed61d 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -481,6 +481,7 @@ out:
> >  static int
> >  init_pipe_control(struct intel_ring_buffer *ring)
> >  {
> > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> >  	struct pipe_control *pc;
> >  	struct drm_i915_gem_object *obj;
> >  	int ret;
> > @@ -499,9 +500,10 @@ init_pipe_control(struct intel_ring_buffer *ring)
> >  		goto err;
> >  	}
> >  
> > -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> > +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > +					I915_CACHE_LLC);
> >  
> > -	ret = i915_gem_object_pin(obj, 4096, true, false);
> > +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
> >  	if (ret)
> >  		goto err_unref;
> >  
> > @@ -1212,6 +1214,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring)
> >  static int init_status_page(struct intel_ring_buffer *ring)
> >  {
> >  	struct drm_device *dev = ring->dev;
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	struct drm_i915_gem_object *obj;
> >  	int ret;
> >  
> > @@ -1222,9 +1225,10 @@ static int init_status_page(struct intel_ring_buffer *ring)
> >  		goto err;
> >  	}
> >  
> > -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> > +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > +					I915_CACHE_LLC);
> >  
> > -	ret = i915_gem_object_pin(obj, 4096, true, false);
> > +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
> >  	if (ret != 0) {
> >  		goto err_unref;
> >  	}
> > @@ -1307,7 +1311,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
> >  
> >  	ring->obj = obj;
> >  
> > -	ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false);
> > +	ret = i915_gem_ggtt_pin(obj, PAGE_SIZE, true, false);
> >  	if (ret)
> >  		goto err_unref;
> >  
> > @@ -1828,7 +1832,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
> >  			return -ENOMEM;
> >  		}
> >  
> > -		ret = i915_gem_object_pin(obj, 0, true, false);
> > +		ret = i915_gem_ggtt_pin(obj, 0, true, false);
> >  		if (ret != 0) {
> >  			drm_gem_object_unreference(&obj->base);
> >  			DRM_ERROR("Failed to ping batch bo\n");
> > -- 
> > 1.8.3.2
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Daniel Vetter July 12, 2013, 6:26 a.m. UTC | #7

On Thu, Jul 11, 2013 at 07:23:08PM -0700, Ben Widawsky wrote:
> On Tue, Jul 09, 2013 at 09:15:01AM +0200, Daniel Vetter wrote:
> > On Mon, Jul 08, 2013 at 11:08:37PM -0700, Ben Widawsky wrote:

[snip]

> > > index 058ad44..21015cd 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -38,10 +38,12 @@
> > >  
> > >  static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
> > >  static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
> > > -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > -						    unsigned alignment,
> > > -						    bool map_and_fenceable,
> > > -						    bool nonblocking);
> > > +static __must_check int
> > > +i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > +			    struct i915_address_space *vm,
> > > +			    unsigned alignment,
> > > +			    bool map_and_fenceable,
> > > +			    bool nonblocking);
> > >  static int i915_gem_phys_pwrite(struct drm_device *dev,
> > >  				struct drm_i915_gem_object *obj,
> > >  				struct drm_i915_gem_pwrite *args,
> > > @@ -135,7 +137,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
> > >  static inline bool
> > >  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
> > >  {
> > > -	return i915_gem_obj_ggtt_bound(obj) && !obj->active;
> > > +	return i915_gem_obj_bound_any(obj) && !obj->active;
> > >  }
> > >  
> > >  int
> > > @@ -422,7 +424,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
> > >  		 * anyway again before the next pread happens. */
> > >  		if (obj->cache_level == I915_CACHE_NONE)
> > >  			needs_clflush = 1;
> > > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > > +		if (i915_gem_obj_bound_any(obj)) {
> > >  			ret = i915_gem_object_set_to_gtt_domain(obj, false);
> > 
> > This is essentially a very convoluted version of "if there's gpu rendering
> > outstanding, please wait for it". Maybe we should switch this to
> > 
> > 	if (obj->active)
> > 		wait_rendering(obj, true);
> > 
> > Same for the shmem_pwrite case below. Would be a separate patch to prep
> > things though. Can I volunteer you for that? The ugly part is to review
> > whether any of the lru list updating that set_domain does in addition to
> > wait_rendering is required, but on a quick read that's not the case.
> 
> Just reading the comment above it says we need the clflush. I don't
> actually understand why we do that even after reading the comment, but
> meh. You tell me, I don't mind doing this as a prep first.

The comment right above is just for the needs_clflush = 1 assignment, the
set_to_gtt_domain call afterwards is just to sync up with the gpu. The
code is confusing and tricky and the lack of a white line in between the
two things plus a comment explaining that we only care about the
wait_rendering side-effect of set_to_gtt_domain doesn't help. If you do
the proposed conversion (and add a white line) that should help a lot in
unconfusing readers.

[snip]

> > > @@ -3333,12 +3376,15 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > >  	}
> > >  
> > >  	if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> > > -		ret = i915_gem_object_unbind(obj);
> > > +		ret = i915_gem_object_unbind(obj, vm);
> > >  		if (ret)
> > >  			return ret;
> > >  	}
> > >  
> > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > +		if (!i915_gem_obj_bound(obj, vm))
> > > +			continue;
> > 
> > Hm, shouldn't we have a per-object list of vmas? Or will that follow later
> > on?
> > 
> > Self-correction: It exists already ... why can't we use this here?
> 
> Yes. That should work, I'll fix it and test it. It looks slightly worse
> IMO in terms of code clarity, but I don't mind the change.

Actually I think it'd gain in clarity, doing pte updatest (which
set_cache_level does) on the vma instead of the (obj, vm) pair feels more
natural. And we'd be able to drop lots of (obj, vm) -> vma lookups here.

> 
> > 
> > > +
> > >  		ret = i915_gem_object_finish_gpu(obj);
> > >  		if (ret)
> > >  			return ret;
> > > @@ -3361,7 +3407,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > >  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> > >  					       obj, cache_level);
> > >  
> > > -		i915_gem_obj_ggtt_set_color(obj, cache_level);
> > > +		i915_gem_obj_set_color(obj, vm, cache_level);
> > >  	}
> > >  
> > >  	if (cache_level == I915_CACHE_NONE) {
> > > @@ -3421,6 +3467,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > >  			       struct drm_file *file)
> > >  {
> > >  	struct drm_i915_gem_caching *args = data;
> > > +	struct drm_i915_private *dev_priv;
> > >  	struct drm_i915_gem_object *obj;
> > >  	enum i915_cache_level level;
> > >  	int ret;
> > > @@ -3445,8 +3492,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > >  		ret = -ENOENT;
> > >  		goto unlock;
> > >  	}
> > > +	dev_priv = obj->base.dev->dev_private;
> > >  
> > > -	ret = i915_gem_object_set_cache_level(obj, level);
> > > +	/* FIXME: Add interface for specific VM? */
> > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
> > >  
> > >  	drm_gem_object_unreference(&obj->base);
> > >  unlock:
> > > @@ -3464,6 +3513,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > >  				     u32 alignment,
> > >  				     struct intel_ring_buffer *pipelined)
> > >  {
> > > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > >  	u32 old_read_domains, old_write_domain;
> > >  	int ret;
> > >  
> > > @@ -3482,7 +3532,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > >  	 * of uncaching, which would allow us to flush all the LLC-cached data
> > >  	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
> > >  	 */
> > > -	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > > +					      I915_CACHE_NONE);
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > @@ -3490,7 +3541,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > >  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> > >  	 * always use map_and_fenceable for all scanout buffers.
> > >  	 */
> > > -	ret = i915_gem_object_pin(obj, alignment, true, false);
> > > +	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > @@ -3633,6 +3684,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
> > >  
> > >  int
> > >  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > +		    struct i915_address_space *vm,
> > >  		    uint32_t alignment,
> > >  		    bool map_and_fenceable,
> > >  		    bool nonblocking)
> > > @@ -3642,26 +3694,29 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > >  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
> > >  		return -EBUSY;
> > >  
> > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> > > +	BUG_ON(map_and_fenceable && !i915_is_ggtt(vm));
> > 
> > WARN_ON, since presumably we can keep on going if we get this wrong
> > (albeit with slightly corrupted state, so render corruptions might
> > follow).
> 
> Can we make a deal, can we leave this as BUG_ON with a FIXME to convert
> it at the end of merging?

Adding a FIXME right above it will cause equal amounts of conflicts, so I
don't see the point that much ...

> 
> > 
> > > +
> > > +	if (i915_gem_obj_bound(obj, vm)) {
> > > +		if ((alignment &&
> > > +		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
> > >  		    (map_and_fenceable && !obj->map_and_fenceable)) {
> > >  			WARN(obj->pin_count,
> > >  			     "bo is already pinned with incorrect alignment:"
> > >  			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
> > >  			     " obj->map_and_fenceable=%d\n",
> > > -			     i915_gem_obj_ggtt_offset(obj), alignment,
> > > +			     i915_gem_obj_offset(obj, vm), alignment,
> > >  			     map_and_fenceable,
> > >  			     obj->map_and_fenceable);
> > > -			ret = i915_gem_object_unbind(obj);
> > > +			ret = i915_gem_object_unbind(obj, vm);
> > >  			if (ret)
> > >  				return ret;
> > >  		}
> > >  	}
> > >  
> > > -	if (!i915_gem_obj_ggtt_bound(obj)) {
> > > +	if (!i915_gem_obj_bound(obj, vm)) {
> > >  		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > >  
> > > -		ret = i915_gem_object_bind_to_gtt(obj, alignment,
> > > +		ret = i915_gem_object_bind_to_gtt(obj, vm, alignment,
> > >  						  map_and_fenceable,
> > >  						  nonblocking);
> > >  		if (ret)
> > > @@ -3684,7 +3739,7 @@ void
> > >  i915_gem_object_unpin(struct drm_i915_gem_object *obj)
> > >  {
> > >  	BUG_ON(obj->pin_count == 0);
> > > -	BUG_ON(!i915_gem_obj_ggtt_bound(obj));
> > > +	BUG_ON(!i915_gem_obj_bound_any(obj));
> > >  
> > >  	if (--obj->pin_count == 0)
> > >  		obj->pin_mappable = false;
> > > @@ -3722,7 +3777,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
> > >  	}
> > >  
> > >  	if (obj->user_pin_count == 0) {
> > > -		ret = i915_gem_object_pin(obj, args->alignment, true, false);
> > > +		ret = i915_gem_ggtt_pin(obj, args->alignment, true, false);
> > >  		if (ret)
> > >  			goto out;
> > >  	}
> > > @@ -3957,6 +4012,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> > >  	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
> > >  	struct drm_device *dev = obj->base.dev;
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > +	struct i915_vma *vma, *next;
> > >  
> > >  	trace_i915_gem_object_destroy(obj);
> > >  
> > > @@ -3964,15 +4020,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> > >  		i915_gem_detach_phys_object(dev, obj);
> > >  
> > >  	obj->pin_count = 0;
> > > -	if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
> > > -		bool was_interruptible;
> > > +	/* NB: 0 or 1 elements */
> > > +	WARN_ON(!list_empty(&obj->vma_list) &&
> > > +		!list_is_singular(&obj->vma_list));
> > > +	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
> > > +		int ret = i915_gem_object_unbind(obj, vma->vm);
> > > +		if (WARN_ON(ret == -ERESTARTSYS)) {
> > > +			bool was_interruptible;
> > >  
> > > -		was_interruptible = dev_priv->mm.interruptible;
> > > -		dev_priv->mm.interruptible = false;
> > > +			was_interruptible = dev_priv->mm.interruptible;
> > > +			dev_priv->mm.interruptible = false;
> > >  
> > > -		WARN_ON(i915_gem_object_unbind(obj));
> > > +			WARN_ON(i915_gem_object_unbind(obj, vma->vm));
> > >  
> > > -		dev_priv->mm.interruptible = was_interruptible;
> > > +			dev_priv->mm.interruptible = was_interruptible;
> > > +		}
> > >  	}
> > >  
> > >  	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
> > > @@ -4332,6 +4394,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
> > >  	INIT_LIST_HEAD(&ring->request_list);
> > >  }
> > >  
> > > +static void i915_init_vm(struct drm_i915_private *dev_priv,
> > > +			 struct i915_address_space *vm)
> > > +{
> > > +	vm->dev = dev_priv->dev;
> > > +	INIT_LIST_HEAD(&vm->active_list);
> > > +	INIT_LIST_HEAD(&vm->inactive_list);
> > > +	INIT_LIST_HEAD(&vm->global_link);
> > > +	list_add(&vm->global_link, &dev_priv->vm_list);
> > > +}
> > > +
> > >  void
> > >  i915_gem_load(struct drm_device *dev)
> > >  {
> > > @@ -4344,8 +4416,9 @@ i915_gem_load(struct drm_device *dev)
> > >  				  SLAB_HWCACHE_ALIGN,
> > >  				  NULL);
> > >  
> > > -	INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
> > > -	INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
> > > +	INIT_LIST_HEAD(&dev_priv->vm_list);
> > > +	i915_init_vm(dev_priv, &dev_priv->gtt.base);
> > > +
> > >  	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
> > >  	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
> > >  	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> > > @@ -4616,9 +4689,9 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > >  			     struct drm_i915_private,
> > >  			     mm.inactive_shrinker);
> > >  	struct drm_device *dev = dev_priv->dev;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > +	struct i915_address_space *vm;
> > >  	struct drm_i915_gem_object *obj;
> > > -	int nr_to_scan = sc->nr_to_scan;
> > > +	int nr_to_scan;
> > >  	bool unlock = true;
> > >  	int cnt;
> > >  
> > > @@ -4632,6 +4705,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > >  		unlock = false;
> > >  	}
> > >  
> > > +	nr_to_scan = sc->nr_to_scan;
> > >  	if (nr_to_scan) {
> > >  		nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
> > >  		if (nr_to_scan > 0)
> > > @@ -4645,11 +4719,93 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > >  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
> > >  		if (obj->pages_pin_count == 0)
> > >  			cnt += obj->base.size >> PAGE_SHIFT;
> > > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > -		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > -			cnt += obj->base.size >> PAGE_SHIFT;
> > > +
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > +		list_for_each_entry(obj, &vm->inactive_list, global_list)
> > > +			if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > +				cnt += obj->base.size >> PAGE_SHIFT;
> > 
> > Isn't this now double-counting objects? In the shrinker we only care about
> > how much physical RAM an object occupies, not how much virtual space it
> > occupies. So just walking the bound list of objects here should be good
> > enough ...
> > 
> 
> Maybe I've misunderstood you. My code is wrong, but I think you're idea
> requires a prep patch because it changes functionality, right?
> 
> So let me know if I've understood you.

Don't we have both the bound and unbound list? So we could just switch
over to counting the bound objects here ... Otherwise yes, we need a prep
patch to create the bound list first.

> 
> > >  
> > >  	if (unlock)
> > >  		mutex_unlock(&dev->struct_mutex);
> > >  	return cnt;
> > >  }
> > > +
> > > +/* All the new VM stuff */
> > > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > > +				  struct i915_address_space *vm)
> > > +{
> > > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > > +	struct i915_vma *vma;
> > > +
> > > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > > +		vm = &dev_priv->gtt.base;
> > > +
> > > +	BUG_ON(list_empty(&o->vma_list));
> > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > 
> > Imo the vma list walking here and in the other helpers below indicates
> > that we should deal more often in vmas instead of (object, vm) pairs. Or
> > is this again something that'll get fixed later on?
> > 
> > I just want to avoid diff churn, and it also makes reviewing easier if the
> > foreshadowing is correct ;-) So generally I'd vote for more liberal
> > sprinkling of obj_to_vma in callers.
> 
> It's not something I fixed in the whole series. I think it makes sense
> conceptually, to keep some things as <obj,vm> and others as direct vma.
> 
> If you want me to change something, you need to be more specific since
> no action specifically comes to mind at this point in the series.

It's just that the (obj, vm) -> vma lookup is a list-walk, so imo we
should try to avoid it whenever possible. Since the vma has both and obj
and a vm pointer the vma is imo strictly better than the (obj, vm) pair.
And the look-up should be pushed down the callchain as much as possible.

So I think generally we want to pass the vma around to functions
everywhere, and the (obj, vm) pair would be the exception (which needs
special justification).

> 
> > 
> > > +		if (vma->vm == vm)
> > > +			return vma->node.start;
> > > +
> > > +	}
> > > +	return -1;
> > > +}
> > > +
> > > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
> > > +{
> > > +	return !list_empty(&o->vma_list);
> > > +}
> > > +
> > > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > > +			struct i915_address_space *vm)
> > > +{
> > > +	struct i915_vma *vma;
> > > +
> > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > +		if (vma->vm == vm)
> > > +			return true;
> > > +	}
> > > +	return false;
> > > +}
> > > +
> > > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > > +				struct i915_address_space *vm)
> > > +{
> > > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > > +	struct i915_vma *vma;
> > > +
> > > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > > +		vm = &dev_priv->gtt.base;
> > > +	BUG_ON(list_empty(&o->vma_list));
> > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > +		if (vma->vm == vm)
> > > +			return vma->node.size;
> > > +	}
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> > > +			    struct i915_address_space *vm,
> > > +			    enum i915_cache_level color)
> > > +{
> > > +	struct i915_vma *vma;
> > > +	BUG_ON(list_empty(&o->vma_list));
> > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > +		if (vma->vm == vm) {
> > > +			vma->node.color = color;
> > > +			return;
> > > +		}
> > > +	}
> > > +
> > > +	WARN(1, "Couldn't set color for VM %p\n", vm);
> > > +}
> > > +
> > > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > > +				     struct i915_address_space *vm)
> > > +{
> > > +	struct i915_vma *vma;
> > > +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> > > +		if (vma->vm == vm)
> > > +			return vma;
> > > +
> > > +	return NULL;
> > > +}
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > > index 2074544..c92fd81 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > > @@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev,
> > >  
> > >  	if (INTEL_INFO(dev)->gen >= 7) {
> > >  		ret = i915_gem_object_set_cache_level(ctx->obj,
> > > +						      &dev_priv->gtt.base,
> > >  						      I915_CACHE_LLC_MLC);
> > >  		/* Failure shouldn't ever happen this early */
> > >  		if (WARN_ON(ret))
> > > @@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
> > >  	 * default context.
> > >  	 */
> > >  	dev_priv->ring[RCS].default_context = ctx;
> > > -	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> > > +	ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> > >  	if (ret) {
> > >  		DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
> > >  		goto err_destroy;
> > > @@ -398,6 +399,7 @@ mi_set_context(struct intel_ring_buffer *ring,
> > >  static int do_switch(struct i915_hw_context *to)
> > >  {
> > >  	struct intel_ring_buffer *ring = to->ring;
> > > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > >  	struct i915_hw_context *from = ring->last_context;
> > >  	u32 hw_flags = 0;
> > >  	int ret;
> > > @@ -407,7 +409,7 @@ static int do_switch(struct i915_hw_context *to)
> > >  	if (from == to)
> > >  		return 0;
> > >  
> > > -	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
> > > +	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > @@ -444,7 +446,8 @@ static int do_switch(struct i915_hw_context *to)
> > >  	 */
> > >  	if (from != NULL) {
> > >  		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> > > -		i915_gem_object_move_to_active(from->obj, ring);
> > > +		i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
> > > +					       ring);
> > >  		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
> > >  		 * whole damn pipeline, we don't need to explicitly mark the
> > >  		 * object dirty. The only exception is that the context must be
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > index df61f33..32efdc0 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > @@ -32,24 +32,21 @@
> > >  #include "i915_trace.h"
> > >  
> > >  static bool
> > > -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
> > > +mark_free(struct i915_vma *vma, struct list_head *unwind)
> > >  {
> > > -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > > -
> > > -	if (obj->pin_count)
> > > +	if (vma->obj->pin_count)
> > >  		return false;
> > >  
> > > -	list_add(&obj->exec_list, unwind);
> > > +	list_add(&vma->obj->exec_list, unwind);
> > >  	return drm_mm_scan_add_block(&vma->node);
> > >  }
> > >  
> > >  int
> > > -i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > -			 unsigned alignment, unsigned cache_level,
> > > +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> > > +			 int min_size, unsigned alignment, unsigned cache_level,
> > >  			 bool mappable, bool nonblocking)
> > >  {
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > >  	struct list_head eviction_list, unwind_list;
> > >  	struct i915_vma *vma;
> > >  	struct drm_i915_gem_object *obj;
> > > @@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> > >  	 */
> > >  
> > >  	INIT_LIST_HEAD(&unwind_list);
> > > -	if (mappable)
> > > +	if (mappable) {
> > > +		BUG_ON(!i915_is_ggtt(vm));
> > >  		drm_mm_init_scan_with_range(&vm->mm, min_size,
> > >  					    alignment, cache_level, 0,
> > >  					    dev_priv->gtt.mappable_end);
> > > -	else
> > > +	} else
> > >  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
> > >  
> > >  	/* First see if there is a large enough contiguous idle region... */
> > >  	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> > > -		if (mark_free(obj, &unwind_list))
> > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > +		if (mark_free(vma, &unwind_list))
> > >  			goto found;
> > >  	}
> > >  
> > > @@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> > >  
> > >  	/* Now merge in the soon-to-be-expired objects... */
> > >  	list_for_each_entry(obj, &vm->active_list, mm_list) {
> > > -		if (mark_free(obj, &unwind_list))
> > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > +		if (mark_free(vma, &unwind_list))
> > >  			goto found;
> > >  	}
> > >  
> > > @@ -109,7 +109,7 @@ none:
> > >  		obj = list_first_entry(&unwind_list,
> > >  				       struct drm_i915_gem_object,
> > >  				       exec_list);
> > > -		vma = __i915_gem_obj_to_vma(obj);
> > > +		vma = i915_gem_obj_to_vma(obj, vm);
> > >  		ret = drm_mm_scan_remove_block(&vma->node);
> > >  		BUG_ON(ret);
> > >  
> > > @@ -130,7 +130,7 @@ found:
> > >  		obj = list_first_entry(&unwind_list,
> > >  				       struct drm_i915_gem_object,
> > >  				       exec_list);
> > > -		vma = __i915_gem_obj_to_vma(obj);
> > > +		vma = i915_gem_obj_to_vma(obj, vm);
> > >  		if (drm_mm_scan_remove_block(&vma->node)) {
> > >  			list_move(&obj->exec_list, &eviction_list);
> > >  			drm_gem_object_reference(&obj->base);
> > > @@ -145,7 +145,7 @@ found:
> > >  				       struct drm_i915_gem_object,
> > >  				       exec_list);
> > >  		if (ret == 0)
> > > -			ret = i915_gem_object_unbind(obj);
> > > +			ret = i915_gem_object_unbind(obj, vm);
> > >  
> > >  		list_del_init(&obj->exec_list);
> > >  		drm_gem_object_unreference(&obj->base);
> > > @@ -158,13 +158,18 @@ int
> > >  i915_gem_evict_everything(struct drm_device *dev)
> > 
> > I suspect evict_everything eventually wants a address_space *vm argument
> > for those cases where we only want to evict everything in a given vm. Atm
> > we have two use-cases of this:
> > - Called from the shrinker as a last-ditch effort. For that it should move
> >   _every_ object onto the unbound list.
> > - Called from execbuf for badly-fragmented address spaces to clean up the
> >   mess. For that case we only care about one address space.
> 
> The current thing is more or less a result of Chris' suggestions. A
> non-posted iteration did plumb the vm, and after reworking to the
> suggestion made by Chris, the vm didn't make much sense anymore.
> 
> For point #1, it requires VM prioritization I think. I don't really see
> any other way to fairly manage it.

The shrinker will rip out  objects in lru order by walking first unbound
and then bound objects. That's imo as fair as it gets, we don't need
priorities between vms.

> For point #2, that I agree it might be useful, but we can easily create
> a new function, and not call it "shrinker" to do it. 

Well my point was that this function is called
i915_gem_evict_everything(dev, vm) and for the first use case we simply
pass in vm = NULL. But essentially thrashing the vm should be rare enough
that for now we don't need to care.


> 
> 
> > 
> > >  {
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > +	struct i915_address_space *vm;
> > >  	struct drm_i915_gem_object *obj, *next;
> > > -	bool lists_empty;
> > > +	bool lists_empty = true;
> > >  	int ret;
> > >  
> > > -	lists_empty = (list_empty(&vm->inactive_list) &&
> > > -		       list_empty(&vm->active_list));
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > +		lists_empty = (list_empty(&vm->inactive_list) &&
> > > +			       list_empty(&vm->active_list));
> > > +		if (!lists_empty)
> > > +			lists_empty = false;
> > > +	}
> > > +
> > >  	if (lists_empty)
> > >  		return -ENOSPC;
> > >  
> > > @@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
> > >  	i915_gem_retire_requests(dev);
> > >  
> > >  	/* Having flushed everything, unbind() should never raise an error */
> > > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > -		if (obj->pin_count == 0)
> > > -			WARN_ON(i915_gem_object_unbind(obj));
> > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > +		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > +			if (obj->pin_count == 0)
> > > +				WARN_ON(i915_gem_object_unbind(obj, vm));
> > > +	}
> > >  
> > >  	return 0;
> > >  }
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > index 5aeb447..e90182d 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > @@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
> > >  }
> > >  
> > >  static void
> > > -eb_destroy(struct eb_objects *eb)
> > > +eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
> > >  {
> > >  	while (!list_empty(&eb->objects)) {
> > >  		struct drm_i915_gem_object *obj;
> > > @@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
> > >  static int
> > >  i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> > >  				   struct eb_objects *eb,
> > > -				   struct drm_i915_gem_relocation_entry *reloc)
> > > +				   struct drm_i915_gem_relocation_entry *reloc,
> > > +				   struct i915_address_space *vm)
> > >  {
> > >  	struct drm_device *dev = obj->base.dev;
> > >  	struct drm_gem_object *target_obj;
> > > @@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> > >  
> > >  static int
> > >  i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > > -				    struct eb_objects *eb)
> > > +				    struct eb_objects *eb,
> > > +				    struct i915_address_space *vm)
> > >  {
> > >  #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
> > >  	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
> > > @@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > >  		do {
> > >  			u64 offset = r->presumed_offset;
> > >  
> > > -			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
> > > +			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
> > > +								 vm);
> > >  			if (ret)
> > >  				return ret;
> > >  
> > > @@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > >  static int
> > >  i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> > >  					 struct eb_objects *eb,
> > > -					 struct drm_i915_gem_relocation_entry *relocs)
> > > +					 struct drm_i915_gem_relocation_entry *relocs,
> > > +					 struct i915_address_space *vm)
> > >  {
> > >  	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> > >  	int i, ret;
> > >  
> > >  	for (i = 0; i < entry->relocation_count; i++) {
> > > -		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
> > > +		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
> > > +							 vm);
> > >  		if (ret)
> > >  			return ret;
> > >  	}
> > > @@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> > >  }
> > >  
> > >  static int
> > > -i915_gem_execbuffer_relocate(struct eb_objects *eb)
> > > +i915_gem_execbuffer_relocate(struct eb_objects *eb,
> > > +			     struct i915_address_space *vm)
> > >  {
> > >  	struct drm_i915_gem_object *obj;
> > >  	int ret = 0;
> > > @@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
> > >  	 */
> > >  	pagefault_disable();
> > >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> > > -		ret = i915_gem_execbuffer_relocate_object(obj, eb);
> > > +		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
> > >  		if (ret)
> > >  			break;
> > >  	}
> > > @@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
> > >  static int
> > >  i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > >  				   struct intel_ring_buffer *ring,
> > > +				   struct i915_address_space *vm,
> > >  				   bool *need_reloc)
> > >  {
> > >  	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > @@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > >  		obj->tiling_mode != I915_TILING_NONE;
> > >  	need_mappable = need_fence || need_reloc_mappable(obj);
> > >  
> > > -	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
> > > +	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
> > > +				  false);
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > @@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > >  		obj->has_aliasing_ppgtt_mapping = 1;
> > >  	}
> > >  
> > > -	if (entry->offset != i915_gem_obj_ggtt_offset(obj)) {
> > > -		entry->offset = i915_gem_obj_ggtt_offset(obj);
> > > +	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
> > > +		entry->offset = i915_gem_obj_offset(obj, vm);
> > >  		*need_reloc = true;
> > >  	}
> > >  
> > > @@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> > >  {
> > >  	struct drm_i915_gem_exec_object2 *entry;
> > >  
> > > -	if (!i915_gem_obj_ggtt_bound(obj))
> > > +	if (!i915_gem_obj_bound_any(obj))
> > >  		return;
> > >  
> > >  	entry = obj->exec_entry;
> > > @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> > >  static int
> > >  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > >  			    struct list_head *objects,
> > > +			    struct i915_address_space *vm,
> > >  			    bool *need_relocs)
> > >  {
> > >  	struct drm_i915_gem_object *obj;
> > > @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > >  		list_for_each_entry(obj, objects, exec_list) {
> > >  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> > >  			bool need_fence, need_mappable;
> > > +			u32 obj_offset;
> > >  
> > > -			if (!i915_gem_obj_ggtt_bound(obj))
> > > +			if (!i915_gem_obj_bound(obj, vm))
> > >  				continue;
> > 
> > I wonder a bit how we could avoid the multipler (obj, vm) -> vma lookups
> > here ... Maybe we should cache them in some pointer somewhere (either in
> > the eb object or by adding a new pointer to the object struct, e.g.
> > obj->eb_vma, similar to obj->eb_list).
> > 
> 
> I agree, and even did this at one unposted patch too. However, I think
> it's a premature optimization which risks code correctness. So I think
> somewhere a FIXME needs to happen to address that issue. (Or if Chris
> complains bitterly about some perf hit).

If you bring up code correctness I'd vote strongly in favour of using vmas
everywhere - vma has the (obj, vm) pair locked down, doing the lookup all
the thing risks us mixing them up eventually and creating a hella lot of
confusion ;-)

> 
> > >  
> > > +			obj_offset = i915_gem_obj_offset(obj, vm);
> > >  			need_fence =
> > >  				has_fenced_gpu_access &&
> > >  				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
> > >  				obj->tiling_mode != I915_TILING_NONE;
> > >  			need_mappable = need_fence || need_reloc_mappable(obj);
> > >  
> > > +			BUG_ON((need_mappable || need_fence) &&
> > > +			       !i915_is_ggtt(vm));
> > > +
> > >  			if ((entry->alignment &&
> > > -			     i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) ||
> > > +			     obj_offset & (entry->alignment - 1)) ||
> > >  			    (need_mappable && !obj->map_and_fenceable))
> > > -				ret = i915_gem_object_unbind(obj);
> > > +				ret = i915_gem_object_unbind(obj, vm);
> > >  			else
> > > -				ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > > +				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> > >  			if (ret)
> > >  				goto err;
> > >  		}
> > >  
> > >  		/* Bind fresh objects */
> > >  		list_for_each_entry(obj, objects, exec_list) {
> > > -			if (i915_gem_obj_ggtt_bound(obj))
> > > +			if (i915_gem_obj_bound(obj, vm))
> > >  				continue;
> > >  
> > > -			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > > +			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> > >  			if (ret)
> > >  				goto err;
> > >  		}
> > > @@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> > >  				  struct drm_file *file,
> > >  				  struct intel_ring_buffer *ring,
> > >  				  struct eb_objects *eb,
> > > -				  struct drm_i915_gem_exec_object2 *exec)
> > > +				  struct drm_i915_gem_exec_object2 *exec,
> > > +				  struct i915_address_space *vm)
> > >  {
> > >  	struct drm_i915_gem_relocation_entry *reloc;
> > >  	struct drm_i915_gem_object *obj;
> > > @@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> > >  		goto err;
> > >  
> > >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> > >  	if (ret)
> > >  		goto err;
> > >  
> > >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> > >  		int offset = obj->exec_entry - exec;
> > >  		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
> > > -							       reloc + reloc_offset[offset]);
> > > +							       reloc + reloc_offset[offset],
> > > +							       vm);
> > >  		if (ret)
> > >  			goto err;
> > >  	}
> > > @@ -768,6 +784,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
> > >  
> > >  static void
> > >  i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > > +				   struct i915_address_space *vm,
> > >  				   struct intel_ring_buffer *ring)
> > >  {
> > >  	struct drm_i915_gem_object *obj;
> > > @@ -782,7 +799,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > >  		obj->base.read_domains = obj->base.pending_read_domains;
> > >  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
> > >  
> > > -		i915_gem_object_move_to_active(obj, ring);
> > > +		i915_gem_object_move_to_active(obj, vm, ring);
> > >  		if (obj->base.write_domain) {
> > >  			obj->dirty = 1;
> > >  			obj->last_write_seqno = intel_ring_get_seqno(ring);
> > > @@ -836,7 +853,8 @@ static int
> > >  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > >  		       struct drm_file *file,
> > >  		       struct drm_i915_gem_execbuffer2 *args,
> > > -		       struct drm_i915_gem_exec_object2 *exec)
> > > +		       struct drm_i915_gem_exec_object2 *exec,
> > > +		       struct i915_address_space *vm)
> > >  {
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > >  	struct eb_objects *eb;
> > > @@ -998,17 +1016,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > >  
> > >  	/* Move the objects en-masse into the GTT, evicting if necessary. */
> > >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> > >  	if (ret)
> > >  		goto err;
> > >  
> > >  	/* The objects are in their final locations, apply the relocations. */
> > >  	if (need_relocs)
> > > -		ret = i915_gem_execbuffer_relocate(eb);
> > > +		ret = i915_gem_execbuffer_relocate(eb, vm);
> > >  	if (ret) {
> > >  		if (ret == -EFAULT) {
> > >  			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
> > > -								eb, exec);
> > > +								eb, exec, vm);
> > >  			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
> > >  		}
> > >  		if (ret)
> > > @@ -1059,7 +1077,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > >  			goto err;
> > >  	}
> > >  
> > > -	exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset;
> > > +	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> > > +		args->batch_start_offset;
> > >  	exec_len = args->batch_len;
> > >  	if (cliprects) {
> > >  		for (i = 0; i < args->num_cliprects; i++) {
> > > @@ -1084,11 +1103,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > >  
> > >  	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
> > >  
> > > -	i915_gem_execbuffer_move_to_active(&eb->objects, ring);
> > > +	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
> > >  	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
> > >  
> > >  err:
> > > -	eb_destroy(eb);
> > > +	eb_destroy(eb, vm);
> > >  
> > >  	mutex_unlock(&dev->struct_mutex);
> > >  
> > > @@ -1105,6 +1124,7 @@ int
> > >  i915_gem_execbuffer(struct drm_device *dev, void *data,
> > >  		    struct drm_file *file)
> > >  {
> > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > >  	struct drm_i915_gem_execbuffer *args = data;
> > >  	struct drm_i915_gem_execbuffer2 exec2;
> > >  	struct drm_i915_gem_exec_object *exec_list = NULL;
> > > @@ -1160,7 +1180,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
> > >  	exec2.flags = I915_EXEC_RENDER;
> > >  	i915_execbuffer2_set_context_id(exec2, 0);
> > >  
> > > -	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
> > > +	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
> > > +				     &dev_priv->gtt.base);
> > >  	if (!ret) {
> > >  		/* Copy the new buffer offsets back to the user's exec list. */
> > >  		for (i = 0; i < args->buffer_count; i++)
> > > @@ -1186,6 +1207,7 @@ int
> > >  i915_gem_execbuffer2(struct drm_device *dev, void *data,
> > >  		     struct drm_file *file)
> > >  {
> > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > >  	struct drm_i915_gem_execbuffer2 *args = data;
> > >  	struct drm_i915_gem_exec_object2 *exec2_list = NULL;
> > >  	int ret;
> > > @@ -1216,7 +1238,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
> > >  		return -EFAULT;
> > >  	}
> > >  
> > > -	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
> > > +	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
> > > +				     &dev_priv->gtt.base);
> > >  	if (!ret) {
> > >  		/* Copy the new buffer offsets back to the user's exec list. */
> > >  		ret = copy_to_user(to_user_ptr(args->buffers_ptr),
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > index 298fc42..70ce2f6 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > @@ -367,6 +367,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
> > >  			    ppgtt->base.total);
> > >  	}
> > >  
> > > +	/* i915_init_vm(dev_priv, &ppgtt->base) */
> > > +
> > >  	return ret;
> > >  }
> > >  
> > > @@ -386,17 +388,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> > >  			    struct drm_i915_gem_object *obj,
> > >  			    enum i915_cache_level cache_level)
> > >  {
> > > -	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> > > -				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > -				   cache_level);
> > > +	struct i915_address_space *vm = &ppgtt->base;
> > > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > > +
> > > +	vm->insert_entries(vm, obj->pages,
> > > +			   obj_offset >> PAGE_SHIFT,
> > > +			   cache_level);
> > >  }
> > >  
> > >  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > >  			      struct drm_i915_gem_object *obj)
> > >  {
> > > -	ppgtt->base.clear_range(&ppgtt->base,
> > > -				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > -				obj->base.size >> PAGE_SHIFT);
> > > +	struct i915_address_space *vm = &ppgtt->base;
> > > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > > +
> > > +	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
> > > +			obj->base.size >> PAGE_SHIFT);
> > >  }
> > >  
> > >  extern int intel_iommu_gfx_mapped;
> > > @@ -447,6 +454,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> > >  				       dev_priv->gtt.base.start / PAGE_SIZE,
> > >  				       dev_priv->gtt.base.total / PAGE_SIZE);
> > >  
> > > +	if (dev_priv->mm.aliasing_ppgtt)
> > > +		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> > > +
> > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > >  		i915_gem_clflush_object(obj);
> > >  		i915_gem_gtt_bind_object(obj, obj->cache_level);
> > > @@ -625,7 +635,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > >  	 * aperture.  One page should be enough to keep any prefetching inside
> > >  	 * of the aperture.
> > >  	 */
> > > -	drm_i915_private_t *dev_priv = dev->dev_private;
> > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > +	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
> > >  	struct drm_mm_node *entry;
> > >  	struct drm_i915_gem_object *obj;
> > >  	unsigned long hole_start, hole_end;
> > > @@ -633,19 +644,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > >  	BUG_ON(mappable_end > end);
> > >  
> > >  	/* Subtract the guard page ... */
> > > -	drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
> > > +	drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
> > >  	if (!HAS_LLC(dev))
> > >  		dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
> > >  
> > >  	/* Mark any preallocated objects as occupied */
> > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > > -		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
> > >  		int ret;
> > >  		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
> > >  			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
> > >  
> > >  		WARN_ON(i915_gem_obj_ggtt_bound(obj));
> > > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > > +		ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
> > >  		if (ret)
> > >  			DRM_DEBUG_KMS("Reservation failed\n");
> > >  		obj->has_global_gtt_mapping = 1;
> > > @@ -656,19 +667,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > >  	dev_priv->gtt.base.total = end - start;
> > >  
> > >  	/* Clear any non-preallocated blocks */
> > > -	drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
> > > -			     hole_start, hole_end) {
> > > +	drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
> > >  		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
> > >  		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
> > >  			      hole_start, hole_end);
> > > -		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > > -					       hole_start / PAGE_SIZE,
> > > -					       count);
> > > +		ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count);
> > >  	}
> > >  
> > >  	/* And finally clear the reserved guard page */
> > > -	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > > -				       end / PAGE_SIZE - 1, 1);
> > > +	ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1);
> > >  }
> > >  
> > >  static bool
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > index 245eb1d..bfe61fa 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > @@ -391,7 +391,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > >  	if (gtt_offset == I915_GTT_OFFSET_NONE)
> > >  		return obj;
> > >  
> > > -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > > +	vma = i915_gem_vma_create(obj, vm);
> > >  	if (!vma) {
> > >  		drm_gem_object_unreference(&obj->base);
> > >  		return NULL;
> > > @@ -404,8 +404,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > >  	 */
> > >  	vma->node.start = gtt_offset;
> > >  	vma->node.size = size;
> > > -	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> > > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > > +	if (drm_mm_initialized(&vm->mm)) {
> > > +		ret = drm_mm_reserve_node(&vm->mm, &vma->node);
> > 
> > These two hunks here for stolen look fishy - we only ever use the stolen
> > preallocated stuff for objects with mappings in the global gtt. So keeping
> > that explicit is imo the better approach. And tbh I'm confused where the
> > local variable vm is from ...
> 
> If we don't create a vma for it, we potentially have to special case a
> bunch of places, I think. I'm not actually sure of this, but the
> overhead to do it is quite small.
> 
> Anyway, I'll look this over again nd see what I think.

I'm not against the vma, I've just wonedered why you do the
/dev_priv->gtt.base/vm/ replacement here since
- it's never gonna be used with another vm than ggtt
- this patch doesn't add the vm variable, so I'm even more confused where
  this started ;-)

Cheers, Daniel

Ben Widawsky July 12, 2013, 3:46 p.m. UTC | #8

On Fri, Jul 12, 2013 at 08:26:07AM +0200, Daniel Vetter wrote:
> On Thu, Jul 11, 2013 at 07:23:08PM -0700, Ben Widawsky wrote:
> > On Tue, Jul 09, 2013 at 09:15:01AM +0200, Daniel Vetter wrote:
> > > On Mon, Jul 08, 2013 at 11:08:37PM -0700, Ben Widawsky wrote:
> 
> [snip]
> 
> > > > index 058ad44..21015cd 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > > @@ -38,10 +38,12 @@
> > > >  
> > > >  static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
> > > >  static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
> > > > -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > > -						    unsigned alignment,
> > > > -						    bool map_and_fenceable,
> > > > -						    bool nonblocking);
> > > > +static __must_check int
> > > > +i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > > > +			    struct i915_address_space *vm,
> > > > +			    unsigned alignment,
> > > > +			    bool map_and_fenceable,
> > > > +			    bool nonblocking);
> > > >  static int i915_gem_phys_pwrite(struct drm_device *dev,
> > > >  				struct drm_i915_gem_object *obj,
> > > >  				struct drm_i915_gem_pwrite *args,
> > > > @@ -135,7 +137,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
> > > >  static inline bool
> > > >  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
> > > >  {
> > > > -	return i915_gem_obj_ggtt_bound(obj) && !obj->active;
> > > > +	return i915_gem_obj_bound_any(obj) && !obj->active;
> > > >  }
> > > >  
> > > >  int
> > > > @@ -422,7 +424,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
> > > >  		 * anyway again before the next pread happens. */
> > > >  		if (obj->cache_level == I915_CACHE_NONE)
> > > >  			needs_clflush = 1;
> > > > -		if (i915_gem_obj_ggtt_bound(obj)) {
> > > > +		if (i915_gem_obj_bound_any(obj)) {
> > > >  			ret = i915_gem_object_set_to_gtt_domain(obj, false);
> > > 
> > > This is essentially a very convoluted version of "if there's gpu rendering
> > > outstanding, please wait for it". Maybe we should switch this to
> > > 
> > > 	if (obj->active)
> > > 		wait_rendering(obj, true);
> > > 
> > > Same for the shmem_pwrite case below. Would be a separate patch to prep
> > > things though. Can I volunteer you for that? The ugly part is to review
> > > whether any of the lru list updating that set_domain does in addition to
> > > wait_rendering is required, but on a quick read that's not the case.
> > 
> > Just reading the comment above it says we need the clflush. I don't
> > actually understand why we do that even after reading the comment, but
> > meh. You tell me, I don't mind doing this as a prep first.
> 
> The comment right above is just for the needs_clflush = 1 assignment, the
> set_to_gtt_domain call afterwards is just to sync up with the gpu. The
> code is confusing and tricky and the lack of a white line in between the
> two things plus a comment explaining that we only care about the
> wait_rendering side-effect of set_to_gtt_domain doesn't help. If you do
> the proposed conversion (and add a white line) that should help a lot in
> unconfusing readers.
> 
> [snip]
> 
> > > > @@ -3333,12 +3376,15 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > >  	}
> > > >  
> > > >  	if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> > > > -		ret = i915_gem_object_unbind(obj);
> > > > +		ret = i915_gem_object_unbind(obj, vm);
> > > >  		if (ret)
> > > >  			return ret;
> > > >  	}
> > > >  
> > > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > > +		if (!i915_gem_obj_bound(obj, vm))
> > > > +			continue;
> > > 
> > > Hm, shouldn't we have a per-object list of vmas? Or will that follow later
> > > on?
> > > 
> > > Self-correction: It exists already ... why can't we use this here?
> > 
> > Yes. That should work, I'll fix it and test it. It looks slightly worse
> > IMO in terms of code clarity, but I don't mind the change.
> 
> Actually I think it'd gain in clarity, doing pte updatest (which
> set_cache_level does) on the vma instead of the (obj, vm) pair feels more
> natural. And we'd be able to drop lots of (obj, vm) -> vma lookups here.

That sounds good to me. Would you mind a patch on top?

> 
> > 
> > > 
> > > > +
> > > >  		ret = i915_gem_object_finish_gpu(obj);
> > > >  		if (ret)
> > > >  			return ret;
> > > > @@ -3361,7 +3407,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > >  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> > > >  					       obj, cache_level);
> > > >  
> > > > -		i915_gem_obj_ggtt_set_color(obj, cache_level);
> > > > +		i915_gem_obj_set_color(obj, vm, cache_level);
> > > >  	}
> > > >  
> > > >  	if (cache_level == I915_CACHE_NONE) {
> > > > @@ -3421,6 +3467,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > > >  			       struct drm_file *file)
> > > >  {
> > > >  	struct drm_i915_gem_caching *args = data;
> > > > +	struct drm_i915_private *dev_priv;
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	enum i915_cache_level level;
> > > >  	int ret;
> > > > @@ -3445,8 +3492,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > > >  		ret = -ENOENT;
> > > >  		goto unlock;
> > > >  	}
> > > > +	dev_priv = obj->base.dev->dev_private;
> > > >  
> > > > -	ret = i915_gem_object_set_cache_level(obj, level);
> > > > +	/* FIXME: Add interface for specific VM? */
> > > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
> > > >  
> > > >  	drm_gem_object_unreference(&obj->base);
> > > >  unlock:
> > > > @@ -3464,6 +3513,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > >  				     u32 alignment,
> > > >  				     struct intel_ring_buffer *pipelined)
> > > >  {
> > > > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > >  	u32 old_read_domains, old_write_domain;
> > > >  	int ret;
> > > >  
> > > > @@ -3482,7 +3532,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > >  	 * of uncaching, which would allow us to flush all the LLC-cached data
> > > >  	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
> > > >  	 */
> > > > -	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> > > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > > > +					      I915_CACHE_NONE);
> > > >  	if (ret)
> > > >  		return ret;
> > > >  
> > > > @@ -3490,7 +3541,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > >  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> > > >  	 * always use map_and_fenceable for all scanout buffers.
> > > >  	 */
> > > > -	ret = i915_gem_object_pin(obj, alignment, true, false);
> > > > +	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
> > > >  	if (ret)
> > > >  		return ret;
> > > >  
> > > > @@ -3633,6 +3684,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
> > > >  
> > > >  int
> > > >  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > > +		    struct i915_address_space *vm,
> > > >  		    uint32_t alignment,
> > > >  		    bool map_and_fenceable,
> > > >  		    bool nonblocking)
> > > > @@ -3642,26 +3694,29 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > >  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
> > > >  		return -EBUSY;
> > > >  
> > > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > > -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> > > > +	BUG_ON(map_and_fenceable && !i915_is_ggtt(vm));
> > > 
> > > WARN_ON, since presumably we can keep on going if we get this wrong
> > > (albeit with slightly corrupted state, so render corruptions might
> > > follow).
> > 
> > Can we make a deal, can we leave this as BUG_ON with a FIXME to convert
> > it at the end of merging?
> 
> Adding a FIXME right above it will cause equal amounts of conflicts, so I
> don't see the point that much ...

I'm just really fearful that in doing the reworks I will end up with
this condition, and I am afraid I will miss them if it's a WARN_ON.
Definitely it's more likely to miss than a BUG.

Also, and we've disagreed on this a few times by now, this is an
internal interface which I think should carry such a fatal error for
this level of mistake.

In any case I've made the change locally. Will yell at you later if I
was right.

> 
> > 
> > > 
> > > > +
> > > > +	if (i915_gem_obj_bound(obj, vm)) {
> > > > +		if ((alignment &&
> > > > +		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
> > > >  		    (map_and_fenceable && !obj->map_and_fenceable)) {
> > > >  			WARN(obj->pin_count,
> > > >  			     "bo is already pinned with incorrect alignment:"
> > > >  			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
> > > >  			     " obj->map_and_fenceable=%d\n",
> > > > -			     i915_gem_obj_ggtt_offset(obj), alignment,
> > > > +			     i915_gem_obj_offset(obj, vm), alignment,
> > > >  			     map_and_fenceable,
> > > >  			     obj->map_and_fenceable);
> > > > -			ret = i915_gem_object_unbind(obj);
> > > > +			ret = i915_gem_object_unbind(obj, vm);
> > > >  			if (ret)
> > > >  				return ret;
> > > >  		}
> > > >  	}
> > > >  
> > > > -	if (!i915_gem_obj_ggtt_bound(obj)) {
> > > > +	if (!i915_gem_obj_bound(obj, vm)) {
> > > >  		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > >  
> > > > -		ret = i915_gem_object_bind_to_gtt(obj, alignment,
> > > > +		ret = i915_gem_object_bind_to_gtt(obj, vm, alignment,
> > > >  						  map_and_fenceable,
> > > >  						  nonblocking);
> > > >  		if (ret)
> > > > @@ -3684,7 +3739,7 @@ void
> > > >  i915_gem_object_unpin(struct drm_i915_gem_object *obj)
> > > >  {
> > > >  	BUG_ON(obj->pin_count == 0);
> > > > -	BUG_ON(!i915_gem_obj_ggtt_bound(obj));
> > > > +	BUG_ON(!i915_gem_obj_bound_any(obj));
> > > >  
> > > >  	if (--obj->pin_count == 0)
> > > >  		obj->pin_mappable = false;
> > > > @@ -3722,7 +3777,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
> > > >  	}
> > > >  
> > > >  	if (obj->user_pin_count == 0) {
> > > > -		ret = i915_gem_object_pin(obj, args->alignment, true, false);
> > > > +		ret = i915_gem_ggtt_pin(obj, args->alignment, true, false);
> > > >  		if (ret)
> > > >  			goto out;
> > > >  	}
> > > > @@ -3957,6 +4012,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> > > >  	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
> > > >  	struct drm_device *dev = obj->base.dev;
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > +	struct i915_vma *vma, *next;
> > > >  
> > > >  	trace_i915_gem_object_destroy(obj);
> > > >  
> > > > @@ -3964,15 +4020,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
> > > >  		i915_gem_detach_phys_object(dev, obj);
> > > >  
> > > >  	obj->pin_count = 0;
> > > > -	if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
> > > > -		bool was_interruptible;
> > > > +	/* NB: 0 or 1 elements */
> > > > +	WARN_ON(!list_empty(&obj->vma_list) &&
> > > > +		!list_is_singular(&obj->vma_list));
> > > > +	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
> > > > +		int ret = i915_gem_object_unbind(obj, vma->vm);
> > > > +		if (WARN_ON(ret == -ERESTARTSYS)) {
> > > > +			bool was_interruptible;
> > > >  
> > > > -		was_interruptible = dev_priv->mm.interruptible;
> > > > -		dev_priv->mm.interruptible = false;
> > > > +			was_interruptible = dev_priv->mm.interruptible;
> > > > +			dev_priv->mm.interruptible = false;
> > > >  
> > > > -		WARN_ON(i915_gem_object_unbind(obj));
> > > > +			WARN_ON(i915_gem_object_unbind(obj, vma->vm));
> > > >  
> > > > -		dev_priv->mm.interruptible = was_interruptible;
> > > > +			dev_priv->mm.interruptible = was_interruptible;
> > > > +		}
> > > >  	}
> > > >  
> > > >  	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
> > > > @@ -4332,6 +4394,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
> > > >  	INIT_LIST_HEAD(&ring->request_list);
> > > >  }
> > > >  
> > > > +static void i915_init_vm(struct drm_i915_private *dev_priv,
> > > > +			 struct i915_address_space *vm)
> > > > +{
> > > > +	vm->dev = dev_priv->dev;
> > > > +	INIT_LIST_HEAD(&vm->active_list);
> > > > +	INIT_LIST_HEAD(&vm->inactive_list);
> > > > +	INIT_LIST_HEAD(&vm->global_link);
> > > > +	list_add(&vm->global_link, &dev_priv->vm_list);
> > > > +}
> > > > +
> > > >  void
> > > >  i915_gem_load(struct drm_device *dev)
> > > >  {
> > > > @@ -4344,8 +4416,9 @@ i915_gem_load(struct drm_device *dev)
> > > >  				  SLAB_HWCACHE_ALIGN,
> > > >  				  NULL);
> > > >  
> > > > -	INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
> > > > -	INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
> > > > +	INIT_LIST_HEAD(&dev_priv->vm_list);
> > > > +	i915_init_vm(dev_priv, &dev_priv->gtt.base);
> > > > +
> > > >  	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
> > > >  	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
> > > >  	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> > > > @@ -4616,9 +4689,9 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > > >  			     struct drm_i915_private,
> > > >  			     mm.inactive_shrinker);
> > > >  	struct drm_device *dev = dev_priv->dev;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > > +	struct i915_address_space *vm;
> > > >  	struct drm_i915_gem_object *obj;
> > > > -	int nr_to_scan = sc->nr_to_scan;
> > > > +	int nr_to_scan;
> > > >  	bool unlock = true;
> > > >  	int cnt;
> > > >  
> > > > @@ -4632,6 +4705,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > > >  		unlock = false;
> > > >  	}
> > > >  
> > > > +	nr_to_scan = sc->nr_to_scan;
> > > >  	if (nr_to_scan) {
> > > >  		nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
> > > >  		if (nr_to_scan > 0)
> > > > @@ -4645,11 +4719,93 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > > >  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
> > > >  		if (obj->pages_pin_count == 0)
> > > >  			cnt += obj->base.size >> PAGE_SHIFT;
> > > > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > > -		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > > -			cnt += obj->base.size >> PAGE_SHIFT;
> > > > +
> > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > > +		list_for_each_entry(obj, &vm->inactive_list, global_list)
> > > > +			if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > > +				cnt += obj->base.size >> PAGE_SHIFT;
> > > 
> > > Isn't this now double-counting objects? In the shrinker we only care about
> > > how much physical RAM an object occupies, not how much virtual space it
> > > occupies. So just walking the bound list of objects here should be good
> > > enough ...
> > > 
> > 
> > Maybe I've misunderstood you. My code is wrong, but I think you're idea
> > requires a prep patch because it changes functionality, right?
> > 
> > So let me know if I've understood you.
> 
> Don't we have both the bound and unbound list? So we could just switch
> over to counting the bound objects here ... Otherwise yes, we need a prep
> patch to create the bound list first.

Of course there is a bound list.

The old code automatically added the size of unbound objects with
unpinned pages, and unpinned inactive objects with unpinned pages.

The latter check for inactive, needs to be checked for all VMAs. That
was my point.

> 
> > 
> > > >  
> > > >  	if (unlock)
> > > >  		mutex_unlock(&dev->struct_mutex);
> > > >  	return cnt;
> > > >  }
> > > > +
> > > > +/* All the new VM stuff */
> > > > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > > > +				  struct i915_address_space *vm)
> > > > +{
> > > > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > > > +	struct i915_vma *vma;
> > > > +
> > > > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > > > +		vm = &dev_priv->gtt.base;
> > > > +
> > > > +	BUG_ON(list_empty(&o->vma_list));
> > > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > 
> > > Imo the vma list walking here and in the other helpers below indicates
> > > that we should deal more often in vmas instead of (object, vm) pairs. Or
> > > is this again something that'll get fixed later on?
> > > 
> > > I just want to avoid diff churn, and it also makes reviewing easier if the
> > > foreshadowing is correct ;-) So generally I'd vote for more liberal
> > > sprinkling of obj_to_vma in callers.
> > 
> > It's not something I fixed in the whole series. I think it makes sense
> > conceptually, to keep some things as <obj,vm> and others as direct vma.
> > 
> > If you want me to change something, you need to be more specific since
> > no action specifically comes to mind at this point in the series.
> 
> It's just that the (obj, vm) -> vma lookup is a list-walk, so imo we
> should try to avoid it whenever possible. Since the vma has both and obj
> and a vm pointer the vma is imo strictly better than the (obj, vm) pair.
> And the look-up should be pushed down the callchain as much as possible.
> 
> So I think generally we want to pass the vma around to functions
> everywhere, and the (obj, vm) pair would be the exception (which needs
> special justification).
> 

Without actually coding it, I am not sure. I think there are probably a
decent number of reasonable exceptions where we want the object (ie.
it's not really that much of a special case). In any case, I think we'll
find you have to do this list walk at some point in the call chain
anyway, but I can try to start changing around the code as a patch on
top of this. I really want to leave as much as this patch in place as
is, since it's decently tested (pre-rebase at least).

> > 
> > > 
> > > > +		if (vma->vm == vm)
> > > > +			return vma->node.start;
> > > > +
> > > > +	}
> > > > +	return -1;
> > > > +}
> > > > +
> > > > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
> > > > +{
> > > > +	return !list_empty(&o->vma_list);
> > > > +}
> > > > +
> > > > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> > > > +			struct i915_address_space *vm)
> > > > +{
> > > > +	struct i915_vma *vma;
> > > > +
> > > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > > +		if (vma->vm == vm)
> > > > +			return true;
> > > > +	}
> > > > +	return false;
> > > > +}
> > > > +
> > > > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> > > > +				struct i915_address_space *vm)
> > > > +{
> > > > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > > > +	struct i915_vma *vma;
> > > > +
> > > > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > > > +		vm = &dev_priv->gtt.base;
> > > > +	BUG_ON(list_empty(&o->vma_list));
> > > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > > +		if (vma->vm == vm)
> > > > +			return vma->node.size;
> > > > +	}
> > > > +
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> > > > +			    struct i915_address_space *vm,
> > > > +			    enum i915_cache_level color)
> > > > +{
> > > > +	struct i915_vma *vma;
> > > > +	BUG_ON(list_empty(&o->vma_list));
> > > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > > +		if (vma->vm == vm) {
> > > > +			vma->node.color = color;
> > > > +			return;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	WARN(1, "Couldn't set color for VM %p\n", vm);
> > > > +}
> > > > +
> > > > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > > > +				     struct i915_address_space *vm)
> > > > +{
> > > > +	struct i915_vma *vma;
> > > > +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> > > > +		if (vma->vm == vm)
> > > > +			return vma;
> > > > +
> > > > +	return NULL;
> > > > +}
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > > > index 2074544..c92fd81 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > > > @@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev,
> > > >  
> > > >  	if (INTEL_INFO(dev)->gen >= 7) {
> > > >  		ret = i915_gem_object_set_cache_level(ctx->obj,
> > > > +						      &dev_priv->gtt.base,
> > > >  						      I915_CACHE_LLC_MLC);
> > > >  		/* Failure shouldn't ever happen this early */
> > > >  		if (WARN_ON(ret))
> > > > @@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
> > > >  	 * default context.
> > > >  	 */
> > > >  	dev_priv->ring[RCS].default_context = ctx;
> > > > -	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> > > > +	ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> > > >  	if (ret) {
> > > >  		DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
> > > >  		goto err_destroy;
> > > > @@ -398,6 +399,7 @@ mi_set_context(struct intel_ring_buffer *ring,
> > > >  static int do_switch(struct i915_hw_context *to)
> > > >  {
> > > >  	struct intel_ring_buffer *ring = to->ring;
> > > > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > > >  	struct i915_hw_context *from = ring->last_context;
> > > >  	u32 hw_flags = 0;
> > > >  	int ret;
> > > > @@ -407,7 +409,7 @@ static int do_switch(struct i915_hw_context *to)
> > > >  	if (from == to)
> > > >  		return 0;
> > > >  
> > > > -	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
> > > > +	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
> > > >  	if (ret)
> > > >  		return ret;
> > > >  
> > > > @@ -444,7 +446,8 @@ static int do_switch(struct i915_hw_context *to)
> > > >  	 */
> > > >  	if (from != NULL) {
> > > >  		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> > > > -		i915_gem_object_move_to_active(from->obj, ring);
> > > > +		i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
> > > > +					       ring);
> > > >  		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
> > > >  		 * whole damn pipeline, we don't need to explicitly mark the
> > > >  		 * object dirty. The only exception is that the context must be
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > > index df61f33..32efdc0 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > > @@ -32,24 +32,21 @@
> > > >  #include "i915_trace.h"
> > > >  
> > > >  static bool
> > > > -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
> > > > +mark_free(struct i915_vma *vma, struct list_head *unwind)
> > > >  {
> > > > -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > > > -
> > > > -	if (obj->pin_count)
> > > > +	if (vma->obj->pin_count)
> > > >  		return false;
> > > >  
> > > > -	list_add(&obj->exec_list, unwind);
> > > > +	list_add(&vma->obj->exec_list, unwind);
> > > >  	return drm_mm_scan_add_block(&vma->node);
> > > >  }
> > > >  
> > > >  int
> > > > -i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > > -			 unsigned alignment, unsigned cache_level,
> > > > +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> > > > +			 int min_size, unsigned alignment, unsigned cache_level,
> > > >  			 bool mappable, bool nonblocking)
> > > >  {
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > >  	struct list_head eviction_list, unwind_list;
> > > >  	struct i915_vma *vma;
> > > >  	struct drm_i915_gem_object *obj;
> > > > @@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > >  	 */
> > > >  
> > > >  	INIT_LIST_HEAD(&unwind_list);
> > > > -	if (mappable)
> > > > +	if (mappable) {
> > > > +		BUG_ON(!i915_is_ggtt(vm));
> > > >  		drm_mm_init_scan_with_range(&vm->mm, min_size,
> > > >  					    alignment, cache_level, 0,
> > > >  					    dev_priv->gtt.mappable_end);
> > > > -	else
> > > > +	} else
> > > >  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
> > > >  
> > > >  	/* First see if there is a large enough contiguous idle region... */
> > > >  	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> > > > -		if (mark_free(obj, &unwind_list))
> > > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > > +		if (mark_free(vma, &unwind_list))
> > > >  			goto found;
> > > >  	}
> > > >  
> > > > @@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > >  
> > > >  	/* Now merge in the soon-to-be-expired objects... */
> > > >  	list_for_each_entry(obj, &vm->active_list, mm_list) {
> > > > -		if (mark_free(obj, &unwind_list))
> > > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> > > > +		if (mark_free(vma, &unwind_list))
> > > >  			goto found;
> > > >  	}
> > > >  
> > > > @@ -109,7 +109,7 @@ none:
> > > >  		obj = list_first_entry(&unwind_list,
> > > >  				       struct drm_i915_gem_object,
> > > >  				       exec_list);
> > > > -		vma = __i915_gem_obj_to_vma(obj);
> > > > +		vma = i915_gem_obj_to_vma(obj, vm);
> > > >  		ret = drm_mm_scan_remove_block(&vma->node);
> > > >  		BUG_ON(ret);
> > > >  
> > > > @@ -130,7 +130,7 @@ found:
> > > >  		obj = list_first_entry(&unwind_list,
> > > >  				       struct drm_i915_gem_object,
> > > >  				       exec_list);
> > > > -		vma = __i915_gem_obj_to_vma(obj);
> > > > +		vma = i915_gem_obj_to_vma(obj, vm);
> > > >  		if (drm_mm_scan_remove_block(&vma->node)) {
> > > >  			list_move(&obj->exec_list, &eviction_list);
> > > >  			drm_gem_object_reference(&obj->base);
> > > > @@ -145,7 +145,7 @@ found:
> > > >  				       struct drm_i915_gem_object,
> > > >  				       exec_list);
> > > >  		if (ret == 0)
> > > > -			ret = i915_gem_object_unbind(obj);
> > > > +			ret = i915_gem_object_unbind(obj, vm);
> > > >  
> > > >  		list_del_init(&obj->exec_list);
> > > >  		drm_gem_object_unreference(&obj->base);
> > > > @@ -158,13 +158,18 @@ int
> > > >  i915_gem_evict_everything(struct drm_device *dev)
> > > 
> > > I suspect evict_everything eventually wants a address_space *vm argument
> > > for those cases where we only want to evict everything in a given vm. Atm
> > > we have two use-cases of this:
> > > - Called from the shrinker as a last-ditch effort. For that it should move
> > >   _every_ object onto the unbound list.
> > > - Called from execbuf for badly-fragmented address spaces to clean up the
> > >   mess. For that case we only care about one address space.
> > 
> > The current thing is more or less a result of Chris' suggestions. A
> > non-posted iteration did plumb the vm, and after reworking to the
> > suggestion made by Chris, the vm didn't make much sense anymore.
> > 
> > For point #1, it requires VM prioritization I think. I don't really see
> > any other way to fairly manage it.
> 
> The shrinker will rip out  objects in lru order by walking first unbound
> and then bound objects. That's imo as fair as it gets, we don't need
> priorities between vms.

If you pass in a vm, the semantics would be, evict everything for the
vm, right?

> 
> > For point #2, that I agree it might be useful, but we can easily create
> > a new function, and not call it "shrinker" to do it. 
> 
> Well my point was that this function is called
> i915_gem_evict_everything(dev, vm) and for the first use case we simply
> pass in vm = NULL. But essentially thrashing the vm should be rare enough
> that for now we don't need to care.
> 

IIRC, this is exactly how my original patch worked pre-Chris.

> 
> > 
> > 
> > > 
> > > >  {
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > -	struct i915_address_space *vm = &dev_priv->gtt.base;
> > > > +	struct i915_address_space *vm;
> > > >  	struct drm_i915_gem_object *obj, *next;
> > > > -	bool lists_empty;
> > > > +	bool lists_empty = true;
> > > >  	int ret;
> > > >  
> > > > -	lists_empty = (list_empty(&vm->inactive_list) &&
> > > > -		       list_empty(&vm->active_list));
> > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > > +		lists_empty = (list_empty(&vm->inactive_list) &&
> > > > +			       list_empty(&vm->active_list));
> > > > +		if (!lists_empty)
> > > > +			lists_empty = false;
> > > > +	}
> > > > +
> > > >  	if (lists_empty)
> > > >  		return -ENOSPC;
> > > >  
> > > > @@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
> > > >  	i915_gem_retire_requests(dev);
> > > >  
> > > >  	/* Having flushed everything, unbind() should never raise an error */
> > > > -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > > -		if (obj->pin_count == 0)
> > > > -			WARN_ON(i915_gem_object_unbind(obj));
> > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > > +		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> > > > +			if (obj->pin_count == 0)
> > > > +				WARN_ON(i915_gem_object_unbind(obj, vm));
> > > > +	}
> > > >  
> > > >  	return 0;
> > > >  }
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > > index 5aeb447..e90182d 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > > @@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
> > > >  }
> > > >  
> > > >  static void
> > > > -eb_destroy(struct eb_objects *eb)
> > > > +eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
> > > >  {
> > > >  	while (!list_empty(&eb->objects)) {
> > > >  		struct drm_i915_gem_object *obj;
> > > > @@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
> > > >  static int
> > > >  i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> > > >  				   struct eb_objects *eb,
> > > > -				   struct drm_i915_gem_relocation_entry *reloc)
> > > > +				   struct drm_i915_gem_relocation_entry *reloc,
> > > > +				   struct i915_address_space *vm)
> > > >  {
> > > >  	struct drm_device *dev = obj->base.dev;
> > > >  	struct drm_gem_object *target_obj;
> > > > @@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> > > >  
> > > >  static int
> > > >  i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > > > -				    struct eb_objects *eb)
> > > > +				    struct eb_objects *eb,
> > > > +				    struct i915_address_space *vm)
> > > >  {
> > > >  #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
> > > >  	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
> > > > @@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > > >  		do {
> > > >  			u64 offset = r->presumed_offset;
> > > >  
> > > > -			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
> > > > +			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
> > > > +								 vm);
> > > >  			if (ret)
> > > >  				return ret;
> > > >  
> > > > @@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> > > >  static int
> > > >  i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> > > >  					 struct eb_objects *eb,
> > > > -					 struct drm_i915_gem_relocation_entry *relocs)
> > > > +					 struct drm_i915_gem_relocation_entry *relocs,
> > > > +					 struct i915_address_space *vm)
> > > >  {
> > > >  	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> > > >  	int i, ret;
> > > >  
> > > >  	for (i = 0; i < entry->relocation_count; i++) {
> > > > -		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
> > > > +		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
> > > > +							 vm);
> > > >  		if (ret)
> > > >  			return ret;
> > > >  	}
> > > > @@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
> > > >  }
> > > >  
> > > >  static int
> > > > -i915_gem_execbuffer_relocate(struct eb_objects *eb)
> > > > +i915_gem_execbuffer_relocate(struct eb_objects *eb,
> > > > +			     struct i915_address_space *vm)
> > > >  {
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	int ret = 0;
> > > > @@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
> > > >  	 */
> > > >  	pagefault_disable();
> > > >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> > > > -		ret = i915_gem_execbuffer_relocate_object(obj, eb);
> > > > +		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
> > > >  		if (ret)
> > > >  			break;
> > > >  	}
> > > > @@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
> > > >  static int
> > > >  i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > > >  				   struct intel_ring_buffer *ring,
> > > > +				   struct i915_address_space *vm,
> > > >  				   bool *need_reloc)
> > > >  {
> > > >  	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > > @@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > > >  		obj->tiling_mode != I915_TILING_NONE;
> > > >  	need_mappable = need_fence || need_reloc_mappable(obj);
> > > >  
> > > > -	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
> > > > +	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
> > > > +				  false);
> > > >  	if (ret)
> > > >  		return ret;
> > > >  
> > > > @@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> > > >  		obj->has_aliasing_ppgtt_mapping = 1;
> > > >  	}
> > > >  
> > > > -	if (entry->offset != i915_gem_obj_ggtt_offset(obj)) {
> > > > -		entry->offset = i915_gem_obj_ggtt_offset(obj);
> > > > +	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
> > > > +		entry->offset = i915_gem_obj_offset(obj, vm);
> > > >  		*need_reloc = true;
> > > >  	}
> > > >  
> > > > @@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> > > >  {
> > > >  	struct drm_i915_gem_exec_object2 *entry;
> > > >  
> > > > -	if (!i915_gem_obj_ggtt_bound(obj))
> > > > +	if (!i915_gem_obj_bound_any(obj))
> > > >  		return;
> > > >  
> > > >  	entry = obj->exec_entry;
> > > > @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> > > >  static int
> > > >  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > > >  			    struct list_head *objects,
> > > > +			    struct i915_address_space *vm,
> > > >  			    bool *need_relocs)
> > > >  {
> > > >  	struct drm_i915_gem_object *obj;
> > > > @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > > >  		list_for_each_entry(obj, objects, exec_list) {
> > > >  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> > > >  			bool need_fence, need_mappable;
> > > > +			u32 obj_offset;
> > > >  
> > > > -			if (!i915_gem_obj_ggtt_bound(obj))
> > > > +			if (!i915_gem_obj_bound(obj, vm))
> > > >  				continue;
> > > 
> > > I wonder a bit how we could avoid the multipler (obj, vm) -> vma lookups
> > > here ... Maybe we should cache them in some pointer somewhere (either in
> > > the eb object or by adding a new pointer to the object struct, e.g.
> > > obj->eb_vma, similar to obj->eb_list).
> > > 
> > 
> > I agree, and even did this at one unposted patch too. However, I think
> > it's a premature optimization which risks code correctness. So I think
> > somewhere a FIXME needs to happen to address that issue. (Or if Chris
> > complains bitterly about some perf hit).
> 
> If you bring up code correctness I'd vote strongly in favour of using vmas
> everywhere - vma has the (obj, vm) pair locked down, doing the lookup all
> the thing risks us mixing them up eventually and creating a hella lot of
> confusion ;-)

I think this is addressed with the previous comments.

> 
> > 
> > > >  
> > > > +			obj_offset = i915_gem_obj_offset(obj, vm);
> > > >  			need_fence =
> > > >  				has_fenced_gpu_access &&
> > > >  				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
> > > >  				obj->tiling_mode != I915_TILING_NONE;
> > > >  			need_mappable = need_fence || need_reloc_mappable(obj);
> > > >  
> > > > +			BUG_ON((need_mappable || need_fence) &&
> > > > +			       !i915_is_ggtt(vm));
> > > > +
> > > >  			if ((entry->alignment &&
> > > > -			     i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) ||
> > > > +			     obj_offset & (entry->alignment - 1)) ||
> > > >  			    (need_mappable && !obj->map_and_fenceable))
> > > > -				ret = i915_gem_object_unbind(obj);
> > > > +				ret = i915_gem_object_unbind(obj, vm);
> > > >  			else
> > > > -				ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > > > +				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> > > >  			if (ret)
> > > >  				goto err;
> > > >  		}
> > > >  
> > > >  		/* Bind fresh objects */
> > > >  		list_for_each_entry(obj, objects, exec_list) {
> > > > -			if (i915_gem_obj_ggtt_bound(obj))
> > > > +			if (i915_gem_obj_bound(obj, vm))
> > > >  				continue;
> > > >  
> > > > -			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> > > > +			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
> > > >  			if (ret)
> > > >  				goto err;
> > > >  		}
> > > > @@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> > > >  				  struct drm_file *file,
> > > >  				  struct intel_ring_buffer *ring,
> > > >  				  struct eb_objects *eb,
> > > > -				  struct drm_i915_gem_exec_object2 *exec)
> > > > +				  struct drm_i915_gem_exec_object2 *exec,
> > > > +				  struct i915_address_space *vm)
> > > >  {
> > > >  	struct drm_i915_gem_relocation_entry *reloc;
> > > >  	struct drm_i915_gem_object *obj;
> > > > @@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> > > >  		goto err;
> > > >  
> > > >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > > > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > > > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> > > >  	if (ret)
> > > >  		goto err;
> > > >  
> > > >  	list_for_each_entry(obj, &eb->objects, exec_list) {
> > > >  		int offset = obj->exec_entry - exec;
> > > >  		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
> > > > -							       reloc + reloc_offset[offset]);
> > > > +							       reloc + reloc_offset[offset],
> > > > +							       vm);
> > > >  		if (ret)
> > > >  			goto err;
> > > >  	}
> > > > @@ -768,6 +784,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
> > > >  
> > > >  static void
> > > >  i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > > > +				   struct i915_address_space *vm,
> > > >  				   struct intel_ring_buffer *ring)
> > > >  {
> > > >  	struct drm_i915_gem_object *obj;
> > > > @@ -782,7 +799,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
> > > >  		obj->base.read_domains = obj->base.pending_read_domains;
> > > >  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
> > > >  
> > > > -		i915_gem_object_move_to_active(obj, ring);
> > > > +		i915_gem_object_move_to_active(obj, vm, ring);
> > > >  		if (obj->base.write_domain) {
> > > >  			obj->dirty = 1;
> > > >  			obj->last_write_seqno = intel_ring_get_seqno(ring);
> > > > @@ -836,7 +853,8 @@ static int
> > > >  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > > >  		       struct drm_file *file,
> > > >  		       struct drm_i915_gem_execbuffer2 *args,
> > > > -		       struct drm_i915_gem_exec_object2 *exec)
> > > > +		       struct drm_i915_gem_exec_object2 *exec,
> > > > +		       struct i915_address_space *vm)
> > > >  {
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > >  	struct eb_objects *eb;
> > > > @@ -998,17 +1016,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > > >  
> > > >  	/* Move the objects en-masse into the GTT, evicting if necessary. */
> > > >  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> > > > -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> > > > +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
> > > >  	if (ret)
> > > >  		goto err;
> > > >  
> > > >  	/* The objects are in their final locations, apply the relocations. */
> > > >  	if (need_relocs)
> > > > -		ret = i915_gem_execbuffer_relocate(eb);
> > > > +		ret = i915_gem_execbuffer_relocate(eb, vm);
> > > >  	if (ret) {
> > > >  		if (ret == -EFAULT) {
> > > >  			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
> > > > -								eb, exec);
> > > > +								eb, exec, vm);
> > > >  			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
> > > >  		}
> > > >  		if (ret)
> > > > @@ -1059,7 +1077,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > > >  			goto err;
> > > >  	}
> > > >  
> > > > -	exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset;
> > > > +	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> > > > +		args->batch_start_offset;
> > > >  	exec_len = args->batch_len;
> > > >  	if (cliprects) {
> > > >  		for (i = 0; i < args->num_cliprects; i++) {
> > > > @@ -1084,11 +1103,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > > >  
> > > >  	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
> > > >  
> > > > -	i915_gem_execbuffer_move_to_active(&eb->objects, ring);
> > > > +	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
> > > >  	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
> > > >  
> > > >  err:
> > > > -	eb_destroy(eb);
> > > > +	eb_destroy(eb, vm);
> > > >  
> > > >  	mutex_unlock(&dev->struct_mutex);
> > > >  
> > > > @@ -1105,6 +1124,7 @@ int
> > > >  i915_gem_execbuffer(struct drm_device *dev, void *data,
> > > >  		    struct drm_file *file)
> > > >  {
> > > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > >  	struct drm_i915_gem_execbuffer *args = data;
> > > >  	struct drm_i915_gem_execbuffer2 exec2;
> > > >  	struct drm_i915_gem_exec_object *exec_list = NULL;
> > > > @@ -1160,7 +1180,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
> > > >  	exec2.flags = I915_EXEC_RENDER;
> > > >  	i915_execbuffer2_set_context_id(exec2, 0);
> > > >  
> > > > -	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
> > > > +	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
> > > > +				     &dev_priv->gtt.base);
> > > >  	if (!ret) {
> > > >  		/* Copy the new buffer offsets back to the user's exec list. */
> > > >  		for (i = 0; i < args->buffer_count; i++)
> > > > @@ -1186,6 +1207,7 @@ int
> > > >  i915_gem_execbuffer2(struct drm_device *dev, void *data,
> > > >  		     struct drm_file *file)
> > > >  {
> > > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > >  	struct drm_i915_gem_execbuffer2 *args = data;
> > > >  	struct drm_i915_gem_exec_object2 *exec2_list = NULL;
> > > >  	int ret;
> > > > @@ -1216,7 +1238,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
> > > >  		return -EFAULT;
> > > >  	}
> > > >  
> > > > -	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
> > > > +	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
> > > > +				     &dev_priv->gtt.base);
> > > >  	if (!ret) {
> > > >  		/* Copy the new buffer offsets back to the user's exec list. */
> > > >  		ret = copy_to_user(to_user_ptr(args->buffers_ptr),
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > > index 298fc42..70ce2f6 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > > @@ -367,6 +367,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
> > > >  			    ppgtt->base.total);
> > > >  	}
> > > >  
> > > > +	/* i915_init_vm(dev_priv, &ppgtt->base) */
> > > > +
> > > >  	return ret;
> > > >  }
> > > >  
> > > > @@ -386,17 +388,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> > > >  			    struct drm_i915_gem_object *obj,
> > > >  			    enum i915_cache_level cache_level)
> > > >  {
> > > > -	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> > > > -				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > > -				   cache_level);
> > > > +	struct i915_address_space *vm = &ppgtt->base;
> > > > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > > > +
> > > > +	vm->insert_entries(vm, obj->pages,
> > > > +			   obj_offset >> PAGE_SHIFT,
> > > > +			   cache_level);
> > > >  }
> > > >  
> > > >  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > > >  			      struct drm_i915_gem_object *obj)
> > > >  {
> > > > -	ppgtt->base.clear_range(&ppgtt->base,
> > > > -				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > > > -				obj->base.size >> PAGE_SHIFT);
> > > > +	struct i915_address_space *vm = &ppgtt->base;
> > > > +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> > > > +
> > > > +	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
> > > > +			obj->base.size >> PAGE_SHIFT);
> > > >  }
> > > >  
> > > >  extern int intel_iommu_gfx_mapped;
> > > > @@ -447,6 +454,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> > > >  				       dev_priv->gtt.base.start / PAGE_SIZE,
> > > >  				       dev_priv->gtt.base.total / PAGE_SIZE);
> > > >  
> > > > +	if (dev_priv->mm.aliasing_ppgtt)
> > > > +		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> > > > +
> > > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > > >  		i915_gem_clflush_object(obj);
> > > >  		i915_gem_gtt_bind_object(obj, obj->cache_level);
> > > > @@ -625,7 +635,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > > >  	 * aperture.  One page should be enough to keep any prefetching inside
> > > >  	 * of the aperture.
> > > >  	 */
> > > > -	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > +	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
> > > >  	struct drm_mm_node *entry;
> > > >  	struct drm_i915_gem_object *obj;
> > > >  	unsigned long hole_start, hole_end;
> > > > @@ -633,19 +644,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > > >  	BUG_ON(mappable_end > end);
> > > >  
> > > >  	/* Subtract the guard page ... */
> > > > -	drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
> > > > +	drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
> > > >  	if (!HAS_LLC(dev))
> > > >  		dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
> > > >  
> > > >  	/* Mark any preallocated objects as occupied */
> > > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > > > -		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> > > > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
> > > >  		int ret;
> > > >  		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
> > > >  			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
> > > >  
> > > >  		WARN_ON(i915_gem_obj_ggtt_bound(obj));
> > > > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > > > +		ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
> > > >  		if (ret)
> > > >  			DRM_DEBUG_KMS("Reservation failed\n");
> > > >  		obj->has_global_gtt_mapping = 1;
> > > > @@ -656,19 +667,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> > > >  	dev_priv->gtt.base.total = end - start;
> > > >  
> > > >  	/* Clear any non-preallocated blocks */
> > > > -	drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
> > > > -			     hole_start, hole_end) {
> > > > +	drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
> > > >  		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
> > > >  		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
> > > >  			      hole_start, hole_end);
> > > > -		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > > > -					       hole_start / PAGE_SIZE,
> > > > -					       count);
> > > > +		ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count);
> > > >  	}
> > > >  
> > > >  	/* And finally clear the reserved guard page */
> > > > -	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > > > -				       end / PAGE_SIZE - 1, 1);
> > > > +	ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1);
> > > >  }
> > > >  
> > > >  static bool
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > index 245eb1d..bfe61fa 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > @@ -391,7 +391,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > > >  	if (gtt_offset == I915_GTT_OFFSET_NONE)
> > > >  		return obj;
> > > >  
> > > > -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > > > +	vma = i915_gem_vma_create(obj, vm);
> > > >  	if (!vma) {
> > > >  		drm_gem_object_unreference(&obj->base);
> > > >  		return NULL;
> > > > @@ -404,8 +404,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > > >  	 */
> > > >  	vma->node.start = gtt_offset;
> > > >  	vma->node.size = size;
> > > > -	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> > > > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > > > +	if (drm_mm_initialized(&vm->mm)) {
> > > > +		ret = drm_mm_reserve_node(&vm->mm, &vma->node);
> > > 
> > > These two hunks here for stolen look fishy - we only ever use the stolen
> > > preallocated stuff for objects with mappings in the global gtt. So keeping
> > > that explicit is imo the better approach. And tbh I'm confused where the
> > > local variable vm is from ...
> > 
> > If we don't create a vma for it, we potentially have to special case a
> > bunch of places, I think. I'm not actually sure of this, but the
> > overhead to do it is quite small.
> > 
> > Anyway, I'll look this over again nd see what I think.
> 
> I'm not against the vma, I've just wonedered why you do the
> /dev_priv->gtt.base/vm/ replacement here since
> - it's never gonna be used with another vm than ggtt
> - this patch doesn't add the vm variable, so I'm even more confused where
>   this started ;-)

It started from the rebase. In the original series, I did that
"deferred_offset" thing, and having a vm variable made that code pass
the checkpatch.pl. There wasn't a particular reason for naming it vm
other than I had done it all over the place.

I've fixed this locally, leaving the vma, and renamed the local variable
ggtt. It still has 3 uses in this function, so it's a bit less typing.

> 
> Cheers, Daniel
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Daniel Vetter July 12, 2013, 4:46 p.m. UTC | #9

On Fri, Jul 12, 2013 at 08:46:48AM -0700, Ben Widawsky wrote:
> On Fri, Jul 12, 2013 at 08:26:07AM +0200, Daniel Vetter wrote:
> > On Thu, Jul 11, 2013 at 07:23:08PM -0700, Ben Widawsky wrote:
> > > On Tue, Jul 09, 2013 at 09:15:01AM +0200, Daniel Vetter wrote:
> > > > On Mon, Jul 08, 2013 at 11:08:37PM -0700, Ben Widawsky wrote:

[snip]

> > > > > @@ -3333,12 +3376,15 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > > >  	}
> > > > >  
> > > > >  	if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> > > > > -		ret = i915_gem_object_unbind(obj);
> > > > > +		ret = i915_gem_object_unbind(obj, vm);
> > > > >  		if (ret)
> > > > >  			return ret;
> > > > >  	}
> > > > >  
> > > > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > > > +		if (!i915_gem_obj_bound(obj, vm))
> > > > > +			continue;
> > > > 
> > > > Hm, shouldn't we have a per-object list of vmas? Or will that follow later
> > > > on?
> > > > 
> > > > Self-correction: It exists already ... why can't we use this here?
> > > 
> > > Yes. That should work, I'll fix it and test it. It looks slightly worse
> > > IMO in terms of code clarity, but I don't mind the change.
> > 
> > Actually I think it'd gain in clarity, doing pte updatest (which
> > set_cache_level does) on the vma instead of the (obj, vm) pair feels more
> > natural. And we'd be able to drop lots of (obj, vm) -> vma lookups here.
> 
> That sounds good to me. Would you mind a patch on top?

If you want I guess we can refactor this after everything has settled. Has
the upside that assessing whether using vma or (obj, vm) is much easier.
So fine with me.

> 
> > 
> > > 
> > > > 
> > > > > +
> > > > >  		ret = i915_gem_object_finish_gpu(obj);
> > > > >  		if (ret)
> > > > >  			return ret;
> > > > > @@ -3361,7 +3407,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > > >  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> > > > >  					       obj, cache_level);
> > > > >  
> > > > > -		i915_gem_obj_ggtt_set_color(obj, cache_level);
> > > > > +		i915_gem_obj_set_color(obj, vm, cache_level);
> > > > >  	}
> > > > >  
> > > > >  	if (cache_level == I915_CACHE_NONE) {
> > > > > @@ -3421,6 +3467,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > > > >  			       struct drm_file *file)
> > > > >  {
> > > > >  	struct drm_i915_gem_caching *args = data;
> > > > > +	struct drm_i915_private *dev_priv;
> > > > >  	struct drm_i915_gem_object *obj;
> > > > >  	enum i915_cache_level level;
> > > > >  	int ret;
> > > > > @@ -3445,8 +3492,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > > > >  		ret = -ENOENT;
> > > > >  		goto unlock;
> > > > >  	}
> > > > > +	dev_priv = obj->base.dev->dev_private;
> > > > >  
> > > > > -	ret = i915_gem_object_set_cache_level(obj, level);
> > > > > +	/* FIXME: Add interface for specific VM? */
> > > > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
> > > > >  
> > > > >  	drm_gem_object_unreference(&obj->base);
> > > > >  unlock:
> > > > > @@ -3464,6 +3513,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > > >  				     u32 alignment,
> > > > >  				     struct intel_ring_buffer *pipelined)
> > > > >  {
> > > > > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > > >  	u32 old_read_domains, old_write_domain;
> > > > >  	int ret;
> > > > >  
> > > > > @@ -3482,7 +3532,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > > >  	 * of uncaching, which would allow us to flush all the LLC-cached data
> > > > >  	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
> > > > >  	 */
> > > > > -	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> > > > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > > > > +					      I915_CACHE_NONE);
> > > > >  	if (ret)
> > > > >  		return ret;
> > > > >  
> > > > > @@ -3490,7 +3541,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > > >  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> > > > >  	 * always use map_and_fenceable for all scanout buffers.
> > > > >  	 */
> > > > > -	ret = i915_gem_object_pin(obj, alignment, true, false);
> > > > > +	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
> > > > >  	if (ret)
> > > > >  		return ret;
> > > > >  
> > > > > @@ -3633,6 +3684,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
> > > > >  
> > > > >  int
> > > > >  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > > > +		    struct i915_address_space *vm,
> > > > >  		    uint32_t alignment,
> > > > >  		    bool map_and_fenceable,
> > > > >  		    bool nonblocking)
> > > > > @@ -3642,26 +3694,29 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > > >  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
> > > > >  		return -EBUSY;
> > > > >  
> > > > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > > > -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> > > > > +	BUG_ON(map_and_fenceable && !i915_is_ggtt(vm));
> > > > 
> > > > WARN_ON, since presumably we can keep on going if we get this wrong
> > > > (albeit with slightly corrupted state, so render corruptions might
> > > > follow).
> > > 
> > > Can we make a deal, can we leave this as BUG_ON with a FIXME to convert
> > > it at the end of merging?
> > 
> > Adding a FIXME right above it will cause equal amounts of conflicts, so I
> > don't see the point that much ...
> 
> I'm just really fearful that in doing the reworks I will end up with
> this condition, and I am afraid I will miss them if it's a WARN_ON.
> Definitely it's more likely to miss than a BUG.
> 
> Also, and we've disagreed on this a few times by now, this is an
> internal interface which I think should carry such a fatal error for
> this level of mistake.

Ime every time I argue this with myself and state your case it ends up
biting me horribly because I'm regularly too incompetent and hit my very
on BUG_ONs ;-) Hence why I insist so much on using WARN_ON wherever
possible. Of course if people don't check they're logs that's a different
matter (*cough* Jesse *cough*) ...

> In any case I've made the change locally. Will yell at you later if I
> was right.

Getting yelled at is part of my job, so bring it on ;-)

[snip]

> > > > > @@ -4645,11 +4719,93 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > > > >  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
> > > > >  		if (obj->pages_pin_count == 0)
> > > > >  			cnt += obj->base.size >> PAGE_SHIFT;
> > > > > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > > > -		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > > > -			cnt += obj->base.size >> PAGE_SHIFT;
> > > > > +
> > > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > > > +		list_for_each_entry(obj, &vm->inactive_list, global_list)
> > > > > +			if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > > > +				cnt += obj->base.size >> PAGE_SHIFT;
> > > > 
> > > > Isn't this now double-counting objects? In the shrinker we only care about
> > > > how much physical RAM an object occupies, not how much virtual space it
> > > > occupies. So just walking the bound list of objects here should be good
> > > > enough ...
> > > > 
> > > 
> > > Maybe I've misunderstood you. My code is wrong, but I think you're idea
> > > requires a prep patch because it changes functionality, right?
> > > 
> > > So let me know if I've understood you.
> > 
> > Don't we have both the bound and unbound list? So we could just switch
> > over to counting the bound objects here ... Otherwise yes, we need a prep
> > patch to create the bound list first.
> 
> Of course there is a bound list.
> 
> The old code automatically added the size of unbound objects with
> unpinned pages, and unpinned inactive objects with unpinned pages.
> 
> The latter check for inactive, needs to be checked for all VMAs. That
> was my point.

Oh right. The thing is that technically there's no reason to not also scan
the active objects, i.e. just the unbound list. So yeah, sounds like we
need a prep patch to switch to the unbound list here first. My apologies
for being dense and not fully grasphing this right away.

> 
> > 
> > > 
> > > > >  
> > > > >  	if (unlock)
> > > > >  		mutex_unlock(&dev->struct_mutex);
> > > > >  	return cnt;
> > > > >  }
> > > > > +
> > > > > +/* All the new VM stuff */
> > > > > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > > > > +				  struct i915_address_space *vm)
> > > > > +{
> > > > > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > > > > +	struct i915_vma *vma;
> > > > > +
> > > > > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > > > > +		vm = &dev_priv->gtt.base;
> > > > > +
> > > > > +	BUG_ON(list_empty(&o->vma_list));
> > > > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > > 
> > > > Imo the vma list walking here and in the other helpers below indicates
> > > > that we should deal more often in vmas instead of (object, vm) pairs. Or
> > > > is this again something that'll get fixed later on?
> > > > 
> > > > I just want to avoid diff churn, and it also makes reviewing easier if the
> > > > foreshadowing is correct ;-) So generally I'd vote for more liberal
> > > > sprinkling of obj_to_vma in callers.
> > > 
> > > It's not something I fixed in the whole series. I think it makes sense
> > > conceptually, to keep some things as <obj,vm> and others as direct vma.
> > > 
> > > If you want me to change something, you need to be more specific since
> > > no action specifically comes to mind at this point in the series.
> > 
> > It's just that the (obj, vm) -> vma lookup is a list-walk, so imo we
> > should try to avoid it whenever possible. Since the vma has both and obj
> > and a vm pointer the vma is imo strictly better than the (obj, vm) pair.
> > And the look-up should be pushed down the callchain as much as possible.
> > 
> > So I think generally we want to pass the vma around to functions
> > everywhere, and the (obj, vm) pair would be the exception (which needs
> > special justification).
> > 
> 
> Without actually coding it, I am not sure. I think there are probably a
> decent number of reasonable exceptions where we want the object (ie.
> it's not really that much of a special case). In any case, I think we'll
> find you have to do this list walk at some point in the call chain
> anyway, but I can try to start changing around the code as a patch on
> top of this. I really want to leave as much as this patch in place as
> is, since it's decently tested (pre-rebase at least).

Ok, I can life with this if we clean things up afterwards. But imo
vma->obj isn't worse for readability than just obj, and passing pairs of
(obj,vm) around all the time just feels wrong conceptually. In C we have
structs for this, and since we already have a suitable one created we
might as well use it.

Aside: I know that we have the (ring, seqno) pair splattered all over the
code. It's been on my todo to fix that ever since I've proposed to add a
i915_gpu_sync_cookie with the original multi-ring enabling. As you can see
I've been ignored, but I hope we can finally fix this with the dma_buf
fence rework.

And we did just recently discover a bug where such a (ring, seqno) pair
got mixed up, so imo not using vma is fraught with unecessary peril.

[snip]

> > > > > @@ -158,13 +158,18 @@ int
> > > > >  i915_gem_evict_everything(struct drm_device *dev)
> > > > 
> > > > I suspect evict_everything eventually wants a address_space *vm argument
> > > > for those cases where we only want to evict everything in a given vm. Atm
> > > > we have two use-cases of this:
> > > > - Called from the shrinker as a last-ditch effort. For that it should move
> > > >   _every_ object onto the unbound list.
> > > > - Called from execbuf for badly-fragmented address spaces to clean up the
> > > >   mess. For that case we only care about one address space.
> > > 
> > > The current thing is more or less a result of Chris' suggestions. A
> > > non-posted iteration did plumb the vm, and after reworking to the
> > > suggestion made by Chris, the vm didn't make much sense anymore.
> > > 
> > > For point #1, it requires VM prioritization I think. I don't really see
> > > any other way to fairly manage it.
> > 
> > The shrinker will rip out  objects in lru order by walking first unbound
> > and then bound objects. That's imo as fair as it gets, we don't need
> > priorities between vms.
> 
> If you pass in a vm, the semantics would be, evict everything for the
> vm, right?

Yes.

> 
> > 
> > > For point #2, that I agree it might be useful, but we can easily create
> > > a new function, and not call it "shrinker" to do it. 
> > 
> > Well my point was that this function is called
> > i915_gem_evict_everything(dev, vm) and for the first use case we simply
> > pass in vm = NULL. But essentially thrashing the vm should be rare enough
> > that for now we don't need to care.
> > 
> 
> IIRC, this is exactly how my original patch worked pre-Chris.

Oops. Do you or Chris still now the argument for changing things? Maybe I
just don't see another facet of the issue at hand ...

[snip]

> > > > > @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> > > > >  static int
> > > > >  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > > > >  			    struct list_head *objects,
> > > > > +			    struct i915_address_space *vm,
> > > > >  			    bool *need_relocs)
> > > > >  {
> > > > >  	struct drm_i915_gem_object *obj;
> > > > > @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > > > >  		list_for_each_entry(obj, objects, exec_list) {
> > > > >  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> > > > >  			bool need_fence, need_mappable;
> > > > > +			u32 obj_offset;
> > > > >  
> > > > > -			if (!i915_gem_obj_ggtt_bound(obj))
> > > > > +			if (!i915_gem_obj_bound(obj, vm))
> > > > >  				continue;
> > > > 
> > > > I wonder a bit how we could avoid the multipler (obj, vm) -> vma lookups
> > > > here ... Maybe we should cache them in some pointer somewhere (either in
> > > > the eb object or by adding a new pointer to the object struct, e.g.
> > > > obj->eb_vma, similar to obj->eb_list).
> > > > 
> > > 
> > > I agree, and even did this at one unposted patch too. However, I think
> > > it's a premature optimization which risks code correctness. So I think
> > > somewhere a FIXME needs to happen to address that issue. (Or if Chris
> > > complains bitterly about some perf hit).
> > 
> > If you bring up code correctness I'd vote strongly in favour of using vmas
> > everywhere - vma has the (obj, vm) pair locked down, doing the lookup all
> > the thing risks us mixing them up eventually and creating a hella lot of
> > confusion ;-)
> 
> I think this is addressed with the previous comments.

See my example for (ring, seqno). I really strongly believe passing pairs
is the wrong thing and passing structs is the right thing. Especially if
we have one at hand. But I'm ok if you want to clean this up afterwards.

Ofc if the cleanup afterwards doesn't happend I'll be a bit pissed ;-)

[snip]

> > > > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > index 245eb1d..bfe61fa 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > @@ -391,7 +391,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > > > >  	if (gtt_offset == I915_GTT_OFFSET_NONE)
> > > > >  		return obj;
> > > > >  
> > > > > -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > > > > +	vma = i915_gem_vma_create(obj, vm);
> > > > >  	if (!vma) {
> > > > >  		drm_gem_object_unreference(&obj->base);
> > > > >  		return NULL;
> > > > > @@ -404,8 +404,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > > > >  	 */
> > > > >  	vma->node.start = gtt_offset;
> > > > >  	vma->node.size = size;
> > > > > -	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> > > > > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > > > > +	if (drm_mm_initialized(&vm->mm)) {
> > > > > +		ret = drm_mm_reserve_node(&vm->mm, &vma->node);
> > > > 
> > > > These two hunks here for stolen look fishy - we only ever use the stolen
> > > > preallocated stuff for objects with mappings in the global gtt. So keeping
> > > > that explicit is imo the better approach. And tbh I'm confused where the
> > > > local variable vm is from ...
> > > 
> > > If we don't create a vma for it, we potentially have to special case a
> > > bunch of places, I think. I'm not actually sure of this, but the
> > > overhead to do it is quite small.
> > > 
> > > Anyway, I'll look this over again nd see what I think.
> > 
> > I'm not against the vma, I've just wonedered why you do the
> > /dev_priv->gtt.base/vm/ replacement here since
> > - it's never gonna be used with another vm than ggtt
> > - this patch doesn't add the vm variable, so I'm even more confused where
> >   this started ;-)
> 
> It started from the rebase. In the original series, I did that
> "deferred_offset" thing, and having a vm variable made that code pass
> the checkpatch.pl. There wasn't a particular reason for naming it vm
> other than I had done it all over the place.
> 
> I've fixed this locally, leaving the vma, and renamed the local variable
> ggtt. It still has 3 uses in this function, so it's a bit less typing.

Yeah, make sense to keep it then.

Cheers, Daniel

Ben Widawsky July 16, 2013, 3:57 a.m. UTC | #10

On Fri, Jul 12, 2013 at 06:46:51PM +0200, Daniel Vetter wrote:
> On Fri, Jul 12, 2013 at 08:46:48AM -0700, Ben Widawsky wrote:
> > On Fri, Jul 12, 2013 at 08:26:07AM +0200, Daniel Vetter wrote:
> > > On Thu, Jul 11, 2013 at 07:23:08PM -0700, Ben Widawsky wrote:
> > > > On Tue, Jul 09, 2013 at 09:15:01AM +0200, Daniel Vetter wrote:
> > > > > On Mon, Jul 08, 2013 at 11:08:37PM -0700, Ben Widawsky wrote:
> 
> [snip]
> 
> > > > > > @@ -3333,12 +3376,15 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > > > >  	}
> > > > > >  
> > > > > >  	if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> > > > > > -		ret = i915_gem_object_unbind(obj);
> > > > > > +		ret = i915_gem_object_unbind(obj, vm);
> > > > > >  		if (ret)
> > > > > >  			return ret;
> > > > > >  	}
> > > > > >  
> > > > > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > > > > +		if (!i915_gem_obj_bound(obj, vm))
> > > > > > +			continue;
> > > > > 
> > > > > Hm, shouldn't we have a per-object list of vmas? Or will that follow later
> > > > > on?
> > > > > 
> > > > > Self-correction: It exists already ... why can't we use this here?
> > > > 
> > > > Yes. That should work, I'll fix it and test it. It looks slightly worse
> > > > IMO in terms of code clarity, but I don't mind the change.
> > > 
> > > Actually I think it'd gain in clarity, doing pte updatest (which
> > > set_cache_level does) on the vma instead of the (obj, vm) pair feels more
> > > natural. And we'd be able to drop lots of (obj, vm) -> vma lookups here.
> > 
> > That sounds good to me. Would you mind a patch on top?
> 
> If you want I guess we can refactor this after everything has settled. Has
> the upside that assessing whether using vma or (obj, vm) is much easier.
> So fine with me.

I think our time to get these merged in is quickly evaporating. I am
sure you know this - just reminding you.

> 
> > 
> > > 
> > > > 
> > > > > 
> > > > > > +
> > > > > >  		ret = i915_gem_object_finish_gpu(obj);
> > > > > >  		if (ret)
> > > > > >  			return ret;
> > > > > > @@ -3361,7 +3407,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > > > >  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> > > > > >  					       obj, cache_level);
> > > > > >  
> > > > > > -		i915_gem_obj_ggtt_set_color(obj, cache_level);
> > > > > > +		i915_gem_obj_set_color(obj, vm, cache_level);
> > > > > >  	}
> > > > > >  
> > > > > >  	if (cache_level == I915_CACHE_NONE) {
> > > > > > @@ -3421,6 +3467,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > > > > >  			       struct drm_file *file)
> > > > > >  {
> > > > > >  	struct drm_i915_gem_caching *args = data;
> > > > > > +	struct drm_i915_private *dev_priv;
> > > > > >  	struct drm_i915_gem_object *obj;
> > > > > >  	enum i915_cache_level level;
> > > > > >  	int ret;
> > > > > > @@ -3445,8 +3492,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > > > > >  		ret = -ENOENT;
> > > > > >  		goto unlock;
> > > > > >  	}
> > > > > > +	dev_priv = obj->base.dev->dev_private;
> > > > > >  
> > > > > > -	ret = i915_gem_object_set_cache_level(obj, level);
> > > > > > +	/* FIXME: Add interface for specific VM? */
> > > > > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
> > > > > >  
> > > > > >  	drm_gem_object_unreference(&obj->base);
> > > > > >  unlock:
> > > > > > @@ -3464,6 +3513,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > > > >  				     u32 alignment,
> > > > > >  				     struct intel_ring_buffer *pipelined)
> > > > > >  {
> > > > > > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > > > >  	u32 old_read_domains, old_write_domain;
> > > > > >  	int ret;
> > > > > >  
> > > > > > @@ -3482,7 +3532,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > > > >  	 * of uncaching, which would allow us to flush all the LLC-cached data
> > > > > >  	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
> > > > > >  	 */
> > > > > > -	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> > > > > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > > > > > +					      I915_CACHE_NONE);
> > > > > >  	if (ret)
> > > > > >  		return ret;
> > > > > >  
> > > > > > @@ -3490,7 +3541,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > > > >  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> > > > > >  	 * always use map_and_fenceable for all scanout buffers.
> > > > > >  	 */
> > > > > > -	ret = i915_gem_object_pin(obj, alignment, true, false);
> > > > > > +	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
> > > > > >  	if (ret)
> > > > > >  		return ret;
> > > > > >  
> > > > > > @@ -3633,6 +3684,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
> > > > > >  
> > > > > >  int
> > > > > >  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > > > > +		    struct i915_address_space *vm,
> > > > > >  		    uint32_t alignment,
> > > > > >  		    bool map_and_fenceable,
> > > > > >  		    bool nonblocking)
> > > > > > @@ -3642,26 +3694,29 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > > > >  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
> > > > > >  		return -EBUSY;
> > > > > >  
> > > > > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > > > > -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> > > > > > +	BUG_ON(map_and_fenceable && !i915_is_ggtt(vm));
> > > > > 
> > > > > WARN_ON, since presumably we can keep on going if we get this wrong
> > > > > (albeit with slightly corrupted state, so render corruptions might
> > > > > follow).
> > > > 
> > > > Can we make a deal, can we leave this as BUG_ON with a FIXME to convert
> > > > it at the end of merging?
> > > 
> > > Adding a FIXME right above it will cause equal amounts of conflicts, so I
> > > don't see the point that much ...
> > 
> > I'm just really fearful that in doing the reworks I will end up with
> > this condition, and I am afraid I will miss them if it's a WARN_ON.
> > Definitely it's more likely to miss than a BUG.
> > 
> > Also, and we've disagreed on this a few times by now, this is an
> > internal interface which I think should carry such a fatal error for
> > this level of mistake.
> 
> Ime every time I argue this with myself and state your case it ends up
> biting me horribly because I'm regularly too incompetent and hit my very
> on BUG_ONs ;-) Hence why I insist so much on using WARN_ON wherever
> possible. Of course if people don't check they're logs that's a different
> matter (*cough* Jesse *cough*) ...
> 
> > In any case I've made the change locally. Will yell at you later if I
> > was right.
> 
> Getting yelled at is part of my job, so bring it on ;-)
> 
> [snip]
> 
> > > > > > @@ -4645,11 +4719,93 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > > > > >  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
> > > > > >  		if (obj->pages_pin_count == 0)
> > > > > >  			cnt += obj->base.size >> PAGE_SHIFT;
> > > > > > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > > > > -		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > > > > -			cnt += obj->base.size >> PAGE_SHIFT;
> > > > > > +
> > > > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > > > > +		list_for_each_entry(obj, &vm->inactive_list, global_list)
> > > > > > +			if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > > > > +				cnt += obj->base.size >> PAGE_SHIFT;
> > > > > 
> > > > > Isn't this now double-counting objects? In the shrinker we only care about
> > > > > how much physical RAM an object occupies, not how much virtual space it
> > > > > occupies. So just walking the bound list of objects here should be good
> > > > > enough ...
> > > > > 
> > > > 
> > > > Maybe I've misunderstood you. My code is wrong, but I think you're idea
> > > > requires a prep patch because it changes functionality, right?
> > > > 
> > > > So let me know if I've understood you.
> > > 
> > > Don't we have both the bound and unbound list? So we could just switch
> > > over to counting the bound objects here ... Otherwise yes, we need a prep
> > > patch to create the bound list first.
> > 
> > Of course there is a bound list.
> > 
> > The old code automatically added the size of unbound objects with
> > unpinned pages, and unpinned inactive objects with unpinned pages.
> > 
> > The latter check for inactive, needs to be checked for all VMAs. That
> > was my point.
> 
> Oh right. The thing is that technically there's no reason to not also scan
> the active objects, i.e. just the unbound list. So yeah, sounds like we
> need a prep patch to switch to the unbound list here first. My apologies
> for being dense and not fully grasphing this right away.
> 
> > 
> > > 
> > > > 
> > > > > >  
> > > > > >  	if (unlock)
> > > > > >  		mutex_unlock(&dev->struct_mutex);
> > > > > >  	return cnt;
> > > > > >  }
> > > > > > +
> > > > > > +/* All the new VM stuff */
> > > > > > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > > > > > +				  struct i915_address_space *vm)
> > > > > > +{
> > > > > > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > > > > > +	struct i915_vma *vma;
> > > > > > +
> > > > > > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > > > > > +		vm = &dev_priv->gtt.base;
> > > > > > +
> > > > > > +	BUG_ON(list_empty(&o->vma_list));
> > > > > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > > > 
> > > > > Imo the vma list walking here and in the other helpers below indicates
> > > > > that we should deal more often in vmas instead of (object, vm) pairs. Or
> > > > > is this again something that'll get fixed later on?
> > > > > 
> > > > > I just want to avoid diff churn, and it also makes reviewing easier if the
> > > > > foreshadowing is correct ;-) So generally I'd vote for more liberal
> > > > > sprinkling of obj_to_vma in callers.
> > > > 
> > > > It's not something I fixed in the whole series. I think it makes sense
> > > > conceptually, to keep some things as <obj,vm> and others as direct vma.
> > > > 
> > > > If you want me to change something, you need to be more specific since
> > > > no action specifically comes to mind at this point in the series.
> > > 
> > > It's just that the (obj, vm) -> vma lookup is a list-walk, so imo we
> > > should try to avoid it whenever possible. Since the vma has both and obj
> > > and a vm pointer the vma is imo strictly better than the (obj, vm) pair.
> > > And the look-up should be pushed down the callchain as much as possible.
> > > 
> > > So I think generally we want to pass the vma around to functions
> > > everywhere, and the (obj, vm) pair would be the exception (which needs
> > > special justification).
> > > 
> > 
> > Without actually coding it, I am not sure. I think there are probably a
> > decent number of reasonable exceptions where we want the object (ie.
> > it's not really that much of a special case). In any case, I think we'll
> > find you have to do this list walk at some point in the call chain
> > anyway, but I can try to start changing around the code as a patch on
> > top of this. I really want to leave as much as this patch in place as
> > is, since it's decently tested (pre-rebase at least).
> 
> Ok, I can life with this if we clean things up afterwards. But imo
> vma->obj isn't worse for readability than just obj, and passing pairs of
> (obj,vm) around all the time just feels wrong conceptually. In C we have
> structs for this, and since we already have a suitable one created we
> might as well use it.
> 
> Aside: I know that we have the (ring, seqno) pair splattered all over the
> code. It's been on my todo to fix that ever since I've proposed to add a
> i915_gpu_sync_cookie with the original multi-ring enabling. As you can see
> I've been ignored, but I hope we can finally fix this with the dma_buf
> fence rework.
> 
> And we did just recently discover a bug where such a (ring, seqno) pair
> got mixed up, so imo not using vma is fraught with unecessary peril.
> 
> [snip]

As I wrote this code, I would say things as, "do this operation on that
object, in so and so address space." That's more or less why I kept
everything as obj, vm pairs. Since GEM is about BOs, at some fundamental
level we're always talking about objects, and you need to translate an
<obj,VM> pair into a VMA. The other motivation was of course to make the
diffs more manageable and easier to write. It's way easier to add a new
VM argument, than it is to do even more plumbing. As stated now in several
places, I agree it very well might make sense to convert many functions
in the call chain to VMAs, and do the lookup at the entry point from
userspace, or whenever it's logical; and indeed in the long run I think
it will lead to less buggy code.


> 
> > > > > > @@ -158,13 +158,18 @@ int
> > > > > >  i915_gem_evict_everything(struct drm_device *dev)
> > > > > 
> > > > > I suspect evict_everything eventually wants a address_space *vm argument
> > > > > for those cases where we only want to evict everything in a given vm. Atm
> > > > > we have two use-cases of this:
> > > > > - Called from the shrinker as a last-ditch effort. For that it should move
> > > > >   _every_ object onto the unbound list.
> > > > > - Called from execbuf for badly-fragmented address spaces to clean up the
> > > > >   mess. For that case we only care about one address space.
> > > > 
> > > > The current thing is more or less a result of Chris' suggestions. A
> > > > non-posted iteration did plumb the vm, and after reworking to the
> > > > suggestion made by Chris, the vm didn't make much sense anymore.
> > > > 
> > > > For point #1, it requires VM prioritization I think. I don't really see
> > > > any other way to fairly manage it.
> > > 
> > > The shrinker will rip out  objects in lru order by walking first unbound
> > > and then bound objects. That's imo as fair as it gets, we don't need
> > > priorities between vms.
> > 
> > If you pass in a vm, the semantics would be, evict everything for the
> > vm, right?
> 
> Yes.
> 
> > 
> > > 
> > > > For point #2, that I agree it might be useful, but we can easily create
> > > > a new function, and not call it "shrinker" to do it. 
> > > 
> > > Well my point was that this function is called
> > > i915_gem_evict_everything(dev, vm) and for the first use case we simply
> > > pass in vm = NULL. But essentially thrashing the vm should be rare enough
> > > that for now we don't need to care.
> > > 
> > 
> > IIRC, this is exactly how my original patch worked pre-Chris.
> 
> Oops. Do you or Chris still now the argument for changing things? Maybe I
> just don't see another facet of the issue at hand ...
> 
> [snip]
> 
> > > > > > @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> > > > > >  static int
> > > > > >  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > > > > >  			    struct list_head *objects,
> > > > > > +			    struct i915_address_space *vm,
> > > > > >  			    bool *need_relocs)
> > > > > >  {
> > > > > >  	struct drm_i915_gem_object *obj;
> > > > > > @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > > > > >  		list_for_each_entry(obj, objects, exec_list) {
> > > > > >  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> > > > > >  			bool need_fence, need_mappable;
> > > > > > +			u32 obj_offset;
> > > > > >  
> > > > > > -			if (!i915_gem_obj_ggtt_bound(obj))
> > > > > > +			if (!i915_gem_obj_bound(obj, vm))
> > > > > >  				continue;
> > > > > 
> > > > > I wonder a bit how we could avoid the multipler (obj, vm) -> vma lookups
> > > > > here ... Maybe we should cache them in some pointer somewhere (either in
> > > > > the eb object or by adding a new pointer to the object struct, e.g.
> > > > > obj->eb_vma, similar to obj->eb_list).
> > > > > 
> > > > 
> > > > I agree, and even did this at one unposted patch too. However, I think
> > > > it's a premature optimization which risks code correctness. So I think
> > > > somewhere a FIXME needs to happen to address that issue. (Or if Chris
> > > > complains bitterly about some perf hit).
> > > 
> > > If you bring up code correctness I'd vote strongly in favour of using vmas
> > > everywhere - vma has the (obj, vm) pair locked down, doing the lookup all
> > > the thing risks us mixing them up eventually and creating a hella lot of
> > > confusion ;-)
> > 
> > I think this is addressed with the previous comments.
> 
> See my example for (ring, seqno). I really strongly believe passing pairs
> is the wrong thing and passing structs is the right thing. Especially if
> we have one at hand. But I'm ok if you want to clean this up afterwards.
> 
> Ofc if the cleanup afterwards doesn't happend I'll be a bit pissed ;-)
> 
> [snip]

I think it's fair to not merge anything until I do it. I'm just asking
that you let me do it as a patch on top of the series as opposed to
doing a bunch of rebasing which I've shown historically time and again
to screw up. Whether or not I should be better at rebasing isn't up for
debate.

> 
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > > index 245eb1d..bfe61fa 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > > @@ -391,7 +391,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > > > > >  	if (gtt_offset == I915_GTT_OFFSET_NONE)
> > > > > >  		return obj;
> > > > > >  
> > > > > > -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > > > > > +	vma = i915_gem_vma_create(obj, vm);
> > > > > >  	if (!vma) {
> > > > > >  		drm_gem_object_unreference(&obj->base);
> > > > > >  		return NULL;
> > > > > > @@ -404,8 +404,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > > > > >  	 */
> > > > > >  	vma->node.start = gtt_offset;
> > > > > >  	vma->node.size = size;
> > > > > > -	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> > > > > > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > > > > > +	if (drm_mm_initialized(&vm->mm)) {
> > > > > > +		ret = drm_mm_reserve_node(&vm->mm, &vma->node);
> > > > > 
> > > > > These two hunks here for stolen look fishy - we only ever use the stolen
> > > > > preallocated stuff for objects with mappings in the global gtt. So keeping
> > > > > that explicit is imo the better approach. And tbh I'm confused where the
> > > > > local variable vm is from ...
> > > > 
> > > > If we don't create a vma for it, we potentially have to special case a
> > > > bunch of places, I think. I'm not actually sure of this, but the
> > > > overhead to do it is quite small.
> > > > 
> > > > Anyway, I'll look this over again nd see what I think.
> > > 
> > > I'm not against the vma, I've just wonedered why you do the
> > > /dev_priv->gtt.base/vm/ replacement here since
> > > - it's never gonna be used with another vm than ggtt
> > > - this patch doesn't add the vm variable, so I'm even more confused where
> > >   this started ;-)
> > 
> > It started from the rebase. In the original series, I did that
> > "deferred_offset" thing, and having a vm variable made that code pass
> > the checkpatch.pl. There wasn't a particular reason for naming it vm
> > other than I had done it all over the place.
> > 
> > I've fixed this locally, leaving the vma, and renamed the local variable
> > ggtt. It still has 3 uses in this function, so it's a bit less typing.
> 
> Yeah, make sense to keep it then.
> 
> Cheers, Daniel

So here is the plan summing up from this mail thread and the other
one...

I'm calling the series done until we get feedback from Chris on the
eviction/shrinker code. Basically whatever he signs off on, I am willing
to implement (assuming you agree). I am confident enough in my rebase
abilities to at least fix that up when we get to it.

On top of this series, I am going smash the 3 patches which introduce
bind/unbind function pointers. I am doing this because all this code
already exists, plays nicely together, and has been tested.

Finally on top of this, I am going to have a [fairly large] patch to
squash <obj,vm> pair into <vma> where I feel its appropriate. My plan
for this is to basically look at a squashed version of all the patches
where I introduce both arguments in one call, think about if it makes
sense to me to think about the operation in terms of an object, or a
vma, and then change as appropriate.

So the feedback I need from you:
1. Does that sound reasonable? If you want to call it weird, that's fine,
but I think the diff churn is acceptable, and the series should be quite
bisectable.
2. Are you aware of anything else I missed in other patch review which I
haven't mentioned. I cannot find anything, and I think except for this
patch I had updated everything locally.

You can see the 14 patches (11 here + 3 from the others series):
http://cgit.freedesktop.org/~bwidawsk/drm-intel/log/?h=ppgtt2

Hopefully this was all taken in good humor. I am not upset or anything,
we just need to figure out how we can make some progress on this soon
without making total crap, and without taking all my time (since I have
a few other things to work on). I know you warned me about the latter a
few weeks ago, but well, reality.

Daniel Vetter July 16, 2013, 5:06 a.m. UTC | #11

On Mon, Jul 15, 2013 at 08:57:13PM -0700, Ben Widawsky wrote:
> So here is the plan summing up from this mail thread and the other
> one...
> 
> I'm calling the series done until we get feedback from Chris on the
> eviction/shrinker code. Basically whatever he signs off on, I am willing
> to implement (assuming you agree). I am confident enough in my rebase
> abilities to at least fix that up when we get to it.

I guess we could shovel this around after merging, shrinker gets frobbed
around once a while for anyway ;-)

> On top of this series, I am going smash the 3 patches which introduce
> bind/unbind function pointers. I am doing this because all this code
> already exists, plays nicely together, and has been tested.
> 
> Finally on top of this, I am going to have a [fairly large] patch to
> squash <obj,vm> pair into <vma> where I feel its appropriate. My plan
> for this is to basically look at a squashed version of all the patches
> where I introduce both arguments in one call, think about if it makes
> sense to me to think about the operation in terms of an object, or a
> vma, and then change as appropriate.
> 
> So the feedback I need from you:
> 1. Does that sound reasonable? If you want to call it weird, that's fine,
> but I think the diff churn is acceptable, and the series should be quite
> bisectable.

I guess I'll live ;-)

> 2. Are you aware of anything else I missed in other patch review which I
> haven't mentioned. I cannot find anything, and I think except for this
> patch I had updated everything locally.

Imo my review on the semantics of ->bind/unbind is fairly important, at
least for discussion. Atm you just add some new abstraction (and kill some
old one for which we've established some good use) without putting it to
real use imo. Haven't read any responses yet though, it's early.

> You can see the 14 patches (11 here + 3 from the others series):
> http://cgit.freedesktop.org/~bwidawsk/drm-intel/log/?h=ppgtt2
> 
> Hopefully this was all taken in good humor. I am not upset or anything,
> we just need to figure out how we can make some progress on this soon
> without making total crap, and without taking all my time (since I have
> a few other things to work on). I know you warned me about the latter a
> few weeks ago, but well, reality.

Imo the first five patches should get the small reviews addressed and then
merged. Atm that's what I'm kinda stalling on ... Then we can tackle the
next little bit of this major work.
-Daniel

[06/11] drm/i915: plumb VM into object operations

Commit Message

Comments

Patch