[01/12] drm/i915: plumb VM into object operations

Message ID	1374458899-8635-2-git-send-email-ben@bwidawsk.net (mailing list archive)
State	New, archived
Headers	show Return-Path: <intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org> From: Ben Widawsky <ben@bwidawsk.net> To: Intel GFX <intel-gfx@lists.freedesktop.org> Date: Sun, 21 Jul 2013 19:08:08 -0700 Message-Id: <1374458899-8635-2-git-send-email-ben@bwidawsk.net> In-Reply-To: <1374458899-8635-1-git-send-email-ben@bwidawsk.net> References: <1374458899-8635-1-git-send-email-ben@bwidawsk.net> Cc: Ben Widawsky <ben@bwidawsk.net> Subject: [Intel-gfx] [PATCH 01/12] drm/i915: plumb VM into object operations Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org

Ben Widawsky July 22, 2013, 2:08 a.m. UTC

This patch was formerly known as:
"drm/i915: Create VMAs (part 3) - plumbing"

This patch adds a VM argument, bind/unbind, and the object
offset/size/color getters/setters. It preserves the old ggtt helper
functions because things still need, and will continue to need them.

Some code will still need to be ported over after this.

v2: Fix purge to pick an object and unbind all vmas
This was doable because of the global bound list change.

v3: With the commit to actually pin/unpin pages in place, there is no
longer a need to check if unbind succeeded before calling put_pages().
Make put_pages only BUG() after checking pin count.

v4: Rebased on top of the new hangcheck work by Mika
plumbed eb_destroy also
Many checkpatch related fixes

v5: Very large rebase

v6:
Change BUG_ON to WARN_ON (Daniel)
Rename vm to ggtt in preallocate stolen, since it is always ggtt when
dealing with stolen memory. (Daniel)
list_for_each will short-circuit already (Daniel)
remove superflous space (Daniel)
Use per object list of vmas (Daniel)
Make obj_bound_any() use obj_bound for each vm (Ben)
s/bind_to_gtt/bind_to_vm/ (Ben)

Fixed up the inactive shrinker. As Daniel noticed the code could
potentially count the same object multiple times. While it's not
possible in the current case, since 1 object can only ever be bound into
1 address space thus far - we may as well try to get something more
future proof in place now. With a prep patch before this to switch over
to using the bound list + inactive check, we're now able to carry that
forward for every address space an object is bound into.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |  29 ++-
 drivers/gpu/drm/i915/i915_dma.c            |   4 -
 drivers/gpu/drm/i915/i915_drv.h            | 107 +++++----
 drivers/gpu/drm/i915/i915_gem.c            | 337 +++++++++++++++++++++--------
 drivers/gpu/drm/i915/i915_gem_context.c    |   9 +-
 drivers/gpu/drm/i915/i915_gem_evict.c      |  51 +++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 +++++---
 drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 ++--
 drivers/gpu/drm/i915/i915_gem_stolen.c     |  10 +-
 drivers/gpu/drm/i915/i915_gem_tiling.c     |  10 +-
 drivers/gpu/drm/i915/i915_trace.h          |  20 +-
 drivers/gpu/drm/i915/intel_fb.c            |   1 -
 drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
 drivers/gpu/drm/i915/intel_pm.c            |   2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    |  16 +-
 15 files changed, 479 insertions(+), 245 deletions(-)

Daniel Vetter July 23, 2013, 4:37 p.m. UTC | #1

On Sun, Jul 21, 2013 at 07:08:08PM -0700, Ben Widawsky wrote:
> This patch was formerly known as:
> "drm/i915: Create VMAs (part 3) - plumbing"
> 
> This patch adds a VM argument, bind/unbind, and the object
> offset/size/color getters/setters. It preserves the old ggtt helper
> functions because things still need, and will continue to need them.
> 
> Some code will still need to be ported over after this.
> 
> v2: Fix purge to pick an object and unbind all vmas
> This was doable because of the global bound list change.
> 
> v3: With the commit to actually pin/unpin pages in place, there is no
> longer a need to check if unbind succeeded before calling put_pages().
> Make put_pages only BUG() after checking pin count.
> 
> v4: Rebased on top of the new hangcheck work by Mika
> plumbed eb_destroy also
> Many checkpatch related fixes
> 
> v5: Very large rebase
> 
> v6:
> Change BUG_ON to WARN_ON (Daniel)
> Rename vm to ggtt in preallocate stolen, since it is always ggtt when
> dealing with stolen memory. (Daniel)
> list_for_each will short-circuit already (Daniel)
> remove superflous space (Daniel)
> Use per object list of vmas (Daniel)
> Make obj_bound_any() use obj_bound for each vm (Ben)
> s/bind_to_gtt/bind_to_vm/ (Ben)
> 
> Fixed up the inactive shrinker. As Daniel noticed the code could
> potentially count the same object multiple times. While it's not
> possible in the current case, since 1 object can only ever be bound into
> 1 address space thus far - we may as well try to get something more
> future proof in place now. With a prep patch before this to switch over
> to using the bound list + inactive check, we're now able to carry that
> forward for every address space an object is bound into.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Ok, I think this patch is too big and needs to be split up. Atm there's
way too many changes in here to be able to do a real review. Things I've
noticed while reading through it
- The set_color interface looks really strange. We loop over all vma, but
  then pass in the (obj, vm) pair so that we _again_ loop over all vmas to
  figure out the right one again to finally set the color.
- The function renaming should imo be split out as much as possible.
- There's some variable renaming like s/alignment/align/. Imo just drop
  that part.
- Some localized prep work without changing function interface should also
  go in separate patches imo, like using ggtt_vm pointers more.

Overall I still think that the little attribute helpers should accept a
vma parameter, not an (obj, vm) pair.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c        |  29 ++-
>  drivers/gpu/drm/i915/i915_dma.c            |   4 -
>  drivers/gpu/drm/i915/i915_drv.h            | 107 +++++----
>  drivers/gpu/drm/i915/i915_gem.c            | 337 +++++++++++++++++++++--------
>  drivers/gpu/drm/i915/i915_gem_context.c    |   9 +-
>  drivers/gpu/drm/i915/i915_gem_evict.c      |  51 +++--
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 +++++---
>  drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 ++--
>  drivers/gpu/drm/i915/i915_gem_stolen.c     |  10 +-
>  drivers/gpu/drm/i915/i915_gem_tiling.c     |  10 +-
>  drivers/gpu/drm/i915/i915_trace.h          |  20 +-
>  drivers/gpu/drm/i915/intel_fb.c            |   1 -
>  drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
>  drivers/gpu/drm/i915/intel_pm.c            |   2 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c    |  16 +-
>  15 files changed, 479 insertions(+), 245 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index be69807..f8e590f 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -92,6 +92,7 @@ static const char *get_tiling_flag(struct drm_i915_gem_object *obj)
>  static void
>  describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>  {
> +	struct i915_vma *vma;
>  	seq_printf(m, "%pK: %s%s %8zdKiB %02x %02x %d %d %d%s%s%s",
>  		   &obj->base,
>  		   get_pin_flag(obj),
> @@ -111,9 +112,15 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>  		seq_printf(m, " (pinned x %d)", obj->pin_count);
>  	if (obj->fence_reg != I915_FENCE_REG_NONE)
>  		seq_printf(m, " (fence: %d)", obj->fence_reg);
> -	if (i915_gem_obj_ggtt_bound(obj))
> -		seq_printf(m, " (gtt offset: %08lx, size: %08x)",
> -			   i915_gem_obj_ggtt_offset(obj), (unsigned int)i915_gem_obj_ggtt_size(obj));
> +	list_for_each_entry(vma, &obj->vma_list, vma_link) {
> +		if (!i915_is_ggtt(vma->vm))
> +			seq_puts(m, " (pp");
> +		else
> +			seq_puts(m, " (g");
> +		seq_printf(m, "gtt offset: %08lx, size: %08lx)",
> +			   i915_gem_obj_offset(obj, vma->vm),
> +			   i915_gem_obj_size(obj, vma->vm));
> +	}
>  	if (obj->stolen)
>  		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
>  	if (obj->pin_mappable || obj->fault_mappable) {
> @@ -175,6 +182,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
>  	return 0;
>  }
>  
> +/* FIXME: Support multiple VM? */
>  #define count_objects(list, member) do { \
>  	list_for_each_entry(obj, list, member) { \
>  		size += i915_gem_obj_ggtt_size(obj); \
> @@ -1781,18 +1789,21 @@ i915_drop_caches_set(void *data, u64 val)
>  
>  	if (val & DROP_BOUND) {
>  		list_for_each_entry_safe(obj, next, &vm->inactive_list,
> -					 mm_list)
> -			if (obj->pin_count == 0) {
> -				ret = i915_gem_object_unbind(obj);
> -				if (ret)
> -					goto unlock;
> -			}
> +					 mm_list) {
> +			if (obj->pin_count)
> +				continue;
> +
> +			ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
> +			if (ret)
> +				goto unlock;
> +		}
>  	}
>  
>  	if (val & DROP_UNBOUND) {
>  		list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list,
>  					 global_list)
>  			if (obj->pages_pin_count == 0) {
> +				/* FIXME: Do this for all vms? */
>  				ret = i915_gem_object_put_pages(obj);
>  				if (ret)
>  					goto unlock;
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 1449d06..4650519 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1499,10 +1499,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
>  
>  	i915_dump_device_info(dev_priv);
>  
> -	INIT_LIST_HEAD(&dev_priv->vm_list);
> -	INIT_LIST_HEAD(&dev_priv->gtt.base.global_link);
> -	list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list);
> -
>  	if (i915_get_bridge_dev(dev)) {
>  		ret = -EIO;
>  		goto free_priv;
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 8b3167e..681cb41 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1379,52 +1379,6 @@ struct drm_i915_gem_object {
>  
>  #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
>  
> -/* This is a temporary define to help transition us to real VMAs. If you see
> - * this, you're either reviewing code, or bisecting it. */
> -static inline struct i915_vma *
> -__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj)
> -{
> -	if (list_empty(&obj->vma_list))
> -		return NULL;
> -	return list_first_entry(&obj->vma_list, struct i915_vma, vma_link);
> -}
> -
> -/* Whether or not this object is currently mapped by the translation tables */
> -static inline bool
> -i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o)
> -{
> -	struct i915_vma *vma = __i915_gem_obj_to_vma(o);
> -	if (vma == NULL)
> -		return false;
> -	return drm_mm_node_allocated(&vma->node);
> -}
> -
> -/* Offset of the first PTE pointing to this object */
> -static inline unsigned long
> -i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o)
> -{
> -	BUG_ON(list_empty(&o->vma_list));
> -	return __i915_gem_obj_to_vma(o)->node.start;
> -}
> -
> -/* The size used in the translation tables may be larger than the actual size of
> - * the object on GEN2/GEN3 because of the way tiling is handled. See
> - * i915_gem_get_gtt_size() for more details.
> - */
> -static inline unsigned long
> -i915_gem_obj_ggtt_size(struct drm_i915_gem_object *o)
> -{
> -	BUG_ON(list_empty(&o->vma_list));
> -	return __i915_gem_obj_to_vma(o)->node.size;
> -}
> -
> -static inline void
> -i915_gem_obj_ggtt_set_color(struct drm_i915_gem_object *o,
> -			    enum i915_cache_level color)
> -{
> -	__i915_gem_obj_to_vma(o)->node.color = color;
> -}
> -
>  /**
>   * Request queue structure.
>   *
> @@ -1736,11 +1690,13 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
>  void i915_gem_vma_destroy(struct i915_vma *vma);
>  
>  int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
> +				     struct i915_address_space *vm,
>  				     uint32_t alignment,
>  				     bool map_and_fenceable,
>  				     bool nonblocking);
>  void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
> -int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
> +int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> +					struct i915_address_space *vm);
>  int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
>  void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
>  void i915_gem_lastclose(struct drm_device *dev);
> @@ -1770,6 +1726,7 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
>  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
>  			 struct intel_ring_buffer *to);
>  void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> +				    struct i915_address_space *vm,
>  				    struct intel_ring_buffer *ring);
>  
>  int i915_gem_dumb_create(struct drm_file *file_priv,
> @@ -1876,6 +1833,7 @@ i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size,
>  			    int tiling_mode, bool fenced);
>  
>  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> +				    struct i915_address_space *vm,
>  				    enum i915_cache_level cache_level);
>  
>  struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
> @@ -1886,6 +1844,56 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
>  
>  void i915_gem_restore_fences(struct drm_device *dev);
>  
> +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> +				  struct i915_address_space *vm);
> +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
> +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> +			struct i915_address_space *vm);
> +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> +				struct i915_address_space *vm);
> +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> +			    struct i915_address_space *vm,
> +			    enum i915_cache_level color);
> +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> +				     struct i915_address_space *vm);
> +/* Some GGTT VM helpers */
> +#define obj_to_ggtt(obj) \
> +	(&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
> +static inline bool i915_is_ggtt(struct i915_address_space *vm)
> +{
> +	struct i915_address_space *ggtt =
> +		&((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base;
> +	return vm == ggtt;
> +}
> +
> +static inline bool i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *obj)
> +{
> +	return i915_gem_obj_bound(obj, obj_to_ggtt(obj));
> +}
> +
> +static inline unsigned long
> +i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *obj)
> +{
> +	return i915_gem_obj_offset(obj, obj_to_ggtt(obj));
> +}
> +
> +static inline unsigned long
> +i915_gem_obj_ggtt_size(struct drm_i915_gem_object *obj)
> +{
> +	return i915_gem_obj_size(obj, obj_to_ggtt(obj));
> +}
> +
> +static inline int __must_check
> +i915_gem_ggtt_pin(struct drm_i915_gem_object *obj,
> +		  uint32_t alignment,
> +		  bool map_and_fenceable,
> +		  bool nonblocking)
> +{
> +	return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment,
> +				   map_and_fenceable, nonblocking);
> +}
> +#undef obj_to_ggtt
> +
>  /* i915_gem_context.c */
>  void i915_gem_context_init(struct drm_device *dev);
>  void i915_gem_context_fini(struct drm_device *dev);
> @@ -1922,6 +1930,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
>  
>  void i915_gem_restore_gtt_mappings(struct drm_device *dev);
>  int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
> +/* FIXME: this is never okay with full PPGTT */
>  void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
>  				enum i915_cache_level cache_level);
>  void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
> @@ -1938,7 +1947,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev)
>  
>  
>  /* i915_gem_evict.c */
> -int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
> +int __must_check i915_gem_evict_something(struct drm_device *dev,
> +					  struct i915_address_space *vm,
> +					  int min_size,
>  					  unsigned alignment,
>  					  unsigned cache_level,
>  					  bool mappable,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 2283765..0111554 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -38,10 +38,12 @@
>  
>  static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
>  static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
> -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> -						    unsigned alignment,
> -						    bool map_and_fenceable,
> -						    bool nonblocking);
> +static __must_check int
> +i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> +			   struct i915_address_space *vm,
> +			   unsigned alignment,
> +			   bool map_and_fenceable,
> +			   bool nonblocking);
>  static int i915_gem_phys_pwrite(struct drm_device *dev,
>  				struct drm_i915_gem_object *obj,
>  				struct drm_i915_gem_pwrite *args,
> @@ -120,7 +122,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
>  static inline bool
>  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
>  {
> -	return i915_gem_obj_ggtt_bound(obj) && !obj->active;
> +	return i915_gem_obj_bound_any(obj) && !obj->active;
>  }
>  
>  int
> @@ -406,7 +408,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
>  		 * anyway again before the next pread happens. */
>  		if (obj->cache_level == I915_CACHE_NONE)
>  			needs_clflush = 1;
> -		if (i915_gem_obj_ggtt_bound(obj)) {
> +		if (i915_gem_obj_bound_any(obj)) {
>  			ret = i915_gem_object_set_to_gtt_domain(obj, false);
>  			if (ret)
>  				return ret;
> @@ -578,7 +580,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
>  	char __user *user_data;
>  	int page_offset, page_length, ret;
>  
> -	ret = i915_gem_object_pin(obj, 0, true, true);
> +	ret = i915_gem_ggtt_pin(obj, 0, true, true);
>  	if (ret)
>  		goto out;
>  
> @@ -723,7 +725,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
>  		 * right away and we therefore have to clflush anyway. */
>  		if (obj->cache_level == I915_CACHE_NONE)
>  			needs_clflush_after = 1;
> -		if (i915_gem_obj_ggtt_bound(obj)) {
> +		if (i915_gem_obj_bound_any(obj)) {
>  			ret = i915_gem_object_set_to_gtt_domain(obj, true);
>  			if (ret)
>  				return ret;
> @@ -1332,7 +1334,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>  	}
>  
>  	/* Now bind it into the GTT if needed */
> -	ret = i915_gem_object_pin(obj, 0, true, false);
> +	ret = i915_gem_ggtt_pin(obj,  0, true, false);
>  	if (ret)
>  		goto unlock;
>  
> @@ -1654,11 +1656,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
>  	if (obj->pages == NULL)
>  		return 0;
>  
> -	BUG_ON(i915_gem_obj_ggtt_bound(obj));
> -
>  	if (obj->pages_pin_count)
>  		return -EBUSY;
>  
> +	BUG_ON(i915_gem_obj_bound_any(obj));
> +
>  	/* ->put_pages might need to allocate memory for the bit17 swizzle
>  	 * array, hence protect them from being reaped by removing them from gtt
>  	 * lists early. */
> @@ -1678,7 +1680,6 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
>  		  bool purgeable_only)
>  {
>  	struct drm_i915_gem_object *obj, *next;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	long count = 0;
>  
>  	list_for_each_entry_safe(obj, next,
> @@ -1692,14 +1693,22 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
>  		}
>  	}
>  
> -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) {
> -		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
> -		    i915_gem_object_unbind(obj) == 0 &&
> -		    i915_gem_object_put_pages(obj) == 0) {
> +	list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list,
> +				 global_list) {
> +		struct i915_vma *vma, *v;
> +
> +		if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
> +			continue;
> +
> +		list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link)
> +			if (i915_gem_object_unbind(obj, vma->vm))
> +				break;
> +
> +		if (!i915_gem_object_put_pages(obj))
>  			count += obj->base.size >> PAGE_SHIFT;
> -			if (count >= target)
> -				return count;
> -		}
> +
> +		if (count >= target)
> +			return count;
>  	}
>  
>  	return count;
> @@ -1859,11 +1868,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
>  
>  void
>  i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> +			       struct i915_address_space *vm,
>  			       struct intel_ring_buffer *ring)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	u32 seqno = intel_ring_get_seqno(ring);
>  
>  	BUG_ON(ring == NULL);
> @@ -1900,12 +1909,9 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  }
>  
>  static void
> -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> +				 struct i915_address_space *vm)
>  {
> -	struct drm_device *dev = obj->base.dev;
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> -
>  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
>  	BUG_ON(!obj->active);
>  
> @@ -2105,10 +2111,11 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
>  	spin_unlock(&file_priv->mm.lock);
>  }
>  
> -static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj)
> +static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj,
> +				    struct i915_address_space *vm)
>  {
> -	if (acthd >= i915_gem_obj_ggtt_offset(obj) &&
> -	    acthd < i915_gem_obj_ggtt_offset(obj) + obj->base.size)
> +	if (acthd >= i915_gem_obj_offset(obj, vm) &&
> +	    acthd < i915_gem_obj_offset(obj, vm) + obj->base.size)
>  		return true;
>  
>  	return false;
> @@ -2131,6 +2138,17 @@ static bool i915_head_inside_request(const u32 acthd_unmasked,
>  	return false;
>  }
>  
> +static struct i915_address_space *
> +request_to_vm(struct drm_i915_gem_request *request)
> +{
> +	struct drm_i915_private *dev_priv = request->ring->dev->dev_private;
> +	struct i915_address_space *vm;
> +
> +	vm = &dev_priv->gtt.base;
> +
> +	return vm;
> +}
> +
>  static bool i915_request_guilty(struct drm_i915_gem_request *request,
>  				const u32 acthd, bool *inside)
>  {
> @@ -2138,9 +2156,9 @@ static bool i915_request_guilty(struct drm_i915_gem_request *request,
>  	 * pointing inside the ring, matches the batch_obj address range.
>  	 * However this is extremely unlikely.
>  	 */
> -
>  	if (request->batch_obj) {
> -		if (i915_head_inside_object(acthd, request->batch_obj)) {
> +		if (i915_head_inside_object(acthd, request->batch_obj,
> +					    request_to_vm(request))) {
>  			*inside = true;
>  			return true;
>  		}
> @@ -2160,17 +2178,21 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring,
>  {
>  	struct i915_ctx_hang_stats *hs = NULL;
>  	bool inside, guilty;
> +	unsigned long offset = 0;
>  
>  	/* Innocent until proven guilty */
>  	guilty = false;
>  
> +	if (request->batch_obj)
> +		offset = i915_gem_obj_offset(request->batch_obj,
> +					     request_to_vm(request));
> +
>  	if (ring->hangcheck.action != wait &&
>  	    i915_request_guilty(request, acthd, &inside)) {
>  		DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n",
>  			  ring->name,
>  			  inside ? "inside" : "flushing",
> -			  request->batch_obj ?
> -			  i915_gem_obj_ggtt_offset(request->batch_obj) : 0,
> +			  offset,
>  			  request->ctx ? request->ctx->id : 0,
>  			  acthd);
>  
> @@ -2227,13 +2249,15 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
>  	}
>  
>  	while (!list_empty(&ring->active_list)) {
> +		struct i915_address_space *vm;
>  		struct drm_i915_gem_object *obj;
>  
>  		obj = list_first_entry(&ring->active_list,
>  				       struct drm_i915_gem_object,
>  				       ring_list);
>  
> -		i915_gem_object_move_to_inactive(obj);
> +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +			i915_gem_object_move_to_inactive(obj, vm);
>  	}
>  }
>  
> @@ -2261,7 +2285,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
>  void i915_gem_reset(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct i915_address_space *vm;
>  	struct drm_i915_gem_object *obj;
>  	struct intel_ring_buffer *ring;
>  	int i;
> @@ -2272,8 +2296,9 @@ void i915_gem_reset(struct drm_device *dev)
>  	/* Move everything out of the GPU domains to ensure we do any
>  	 * necessary invalidation upon reuse.
>  	 */
> -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> -		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +		list_for_each_entry(obj, &vm->inactive_list, mm_list)
> +			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
>  
>  	i915_gem_restore_fences(dev);
>  }
> @@ -2318,6 +2343,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
>  	 * by the ringbuffer to the flushing/inactive lists as appropriate.
>  	 */
>  	while (!list_empty(&ring->active_list)) {
> +		struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +		struct i915_address_space *vm;
>  		struct drm_i915_gem_object *obj;
>  
>  		obj = list_first_entry(&ring->active_list,
> @@ -2327,7 +2354,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
>  		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
>  			break;
>  
> -		i915_gem_object_move_to_inactive(obj);
> +		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +			i915_gem_object_move_to_inactive(obj, vm);
>  	}
>  
>  	if (unlikely(ring->trace_irq_seqno &&
> @@ -2573,13 +2601,14 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
>   * Unbinds an object from the GTT aperture.
>   */
>  int
> -i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> +i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> +		       struct i915_address_space *vm)
>  {
>  	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
>  	struct i915_vma *vma;
>  	int ret;
>  
> -	if (!i915_gem_obj_ggtt_bound(obj))
> +	if (!i915_gem_obj_bound(obj, vm))
>  		return 0;
>  
>  	if (obj->pin_count)
> @@ -2602,7 +2631,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
>  	if (ret)
>  		return ret;
>  
> -	trace_i915_gem_object_unbind(obj);
> +	trace_i915_gem_object_unbind(obj, vm);
>  
>  	if (obj->has_global_gtt_mapping)
>  		i915_gem_gtt_unbind_object(obj);
> @@ -2617,7 +2646,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
>  	/* Avoid an unnecessary call to unbind on rebind. */
>  	obj->map_and_fenceable = true;
>  
> -	vma = __i915_gem_obj_to_vma(obj);
> +	vma = i915_gem_obj_to_vma(obj, vm);
>  	list_del(&vma->vma_link);
>  	drm_mm_remove_node(&vma->node);
>  	i915_gem_vma_destroy(vma);
> @@ -2764,6 +2793,7 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
>  		     "object 0x%08lx not 512K or pot-size 0x%08x aligned\n",
>  		     i915_gem_obj_ggtt_offset(obj), size);
>  
> +
>  		pitch_val = obj->stride / 128;
>  		pitch_val = ffs(pitch_val) - 1;
>  
> @@ -3049,24 +3079,26 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
>   * Finds free space in the GTT aperture and binds the object there.
>   */
>  static int
> -i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> -			    unsigned alignment,
> -			    bool map_and_fenceable,
> -			    bool nonblocking)
> +i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> +			   struct i915_address_space *vm,
> +			   unsigned alignment,
> +			   bool map_and_fenceable,
> +			   bool nonblocking)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	u32 size, fence_size, fence_alignment, unfenced_alignment;
>  	bool mappable, fenceable;
> -	size_t gtt_max = map_and_fenceable ?
> -		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
> +	size_t gtt_max =
> +		map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total;
>  	struct i915_vma *vma;
>  	int ret;
>  
>  	if (WARN_ON(!list_empty(&obj->vma_list)))
>  		return -EBUSY;
>  
> +	BUG_ON(!i915_is_ggtt(vm));
> +
>  	fence_size = i915_gem_get_gtt_size(dev,
>  					   obj->base.size,
>  					   obj->tiling_mode);
> @@ -3105,19 +3137,21 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
>  
>  	i915_gem_object_pin_pages(obj);
>  
> -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> +	/* For now we only ever use 1 vma per object */
> +	WARN_ON(!list_empty(&obj->vma_list));
> +
> +	vma = i915_gem_vma_create(obj, vm);
>  	if (IS_ERR(vma)) {
>  		i915_gem_object_unpin_pages(obj);
>  		return PTR_ERR(vma);
>  	}
>  
>  search_free:
> -	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
> -						  &vma->node,
> +	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
>  						  size, alignment,
>  						  obj->cache_level, 0, gtt_max);
>  	if (ret) {
> -		ret = i915_gem_evict_something(dev, size, alignment,
> +		ret = i915_gem_evict_something(dev, vm, size, alignment,
>  					       obj->cache_level,
>  					       map_and_fenceable,
>  					       nonblocking);
> @@ -3138,18 +3172,25 @@ search_free:
>  
>  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
>  	list_add_tail(&obj->mm_list, &vm->inactive_list);
> -	list_add(&vma->vma_link, &obj->vma_list);
> +
> +	/* Keep GGTT vmas first to make debug easier */
> +	if (i915_is_ggtt(vm))
> +		list_add(&vma->vma_link, &obj->vma_list);
> +	else
> +		list_add_tail(&vma->vma_link, &obj->vma_list);
>  
>  	fenceable =
> +		i915_is_ggtt(vm) &&
>  		i915_gem_obj_ggtt_size(obj) == fence_size &&
>  		(i915_gem_obj_ggtt_offset(obj) & (fence_alignment - 1)) == 0;
>  
> -	mappable = i915_gem_obj_ggtt_offset(obj) + obj->base.size <=
> -		dev_priv->gtt.mappable_end;
> +	mappable =
> +		i915_is_ggtt(vm) &&
> +		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
>  
>  	obj->map_and_fenceable = mappable && fenceable;
>  
> -	trace_i915_gem_object_bind(obj, map_and_fenceable);
> +	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
>  	i915_gem_verify_gtt(dev);
>  	return 0;
>  
> @@ -3253,7 +3294,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>  	int ret;
>  
>  	/* Not valid to be called on unbound objects. */
> -	if (!i915_gem_obj_ggtt_bound(obj))
> +	if (!i915_gem_obj_bound_any(obj))
>  		return -EINVAL;
>  
>  	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
> @@ -3299,11 +3340,12 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>  }
>  
>  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> +				    struct i915_address_space *vm,
>  				    enum i915_cache_level cache_level)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> +	struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
>  	int ret;
>  
>  	if (obj->cache_level == cache_level)
> @@ -3315,12 +3357,12 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  	}
>  
>  	if (vma && !i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> -		ret = i915_gem_object_unbind(obj);
> +		ret = i915_gem_object_unbind(obj, vm);
>  		if (ret)
>  			return ret;
>  	}
>  
> -	if (i915_gem_obj_ggtt_bound(obj)) {
> +	list_for_each_entry(vma, &obj->vma_list, vma_link) {
>  		ret = i915_gem_object_finish_gpu(obj);
>  		if (ret)
>  			return ret;
> @@ -3343,7 +3385,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
>  					       obj, cache_level);
>  
> -		i915_gem_obj_ggtt_set_color(obj, cache_level);
> +		i915_gem_obj_set_color(obj, vma->vm, cache_level);
>  	}
>  
>  	if (cache_level == I915_CACHE_NONE) {
> @@ -3403,6 +3445,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
>  			       struct drm_file *file)
>  {
>  	struct drm_i915_gem_caching *args = data;
> +	struct drm_i915_private *dev_priv;
>  	struct drm_i915_gem_object *obj;
>  	enum i915_cache_level level;
>  	int ret;
> @@ -3427,8 +3470,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
>  		ret = -ENOENT;
>  		goto unlock;
>  	}
> +	dev_priv = obj->base.dev->dev_private;
>  
> -	ret = i915_gem_object_set_cache_level(obj, level);
> +	/* FIXME: Add interface for specific VM? */
> +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
>  
>  	drm_gem_object_unreference(&obj->base);
>  unlock:
> @@ -3446,6 +3491,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>  				     u32 alignment,
>  				     struct intel_ring_buffer *pipelined)
>  {
> +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
>  	u32 old_read_domains, old_write_domain;
>  	int ret;
>  
> @@ -3464,7 +3510,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>  	 * of uncaching, which would allow us to flush all the LLC-cached data
>  	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
>  	 */
> -	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> +					      I915_CACHE_NONE);
>  	if (ret)
>  		return ret;
>  
> @@ -3472,7 +3519,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>  	 * (e.g. libkms for the bootup splash), we have to ensure that we
>  	 * always use map_and_fenceable for all scanout buffers.
>  	 */
> -	ret = i915_gem_object_pin(obj, alignment, true, false);
> +	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
>  	if (ret)
>  		return ret;
>  
> @@ -3615,6 +3662,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
>  
>  int
>  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> +		    struct i915_address_space *vm,
>  		    uint32_t alignment,
>  		    bool map_and_fenceable,
>  		    bool nonblocking)
> @@ -3624,28 +3672,31 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
>  		return -EBUSY;
>  
> -	if (i915_gem_obj_ggtt_bound(obj)) {
> -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> +	WARN_ON(map_and_fenceable && !i915_is_ggtt(vm));
> +
> +	if (i915_gem_obj_bound(obj, vm)) {
> +		if ((alignment &&
> +		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
>  		    (map_and_fenceable && !obj->map_and_fenceable)) {
>  			WARN(obj->pin_count,
>  			     "bo is already pinned with incorrect alignment:"
>  			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
>  			     " obj->map_and_fenceable=%d\n",
> -			     i915_gem_obj_ggtt_offset(obj), alignment,
> +			     i915_gem_obj_offset(obj, vm), alignment,
>  			     map_and_fenceable,
>  			     obj->map_and_fenceable);
> -			ret = i915_gem_object_unbind(obj);
> +			ret = i915_gem_object_unbind(obj, vm);
>  			if (ret)
>  				return ret;
>  		}
>  	}
>  
> -	if (!i915_gem_obj_ggtt_bound(obj)) {
> +	if (!i915_gem_obj_bound(obj, vm)) {
>  		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
>  
> -		ret = i915_gem_object_bind_to_gtt(obj, alignment,
> -						  map_and_fenceable,
> -						  nonblocking);
> +		ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
> +						 map_and_fenceable,
> +						 nonblocking);
>  		if (ret)
>  			return ret;
>  
> @@ -3666,7 +3717,7 @@ void
>  i915_gem_object_unpin(struct drm_i915_gem_object *obj)
>  {
>  	BUG_ON(obj->pin_count == 0);
> -	BUG_ON(!i915_gem_obj_ggtt_bound(obj));
> +	BUG_ON(!i915_gem_obj_bound_any(obj));
>  
>  	if (--obj->pin_count == 0)
>  		obj->pin_mappable = false;
> @@ -3704,7 +3755,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
>  	}
>  
>  	if (obj->user_pin_count == 0) {
> -		ret = i915_gem_object_pin(obj, args->alignment, true, false);
> +		ret = i915_gem_ggtt_pin(obj, args->alignment, true, false);
>  		if (ret)
>  			goto out;
>  	}
> @@ -3937,6 +3988,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
>  	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
>  	struct drm_device *dev = obj->base.dev;
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> +	struct i915_vma *vma, *next;
>  
>  	trace_i915_gem_object_destroy(obj);
>  
> @@ -3944,15 +3996,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
>  		i915_gem_detach_phys_object(dev, obj);
>  
>  	obj->pin_count = 0;
> -	if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
> -		bool was_interruptible;
> +	/* NB: 0 or 1 elements */
> +	WARN_ON(!list_empty(&obj->vma_list) &&
> +		!list_is_singular(&obj->vma_list));
> +	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
> +		int ret = i915_gem_object_unbind(obj, vma->vm);
> +		if (WARN_ON(ret == -ERESTARTSYS)) {
> +			bool was_interruptible;
>  
> -		was_interruptible = dev_priv->mm.interruptible;
> -		dev_priv->mm.interruptible = false;
> +			was_interruptible = dev_priv->mm.interruptible;
> +			dev_priv->mm.interruptible = false;
>  
> -		WARN_ON(i915_gem_object_unbind(obj));
> +			WARN_ON(i915_gem_object_unbind(obj, vma->vm));
>  
> -		dev_priv->mm.interruptible = was_interruptible;
> +			dev_priv->mm.interruptible = was_interruptible;
> +		}
>  	}
>  
>  	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
> @@ -4319,6 +4377,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
>  	INIT_LIST_HEAD(&ring->request_list);
>  }
>  
> +static void i915_init_vm(struct drm_i915_private *dev_priv,
> +			 struct i915_address_space *vm)
> +{
> +	vm->dev = dev_priv->dev;
> +	INIT_LIST_HEAD(&vm->active_list);
> +	INIT_LIST_HEAD(&vm->inactive_list);
> +	INIT_LIST_HEAD(&vm->global_link);
> +	list_add(&vm->global_link, &dev_priv->vm_list);
> +}
> +
>  void
>  i915_gem_load(struct drm_device *dev)
>  {
> @@ -4331,8 +4399,9 @@ i915_gem_load(struct drm_device *dev)
>  				  SLAB_HWCACHE_ALIGN,
>  				  NULL);
>  
> -	INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
> -	INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
> +	INIT_LIST_HEAD(&dev_priv->vm_list);
> +	i915_init_vm(dev_priv, &dev_priv->gtt.base);
> +
>  	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
>  	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
>  	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> @@ -4603,9 +4672,8 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>  			     struct drm_i915_private,
>  			     mm.inactive_shrinker);
>  	struct drm_device *dev = dev_priv->dev;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	struct drm_i915_gem_object *obj;
> -	int nr_to_scan = sc->nr_to_scan;
> +	int nr_to_scan;
>  	bool unlock = true;
>  	int cnt;
>  
> @@ -4619,6 +4687,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>  		unlock = false;
>  	}
>  
> +	nr_to_scan = sc->nr_to_scan;
>  	if (nr_to_scan) {
>  		nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
>  		if (nr_to_scan > 0)
> @@ -4632,11 +4701,109 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
>  		if (obj->pages_pin_count == 0)
>  			cnt += obj->base.size >> PAGE_SHIFT;
> -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> +
> +	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> +		if (obj->active)
> +			continue;
> +
> +		i915_gem_object_flush_gtt_write_domain(obj);
> +		i915_gem_object_flush_cpu_write_domain(obj);
> +		/* FIXME: Can't assume global gtt */
> +		i915_gem_object_move_to_inactive(obj, &dev_priv->gtt.base);
> +
>  		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
>  			cnt += obj->base.size >> PAGE_SHIFT;
> +	}
>  
>  	if (unlock)
>  		mutex_unlock(&dev->struct_mutex);
>  	return cnt;
>  }
> +
> +/* All the new VM stuff */
> +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> +				  struct i915_address_space *vm)
> +{
> +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> +	struct i915_vma *vma;
> +
> +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> +		vm = &dev_priv->gtt.base;
> +
> +	BUG_ON(list_empty(&o->vma_list));
> +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> +		if (vma->vm == vm)
> +			return vma->node.start;
> +
> +	}
> +	return -1;
> +}
> +
> +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> +			struct i915_address_space *vm)
> +{
> +	struct i915_vma *vma;
> +
> +	list_for_each_entry(vma, &o->vma_list, vma_link)
> +		if (vma->vm == vm)
> +			return true;
> +
> +	return false;
> +}
> +
> +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
> +{
> +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> +	struct i915_address_space *vm;
> +
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +		if (i915_gem_obj_bound(o, vm))
> +			return true;
> +
> +	return false;
> +}
> +
> +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> +				struct i915_address_space *vm)
> +{
> +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> +	struct i915_vma *vma;
> +
> +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> +		vm = &dev_priv->gtt.base;
> +
> +	BUG_ON(list_empty(&o->vma_list));
> +
> +	list_for_each_entry(vma, &o->vma_list, vma_link)
> +		if (vma->vm == vm)
> +			return vma->node.size;
> +
> +	return 0;
> +}
> +
> +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> +			    struct i915_address_space *vm,
> +			    enum i915_cache_level color)
> +{
> +	struct i915_vma *vma;
> +	BUG_ON(list_empty(&o->vma_list));
> +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> +		if (vma->vm == vm) {
> +			vma->node.color = color;
> +			return;
> +		}
> +	}
> +
> +	WARN(1, "Couldn't set color for VM %p\n", vm);
> +}
> +
> +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> +				     struct i915_address_space *vm)
> +{
> +	struct i915_vma *vma;
> +	list_for_each_entry(vma, &obj->vma_list, vma_link)
> +		if (vma->vm == vm)
> +			return vma;
> +
> +	return NULL;
> +}
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 2470206..873577d 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev,
>  
>  	if (INTEL_INFO(dev)->gen >= 7) {
>  		ret = i915_gem_object_set_cache_level(ctx->obj,
> +						      &dev_priv->gtt.base,
>  						      I915_CACHE_LLC_MLC);
>  		/* Failure shouldn't ever happen this early */
>  		if (WARN_ON(ret))
> @@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
>  	 * default context.
>  	 */
>  	dev_priv->ring[RCS].default_context = ctx;
> -	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> +	ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
>  	if (ret) {
>  		DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
>  		goto err_destroy;
> @@ -391,6 +392,7 @@ mi_set_context(struct intel_ring_buffer *ring,
>  static int do_switch(struct i915_hw_context *to)
>  {
>  	struct intel_ring_buffer *ring = to->ring;
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  	struct i915_hw_context *from = ring->last_context;
>  	u32 hw_flags = 0;
>  	int ret;
> @@ -400,7 +402,7 @@ static int do_switch(struct i915_hw_context *to)
>  	if (from == to)
>  		return 0;
>  
> -	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
> +	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
>  	if (ret)
>  		return ret;
>  
> @@ -437,7 +439,8 @@ static int do_switch(struct i915_hw_context *to)
>  	 */
>  	if (from != NULL) {
>  		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> -		i915_gem_object_move_to_active(from->obj, ring);
> +		i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
> +					       ring);
>  		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
>  		 * whole damn pipeline, we don't need to explicitly mark the
>  		 * object dirty. The only exception is that the context must be
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index df61f33..32efdc0 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -32,24 +32,21 @@
>  #include "i915_trace.h"
>  
>  static bool
> -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
> +mark_free(struct i915_vma *vma, struct list_head *unwind)
>  {
> -	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> -
> -	if (obj->pin_count)
> +	if (vma->obj->pin_count)
>  		return false;
>  
> -	list_add(&obj->exec_list, unwind);
> +	list_add(&vma->obj->exec_list, unwind);
>  	return drm_mm_scan_add_block(&vma->node);
>  }
>  
>  int
> -i915_gem_evict_something(struct drm_device *dev, int min_size,
> -			 unsigned alignment, unsigned cache_level,
> +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> +			 int min_size, unsigned alignment, unsigned cache_level,
>  			 bool mappable, bool nonblocking)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
>  	struct list_head eviction_list, unwind_list;
>  	struct i915_vma *vma;
>  	struct drm_i915_gem_object *obj;
> @@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
>  	 */
>  
>  	INIT_LIST_HEAD(&unwind_list);
> -	if (mappable)
> +	if (mappable) {
> +		BUG_ON(!i915_is_ggtt(vm));
>  		drm_mm_init_scan_with_range(&vm->mm, min_size,
>  					    alignment, cache_level, 0,
>  					    dev_priv->gtt.mappable_end);
> -	else
> +	} else
>  		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
>  
>  	/* First see if there is a large enough contiguous idle region... */
>  	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> -		if (mark_free(obj, &unwind_list))
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +		if (mark_free(vma, &unwind_list))
>  			goto found;
>  	}
>  
> @@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
>  
>  	/* Now merge in the soon-to-be-expired objects... */
>  	list_for_each_entry(obj, &vm->active_list, mm_list) {
> -		if (mark_free(obj, &unwind_list))
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +		if (mark_free(vma, &unwind_list))
>  			goto found;
>  	}
>  
> @@ -109,7 +109,7 @@ none:
>  		obj = list_first_entry(&unwind_list,
>  				       struct drm_i915_gem_object,
>  				       exec_list);
> -		vma = __i915_gem_obj_to_vma(obj);
> +		vma = i915_gem_obj_to_vma(obj, vm);
>  		ret = drm_mm_scan_remove_block(&vma->node);
>  		BUG_ON(ret);
>  
> @@ -130,7 +130,7 @@ found:
>  		obj = list_first_entry(&unwind_list,
>  				       struct drm_i915_gem_object,
>  				       exec_list);
> -		vma = __i915_gem_obj_to_vma(obj);
> +		vma = i915_gem_obj_to_vma(obj, vm);
>  		if (drm_mm_scan_remove_block(&vma->node)) {
>  			list_move(&obj->exec_list, &eviction_list);
>  			drm_gem_object_reference(&obj->base);
> @@ -145,7 +145,7 @@ found:
>  				       struct drm_i915_gem_object,
>  				       exec_list);
>  		if (ret == 0)
> -			ret = i915_gem_object_unbind(obj);
> +			ret = i915_gem_object_unbind(obj, vm);
>  
>  		list_del_init(&obj->exec_list);
>  		drm_gem_object_unreference(&obj->base);
> @@ -158,13 +158,18 @@ int
>  i915_gem_evict_everything(struct drm_device *dev)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct i915_address_space *vm;
>  	struct drm_i915_gem_object *obj, *next;
> -	bool lists_empty;
> +	bool lists_empty = true;
>  	int ret;
>  
> -	lists_empty = (list_empty(&vm->inactive_list) &&
> -		       list_empty(&vm->active_list));
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +		lists_empty = (list_empty(&vm->inactive_list) &&
> +			       list_empty(&vm->active_list));
> +		if (!lists_empty)
> +			lists_empty = false;
> +	}
> +
>  	if (lists_empty)
>  		return -ENOSPC;
>  
> @@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
>  	i915_gem_retire_requests(dev);
>  
>  	/* Having flushed everything, unbind() should never raise an error */
> -	list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> -		if (obj->pin_count == 0)
> -			WARN_ON(i915_gem_object_unbind(obj));
> +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> +			if (obj->pin_count == 0)
> +				WARN_ON(i915_gem_object_unbind(obj, vm));
> +	}
>  
>  	return 0;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 1734825..819d8d8 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
>  }
>  
>  static void
> -eb_destroy(struct eb_objects *eb)
> +eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
>  {
>  	while (!list_empty(&eb->objects)) {
>  		struct drm_i915_gem_object *obj;
> @@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
>  static int
>  i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>  				   struct eb_objects *eb,
> -				   struct drm_i915_gem_relocation_entry *reloc)
> +				   struct drm_i915_gem_relocation_entry *reloc,
> +				   struct i915_address_space *vm)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_gem_object *target_obj;
> @@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>  
>  static int
>  i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> -				    struct eb_objects *eb)
> +				    struct eb_objects *eb,
> +				    struct i915_address_space *vm)
>  {
>  #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
>  	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
> @@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
>  		do {
>  			u64 offset = r->presumed_offset;
>  
> -			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
> +			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
> +								 vm);
>  			if (ret)
>  				return ret;
>  
> @@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
>  static int
>  i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
>  					 struct eb_objects *eb,
> -					 struct drm_i915_gem_relocation_entry *relocs)
> +					 struct drm_i915_gem_relocation_entry *relocs,
> +					 struct i915_address_space *vm)
>  {
>  	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
>  	int i, ret;
>  
>  	for (i = 0; i < entry->relocation_count; i++) {
> -		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
> +		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
> +							 vm);
>  		if (ret)
>  			return ret;
>  	}
> @@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
>  }
>  
>  static int
> -i915_gem_execbuffer_relocate(struct eb_objects *eb)
> +i915_gem_execbuffer_relocate(struct eb_objects *eb,
> +			     struct i915_address_space *vm)
>  {
>  	struct drm_i915_gem_object *obj;
>  	int ret = 0;
> @@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
>  	 */
>  	pagefault_disable();
>  	list_for_each_entry(obj, &eb->objects, exec_list) {
> -		ret = i915_gem_execbuffer_relocate_object(obj, eb);
> +		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
>  		if (ret)
>  			break;
>  	}
> @@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
>  static int
>  i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>  				   struct intel_ring_buffer *ring,
> +				   struct i915_address_space *vm,
>  				   bool *need_reloc)
>  {
>  	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> @@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>  		obj->tiling_mode != I915_TILING_NONE;
>  	need_mappable = need_fence || need_reloc_mappable(obj);
>  
> -	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
> +	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
> +				  false);
>  	if (ret)
>  		return ret;
>  
> @@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>  		obj->has_aliasing_ppgtt_mapping = 1;
>  	}
>  
> -	if (entry->offset != i915_gem_obj_ggtt_offset(obj)) {
> -		entry->offset = i915_gem_obj_ggtt_offset(obj);
> +	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
> +		entry->offset = i915_gem_obj_offset(obj, vm);
>  		*need_reloc = true;
>  	}
>  
> @@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
>  {
>  	struct drm_i915_gem_exec_object2 *entry;
>  
> -	if (!i915_gem_obj_ggtt_bound(obj))
> +	if (!i915_gem_obj_bound_any(obj))
>  		return;
>  
>  	entry = obj->exec_entry;
> @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
>  static int
>  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
>  			    struct list_head *objects,
> +			    struct i915_address_space *vm,
>  			    bool *need_relocs)
>  {
>  	struct drm_i915_gem_object *obj;
> @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
>  		list_for_each_entry(obj, objects, exec_list) {
>  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
>  			bool need_fence, need_mappable;
> +			u32 obj_offset;
>  
> -			if (!i915_gem_obj_ggtt_bound(obj))
> +			if (!i915_gem_obj_bound(obj, vm))
>  				continue;
>  
> +			obj_offset = i915_gem_obj_offset(obj, vm);
>  			need_fence =
>  				has_fenced_gpu_access &&
>  				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
>  				obj->tiling_mode != I915_TILING_NONE;
>  			need_mappable = need_fence || need_reloc_mappable(obj);
>  
> +			BUG_ON((need_mappable || need_fence) &&
> +			       !i915_is_ggtt(vm));
> +
>  			if ((entry->alignment &&
> -			     i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) ||
> +			     obj_offset & (entry->alignment - 1)) ||
>  			    (need_mappable && !obj->map_and_fenceable))
> -				ret = i915_gem_object_unbind(obj);
> +				ret = i915_gem_object_unbind(obj, vm);
>  			else
> -				ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> +				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
>  			if (ret)
>  				goto err;
>  		}
>  
>  		/* Bind fresh objects */
>  		list_for_each_entry(obj, objects, exec_list) {
> -			if (i915_gem_obj_ggtt_bound(obj))
> +			if (i915_gem_obj_bound(obj, vm))
>  				continue;
>  
> -			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> +			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
>  			if (ret)
>  				goto err;
>  		}
> @@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
>  				  struct drm_file *file,
>  				  struct intel_ring_buffer *ring,
>  				  struct eb_objects *eb,
> -				  struct drm_i915_gem_exec_object2 *exec)
> +				  struct drm_i915_gem_exec_object2 *exec,
> +				  struct i915_address_space *vm)
>  {
>  	struct drm_i915_gem_relocation_entry *reloc;
>  	struct drm_i915_gem_object *obj;
> @@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
>  		goto err;
>  
>  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
>  	if (ret)
>  		goto err;
>  
>  	list_for_each_entry(obj, &eb->objects, exec_list) {
>  		int offset = obj->exec_entry - exec;
>  		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
> -							       reloc + reloc_offset[offset]);
> +							       reloc + reloc_offset[offset],
> +							       vm);
>  		if (ret)
>  			goto err;
>  	}
> @@ -770,6 +786,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
>  
>  static void
>  i915_gem_execbuffer_move_to_active(struct list_head *objects,
> +				   struct i915_address_space *vm,
>  				   struct intel_ring_buffer *ring)
>  {
>  	struct drm_i915_gem_object *obj;
> @@ -784,7 +801,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
>  		obj->base.read_domains = obj->base.pending_read_domains;
>  		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
>  
> -		i915_gem_object_move_to_active(obj, ring);
> +		i915_gem_object_move_to_active(obj, vm, ring);
>  		if (obj->base.write_domain) {
>  			obj->dirty = 1;
>  			obj->last_write_seqno = intel_ring_get_seqno(ring);
> @@ -838,7 +855,8 @@ static int
>  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  		       struct drm_file *file,
>  		       struct drm_i915_gem_execbuffer2 *args,
> -		       struct drm_i915_gem_exec_object2 *exec)
> +		       struct drm_i915_gem_exec_object2 *exec,
> +		       struct i915_address_space *vm)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
>  	struct eb_objects *eb;
> @@ -1000,17 +1018,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  
>  	/* Move the objects en-masse into the GTT, evicting if necessary. */
>  	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> -	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> +	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
>  	if (ret)
>  		goto err;
>  
>  	/* The objects are in their final locations, apply the relocations. */
>  	if (need_relocs)
> -		ret = i915_gem_execbuffer_relocate(eb);
> +		ret = i915_gem_execbuffer_relocate(eb, vm);
>  	if (ret) {
>  		if (ret == -EFAULT) {
>  			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
> -								eb, exec);
> +								eb, exec, vm);
>  			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
>  		}
>  		if (ret)
> @@ -1061,7 +1079,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  			goto err;
>  	}
>  
> -	exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset;
> +	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> +		args->batch_start_offset;
>  	exec_len = args->batch_len;
>  	if (cliprects) {
>  		for (i = 0; i < args->num_cliprects; i++) {
> @@ -1086,11 +1105,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  
>  	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
>  
> -	i915_gem_execbuffer_move_to_active(&eb->objects, ring);
> +	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
>  	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
>  
>  err:
> -	eb_destroy(eb);
> +	eb_destroy(eb, vm);
>  
>  	mutex_unlock(&dev->struct_mutex);
>  
> @@ -1107,6 +1126,7 @@ int
>  i915_gem_execbuffer(struct drm_device *dev, void *data,
>  		    struct drm_file *file)
>  {
> +	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_gem_execbuffer *args = data;
>  	struct drm_i915_gem_execbuffer2 exec2;
>  	struct drm_i915_gem_exec_object *exec_list = NULL;
> @@ -1162,7 +1182,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
>  	exec2.flags = I915_EXEC_RENDER;
>  	i915_execbuffer2_set_context_id(exec2, 0);
>  
> -	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
> +	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
> +				     &dev_priv->gtt.base);
>  	if (!ret) {
>  		/* Copy the new buffer offsets back to the user's exec list. */
>  		for (i = 0; i < args->buffer_count; i++)
> @@ -1188,6 +1209,7 @@ int
>  i915_gem_execbuffer2(struct drm_device *dev, void *data,
>  		     struct drm_file *file)
>  {
> +	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_gem_execbuffer2 *args = data;
>  	struct drm_i915_gem_exec_object2 *exec2_list = NULL;
>  	int ret;
> @@ -1218,7 +1240,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
>  		return -EFAULT;
>  	}
>  
> -	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
> +	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
> +				     &dev_priv->gtt.base);
>  	if (!ret) {
>  		/* Copy the new buffer offsets back to the user's exec list. */
>  		ret = copy_to_user(to_user_ptr(args->buffers_ptr),
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 3b639a9..44f3464 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -390,6 +390,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
>  			    ppgtt->base.total);
>  	}
>  
> +	/* i915_init_vm(dev_priv, &ppgtt->base) */
> +
>  	return ret;
>  }
>  
> @@ -409,17 +411,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
>  			    struct drm_i915_gem_object *obj,
>  			    enum i915_cache_level cache_level)
>  {
> -	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> -				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> -				   cache_level);
> +	struct i915_address_space *vm = &ppgtt->base;
> +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> +
> +	vm->insert_entries(vm, obj->pages,
> +			   obj_offset >> PAGE_SHIFT,
> +			   cache_level);
>  }
>  
>  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
>  			      struct drm_i915_gem_object *obj)
>  {
> -	ppgtt->base.clear_range(&ppgtt->base,
> -				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> -				obj->base.size >> PAGE_SHIFT);
> +	struct i915_address_space *vm = &ppgtt->base;
> +	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> +
> +	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
> +			obj->base.size >> PAGE_SHIFT);
>  }
>  
>  extern int intel_iommu_gfx_mapped;
> @@ -470,6 +477,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
>  				       dev_priv->gtt.base.start / PAGE_SIZE,
>  				       dev_priv->gtt.base.total / PAGE_SIZE);
>  
> +	if (dev_priv->mm.aliasing_ppgtt)
> +		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> +
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
>  		i915_gem_clflush_object(obj);
>  		i915_gem_gtt_bind_object(obj, obj->cache_level);
> @@ -648,7 +658,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>  	 * aperture.  One page should be enough to keep any prefetching inside
>  	 * of the aperture.
>  	 */
> -	drm_i915_private_t *dev_priv = dev->dev_private;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
>  	struct drm_mm_node *entry;
>  	struct drm_i915_gem_object *obj;
>  	unsigned long hole_start, hole_end;
> @@ -656,19 +667,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>  	BUG_ON(mappable_end > end);
>  
>  	/* Subtract the guard page ... */
> -	drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
> +	drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
>  	if (!HAS_LLC(dev))
>  		dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
>  
>  	/* Mark any preallocated objects as occupied */
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> -		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> +		struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
>  		int ret;
>  		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
>  			      i915_gem_obj_ggtt_offset(obj), obj->base.size);
>  
>  		WARN_ON(i915_gem_obj_ggtt_bound(obj));
> -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> +		ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
>  		if (ret)
>  			DRM_DEBUG_KMS("Reservation failed\n");
>  		obj->has_global_gtt_mapping = 1;
> @@ -679,19 +690,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>  	dev_priv->gtt.base.total = end - start;
>  
>  	/* Clear any non-preallocated blocks */
> -	drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
> -			     hole_start, hole_end) {
> +	drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
>  		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
>  		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
>  			      hole_start, hole_end);
> -		dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> -					       hole_start / PAGE_SIZE,
> -					       count);
> +		ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count);
>  	}
>  
>  	/* And finally clear the reserved guard page */
> -	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> -				       end / PAGE_SIZE - 1, 1);
> +	ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1);
>  }
>  
>  static bool
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index 27ffb4c..000ffbd 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -351,7 +351,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>  					       u32 size)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct i915_address_space *vm = &dev_priv->gtt.base;
> +	struct i915_address_space *ggtt = &dev_priv->gtt.base;
>  	struct drm_i915_gem_object *obj;
>  	struct drm_mm_node *stolen;
>  	struct i915_vma *vma;
> @@ -394,7 +394,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>  	if (gtt_offset == I915_GTT_OFFSET_NONE)
>  		return obj;
>  
> -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> +	vma = i915_gem_vma_create(obj, ggtt);
>  	if (IS_ERR(vma)) {
>  		ret = PTR_ERR(vma);
>  		goto err_out;
> @@ -407,8 +407,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>  	 */
>  	vma->node.start = gtt_offset;
>  	vma->node.size = size;
> -	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> +	if (drm_mm_initialized(&ggtt->mm)) {
> +		ret = drm_mm_reserve_node(&ggtt->mm, &vma->node);
>  		if (ret) {
>  			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
>  			i915_gem_vma_destroy(vma);
> @@ -419,7 +419,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>  	obj->has_global_gtt_mapping = 1;
>  
>  	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
> -	list_add_tail(&obj->mm_list, &vm->inactive_list);
> +	list_add_tail(&obj->mm_list, &ggtt->inactive_list);
>  
>  	return obj;
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> index 92a8d27..808ca2a 100644
> --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> @@ -360,17 +360,19 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
>  
>  		obj->map_and_fenceable =
>  			!i915_gem_obj_ggtt_bound(obj) ||
> -			(i915_gem_obj_ggtt_offset(obj) + obj->base.size <= dev_priv->gtt.mappable_end &&
> +			(i915_gem_obj_ggtt_offset(obj) +
> +			 obj->base.size <= dev_priv->gtt.mappable_end &&
>  			 i915_gem_object_fence_ok(obj, args->tiling_mode));
>  
>  		/* Rebind if we need a change of alignment */
>  		if (!obj->map_and_fenceable) {
> -			u32 unfenced_alignment =
> +			struct i915_address_space *ggtt = &dev_priv->gtt.base;
> +			u32 unfenced_align =
>  				i915_gem_get_gtt_alignment(dev, obj->base.size,
>  							    args->tiling_mode,
>  							    false);
> -			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_alignment - 1))
> -				ret = i915_gem_object_unbind(obj);
> +			if (i915_gem_obj_ggtt_offset(obj) & (unfenced_align - 1))
> +				ret = i915_gem_object_unbind(obj, ggtt);
>  		}
>  
>  		if (ret == 0) {
> diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> index 7d283b5..3f019d3 100644
> --- a/drivers/gpu/drm/i915/i915_trace.h
> +++ b/drivers/gpu/drm/i915/i915_trace.h
> @@ -34,11 +34,13 @@ TRACE_EVENT(i915_gem_object_create,
>  );
>  
>  TRACE_EVENT(i915_gem_object_bind,
> -	    TP_PROTO(struct drm_i915_gem_object *obj, bool mappable),
> -	    TP_ARGS(obj, mappable),
> +	    TP_PROTO(struct drm_i915_gem_object *obj,
> +		     struct i915_address_space *vm, bool mappable),
> +	    TP_ARGS(obj, vm, mappable),
>  
>  	    TP_STRUCT__entry(
>  			     __field(struct drm_i915_gem_object *, obj)
> +			     __field(struct i915_address_space *, vm)
>  			     __field(u32, offset)
>  			     __field(u32, size)
>  			     __field(bool, mappable)
> @@ -46,8 +48,8 @@ TRACE_EVENT(i915_gem_object_bind,
>  
>  	    TP_fast_assign(
>  			   __entry->obj = obj;
> -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> +			   __entry->size = i915_gem_obj_size(obj, vm);
>  			   __entry->mappable = mappable;
>  			   ),
>  
> @@ -57,19 +59,21 @@ TRACE_EVENT(i915_gem_object_bind,
>  );
>  
>  TRACE_EVENT(i915_gem_object_unbind,
> -	    TP_PROTO(struct drm_i915_gem_object *obj),
> -	    TP_ARGS(obj),
> +	    TP_PROTO(struct drm_i915_gem_object *obj,
> +		     struct i915_address_space *vm),
> +	    TP_ARGS(obj, vm),
>  
>  	    TP_STRUCT__entry(
>  			     __field(struct drm_i915_gem_object *, obj)
> +			     __field(struct i915_address_space *, vm)
>  			     __field(u32, offset)
>  			     __field(u32, size)
>  			     ),
>  
>  	    TP_fast_assign(
>  			   __entry->obj = obj;
> -			   __entry->offset = i915_gem_obj_ggtt_offset(obj);
> -			   __entry->size = i915_gem_obj_ggtt_size(obj);
> +			   __entry->offset = i915_gem_obj_offset(obj, vm);
> +			   __entry->size = i915_gem_obj_size(obj, vm);
>  			   ),
>  
>  	    TP_printk("obj=%p, offset=%08x size=%x",
> diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
> index f3c97e0..b69cc63 100644
> --- a/drivers/gpu/drm/i915/intel_fb.c
> +++ b/drivers/gpu/drm/i915/intel_fb.c
> @@ -170,7 +170,6 @@ static int intelfb_create(struct drm_fb_helper *helper,
>  		      fb->width, fb->height,
>  		      i915_gem_obj_ggtt_offset(obj), obj);
>  
> -
>  	mutex_unlock(&dev->struct_mutex);
>  	vga_switcheroo_client_fb_set(dev->pdev, info);
>  	return 0;
> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> index 2abb53e..22ccb7e 100644
> --- a/drivers/gpu/drm/i915/intel_overlay.c
> +++ b/drivers/gpu/drm/i915/intel_overlay.c
> @@ -1350,7 +1350,7 @@ void intel_setup_overlay(struct drm_device *dev)
>  		}
>  		overlay->flip_addr = reg_bo->phys_obj->handle->busaddr;
>  	} else {
> -		ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false);
> +		ret = i915_gem_ggtt_pin(reg_bo, PAGE_SIZE, true, false);
>  		if (ret) {
>  			DRM_ERROR("failed to pin overlay register bo\n");
>  			goto out_free_bo;
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 008e0e0..0fb081c 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2860,7 +2860,7 @@ intel_alloc_context_page(struct drm_device *dev)
>  		return NULL;
>  	}
>  
> -	ret = i915_gem_object_pin(ctx, 4096, true, false);
> +	ret = i915_gem_ggtt_pin(ctx, 4096, true, false);
>  	if (ret) {
>  		DRM_ERROR("failed to pin power context: %d\n", ret);
>  		goto err_unref;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 8527ea0..88130a3 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -481,6 +481,7 @@ out:
>  static int
>  init_pipe_control(struct intel_ring_buffer *ring)
>  {
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  	struct pipe_control *pc;
>  	struct drm_i915_gem_object *obj;
>  	int ret;
> @@ -499,9 +500,10 @@ init_pipe_control(struct intel_ring_buffer *ring)
>  		goto err;
>  	}
>  
> -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> +					I915_CACHE_LLC);
>  
> -	ret = i915_gem_object_pin(obj, 4096, true, false);
> +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
>  	if (ret)
>  		goto err_unref;
>  
> @@ -1212,6 +1214,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring)
>  static int init_status_page(struct intel_ring_buffer *ring)
>  {
>  	struct drm_device *dev = ring->dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_gem_object *obj;
>  	int ret;
>  
> @@ -1222,9 +1225,10 @@ static int init_status_page(struct intel_ring_buffer *ring)
>  		goto err;
>  	}
>  
> -	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> +	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> +					I915_CACHE_LLC);
>  
> -	ret = i915_gem_object_pin(obj, 4096, true, false);
> +	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
>  	if (ret != 0) {
>  		goto err_unref;
>  	}
> @@ -1307,7 +1311,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
>  
>  	ring->obj = obj;
>  
> -	ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false);
> +	ret = i915_gem_ggtt_pin(obj, PAGE_SIZE, true, false);
>  	if (ret)
>  		goto err_unref;
>  
> @@ -1828,7 +1832,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
>  			return -ENOMEM;
>  		}
>  
> -		ret = i915_gem_object_pin(obj, 0, true, false);
> +		ret = i915_gem_ggtt_pin(obj, 0, true, false);
>  		if (ret != 0) {
>  			drm_gem_object_unreference(&obj->base);
>  			DRM_ERROR("Failed to ping batch bo\n");
> -- 
> 1.8.3.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Daniel Vetter July 26, 2013, 9:51 a.m. UTC | #2

HI all,

So Ben&I had a bit a private discussion and one thing I've explained a bit
more in detail is what kind of review I'm doing as maintainer. I've
figured this is generally useful. We've also discussed a bit that for
developers without their own lab it would be nice if QA could test random
branches on their set of machines. But imo that'll take quite a while,
there's lots of other stuff to improve in QA land first. Anyway, here's
it:

Now an explanation for why this freaked me out, which is essentially an
explanation of what I do when I do maintainer reviews:

Probably the most important question I ask myself when reading a patch is
"if a regression would bisect to this, and the bisect is the only useful
piece of evidence, would I stand a chance to understand it?".  Your patch
is big, has the appearance of doing a few unrelated things and could very
well hide a bug which would take me an awful lot of time to spot. So imo
the answer for your patch is a clear "no".

I've merged a few such patches in the past  where I've had a similar hunch
and regretted it almost always. I've also sometimes split-up the patch
while applying, but that approach doesn't scale any more with our rather
big team.

The second thing I try to figure out is whether the patch author is indeed
the local expert on the topic at hand now. With our team size and patch
flow I don't stand a chance if I try to understand everything to the last
detail. Instead I try to assess this through the proxy of convincing
myself the the patch submitter understands stuff much better than I do. I
tend to check that by asking random questions, proposing alternative
approaches and also by rating code/patch clarity. The obj_set_color
double-loop very much gave me the impression that you didn't have a clear
idea about how exactly this should work, so that  hunk trigger this
maintainer hunch.

I admit that this is all rather fluffy and very much an inexact science,
but it's the only tools I have as a maintainer. The alternative of doing
shit myself or checking everything myself in-depth just doesnt scale.

Cheers, Daniel


On Mon, Jul 22, 2013 at 4:08 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
> This patch was formerly known as:
> "drm/i915: Create VMAs (part 3) - plumbing"
>
> This patch adds a VM argument, bind/unbind, and the object
> offset/size/color getters/setters. It preserves the old ggtt helper
> functions because things still need, and will continue to need them.
>
> Some code will still need to be ported over after this.
>
> v2: Fix purge to pick an object and unbind all vmas
> This was doable because of the global bound list change.
>
> v3: With the commit to actually pin/unpin pages in place, there is no
> longer a need to check if unbind succeeded before calling put_pages().
> Make put_pages only BUG() after checking pin count.
>
> v4: Rebased on top of the new hangcheck work by Mika
> plumbed eb_destroy also
> Many checkpatch related fixes
>
> v5: Very large rebase
>
> v6:
> Change BUG_ON to WARN_ON (Daniel)
> Rename vm to ggtt in preallocate stolen, since it is always ggtt when
> dealing with stolen memory. (Daniel)
> list_for_each will short-circuit already (Daniel)
> remove superflous space (Daniel)
> Use per object list of vmas (Daniel)
> Make obj_bound_any() use obj_bound for each vm (Ben)
> s/bind_to_gtt/bind_to_vm/ (Ben)
>
> Fixed up the inactive shrinker. As Daniel noticed the code could
> potentially count the same object multiple times. While it's not
> possible in the current case, since 1 object can only ever be bound into
> 1 address space thus far - we may as well try to get something more
> future proof in place now. With a prep patch before this to switch over
> to using the bound list + inactive check, we're now able to carry that
> forward for every address space an object is bound into.
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c        |  29 ++-
>  drivers/gpu/drm/i915/i915_dma.c            |   4 -
>  drivers/gpu/drm/i915/i915_drv.h            | 107 +++++----
>  drivers/gpu/drm/i915/i915_gem.c            | 337 +++++++++++++++++++++--------
>  drivers/gpu/drm/i915/i915_gem_context.c    |   9 +-
>  drivers/gpu/drm/i915/i915_gem_evict.c      |  51 +++--
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 +++++---
>  drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 ++--
>  drivers/gpu/drm/i915/i915_gem_stolen.c     |  10 +-
>  drivers/gpu/drm/i915/i915_gem_tiling.c     |  10 +-
>  drivers/gpu/drm/i915/i915_trace.h          |  20 +-
>  drivers/gpu/drm/i915/intel_fb.c            |   1 -
>  drivers/gpu/drm/i915/intel_overlay.c       |   2 +-
>  drivers/gpu/drm/i915/intel_pm.c            |   2 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c    |  16 +-
>  15 files changed, 479 insertions(+), 245 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index be69807..f8e590f 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -92,6 +92,7 @@ static const char *get_tiling_flag(struct drm_i915_gem_object *obj)
>  static void
>  describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>  {
> +       struct i915_vma *vma;
>         seq_printf(m, "%pK: %s%s %8zdKiB %02x %02x %d %d %d%s%s%s",
>                    &obj->base,
>                    get_pin_flag(obj),
> @@ -111,9 +112,15 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>                 seq_printf(m, " (pinned x %d)", obj->pin_count);
>         if (obj->fence_reg != I915_FENCE_REG_NONE)
>                 seq_printf(m, " (fence: %d)", obj->fence_reg);
> -       if (i915_gem_obj_ggtt_bound(obj))
> -               seq_printf(m, " (gtt offset: %08lx, size: %08x)",
> -                          i915_gem_obj_ggtt_offset(obj), (unsigned int)i915_gem_obj_ggtt_size(obj));
> +       list_for_each_entry(vma, &obj->vma_list, vma_link) {
> +               if (!i915_is_ggtt(vma->vm))
> +                       seq_puts(m, " (pp");
> +               else
> +                       seq_puts(m, " (g");
> +               seq_printf(m, "gtt offset: %08lx, size: %08lx)",
> +                          i915_gem_obj_offset(obj, vma->vm),
> +                          i915_gem_obj_size(obj, vma->vm));
> +       }
>         if (obj->stolen)
>                 seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
>         if (obj->pin_mappable || obj->fault_mappable) {
> @@ -175,6 +182,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
>         return 0;
>  }
>
> +/* FIXME: Support multiple VM? */
>  #define count_objects(list, member) do { \
>         list_for_each_entry(obj, list, member) { \
>                 size += i915_gem_obj_ggtt_size(obj); \
> @@ -1781,18 +1789,21 @@ i915_drop_caches_set(void *data, u64 val)
>
>         if (val & DROP_BOUND) {
>                 list_for_each_entry_safe(obj, next, &vm->inactive_list,
> -                                        mm_list)
> -                       if (obj->pin_count == 0) {
> -                               ret = i915_gem_object_unbind(obj);
> -                               if (ret)
> -                                       goto unlock;
> -                       }
> +                                        mm_list) {
> +                       if (obj->pin_count)
> +                               continue;
> +
> +                       ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
> +                       if (ret)
> +                               goto unlock;
> +               }
>         }
>
>         if (val & DROP_UNBOUND) {
>                 list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list,
>                                          global_list)
>                         if (obj->pages_pin_count == 0) {
> +                               /* FIXME: Do this for all vms? */
>                                 ret = i915_gem_object_put_pages(obj);
>                                 if (ret)
>                                         goto unlock;
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 1449d06..4650519 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1499,10 +1499,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
>
>         i915_dump_device_info(dev_priv);
>
> -       INIT_LIST_HEAD(&dev_priv->vm_list);
> -       INIT_LIST_HEAD(&dev_priv->gtt.base.global_link);
> -       list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list);
> -
>         if (i915_get_bridge_dev(dev)) {
>                 ret = -EIO;
>                 goto free_priv;
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 8b3167e..681cb41 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1379,52 +1379,6 @@ struct drm_i915_gem_object {
>
>  #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
>
> -/* This is a temporary define to help transition us to real VMAs. If you see
> - * this, you're either reviewing code, or bisecting it. */
> -static inline struct i915_vma *
> -__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj)
> -{
> -       if (list_empty(&obj->vma_list))
> -               return NULL;
> -       return list_first_entry(&obj->vma_list, struct i915_vma, vma_link);
> -}
> -
> -/* Whether or not this object is currently mapped by the translation tables */
> -static inline bool
> -i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o)
> -{
> -       struct i915_vma *vma = __i915_gem_obj_to_vma(o);
> -       if (vma == NULL)
> -               return false;
> -       return drm_mm_node_allocated(&vma->node);
> -}
> -
> -/* Offset of the first PTE pointing to this object */
> -static inline unsigned long
> -i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o)
> -{
> -       BUG_ON(list_empty(&o->vma_list));
> -       return __i915_gem_obj_to_vma(o)->node.start;
> -}
> -
> -/* The size used in the translation tables may be larger than the actual size of
> - * the object on GEN2/GEN3 because of the way tiling is handled. See
> - * i915_gem_get_gtt_size() for more details.
> - */
> -static inline unsigned long
> -i915_gem_obj_ggtt_size(struct drm_i915_gem_object *o)
> -{
> -       BUG_ON(list_empty(&o->vma_list));
> -       return __i915_gem_obj_to_vma(o)->node.size;
> -}
> -
> -static inline void
> -i915_gem_obj_ggtt_set_color(struct drm_i915_gem_object *o,
> -                           enum i915_cache_level color)
> -{
> -       __i915_gem_obj_to_vma(o)->node.color = color;
> -}
> -
>  /**
>   * Request queue structure.
>   *
> @@ -1736,11 +1690,13 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
>  void i915_gem_vma_destroy(struct i915_vma *vma);
>
>  int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
> +                                    struct i915_address_space *vm,
>                                      uint32_t alignment,
>                                      bool map_and_fenceable,
>                                      bool nonblocking);
>  void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
> -int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
> +int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> +                                       struct i915_address_space *vm);
>  int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
>  void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
>  void i915_gem_lastclose(struct drm_device *dev);
> @@ -1770,6 +1726,7 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
>  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
>                          struct intel_ring_buffer *to);
>  void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> +                                   struct i915_address_space *vm,
>                                     struct intel_ring_buffer *ring);
>
>  int i915_gem_dumb_create(struct drm_file *file_priv,
> @@ -1876,6 +1833,7 @@ i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size,
>                             int tiling_mode, bool fenced);
>
>  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> +                                   struct i915_address_space *vm,
>                                     enum i915_cache_level cache_level);
>
>  struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
> @@ -1886,6 +1844,56 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
>
>  void i915_gem_restore_fences(struct drm_device *dev);
>
> +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> +                                 struct i915_address_space *vm);
> +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
> +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> +                       struct i915_address_space *vm);
> +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> +                               struct i915_address_space *vm);
> +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> +                           struct i915_address_space *vm,
> +                           enum i915_cache_level color);
> +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> +                                    struct i915_address_space *vm);
> +/* Some GGTT VM helpers */
> +#define obj_to_ggtt(obj) \
> +       (&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
> +static inline bool i915_is_ggtt(struct i915_address_space *vm)
> +{
> +       struct i915_address_space *ggtt =
> +               &((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base;
> +       return vm == ggtt;
> +}
> +
> +static inline bool i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *obj)
> +{
> +       return i915_gem_obj_bound(obj, obj_to_ggtt(obj));
> +}
> +
> +static inline unsigned long
> +i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *obj)
> +{
> +       return i915_gem_obj_offset(obj, obj_to_ggtt(obj));
> +}
> +
> +static inline unsigned long
> +i915_gem_obj_ggtt_size(struct drm_i915_gem_object *obj)
> +{
> +       return i915_gem_obj_size(obj, obj_to_ggtt(obj));
> +}
> +
> +static inline int __must_check
> +i915_gem_ggtt_pin(struct drm_i915_gem_object *obj,
> +                 uint32_t alignment,
> +                 bool map_and_fenceable,
> +                 bool nonblocking)
> +{
> +       return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment,
> +                                  map_and_fenceable, nonblocking);
> +}
> +#undef obj_to_ggtt
> +
>  /* i915_gem_context.c */
>  void i915_gem_context_init(struct drm_device *dev);
>  void i915_gem_context_fini(struct drm_device *dev);
> @@ -1922,6 +1930,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
>
>  void i915_gem_restore_gtt_mappings(struct drm_device *dev);
>  int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
> +/* FIXME: this is never okay with full PPGTT */
>  void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
>                                 enum i915_cache_level cache_level);
>  void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
> @@ -1938,7 +1947,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev)
>
>
>  /* i915_gem_evict.c */
> -int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
> +int __must_check i915_gem_evict_something(struct drm_device *dev,
> +                                         struct i915_address_space *vm,
> +                                         int min_size,
>                                           unsigned alignment,
>                                           unsigned cache_level,
>                                           bool mappable,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 2283765..0111554 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -38,10 +38,12 @@
>
>  static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
>  static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
> -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> -                                                   unsigned alignment,
> -                                                   bool map_and_fenceable,
> -                                                   bool nonblocking);
> +static __must_check int
> +i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> +                          struct i915_address_space *vm,
> +                          unsigned alignment,
> +                          bool map_and_fenceable,
> +                          bool nonblocking);
>  static int i915_gem_phys_pwrite(struct drm_device *dev,
>                                 struct drm_i915_gem_object *obj,
>                                 struct drm_i915_gem_pwrite *args,
> @@ -120,7 +122,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
>  static inline bool
>  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
>  {
> -       return i915_gem_obj_ggtt_bound(obj) && !obj->active;
> +       return i915_gem_obj_bound_any(obj) && !obj->active;
>  }
>
>  int
> @@ -406,7 +408,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
>                  * anyway again before the next pread happens. */
>                 if (obj->cache_level == I915_CACHE_NONE)
>                         needs_clflush = 1;
> -               if (i915_gem_obj_ggtt_bound(obj)) {
> +               if (i915_gem_obj_bound_any(obj)) {
>                         ret = i915_gem_object_set_to_gtt_domain(obj, false);
>                         if (ret)
>                                 return ret;
> @@ -578,7 +580,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
>         char __user *user_data;
>         int page_offset, page_length, ret;
>
> -       ret = i915_gem_object_pin(obj, 0, true, true);
> +       ret = i915_gem_ggtt_pin(obj, 0, true, true);
>         if (ret)
>                 goto out;
>
> @@ -723,7 +725,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
>                  * right away and we therefore have to clflush anyway. */
>                 if (obj->cache_level == I915_CACHE_NONE)
>                         needs_clflush_after = 1;
> -               if (i915_gem_obj_ggtt_bound(obj)) {
> +               if (i915_gem_obj_bound_any(obj)) {
>                         ret = i915_gem_object_set_to_gtt_domain(obj, true);
>                         if (ret)
>                                 return ret;
> @@ -1332,7 +1334,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>         }
>
>         /* Now bind it into the GTT if needed */
> -       ret = i915_gem_object_pin(obj, 0, true, false);
> +       ret = i915_gem_ggtt_pin(obj,  0, true, false);
>         if (ret)
>                 goto unlock;
>
> @@ -1654,11 +1656,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
>         if (obj->pages == NULL)
>                 return 0;
>
> -       BUG_ON(i915_gem_obj_ggtt_bound(obj));
> -
>         if (obj->pages_pin_count)
>                 return -EBUSY;
>
> +       BUG_ON(i915_gem_obj_bound_any(obj));
> +
>         /* ->put_pages might need to allocate memory for the bit17 swizzle
>          * array, hence protect them from being reaped by removing them from gtt
>          * lists early. */
> @@ -1678,7 +1680,6 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
>                   bool purgeable_only)
>  {
>         struct drm_i915_gem_object *obj, *next;
> -       struct i915_address_space *vm = &dev_priv->gtt.base;
>         long count = 0;
>
>         list_for_each_entry_safe(obj, next,
> @@ -1692,14 +1693,22 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
>                 }
>         }
>
> -       list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) {
> -               if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
> -                   i915_gem_object_unbind(obj) == 0 &&
> -                   i915_gem_object_put_pages(obj) == 0) {
> +       list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list,
> +                                global_list) {
> +               struct i915_vma *vma, *v;
> +
> +               if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
> +                       continue;
> +
> +               list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link)
> +                       if (i915_gem_object_unbind(obj, vma->vm))
> +                               break;
> +
> +               if (!i915_gem_object_put_pages(obj))
>                         count += obj->base.size >> PAGE_SHIFT;
> -                       if (count >= target)
> -                               return count;
> -               }
> +
> +               if (count >= target)
> +                       return count;
>         }
>
>         return count;
> @@ -1859,11 +1868,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
>
>  void
>  i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> +                              struct i915_address_space *vm,
>                                struct intel_ring_buffer *ring)
>  {
>         struct drm_device *dev = obj->base.dev;
>         struct drm_i915_private *dev_priv = dev->dev_private;
> -       struct i915_address_space *vm = &dev_priv->gtt.base;
>         u32 seqno = intel_ring_get_seqno(ring);
>
>         BUG_ON(ring == NULL);
> @@ -1900,12 +1909,9 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  }
>
>  static void
> -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
> +                                struct i915_address_space *vm)
>  {
> -       struct drm_device *dev = obj->base.dev;
> -       struct drm_i915_private *dev_priv = dev->dev_private;
> -       struct i915_address_space *vm = &dev_priv->gtt.base;
> -
>         BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
>         BUG_ON(!obj->active);
>
> @@ -2105,10 +2111,11 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
>         spin_unlock(&file_priv->mm.lock);
>  }
>
> -static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj)
> +static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj,
> +                                   struct i915_address_space *vm)
>  {
> -       if (acthd >= i915_gem_obj_ggtt_offset(obj) &&
> -           acthd < i915_gem_obj_ggtt_offset(obj) + obj->base.size)
> +       if (acthd >= i915_gem_obj_offset(obj, vm) &&
> +           acthd < i915_gem_obj_offset(obj, vm) + obj->base.size)
>                 return true;
>
>         return false;
> @@ -2131,6 +2138,17 @@ static bool i915_head_inside_request(const u32 acthd_unmasked,
>         return false;
>  }
>
> +static struct i915_address_space *
> +request_to_vm(struct drm_i915_gem_request *request)
> +{
> +       struct drm_i915_private *dev_priv = request->ring->dev->dev_private;
> +       struct i915_address_space *vm;
> +
> +       vm = &dev_priv->gtt.base;
> +
> +       return vm;
> +}
> +
>  static bool i915_request_guilty(struct drm_i915_gem_request *request,
>                                 const u32 acthd, bool *inside)
>  {
> @@ -2138,9 +2156,9 @@ static bool i915_request_guilty(struct drm_i915_gem_request *request,
>          * pointing inside the ring, matches the batch_obj address range.
>          * However this is extremely unlikely.
>          */
> -
>         if (request->batch_obj) {
> -               if (i915_head_inside_object(acthd, request->batch_obj)) {
> +               if (i915_head_inside_object(acthd, request->batch_obj,
> +                                           request_to_vm(request))) {
>                         *inside = true;
>                         return true;
>                 }
> @@ -2160,17 +2178,21 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring,
>  {
>         struct i915_ctx_hang_stats *hs = NULL;
>         bool inside, guilty;
> +       unsigned long offset = 0;
>
>         /* Innocent until proven guilty */
>         guilty = false;
>
> +       if (request->batch_obj)
> +               offset = i915_gem_obj_offset(request->batch_obj,
> +                                            request_to_vm(request));
> +
>         if (ring->hangcheck.action != wait &&
>             i915_request_guilty(request, acthd, &inside)) {
>                 DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n",
>                           ring->name,
>                           inside ? "inside" : "flushing",
> -                         request->batch_obj ?
> -                         i915_gem_obj_ggtt_offset(request->batch_obj) : 0,
> +                         offset,
>                           request->ctx ? request->ctx->id : 0,
>                           acthd);
>
> @@ -2227,13 +2249,15 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
>         }
>
>         while (!list_empty(&ring->active_list)) {
> +               struct i915_address_space *vm;
>                 struct drm_i915_gem_object *obj;
>
>                 obj = list_first_entry(&ring->active_list,
>                                        struct drm_i915_gem_object,
>                                        ring_list);
>
> -               i915_gem_object_move_to_inactive(obj);
> +               list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +                       i915_gem_object_move_to_inactive(obj, vm);
>         }
>  }
>
> @@ -2261,7 +2285,7 @@ void i915_gem_restore_fences(struct drm_device *dev)
>  void i915_gem_reset(struct drm_device *dev)
>  {
>         struct drm_i915_private *dev_priv = dev->dev_private;
> -       struct i915_address_space *vm = &dev_priv->gtt.base;
> +       struct i915_address_space *vm;
>         struct drm_i915_gem_object *obj;
>         struct intel_ring_buffer *ring;
>         int i;
> @@ -2272,8 +2296,9 @@ void i915_gem_reset(struct drm_device *dev)
>         /* Move everything out of the GPU domains to ensure we do any
>          * necessary invalidation upon reuse.
>          */
> -       list_for_each_entry(obj, &vm->inactive_list, mm_list)
> -               obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> +       list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +               list_for_each_entry(obj, &vm->inactive_list, mm_list)
> +                       obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
>
>         i915_gem_restore_fences(dev);
>  }
> @@ -2318,6 +2343,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
>          * by the ringbuffer to the flushing/inactive lists as appropriate.
>          */
>         while (!list_empty(&ring->active_list)) {
> +               struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +               struct i915_address_space *vm;
>                 struct drm_i915_gem_object *obj;
>
>                 obj = list_first_entry(&ring->active_list,
> @@ -2327,7 +2354,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
>                 if (!i915_seqno_passed(seqno, obj->last_read_seqno))
>                         break;
>
> -               i915_gem_object_move_to_inactive(obj);
> +               list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +                       i915_gem_object_move_to_inactive(obj, vm);
>         }
>
>         if (unlikely(ring->trace_irq_seqno &&
> @@ -2573,13 +2601,14 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
>   * Unbinds an object from the GTT aperture.
>   */
>  int
> -i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> +i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> +                      struct i915_address_space *vm)
>  {
>         drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
>         struct i915_vma *vma;
>         int ret;
>
> -       if (!i915_gem_obj_ggtt_bound(obj))
> +       if (!i915_gem_obj_bound(obj, vm))
>                 return 0;
>
>         if (obj->pin_count)
> @@ -2602,7 +2631,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
>         if (ret)
>                 return ret;
>
> -       trace_i915_gem_object_unbind(obj);
> +       trace_i915_gem_object_unbind(obj, vm);
>
>         if (obj->has_global_gtt_mapping)
>                 i915_gem_gtt_unbind_object(obj);
> @@ -2617,7 +2646,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
>         /* Avoid an unnecessary call to unbind on rebind. */
>         obj->map_and_fenceable = true;
>
> -       vma = __i915_gem_obj_to_vma(obj);
> +       vma = i915_gem_obj_to_vma(obj, vm);
>         list_del(&vma->vma_link);
>         drm_mm_remove_node(&vma->node);
>         i915_gem_vma_destroy(vma);
> @@ -2764,6 +2793,7 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
>                      "object 0x%08lx not 512K or pot-size 0x%08x aligned\n",
>                      i915_gem_obj_ggtt_offset(obj), size);
>
> +
>                 pitch_val = obj->stride / 128;
>                 pitch_val = ffs(pitch_val) - 1;
>
> @@ -3049,24 +3079,26 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
>   * Finds free space in the GTT aperture and binds the object there.
>   */
>  static int
> -i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> -                           unsigned alignment,
> -                           bool map_and_fenceable,
> -                           bool nonblocking)
> +i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> +                          struct i915_address_space *vm,
> +                          unsigned alignment,
> +                          bool map_and_fenceable,
> +                          bool nonblocking)
>  {
>         struct drm_device *dev = obj->base.dev;
>         drm_i915_private_t *dev_priv = dev->dev_private;
> -       struct i915_address_space *vm = &dev_priv->gtt.base;
>         u32 size, fence_size, fence_alignment, unfenced_alignment;
>         bool mappable, fenceable;
> -       size_t gtt_max = map_and_fenceable ?
> -               dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
> +       size_t gtt_max =
> +               map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total;
>         struct i915_vma *vma;
>         int ret;
>
>         if (WARN_ON(!list_empty(&obj->vma_list)))
>                 return -EBUSY;
>
> +       BUG_ON(!i915_is_ggtt(vm));
> +
>         fence_size = i915_gem_get_gtt_size(dev,
>                                            obj->base.size,
>                                            obj->tiling_mode);
> @@ -3105,19 +3137,21 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
>
>         i915_gem_object_pin_pages(obj);
>
> -       vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> +       /* For now we only ever use 1 vma per object */
> +       WARN_ON(!list_empty(&obj->vma_list));
> +
> +       vma = i915_gem_vma_create(obj, vm);
>         if (IS_ERR(vma)) {
>                 i915_gem_object_unpin_pages(obj);
>                 return PTR_ERR(vma);
>         }
>
>  search_free:
> -       ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
> -                                                 &vma->node,
> +       ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
>                                                   size, alignment,
>                                                   obj->cache_level, 0, gtt_max);
>         if (ret) {
> -               ret = i915_gem_evict_something(dev, size, alignment,
> +               ret = i915_gem_evict_something(dev, vm, size, alignment,
>                                                obj->cache_level,
>                                                map_and_fenceable,
>                                                nonblocking);
> @@ -3138,18 +3172,25 @@ search_free:
>
>         list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
>         list_add_tail(&obj->mm_list, &vm->inactive_list);
> -       list_add(&vma->vma_link, &obj->vma_list);
> +
> +       /* Keep GGTT vmas first to make debug easier */
> +       if (i915_is_ggtt(vm))
> +               list_add(&vma->vma_link, &obj->vma_list);
> +       else
> +               list_add_tail(&vma->vma_link, &obj->vma_list);
>
>         fenceable =
> +               i915_is_ggtt(vm) &&
>                 i915_gem_obj_ggtt_size(obj) == fence_size &&
>                 (i915_gem_obj_ggtt_offset(obj) & (fence_alignment - 1)) == 0;
>
> -       mappable = i915_gem_obj_ggtt_offset(obj) + obj->base.size <=
> -               dev_priv->gtt.mappable_end;
> +       mappable =
> +               i915_is_ggtt(vm) &&
> +               vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
>
>         obj->map_and_fenceable = mappable && fenceable;
>
> -       trace_i915_gem_object_bind(obj, map_and_fenceable);
> +       trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
>         i915_gem_verify_gtt(dev);
>         return 0;
>
> @@ -3253,7 +3294,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>         int ret;
>
>         /* Not valid to be called on unbound objects. */
> -       if (!i915_gem_obj_ggtt_bound(obj))
> +       if (!i915_gem_obj_bound_any(obj))
>                 return -EINVAL;
>
>         if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
> @@ -3299,11 +3340,12 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>  }
>
>  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> +                                   struct i915_address_space *vm,
>                                     enum i915_cache_level cache_level)
>  {
>         struct drm_device *dev = obj->base.dev;
>         drm_i915_private_t *dev_priv = dev->dev_private;
> -       struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> +       struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
>         int ret;
>
>         if (obj->cache_level == cache_level)
> @@ -3315,12 +3357,12 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>         }
>
>         if (vma && !i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> -               ret = i915_gem_object_unbind(obj);
> +               ret = i915_gem_object_unbind(obj, vm);
>                 if (ret)
>                         return ret;
>         }
>
> -       if (i915_gem_obj_ggtt_bound(obj)) {
> +       list_for_each_entry(vma, &obj->vma_list, vma_link) {
>                 ret = i915_gem_object_finish_gpu(obj);
>                 if (ret)
>                         return ret;
> @@ -3343,7 +3385,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>                         i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
>                                                obj, cache_level);
>
> -               i915_gem_obj_ggtt_set_color(obj, cache_level);
> +               i915_gem_obj_set_color(obj, vma->vm, cache_level);
>         }
>
>         if (cache_level == I915_CACHE_NONE) {
> @@ -3403,6 +3445,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
>                                struct drm_file *file)
>  {
>         struct drm_i915_gem_caching *args = data;
> +       struct drm_i915_private *dev_priv;
>         struct drm_i915_gem_object *obj;
>         enum i915_cache_level level;
>         int ret;
> @@ -3427,8 +3470,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
>                 ret = -ENOENT;
>                 goto unlock;
>         }
> +       dev_priv = obj->base.dev->dev_private;
>
> -       ret = i915_gem_object_set_cache_level(obj, level);
> +       /* FIXME: Add interface for specific VM? */
> +       ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
>
>         drm_gem_object_unreference(&obj->base);
>  unlock:
> @@ -3446,6 +3491,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>                                      u32 alignment,
>                                      struct intel_ring_buffer *pipelined)
>  {
> +       struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
>         u32 old_read_domains, old_write_domain;
>         int ret;
>
> @@ -3464,7 +3510,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>          * of uncaching, which would allow us to flush all the LLC-cached data
>          * with that bit in the PTE to main memory with just one PIPE_CONTROL.
>          */
> -       ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> +       ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> +                                             I915_CACHE_NONE);
>         if (ret)
>                 return ret;
>
> @@ -3472,7 +3519,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>          * (e.g. libkms for the bootup splash), we have to ensure that we
>          * always use map_and_fenceable for all scanout buffers.
>          */
> -       ret = i915_gem_object_pin(obj, alignment, true, false);
> +       ret = i915_gem_ggtt_pin(obj, alignment, true, false);
>         if (ret)
>                 return ret;
>
> @@ -3615,6 +3662,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
>
>  int
>  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> +                   struct i915_address_space *vm,
>                     uint32_t alignment,
>                     bool map_and_fenceable,
>                     bool nonblocking)
> @@ -3624,28 +3672,31 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>         if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
>                 return -EBUSY;
>
> -       if (i915_gem_obj_ggtt_bound(obj)) {
> -               if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> +       WARN_ON(map_and_fenceable && !i915_is_ggtt(vm));
> +
> +       if (i915_gem_obj_bound(obj, vm)) {
> +               if ((alignment &&
> +                    i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
>                     (map_and_fenceable && !obj->map_and_fenceable)) {
>                         WARN(obj->pin_count,
>                              "bo is already pinned with incorrect alignment:"
>                              " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
>                              " obj->map_and_fenceable=%d\n",
> -                            i915_gem_obj_ggtt_offset(obj), alignment,
> +                            i915_gem_obj_offset(obj, vm), alignment,
>                              map_and_fenceable,
>                              obj->map_and_fenceable);
> -                       ret = i915_gem_object_unbind(obj);
> +                       ret = i915_gem_object_unbind(obj, vm);
>                         if (ret)
>                                 return ret;
>                 }
>         }
>
> -       if (!i915_gem_obj_ggtt_bound(obj)) {
> +       if (!i915_gem_obj_bound(obj, vm)) {
>                 struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
>
> -               ret = i915_gem_object_bind_to_gtt(obj, alignment,
> -                                                 map_and_fenceable,
> -                                                 nonblocking);
> +               ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
> +                                                map_and_fenceable,
> +                                                nonblocking);
>                 if (ret)
>                         return ret;
>
> @@ -3666,7 +3717,7 @@ void
>  i915_gem_object_unpin(struct drm_i915_gem_object *obj)
>  {
>         BUG_ON(obj->pin_count == 0);
> -       BUG_ON(!i915_gem_obj_ggtt_bound(obj));
> +       BUG_ON(!i915_gem_obj_bound_any(obj));
>
>         if (--obj->pin_count == 0)
>                 obj->pin_mappable = false;
> @@ -3704,7 +3755,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
>         }
>
>         if (obj->user_pin_count == 0) {
> -               ret = i915_gem_object_pin(obj, args->alignment, true, false);
> +               ret = i915_gem_ggtt_pin(obj, args->alignment, true, false);
>                 if (ret)
>                         goto out;
>         }
> @@ -3937,6 +3988,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
>         struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
>         struct drm_device *dev = obj->base.dev;
>         drm_i915_private_t *dev_priv = dev->dev_private;
> +       struct i915_vma *vma, *next;
>
>         trace_i915_gem_object_destroy(obj);
>
> @@ -3944,15 +3996,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
>                 i915_gem_detach_phys_object(dev, obj);
>
>         obj->pin_count = 0;
> -       if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
> -               bool was_interruptible;
> +       /* NB: 0 or 1 elements */
> +       WARN_ON(!list_empty(&obj->vma_list) &&
> +               !list_is_singular(&obj->vma_list));
> +       list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
> +               int ret = i915_gem_object_unbind(obj, vma->vm);
> +               if (WARN_ON(ret == -ERESTARTSYS)) {
> +                       bool was_interruptible;
>
> -               was_interruptible = dev_priv->mm.interruptible;
> -               dev_priv->mm.interruptible = false;
> +                       was_interruptible = dev_priv->mm.interruptible;
> +                       dev_priv->mm.interruptible = false;
>
> -               WARN_ON(i915_gem_object_unbind(obj));
> +                       WARN_ON(i915_gem_object_unbind(obj, vma->vm));
>
> -               dev_priv->mm.interruptible = was_interruptible;
> +                       dev_priv->mm.interruptible = was_interruptible;
> +               }
>         }
>
>         /* Stolen objects don't hold a ref, but do hold pin count. Fix that up
> @@ -4319,6 +4377,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
>         INIT_LIST_HEAD(&ring->request_list);
>  }
>
> +static void i915_init_vm(struct drm_i915_private *dev_priv,
> +                        struct i915_address_space *vm)
> +{
> +       vm->dev = dev_priv->dev;
> +       INIT_LIST_HEAD(&vm->active_list);
> +       INIT_LIST_HEAD(&vm->inactive_list);
> +       INIT_LIST_HEAD(&vm->global_link);
> +       list_add(&vm->global_link, &dev_priv->vm_list);
> +}
> +
>  void
>  i915_gem_load(struct drm_device *dev)
>  {
> @@ -4331,8 +4399,9 @@ i915_gem_load(struct drm_device *dev)
>                                   SLAB_HWCACHE_ALIGN,
>                                   NULL);
>
> -       INIT_LIST_HEAD(&dev_priv->gtt.base.active_list);
> -       INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list);
> +       INIT_LIST_HEAD(&dev_priv->vm_list);
> +       i915_init_vm(dev_priv, &dev_priv->gtt.base);
> +
>         INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
>         INIT_LIST_HEAD(&dev_priv->mm.bound_list);
>         INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> @@ -4603,9 +4672,8 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>                              struct drm_i915_private,
>                              mm.inactive_shrinker);
>         struct drm_device *dev = dev_priv->dev;
> -       struct i915_address_space *vm = &dev_priv->gtt.base;
>         struct drm_i915_gem_object *obj;
> -       int nr_to_scan = sc->nr_to_scan;
> +       int nr_to_scan;
>         bool unlock = true;
>         int cnt;
>
> @@ -4619,6 +4687,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>                 unlock = false;
>         }
>
> +       nr_to_scan = sc->nr_to_scan;
>         if (nr_to_scan) {
>                 nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
>                 if (nr_to_scan > 0)
> @@ -4632,11 +4701,109 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>         list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
>                 if (obj->pages_pin_count == 0)
>                         cnt += obj->base.size >> PAGE_SHIFT;
> -       list_for_each_entry(obj, &vm->inactive_list, mm_list)
> +
> +       list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> +               if (obj->active)
> +                       continue;
> +
> +               i915_gem_object_flush_gtt_write_domain(obj);
> +               i915_gem_object_flush_cpu_write_domain(obj);
> +               /* FIXME: Can't assume global gtt */
> +               i915_gem_object_move_to_inactive(obj, &dev_priv->gtt.base);
> +
>                 if (obj->pin_count == 0 && obj->pages_pin_count == 0)
>                         cnt += obj->base.size >> PAGE_SHIFT;
> +       }
>
>         if (unlock)
>                 mutex_unlock(&dev->struct_mutex);
>         return cnt;
>  }
> +
> +/* All the new VM stuff */
> +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> +                                 struct i915_address_space *vm)
> +{
> +       struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> +       struct i915_vma *vma;
> +
> +       if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> +               vm = &dev_priv->gtt.base;
> +
> +       BUG_ON(list_empty(&o->vma_list));
> +       list_for_each_entry(vma, &o->vma_list, vma_link) {
> +               if (vma->vm == vm)
> +                       return vma->node.start;
> +
> +       }
> +       return -1;
> +}
> +
> +bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
> +                       struct i915_address_space *vm)
> +{
> +       struct i915_vma *vma;
> +
> +       list_for_each_entry(vma, &o->vma_list, vma_link)
> +               if (vma->vm == vm)
> +                       return true;
> +
> +       return false;
> +}
> +
> +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
> +{
> +       struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> +       struct i915_address_space *vm;
> +
> +       list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> +               if (i915_gem_obj_bound(o, vm))
> +                       return true;
> +
> +       return false;
> +}
> +
> +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
> +                               struct i915_address_space *vm)
> +{
> +       struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> +       struct i915_vma *vma;
> +
> +       if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> +               vm = &dev_priv->gtt.base;
> +
> +       BUG_ON(list_empty(&o->vma_list));
> +
> +       list_for_each_entry(vma, &o->vma_list, vma_link)
> +               if (vma->vm == vm)
> +                       return vma->node.size;
> +
> +       return 0;
> +}
> +
> +void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> +                           struct i915_address_space *vm,
> +                           enum i915_cache_level color)
> +{
> +       struct i915_vma *vma;
> +       BUG_ON(list_empty(&o->vma_list));
> +       list_for_each_entry(vma, &o->vma_list, vma_link) {
> +               if (vma->vm == vm) {
> +                       vma->node.color = color;
> +                       return;
> +               }
> +       }
> +
> +       WARN(1, "Couldn't set color for VM %p\n", vm);
> +}
> +
> +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> +                                    struct i915_address_space *vm)
> +{
> +       struct i915_vma *vma;
> +       list_for_each_entry(vma, &obj->vma_list, vma_link)
> +               if (vma->vm == vm)
> +                       return vma;
> +
> +       return NULL;
> +}
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 2470206..873577d 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev,
>
>         if (INTEL_INFO(dev)->gen >= 7) {
>                 ret = i915_gem_object_set_cache_level(ctx->obj,
> +                                                     &dev_priv->gtt.base,
>                                                       I915_CACHE_LLC_MLC);
>                 /* Failure shouldn't ever happen this early */
>                 if (WARN_ON(ret))
> @@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
>          * default context.
>          */
>         dev_priv->ring[RCS].default_context = ctx;
> -       ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
> +       ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
>         if (ret) {
>                 DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
>                 goto err_destroy;
> @@ -391,6 +392,7 @@ mi_set_context(struct intel_ring_buffer *ring,
>  static int do_switch(struct i915_hw_context *to)
>  {
>         struct intel_ring_buffer *ring = to->ring;
> +       struct drm_i915_private *dev_priv = ring->dev->dev_private;
>         struct i915_hw_context *from = ring->last_context;
>         u32 hw_flags = 0;
>         int ret;
> @@ -400,7 +402,7 @@ static int do_switch(struct i915_hw_context *to)
>         if (from == to)
>                 return 0;
>
> -       ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
> +       ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
>         if (ret)
>                 return ret;
>
> @@ -437,7 +439,8 @@ static int do_switch(struct i915_hw_context *to)
>          */
>         if (from != NULL) {
>                 from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> -               i915_gem_object_move_to_active(from->obj, ring);
> +               i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
> +                                              ring);
>                 /* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
>                  * whole damn pipeline, we don't need to explicitly mark the
>                  * object dirty. The only exception is that the context must be
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index df61f33..32efdc0 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -32,24 +32,21 @@
>  #include "i915_trace.h"
>
>  static bool
> -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
> +mark_free(struct i915_vma *vma, struct list_head *unwind)
>  {
> -       struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> -
> -       if (obj->pin_count)
> +       if (vma->obj->pin_count)
>                 return false;
>
> -       list_add(&obj->exec_list, unwind);
> +       list_add(&vma->obj->exec_list, unwind);
>         return drm_mm_scan_add_block(&vma->node);
>  }
>
>  int
> -i915_gem_evict_something(struct drm_device *dev, int min_size,
> -                        unsigned alignment, unsigned cache_level,
> +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
> +                        int min_size, unsigned alignment, unsigned cache_level,
>                          bool mappable, bool nonblocking)
>  {
>         drm_i915_private_t *dev_priv = dev->dev_private;
> -       struct i915_address_space *vm = &dev_priv->gtt.base;
>         struct list_head eviction_list, unwind_list;
>         struct i915_vma *vma;
>         struct drm_i915_gem_object *obj;
> @@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
>          */
>
>         INIT_LIST_HEAD(&unwind_list);
> -       if (mappable)
> +       if (mappable) {
> +               BUG_ON(!i915_is_ggtt(vm));
>                 drm_mm_init_scan_with_range(&vm->mm, min_size,
>                                             alignment, cache_level, 0,
>                                             dev_priv->gtt.mappable_end);
> -       else
> +       } else
>                 drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
>
>         /* First see if there is a large enough contiguous idle region... */
>         list_for_each_entry(obj, &vm->inactive_list, mm_list) {
> -               if (mark_free(obj, &unwind_list))
> +               struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +               if (mark_free(vma, &unwind_list))
>                         goto found;
>         }
>
> @@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
>
>         /* Now merge in the soon-to-be-expired objects... */
>         list_for_each_entry(obj, &vm->active_list, mm_list) {
> -               if (mark_free(obj, &unwind_list))
> +               struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
> +               if (mark_free(vma, &unwind_list))
>                         goto found;
>         }
>
> @@ -109,7 +109,7 @@ none:
>                 obj = list_first_entry(&unwind_list,
>                                        struct drm_i915_gem_object,
>                                        exec_list);
> -               vma = __i915_gem_obj_to_vma(obj);
> +               vma = i915_gem_obj_to_vma(obj, vm);
>                 ret = drm_mm_scan_remove_block(&vma->node);
>                 BUG_ON(ret);
>
> @@ -130,7 +130,7 @@ found:
>                 obj = list_first_entry(&unwind_list,
>                                        struct drm_i915_gem_object,
>                                        exec_list);
> -               vma = __i915_gem_obj_to_vma(obj);
> +               vma = i915_gem_obj_to_vma(obj, vm);
>                 if (drm_mm_scan_remove_block(&vma->node)) {
>                         list_move(&obj->exec_list, &eviction_list);
>                         drm_gem_object_reference(&obj->base);
> @@ -145,7 +145,7 @@ found:
>                                        struct drm_i915_gem_object,
>                                        exec_list);
>                 if (ret == 0)
> -                       ret = i915_gem_object_unbind(obj);
> +                       ret = i915_gem_object_unbind(obj, vm);
>
>                 list_del_init(&obj->exec_list);
>                 drm_gem_object_unreference(&obj->base);
> @@ -158,13 +158,18 @@ int
>  i915_gem_evict_everything(struct drm_device *dev)
>  {
>         drm_i915_private_t *dev_priv = dev->dev_private;
> -       struct i915_address_space *vm = &dev_priv->gtt.base;
> +       struct i915_address_space *vm;
>         struct drm_i915_gem_object *obj, *next;
> -       bool lists_empty;
> +       bool lists_empty = true;
>         int ret;
>
> -       lists_empty = (list_empty(&vm->inactive_list) &&
> -                      list_empty(&vm->active_list));
> +       list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +               lists_empty = (list_empty(&vm->inactive_list) &&
> +                              list_empty(&vm->active_list));
> +               if (!lists_empty)
> +                       lists_empty = false;
> +       }
> +
>         if (lists_empty)
>                 return -ENOSPC;
>
> @@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev)
>         i915_gem_retire_requests(dev);
>
>         /* Having flushed everything, unbind() should never raise an error */
> -       list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> -               if (obj->pin_count == 0)
> -                       WARN_ON(i915_gem_object_unbind(obj));
> +       list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> +               list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
> +                       if (obj->pin_count == 0)
> +                               WARN_ON(i915_gem_object_unbind(obj, vm));
> +       }
>
>         return 0;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 1734825..819d8d8 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
>  }
>
>  static void
> -eb_destroy(struct eb_objects *eb)
> +eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
>  {
>         while (!list_empty(&eb->objects)) {
>                 struct drm_i915_gem_object *obj;
> @@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
>  static int
>  i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>                                    struct eb_objects *eb,
> -                                  struct drm_i915_gem_relocation_entry *reloc)
> +                                  struct drm_i915_gem_relocation_entry *reloc,
> +                                  struct i915_address_space *vm)
>  {
>         struct drm_device *dev = obj->base.dev;
>         struct drm_gem_object *target_obj;
> @@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>
>  static int
>  i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
> -                                   struct eb_objects *eb)
> +                                   struct eb_objects *eb,
> +                                   struct i915_address_space *vm)
>  {
>  #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
>         struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
> @@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
>                 do {
>                         u64 offset = r->presumed_offset;
>
> -                       ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
> +                       ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
> +                                                                vm);
>                         if (ret)
>                                 return ret;
>
> @@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
>  static int
>  i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
>                                          struct eb_objects *eb,
> -                                        struct drm_i915_gem_relocation_entry *relocs)
> +                                        struct drm_i915_gem_relocation_entry *relocs,
> +                                        struct i915_address_space *vm)
>  {
>         const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
>         int i, ret;
>
>         for (i = 0; i < entry->relocation_count; i++) {
> -               ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
> +               ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
> +                                                        vm);
>                 if (ret)
>                         return ret;
>         }
> @@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
>  }
>
>  static int
> -i915_gem_execbuffer_relocate(struct eb_objects *eb)
> +i915_gem_execbuffer_relocate(struct eb_objects *eb,
> +                            struct i915_address_space *vm)
>  {
>         struct drm_i915_gem_object *obj;
>         int ret = 0;
> @@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
>          */
>         pagefault_disable();
>         list_for_each_entry(obj, &eb->objects, exec_list) {
> -               ret = i915_gem_execbuffer_relocate_object(obj, eb);
> +               ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
>                 if (ret)
>                         break;
>         }
> @@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
>  static int
>  i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>                                    struct intel_ring_buffer *ring,
> +                                  struct i915_address_space *vm,
>                                    bool *need_reloc)
>  {
>         struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> @@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>                 obj->tiling_mode != I915_TILING_NONE;
>         need_mappable = need_fence || need_reloc_mappable(obj);
>
> -       ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
> +       ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
> +                                 false);
>         if (ret)
>                 return ret;
>
> @@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>                 obj->has_aliasing_ppgtt_mapping = 1;
>         }
>
> -       if (entry->offset != i915_gem_obj_ggtt_offset(obj)) {
> -               entry->offset = i915_gem_obj_ggtt_offset(obj);
> +       if (entry->offset != i915_gem_obj_offset(obj, vm)) {
> +               entry->offset = i915_gem_obj_offset(obj, vm);
>                 *need_reloc = true;
>         }
>
> @@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
>  {
>         struct drm_i915_gem_exec_object2 *entry;
>
> -       if (!i915_gem_obj_ggtt_bound(obj))
> +       if (!i915_gem_obj_bound_any(obj))
>                 return;
>
>         entry = obj->exec_entry;
> @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
>  static int
>  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
>                             struct list_head *objects,
> +                           struct i915_address_space *vm,
>                             bool *need_relocs)
>  {
>         struct drm_i915_gem_object *obj;
> @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
>                 list_for_each_entry(obj, objects, exec_list) {
>                         struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
>                         bool need_fence, need_mappable;
> +                       u32 obj_offset;
>
> -                       if (!i915_gem_obj_ggtt_bound(obj))
> +                       if (!i915_gem_obj_bound(obj, vm))
>                                 continue;
>
> +                       obj_offset = i915_gem_obj_offset(obj, vm);
>                         need_fence =
>                                 has_fenced_gpu_access &&
>                                 entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
>                                 obj->tiling_mode != I915_TILING_NONE;
>                         need_mappable = need_fence || need_reloc_mappable(obj);
>
> +                       BUG_ON((need_mappable || need_fence) &&
> +                              !i915_is_ggtt(vm));
> +
>                         if ((entry->alignment &&
> -                            i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) ||
> +                            obj_offset & (entry->alignment - 1)) ||
>                             (need_mappable && !obj->map_and_fenceable))
> -                               ret = i915_gem_object_unbind(obj);
> +                               ret = i915_gem_object_unbind(obj, vm);
>                         else
> -                               ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> +                               ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
>                         if (ret)
>                                 goto err;
>                 }
>
>                 /* Bind fresh objects */
>                 list_for_each_entry(obj, objects, exec_list) {
> -                       if (i915_gem_obj_ggtt_bound(obj))
> +                       if (i915_gem_obj_bound(obj, vm))
>                                 continue;
>
> -                       ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> +                       ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
>                         if (ret)
>                                 goto err;
>                 }
> @@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
>                                   struct drm_file *file,
>                                   struct intel_ring_buffer *ring,
>                                   struct eb_objects *eb,
> -                                 struct drm_i915_gem_exec_object2 *exec)
> +                                 struct drm_i915_gem_exec_object2 *exec,
> +                                 struct i915_address_space *vm)
>  {
>         struct drm_i915_gem_relocation_entry *reloc;
>         struct drm_i915_gem_object *obj;
> @@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
>                 goto err;
>
>         need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> -       ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> +       ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
>         if (ret)
>                 goto err;
>
>         list_for_each_entry(obj, &eb->objects, exec_list) {
>                 int offset = obj->exec_entry - exec;
>                 ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
> -                                                              reloc + reloc_offset[offset]);
> +                                                              reloc + reloc_offset[offset],
> +                                                              vm);
>                 if (ret)
>                         goto err;
>         }
> @@ -770,6 +786,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
>
>  static void
>  i915_gem_execbuffer_move_to_active(struct list_head *objects,
> +                                  struct i915_address_space *vm,
>                                    struct intel_ring_buffer *ring)
>  {
>         struct drm_i915_gem_object *obj;
> @@ -784,7 +801,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
>                 obj->base.read_domains = obj->base.pending_read_domains;
>                 obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
>
> -               i915_gem_object_move_to_active(obj, ring);
> +               i915_gem_object_move_to_active(obj, vm, ring);
>                 if (obj->base.write_domain) {
>                         obj->dirty = 1;
>                         obj->last_write_seqno = intel_ring_get_seqno(ring);
> @@ -838,7 +855,8 @@ static int
>  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>                        struct drm_file *file,
>                        struct drm_i915_gem_execbuffer2 *args,
> -                      struct drm_i915_gem_exec_object2 *exec)
> +                      struct drm_i915_gem_exec_object2 *exec,
> +                      struct i915_address_space *vm)
>  {
>         drm_i915_private_t *dev_priv = dev->dev_private;
>         struct eb_objects *eb;
> @@ -1000,17 +1018,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>
>         /* Move the objects en-masse into the GTT, evicting if necessary. */
>         need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> -       ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
> +       ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
>         if (ret)
>                 goto err;
>
>         /* The objects are in their final locations, apply the relocations. */
>         if (need_relocs)
> -               ret = i915_gem_execbuffer_relocate(eb);
> +               ret = i915_gem_execbuffer_relocate(eb, vm);
>         if (ret) {
>                 if (ret == -EFAULT) {
>                         ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
> -                                                               eb, exec);
> +                                                               eb, exec, vm);
>                         BUG_ON(!mutex_is_locked(&dev->struct_mutex));
>                 }
>                 if (ret)
> @@ -1061,7 +1079,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>                         goto err;
>         }
>
> -       exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset;
> +       exec_start = i915_gem_obj_offset(batch_obj, vm) +
> +               args->batch_start_offset;
>         exec_len = args->batch_len;
>         if (cliprects) {
>                 for (i = 0; i < args->num_cliprects; i++) {
> @@ -1086,11 +1105,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>
>         trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
>
> -       i915_gem_execbuffer_move_to_active(&eb->objects, ring);
> +       i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
>         i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
>
>  err:
> -       eb_destroy(eb);
> +       eb_destroy(eb, vm);
>
>         mutex_unlock(&dev->struct_mutex);
>
> @@ -1107,6 +1126,7 @@ int
>  i915_gem_execbuffer(struct drm_device *dev, void *data,
>                     struct drm_file *file)
>  {
> +       struct drm_i915_private *dev_priv = dev->dev_private;
>         struct drm_i915_gem_execbuffer *args = data;
>         struct drm_i915_gem_execbuffer2 exec2;
>         struct drm_i915_gem_exec_object *exec_list = NULL;
> @@ -1162,7 +1182,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
>         exec2.flags = I915_EXEC_RENDER;
>         i915_execbuffer2_set_context_id(exec2, 0);
>
> -       ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
> +       ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
> +                                    &dev_priv->gtt.base);
>         if (!ret) {
>                 /* Copy the new buffer offsets back to the user's exec list. */
>                 for (i = 0; i < args->buffer_count; i++)
> @@ -1188,6 +1209,7 @@ int
>  i915_gem_execbuffer2(struct drm_device *dev, void *data,
>                      struct drm_file *file)
>  {
> +       struct drm_i915_private *dev_priv = dev->dev_private;
>         struct drm_i915_gem_execbuffer2 *args = data;
>         struct drm_i915_gem_exec_object2 *exec2_list = NULL;
>         int ret;
> @@ -1218,7 +1240,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
>                 return -EFAULT;
>         }
>
> -       ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
> +       ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
> +                                    &dev_priv->gtt.base);
>         if (!ret) {
>                 /* Copy the new buffer offsets back to the user's exec list. */
>                 ret = copy_to_user(to_user_ptr(args->buffers_ptr),
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 3b639a9..44f3464 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -390,6 +390,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
>                             ppgtt->base.total);
>         }
>
> +       /* i915_init_vm(dev_priv, &ppgtt->base) */
> +
>         return ret;
>  }
>
> @@ -409,17 +411,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
>                             struct drm_i915_gem_object *obj,
>                             enum i915_cache_level cache_level)
>  {
> -       ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> -                                  i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> -                                  cache_level);
> +       struct i915_address_space *vm = &ppgtt->base;
> +       unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> +
> +       vm->insert_entries(vm, obj->pages,
> +                          obj_offset >> PAGE_SHIFT,
> +                          cache_level);
>  }
>
>  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
>                               struct drm_i915_gem_object *obj)
>  {
> -       ppgtt->base.clear_range(&ppgtt->base,
> -                               i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> -                               obj->base.size >> PAGE_SHIFT);
> +       struct i915_address_space *vm = &ppgtt->base;
> +       unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
> +
> +       vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
> +                       obj->base.size >> PAGE_SHIFT);
>  }
>
>  extern int intel_iommu_gfx_mapped;
> @@ -470,6 +477,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
>                                        dev_priv->gtt.base.start / PAGE_SIZE,
>                                        dev_priv->gtt.base.total / PAGE_SIZE);
>
> +       if (dev_priv->mm.aliasing_ppgtt)
> +               gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> +
>         list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
>                 i915_gem_clflush_object(obj);
>                 i915_gem_gtt_bind_object(obj, obj->cache_level);
> @@ -648,7 +658,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>          * aperture.  One page should be enough to keep any prefetching inside
>          * of the aperture.
>          */
> -       drm_i915_private_t *dev_priv = dev->dev_private;
> +       struct drm_i915_private *dev_priv = dev->dev_private;
> +       struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
>         struct drm_mm_node *entry;
>         struct drm_i915_gem_object *obj;
>         unsigned long hole_start, hole_end;
> @@ -656,19 +667,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>         BUG_ON(mappable_end > end);
>
>         /* Subtract the guard page ... */
> -       drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE);
> +       drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
>         if (!HAS_LLC(dev))
>                 dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
>
>         /* Mark any preallocated objects as occupied */
>         list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> -               struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
> +               struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
>                 int ret;
>                 DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
>                               i915_gem_obj_ggtt_offset(obj), obj->base.size);
>
>                 WARN_ON(i915_gem_obj_ggtt_bound(obj));
> -               ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> +               ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
>                 if (ret)
>                         DRM_DEBUG_KMS("Reservation failed\n");
>                 obj->has_global_gtt_mapping = 1;
> @@ -679,19 +690,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>         dev_priv->gtt.base.total = end - start;
>
>         /* Clear any non-preallocated blocks */
> -       drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm,
> -                            hole_start, hole_end) {
> +       drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
>                 const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
>                 DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
>                               hole_start, hole_end);
> -               dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> -                                              hole_start / PAGE_SIZE,
> -                                              count);
> +               ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count);
>         }
>
>         /* And finally clear the reserved guard page */
> -       dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> -                                      end / PAGE_SIZE - 1, 1);
> +       ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1);
>  }
>
>  static bool
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index 27ffb4c..000ffbd 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -351,7 +351,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>                                                u32 size)
>  {
>         struct drm_i915_private *dev_priv = dev->dev_private;
> -       struct i915_address_space *vm = &dev_priv->gtt.base;
> +       struct i915_address_space *ggtt = &dev_priv->gtt.base;
>         struct drm_i915_gem_object *obj;
>         struct drm_mm_node *stolen;
>         struct i915_vma *vma;
> @@ -394,7 +394,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>         if (gtt_offset == I915_GTT_OFFSET_NONE)
>                 return obj;
>
> -       vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> +       vma = i915_gem_vma_create(obj, ggtt);
>         if (IS_ERR(vma)) {
>                 ret = PTR_ERR(vma);
>                 goto err_out;
> @@ -407,8 +407,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>          */
>         vma->node.start = gtt_offset;
>         vma->node.size = size;
> -       if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> -               ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> +       if (drm_mm_initialized(&ggtt->mm)) {
> +               ret = drm_mm_reserve_node(&ggtt->mm, &vma->node);
>                 if (ret) {
>                         DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
>                         i915_gem_vma_destroy(vma);
> @@ -419,7 +419,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>         obj->has_global_gtt_mapping = 1;
>
>         list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
> -       list_add_tail(&obj->mm_list, &vm->inactive_list);
> +       list_add_tail(&obj->mm_list, &ggtt->inactive_list);
>
>         return obj;
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> index 92a8d27..808ca2a 100644
> --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> @@ -360,17 +360,19 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
>
>                 obj->map_and_fenceable =
>                         !i915_gem_obj_ggtt_bound(obj) ||
> -                       (i915_gem_obj_ggtt_offset(obj) + obj->base.size <= dev_priv->gtt.mappable_end &&
> +                       (i915_gem_obj_ggtt_offset(obj) +
> +                        obj->base.size <= dev_priv->gtt.mappable_end &&
>                          i915_gem_object_fence_ok(obj, args->tiling_mode));
>
>                 /* Rebind if we need a change of alignment */
>                 if (!obj->map_and_fenceable) {
> -                       u32 unfenced_alignment =
> +                       struct i915_address_space *ggtt = &dev_priv->gtt.base;
> +                       u32 unfenced_align =
>                                 i915_gem_get_gtt_alignment(dev, obj->base.size,
>                                                             args->tiling_mode,
>                                                             false);
> -                       if (i915_gem_obj_ggtt_offset(obj) & (unfenced_alignment - 1))
> -                               ret = i915_gem_object_unbind(obj);
> +                       if (i915_gem_obj_ggtt_offset(obj) & (unfenced_align - 1))
> +                               ret = i915_gem_object_unbind(obj, ggtt);
>                 }
>
>                 if (ret == 0) {
> diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> index 7d283b5..3f019d3 100644
> --- a/drivers/gpu/drm/i915/i915_trace.h
> +++ b/drivers/gpu/drm/i915/i915_trace.h
> @@ -34,11 +34,13 @@ TRACE_EVENT(i915_gem_object_create,
>  );
>
>  TRACE_EVENT(i915_gem_object_bind,
> -           TP_PROTO(struct drm_i915_gem_object *obj, bool mappable),
> -           TP_ARGS(obj, mappable),
> +           TP_PROTO(struct drm_i915_gem_object *obj,
> +                    struct i915_address_space *vm, bool mappable),
> +           TP_ARGS(obj, vm, mappable),
>
>             TP_STRUCT__entry(
>                              __field(struct drm_i915_gem_object *, obj)
> +                            __field(struct i915_address_space *, vm)
>                              __field(u32, offset)
>                              __field(u32, size)
>                              __field(bool, mappable)
> @@ -46,8 +48,8 @@ TRACE_EVENT(i915_gem_object_bind,
>
>             TP_fast_assign(
>                            __entry->obj = obj;
> -                          __entry->offset = i915_gem_obj_ggtt_offset(obj);
> -                          __entry->size = i915_gem_obj_ggtt_size(obj);
> +                          __entry->offset = i915_gem_obj_offset(obj, vm);
> +                          __entry->size = i915_gem_obj_size(obj, vm);
>                            __entry->mappable = mappable;
>                            ),
>
> @@ -57,19 +59,21 @@ TRACE_EVENT(i915_gem_object_bind,
>  );
>
>  TRACE_EVENT(i915_gem_object_unbind,
> -           TP_PROTO(struct drm_i915_gem_object *obj),
> -           TP_ARGS(obj),
> +           TP_PROTO(struct drm_i915_gem_object *obj,
> +                    struct i915_address_space *vm),
> +           TP_ARGS(obj, vm),
>
>             TP_STRUCT__entry(
>                              __field(struct drm_i915_gem_object *, obj)
> +                            __field(struct i915_address_space *, vm)
>                              __field(u32, offset)
>                              __field(u32, size)
>                              ),
>
>             TP_fast_assign(
>                            __entry->obj = obj;
> -                          __entry->offset = i915_gem_obj_ggtt_offset(obj);
> -                          __entry->size = i915_gem_obj_ggtt_size(obj);
> +                          __entry->offset = i915_gem_obj_offset(obj, vm);
> +                          __entry->size = i915_gem_obj_size(obj, vm);
>                            ),
>
>             TP_printk("obj=%p, offset=%08x size=%x",
> diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
> index f3c97e0..b69cc63 100644
> --- a/drivers/gpu/drm/i915/intel_fb.c
> +++ b/drivers/gpu/drm/i915/intel_fb.c
> @@ -170,7 +170,6 @@ static int intelfb_create(struct drm_fb_helper *helper,
>                       fb->width, fb->height,
>                       i915_gem_obj_ggtt_offset(obj), obj);
>
> -
>         mutex_unlock(&dev->struct_mutex);
>         vga_switcheroo_client_fb_set(dev->pdev, info);
>         return 0;
> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> index 2abb53e..22ccb7e 100644
> --- a/drivers/gpu/drm/i915/intel_overlay.c
> +++ b/drivers/gpu/drm/i915/intel_overlay.c
> @@ -1350,7 +1350,7 @@ void intel_setup_overlay(struct drm_device *dev)
>                 }
>                 overlay->flip_addr = reg_bo->phys_obj->handle->busaddr;
>         } else {
> -               ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false);
> +               ret = i915_gem_ggtt_pin(reg_bo, PAGE_SIZE, true, false);
>                 if (ret) {
>                         DRM_ERROR("failed to pin overlay register bo\n");
>                         goto out_free_bo;
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 008e0e0..0fb081c 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2860,7 +2860,7 @@ intel_alloc_context_page(struct drm_device *dev)
>                 return NULL;
>         }
>
> -       ret = i915_gem_object_pin(ctx, 4096, true, false);
> +       ret = i915_gem_ggtt_pin(ctx, 4096, true, false);
>         if (ret) {
>                 DRM_ERROR("failed to pin power context: %d\n", ret);
>                 goto err_unref;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 8527ea0..88130a3 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -481,6 +481,7 @@ out:
>  static int
>  init_pipe_control(struct intel_ring_buffer *ring)
>  {
> +       struct drm_i915_private *dev_priv = ring->dev->dev_private;
>         struct pipe_control *pc;
>         struct drm_i915_gem_object *obj;
>         int ret;
> @@ -499,9 +500,10 @@ init_pipe_control(struct intel_ring_buffer *ring)
>                 goto err;
>         }
>
> -       i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> +       i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> +                                       I915_CACHE_LLC);
>
> -       ret = i915_gem_object_pin(obj, 4096, true, false);
> +       ret = i915_gem_ggtt_pin(obj, 4096, true, false);
>         if (ret)
>                 goto err_unref;
>
> @@ -1212,6 +1214,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring)
>  static int init_status_page(struct intel_ring_buffer *ring)
>  {
>         struct drm_device *dev = ring->dev;
> +       struct drm_i915_private *dev_priv = dev->dev_private;
>         struct drm_i915_gem_object *obj;
>         int ret;
>
> @@ -1222,9 +1225,10 @@ static int init_status_page(struct intel_ring_buffer *ring)
>                 goto err;
>         }
>
> -       i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> +       i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> +                                       I915_CACHE_LLC);
>
> -       ret = i915_gem_object_pin(obj, 4096, true, false);
> +       ret = i915_gem_ggtt_pin(obj, 4096, true, false);
>         if (ret != 0) {
>                 goto err_unref;
>         }
> @@ -1307,7 +1311,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
>
>         ring->obj = obj;
>
> -       ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false);
> +       ret = i915_gem_ggtt_pin(obj, PAGE_SIZE, true, false);
>         if (ret)
>                 goto err_unref;
>
> @@ -1828,7 +1832,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
>                         return -ENOMEM;
>                 }
>
> -               ret = i915_gem_object_pin(obj, 0, true, false);
> +               ret = i915_gem_ggtt_pin(obj, 0, true, false);
>                 if (ret != 0) {
>                         drm_gem_object_unreference(&obj->base);
>                         DRM_ERROR("Failed to ping batch bo\n");
> --
> 1.8.3.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Jesse Barnes July 26, 2013, 4:59 p.m. UTC | #3

On Fri, 26 Jul 2013 11:51:00 +0200
Daniel Vetter <daniel@ffwll.ch> wrote:

> HI all,
> 
> So Ben&I had a bit a private discussion and one thing I've explained a bit
> more in detail is what kind of review I'm doing as maintainer. I've
> figured this is generally useful. We've also discussed a bit that for
> developers without their own lab it would be nice if QA could test random
> branches on their set of machines. But imo that'll take quite a while,
> there's lots of other stuff to improve in QA land first. Anyway, here's
> it:
> 
> Now an explanation for why this freaked me out, which is essentially an
> explanation of what I do when I do maintainer reviews:
> 
> Probably the most important question I ask myself when reading a patch is
> "if a regression would bisect to this, and the bisect is the only useful
> piece of evidence, would I stand a chance to understand it?".  Your patch
> is big, has the appearance of doing a few unrelated things and could very
> well hide a bug which would take me an awful lot of time to spot. So imo
> the answer for your patch is a clear "no".

This is definitely a good point.  Big patches are both hard to review
and hard to debug, so should be kept as simple as possible (but no
simpler!).

> I've merged a few such patches in the past  where I've had a similar hunch
> and regretted it almost always. I've also sometimes split-up the patch
> while applying, but that approach doesn't scale any more with our rather
> big team.
> 
> The second thing I try to figure out is whether the patch author is indeed
> the local expert on the topic at hand now. With our team size and patch
> flow I don't stand a chance if I try to understand everything to the last
> detail. Instead I try to assess this through the proxy of convincing
> myself the the patch submitter understands stuff much better than I do. I
> tend to check that by asking random questions, proposing alternative
> approaches and also by rating code/patch clarity. The obj_set_color
> double-loop very much gave me the impression that you didn't have a clear
> idea about how exactly this should work, so that  hunk trigger this
> maintainer hunch.

This is the part I think is unfair (see below) when proposed
alternatives aren't clearly defined.

> I admit that this is all rather fluffy and very much an inexact science,
> but it's the only tools I have as a maintainer. The alternative of doing
> shit myself or checking everything myself in-depth just doesnt scale.

I'm glad you brought this up, but I see a contradiction here:  if
you're just asking random questions to convince yourself the author
knows what they're doing, but simultaneously you're not checking
everything yourself in-depth, you'll have no way to know whether your
questions are being dealt with properly.

I think the way out of that contradiction is to trust reviewers,
especially in specific areas.

There's a downside in that the design will be a little less coherent
(i.e. matching the vision of a single person), but as you said, that
doesn't scale.

So I'd suggest a couple of rules to help:
  1) every patch gets at least two reviewed-bys
  2) one of those reviewed-bys should be from a domain expert, e.g.:
     DP - Todd, Jani
     GEM - Chris, Daniel
     $PLATFORM - $PLATFORM owner
     HDMI - Paulo
     PSR/FBC - Rodrigo/Shobhit
     * - Daniel (you get to be a wildcard)
     etc.
  3) reviews aren't allowed to contain solely bikeshed/codingstyle
     change requests, if there's nothing substantial merge shouldn't be
     blocked (modulo egregious violations like Hungarian notation)
  4) review comments should be concrete and actionable, and ideally not
     leave the author hanging with hints about problems the reviewer
     has spotted, leaving the author looking for easter eggs

For the most part I think we adhere to this, though reviews from the
domain experts are done more on an ad-hoc basis these days...

Thoughts?

Chris Wilson July 26, 2013, 5:08 p.m. UTC | #4

On Fri, Jul 26, 2013 at 09:59:42AM -0700, Jesse Barnes wrote:
>   4) review comments should be concrete and actionable, and ideally not
>      leave the author hanging with hints about problems the reviewer
>      has spotted, leaving the author looking for easter eggs

Where am I going to find my fun, if I am not allowed to tell you that
you missed a zero in a thousand line patch but not tell you where?
Spoilsport :-p
-Chris

Jesse Barnes July 26, 2013, 5:12 p.m. UTC | #5

On Fri, 26 Jul 2013 18:08:48 +0100
Chris Wilson <chris@chris-wilson.co.uk> wrote:

> On Fri, Jul 26, 2013 at 09:59:42AM -0700, Jesse Barnes wrote:
> >   4) review comments should be concrete and actionable, and ideally not
> >      leave the author hanging with hints about problems the reviewer
> >      has spotted, leaving the author looking for easter eggs
> 
> Where am I going to find my fun, if I am not allowed to tell you that
> you missed a zero in a thousand line patch but not tell you where?
> Spoilsport :-p

You'll just need to take up golf or something. :)

Daniel Vetter July 26, 2013, 5:40 p.m. UTC | #6

On Fri, Jul 26, 2013 at 09:59:42AM -0700, Jesse Barnes wrote:
> On Fri, 26 Jul 2013 11:51:00 +0200
> Daniel Vetter <daniel@ffwll.ch> wrote:
> 
> > HI all,
> > 
> > So Ben&I had a bit a private discussion and one thing I've explained a bit
> > more in detail is what kind of review I'm doing as maintainer. I've
> > figured this is generally useful. We've also discussed a bit that for
> > developers without their own lab it would be nice if QA could test random
> > branches on their set of machines. But imo that'll take quite a while,
> > there's lots of other stuff to improve in QA land first. Anyway, here's
> > it:
> > 
> > Now an explanation for why this freaked me out, which is essentially an
> > explanation of what I do when I do maintainer reviews:
> > 
> > Probably the most important question I ask myself when reading a patch is
> > "if a regression would bisect to this, and the bisect is the only useful
> > piece of evidence, would I stand a chance to understand it?".  Your patch
> > is big, has the appearance of doing a few unrelated things and could very
> > well hide a bug which would take me an awful lot of time to spot. So imo
> > the answer for your patch is a clear "no".
> 
> This is definitely a good point.  Big patches are both hard to review
> and hard to debug, so should be kept as simple as possible (but no
> simpler!).
> 
> > I've merged a few such patches in the past  where I've had a similar hunch
> > and regretted it almost always. I've also sometimes split-up the patch
> > while applying, but that approach doesn't scale any more with our rather
> > big team.
> > 
> > The second thing I try to figure out is whether the patch author is indeed
> > the local expert on the topic at hand now. With our team size and patch
> > flow I don't stand a chance if I try to understand everything to the last
> > detail. Instead I try to assess this through the proxy of convincing
> > myself the the patch submitter understands stuff much better than I do. I
> > tend to check that by asking random questions, proposing alternative
> > approaches and also by rating code/patch clarity. The obj_set_color
> > double-loop very much gave me the impression that you didn't have a clear
> > idea about how exactly this should work, so that  hunk trigger this
> > maintainer hunch.
> 
> This is the part I think is unfair (see below) when proposed
> alternatives aren't clearly defined.

Ben split up the patches meanwhile and imo they now look great (so fully
address the first concern). I've read through them this morning and dumped
a few (imo actionable) quick comments on irc. For the example here my
request is to squash a double-loop over vma lists (which will also rip out
a function call indirection as a bonus).

> > I admit that this is all rather fluffy and very much an inexact science,
> > but it's the only tools I have as a maintainer. The alternative of doing
> > shit myself or checking everything myself in-depth just doesnt scale.
> 
> I'm glad you brought this up, but I see a contradiction here:  if
> you're just asking random questions to convince yourself the author
> knows what they're doing, but simultaneously you're not checking
> everything yourself in-depth, you'll have no way to know whether your
> questions are being dealt with properly.

Well if the reply is unsure or inconstistent then I tend to dig in. E.g.
with Paulo's pc8+ stuff I've asked a few questions about interactions with
gmbus/edid reading/gem execbuf and he replied that he doesn't know. His
2nd patch version was still a bit thin on details in that area, so I've
sat down read through stuff and made a concrete&actionable list of
corner-cases I think we should exercise.

> I think the way out of that contradiction is to trust reviewers,
> especially in specific areas.

Imo I've already started with that, there's lots of patches where I only
do a very cursory read when merging since I trust $AUTHOR and $REVIEWER
to get it right.

> There's a downside in that the design will be a little less coherent
> (i.e. matching the vision of a single person), but as you said, that
> doesn't scale.

I think overall we can still achieve good consistency in the design, so
that's a part where I try to chip in. But with a larger team it's clear
that consistency in little details will fizzle out more, otoh doing such
cleanups after big reworks (heck I've been rather inconstinent in all the
refactoring in the modeset code myself) sounds like good material to drag
newbies into our codebase.

> So I'd suggest a couple of rules to help:
>   1) every patch gets at least two reviewed-bys

We have a hard time doing our current review load in a timely manner
already, I don't expect this to scale if we do it formally. But ...

>   2) one of those reviewed-bys should be from a domain expert, e.g.:
>      DP - Todd, Jani
>      GEM - Chris, Daniel
>      $PLATFORM - $PLATFORM owner
>      HDMI - Paulo
>      PSR/FBC - Rodrigo/Shobhit
>      * - Daniel (you get to be a wildcard)
>      etc.

... this is something that I've started to take into account already. E.g.
when I ask someone less experienced for a given topic to do a
fish-out-of-water review I'll also poke domain experts to ack it. And if
there's a concern it obviously overrules an r-b tag from someone else.

>   3) reviews aren't allowed to contain solely bikeshed/codingstyle
>      change requests, if there's nothing substantial merge shouldn't be
>      blocked (modulo egregious violations like Hungarian notation)

I think we're doing fairly well. Occasionally I rant around review myself,
but often that's just the schlep of digging the patch out again and
refining it - most often the reviewer is right, which obviously makes it
worse ;-)

We have a few cases where discussions tend to loop forever. Sometimes I
step in but often I feel like I shouldn't be the one to make the call,
e.g. the audio discussions around the hsw power well drag out often, but
imo that's a topic where Paulo should make the calls.

Occasionally though I block a patch on bikeshed topics simply because I
think the improved consistency is worth it. One example is the gen checks
so that our code matches 0-based C array semantics and our usual writing
style of using genX+ and pre-genX to be inclusive/exclusive respectively.

>   4) review comments should be concrete and actionable, and ideally not
>      leave the author hanging with hints about problems the reviewer
>      has spotted, leaving the author looking for easter eggs

Where's the fun in that? I think the right way to look at easter egg
hunting is that the clear&actionable task from the reviewer is to go
easter egg hunting ;-)

More seriously though asking "what happens if?" questions is an important
part of review imo, and sometimes those tend to be an easter egg hunt for
both reviewer and patch author."

> For the most part I think we adhere to this, though reviews from the
> domain experts are done more on an ad-hoc basis these days...
> 
> Thoughts?

Generally I think our overall process is
a) a mess (as in not really formalized much) and
b) works surprisingly well.

So I think fine-tuning of individual parts and having an occasional
process discussion should be good enough to keep going.

Cheers, Daniel

Ben Widawsky July 26, 2013, 8:15 p.m. UTC | #7

On Fri, Jul 26, 2013 at 11:51:00AM +0200, Daniel Vetter wrote:
> HI all,
> 
> So Ben&I had a bit a private discussion and one thing I've explained a bit
> more in detail is what kind of review I'm doing as maintainer. I've
> figured this is generally useful. We've also discussed a bit that for
> developers without their own lab it would be nice if QA could test random
> branches on their set of machines. But imo that'll take quite a while,
> there's lots of other stuff to improve in QA land first. Anyway, here's
> it:
> 
> Now an explanation for why this freaked me out, which is essentially an
> explanation of what I do when I do maintainer reviews:
> 
> Probably the most important question I ask myself when reading a patch is
> "if a regression would bisect to this, and the bisect is the only useful
> piece of evidence, would I stand a chance to understand it?".  Your patch
> is big, has the appearance of doing a few unrelated things and could very
> well hide a bug which would take me an awful lot of time to spot. So imo
> the answer for your patch is a clear "no".
> 
> I've merged a few such patches in the past  where I've had a similar hunch
> and regretted it almost always. I've also sometimes split-up the patch
> while applying, but that approach doesn't scale any more with our rather
> big team.

You should never do this, IMO. If you require the patches to be split in
your tree, the developer should do it. See below for reasons I think
this sucks.

> 
> The second thing I try to figure out is whether the patch author is indeed
> the local expert on the topic at hand now. With our team size and patch
> flow I don't stand a chance if I try to understand everything to the last
> detail. Instead I try to assess this through the proxy of convincing
> myself the the patch submitter understands stuff much better than I do. I
> tend to check that by asking random questions, proposing alternative
> approaches and also by rating code/patch clarity. The obj_set_color
> double-loop very much gave me the impression that you didn't have a clear
> idea about how exactly this should work, so that  hunk trigger this
> maintainer hunch.
> 
> I admit that this is all rather fluffy and very much an inexact science,
> but it's the only tools I have as a maintainer. The alternative of doing
> shit myself or checking everything myself in-depth just doesnt scale.
> 
> Cheers, Daniel
> 
> 
> On Mon, Jul 22, 2013 at 4:08 AM, Ben Widawsky <ben@bwidawsk.net> wrote:

I think the subthread Jesse started had a bunch of good points, but
concisely I see 3 problems with our current process (and these were
addressed in my original mail, but I guess you didn't want to air my
dirty laundry :p):

1. Delay hurts QA. Balking on patches because they're hard to review
limits QA on that patch, and reduces QA time on the fixed up patches. I
agree this is something which is fixable within QA, but it doesn't exist
at present.

2. We don't have a way to bound review/merge. I tried to do this on this
series. After your initial review, I gave a list of things I was going
to fix, and asked you for an ack that if I fixed those, you would merge.
IMO, you didn't stick to this agreement, and came back with rework
requests on a patch I had already submitted. I don't know how to fix
this one because I think you should be entitled to change your mind.

A caveat to this: I did make some mistakes on rebase that needed
addressing. ie. the ends justified the means.

3a. Reworking code introduces bugs. I feel I am more guilty here than
most, but, consider even in the best case of those new bugs being
caught in review. In such a case, you've now introduced at least 2 extra
revs, and 2 extra lag cycles waiting for review. That assumes further
work doesn't spiral into more requested fixups, or more bugs. In the
less ideal case, you've simply introduced a new bug in addition to the
delay.

3b. Patch splitting is art not science.

There is a really delicate balance between splitting patches because
it's logically a functional split vs. splitting things up to make things
easier to chew on. Now in my case specifically, I think overall the
series has improved, and I found some crud that got squashed in which
shouldn't have been there. I also believe a lot of the splitting really
doesn't make much sense other than for review purposes and sometimes
that is okay.

In my case, I had a huge patch, but a lot of that patch was actually a
sed job of "s/obj/obj,vm/."  You came back with, "you're doing a bunch
of extra lookups." That was exactly the point of the patch; the extra
lookups should have made the review simpler, and could be cleaned up
later.

My point is: A larger quantity of small patches is not always easier to
review than a small quantity of large patches. Large patch series review
often requires the reviewer to keep a lot of context as they review.

*4. The result of all this is I think a lot of the time we (the
developers) end up writing your patch for you. While I respect your
opinion very highly, and I think more often than not that your way is
better, it's just inefficient.

I'll wrap this all up with, I don't envy you. On a bunch of emails, I've
seen you be apologetic for putting developers in between a rock, and a
hard place (you, and program management). I recognize you have the same
dilemma with Dave/Linus, and the rest of us developers. I think the
overall strategy should be to improve QA, but then you have to take the
leap of limiting your requests for reworks, and accepting QAs stamp of
approval.

Daniel Vetter July 26, 2013, 8:43 p.m. UTC | #8

On Fri, Jul 26, 2013 at 10:15 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> I think the subthread Jesse started had a bunch of good points, but
> concisely I see 3 problems with our current process (and these were
> addressed in my original mail, but I guess you didn't want to air my
> dirty laundry :p):

I've cut out some of the later discussion in my mail (and that thread)
since I've figured it's not the main point I wanted to make. No fear
of dirty laundry ;-)

>
> 1. Delay hurts QA. Balking on patches because they're hard to review
> limits QA on that patch, and reduces QA time on the fixed up patches. I
> agree this is something which is fixable within QA, but it doesn't exist
> at present.

Yeah, I agree that this is an issue for developers without their
private lab ;-) And it's also an issue for those with one, since
running tests without a good fully automated system is a pian.

With discussed this a bit with Jesse yesterday on irc, but my point is
that currentl QA doesn't have a quick enough turn-around even for
testing -nightly that this would be feasible. And I also think that
something like this should be started with userspace (i.e. mesa)
testing first, which is already in progress.

Once QA has infrastructure to test arbitrary branches and once they
have enough horsepower and automation (and people to do all this) we
can take a look again. But imo trying to do this early is just wishful
thinking, we have to deal with what we have, not what we'd like to get
for Xmas.

> 2. We don't have a way to bound review/merge. I tried to do this on this
> series. After your initial review, I gave a list of things I was going
> to fix, and asked you for an ack that if I fixed those, you would merge.
> IMO, you didn't stick to this agreement, and came back with rework
> requests on a patch I had already submitted. I don't know how to fix
> this one because I think you should be entitled to change your mind.
>
> A caveat to this: I did make some mistakes on rebase that needed
> addressing. ie. the ends justified the means.

Yeah, the problem is that for really big stuff like your ppgtt series
the merge process is incremental: We'll do a rough plan and then pull
in parts one-by-one. And then when the sub-series get reviewed new
things pop up. And sometimes the reviewer is simply confused and asks
for stupid things ...

I don't think we can fix this since that's just how it works. But we
can certainly keep this in mind when estimating the effort to get
features in - big stuff will have some uncertainty (and hence need for
time buffers) even after the first review. For the ppgtt work I need
to blame myself too since the original plan was way too optimistic,
but I really wanted to get this in before you get sucked away into the
next big thing lined up (which in this case unfortunately came
attached with a deadline).

> 3a. Reworking code introduces bugs. I feel I am more guilty here than
> most, but, consider even in the best case of those new bugs being
> caught in review. In such a case, you've now introduced at least 2 extra
> revs, and 2 extra lag cycles waiting for review. That assumes further
> work doesn't spiral into more requested fixups, or more bugs. In the
> less ideal case, you've simply introduced a new bug in addition to the
> delay.

I'm trying to address this by sharing rebase BKMs as much as possible.
Since I'm the one on the team doing the most rebasing (with -internal)
that hopefully helps.

> 3b. Patch splitting is art not science.
>
> There is a really delicate balance between splitting patches because
> it's logically a functional split vs. splitting things up to make things
> easier to chew on. Now in my case specifically, I think overall the
> series has improved, and I found some crud that got squashed in which
> shouldn't have been there. I also believe a lot of the splitting really
> doesn't make much sense other than for review purposes and sometimes
> that is okay.

Imo splitting patches has two functions: Make the reviewer's life
easier (not really the developers) and have simple patches in case of
a regression which bisects to it. Ime you get about a 1-in-5
regression rate in dinq, so that chance is very much neglectable. And
for the ugly regressions where we have no clue we can easily blow
through a few man-months of engineer time to track them time.

> In my case, I had a huge patch, but a lot of that patch was actually a
> sed job of "s/obj/obj,vm/."  You came back with, "you're doing a bunch
> of extra lookups." That was exactly the point of the patch; the extra
> lookups should have made the review simpler, and could be cleaned up
> later.
>
> My point is: A larger quantity of small patches is not always easier to
> review than a small quantity of large patches. Large patch series review
> often requires the reviewer to keep a lot of context as they review.

I don't mind big sed jobs or moving functions to new files (well those
quite a bit since they're a pain for rebasing -internal). But such a
big patch needs to be conceptually really simple, my rule of thumb is
that patch size times complexity should follow a constant upper limit.
So a big move stuff patch shouldn't also rename a bunch of functions
(wasn't too happy about Chris' intel_uncore.c extract) since that
makes comparing harder (both in review and in rebasing).

If the patch is really big (like driver-wide sed jobs) the conceptual
change should approach 0. For example if you want to embed an object
you first create an access helper (big sed job, no change, not even in
the struct layout). Then a 2nd patch which changes the access helper,
but would otherwise be very small.

Imo  the big patch I've asked you to split up had lot of sed-like
things, but also a few potentially functional/conceptual changes in
it. The combination was imo too much. But that doesn't mean I won't
accept sed jobs that result in a much larger diff, just that they need
to be really simple.

> *4. The result of all this is I think a lot of the time we (the
> developers) end up writing your patch for you. While I respect your
> opinion very highly, and I think more often than not that your way is
> better, it's just inefficient.

Yeah, I'm aware that sometimes I go overboard with "my way or the
highway" even if I don't state that explicitly. Often though when I
drop random ideas or ask questions I'm ok if the patch author sticks
to his way if it comes with a good explanation attached. That at least
is one of the reason why I want to always update commit messages even
when the reviewer in the end did not ask for a code change.

Todays discussion about the loop in one of your patches in
evict_everything was a prime example: I've read through your code,
decided that it looks funny and dropped a suggestion on irc. But later
on I've read the end result and noticed that my suggestion is much
worse than what you have.

In such cases I expect developers to stand up, explain why something
is like it is and tell me that I'm full of myself ;-)

This will be even more important going forward since with the growing
team and code output I'll be less and less able to keep track of
everything. So the chance that I'll utter complete bs in a review will
only increase. If you don't call me out on it we'll end up with worse
code, which I very much don't want to.

> I'll wrap this all up with, I don't envy you. On a bunch of emails, I've
> seen you be apologetic for putting developers in between a rock, and a
> hard place (you, and program management). I recognize you have the same
> dilemma with Dave/Linus, and the rest of us developers. I think the
> overall strategy should be to improve QA, but then you have to take the
> leap of limiting your requests for reworks, and accepting QAs stamp of
> approval.

Hey, overall it's actually quite a bit of fun.

I do agree that QA is really important for a fastpaced process, but
it's also not the only peace to get something in. Review (both of the
patch itself but also of  the test coverage) catches a lot of issues,
and in many cases not the same ones as QA would. Especially if the
testcoverage of a new feature is less than stellar, which imo is still
the case for gem due to the tons of finickle cornercases.

Cheers, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

Dave Airlie July 26, 2013, 11:13 p.m. UTC | #9

>
> Hey, overall it's actually quite a bit of fun.
>
> I do agree that QA is really important for a fastpaced process, but
> it's also not the only peace to get something in. Review (both of the
> patch itself but also of  the test coverage) catches a lot of issues,
> and in many cases not the same ones as QA would. Especially if the
> testcoverage of a new feature is less than stellar, which imo is still
> the case for gem due to the tons of finickle cornercases.

Just my 2c worth on this topic, since I like the current process, and
I believe making it too formal is probably going to make things suck
too much.

I'd rather Daniel was slowing you guys down up front more, I don't
give a crap about Intel project management or personal manager relying
on getting features merged when, I do care that you engineers when you
merge something generally get transferred 100% onto something else and
don't react strongly enough to issues on older code you have created
that either have lain dormant since patches merged or are regressions
since patches merged. So I believe the slowing down of merging
features gives a better chance of QA or other random devs of finding
the misc regressions while you are still focused on the code and
hitting the long term bugs that you guys rarely get resourced to fix
unless I threaten to stop pulling stuff.

So whatever Daniel says goes as far as I'm concerned, if I even
suspect he's taken some internal Intel pressure to merge some feature,
I'm going to stop pulling from him faster than I stopped pulling from
the previous maintainers :-), so yeah engineers should be prepared to
backup what they post even if Daniel is wrong, but on the other hand
they need to demonstrate they understand the code they are pushing and
sometimes with ppgtt and contexts I'm not sure anyone really
understands how the hw works let alone the sw :-P

Dave.

Ben Widawsky July 27, 2013, 12:05 a.m. UTC | #10

On Sat, Jul 27, 2013 at 09:13:38AM +1000, Dave Airlie wrote:
> >
> > Hey, overall it's actually quite a bit of fun.
> >
> > I do agree that QA is really important for a fastpaced process, but
> > it's also not the only peace to get something in. Review (both of the
> > patch itself but also of  the test coverage) catches a lot of issues,
> > and in many cases not the same ones as QA would. Especially if the
> > testcoverage of a new feature is less than stellar, which imo is still
> > the case for gem due to the tons of finickle cornercases.
> 
> Just my 2c worth on this topic, since I like the current process, and
> I believe making it too formal is probably going to make things suck
> too much.
> 
> I'd rather Daniel was slowing you guys down up front more, I don't
> give a crap about Intel project management or personal manager relying
> on getting features merged when, I do care that you engineers when you
> merge something generally get transferred 100% onto something else and
> don't react strongly enough to issues on older code you have created
> that either have lain dormant since patches merged or are regressions
> since patches merged. So I believe the slowing down of merging
> features gives a better chance of QA or other random devs of finding
> the misc regressions while you are still focused on the code and
> hitting the long term bugs that you guys rarely get resourced to fix
> unless I threaten to stop pulling stuff.
> 
> So whatever Daniel says goes as far as I'm concerned, if I even
> suspect he's taken some internal Intel pressure to merge some feature,
> I'm going to stop pulling from him faster than I stopped pulling from
> the previous maintainers :-), so yeah engineers should be prepared to
> backup what they post even if Daniel is wrong, but on the other hand
> they need to demonstrate they understand the code they are pushing and
> sometimes with ppgtt and contexts I'm not sure anyone really
> understands how the hw works let alone the sw :-P
> 
> Dave.

Honestly, I wouldn't have responded if you didn't mention the Intel
program management thing...

The problem I am trying to emphasize, and let's use contexts/ppgtt as an
example, is we have three options:
1. It's complicated, and a big change, so let's not do it.
2. I continue to rebase the massive change on top of the extremely fast
paced i915 tree, with no QA coverage.
3. We get decent bits merged ASAP by putting it in a repo that both gets
much wider usage than my personal branch, and gets nightly QA coverage.

PPGTT + Contexts have existed for a while, and so we went with #1 for
quite a while.

Now we're at #2. There's two sides to your 'developer needs to
defend...' I need Daniel to give succinct feedback, and agree upon steps
required to get code merged. My original gripe was that it's hard to
deal with the, "that patch is too big" comments almost 2 months after
the first version was sent. Equally, "that looks funny" without a real
explanation of what looks funny, or sufficient thought up front about
what might look better is just as hard to deal with. Inevitably, yes -
it's a big scary series of patches - but if we're honest with ourselves,
it's almost guaranteed to blow up somewhere regardless of how much we
rework it, and who reviews it. Blowing up long before you merge would
always be better than the after you merge.

My desire is to get to something like #3. I had a really long paragraph
on why and how we could do that, but I've redacted it. Let's just leave
it as, I think that should be the goal.

Finally, let me clear that none of the discussion I'm having with Daniel
that spawned this thread are inspired by Intel program management. My
personal opinion is that your firm stance has really helped us
internally to fight back stupid decisions. Honestly, I wish you had a
more direct input into our management, and product planners.

Dave Airlie July 27, 2013, 8:52 a.m. UTC | #11

On Sat, Jul 27, 2013 at 10:05 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Sat, Jul 27, 2013 at 09:13:38AM +1000, Dave Airlie wrote:
>> >
>> > Hey, overall it's actually quite a bit of fun.
>> >
>> > I do agree that QA is really important for a fastpaced process, but
>> > it's also not the only peace to get something in. Review (both of the
>> > patch itself but also of  the test coverage) catches a lot of issues,
>> > and in many cases not the same ones as QA would. Especially if the
>> > testcoverage of a new feature is less than stellar, which imo is still
>> > the case for gem due to the tons of finickle cornercases.
>>
>> Just my 2c worth on this topic, since I like the current process, and
>> I believe making it too formal is probably going to make things suck
>> too much.
>>
>> I'd rather Daniel was slowing you guys down up front more, I don't
>> give a crap about Intel project management or personal manager relying
>> on getting features merged when, I do care that you engineers when you
>> merge something generally get transferred 100% onto something else and
>> don't react strongly enough to issues on older code you have created
>> that either have lain dormant since patches merged or are regressions
>> since patches merged. So I believe the slowing down of merging
>> features gives a better chance of QA or other random devs of finding
>> the misc regressions while you are still focused on the code and
>> hitting the long term bugs that you guys rarely get resourced to fix
>> unless I threaten to stop pulling stuff.
>>
>> So whatever Daniel says goes as far as I'm concerned, if I even
>> suspect he's taken some internal Intel pressure to merge some feature,
>> I'm going to stop pulling from him faster than I stopped pulling from
>> the previous maintainers :-), so yeah engineers should be prepared to
>> backup what they post even if Daniel is wrong, but on the other hand
>> they need to demonstrate they understand the code they are pushing and
>> sometimes with ppgtt and contexts I'm not sure anyone really
>> understands how the hw works let alone the sw :-P
>>
>> Dave.
>
> Honestly, I wouldn't have responded if you didn't mention the Intel
> program management thing...
>
> The problem I am trying to emphasize, and let's use contexts/ppgtt as an
> example, is we have three options:
> 1. It's complicated, and a big change, so let's not do it.
> 2. I continue to rebase the massive change on top of the extremely fast
> paced i915 tree, with no QA coverage.
> 3. We get decent bits merged ASAP by putting it in a repo that both gets
> much wider usage than my personal branch, and gets nightly QA coverage.
>
> PPGTT + Contexts have existed for a while, and so we went with #1 for
> quite a while.
>
> Now we're at #2. There's two sides to your 'developer needs to
> defend...' I need Daniel to give succinct feedback, and agree upon steps
> required to get code merged. My original gripe was that it's hard to
> deal with the, "that patch is too big" comments almost 2 months after
> the first version was sent. Equally, "that looks funny" without a real
> explanation of what looks funny, or sufficient thought up front about
> what might look better is just as hard to deal with. Inevitably, yes -
> it's a big scary series of patches - but if we're honest with ourselves,
> it's almost guaranteed to blow up somewhere regardless of how much we
> rework it, and who reviews it. Blowing up long before you merge would
> always be better than the after you merge.
>
> My desire is to get to something like #3. I had a really long paragraph
> on why and how we could do that, but I've redacted it. Let's just leave
> it as, I think that should be the goal.
>

Daniel could start taking topic branches like Ingo does, however he'd
have a lot of fun merging them,
he's already getting closer and closer to the extreme stuff -tip does,
and he'd have to feed the topics to QA and possibly -next separately,
the question is when to include a branch or not include it.

Maybe he can schedule a time that QA gets all the branches, and maybe
not put stuff into -next until we are sure its on its way.

Dave.

Jesse Barnes July 29, 2013, 10:35 p.m. UTC | #12

On Sat, 27 Jul 2013 09:13:38 +1000
Dave Airlie <airlied@gmail.com> wrote:

> >
> > Hey, overall it's actually quite a bit of fun.
> >
> > I do agree that QA is really important for a fastpaced process, but
> > it's also not the only peace to get something in. Review (both of the
> > patch itself but also of  the test coverage) catches a lot of issues,
> > and in many cases not the same ones as QA would. Especially if the
> > testcoverage of a new feature is less than stellar, which imo is still
> > the case for gem due to the tons of finickle cornercases.
> 
> Just my 2c worth on this topic, since I like the current process, and
> I believe making it too formal is probably going to make things suck
> too much.
> 
> I'd rather Daniel was slowing you guys down up front more, I don't
> give a crap about Intel project management or personal manager relying
> on getting features merged when, I do care that you engineers when you
> merge something generally get transferred 100% onto something else and
> don't react strongly enough to issues on older code you have created
> that either have lain dormant since patches merged or are regressions
> since patches merged. So I believe the slowing down of merging
> features gives a better chance of QA or other random devs of finding
> the misc regressions while you are still focused on the code and
> hitting the long term bugs that you guys rarely get resourced to fix
> unless I threaten to stop pulling stuff.
> 
> So whatever Daniel says goes as far as I'm concerned, if I even
> suspect he's taken some internal Intel pressure to merge some feature,
> I'm going to stop pulling from him faster than I stopped pulling from
> the previous maintainers :-), so yeah engineers should be prepared to
> backup what they post even if Daniel is wrong, but on the other hand
> they need to demonstrate they understand the code they are pushing and
> sometimes with ppgtt and contexts I'm not sure anyone really
> understands how the hw works let alone the sw :-P

Some of this is driven by me, because I have one main goal in mind in
getting our code upstream: I want high quality kernel support for our
products upstream and released, in an official Linus release, before the
product ships.  That gives OSVs and other downstream consumers of the
code a chance to get the bits and be ready when products start rolling
out.

Without a bounded time process for getting bits upstream, that can't
happen.  That's why I was trying to encourage reviewers to provide
specific feedback, since vague feedback is more likely to leave a
patchset in the doldrums and de-motivate the author.

I think the "slowing things down" may hurt more than it helps here.
For example all the time Paulo spends on refactoring and rebasing his
PC8 stuff is time he could have spent on HSW bugs instead.  Likewise
with Ben's stuff (and there the rebasing is actually reducing quality
rather than increasing it, at least from a bug perspective).

Dave Airlie July 29, 2013, 11:50 p.m. UTC | #13

>> > I do agree that QA is really important for a fastpaced process, but
>> > it's also not the only peace to get something in. Review (both of the
>> > patch itself but also of  the test coverage) catches a lot of issues,
>> > and in many cases not the same ones as QA would. Especially if the
>> > testcoverage of a new feature is less than stellar, which imo is still
>> > the case for gem due to the tons of finickle cornercases.
>>
>> Just my 2c worth on this topic, since I like the current process, and
>> I believe making it too formal is probably going to make things suck
>> too much.
>>
>> I'd rather Daniel was slowing you guys down up front more, I don't
>> give a crap about Intel project management or personal manager relying
>> on getting features merged when, I do care that you engineers when you
>> merge something generally get transferred 100% onto something else and
>> don't react strongly enough to issues on older code you have created
>> that either have lain dormant since patches merged or are regressions
>> since patches merged. So I believe the slowing down of merging
>> features gives a better chance of QA or other random devs of finding
>> the misc regressions while you are still focused on the code and
>> hitting the long term bugs that you guys rarely get resourced to fix
>> unless I threaten to stop pulling stuff.
>>
>> So whatever Daniel says goes as far as I'm concerned, if I even
>> suspect he's taken some internal Intel pressure to merge some feature,
>> I'm going to stop pulling from him faster than I stopped pulling from
>> the previous maintainers :-), so yeah engineers should be prepared to
>> backup what they post even if Daniel is wrong, but on the other hand
>> they need to demonstrate they understand the code they are pushing and
>> sometimes with ppgtt and contexts I'm not sure anyone really
>> understands how the hw works let alone the sw :-P
>
> Some of this is driven by me, because I have one main goal in mind in
> getting our code upstream: I want high quality kernel support for our
> products upstream and released, in an official Linus release, before the
> product ships.  That gives OSVs and other downstream consumers of the
> code a chance to get the bits and be ready when products start rolling
> out.

Your main goal is however different than mine, my main goal is to
not regress the code that is already upstream and have bugs in it
fixed. Slowing down new platform merges seems to do that a lot
better than merging stuff :-)

I realise you guys pay lip service to my goals at times, but I often
get the feeling that you'd rather merge HSW support and run away
to the next platform than spend a lot of time fixing reported bugs in
Ironlake/Sandybridge/Ivybridge *cough RC6 after suspend/resume*.

It would be nice to be proven wrong once in a while where someone is
actually assigned a bug fix in preference to adding new features for new
platforms.

Dave.

Daniel Vetter Aug. 4, 2013, 7:55 p.m. UTC | #14

On Sat, Jul 27, 2013 at 06:52:55PM +1000, Dave Airlie wrote:
> On Sat, Jul 27, 2013 at 10:05 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
> > On Sat, Jul 27, 2013 at 09:13:38AM +1000, Dave Airlie wrote:
> >> >
> >> > Hey, overall it's actually quite a bit of fun.
> >> >
> >> > I do agree that QA is really important for a fastpaced process, but
> >> > it's also not the only peace to get something in. Review (both of the
> >> > patch itself but also of  the test coverage) catches a lot of issues,
> >> > and in many cases not the same ones as QA would. Especially if the
> >> > testcoverage of a new feature is less than stellar, which imo is still
> >> > the case for gem due to the tons of finickle cornercases.
> >>
> >> Just my 2c worth on this topic, since I like the current process, and
> >> I believe making it too formal is probably going to make things suck
> >> too much.
> >>
> >> I'd rather Daniel was slowing you guys down up front more, I don't
> >> give a crap about Intel project management or personal manager relying
> >> on getting features merged when, I do care that you engineers when you
> >> merge something generally get transferred 100% onto something else and
> >> don't react strongly enough to issues on older code you have created
> >> that either have lain dormant since patches merged or are regressions
> >> since patches merged. So I believe the slowing down of merging
> >> features gives a better chance of QA or other random devs of finding
> >> the misc regressions while you are still focused on the code and
> >> hitting the long term bugs that you guys rarely get resourced to fix
> >> unless I threaten to stop pulling stuff.
> >>
> >> So whatever Daniel says goes as far as I'm concerned, if I even
> >> suspect he's taken some internal Intel pressure to merge some feature,
> >> I'm going to stop pulling from him faster than I stopped pulling from
> >> the previous maintainers :-), so yeah engineers should be prepared to
> >> backup what they post even if Daniel is wrong, but on the other hand
> >> they need to demonstrate they understand the code they are pushing and
> >> sometimes with ppgtt and contexts I'm not sure anyone really
> >> understands how the hw works let alone the sw :-P
> >>
> >> Dave.
> >
> > Honestly, I wouldn't have responded if you didn't mention the Intel
> > program management thing...
> >
> > The problem I am trying to emphasize, and let's use contexts/ppgtt as an
> > example, is we have three options:
> > 1. It's complicated, and a big change, so let's not do it.
> > 2. I continue to rebase the massive change on top of the extremely fast
> > paced i915 tree, with no QA coverage.
> > 3. We get decent bits merged ASAP by putting it in a repo that both gets
> > much wider usage than my personal branch, and gets nightly QA coverage.
> >
> > PPGTT + Contexts have existed for a while, and so we went with #1 for
> > quite a while.
> >
> > Now we're at #2. There's two sides to your 'developer needs to
> > defend...' I need Daniel to give succinct feedback, and agree upon steps
> > required to get code merged. My original gripe was that it's hard to
> > deal with the, "that patch is too big" comments almost 2 months after
> > the first version was sent. Equally, "that looks funny" without a real
> > explanation of what looks funny, or sufficient thought up front about
> > what might look better is just as hard to deal with. Inevitably, yes -
> > it's a big scary series of patches - but if we're honest with ourselves,
> > it's almost guaranteed to blow up somewhere regardless of how much we
> > rework it, and who reviews it. Blowing up long before you merge would
> > always be better than the after you merge.
> >
> > My desire is to get to something like #3. I had a really long paragraph
> > on why and how we could do that, but I've redacted it. Let's just leave
> > it as, I think that should be the goal.
> >
> 
> Daniel could start taking topic branches like Ingo does, however he'd
> have a lot of fun merging them,
> he's already getting closer and closer to the extreme stuff -tip does,
> and he'd have to feed the topics to QA and possibly -next separately,
> the question is when to include a branch or not include it.

Yeah, I guess eventually we need to go more crazy with the branching model
for drm/i915. But even getting to the current model was quite some fun, so
I don't want to rock the boat too much if not required ;-)

Also I fear that integrating random developer branches myself will put me
at an ugly spot where I partially maintain (due to the regular merge
conflicts) patches I haven't yet accepted. And since I'm only human I'll
then just merge patches to get rid of the merge pain. So I don't really
want to do that.

Similarly for the internal tree (which just contains hw enabling for
platforms we're not yet allowed to talk about and some related hacks) I've
put down the rule that I won't take patches which are not upstream
material (minus the last bit of polish and no real review requirement).
Otherwise I'll start to bend the upstream rules a bit ... ;-)
 
> Maybe he can schedule a time that QA gets all the branches, and maybe
> not put stuff into -next until we are sure its on its way.

Imo the solution here is for QA to beat the nightly test infrastructure
into a solid enough shape that it can run arbitrary developer branches,
unattended. I think we're slowly getting there (but for obvious reasons
that's no my main aim as the maintainer when working together with our QA
guys).

Cheers, Daniel

Daniel Vetter Aug. 4, 2013, 8:17 p.m. UTC | #15

The nice thing with kicking off a process discussion before disappearing
into vacation is that I've had a long time to come up with some
well-sharpened opinions. And what better way to start than with a good
old-fashioned flamewar ;-)

On Tue, Jul 30, 2013 at 09:50:21AM +1000, Dave Airlie wrote:
> >> > I do agree that QA is really important for a fastpaced process, but
> >> > it's also not the only peace to get something in. Review (both of the
> >> > patch itself but also of  the test coverage) catches a lot of issues,
> >> > and in many cases not the same ones as QA would. Especially if the
> >> > testcoverage of a new feature is less than stellar, which imo is still
> >> > the case for gem due to the tons of finickle cornercases.
> >>
> >> Just my 2c worth on this topic, since I like the current process, and
> >> I believe making it too formal is probably going to make things suck
> >> too much.
> >>
> >> I'd rather Daniel was slowing you guys down up front more, I don't
> >> give a crap about Intel project management or personal manager relying
> >> on getting features merged when, I do care that you engineers when you
> >> merge something generally get transferred 100% onto something else and
> >> don't react strongly enough to issues on older code you have created
> >> that either have lain dormant since patches merged or are regressions
> >> since patches merged. So I believe the slowing down of merging
> >> features gives a better chance of QA or other random devs of finding
> >> the misc regressions while you are still focused on the code and
> >> hitting the long term bugs that you guys rarely get resourced to fix
> >> unless I threaten to stop pulling stuff.
> >>
> >> So whatever Daniel says goes as far as I'm concerned, if I even
> >> suspect he's taken some internal Intel pressure to merge some feature,
> >> I'm going to stop pulling from him faster than I stopped pulling from
> >> the previous maintainers :-), so yeah engineers should be prepared to
> >> backup what they post even if Daniel is wrong, but on the other hand
> >> they need to demonstrate they understand the code they are pushing and
> >> sometimes with ppgtt and contexts I'm not sure anyone really
> >> understands how the hw works let alone the sw :-P
> >
> > Some of this is driven by me, because I have one main goal in mind in
> > getting our code upstream: I want high quality kernel support for our
> > products upstream and released, in an official Linus release, before the
> > product ships.  That gives OSVs and other downstream consumers of the
> > code a chance to get the bits and be ready when products start rolling
> > out.

Imo the "unpredictable upstream" vs. "high quality kernel support in
upstream" is a false dichotomy. Afaics the "unpredictability" is _because_
I am not willing to compromise on decent quality. I still claim that
upstreaming is a fairly predictable thing (whithin some bounds of how well
some tasks can be estimated up-front without doing some research or
prototyping), and the blocker here is our mediocre project tracking.

I've thought a bit about this (and read a few provoking books about the
matter) over vacation and I fear I get to demonstrate this only by running
the estimation show myself a bit. But atm I'm by far not frustrated enough
yet with the current state of affairs to sign up for that - still chewing
on that maintainer thing ;-)

> Your main goal is however different than mine, my main goal is to
> not regress the code that is already upstream and have bugs in it
> fixed. Slowing down new platform merges seems to do that a lot
> better than merging stuff :-)
> 
> I realise you guys pay lip service to my goals at times, but I often
> get the feeling that you'd rather merge HSW support and run away
> to the next platform than spend a lot of time fixing reported bugs in
> Ironlake/Sandybridge/Ivybridge *cough RC6 after suspend/resume*.
> 
> It would be nice to be proven wrong once in a while where someone is
> actually assigned a bug fix in preference to adding new features for new
> platforms.

Well, that team is 50% Chris&me with other people (many from the community
...) rounding things off. That is quite a bit better than a year ago (and
yep, we blow up stuff, too) but not great. And it's imo also true that
Intel as a company doesn't care one bit once the hw is shipped.

My approach here has been to be a royal jerk about test coverage for new
features and blocking stuff if a regression isn't tackled in time. People
scream all around, but it seems to work and we're imo getting to a "farly
decent regression handling" point. I also try to push for enabling
features across platforms (if the hw should work the same way) in the name
of increased test coverage. That one seems to be less effective (e.g. fbc
for hsw only ...).

Cheers, Daniel

Daniel Vetter Aug. 4, 2013, 8:31 p.m. UTC | #16

On Fri, Jul 26, 2013 at 10:12:43AM -0700, Jesse Barnes wrote:
> On Fri, 26 Jul 2013 18:08:48 +0100
> Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > On Fri, Jul 26, 2013 at 09:59:42AM -0700, Jesse Barnes wrote:
> > >   4) review comments should be concrete and actionable, and ideally not
> > >      leave the author hanging with hints about problems the reviewer
> > >      has spotted, leaving the author looking for easter eggs
> > 
> > Where am I going to find my fun, if I am not allowed to tell you that
> > you missed a zero in a thousand line patch but not tell you where?
> > Spoilsport :-p
> 
> You'll just need to take up golf or something. :)

Poignant opinion from the guy who bored himself on vacations: I disagree
on two grounds:

Chris without the occasional easter-egg sprinkling just wouldn't be Chris
anymore, at least how I know him. Imo we're a bunch of individuals, quirks
and all, not a pile of interchangeable cogs that just churn out code. And
yes am as amused as the next guy when I spoil by pants by inadvertedly
sitting in one of Chris' easter-eggs, otoh I can't help not grinning when
I discover them in time ;-)

Which leads to the "where's the fun?" question. I've started hacking on
drm/i915 because it's fun (despite the frustration). And the fun is what
keeps me slogging through bug reports each morning. So if we ditch that in
the name of efficiency that'll affect my productivity a lot (just not in
the intended direction) and you'll probably need to look for a new
maintainer ...

With that out of the way I'm obviously not advocating for unclear review -
mail is an occasional rather lossy communication medium and we need to
keep that in mind all the time. I'm only against your easter egg comment,
since throwing those out with the badwather is imo bad.

Cheers, Daniel

Jesse Barnes Aug. 5, 2013, 9:33 p.m. UTC | #17

On Sun, 4 Aug 2013 22:17:47 +0200
Daniel Vetter <daniel@ffwll.ch> wrote:
> Imo the "unpredictable upstream" vs. "high quality kernel support in
> upstream" is a false dichotomy. Afaics the "unpredictability" is _because_
> I am not willing to compromise on decent quality. I still claim that
> upstreaming is a fairly predictable thing (whithin some bounds of how well
> some tasks can be estimated up-front without doing some research or
> prototyping), and the blocker here is our mediocre project tracking.

Well, I definitely disagree here.  With our current (and recent past)
processes, we've generally ended up with lots of hw support landing
well after parts start shipping, and the quality hasn't been high (in
terms of user reported bugs) despite all the delay.  So while our code
might look pretty, the fact is that it's late, and has hard to debug
low level bugs (RC6, semaphores, etc).

<rant>
It's fairly easy to add support for hardware well after it ships, and
in a substandard way (e.g. hard power features disabled because we
can't figure them out because the hw debug folks have moved on).  If we
want to keep doing that, fine, but I'd really like us to do better and
catch the hard bugs *before* hw ships, and make sure it's solid and
complete *before* users get it.  But maybe that's just me.  Maybe
treating our driver like any other RE or "best effort" Linux driver is
the right way to go.  If so, fine, let's just not change anything.
</rant>

> My approach here has been to be a royal jerk about test coverage for new
> features and blocking stuff if a regression isn't tackled in time. People
> scream all around, but it seems to work and we're imo getting to a "farly
> decent regression handling" point. I also try to push for enabling
> features across platforms (if the hw should work the same way) in the name
> of increased test coverage. That one seems to be less effective (e.g. fbc
> for hsw only ...).

But code that isn't upstream *WON'T BE TESTED* reasonably.  So if
you're waiting for all tests to be written before going upstream, all
you're doing is delaying the bug reports that will inevitably come in,
both from new test programs and from general usage. On top of that, if
someone is trying to refactor at the same time, things just become a
mess with all sorts of regressions introduced that weren't an issue
with the original patchset...

Daniel Vetter Aug. 5, 2013, 10:19 p.m. UTC | #18

On Mon, Aug 5, 2013 at 11:33 PM, Jesse Barnes <jbarnes@virtuousgeek.org> wrote:
> On Sun, 4 Aug 2013 22:17:47 +0200
> Daniel Vetter <daniel@ffwll.ch> wrote:
>> Imo the "unpredictable upstream" vs. "high quality kernel support in
>> upstream" is a false dichotomy. Afaics the "unpredictability" is _because_
>> I am not willing to compromise on decent quality. I still claim that
>> upstreaming is a fairly predictable thing (whithin some bounds of how well
>> some tasks can be estimated up-front without doing some research or
>> prototyping), and the blocker here is our mediocre project tracking.
>
> Well, I definitely disagree here.  With our current (and recent past)
> processes, we've generally ended up with lots of hw support landing
> well after parts start shipping, and the quality hasn't been high (in
> terms of user reported bugs) despite all the delay.  So while our code
> might look pretty, the fact is that it's late, and has hard to debug
> low level bugs (RC6, semaphores, etc).
>
> <rant>
> It's fairly easy to add support for hardware well after it ships, and
> in a substandard way (e.g. hard power features disabled because we
> can't figure them out because the hw debug folks have moved on).  If we
> want to keep doing that, fine, but I'd really like us to do better and
> catch the hard bugs *before* hw ships, and make sure it's solid and
> complete *before* users get it.  But maybe that's just me.  Maybe
> treating our driver like any other RE or "best effort" Linux driver is
> the right way to go.  If so, fine, let's just not change anything.
> </rant>

The only thing I read here, both in the paragraph above and in the
rant is that we suck. I agree. My opinion is that this is because
we've started late, had too few resources and didn't seriously
estimate how much work is actually involved to enable something for
real.

The only reason I could distill from the above two paragraphs among
the ranting for way we are so much late is "So while our code might
look pretty, it's late and buggy". That's imo a farily shallow stab at
preceived bikesheds, but not a useful angle to improve process.

Now I agree that I uphold a fairly high quality standard for
upstreaming, but not an unreasonable one:
- drm/i915 transformed from the undisputed shittiest driver in the
kernel to one that mostly just works, while picking up development
pace. So I don't think I'm fully wrong on insisting on this level of
quality.
- we do ship the driver essentially continously, which means we can
implement features only by small refactoring steps. That clearly
involves more work than just stitching something together for a
product.

I welcome discussing whether I impose t0o high standards, but that
needs to be supplied with examples and solid reasons. "It's just too
hard" without more context isn't one, since yes, the work we pull off
here actually is hard.

Also note that Chris&me still bear the brute of fixing the random
fallout all over (it's getting better). So if any proposed changes
involves me blowing through even more time to track down issues I'm
strongly not in favour. Same holds for Chris often-heard comment that
a patch needs an improved commit message or a comment somewhere. Yes
it's annoying that you need to resend it (this often bugs me myself)
just to paint the bikeshed a bit different. But imo Chris is pretty
much throughout spot-on with his requests and a high-quality git
history has, in my experience at least, been extremely valueable to
track down the really ugly issues and legalese around all the
established precendence.

>> My approach here has been to be a royal jerk about test coverage for new
>> features and blocking stuff if a regression isn't tackled in time. People
>> scream all around, but it seems to work and we're imo getting to a "farly
>> decent regression handling" point. I also try to push for enabling
>> features across platforms (if the hw should work the same way) in the name
>> of increased test coverage. That one seems to be less effective (e.g. fbc
>> for hsw only ...).
>
> But code that isn't upstream *WON'T BE TESTED* reasonably.  So if
> you're waiting for all tests to be written before going upstream, all
> you're doing is delaying the bug reports that will inevitably come in,
> both from new test programs and from general usage. On top of that, if
> someone is trying to refactor at the same time, things just become a
> mess with all sorts of regressions introduced that weren't an issue
> with the original patchset...

QA on my trees and the igt testcoverage I demand for new features is
to catch regressions once something is merged. We've managed to break
code in less than a day since it's merged on multiple occasions, so
this is very real and just part of the quality standard I impose.

Furthermore I don't want that a new feature regresses overall
stability of our driver. And since that quality is increasing rather
decently I ask for more testcases to exercise cornercases to make sure
they're all covered. This is very much orthogonal to doing review and
just one more puzzle to ensure we don't go back to the neat old days
of shipping half-baked crap.

Note that nowadays QA is catching a lot of the regressions even before
the patches land in Dave's tree (sometimes there's the occasional
brown paper bag event though, but in each such case I analysis the
failure mode and work to prevent it in the future). And imo that's
squarely due to much improved test coverage and the rigid test
coverage requirements for new feautures I impose. And of course the
overall improve QA process flow with much quicker regression
turnaround times also greatly helps here.

Now I agree (and I think I've mentioned this a bunch of times in this
thread already) that this leads to a pain for developers. I see two
main issues, both are (slowly) improving:
- Testing patch series for regressions before merging. QA just set up
the developer patch test system, and despite that it's still rather
limited Ben seems to be fairly happy with where it's going. So I think
we're on track to improve this and avoid the need for developers to
have a private lab like Chris and I essentially have.
- Rebase hell due to ongoing other work. Thus far I've only tried to
help here by rechecking/delaying refactoring patches while big
features are pending. I think we need to try new approaches here and
imo better planing should help. E.g. the initial modeset refactor was
way too big and a monolithic junk that I've just wrestled in by
exorting r-b tags from you. In contrast the pipe config rework was
about equally big, but at any given time only about 30-50 patches
where outstanding (in extreme cases), and mutliple people contributed
different parts of the overall beast. Of course that means that
occasional, for really big stuff, we need to plan to write a first
proof of concept as a landmark where we need to go to, which pretty
much will be thrown away completely.

One meta-comment on top of the actual discussion: I really appreciate
critique and I've grown a good maintainer-skin to also deal with
really harsh critique. But I prefer less ranting and more concrete
examples where I've botched the job (there are plentiful to pick from
imo) and concrete suggestion for how to improve our overall process. I
think these process woes are painful for everyone and due to our fast
growth we're constantly pushing into new levels of ugly, but imo the
way to go forward is by small (sometimes positively tiny), but
continous adjustements and improvements.

I think we both agree where we'd like to be, but at least for me in
the day-to-day fight in the trenches the rosy picture 200 miles away
doesn't really help. Maybe I'm too delusional and sarcastic that way
;-)

Cheers, Daniel

Jesse Barnes Aug. 5, 2013, 11:34 p.m. UTC | #19

On Tue, 6 Aug 2013 00:19:33 +0200
Daniel Vetter <daniel@ffwll.ch> wrote:
> The only thing I read here, both in the paragraph above and in the
> rant is that we suck. I agree. My opinion is that this is because
> we've started late, had too few resources and didn't seriously
> estimate how much work is actually involved to enable something for
> real.

No, it's more than that, we suck in very specific ways:
  1) large (and sometimes even small) features take waaay too long to
     land upstream, taking valuable developer time away from other
     things like bug fixing, regressions, etc
  2) hw support lands late, which makes it harder to get debug traction
     with tough bugs (e.g. RC6)

> 
> The only reason I could distill from the above two paragraphs among
> the ranting for way we are so much late is "So while our code might
> look pretty, it's late and buggy". That's imo a farily shallow stab at
> preceived bikesheds, but not a useful angle to improve process.

No, I suggested improvements to our process earlier, and it sounded
like you mostly agreed, though seemed to deny point that we spin for
too long on things (point #1 above).

> Now I agree that I uphold a fairly high quality standard for
> upstreaming, but not an unreasonable one:
> - drm/i915 transformed from the undisputed shittiest driver in the
> kernel to one that mostly just works, while picking up development
> pace. So I don't think I'm fully wrong on insisting on this level of
> quality.
> - we do ship the driver essentially continously, which means we can
> implement features only by small refactoring steps. That clearly
> involves more work than just stitching something together for a
> product.

<sarcasm>
You're way off base here.  We should ship a shitty driver and just land
everything without review or testing.  That way we can go really fast.
Your quality standards are too high (in that they exist at all).
</sarcasm>

More seriously, quality should be measured by the end result in terms
of bugs and how users actually use our stuff.  I'm not sure if that's
what you mean by a "high quality standard".  Sometimes it seems you
care more about refactoring things ad-infinitum than tested code.

> Also note that Chris&me still bear the brute of fixing the random
> fallout all over (it's getting better). So if any proposed changes
> involves me blowing through even more time to track down issues I'm
> strongly not in favour. Same holds for Chris often-heard comment that
> a patch needs an improved commit message or a comment somewhere. Yes
> it's annoying that you need to resend it (this often bugs me myself)
> just to paint the bikeshed a bit different. But imo Chris is pretty
> much throughout spot-on with his requests and a high-quality git
> history has, in my experience at least, been extremely valueable to
> track down the really ugly issues and legalese around all the
> established precendence.

Again, no one is suggesting that we have shitty changelogs or that we
add comments.  Not sure why you brought that up.

> - Rebase hell due to ongoing other work. Thus far I've only tried to
> help here by rechecking/delaying refactoring patches while big
> features are pending. I think we need to try new approaches here and
> imo better planing should help. E.g. the initial modeset refactor was
> way too big and a monolithic junk that I've just wrestled in by
> exorting r-b tags from you. In contrast the pipe config rework was
> about equally big, but at any given time only about 30-50 patches
> where outstanding (in extreme cases), and mutliple people contributed
> different parts of the overall beast. Of course that means that
> occasional, for really big stuff, we need to plan to write a first
> proof of concept as a landmark where we need to go to, which pretty
> much will be thrown away completely.

This is the real issue.  We don't have enough people to burn on
single features for 6 months each so they can be rewritten 3 times until
they look how you would have done it.  If we keep doing that, you
may as well write all of it, and we'll be stuck in my <rant> from
the previous message.  That's why I suggested the two reviewed-by tags
ought to be sufficient as a merge criteria.  Sure, there may be room
for refactoring, but if things are understandable by other developers
and well tested, why block them?

> One meta-comment on top of the actual discussion: I really appreciate
> critique and I've grown a good maintainer-skin to also deal with
> really harsh critique. But I prefer less ranting and more concrete
> examples where I've botched the job (there are plentiful to pick from
> imo) and concrete suggestion for how to improve our overall process.

I've suggested some already, but they've fallen on deaf ears afaict.  I
don't know what more I can do to convince you that you acting as a
review/refactor bottleneck actively undermines the goals I think we
share.

But I'm done with this thread.  Maybe others want to comment on things
they might think improve the situation.

Daniel Vetter Aug. 6, 2013, 6:29 a.m. UTC | #20

Like I've said in my previous mail I expect such discussions to be
hard and I also think stopping now and giving up is the wrong
approach. So another round.

On Tue, Aug 6, 2013 at 1:34 AM, Jesse Barnes <jbarnes@virtuousgeek.org> wrote:
> On Tue, 6 Aug 2013 00:19:33 +0200
> Daniel Vetter <daniel@ffwll.ch> wrote:
>> The only thing I read here, both in the paragraph above and in the
>> rant is that we suck. I agree. My opinion is that this is because
>> we've started late, had too few resources and didn't seriously
>> estimate how much work is actually involved to enable something for
>> real.
>
> No, it's more than that, we suck in very specific ways:
>   1) large (and sometimes even small) features take waaay too long to
>      land upstream, taking valuable developer time away from other
>      things like bug fixing, regressions, etc
>   2) hw support lands late, which makes it harder to get debug traction
>      with tough bugs (e.g. RC6)
>
>>
>> The only reason I could distill from the above two paragraphs among
>> the ranting for way we are so much late is "So while our code might
>> look pretty, it's late and buggy". That's imo a farily shallow stab at
>> preceived bikesheds, but not a useful angle to improve process.
>
> No, I suggested improvements to our process earlier, and it sounded
> like you mostly agreed, though seemed to deny point that we spin for
> too long on things (point #1 above).

I'll cover your process suggestions below, since some of your
clarifications below shine a different light onto them. But overall I
agree, even that we seem to spin sometimes awfully long.

>> Now I agree that I uphold a fairly high quality standard for
>> upstreaming, but not an unreasonable one:
>> - drm/i915 transformed from the undisputed shittiest driver in the
>> kernel to one that mostly just works, while picking up development
>> pace. So I don't think I'm fully wrong on insisting on this level of
>> quality.
>> - we do ship the driver essentially continously, which means we can
>> implement features only by small refactoring steps. That clearly
>> involves more work than just stitching something together for a
>> product.
>
> <sarcasm>
> You're way off base here.  We should ship a shitty driver and just land
> everything without review or testing.  That way we can go really fast.
> Your quality standards are too high (in that they exist at all).
> </sarcasm>
>
> More seriously, quality should be measured by the end result in terms
> of bugs and how users actually use our stuff.  I'm not sure if that's
> what you mean by a "high quality standard".  Sometimes it seems you
> care more about refactoring things ad-infinitum than tested code.

I often throw in a refactoring suggestion when people work on a
feature, that's right. Often it is also a crappy idea, but imo for
long-term maintainance a neat&tidy codebase is really important. So
I'll just throw them out and see what sticks with people.

I realize that pretty much all of the quality standard discussion here
is really fluffy, but like I explained I get to bear a large part of
the "keep it going" workload. And as long as that's the case I frankly
think my standards carry more weight. Furthermore in the cases where
other people from our team chip in with bugfixing that's mostly in
cases where a self-check or testcase clearly puts the blame on them.
So if that is the only way to volunteer people I'll keep asking for
those things (and delay patches indefinitely like e.g. your fastboot
stuff).

And like I've said I'm open to discuss those requirements, but I
freely admit that I have a rather solid ground resolve on this topic.

>> Also note that Chris&me still bear the brute of fixing the random
>> fallout all over (it's getting better). So if any proposed changes
>> involves me blowing through even more time to track down issues I'm
>> strongly not in favour. Same holds for Chris often-heard comment that
>> a patch needs an improved commit message or a comment somewhere. Yes
>> it's annoying that you need to resend it (this often bugs me myself)
>> just to paint the bikeshed a bit different. But imo Chris is pretty
>> much throughout spot-on with his requests and a high-quality git
>> history has, in my experience at least, been extremely valueable to
>> track down the really ugly issues and legalese around all the
>> established precendence.
>
> Again, no one is suggesting that we have shitty changelogs or that we
> add comments.  Not sure why you brought that up.

I added it since I've just read through some of the patches on the
android internal branch yesterday and a lot of those patches fall
through on the "good enough commit message" criterion (mostly by
failing to explain why the patch is needed). I've figured that's
relevant since on internal irc you've said even pushing fixes to
upstream is a PITA since they require 2-3 rounds to get in.

To keep things concrete one such example is Kamal's recent rc6 fix
where I've asked for a different approach and he sounded rather pissed
that I don't just take his patch as-is. But after I've explained my
reasoning he seemed to agree, at least he sent out a revised version.
And the changes have all been what I guess you'd call bikesheds, since
it was just shuffling the code logic around a bit and pimping the
commit message. I claim that this is worth it and I think your stance
is that we shouldn't delay patches like this. Or is this a bad example
for a patch which you think was unduly delayed? Please bring another
one up in this case, I really think process discussions are easier
with concrete examples.

>> - Rebase hell due to ongoing other work. Thus far I've only tried to
>> help here by rechecking/delaying refactoring patches while big
>> features are pending. I think we need to try new approaches here and
>> imo better planing should help. E.g. the initial modeset refactor was
>> way too big and a monolithic junk that I've just wrestled in by
>> exorting r-b tags from you. In contrast the pipe config rework was
>> about equally big, but at any given time only about 30-50 patches
>> where outstanding (in extreme cases), and mutliple people contributed
>> different parts of the overall beast. Of course that means that
>> occasional, for really big stuff, we need to plan to write a first
>> proof of concept as a landmark where we need to go to, which pretty
>> much will be thrown away completely.
>
> This is the real issue.  We don't have enough people to burn on
> single features for 6 months each so they can be rewritten 3 times until
> they look how you would have done it.  If we keep doing that, you
> may as well write all of it, and we'll be stuck in my <rant> from
> the previous message.  That's why I suggested the two reviewed-by tags
> ought to be sufficient as a merge criteria.  Sure, there may be room
> for refactoring, but if things are understandable by other developers
> and well tested, why block them?

I'd like to see an example here for something that I blocked, really.
One I could think up is the ips feature from Paulo where I've asked to
convert it over to the pipe config tracking. But I asked for that
specifically so that one of our giant long-term feature goals (atomic
modeset) doesn't move further away, so I think for our long-term aims
this request was justified.

Otherwise I have a hard time coming up with features that had r-b tags
from one of the domain expert you've listed (i.e. where understood
well) and I blocked them. It's true that I often spot something small
when applying a patch, but I also often fix it up while applying
(mostly adding notes to the commit message) or asking for a quick
follow-up fixup patch.

>> One meta-comment on top of the actual discussion: I really appreciate
>> critique and I've grown a good maintainer-skin to also deal with
>> really harsh critique. But I prefer less ranting and more concrete
>> examples where I've botched the job (there are plentiful to pick from
>> imo) and concrete suggestion for how to improve our overall process.
>
> I've suggested some already, but they've fallen on deaf ears afaict.

Your above clarification that the 2 r-b tags (one from the domain
expert) should overrule my concern imo makes your original proposal a
bit different - my impression was that you've asked for 2 r-b tags,
period. Which would be more than what we currently have, and since we
have a hard time doing even that would imo I think asking for 2 r-b
tags is completely unrealistic.

One prime example is Ville's watermark patches, which have been ready
(he only did a very few v2 versions for bikesheds) since over a month
ago. But stuck since no one bothered to review them.

So your suggestions (points 1) thru 4) in your original mail in this
thread) haven't fallen on deaf ears. Specifically wrt review from
domain experts I'm ok with just an informal ack and letting someone
else do the detailed review. That way the 2nd function of reviewing of
diffusing knowledge in our distributed team works better when I pick
non-domain-experts.

> I don't know what more I can do to convince you that you acting as a
> review/refactor bottleneck actively undermines the goals I think we
> share.

I disagree that I'm a bottleneck. Just yesterday I've merged roughly
50 patches because they where all nicely reviewed. And like I've said
some of those patches have been stuck for a month in
no-one-bothers-to-review-them limbo land.

If we drag out another example and look at the ppgtt stuff from Ben
which I've asked to be reworked quite a bit. Now one mistake I've done
is to be way too optimistic about how much time this will take when
hashing out a merge plan with Ben. I've committed the mistake of
trying to fit the work that I think needs to be done into the
available time Ben has and so done the same wishful thinking planning
I complain about all the time. Next time around I'll try to make an
honest plan first and then try to fit it into the time we have instead
of the other way round.

But I really think the rework was required since with the original
patch series I was often left with the nagging feeling that I just
don't understand what's going on, and whether I'd really be able to
track down a regression if it bisected to one of the patches. So I
couldn't slap an honset r-b tag onto it. The new series is imo great
and a joy to review.

So again please bring up an example where I've failed and we can look
at it and figure out what needs to change to improve the process. Imo
those little patches and adjustements to our process are the way
forward. At least that approach worked really well for beating our
kernel QA process into shape. And yes, it's tedious and results will
take time to show up.

> But I'm done with this thread.  Maybe others want to comment on things
> they might think improve the situation.

I'm not letting you off the hook that easily ;-)

Cheers, Daniel

Paulo Zanoni Aug. 6, 2013, 2:50 p.m. UTC | #21

Hi

A few direct responses and my 2 cents at the end. This is all my
humble opinion, feel free to disagree or ignore it :)

2013/8/6 Daniel Vetter <daniel@ffwll.ch>:
>
> I often throw in a refactoring suggestion when people work on a
> feature, that's right. Often it is also a crappy idea, but imo for
> long-term maintainance a neat&tidy codebase is really important. So
> I'll just throw them out and see what sticks with people.
>

The problem is that if you throw and it doesn't stick, then people
feel you won't merge it. So they kinda feel they have to do it all the
time.

Another thing is that sometimes the refactoring is just plain
bikeshedding, and that leads to demotivated workers. People write
things on their way, but then they are forced to do it in another way,
which is also correct, but just different, and wastes a lot of time.
And I'm not talking specifically about Daniel's suggestions, everybody
does this kind of bikeshedding (well, I'm sure I do). If someone gives
a bikeshed to a patch, Daniel will see there's an unattended review
comment and will not merge the patch at all, so basically a random
reviewer can easily block someone else's patch. I guess we all should
try to give less bikeshedding, including me.

>
> One prime example is Ville's watermark patches, which have been ready
> (he only did a very few v2 versions for bikesheds) since over a month
> ago. But stuck since no one bothered to review them.

Actually I subscribed myself to review (on review board) and purposely
waited until he was back from vacation before I would start the
review. I also did early 0-day testing on real hardware, which is IMHO
way much more useful than just reviewing. Something that happened many
times for me in the past: I reviewed a patch, thought it was correct,
then decided to boot the patch before sending the R-B email and found
a bug.


And my 2 cents:

Daniel and Jesse are based on different premises, which means they
will basically discuss forever until they realize that.

In an exaggerated view, Daniel's premises:
- Merging patches with bugs is unacceptable
  - Colorary: users should never have to report bugs/regressions
- Delaying patch merging due to refactoring or review comments will
always make it better

In the same exaggerated view, Jesse's premises:
- Actual user/developer testing is more valuable than review and refactoring
  - Colorary: merging code with bugs is acceptable, we want the bug reports
- Endless code churn due to review/refactoring may actually introduce
bugs not present in the first version

Please tell me if I'm wrong.

From my point of view, this is all about tradeoffs and you two stand
on different positions in these tradeoffs. Example:
- Time time you save by not doing all the refactoring/bikeshedding can
be spent doing bug fixing or reviewing/testing someone else's patches.
  - But the question is: which one is more worth it? An hour
refactoring/rebasing so the code behaves exactly like $reviewer wants,
or an hour staring at bugzilla or reviewing/testing patches?
  - From my point of view, it seems Daniel assumes people will always
spend 0 time fixing bugs, that's why he requests people so much
refactoring: the tradeoff slider is completely at one side. But that's
kind of a vicious/virtuous cycle: the more he increases his "quality
standards", the more we'll spend time on the refactorings, so we'll
spend even less time on bugzilla", so Daniel will increase the
standards even more due to even less time spent on bugzilla, and so
on.

One thing which we didn't discuss explicitly right now  and IMHO is
important is how people *feel* about all this. It seems to me that the
current amount of reworking required is making some people (e.g.,
Jesse, Ben) demotivated and unhappy. While this is not really a
measurable thing, I'm sure it negatively affects the rate we improve
our code base and fix our bugs. If we bikeshed a feature to the point
where the author gets fed up with it and just wants it to get merged,
there's a high chance that future bugs discovered on this feature
won't be solved that quickly due the stressful experience the author
had with the feature. And sometimes the unavoidable "I'll just
implement whatever review comments I get because I'm so tired about
this series and now I just want to get it merged" attitude is a very
nice way to introduce bugs.

And one more thing. IMHO this discussion should all be on how we deal
with the people on our team, who get paid to write this code. When
external people contribute patches to us, IMHO we should give them big
thanks, send emails with many smileys, and hold all our spotted
bikesheds to separate patches that we'll send later. Too high quality
standards doesn't seem to be a good way to encourage people who don't
dominate our code base.


My possible suggestions:

- We already have drm-intel-next-queued as a barrier to protect
against bugs in merged patches (it's a barrier to drm-intel-next,
which external people should be using). Even though I do not spend
that much time on bugzilla bugs, I do rebase on dinq/nightly every day
and try to make sure all the regressions I spot are fixed, and I count
this as "bug fixing time". What if we resist our OCDs and urge to
request reworks, then merge patches to dinq more often? To compensate
for this, if anybody reports a single problem in a patch or series
present on dinq, it gets immediately reverted (which means dinq will
either do lots of rebasing or contain many many reverts). And we try
to keep drm-intel-next away from all the dinq madness. Does that sound
maintainable?
- Another idea I already gave a few times is to accept features more
easily, but leave them disabled by default until all the required
reworks are there. Daniel rejected this idea because he feels people
won't do the reworks and will leave the feature disabled by default
forever. My counter-argument: 99% of the features we do are somehow
tracked by PMs, we should make sure the PMs know features are still
disabled, and perhaps open sub-tasks on the feature tracking systems
to document that the feature is not yet completed since it's not
enabled by default.

In other words: this problem is too hard, it's about tradeoffs and
there's no perfect solution that will please everybody.

My just 2 cents, I hope to not have offended anybody :(

Cheers,
Paulo

Daniel Vetter Aug. 6, 2013, 5:06 p.m. UTC | #22

On Tue, Aug 6, 2013 at 4:50 PM, Paulo Zanoni <przanoni@gmail.com> wrote:
> A few direct responses and my 2 cents at the end. This is all my
> humble opinion, feel free to disagree or ignore it :)

I think you make some excellent points, so thanks a lot for joining
the discussion.

> 2013/8/6 Daniel Vetter <daniel@ffwll.ch>:
>>
>> I often throw in a refactoring suggestion when people work on a
>> feature, that's right. Often it is also a crappy idea, but imo for
>> long-term maintainance a neat&tidy codebase is really important. So
>> I'll just throw them out and see what sticks with people.
>>
>
> The problem is that if you throw and it doesn't stick, then people
> feel you won't merge it. So they kinda feel they have to do it all the
> time.
>
> Another thing is that sometimes the refactoring is just plain
> bikeshedding, and that leads to demotivated workers. People write
> things on their way, but then they are forced to do it in another way,
> which is also correct, but just different, and wastes a lot of time.
> And I'm not talking specifically about Daniel's suggestions, everybody
> does this kind of bikeshedding (well, I'm sure I do). If someone gives
> a bikeshed to a patch, Daniel will see there's an unattended review
> comment and will not merge the patch at all, so basically a random
> reviewer can easily block someone else's patch. I guess we all should
> try to give less bikeshedding, including me.

Yeah, that happens. With all the stuff going on I relly can't keep
track of everything, so if it looks like the patch author and the
reviewer are still going back&forth I just wait. And like I've
explained in private once I don't like stepping in as the maintainer
when this happens since I'm not the topic expert by far, so my
assessment will be about as good as a coin-toss. Of course if the
question centers around integration issues with the overall codebase
I'll happily chime in.

I think the only way to reduce time wasted in such stuck discussions
is to admit that the best solution isn't clear and that adding a fixme
comment somewhere to look at the issue again for the next platform
(bug, regression, feature, ...) that touches the same area. Or maybe
reconsider once everything has landed and it's clear what then
end-result really looks like.

>> One prime example is Ville's watermark patches, which have been ready
>> (he only did a very few v2 versions for bikesheds) since over a month
>> ago. But stuck since no one bothered to review them.
>
> Actually I subscribed myself to review (on review board) and purposely
> waited until he was back from vacation before I would start the
> review. I also did early 0-day testing on real hardware, which is IMHO
> way much more useful than just reviewing. Something that happened many
> times for me in the past: I reviewed a patch, thought it was correct,
> then decided to boot the patch before sending the R-B email and found
> a bug.

Imo review shouldn't require you to apply the patches and test them.
Of course if it helps you to convince yourself the patch is good I'm
fine with that approach. But myself if I have doubts I prefer to check
whether a testcase/selfcheck exists to exercise that corner case (and
so will prevent this from also ever breaking again). Testing itself
should be done by the developer (or bug reporter). Hopefully the
developer patch test system that QA is now rolling out will help a lot
in that regard.

> And my 2 cents:
>
> Daniel and Jesse are based on different premises, which means they
> will basically discuss forever until they realize that.
>
> In an exaggerated view, Daniel's premises:
> - Merging patches with bugs is unacceptable
>   - Colorary: users should never have to report bugs/regressions
> - Delaying patch merging due to refactoring or review comments will
> always make it better
>
> In the same exaggerated view, Jesse's premises:
> - Actual user/developer testing is more valuable than review and refactoring
>   - Colorary: merging code with bugs is acceptable, we want the bug reports
> - Endless code churn due to review/refactoring may actually introduce
> bugs not present in the first version
>
> Please tell me if I'm wrong.

At least from my pov I think this is a very accurate description of
our different assumptions and how that shapes how we perceive these
process issues.

> From my point of view, this is all about tradeoffs and you two stand
> on different positions in these tradeoffs. Example:
> - Time time you save by not doing all the refactoring/bikeshedding can
> be spent doing bug fixing or reviewing/testing someone else's patches.
>   - But the question is: which one is more worth it? An hour
> refactoring/rebasing so the code behaves exactly like $reviewer wants,
> or an hour staring at bugzilla or reviewing/testing patches?
>   - From my point of view, it seems Daniel assumes people will always
> spend 0 time fixing bugs, that's why he requests people so much
> refactoring: the tradeoff slider is completely at one side. But that's
> kind of a vicious/virtuous cycle: the more he increases his "quality
> standards", the more we'll spend time on the refactorings, so we'll
> spend even less time on bugzilla", so Daniel will increase the
> standards even more due to even less time spent on bugzilla, and so
> on.

tbh I haven't considered that I might cause a negative feedback cycle here.

One thing that seems to work (at least for me) is when we have good
testcase. With QA's much improved regression reporting I can then
directly assign a bug to the patch auther of the offending commit.
That seems to help a lot in distributing the regression handling work.

But more tests aren't a magic solution since they also take a lot of
time to write. And in a few areas our test coverage gaps are still so
big that relying on tests only for good quality and much less on
clean&clear code which is easy to review isn't really a workable
approach. But I'd be willing to trade off more tests for less bikeshed
in review since imo the two parts are at least partial substitutes.
Thus far though writing tests seems to often come as an afterthough
and not as the first thing, so I guess this doesn't work too well with
our current team. Personally I don't like writing testcases too much,
even though it's fun to blow up the kernel ;-) And it often helps a
_lot_ with understanding the exact nature of a bug/issue, at least for
me.

Another approach could be if developers try to proactively work a bit
on issues in they're area and take active ownership, I'm much more
inclined to just merge patches in this case. Examples are how Jani
wrestles around with the backlight code or how you constantly hunt
down unclaimed register issues. Unfortunately that requires that
people follow the bugspam and m-l mail flood, which is a major time
drain :(

> One thing which we didn't discuss explicitly right now  and IMHO is
> important is how people *feel* about all this. It seems to me that the
> current amount of reworking required is making some people (e.g.,
> Jesse, Ben) demotivated and unhappy. While this is not really a
> measurable thing, I'm sure it negatively affects the rate we improve
> our code base and fix our bugs. If we bikeshed a feature to the point
> where the author gets fed up with it and just wants it to get merged,
> there's a high chance that future bugs discovered on this feature
> won't be solved that quickly due the stressful experience the author
> had with the feature. And sometimes the unavoidable "I'll just
> implement whatever review comments I get because I'm so tired about
> this series and now I just want to get it merged" attitude is a very
> nice way to introduce bugs.

Yep, people are the most important thing, technical issues can usually
be solved much easier. Maybe we need to look for different approaches
that suit people better (everyone's a bit different), like the idea
above to emphasis tests more instead of code cleanliness and
consistency. E.g. for your current pc8+ stuff I've somewhat decided
that I'm not going to drop bikesheds, but just make sure the testcase
looks good. Well throw a few ideas around while reading the patches,
but those are just ideas ... again a case I guess where you can
mistake my suggestions as requirements :(

I need to work on making such idea-throwing clearer.

Otherwise I'm running a bit low on ideas how we could change the patch
polishing for upstream to better suit people and prevent fatalistic
"this isn't really my work anymore" resingation. Ideas?

> And one more thing. IMHO this discussion should all be on how we deal
> with the people on our team, who get paid to write this code. When
> external people contribute patches to us, IMHO we should give them big
> thanks, send emails with many smileys, and hold all our spotted
> bikesheds to separate patches that we'll send later. Too high quality
> standards doesn't seem to be a good way to encourage people who don't
> dominate our code base.

I disagree. External contributions should follow the same standards as
our own code. And just because we're paid to do this doesn't mean I
won't be really happy about a tricky bugfix or a cool feature. Afaic
remember the only non-intel feature that was merged that imo didn't
live up to my standards was the initial i915 prime support from Dave.
And I've clearly stated that I won't merge the patch through my tree
and listed the reasons why I think it's not ready.

> My possible suggestions:
>
> - We already have drm-intel-next-queued as a barrier to protect
> against bugs in merged patches (it's a barrier to drm-intel-next,
> which external people should be using). Even though I do not spend
> that much time on bugzilla bugs, I do rebase on dinq/nightly every day
> and try to make sure all the regressions I spot are fixed, and I count
> this as "bug fixing time". What if we resist our OCDs and urge to
> request reworks, then merge patches to dinq more often? To compensate
> for this, if anybody reports a single problem in a patch or series
> present on dinq, it gets immediately reverted (which means dinq will
> either do lots of rebasing or contain many many reverts). And we try
> to keep drm-intel-next away from all the dinq madness. Does that sound
> maintainable?

I occasionally botch a revert/merge/rebase and since it wouldn't scale
when I ask people to cross check my tree in detail every time (or
people just assume I didn't botch it) those slip out. So I prefer if I
don't have to maintain more volatile trees.

I'm also not terribly in favour of merging stuff early and hoping for
reworks since often the attention moves immediately to the next thing.
E.g. VECS support was merged after a long delay when finally some
basic tests popped up. But then a slight change from Mika to better
exercise some seqno wrap/gpu reset corner cases showed that semaphores
don't work with VECS. QA dutifully reported this bug and Chris
analysis the gpu hang state. Ever since then this was ignored. So I
somewhat agree with Dave here, at least sometimes ...

I'm also not sure that an immediate revert rule is the right approach.
Often an issue is just minor (e.g. the modeset state checker trips
up), dropping the patch right away might be the wrong approach. Of
course if something doesn't get fixed quickly that's not great,
either.

> - Another idea I already gave a few times is to accept features more
> easily, but leave them disabled by default until all the required
> reworks are there. Daniel rejected this idea because he feels people
> won't do the reworks and will leave the feature disabled by default
> forever. My counter-argument: 99% of the features we do are somehow
> tracked by PMs, we should make sure the PMs know features are still
> disabled, and perhaps open sub-tasks on the feature tracking systems
> to document that the feature is not yet completed since it's not
> enabled by default.

I'm not sure how much that would help. If something is disabled by
default it won't getting beaten on by QA. And Jesse is right that we
just need that coverage, but to discover corner case bugs but also to
ensure a feature doesn't regress. If we merge something disabled by
default I fear it'll bitrot as quickly as an unmerged patch series.
But we leave in the delusion that it all still works. So I'm not sure
it's a good approach, but with psr we kinda have this as a real-world
experiment running. Let's see how it goes ...

> In other words: this problem is too hard, it's about tradeoffs and
> there's no perfect solution that will please everybody.

Yeah, I think your approach of clearly stating this as a tradeoff
issue cleared up things a lot for me. I think we need to actively hunt
for opportunities and new ideas. I've added a few of my own above, but
I think it's clear that there's no silver bullet.

One idea I'm pondering is whether a much more detailed breakdown of a
task/feature/... and how to get the test coverage and all the parts
merged could help. At least from my pov a big part of the frustration
seems to stem from the fact that the upstreaming process is highly
variable, and like I've said a few times I think we can do much
better. At least once we've tried this a few times and have some
experience. But again this is not for free but involves quite some
work. And I guess I need to be highly involved or even do large parts
of that break-down to make sure nothing gets missed, and I kinda don't
want to sign up for that work ;-)

> My just 2 cents, I hope to not have offended anybody :(

Not at all, and I think your input has been very valuable to the discussion.

Thanks a lot,
Daniel

Dave Airlie Aug. 6, 2013, 11:28 p.m. UTC | #23

>
> In the same exaggerated view, Jesse's premises:
> - Actual user/developer testing is more valuable than review and refactoring
>   - Colorary: merging code with bugs is acceptable, we want the bug reports
> - Endless code churn due to review/refactoring may actually introduce
> bugs not present in the first version
>
> Please tell me if I'm wrong.
>
> From my point of view, this is all about tradeoffs and you two stand
> on different positions in these tradeoffs. Example:
> - Time time you save by not doing all the refactoring/bikeshedding can
> be spent doing bug fixing or reviewing/testing someone else's patches.
>   - But the question is: which one is more worth it? An hour
> refactoring/rebasing so the code behaves exactly like $reviewer wants,
> or an hour staring at bugzilla or reviewing/testing patches?
>   - From my point of view, it seems Daniel assumes people will always
> spend 0 time fixing bugs, that's why he requests people so much
> refactoring: the tradeoff slider is completely at one side. But that's
> kind of a vicious/virtuous cycle: the more he increases his "quality
> standards", the more we'll spend time on the refactorings, so we'll
> spend even less time on bugzilla", so Daniel will increase the
> standards even more due to even less time spent on bugzilla, and so
> on.

Here is the thing, before Daniel started making people write tests and
bikeshedding,
people spent 0 time on bugs, I can dig up countless times now I've had
RHEL regressions
that I've had to stop merging code to get anyone to look at.

So Jesse likes to think that people will have more time to look at
bugzilla if they aren't
refactoring patches, but generally I find people will just get moved
onto the next task the
second the code is merged by Daniel, and will fight against taking any
responsibility for
code that is already merged unless hit with a big stick.

This is just ingrained in how people work, doing new shiny stuff is
always more fun than
spending 4 days or weeks to send a one liner patch, so really if
people thinking we just need to merge
most stuff faster is the solution they are delusional, and I'll gladly
stop pulling until they stop.

I've spent 2-3 weeks on single bugs in the graphics stack before and
I'm sure I will again, but the incentive to go
hunting for them generally comes from someone important reporting the
bug, not from a misc bug report
in bugzilla from someone who isn't a monetary concern. So Jesse if you
really believe the team will focus on bugs
2-3 months after the code is merged and drop their priority for
merging whatever cool feature they are on now, then
maybe I'd agree, but so far history has shown this never happens.

Dave.

[01/12] drm/i915: plumb VM into object operations

Commit Message

Comments

Patch