Message ID | 1374458899-8635-2-git-send-email-ben@bwidawsk.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sun, Jul 21, 2013 at 07:08:08PM -0700, Ben Widawsky wrote: > This patch was formerly known as: > "drm/i915: Create VMAs (part 3) - plumbing" > > This patch adds a VM argument, bind/unbind, and the object > offset/size/color getters/setters. It preserves the old ggtt helper > functions because things still need, and will continue to need them. > > Some code will still need to be ported over after this. > > v2: Fix purge to pick an object and unbind all vmas > This was doable because of the global bound list change. > > v3: With the commit to actually pin/unpin pages in place, there is no > longer a need to check if unbind succeeded before calling put_pages(). > Make put_pages only BUG() after checking pin count. > > v4: Rebased on top of the new hangcheck work by Mika > plumbed eb_destroy also > Many checkpatch related fixes > > v5: Very large rebase > > v6: > Change BUG_ON to WARN_ON (Daniel) > Rename vm to ggtt in preallocate stolen, since it is always ggtt when > dealing with stolen memory. (Daniel) > list_for_each will short-circuit already (Daniel) > remove superflous space (Daniel) > Use per object list of vmas (Daniel) > Make obj_bound_any() use obj_bound for each vm (Ben) > s/bind_to_gtt/bind_to_vm/ (Ben) > > Fixed up the inactive shrinker. As Daniel noticed the code could > potentially count the same object multiple times. While it's not > possible in the current case, since 1 object can only ever be bound into > 1 address space thus far - we may as well try to get something more > future proof in place now. With a prep patch before this to switch over > to using the bound list + inactive check, we're now able to carry that > forward for every address space an object is bound into. > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Ok, I think this patch is too big and needs to be split up. Atm there's way too many changes in here to be able to do a real review. Things I've noticed while reading through it - The set_color interface looks really strange. We loop over all vma, but then pass in the (obj, vm) pair so that we _again_ loop over all vmas to figure out the right one again to finally set the color. - The function renaming should imo be split out as much as possible. - There's some variable renaming like s/alignment/align/. Imo just drop that part. - Some localized prep work without changing function interface should also go in separate patches imo, like using ggtt_vm pointers more. Overall I still think that the little attribute helpers should accept a vma parameter, not an (obj, vm) pair. -Daniel > --- > drivers/gpu/drm/i915/i915_debugfs.c | 29 ++- > drivers/gpu/drm/i915/i915_dma.c | 4 - > drivers/gpu/drm/i915/i915_drv.h | 107 +++++---- > drivers/gpu/drm/i915/i915_gem.c | 337 +++++++++++++++++++++-------- > drivers/gpu/drm/i915/i915_gem_context.c | 9 +- > drivers/gpu/drm/i915/i915_gem_evict.c | 51 +++-- > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 85 +++++--- > drivers/gpu/drm/i915/i915_gem_gtt.c | 41 ++-- > drivers/gpu/drm/i915/i915_gem_stolen.c | 10 +- > drivers/gpu/drm/i915/i915_gem_tiling.c | 10 +- > drivers/gpu/drm/i915/i915_trace.h | 20 +- > drivers/gpu/drm/i915/intel_fb.c | 1 - > drivers/gpu/drm/i915/intel_overlay.c | 2 +- > drivers/gpu/drm/i915/intel_pm.c | 2 +- > drivers/gpu/drm/i915/intel_ringbuffer.c | 16 +- > 15 files changed, 479 insertions(+), 245 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c > index be69807..f8e590f 100644 > --- a/drivers/gpu/drm/i915/i915_debugfs.c > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > @@ -92,6 +92,7 @@ static const char *get_tiling_flag(struct drm_i915_gem_object *obj) > static void > describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) > { > + struct i915_vma *vma; > seq_printf(m, "%pK: %s%s %8zdKiB %02x %02x %d %d %d%s%s%s", > &obj->base, > get_pin_flag(obj), > @@ -111,9 +112,15 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) > seq_printf(m, " (pinned x %d)", obj->pin_count); > if (obj->fence_reg != I915_FENCE_REG_NONE) > seq_printf(m, " (fence: %d)", obj->fence_reg); > - if (i915_gem_obj_ggtt_bound(obj)) > - seq_printf(m, " (gtt offset: %08lx, size: %08x)", > - i915_gem_obj_ggtt_offset(obj), (unsigned int)i915_gem_obj_ggtt_size(obj)); > + list_for_each_entry(vma, &obj->vma_list, vma_link) { > + if (!i915_is_ggtt(vma->vm)) > + seq_puts(m, " (pp"); > + else > + seq_puts(m, " (g"); > + seq_printf(m, "gtt offset: %08lx, size: %08lx)", > + i915_gem_obj_offset(obj, vma->vm), > + i915_gem_obj_size(obj, vma->vm)); > + } > if (obj->stolen) > seq_printf(m, " (stolen: %08lx)", obj->stolen->start); > if (obj->pin_mappable || obj->fault_mappable) { > @@ -175,6 +182,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data) > return 0; > } > > +/* FIXME: Support multiple VM? */ > #define count_objects(list, member) do { \ > list_for_each_entry(obj, list, member) { \ > size += i915_gem_obj_ggtt_size(obj); \ > @@ -1781,18 +1789,21 @@ i915_drop_caches_set(void *data, u64 val) > > if (val & DROP_BOUND) { > list_for_each_entry_safe(obj, next, &vm->inactive_list, > - mm_list) > - if (obj->pin_count == 0) { > - ret = i915_gem_object_unbind(obj); > - if (ret) > - goto unlock; > - } > + mm_list) { > + if (obj->pin_count) > + continue; > + > + ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base); > + if (ret) > + goto unlock; > + } > } > > if (val & DROP_UNBOUND) { > list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list, > global_list) > if (obj->pages_pin_count == 0) { > + /* FIXME: Do this for all vms? */ > ret = i915_gem_object_put_pages(obj); > if (ret) > goto unlock; > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c > index 1449d06..4650519 100644 > --- a/drivers/gpu/drm/i915/i915_dma.c > +++ b/drivers/gpu/drm/i915/i915_dma.c > @@ -1499,10 +1499,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags) > > i915_dump_device_info(dev_priv); > > - INIT_LIST_HEAD(&dev_priv->vm_list); > - INIT_LIST_HEAD(&dev_priv->gtt.base.global_link); > - list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list); > - > if (i915_get_bridge_dev(dev)) { > ret = -EIO; > goto free_priv; > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 8b3167e..681cb41 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -1379,52 +1379,6 @@ struct drm_i915_gem_object { > > #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base) > > -/* This is a temporary define to help transition us to real VMAs. If you see > - * this, you're either reviewing code, or bisecting it. */ > -static inline struct i915_vma * > -__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj) > -{ > - if (list_empty(&obj->vma_list)) > - return NULL; > - return list_first_entry(&obj->vma_list, struct i915_vma, vma_link); > -} > - > -/* Whether or not this object is currently mapped by the translation tables */ > -static inline bool > -i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o) > -{ > - struct i915_vma *vma = __i915_gem_obj_to_vma(o); > - if (vma == NULL) > - return false; > - return drm_mm_node_allocated(&vma->node); > -} > - > -/* Offset of the first PTE pointing to this object */ > -static inline unsigned long > -i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o) > -{ > - BUG_ON(list_empty(&o->vma_list)); > - return __i915_gem_obj_to_vma(o)->node.start; > -} > - > -/* The size used in the translation tables may be larger than the actual size of > - * the object on GEN2/GEN3 because of the way tiling is handled. See > - * i915_gem_get_gtt_size() for more details. > - */ > -static inline unsigned long > -i915_gem_obj_ggtt_size(struct drm_i915_gem_object *o) > -{ > - BUG_ON(list_empty(&o->vma_list)); > - return __i915_gem_obj_to_vma(o)->node.size; > -} > - > -static inline void > -i915_gem_obj_ggtt_set_color(struct drm_i915_gem_object *o, > - enum i915_cache_level color) > -{ > - __i915_gem_obj_to_vma(o)->node.color = color; > -} > - > /** > * Request queue structure. > * > @@ -1736,11 +1690,13 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj, > void i915_gem_vma_destroy(struct i915_vma *vma); > > int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm, > uint32_t alignment, > bool map_and_fenceable, > bool nonblocking); > void i915_gem_object_unpin(struct drm_i915_gem_object *obj); > -int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj); > +int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm); > int i915_gem_object_put_pages(struct drm_i915_gem_object *obj); > void i915_gem_release_mmap(struct drm_i915_gem_object *obj); > void i915_gem_lastclose(struct drm_device *dev); > @@ -1770,6 +1726,7 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev); > int i915_gem_object_sync(struct drm_i915_gem_object *obj, > struct intel_ring_buffer *to); > void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm, > struct intel_ring_buffer *ring); > > int i915_gem_dumb_create(struct drm_file *file_priv, > @@ -1876,6 +1833,7 @@ i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size, > int tiling_mode, bool fenced); > > int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm, > enum i915_cache_level cache_level); > > struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, > @@ -1886,6 +1844,56 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev, > > void i915_gem_restore_fences(struct drm_device *dev); > > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o, > + struct i915_address_space *vm); > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o); > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o, > + struct i915_address_space *vm); > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o, > + struct i915_address_space *vm); > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o, > + struct i915_address_space *vm, > + enum i915_cache_level color); > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm); > +/* Some GGTT VM helpers */ > +#define obj_to_ggtt(obj) \ > + (&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base) > +static inline bool i915_is_ggtt(struct i915_address_space *vm) > +{ > + struct i915_address_space *ggtt = > + &((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base; > + return vm == ggtt; > +} > + > +static inline bool i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *obj) > +{ > + return i915_gem_obj_bound(obj, obj_to_ggtt(obj)); > +} > + > +static inline unsigned long > +i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *obj) > +{ > + return i915_gem_obj_offset(obj, obj_to_ggtt(obj)); > +} > + > +static inline unsigned long > +i915_gem_obj_ggtt_size(struct drm_i915_gem_object *obj) > +{ > + return i915_gem_obj_size(obj, obj_to_ggtt(obj)); > +} > + > +static inline int __must_check > +i915_gem_ggtt_pin(struct drm_i915_gem_object *obj, > + uint32_t alignment, > + bool map_and_fenceable, > + bool nonblocking) > +{ > + return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment, > + map_and_fenceable, nonblocking); > +} > +#undef obj_to_ggtt > + > /* i915_gem_context.c */ > void i915_gem_context_init(struct drm_device *dev); > void i915_gem_context_fini(struct drm_device *dev); > @@ -1922,6 +1930,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt, > > void i915_gem_restore_gtt_mappings(struct drm_device *dev); > int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj); > +/* FIXME: this is never okay with full PPGTT */ > void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj, > enum i915_cache_level cache_level); > void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj); > @@ -1938,7 +1947,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev) > > > /* i915_gem_evict.c */ > -int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size, > +int __must_check i915_gem_evict_something(struct drm_device *dev, > + struct i915_address_space *vm, > + int min_size, > unsigned alignment, > unsigned cache_level, > bool mappable, > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 2283765..0111554 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -38,10 +38,12 @@ > > static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj); > static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj); > -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj, > - unsigned alignment, > - bool map_and_fenceable, > - bool nonblocking); > +static __must_check int > +i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm, > + unsigned alignment, > + bool map_and_fenceable, > + bool nonblocking); > static int i915_gem_phys_pwrite(struct drm_device *dev, > struct drm_i915_gem_object *obj, > struct drm_i915_gem_pwrite *args, > @@ -120,7 +122,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev) > static inline bool > i915_gem_object_is_inactive(struct drm_i915_gem_object *obj) > { > - return i915_gem_obj_ggtt_bound(obj) && !obj->active; > + return i915_gem_obj_bound_any(obj) && !obj->active; > } > > int > @@ -406,7 +408,7 @@ i915_gem_shmem_pread(struct drm_device *dev, > * anyway again before the next pread happens. */ > if (obj->cache_level == I915_CACHE_NONE) > needs_clflush = 1; > - if (i915_gem_obj_ggtt_bound(obj)) { > + if (i915_gem_obj_bound_any(obj)) { > ret = i915_gem_object_set_to_gtt_domain(obj, false); > if (ret) > return ret; > @@ -578,7 +580,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev, > char __user *user_data; > int page_offset, page_length, ret; > > - ret = i915_gem_object_pin(obj, 0, true, true); > + ret = i915_gem_ggtt_pin(obj, 0, true, true); > if (ret) > goto out; > > @@ -723,7 +725,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev, > * right away and we therefore have to clflush anyway. */ > if (obj->cache_level == I915_CACHE_NONE) > needs_clflush_after = 1; > - if (i915_gem_obj_ggtt_bound(obj)) { > + if (i915_gem_obj_bound_any(obj)) { > ret = i915_gem_object_set_to_gtt_domain(obj, true); > if (ret) > return ret; > @@ -1332,7 +1334,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf) > } > > /* Now bind it into the GTT if needed */ > - ret = i915_gem_object_pin(obj, 0, true, false); > + ret = i915_gem_ggtt_pin(obj, 0, true, false); > if (ret) > goto unlock; > > @@ -1654,11 +1656,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj) > if (obj->pages == NULL) > return 0; > > - BUG_ON(i915_gem_obj_ggtt_bound(obj)); > - > if (obj->pages_pin_count) > return -EBUSY; > > + BUG_ON(i915_gem_obj_bound_any(obj)); > + > /* ->put_pages might need to allocate memory for the bit17 swizzle > * array, hence protect them from being reaped by removing them from gtt > * lists early. */ > @@ -1678,7 +1680,6 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target, > bool purgeable_only) > { > struct drm_i915_gem_object *obj, *next; > - struct i915_address_space *vm = &dev_priv->gtt.base; > long count = 0; > > list_for_each_entry_safe(obj, next, > @@ -1692,14 +1693,22 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target, > } > } > > - list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) { > - if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) && > - i915_gem_object_unbind(obj) == 0 && > - i915_gem_object_put_pages(obj) == 0) { > + list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list, > + global_list) { > + struct i915_vma *vma, *v; > + > + if (!i915_gem_object_is_purgeable(obj) && purgeable_only) > + continue; > + > + list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link) > + if (i915_gem_object_unbind(obj, vma->vm)) > + break; > + > + if (!i915_gem_object_put_pages(obj)) > count += obj->base.size >> PAGE_SHIFT; > - if (count >= target) > - return count; > - } > + > + if (count >= target) > + return count; > } > > return count; > @@ -1859,11 +1868,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj) > > void > i915_gem_object_move_to_active(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm, > struct intel_ring_buffer *ring) > { > struct drm_device *dev = obj->base.dev; > struct drm_i915_private *dev_priv = dev->dev_private; > - struct i915_address_space *vm = &dev_priv->gtt.base; > u32 seqno = intel_ring_get_seqno(ring); > > BUG_ON(ring == NULL); > @@ -1900,12 +1909,9 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj, > } > > static void > -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj) > +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm) > { > - struct drm_device *dev = obj->base.dev; > - struct drm_i915_private *dev_priv = dev->dev_private; > - struct i915_address_space *vm = &dev_priv->gtt.base; > - > BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS); > BUG_ON(!obj->active); > > @@ -2105,10 +2111,11 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request) > spin_unlock(&file_priv->mm.lock); > } > > -static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj) > +static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj, > + struct i915_address_space *vm) > { > - if (acthd >= i915_gem_obj_ggtt_offset(obj) && > - acthd < i915_gem_obj_ggtt_offset(obj) + obj->base.size) > + if (acthd >= i915_gem_obj_offset(obj, vm) && > + acthd < i915_gem_obj_offset(obj, vm) + obj->base.size) > return true; > > return false; > @@ -2131,6 +2138,17 @@ static bool i915_head_inside_request(const u32 acthd_unmasked, > return false; > } > > +static struct i915_address_space * > +request_to_vm(struct drm_i915_gem_request *request) > +{ > + struct drm_i915_private *dev_priv = request->ring->dev->dev_private; > + struct i915_address_space *vm; > + > + vm = &dev_priv->gtt.base; > + > + return vm; > +} > + > static bool i915_request_guilty(struct drm_i915_gem_request *request, > const u32 acthd, bool *inside) > { > @@ -2138,9 +2156,9 @@ static bool i915_request_guilty(struct drm_i915_gem_request *request, > * pointing inside the ring, matches the batch_obj address range. > * However this is extremely unlikely. > */ > - > if (request->batch_obj) { > - if (i915_head_inside_object(acthd, request->batch_obj)) { > + if (i915_head_inside_object(acthd, request->batch_obj, > + request_to_vm(request))) { > *inside = true; > return true; > } > @@ -2160,17 +2178,21 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring, > { > struct i915_ctx_hang_stats *hs = NULL; > bool inside, guilty; > + unsigned long offset = 0; > > /* Innocent until proven guilty */ > guilty = false; > > + if (request->batch_obj) > + offset = i915_gem_obj_offset(request->batch_obj, > + request_to_vm(request)); > + > if (ring->hangcheck.action != wait && > i915_request_guilty(request, acthd, &inside)) { > DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n", > ring->name, > inside ? "inside" : "flushing", > - request->batch_obj ? > - i915_gem_obj_ggtt_offset(request->batch_obj) : 0, > + offset, > request->ctx ? request->ctx->id : 0, > acthd); > > @@ -2227,13 +2249,15 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv, > } > > while (!list_empty(&ring->active_list)) { > + struct i915_address_space *vm; > struct drm_i915_gem_object *obj; > > obj = list_first_entry(&ring->active_list, > struct drm_i915_gem_object, > ring_list); > > - i915_gem_object_move_to_inactive(obj); > + list_for_each_entry(vm, &dev_priv->vm_list, global_link) > + i915_gem_object_move_to_inactive(obj, vm); > } > } > > @@ -2261,7 +2285,7 @@ void i915_gem_restore_fences(struct drm_device *dev) > void i915_gem_reset(struct drm_device *dev) > { > struct drm_i915_private *dev_priv = dev->dev_private; > - struct i915_address_space *vm = &dev_priv->gtt.base; > + struct i915_address_space *vm; > struct drm_i915_gem_object *obj; > struct intel_ring_buffer *ring; > int i; > @@ -2272,8 +2296,9 @@ void i915_gem_reset(struct drm_device *dev) > /* Move everything out of the GPU domains to ensure we do any > * necessary invalidation upon reuse. > */ > - list_for_each_entry(obj, &vm->inactive_list, mm_list) > - obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS; > + list_for_each_entry(vm, &dev_priv->vm_list, global_link) > + list_for_each_entry(obj, &vm->inactive_list, mm_list) > + obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS; > > i915_gem_restore_fences(dev); > } > @@ -2318,6 +2343,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring) > * by the ringbuffer to the flushing/inactive lists as appropriate. > */ > while (!list_empty(&ring->active_list)) { > + struct drm_i915_private *dev_priv = ring->dev->dev_private; > + struct i915_address_space *vm; > struct drm_i915_gem_object *obj; > > obj = list_first_entry(&ring->active_list, > @@ -2327,7 +2354,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring) > if (!i915_seqno_passed(seqno, obj->last_read_seqno)) > break; > > - i915_gem_object_move_to_inactive(obj); > + list_for_each_entry(vm, &dev_priv->vm_list, global_link) > + i915_gem_object_move_to_inactive(obj, vm); > } > > if (unlikely(ring->trace_irq_seqno && > @@ -2573,13 +2601,14 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj) > * Unbinds an object from the GTT aperture. > */ > int > -i915_gem_object_unbind(struct drm_i915_gem_object *obj) > +i915_gem_object_unbind(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm) > { > drm_i915_private_t *dev_priv = obj->base.dev->dev_private; > struct i915_vma *vma; > int ret; > > - if (!i915_gem_obj_ggtt_bound(obj)) > + if (!i915_gem_obj_bound(obj, vm)) > return 0; > > if (obj->pin_count) > @@ -2602,7 +2631,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj) > if (ret) > return ret; > > - trace_i915_gem_object_unbind(obj); > + trace_i915_gem_object_unbind(obj, vm); > > if (obj->has_global_gtt_mapping) > i915_gem_gtt_unbind_object(obj); > @@ -2617,7 +2646,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj) > /* Avoid an unnecessary call to unbind on rebind. */ > obj->map_and_fenceable = true; > > - vma = __i915_gem_obj_to_vma(obj); > + vma = i915_gem_obj_to_vma(obj, vm); > list_del(&vma->vma_link); > drm_mm_remove_node(&vma->node); > i915_gem_vma_destroy(vma); > @@ -2764,6 +2793,7 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg, > "object 0x%08lx not 512K or pot-size 0x%08x aligned\n", > i915_gem_obj_ggtt_offset(obj), size); > > + > pitch_val = obj->stride / 128; > pitch_val = ffs(pitch_val) - 1; > > @@ -3049,24 +3079,26 @@ static void i915_gem_verify_gtt(struct drm_device *dev) > * Finds free space in the GTT aperture and binds the object there. > */ > static int > -i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj, > - unsigned alignment, > - bool map_and_fenceable, > - bool nonblocking) > +i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm, > + unsigned alignment, > + bool map_and_fenceable, > + bool nonblocking) > { > struct drm_device *dev = obj->base.dev; > drm_i915_private_t *dev_priv = dev->dev_private; > - struct i915_address_space *vm = &dev_priv->gtt.base; > u32 size, fence_size, fence_alignment, unfenced_alignment; > bool mappable, fenceable; > - size_t gtt_max = map_and_fenceable ? > - dev_priv->gtt.mappable_end : dev_priv->gtt.base.total; > + size_t gtt_max = > + map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total; > struct i915_vma *vma; > int ret; > > if (WARN_ON(!list_empty(&obj->vma_list))) > return -EBUSY; > > + BUG_ON(!i915_is_ggtt(vm)); > + > fence_size = i915_gem_get_gtt_size(dev, > obj->base.size, > obj->tiling_mode); > @@ -3105,19 +3137,21 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj, > > i915_gem_object_pin_pages(obj); > > - vma = i915_gem_vma_create(obj, &dev_priv->gtt.base); > + /* For now we only ever use 1 vma per object */ > + WARN_ON(!list_empty(&obj->vma_list)); > + > + vma = i915_gem_vma_create(obj, vm); > if (IS_ERR(vma)) { > i915_gem_object_unpin_pages(obj); > return PTR_ERR(vma); > } > > search_free: > - ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm, > - &vma->node, > + ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node, > size, alignment, > obj->cache_level, 0, gtt_max); > if (ret) { > - ret = i915_gem_evict_something(dev, size, alignment, > + ret = i915_gem_evict_something(dev, vm, size, alignment, > obj->cache_level, > map_and_fenceable, > nonblocking); > @@ -3138,18 +3172,25 @@ search_free: > > list_move_tail(&obj->global_list, &dev_priv->mm.bound_list); > list_add_tail(&obj->mm_list, &vm->inactive_list); > - list_add(&vma->vma_link, &obj->vma_list); > + > + /* Keep GGTT vmas first to make debug easier */ > + if (i915_is_ggtt(vm)) > + list_add(&vma->vma_link, &obj->vma_list); > + else > + list_add_tail(&vma->vma_link, &obj->vma_list); > > fenceable = > + i915_is_ggtt(vm) && > i915_gem_obj_ggtt_size(obj) == fence_size && > (i915_gem_obj_ggtt_offset(obj) & (fence_alignment - 1)) == 0; > > - mappable = i915_gem_obj_ggtt_offset(obj) + obj->base.size <= > - dev_priv->gtt.mappable_end; > + mappable = > + i915_is_ggtt(vm) && > + vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end; > > obj->map_and_fenceable = mappable && fenceable; > > - trace_i915_gem_object_bind(obj, map_and_fenceable); > + trace_i915_gem_object_bind(obj, vm, map_and_fenceable); > i915_gem_verify_gtt(dev); > return 0; > > @@ -3253,7 +3294,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) > int ret; > > /* Not valid to be called on unbound objects. */ > - if (!i915_gem_obj_ggtt_bound(obj)) > + if (!i915_gem_obj_bound_any(obj)) > return -EINVAL; > > if (obj->base.write_domain == I915_GEM_DOMAIN_GTT) > @@ -3299,11 +3340,12 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) > } > > int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm, > enum i915_cache_level cache_level) > { > struct drm_device *dev = obj->base.dev; > drm_i915_private_t *dev_priv = dev->dev_private; > - struct i915_vma *vma = __i915_gem_obj_to_vma(obj); > + struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm); > int ret; > > if (obj->cache_level == cache_level) > @@ -3315,12 +3357,12 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > } > > if (vma && !i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) { > - ret = i915_gem_object_unbind(obj); > + ret = i915_gem_object_unbind(obj, vm); > if (ret) > return ret; > } > > - if (i915_gem_obj_ggtt_bound(obj)) { > + list_for_each_entry(vma, &obj->vma_list, vma_link) { > ret = i915_gem_object_finish_gpu(obj); > if (ret) > return ret; > @@ -3343,7 +3385,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt, > obj, cache_level); > > - i915_gem_obj_ggtt_set_color(obj, cache_level); > + i915_gem_obj_set_color(obj, vma->vm, cache_level); > } > > if (cache_level == I915_CACHE_NONE) { > @@ -3403,6 +3445,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data, > struct drm_file *file) > { > struct drm_i915_gem_caching *args = data; > + struct drm_i915_private *dev_priv; > struct drm_i915_gem_object *obj; > enum i915_cache_level level; > int ret; > @@ -3427,8 +3470,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data, > ret = -ENOENT; > goto unlock; > } > + dev_priv = obj->base.dev->dev_private; > > - ret = i915_gem_object_set_cache_level(obj, level); > + /* FIXME: Add interface for specific VM? */ > + ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level); > > drm_gem_object_unreference(&obj->base); > unlock: > @@ -3446,6 +3491,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, > u32 alignment, > struct intel_ring_buffer *pipelined) > { > + struct drm_i915_private *dev_priv = obj->base.dev->dev_private; > u32 old_read_domains, old_write_domain; > int ret; > > @@ -3464,7 +3510,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, > * of uncaching, which would allow us to flush all the LLC-cached data > * with that bit in the PTE to main memory with just one PIPE_CONTROL. > */ > - ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE); > + ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, > + I915_CACHE_NONE); > if (ret) > return ret; > > @@ -3472,7 +3519,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, > * (e.g. libkms for the bootup splash), we have to ensure that we > * always use map_and_fenceable for all scanout buffers. > */ > - ret = i915_gem_object_pin(obj, alignment, true, false); > + ret = i915_gem_ggtt_pin(obj, alignment, true, false); > if (ret) > return ret; > > @@ -3615,6 +3662,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file) > > int > i915_gem_object_pin(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm, > uint32_t alignment, > bool map_and_fenceable, > bool nonblocking) > @@ -3624,28 +3672,31 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj, > if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT)) > return -EBUSY; > > - if (i915_gem_obj_ggtt_bound(obj)) { > - if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) || > + WARN_ON(map_and_fenceable && !i915_is_ggtt(vm)); > + > + if (i915_gem_obj_bound(obj, vm)) { > + if ((alignment && > + i915_gem_obj_offset(obj, vm) & (alignment - 1)) || > (map_and_fenceable && !obj->map_and_fenceable)) { > WARN(obj->pin_count, > "bo is already pinned with incorrect alignment:" > " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d," > " obj->map_and_fenceable=%d\n", > - i915_gem_obj_ggtt_offset(obj), alignment, > + i915_gem_obj_offset(obj, vm), alignment, > map_and_fenceable, > obj->map_and_fenceable); > - ret = i915_gem_object_unbind(obj); > + ret = i915_gem_object_unbind(obj, vm); > if (ret) > return ret; > } > } > > - if (!i915_gem_obj_ggtt_bound(obj)) { > + if (!i915_gem_obj_bound(obj, vm)) { > struct drm_i915_private *dev_priv = obj->base.dev->dev_private; > > - ret = i915_gem_object_bind_to_gtt(obj, alignment, > - map_and_fenceable, > - nonblocking); > + ret = i915_gem_object_bind_to_vm(obj, vm, alignment, > + map_and_fenceable, > + nonblocking); > if (ret) > return ret; > > @@ -3666,7 +3717,7 @@ void > i915_gem_object_unpin(struct drm_i915_gem_object *obj) > { > BUG_ON(obj->pin_count == 0); > - BUG_ON(!i915_gem_obj_ggtt_bound(obj)); > + BUG_ON(!i915_gem_obj_bound_any(obj)); > > if (--obj->pin_count == 0) > obj->pin_mappable = false; > @@ -3704,7 +3755,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data, > } > > if (obj->user_pin_count == 0) { > - ret = i915_gem_object_pin(obj, args->alignment, true, false); > + ret = i915_gem_ggtt_pin(obj, args->alignment, true, false); > if (ret) > goto out; > } > @@ -3937,6 +3988,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj) > struct drm_i915_gem_object *obj = to_intel_bo(gem_obj); > struct drm_device *dev = obj->base.dev; > drm_i915_private_t *dev_priv = dev->dev_private; > + struct i915_vma *vma, *next; > > trace_i915_gem_object_destroy(obj); > > @@ -3944,15 +3996,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj) > i915_gem_detach_phys_object(dev, obj); > > obj->pin_count = 0; > - if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) { > - bool was_interruptible; > + /* NB: 0 or 1 elements */ > + WARN_ON(!list_empty(&obj->vma_list) && > + !list_is_singular(&obj->vma_list)); > + list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) { > + int ret = i915_gem_object_unbind(obj, vma->vm); > + if (WARN_ON(ret == -ERESTARTSYS)) { > + bool was_interruptible; > > - was_interruptible = dev_priv->mm.interruptible; > - dev_priv->mm.interruptible = false; > + was_interruptible = dev_priv->mm.interruptible; > + dev_priv->mm.interruptible = false; > > - WARN_ON(i915_gem_object_unbind(obj)); > + WARN_ON(i915_gem_object_unbind(obj, vma->vm)); > > - dev_priv->mm.interruptible = was_interruptible; > + dev_priv->mm.interruptible = was_interruptible; > + } > } > > /* Stolen objects don't hold a ref, but do hold pin count. Fix that up > @@ -4319,6 +4377,16 @@ init_ring_lists(struct intel_ring_buffer *ring) > INIT_LIST_HEAD(&ring->request_list); > } > > +static void i915_init_vm(struct drm_i915_private *dev_priv, > + struct i915_address_space *vm) > +{ > + vm->dev = dev_priv->dev; > + INIT_LIST_HEAD(&vm->active_list); > + INIT_LIST_HEAD(&vm->inactive_list); > + INIT_LIST_HEAD(&vm->global_link); > + list_add(&vm->global_link, &dev_priv->vm_list); > +} > + > void > i915_gem_load(struct drm_device *dev) > { > @@ -4331,8 +4399,9 @@ i915_gem_load(struct drm_device *dev) > SLAB_HWCACHE_ALIGN, > NULL); > > - INIT_LIST_HEAD(&dev_priv->gtt.base.active_list); > - INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list); > + INIT_LIST_HEAD(&dev_priv->vm_list); > + i915_init_vm(dev_priv, &dev_priv->gtt.base); > + > INIT_LIST_HEAD(&dev_priv->mm.unbound_list); > INIT_LIST_HEAD(&dev_priv->mm.bound_list); > INIT_LIST_HEAD(&dev_priv->mm.fence_list); > @@ -4603,9 +4672,8 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc) > struct drm_i915_private, > mm.inactive_shrinker); > struct drm_device *dev = dev_priv->dev; > - struct i915_address_space *vm = &dev_priv->gtt.base; > struct drm_i915_gem_object *obj; > - int nr_to_scan = sc->nr_to_scan; > + int nr_to_scan; > bool unlock = true; > int cnt; > > @@ -4619,6 +4687,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc) > unlock = false; > } > > + nr_to_scan = sc->nr_to_scan; > if (nr_to_scan) { > nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan); > if (nr_to_scan > 0) > @@ -4632,11 +4701,109 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc) > list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list) > if (obj->pages_pin_count == 0) > cnt += obj->base.size >> PAGE_SHIFT; > - list_for_each_entry(obj, &vm->inactive_list, mm_list) > + > + list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) { > + if (obj->active) > + continue; > + > + i915_gem_object_flush_gtt_write_domain(obj); > + i915_gem_object_flush_cpu_write_domain(obj); > + /* FIXME: Can't assume global gtt */ > + i915_gem_object_move_to_inactive(obj, &dev_priv->gtt.base); > + > if (obj->pin_count == 0 && obj->pages_pin_count == 0) > cnt += obj->base.size >> PAGE_SHIFT; > + } > > if (unlock) > mutex_unlock(&dev->struct_mutex); > return cnt; > } > + > +/* All the new VM stuff */ > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o, > + struct i915_address_space *vm) > +{ > + struct drm_i915_private *dev_priv = o->base.dev->dev_private; > + struct i915_vma *vma; > + > + if (vm == &dev_priv->mm.aliasing_ppgtt->base) > + vm = &dev_priv->gtt.base; > + > + BUG_ON(list_empty(&o->vma_list)); > + list_for_each_entry(vma, &o->vma_list, vma_link) { > + if (vma->vm == vm) > + return vma->node.start; > + > + } > + return -1; > +} > + > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o, > + struct i915_address_space *vm) > +{ > + struct i915_vma *vma; > + > + list_for_each_entry(vma, &o->vma_list, vma_link) > + if (vma->vm == vm) > + return true; > + > + return false; > +} > + > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o) > +{ > + struct drm_i915_private *dev_priv = o->base.dev->dev_private; > + struct i915_address_space *vm; > + > + list_for_each_entry(vm, &dev_priv->vm_list, global_link) > + if (i915_gem_obj_bound(o, vm)) > + return true; > + > + return false; > +} > + > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o, > + struct i915_address_space *vm) > +{ > + struct drm_i915_private *dev_priv = o->base.dev->dev_private; > + struct i915_vma *vma; > + > + if (vm == &dev_priv->mm.aliasing_ppgtt->base) > + vm = &dev_priv->gtt.base; > + > + BUG_ON(list_empty(&o->vma_list)); > + > + list_for_each_entry(vma, &o->vma_list, vma_link) > + if (vma->vm == vm) > + return vma->node.size; > + > + return 0; > +} > + > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o, > + struct i915_address_space *vm, > + enum i915_cache_level color) > +{ > + struct i915_vma *vma; > + BUG_ON(list_empty(&o->vma_list)); > + list_for_each_entry(vma, &o->vma_list, vma_link) { > + if (vma->vm == vm) { > + vma->node.color = color; > + return; > + } > + } > + > + WARN(1, "Couldn't set color for VM %p\n", vm); > +} > + > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm) > +{ > + struct i915_vma *vma; > + list_for_each_entry(vma, &obj->vma_list, vma_link) > + if (vma->vm == vm) > + return vma; > + > + return NULL; > +} > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c > index 2470206..873577d 100644 > --- a/drivers/gpu/drm/i915/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > @@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev, > > if (INTEL_INFO(dev)->gen >= 7) { > ret = i915_gem_object_set_cache_level(ctx->obj, > + &dev_priv->gtt.base, > I915_CACHE_LLC_MLC); > /* Failure shouldn't ever happen this early */ > if (WARN_ON(ret)) > @@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv) > * default context. > */ > dev_priv->ring[RCS].default_context = ctx; > - ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false); > + ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false); > if (ret) { > DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret); > goto err_destroy; > @@ -391,6 +392,7 @@ mi_set_context(struct intel_ring_buffer *ring, > static int do_switch(struct i915_hw_context *to) > { > struct intel_ring_buffer *ring = to->ring; > + struct drm_i915_private *dev_priv = ring->dev->dev_private; > struct i915_hw_context *from = ring->last_context; > u32 hw_flags = 0; > int ret; > @@ -400,7 +402,7 @@ static int do_switch(struct i915_hw_context *to) > if (from == to) > return 0; > > - ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false); > + ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false); > if (ret) > return ret; > > @@ -437,7 +439,8 @@ static int do_switch(struct i915_hw_context *to) > */ > if (from != NULL) { > from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION; > - i915_gem_object_move_to_active(from->obj, ring); > + i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base, > + ring); > /* As long as MI_SET_CONTEXT is serializing, ie. it flushes the > * whole damn pipeline, we don't need to explicitly mark the > * object dirty. The only exception is that the context must be > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c > index df61f33..32efdc0 100644 > --- a/drivers/gpu/drm/i915/i915_gem_evict.c > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c > @@ -32,24 +32,21 @@ > #include "i915_trace.h" > > static bool > -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind) > +mark_free(struct i915_vma *vma, struct list_head *unwind) > { > - struct i915_vma *vma = __i915_gem_obj_to_vma(obj); > - > - if (obj->pin_count) > + if (vma->obj->pin_count) > return false; > > - list_add(&obj->exec_list, unwind); > + list_add(&vma->obj->exec_list, unwind); > return drm_mm_scan_add_block(&vma->node); > } > > int > -i915_gem_evict_something(struct drm_device *dev, int min_size, > - unsigned alignment, unsigned cache_level, > +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm, > + int min_size, unsigned alignment, unsigned cache_level, > bool mappable, bool nonblocking) > { > drm_i915_private_t *dev_priv = dev->dev_private; > - struct i915_address_space *vm = &dev_priv->gtt.base; > struct list_head eviction_list, unwind_list; > struct i915_vma *vma; > struct drm_i915_gem_object *obj; > @@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size, > */ > > INIT_LIST_HEAD(&unwind_list); > - if (mappable) > + if (mappable) { > + BUG_ON(!i915_is_ggtt(vm)); > drm_mm_init_scan_with_range(&vm->mm, min_size, > alignment, cache_level, 0, > dev_priv->gtt.mappable_end); > - else > + } else > drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level); > > /* First see if there is a large enough contiguous idle region... */ > list_for_each_entry(obj, &vm->inactive_list, mm_list) { > - if (mark_free(obj, &unwind_list)) > + struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm); > + if (mark_free(vma, &unwind_list)) > goto found; > } > > @@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size, > > /* Now merge in the soon-to-be-expired objects... */ > list_for_each_entry(obj, &vm->active_list, mm_list) { > - if (mark_free(obj, &unwind_list)) > + struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm); > + if (mark_free(vma, &unwind_list)) > goto found; > } > > @@ -109,7 +109,7 @@ none: > obj = list_first_entry(&unwind_list, > struct drm_i915_gem_object, > exec_list); > - vma = __i915_gem_obj_to_vma(obj); > + vma = i915_gem_obj_to_vma(obj, vm); > ret = drm_mm_scan_remove_block(&vma->node); > BUG_ON(ret); > > @@ -130,7 +130,7 @@ found: > obj = list_first_entry(&unwind_list, > struct drm_i915_gem_object, > exec_list); > - vma = __i915_gem_obj_to_vma(obj); > + vma = i915_gem_obj_to_vma(obj, vm); > if (drm_mm_scan_remove_block(&vma->node)) { > list_move(&obj->exec_list, &eviction_list); > drm_gem_object_reference(&obj->base); > @@ -145,7 +145,7 @@ found: > struct drm_i915_gem_object, > exec_list); > if (ret == 0) > - ret = i915_gem_object_unbind(obj); > + ret = i915_gem_object_unbind(obj, vm); > > list_del_init(&obj->exec_list); > drm_gem_object_unreference(&obj->base); > @@ -158,13 +158,18 @@ int > i915_gem_evict_everything(struct drm_device *dev) > { > drm_i915_private_t *dev_priv = dev->dev_private; > - struct i915_address_space *vm = &dev_priv->gtt.base; > + struct i915_address_space *vm; > struct drm_i915_gem_object *obj, *next; > - bool lists_empty; > + bool lists_empty = true; > int ret; > > - lists_empty = (list_empty(&vm->inactive_list) && > - list_empty(&vm->active_list)); > + list_for_each_entry(vm, &dev_priv->vm_list, global_link) { > + lists_empty = (list_empty(&vm->inactive_list) && > + list_empty(&vm->active_list)); > + if (!lists_empty) > + lists_empty = false; > + } > + > if (lists_empty) > return -ENOSPC; > > @@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev) > i915_gem_retire_requests(dev); > > /* Having flushed everything, unbind() should never raise an error */ > - list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) > - if (obj->pin_count == 0) > - WARN_ON(i915_gem_object_unbind(obj)); > + list_for_each_entry(vm, &dev_priv->vm_list, global_link) { > + list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) > + if (obj->pin_count == 0) > + WARN_ON(i915_gem_object_unbind(obj, vm)); > + } > > return 0; > } > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > index 1734825..819d8d8 100644 > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > @@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle) > } > > static void > -eb_destroy(struct eb_objects *eb) > +eb_destroy(struct eb_objects *eb, struct i915_address_space *vm) > { > while (!list_empty(&eb->objects)) { > struct drm_i915_gem_object *obj; > @@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj) > static int > i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj, > struct eb_objects *eb, > - struct drm_i915_gem_relocation_entry *reloc) > + struct drm_i915_gem_relocation_entry *reloc, > + struct i915_address_space *vm) > { > struct drm_device *dev = obj->base.dev; > struct drm_gem_object *target_obj; > @@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj, > > static int > i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj, > - struct eb_objects *eb) > + struct eb_objects *eb, > + struct i915_address_space *vm) > { > #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry)) > struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)]; > @@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj, > do { > u64 offset = r->presumed_offset; > > - ret = i915_gem_execbuffer_relocate_entry(obj, eb, r); > + ret = i915_gem_execbuffer_relocate_entry(obj, eb, r, > + vm); > if (ret) > return ret; > > @@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj, > static int > i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj, > struct eb_objects *eb, > - struct drm_i915_gem_relocation_entry *relocs) > + struct drm_i915_gem_relocation_entry *relocs, > + struct i915_address_space *vm) > { > const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry; > int i, ret; > > for (i = 0; i < entry->relocation_count; i++) { > - ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]); > + ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i], > + vm); > if (ret) > return ret; > } > @@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj, > } > > static int > -i915_gem_execbuffer_relocate(struct eb_objects *eb) > +i915_gem_execbuffer_relocate(struct eb_objects *eb, > + struct i915_address_space *vm) > { > struct drm_i915_gem_object *obj; > int ret = 0; > @@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb) > */ > pagefault_disable(); > list_for_each_entry(obj, &eb->objects, exec_list) { > - ret = i915_gem_execbuffer_relocate_object(obj, eb); > + ret = i915_gem_execbuffer_relocate_object(obj, eb, vm); > if (ret) > break; > } > @@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj) > static int > i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj, > struct intel_ring_buffer *ring, > + struct i915_address_space *vm, > bool *need_reloc) > { > struct drm_i915_private *dev_priv = obj->base.dev->dev_private; > @@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj, > obj->tiling_mode != I915_TILING_NONE; > need_mappable = need_fence || need_reloc_mappable(obj); > > - ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false); > + ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable, > + false); > if (ret) > return ret; > > @@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj, > obj->has_aliasing_ppgtt_mapping = 1; > } > > - if (entry->offset != i915_gem_obj_ggtt_offset(obj)) { > - entry->offset = i915_gem_obj_ggtt_offset(obj); > + if (entry->offset != i915_gem_obj_offset(obj, vm)) { > + entry->offset = i915_gem_obj_offset(obj, vm); > *need_reloc = true; > } > > @@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj) > { > struct drm_i915_gem_exec_object2 *entry; > > - if (!i915_gem_obj_ggtt_bound(obj)) > + if (!i915_gem_obj_bound_any(obj)) > return; > > entry = obj->exec_entry; > @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj) > static int > i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring, > struct list_head *objects, > + struct i915_address_space *vm, > bool *need_relocs) > { > struct drm_i915_gem_object *obj; > @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring, > list_for_each_entry(obj, objects, exec_list) { > struct drm_i915_gem_exec_object2 *entry = obj->exec_entry; > bool need_fence, need_mappable; > + u32 obj_offset; > > - if (!i915_gem_obj_ggtt_bound(obj)) > + if (!i915_gem_obj_bound(obj, vm)) > continue; > > + obj_offset = i915_gem_obj_offset(obj, vm); > need_fence = > has_fenced_gpu_access && > entry->flags & EXEC_OBJECT_NEEDS_FENCE && > obj->tiling_mode != I915_TILING_NONE; > need_mappable = need_fence || need_reloc_mappable(obj); > > + BUG_ON((need_mappable || need_fence) && > + !i915_is_ggtt(vm)); > + > if ((entry->alignment && > - i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) || > + obj_offset & (entry->alignment - 1)) || > (need_mappable && !obj->map_and_fenceable)) > - ret = i915_gem_object_unbind(obj); > + ret = i915_gem_object_unbind(obj, vm); > else > - ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs); > + ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs); > if (ret) > goto err; > } > > /* Bind fresh objects */ > list_for_each_entry(obj, objects, exec_list) { > - if (i915_gem_obj_ggtt_bound(obj)) > + if (i915_gem_obj_bound(obj, vm)) > continue; > > - ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs); > + ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs); > if (ret) > goto err; > } > @@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev, > struct drm_file *file, > struct intel_ring_buffer *ring, > struct eb_objects *eb, > - struct drm_i915_gem_exec_object2 *exec) > + struct drm_i915_gem_exec_object2 *exec, > + struct i915_address_space *vm) > { > struct drm_i915_gem_relocation_entry *reloc; > struct drm_i915_gem_object *obj; > @@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev, > goto err; > > need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0; > - ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs); > + ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs); > if (ret) > goto err; > > list_for_each_entry(obj, &eb->objects, exec_list) { > int offset = obj->exec_entry - exec; > ret = i915_gem_execbuffer_relocate_object_slow(obj, eb, > - reloc + reloc_offset[offset]); > + reloc + reloc_offset[offset], > + vm); > if (ret) > goto err; > } > @@ -770,6 +786,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec, > > static void > i915_gem_execbuffer_move_to_active(struct list_head *objects, > + struct i915_address_space *vm, > struct intel_ring_buffer *ring) > { > struct drm_i915_gem_object *obj; > @@ -784,7 +801,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects, > obj->base.read_domains = obj->base.pending_read_domains; > obj->fenced_gpu_access = obj->pending_fenced_gpu_access; > > - i915_gem_object_move_to_active(obj, ring); > + i915_gem_object_move_to_active(obj, vm, ring); > if (obj->base.write_domain) { > obj->dirty = 1; > obj->last_write_seqno = intel_ring_get_seqno(ring); > @@ -838,7 +855,8 @@ static int > i915_gem_do_execbuffer(struct drm_device *dev, void *data, > struct drm_file *file, > struct drm_i915_gem_execbuffer2 *args, > - struct drm_i915_gem_exec_object2 *exec) > + struct drm_i915_gem_exec_object2 *exec, > + struct i915_address_space *vm) > { > drm_i915_private_t *dev_priv = dev->dev_private; > struct eb_objects *eb; > @@ -1000,17 +1018,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, > > /* Move the objects en-masse into the GTT, evicting if necessary. */ > need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0; > - ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs); > + ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs); > if (ret) > goto err; > > /* The objects are in their final locations, apply the relocations. */ > if (need_relocs) > - ret = i915_gem_execbuffer_relocate(eb); > + ret = i915_gem_execbuffer_relocate(eb, vm); > if (ret) { > if (ret == -EFAULT) { > ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring, > - eb, exec); > + eb, exec, vm); > BUG_ON(!mutex_is_locked(&dev->struct_mutex)); > } > if (ret) > @@ -1061,7 +1079,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, > goto err; > } > > - exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset; > + exec_start = i915_gem_obj_offset(batch_obj, vm) + > + args->batch_start_offset; > exec_len = args->batch_len; > if (cliprects) { > for (i = 0; i < args->num_cliprects; i++) { > @@ -1086,11 +1105,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, > > trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags); > > - i915_gem_execbuffer_move_to_active(&eb->objects, ring); > + i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring); > i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj); > > err: > - eb_destroy(eb); > + eb_destroy(eb, vm); > > mutex_unlock(&dev->struct_mutex); > > @@ -1107,6 +1126,7 @@ int > i915_gem_execbuffer(struct drm_device *dev, void *data, > struct drm_file *file) > { > + struct drm_i915_private *dev_priv = dev->dev_private; > struct drm_i915_gem_execbuffer *args = data; > struct drm_i915_gem_execbuffer2 exec2; > struct drm_i915_gem_exec_object *exec_list = NULL; > @@ -1162,7 +1182,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data, > exec2.flags = I915_EXEC_RENDER; > i915_execbuffer2_set_context_id(exec2, 0); > > - ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list); > + ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list, > + &dev_priv->gtt.base); > if (!ret) { > /* Copy the new buffer offsets back to the user's exec list. */ > for (i = 0; i < args->buffer_count; i++) > @@ -1188,6 +1209,7 @@ int > i915_gem_execbuffer2(struct drm_device *dev, void *data, > struct drm_file *file) > { > + struct drm_i915_private *dev_priv = dev->dev_private; > struct drm_i915_gem_execbuffer2 *args = data; > struct drm_i915_gem_exec_object2 *exec2_list = NULL; > int ret; > @@ -1218,7 +1240,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data, > return -EFAULT; > } > > - ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list); > + ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list, > + &dev_priv->gtt.base); > if (!ret) { > /* Copy the new buffer offsets back to the user's exec list. */ > ret = copy_to_user(to_user_ptr(args->buffers_ptr), > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c > index 3b639a9..44f3464 100644 > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > @@ -390,6 +390,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev) > ppgtt->base.total); > } > > + /* i915_init_vm(dev_priv, &ppgtt->base) */ > + > return ret; > } > > @@ -409,17 +411,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt, > struct drm_i915_gem_object *obj, > enum i915_cache_level cache_level) > { > - ppgtt->base.insert_entries(&ppgtt->base, obj->pages, > - i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT, > - cache_level); > + struct i915_address_space *vm = &ppgtt->base; > + unsigned long obj_offset = i915_gem_obj_offset(obj, vm); > + > + vm->insert_entries(vm, obj->pages, > + obj_offset >> PAGE_SHIFT, > + cache_level); > } > > void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt, > struct drm_i915_gem_object *obj) > { > - ppgtt->base.clear_range(&ppgtt->base, > - i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT, > - obj->base.size >> PAGE_SHIFT); > + struct i915_address_space *vm = &ppgtt->base; > + unsigned long obj_offset = i915_gem_obj_offset(obj, vm); > + > + vm->clear_range(vm, obj_offset >> PAGE_SHIFT, > + obj->base.size >> PAGE_SHIFT); > } > > extern int intel_iommu_gfx_mapped; > @@ -470,6 +477,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev) > dev_priv->gtt.base.start / PAGE_SIZE, > dev_priv->gtt.base.total / PAGE_SIZE); > > + if (dev_priv->mm.aliasing_ppgtt) > + gen6_write_pdes(dev_priv->mm.aliasing_ppgtt); > + > list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) { > i915_gem_clflush_object(obj); > i915_gem_gtt_bind_object(obj, obj->cache_level); > @@ -648,7 +658,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev, > * aperture. One page should be enough to keep any prefetching inside > * of the aperture. > */ > - drm_i915_private_t *dev_priv = dev->dev_private; > + struct drm_i915_private *dev_priv = dev->dev_private; > + struct i915_address_space *ggtt_vm = &dev_priv->gtt.base; > struct drm_mm_node *entry; > struct drm_i915_gem_object *obj; > unsigned long hole_start, hole_end; > @@ -656,19 +667,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev, > BUG_ON(mappable_end > end); > > /* Subtract the guard page ... */ > - drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE); > + drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE); > if (!HAS_LLC(dev)) > dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust; > > /* Mark any preallocated objects as occupied */ > list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) { > - struct i915_vma *vma = __i915_gem_obj_to_vma(obj); > + struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm); > int ret; > DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n", > i915_gem_obj_ggtt_offset(obj), obj->base.size); > > WARN_ON(i915_gem_obj_ggtt_bound(obj)); > - ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node); > + ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node); > if (ret) > DRM_DEBUG_KMS("Reservation failed\n"); > obj->has_global_gtt_mapping = 1; > @@ -679,19 +690,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev, > dev_priv->gtt.base.total = end - start; > > /* Clear any non-preallocated blocks */ > - drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm, > - hole_start, hole_end) { > + drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) { > const unsigned long count = (hole_end - hole_start) / PAGE_SIZE; > DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n", > hole_start, hole_end); > - dev_priv->gtt.base.clear_range(&dev_priv->gtt.base, > - hole_start / PAGE_SIZE, > - count); > + ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count); > } > > /* And finally clear the reserved guard page */ > - dev_priv->gtt.base.clear_range(&dev_priv->gtt.base, > - end / PAGE_SIZE - 1, 1); > + ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1); > } > > static bool > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c > index 27ffb4c..000ffbd 100644 > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c > @@ -351,7 +351,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev, > u32 size) > { > struct drm_i915_private *dev_priv = dev->dev_private; > - struct i915_address_space *vm = &dev_priv->gtt.base; > + struct i915_address_space *ggtt = &dev_priv->gtt.base; > struct drm_i915_gem_object *obj; > struct drm_mm_node *stolen; > struct i915_vma *vma; > @@ -394,7 +394,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev, > if (gtt_offset == I915_GTT_OFFSET_NONE) > return obj; > > - vma = i915_gem_vma_create(obj, &dev_priv->gtt.base); > + vma = i915_gem_vma_create(obj, ggtt); > if (IS_ERR(vma)) { > ret = PTR_ERR(vma); > goto err_out; > @@ -407,8 +407,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev, > */ > vma->node.start = gtt_offset; > vma->node.size = size; > - if (drm_mm_initialized(&dev_priv->gtt.base.mm)) { > - ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node); > + if (drm_mm_initialized(&ggtt->mm)) { > + ret = drm_mm_reserve_node(&ggtt->mm, &vma->node); > if (ret) { > DRM_DEBUG_KMS("failed to allocate stolen GTT space\n"); > i915_gem_vma_destroy(vma); > @@ -419,7 +419,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev, > obj->has_global_gtt_mapping = 1; > > list_add_tail(&obj->global_list, &dev_priv->mm.bound_list); > - list_add_tail(&obj->mm_list, &vm->inactive_list); > + list_add_tail(&obj->mm_list, &ggtt->inactive_list); > > return obj; > > diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c > index 92a8d27..808ca2a 100644 > --- a/drivers/gpu/drm/i915/i915_gem_tiling.c > +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c > @@ -360,17 +360,19 @@ i915_gem_set_tiling(struct drm_device *dev, void *data, > > obj->map_and_fenceable = > !i915_gem_obj_ggtt_bound(obj) || > - (i915_gem_obj_ggtt_offset(obj) + obj->base.size <= dev_priv->gtt.mappable_end && > + (i915_gem_obj_ggtt_offset(obj) + > + obj->base.size <= dev_priv->gtt.mappable_end && > i915_gem_object_fence_ok(obj, args->tiling_mode)); > > /* Rebind if we need a change of alignment */ > if (!obj->map_and_fenceable) { > - u32 unfenced_alignment = > + struct i915_address_space *ggtt = &dev_priv->gtt.base; > + u32 unfenced_align = > i915_gem_get_gtt_alignment(dev, obj->base.size, > args->tiling_mode, > false); > - if (i915_gem_obj_ggtt_offset(obj) & (unfenced_alignment - 1)) > - ret = i915_gem_object_unbind(obj); > + if (i915_gem_obj_ggtt_offset(obj) & (unfenced_align - 1)) > + ret = i915_gem_object_unbind(obj, ggtt); > } > > if (ret == 0) { > diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h > index 7d283b5..3f019d3 100644 > --- a/drivers/gpu/drm/i915/i915_trace.h > +++ b/drivers/gpu/drm/i915/i915_trace.h > @@ -34,11 +34,13 @@ TRACE_EVENT(i915_gem_object_create, > ); > > TRACE_EVENT(i915_gem_object_bind, > - TP_PROTO(struct drm_i915_gem_object *obj, bool mappable), > - TP_ARGS(obj, mappable), > + TP_PROTO(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm, bool mappable), > + TP_ARGS(obj, vm, mappable), > > TP_STRUCT__entry( > __field(struct drm_i915_gem_object *, obj) > + __field(struct i915_address_space *, vm) > __field(u32, offset) > __field(u32, size) > __field(bool, mappable) > @@ -46,8 +48,8 @@ TRACE_EVENT(i915_gem_object_bind, > > TP_fast_assign( > __entry->obj = obj; > - __entry->offset = i915_gem_obj_ggtt_offset(obj); > - __entry->size = i915_gem_obj_ggtt_size(obj); > + __entry->offset = i915_gem_obj_offset(obj, vm); > + __entry->size = i915_gem_obj_size(obj, vm); > __entry->mappable = mappable; > ), > > @@ -57,19 +59,21 @@ TRACE_EVENT(i915_gem_object_bind, > ); > > TRACE_EVENT(i915_gem_object_unbind, > - TP_PROTO(struct drm_i915_gem_object *obj), > - TP_ARGS(obj), > + TP_PROTO(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm), > + TP_ARGS(obj, vm), > > TP_STRUCT__entry( > __field(struct drm_i915_gem_object *, obj) > + __field(struct i915_address_space *, vm) > __field(u32, offset) > __field(u32, size) > ), > > TP_fast_assign( > __entry->obj = obj; > - __entry->offset = i915_gem_obj_ggtt_offset(obj); > - __entry->size = i915_gem_obj_ggtt_size(obj); > + __entry->offset = i915_gem_obj_offset(obj, vm); > + __entry->size = i915_gem_obj_size(obj, vm); > ), > > TP_printk("obj=%p, offset=%08x size=%x", > diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c > index f3c97e0..b69cc63 100644 > --- a/drivers/gpu/drm/i915/intel_fb.c > +++ b/drivers/gpu/drm/i915/intel_fb.c > @@ -170,7 +170,6 @@ static int intelfb_create(struct drm_fb_helper *helper, > fb->width, fb->height, > i915_gem_obj_ggtt_offset(obj), obj); > > - > mutex_unlock(&dev->struct_mutex); > vga_switcheroo_client_fb_set(dev->pdev, info); > return 0; > diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c > index 2abb53e..22ccb7e 100644 > --- a/drivers/gpu/drm/i915/intel_overlay.c > +++ b/drivers/gpu/drm/i915/intel_overlay.c > @@ -1350,7 +1350,7 @@ void intel_setup_overlay(struct drm_device *dev) > } > overlay->flip_addr = reg_bo->phys_obj->handle->busaddr; > } else { > - ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false); > + ret = i915_gem_ggtt_pin(reg_bo, PAGE_SIZE, true, false); > if (ret) { > DRM_ERROR("failed to pin overlay register bo\n"); > goto out_free_bo; > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c > index 008e0e0..0fb081c 100644 > --- a/drivers/gpu/drm/i915/intel_pm.c > +++ b/drivers/gpu/drm/i915/intel_pm.c > @@ -2860,7 +2860,7 @@ intel_alloc_context_page(struct drm_device *dev) > return NULL; > } > > - ret = i915_gem_object_pin(ctx, 4096, true, false); > + ret = i915_gem_ggtt_pin(ctx, 4096, true, false); > if (ret) { > DRM_ERROR("failed to pin power context: %d\n", ret); > goto err_unref; > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c > index 8527ea0..88130a3 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > @@ -481,6 +481,7 @@ out: > static int > init_pipe_control(struct intel_ring_buffer *ring) > { > + struct drm_i915_private *dev_priv = ring->dev->dev_private; > struct pipe_control *pc; > struct drm_i915_gem_object *obj; > int ret; > @@ -499,9 +500,10 @@ init_pipe_control(struct intel_ring_buffer *ring) > goto err; > } > > - i915_gem_object_set_cache_level(obj, I915_CACHE_LLC); > + i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, > + I915_CACHE_LLC); > > - ret = i915_gem_object_pin(obj, 4096, true, false); > + ret = i915_gem_ggtt_pin(obj, 4096, true, false); > if (ret) > goto err_unref; > > @@ -1212,6 +1214,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring) > static int init_status_page(struct intel_ring_buffer *ring) > { > struct drm_device *dev = ring->dev; > + struct drm_i915_private *dev_priv = dev->dev_private; > struct drm_i915_gem_object *obj; > int ret; > > @@ -1222,9 +1225,10 @@ static int init_status_page(struct intel_ring_buffer *ring) > goto err; > } > > - i915_gem_object_set_cache_level(obj, I915_CACHE_LLC); > + i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, > + I915_CACHE_LLC); > > - ret = i915_gem_object_pin(obj, 4096, true, false); > + ret = i915_gem_ggtt_pin(obj, 4096, true, false); > if (ret != 0) { > goto err_unref; > } > @@ -1307,7 +1311,7 @@ static int intel_init_ring_buffer(struct drm_device *dev, > > ring->obj = obj; > > - ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false); > + ret = i915_gem_ggtt_pin(obj, PAGE_SIZE, true, false); > if (ret) > goto err_unref; > > @@ -1828,7 +1832,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev) > return -ENOMEM; > } > > - ret = i915_gem_object_pin(obj, 0, true, false); > + ret = i915_gem_ggtt_pin(obj, 0, true, false); > if (ret != 0) { > drm_gem_object_unreference(&obj->base); > DRM_ERROR("Failed to ping batch bo\n"); > -- > 1.8.3.3 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
HI all, So Ben&I had a bit a private discussion and one thing I've explained a bit more in detail is what kind of review I'm doing as maintainer. I've figured this is generally useful. We've also discussed a bit that for developers without their own lab it would be nice if QA could test random branches on their set of machines. But imo that'll take quite a while, there's lots of other stuff to improve in QA land first. Anyway, here's it: Now an explanation for why this freaked me out, which is essentially an explanation of what I do when I do maintainer reviews: Probably the most important question I ask myself when reading a patch is "if a regression would bisect to this, and the bisect is the only useful piece of evidence, would I stand a chance to understand it?". Your patch is big, has the appearance of doing a few unrelated things and could very well hide a bug which would take me an awful lot of time to spot. So imo the answer for your patch is a clear "no". I've merged a few such patches in the past where I've had a similar hunch and regretted it almost always. I've also sometimes split-up the patch while applying, but that approach doesn't scale any more with our rather big team. The second thing I try to figure out is whether the patch author is indeed the local expert on the topic at hand now. With our team size and patch flow I don't stand a chance if I try to understand everything to the last detail. Instead I try to assess this through the proxy of convincing myself the the patch submitter understands stuff much better than I do. I tend to check that by asking random questions, proposing alternative approaches and also by rating code/patch clarity. The obj_set_color double-loop very much gave me the impression that you didn't have a clear idea about how exactly this should work, so that hunk trigger this maintainer hunch. I admit that this is all rather fluffy and very much an inexact science, but it's the only tools I have as a maintainer. The alternative of doing shit myself or checking everything myself in-depth just doesnt scale. Cheers, Daniel On Mon, Jul 22, 2013 at 4:08 AM, Ben Widawsky <ben@bwidawsk.net> wrote: > This patch was formerly known as: > "drm/i915: Create VMAs (part 3) - plumbing" > > This patch adds a VM argument, bind/unbind, and the object > offset/size/color getters/setters. It preserves the old ggtt helper > functions because things still need, and will continue to need them. > > Some code will still need to be ported over after this. > > v2: Fix purge to pick an object and unbind all vmas > This was doable because of the global bound list change. > > v3: With the commit to actually pin/unpin pages in place, there is no > longer a need to check if unbind succeeded before calling put_pages(). > Make put_pages only BUG() after checking pin count. > > v4: Rebased on top of the new hangcheck work by Mika > plumbed eb_destroy also > Many checkpatch related fixes > > v5: Very large rebase > > v6: > Change BUG_ON to WARN_ON (Daniel) > Rename vm to ggtt in preallocate stolen, since it is always ggtt when > dealing with stolen memory. (Daniel) > list_for_each will short-circuit already (Daniel) > remove superflous space (Daniel) > Use per object list of vmas (Daniel) > Make obj_bound_any() use obj_bound for each vm (Ben) > s/bind_to_gtt/bind_to_vm/ (Ben) > > Fixed up the inactive shrinker. As Daniel noticed the code could > potentially count the same object multiple times. While it's not > possible in the current case, since 1 object can only ever be bound into > 1 address space thus far - we may as well try to get something more > future proof in place now. With a prep patch before this to switch over > to using the bound list + inactive check, we're now able to carry that > forward for every address space an object is bound into. > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > --- > drivers/gpu/drm/i915/i915_debugfs.c | 29 ++- > drivers/gpu/drm/i915/i915_dma.c | 4 - > drivers/gpu/drm/i915/i915_drv.h | 107 +++++---- > drivers/gpu/drm/i915/i915_gem.c | 337 +++++++++++++++++++++-------- > drivers/gpu/drm/i915/i915_gem_context.c | 9 +- > drivers/gpu/drm/i915/i915_gem_evict.c | 51 +++-- > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 85 +++++--- > drivers/gpu/drm/i915/i915_gem_gtt.c | 41 ++-- > drivers/gpu/drm/i915/i915_gem_stolen.c | 10 +- > drivers/gpu/drm/i915/i915_gem_tiling.c | 10 +- > drivers/gpu/drm/i915/i915_trace.h | 20 +- > drivers/gpu/drm/i915/intel_fb.c | 1 - > drivers/gpu/drm/i915/intel_overlay.c | 2 +- > drivers/gpu/drm/i915/intel_pm.c | 2 +- > drivers/gpu/drm/i915/intel_ringbuffer.c | 16 +- > 15 files changed, 479 insertions(+), 245 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c > index be69807..f8e590f 100644 > --- a/drivers/gpu/drm/i915/i915_debugfs.c > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > @@ -92,6 +92,7 @@ static const char *get_tiling_flag(struct drm_i915_gem_object *obj) > static void > describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) > { > + struct i915_vma *vma; > seq_printf(m, "%pK: %s%s %8zdKiB %02x %02x %d %d %d%s%s%s", > &obj->base, > get_pin_flag(obj), > @@ -111,9 +112,15 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) > seq_printf(m, " (pinned x %d)", obj->pin_count); > if (obj->fence_reg != I915_FENCE_REG_NONE) > seq_printf(m, " (fence: %d)", obj->fence_reg); > - if (i915_gem_obj_ggtt_bound(obj)) > - seq_printf(m, " (gtt offset: %08lx, size: %08x)", > - i915_gem_obj_ggtt_offset(obj), (unsigned int)i915_gem_obj_ggtt_size(obj)); > + list_for_each_entry(vma, &obj->vma_list, vma_link) { > + if (!i915_is_ggtt(vma->vm)) > + seq_puts(m, " (pp"); > + else > + seq_puts(m, " (g"); > + seq_printf(m, "gtt offset: %08lx, size: %08lx)", > + i915_gem_obj_offset(obj, vma->vm), > + i915_gem_obj_size(obj, vma->vm)); > + } > if (obj->stolen) > seq_printf(m, " (stolen: %08lx)", obj->stolen->start); > if (obj->pin_mappable || obj->fault_mappable) { > @@ -175,6 +182,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data) > return 0; > } > > +/* FIXME: Support multiple VM? */ > #define count_objects(list, member) do { \ > list_for_each_entry(obj, list, member) { \ > size += i915_gem_obj_ggtt_size(obj); \ > @@ -1781,18 +1789,21 @@ i915_drop_caches_set(void *data, u64 val) > > if (val & DROP_BOUND) { > list_for_each_entry_safe(obj, next, &vm->inactive_list, > - mm_list) > - if (obj->pin_count == 0) { > - ret = i915_gem_object_unbind(obj); > - if (ret) > - goto unlock; > - } > + mm_list) { > + if (obj->pin_count) > + continue; > + > + ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base); > + if (ret) > + goto unlock; > + } > } > > if (val & DROP_UNBOUND) { > list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list, > global_list) > if (obj->pages_pin_count == 0) { > + /* FIXME: Do this for all vms? */ > ret = i915_gem_object_put_pages(obj); > if (ret) > goto unlock; > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c > index 1449d06..4650519 100644 > --- a/drivers/gpu/drm/i915/i915_dma.c > +++ b/drivers/gpu/drm/i915/i915_dma.c > @@ -1499,10 +1499,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags) > > i915_dump_device_info(dev_priv); > > - INIT_LIST_HEAD(&dev_priv->vm_list); > - INIT_LIST_HEAD(&dev_priv->gtt.base.global_link); > - list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list); > - > if (i915_get_bridge_dev(dev)) { > ret = -EIO; > goto free_priv; > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 8b3167e..681cb41 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -1379,52 +1379,6 @@ struct drm_i915_gem_object { > > #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base) > > -/* This is a temporary define to help transition us to real VMAs. If you see > - * this, you're either reviewing code, or bisecting it. */ > -static inline struct i915_vma * > -__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj) > -{ > - if (list_empty(&obj->vma_list)) > - return NULL; > - return list_first_entry(&obj->vma_list, struct i915_vma, vma_link); > -} > - > -/* Whether or not this object is currently mapped by the translation tables */ > -static inline bool > -i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o) > -{ > - struct i915_vma *vma = __i915_gem_obj_to_vma(o); > - if (vma == NULL) > - return false; > - return drm_mm_node_allocated(&vma->node); > -} > - > -/* Offset of the first PTE pointing to this object */ > -static inline unsigned long > -i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o) > -{ > - BUG_ON(list_empty(&o->vma_list)); > - return __i915_gem_obj_to_vma(o)->node.start; > -} > - > -/* The size used in the translation tables may be larger than the actual size of > - * the object on GEN2/GEN3 because of the way tiling is handled. See > - * i915_gem_get_gtt_size() for more details. > - */ > -static inline unsigned long > -i915_gem_obj_ggtt_size(struct drm_i915_gem_object *o) > -{ > - BUG_ON(list_empty(&o->vma_list)); > - return __i915_gem_obj_to_vma(o)->node.size; > -} > - > -static inline void > -i915_gem_obj_ggtt_set_color(struct drm_i915_gem_object *o, > - enum i915_cache_level color) > -{ > - __i915_gem_obj_to_vma(o)->node.color = color; > -} > - > /** > * Request queue structure. > * > @@ -1736,11 +1690,13 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj, > void i915_gem_vma_destroy(struct i915_vma *vma); > > int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm, > uint32_t alignment, > bool map_and_fenceable, > bool nonblocking); > void i915_gem_object_unpin(struct drm_i915_gem_object *obj); > -int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj); > +int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm); > int i915_gem_object_put_pages(struct drm_i915_gem_object *obj); > void i915_gem_release_mmap(struct drm_i915_gem_object *obj); > void i915_gem_lastclose(struct drm_device *dev); > @@ -1770,6 +1726,7 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev); > int i915_gem_object_sync(struct drm_i915_gem_object *obj, > struct intel_ring_buffer *to); > void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm, > struct intel_ring_buffer *ring); > > int i915_gem_dumb_create(struct drm_file *file_priv, > @@ -1876,6 +1833,7 @@ i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size, > int tiling_mode, bool fenced); > > int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm, > enum i915_cache_level cache_level); > > struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, > @@ -1886,6 +1844,56 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev, > > void i915_gem_restore_fences(struct drm_device *dev); > > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o, > + struct i915_address_space *vm); > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o); > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o, > + struct i915_address_space *vm); > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o, > + struct i915_address_space *vm); > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o, > + struct i915_address_space *vm, > + enum i915_cache_level color); > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm); > +/* Some GGTT VM helpers */ > +#define obj_to_ggtt(obj) \ > + (&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base) > +static inline bool i915_is_ggtt(struct i915_address_space *vm) > +{ > + struct i915_address_space *ggtt = > + &((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base; > + return vm == ggtt; > +} > + > +static inline bool i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *obj) > +{ > + return i915_gem_obj_bound(obj, obj_to_ggtt(obj)); > +} > + > +static inline unsigned long > +i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *obj) > +{ > + return i915_gem_obj_offset(obj, obj_to_ggtt(obj)); > +} > + > +static inline unsigned long > +i915_gem_obj_ggtt_size(struct drm_i915_gem_object *obj) > +{ > + return i915_gem_obj_size(obj, obj_to_ggtt(obj)); > +} > + > +static inline int __must_check > +i915_gem_ggtt_pin(struct drm_i915_gem_object *obj, > + uint32_t alignment, > + bool map_and_fenceable, > + bool nonblocking) > +{ > + return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment, > + map_and_fenceable, nonblocking); > +} > +#undef obj_to_ggtt > + > /* i915_gem_context.c */ > void i915_gem_context_init(struct drm_device *dev); > void i915_gem_context_fini(struct drm_device *dev); > @@ -1922,6 +1930,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt, > > void i915_gem_restore_gtt_mappings(struct drm_device *dev); > int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj); > +/* FIXME: this is never okay with full PPGTT */ > void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj, > enum i915_cache_level cache_level); > void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj); > @@ -1938,7 +1947,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev) > > > /* i915_gem_evict.c */ > -int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size, > +int __must_check i915_gem_evict_something(struct drm_device *dev, > + struct i915_address_space *vm, > + int min_size, > unsigned alignment, > unsigned cache_level, > bool mappable, > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 2283765..0111554 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -38,10 +38,12 @@ > > static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj); > static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj); > -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj, > - unsigned alignment, > - bool map_and_fenceable, > - bool nonblocking); > +static __must_check int > +i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm, > + unsigned alignment, > + bool map_and_fenceable, > + bool nonblocking); > static int i915_gem_phys_pwrite(struct drm_device *dev, > struct drm_i915_gem_object *obj, > struct drm_i915_gem_pwrite *args, > @@ -120,7 +122,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev) > static inline bool > i915_gem_object_is_inactive(struct drm_i915_gem_object *obj) > { > - return i915_gem_obj_ggtt_bound(obj) && !obj->active; > + return i915_gem_obj_bound_any(obj) && !obj->active; > } > > int > @@ -406,7 +408,7 @@ i915_gem_shmem_pread(struct drm_device *dev, > * anyway again before the next pread happens. */ > if (obj->cache_level == I915_CACHE_NONE) > needs_clflush = 1; > - if (i915_gem_obj_ggtt_bound(obj)) { > + if (i915_gem_obj_bound_any(obj)) { > ret = i915_gem_object_set_to_gtt_domain(obj, false); > if (ret) > return ret; > @@ -578,7 +580,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev, > char __user *user_data; > int page_offset, page_length, ret; > > - ret = i915_gem_object_pin(obj, 0, true, true); > + ret = i915_gem_ggtt_pin(obj, 0, true, true); > if (ret) > goto out; > > @@ -723,7 +725,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev, > * right away and we therefore have to clflush anyway. */ > if (obj->cache_level == I915_CACHE_NONE) > needs_clflush_after = 1; > - if (i915_gem_obj_ggtt_bound(obj)) { > + if (i915_gem_obj_bound_any(obj)) { > ret = i915_gem_object_set_to_gtt_domain(obj, true); > if (ret) > return ret; > @@ -1332,7 +1334,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf) > } > > /* Now bind it into the GTT if needed */ > - ret = i915_gem_object_pin(obj, 0, true, false); > + ret = i915_gem_ggtt_pin(obj, 0, true, false); > if (ret) > goto unlock; > > @@ -1654,11 +1656,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj) > if (obj->pages == NULL) > return 0; > > - BUG_ON(i915_gem_obj_ggtt_bound(obj)); > - > if (obj->pages_pin_count) > return -EBUSY; > > + BUG_ON(i915_gem_obj_bound_any(obj)); > + > /* ->put_pages might need to allocate memory for the bit17 swizzle > * array, hence protect them from being reaped by removing them from gtt > * lists early. */ > @@ -1678,7 +1680,6 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target, > bool purgeable_only) > { > struct drm_i915_gem_object *obj, *next; > - struct i915_address_space *vm = &dev_priv->gtt.base; > long count = 0; > > list_for_each_entry_safe(obj, next, > @@ -1692,14 +1693,22 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target, > } > } > > - list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) { > - if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) && > - i915_gem_object_unbind(obj) == 0 && > - i915_gem_object_put_pages(obj) == 0) { > + list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list, > + global_list) { > + struct i915_vma *vma, *v; > + > + if (!i915_gem_object_is_purgeable(obj) && purgeable_only) > + continue; > + > + list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link) > + if (i915_gem_object_unbind(obj, vma->vm)) > + break; > + > + if (!i915_gem_object_put_pages(obj)) > count += obj->base.size >> PAGE_SHIFT; > - if (count >= target) > - return count; > - } > + > + if (count >= target) > + return count; > } > > return count; > @@ -1859,11 +1868,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj) > > void > i915_gem_object_move_to_active(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm, > struct intel_ring_buffer *ring) > { > struct drm_device *dev = obj->base.dev; > struct drm_i915_private *dev_priv = dev->dev_private; > - struct i915_address_space *vm = &dev_priv->gtt.base; > u32 seqno = intel_ring_get_seqno(ring); > > BUG_ON(ring == NULL); > @@ -1900,12 +1909,9 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj, > } > > static void > -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj) > +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm) > { > - struct drm_device *dev = obj->base.dev; > - struct drm_i915_private *dev_priv = dev->dev_private; > - struct i915_address_space *vm = &dev_priv->gtt.base; > - > BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS); > BUG_ON(!obj->active); > > @@ -2105,10 +2111,11 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request) > spin_unlock(&file_priv->mm.lock); > } > > -static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj) > +static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj, > + struct i915_address_space *vm) > { > - if (acthd >= i915_gem_obj_ggtt_offset(obj) && > - acthd < i915_gem_obj_ggtt_offset(obj) + obj->base.size) > + if (acthd >= i915_gem_obj_offset(obj, vm) && > + acthd < i915_gem_obj_offset(obj, vm) + obj->base.size) > return true; > > return false; > @@ -2131,6 +2138,17 @@ static bool i915_head_inside_request(const u32 acthd_unmasked, > return false; > } > > +static struct i915_address_space * > +request_to_vm(struct drm_i915_gem_request *request) > +{ > + struct drm_i915_private *dev_priv = request->ring->dev->dev_private; > + struct i915_address_space *vm; > + > + vm = &dev_priv->gtt.base; > + > + return vm; > +} > + > static bool i915_request_guilty(struct drm_i915_gem_request *request, > const u32 acthd, bool *inside) > { > @@ -2138,9 +2156,9 @@ static bool i915_request_guilty(struct drm_i915_gem_request *request, > * pointing inside the ring, matches the batch_obj address range. > * However this is extremely unlikely. > */ > - > if (request->batch_obj) { > - if (i915_head_inside_object(acthd, request->batch_obj)) { > + if (i915_head_inside_object(acthd, request->batch_obj, > + request_to_vm(request))) { > *inside = true; > return true; > } > @@ -2160,17 +2178,21 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring, > { > struct i915_ctx_hang_stats *hs = NULL; > bool inside, guilty; > + unsigned long offset = 0; > > /* Innocent until proven guilty */ > guilty = false; > > + if (request->batch_obj) > + offset = i915_gem_obj_offset(request->batch_obj, > + request_to_vm(request)); > + > if (ring->hangcheck.action != wait && > i915_request_guilty(request, acthd, &inside)) { > DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n", > ring->name, > inside ? "inside" : "flushing", > - request->batch_obj ? > - i915_gem_obj_ggtt_offset(request->batch_obj) : 0, > + offset, > request->ctx ? request->ctx->id : 0, > acthd); > > @@ -2227,13 +2249,15 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv, > } > > while (!list_empty(&ring->active_list)) { > + struct i915_address_space *vm; > struct drm_i915_gem_object *obj; > > obj = list_first_entry(&ring->active_list, > struct drm_i915_gem_object, > ring_list); > > - i915_gem_object_move_to_inactive(obj); > + list_for_each_entry(vm, &dev_priv->vm_list, global_link) > + i915_gem_object_move_to_inactive(obj, vm); > } > } > > @@ -2261,7 +2285,7 @@ void i915_gem_restore_fences(struct drm_device *dev) > void i915_gem_reset(struct drm_device *dev) > { > struct drm_i915_private *dev_priv = dev->dev_private; > - struct i915_address_space *vm = &dev_priv->gtt.base; > + struct i915_address_space *vm; > struct drm_i915_gem_object *obj; > struct intel_ring_buffer *ring; > int i; > @@ -2272,8 +2296,9 @@ void i915_gem_reset(struct drm_device *dev) > /* Move everything out of the GPU domains to ensure we do any > * necessary invalidation upon reuse. > */ > - list_for_each_entry(obj, &vm->inactive_list, mm_list) > - obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS; > + list_for_each_entry(vm, &dev_priv->vm_list, global_link) > + list_for_each_entry(obj, &vm->inactive_list, mm_list) > + obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS; > > i915_gem_restore_fences(dev); > } > @@ -2318,6 +2343,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring) > * by the ringbuffer to the flushing/inactive lists as appropriate. > */ > while (!list_empty(&ring->active_list)) { > + struct drm_i915_private *dev_priv = ring->dev->dev_private; > + struct i915_address_space *vm; > struct drm_i915_gem_object *obj; > > obj = list_first_entry(&ring->active_list, > @@ -2327,7 +2354,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring) > if (!i915_seqno_passed(seqno, obj->last_read_seqno)) > break; > > - i915_gem_object_move_to_inactive(obj); > + list_for_each_entry(vm, &dev_priv->vm_list, global_link) > + i915_gem_object_move_to_inactive(obj, vm); > } > > if (unlikely(ring->trace_irq_seqno && > @@ -2573,13 +2601,14 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj) > * Unbinds an object from the GTT aperture. > */ > int > -i915_gem_object_unbind(struct drm_i915_gem_object *obj) > +i915_gem_object_unbind(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm) > { > drm_i915_private_t *dev_priv = obj->base.dev->dev_private; > struct i915_vma *vma; > int ret; > > - if (!i915_gem_obj_ggtt_bound(obj)) > + if (!i915_gem_obj_bound(obj, vm)) > return 0; > > if (obj->pin_count) > @@ -2602,7 +2631,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj) > if (ret) > return ret; > > - trace_i915_gem_object_unbind(obj); > + trace_i915_gem_object_unbind(obj, vm); > > if (obj->has_global_gtt_mapping) > i915_gem_gtt_unbind_object(obj); > @@ -2617,7 +2646,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj) > /* Avoid an unnecessary call to unbind on rebind. */ > obj->map_and_fenceable = true; > > - vma = __i915_gem_obj_to_vma(obj); > + vma = i915_gem_obj_to_vma(obj, vm); > list_del(&vma->vma_link); > drm_mm_remove_node(&vma->node); > i915_gem_vma_destroy(vma); > @@ -2764,6 +2793,7 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg, > "object 0x%08lx not 512K or pot-size 0x%08x aligned\n", > i915_gem_obj_ggtt_offset(obj), size); > > + > pitch_val = obj->stride / 128; > pitch_val = ffs(pitch_val) - 1; > > @@ -3049,24 +3079,26 @@ static void i915_gem_verify_gtt(struct drm_device *dev) > * Finds free space in the GTT aperture and binds the object there. > */ > static int > -i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj, > - unsigned alignment, > - bool map_and_fenceable, > - bool nonblocking) > +i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm, > + unsigned alignment, > + bool map_and_fenceable, > + bool nonblocking) > { > struct drm_device *dev = obj->base.dev; > drm_i915_private_t *dev_priv = dev->dev_private; > - struct i915_address_space *vm = &dev_priv->gtt.base; > u32 size, fence_size, fence_alignment, unfenced_alignment; > bool mappable, fenceable; > - size_t gtt_max = map_and_fenceable ? > - dev_priv->gtt.mappable_end : dev_priv->gtt.base.total; > + size_t gtt_max = > + map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total; > struct i915_vma *vma; > int ret; > > if (WARN_ON(!list_empty(&obj->vma_list))) > return -EBUSY; > > + BUG_ON(!i915_is_ggtt(vm)); > + > fence_size = i915_gem_get_gtt_size(dev, > obj->base.size, > obj->tiling_mode); > @@ -3105,19 +3137,21 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj, > > i915_gem_object_pin_pages(obj); > > - vma = i915_gem_vma_create(obj, &dev_priv->gtt.base); > + /* For now we only ever use 1 vma per object */ > + WARN_ON(!list_empty(&obj->vma_list)); > + > + vma = i915_gem_vma_create(obj, vm); > if (IS_ERR(vma)) { > i915_gem_object_unpin_pages(obj); > return PTR_ERR(vma); > } > > search_free: > - ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm, > - &vma->node, > + ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node, > size, alignment, > obj->cache_level, 0, gtt_max); > if (ret) { > - ret = i915_gem_evict_something(dev, size, alignment, > + ret = i915_gem_evict_something(dev, vm, size, alignment, > obj->cache_level, > map_and_fenceable, > nonblocking); > @@ -3138,18 +3172,25 @@ search_free: > > list_move_tail(&obj->global_list, &dev_priv->mm.bound_list); > list_add_tail(&obj->mm_list, &vm->inactive_list); > - list_add(&vma->vma_link, &obj->vma_list); > + > + /* Keep GGTT vmas first to make debug easier */ > + if (i915_is_ggtt(vm)) > + list_add(&vma->vma_link, &obj->vma_list); > + else > + list_add_tail(&vma->vma_link, &obj->vma_list); > > fenceable = > + i915_is_ggtt(vm) && > i915_gem_obj_ggtt_size(obj) == fence_size && > (i915_gem_obj_ggtt_offset(obj) & (fence_alignment - 1)) == 0; > > - mappable = i915_gem_obj_ggtt_offset(obj) + obj->base.size <= > - dev_priv->gtt.mappable_end; > + mappable = > + i915_is_ggtt(vm) && > + vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end; > > obj->map_and_fenceable = mappable && fenceable; > > - trace_i915_gem_object_bind(obj, map_and_fenceable); > + trace_i915_gem_object_bind(obj, vm, map_and_fenceable); > i915_gem_verify_gtt(dev); > return 0; > > @@ -3253,7 +3294,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) > int ret; > > /* Not valid to be called on unbound objects. */ > - if (!i915_gem_obj_ggtt_bound(obj)) > + if (!i915_gem_obj_bound_any(obj)) > return -EINVAL; > > if (obj->base.write_domain == I915_GEM_DOMAIN_GTT) > @@ -3299,11 +3340,12 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) > } > > int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm, > enum i915_cache_level cache_level) > { > struct drm_device *dev = obj->base.dev; > drm_i915_private_t *dev_priv = dev->dev_private; > - struct i915_vma *vma = __i915_gem_obj_to_vma(obj); > + struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm); > int ret; > > if (obj->cache_level == cache_level) > @@ -3315,12 +3357,12 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > } > > if (vma && !i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) { > - ret = i915_gem_object_unbind(obj); > + ret = i915_gem_object_unbind(obj, vm); > if (ret) > return ret; > } > > - if (i915_gem_obj_ggtt_bound(obj)) { > + list_for_each_entry(vma, &obj->vma_list, vma_link) { > ret = i915_gem_object_finish_gpu(obj); > if (ret) > return ret; > @@ -3343,7 +3385,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt, > obj, cache_level); > > - i915_gem_obj_ggtt_set_color(obj, cache_level); > + i915_gem_obj_set_color(obj, vma->vm, cache_level); > } > > if (cache_level == I915_CACHE_NONE) { > @@ -3403,6 +3445,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data, > struct drm_file *file) > { > struct drm_i915_gem_caching *args = data; > + struct drm_i915_private *dev_priv; > struct drm_i915_gem_object *obj; > enum i915_cache_level level; > int ret; > @@ -3427,8 +3470,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data, > ret = -ENOENT; > goto unlock; > } > + dev_priv = obj->base.dev->dev_private; > > - ret = i915_gem_object_set_cache_level(obj, level); > + /* FIXME: Add interface for specific VM? */ > + ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level); > > drm_gem_object_unreference(&obj->base); > unlock: > @@ -3446,6 +3491,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, > u32 alignment, > struct intel_ring_buffer *pipelined) > { > + struct drm_i915_private *dev_priv = obj->base.dev->dev_private; > u32 old_read_domains, old_write_domain; > int ret; > > @@ -3464,7 +3510,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, > * of uncaching, which would allow us to flush all the LLC-cached data > * with that bit in the PTE to main memory with just one PIPE_CONTROL. > */ > - ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE); > + ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, > + I915_CACHE_NONE); > if (ret) > return ret; > > @@ -3472,7 +3519,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, > * (e.g. libkms for the bootup splash), we have to ensure that we > * always use map_and_fenceable for all scanout buffers. > */ > - ret = i915_gem_object_pin(obj, alignment, true, false); > + ret = i915_gem_ggtt_pin(obj, alignment, true, false); > if (ret) > return ret; > > @@ -3615,6 +3662,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file) > > int > i915_gem_object_pin(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm, > uint32_t alignment, > bool map_and_fenceable, > bool nonblocking) > @@ -3624,28 +3672,31 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj, > if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT)) > return -EBUSY; > > - if (i915_gem_obj_ggtt_bound(obj)) { > - if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) || > + WARN_ON(map_and_fenceable && !i915_is_ggtt(vm)); > + > + if (i915_gem_obj_bound(obj, vm)) { > + if ((alignment && > + i915_gem_obj_offset(obj, vm) & (alignment - 1)) || > (map_and_fenceable && !obj->map_and_fenceable)) { > WARN(obj->pin_count, > "bo is already pinned with incorrect alignment:" > " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d," > " obj->map_and_fenceable=%d\n", > - i915_gem_obj_ggtt_offset(obj), alignment, > + i915_gem_obj_offset(obj, vm), alignment, > map_and_fenceable, > obj->map_and_fenceable); > - ret = i915_gem_object_unbind(obj); > + ret = i915_gem_object_unbind(obj, vm); > if (ret) > return ret; > } > } > > - if (!i915_gem_obj_ggtt_bound(obj)) { > + if (!i915_gem_obj_bound(obj, vm)) { > struct drm_i915_private *dev_priv = obj->base.dev->dev_private; > > - ret = i915_gem_object_bind_to_gtt(obj, alignment, > - map_and_fenceable, > - nonblocking); > + ret = i915_gem_object_bind_to_vm(obj, vm, alignment, > + map_and_fenceable, > + nonblocking); > if (ret) > return ret; > > @@ -3666,7 +3717,7 @@ void > i915_gem_object_unpin(struct drm_i915_gem_object *obj) > { > BUG_ON(obj->pin_count == 0); > - BUG_ON(!i915_gem_obj_ggtt_bound(obj)); > + BUG_ON(!i915_gem_obj_bound_any(obj)); > > if (--obj->pin_count == 0) > obj->pin_mappable = false; > @@ -3704,7 +3755,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data, > } > > if (obj->user_pin_count == 0) { > - ret = i915_gem_object_pin(obj, args->alignment, true, false); > + ret = i915_gem_ggtt_pin(obj, args->alignment, true, false); > if (ret) > goto out; > } > @@ -3937,6 +3988,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj) > struct drm_i915_gem_object *obj = to_intel_bo(gem_obj); > struct drm_device *dev = obj->base.dev; > drm_i915_private_t *dev_priv = dev->dev_private; > + struct i915_vma *vma, *next; > > trace_i915_gem_object_destroy(obj); > > @@ -3944,15 +3996,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj) > i915_gem_detach_phys_object(dev, obj); > > obj->pin_count = 0; > - if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) { > - bool was_interruptible; > + /* NB: 0 or 1 elements */ > + WARN_ON(!list_empty(&obj->vma_list) && > + !list_is_singular(&obj->vma_list)); > + list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) { > + int ret = i915_gem_object_unbind(obj, vma->vm); > + if (WARN_ON(ret == -ERESTARTSYS)) { > + bool was_interruptible; > > - was_interruptible = dev_priv->mm.interruptible; > - dev_priv->mm.interruptible = false; > + was_interruptible = dev_priv->mm.interruptible; > + dev_priv->mm.interruptible = false; > > - WARN_ON(i915_gem_object_unbind(obj)); > + WARN_ON(i915_gem_object_unbind(obj, vma->vm)); > > - dev_priv->mm.interruptible = was_interruptible; > + dev_priv->mm.interruptible = was_interruptible; > + } > } > > /* Stolen objects don't hold a ref, but do hold pin count. Fix that up > @@ -4319,6 +4377,16 @@ init_ring_lists(struct intel_ring_buffer *ring) > INIT_LIST_HEAD(&ring->request_list); > } > > +static void i915_init_vm(struct drm_i915_private *dev_priv, > + struct i915_address_space *vm) > +{ > + vm->dev = dev_priv->dev; > + INIT_LIST_HEAD(&vm->active_list); > + INIT_LIST_HEAD(&vm->inactive_list); > + INIT_LIST_HEAD(&vm->global_link); > + list_add(&vm->global_link, &dev_priv->vm_list); > +} > + > void > i915_gem_load(struct drm_device *dev) > { > @@ -4331,8 +4399,9 @@ i915_gem_load(struct drm_device *dev) > SLAB_HWCACHE_ALIGN, > NULL); > > - INIT_LIST_HEAD(&dev_priv->gtt.base.active_list); > - INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list); > + INIT_LIST_HEAD(&dev_priv->vm_list); > + i915_init_vm(dev_priv, &dev_priv->gtt.base); > + > INIT_LIST_HEAD(&dev_priv->mm.unbound_list); > INIT_LIST_HEAD(&dev_priv->mm.bound_list); > INIT_LIST_HEAD(&dev_priv->mm.fence_list); > @@ -4603,9 +4672,8 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc) > struct drm_i915_private, > mm.inactive_shrinker); > struct drm_device *dev = dev_priv->dev; > - struct i915_address_space *vm = &dev_priv->gtt.base; > struct drm_i915_gem_object *obj; > - int nr_to_scan = sc->nr_to_scan; > + int nr_to_scan; > bool unlock = true; > int cnt; > > @@ -4619,6 +4687,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc) > unlock = false; > } > > + nr_to_scan = sc->nr_to_scan; > if (nr_to_scan) { > nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan); > if (nr_to_scan > 0) > @@ -4632,11 +4701,109 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc) > list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list) > if (obj->pages_pin_count == 0) > cnt += obj->base.size >> PAGE_SHIFT; > - list_for_each_entry(obj, &vm->inactive_list, mm_list) > + > + list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) { > + if (obj->active) > + continue; > + > + i915_gem_object_flush_gtt_write_domain(obj); > + i915_gem_object_flush_cpu_write_domain(obj); > + /* FIXME: Can't assume global gtt */ > + i915_gem_object_move_to_inactive(obj, &dev_priv->gtt.base); > + > if (obj->pin_count == 0 && obj->pages_pin_count == 0) > cnt += obj->base.size >> PAGE_SHIFT; > + } > > if (unlock) > mutex_unlock(&dev->struct_mutex); > return cnt; > } > + > +/* All the new VM stuff */ > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o, > + struct i915_address_space *vm) > +{ > + struct drm_i915_private *dev_priv = o->base.dev->dev_private; > + struct i915_vma *vma; > + > + if (vm == &dev_priv->mm.aliasing_ppgtt->base) > + vm = &dev_priv->gtt.base; > + > + BUG_ON(list_empty(&o->vma_list)); > + list_for_each_entry(vma, &o->vma_list, vma_link) { > + if (vma->vm == vm) > + return vma->node.start; > + > + } > + return -1; > +} > + > +bool i915_gem_obj_bound(struct drm_i915_gem_object *o, > + struct i915_address_space *vm) > +{ > + struct i915_vma *vma; > + > + list_for_each_entry(vma, &o->vma_list, vma_link) > + if (vma->vm == vm) > + return true; > + > + return false; > +} > + > +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o) > +{ > + struct drm_i915_private *dev_priv = o->base.dev->dev_private; > + struct i915_address_space *vm; > + > + list_for_each_entry(vm, &dev_priv->vm_list, global_link) > + if (i915_gem_obj_bound(o, vm)) > + return true; > + > + return false; > +} > + > +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o, > + struct i915_address_space *vm) > +{ > + struct drm_i915_private *dev_priv = o->base.dev->dev_private; > + struct i915_vma *vma; > + > + if (vm == &dev_priv->mm.aliasing_ppgtt->base) > + vm = &dev_priv->gtt.base; > + > + BUG_ON(list_empty(&o->vma_list)); > + > + list_for_each_entry(vma, &o->vma_list, vma_link) > + if (vma->vm == vm) > + return vma->node.size; > + > + return 0; > +} > + > +void i915_gem_obj_set_color(struct drm_i915_gem_object *o, > + struct i915_address_space *vm, > + enum i915_cache_level color) > +{ > + struct i915_vma *vma; > + BUG_ON(list_empty(&o->vma_list)); > + list_for_each_entry(vma, &o->vma_list, vma_link) { > + if (vma->vm == vm) { > + vma->node.color = color; > + return; > + } > + } > + > + WARN(1, "Couldn't set color for VM %p\n", vm); > +} > + > +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm) > +{ > + struct i915_vma *vma; > + list_for_each_entry(vma, &obj->vma_list, vma_link) > + if (vma->vm == vm) > + return vma; > + > + return NULL; > +} > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c > index 2470206..873577d 100644 > --- a/drivers/gpu/drm/i915/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > @@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev, > > if (INTEL_INFO(dev)->gen >= 7) { > ret = i915_gem_object_set_cache_level(ctx->obj, > + &dev_priv->gtt.base, > I915_CACHE_LLC_MLC); > /* Failure shouldn't ever happen this early */ > if (WARN_ON(ret)) > @@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv) > * default context. > */ > dev_priv->ring[RCS].default_context = ctx; > - ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false); > + ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false); > if (ret) { > DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret); > goto err_destroy; > @@ -391,6 +392,7 @@ mi_set_context(struct intel_ring_buffer *ring, > static int do_switch(struct i915_hw_context *to) > { > struct intel_ring_buffer *ring = to->ring; > + struct drm_i915_private *dev_priv = ring->dev->dev_private; > struct i915_hw_context *from = ring->last_context; > u32 hw_flags = 0; > int ret; > @@ -400,7 +402,7 @@ static int do_switch(struct i915_hw_context *to) > if (from == to) > return 0; > > - ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false); > + ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false); > if (ret) > return ret; > > @@ -437,7 +439,8 @@ static int do_switch(struct i915_hw_context *to) > */ > if (from != NULL) { > from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION; > - i915_gem_object_move_to_active(from->obj, ring); > + i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base, > + ring); > /* As long as MI_SET_CONTEXT is serializing, ie. it flushes the > * whole damn pipeline, we don't need to explicitly mark the > * object dirty. The only exception is that the context must be > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c > index df61f33..32efdc0 100644 > --- a/drivers/gpu/drm/i915/i915_gem_evict.c > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c > @@ -32,24 +32,21 @@ > #include "i915_trace.h" > > static bool > -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind) > +mark_free(struct i915_vma *vma, struct list_head *unwind) > { > - struct i915_vma *vma = __i915_gem_obj_to_vma(obj); > - > - if (obj->pin_count) > + if (vma->obj->pin_count) > return false; > > - list_add(&obj->exec_list, unwind); > + list_add(&vma->obj->exec_list, unwind); > return drm_mm_scan_add_block(&vma->node); > } > > int > -i915_gem_evict_something(struct drm_device *dev, int min_size, > - unsigned alignment, unsigned cache_level, > +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm, > + int min_size, unsigned alignment, unsigned cache_level, > bool mappable, bool nonblocking) > { > drm_i915_private_t *dev_priv = dev->dev_private; > - struct i915_address_space *vm = &dev_priv->gtt.base; > struct list_head eviction_list, unwind_list; > struct i915_vma *vma; > struct drm_i915_gem_object *obj; > @@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size, > */ > > INIT_LIST_HEAD(&unwind_list); > - if (mappable) > + if (mappable) { > + BUG_ON(!i915_is_ggtt(vm)); > drm_mm_init_scan_with_range(&vm->mm, min_size, > alignment, cache_level, 0, > dev_priv->gtt.mappable_end); > - else > + } else > drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level); > > /* First see if there is a large enough contiguous idle region... */ > list_for_each_entry(obj, &vm->inactive_list, mm_list) { > - if (mark_free(obj, &unwind_list)) > + struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm); > + if (mark_free(vma, &unwind_list)) > goto found; > } > > @@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size, > > /* Now merge in the soon-to-be-expired objects... */ > list_for_each_entry(obj, &vm->active_list, mm_list) { > - if (mark_free(obj, &unwind_list)) > + struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm); > + if (mark_free(vma, &unwind_list)) > goto found; > } > > @@ -109,7 +109,7 @@ none: > obj = list_first_entry(&unwind_list, > struct drm_i915_gem_object, > exec_list); > - vma = __i915_gem_obj_to_vma(obj); > + vma = i915_gem_obj_to_vma(obj, vm); > ret = drm_mm_scan_remove_block(&vma->node); > BUG_ON(ret); > > @@ -130,7 +130,7 @@ found: > obj = list_first_entry(&unwind_list, > struct drm_i915_gem_object, > exec_list); > - vma = __i915_gem_obj_to_vma(obj); > + vma = i915_gem_obj_to_vma(obj, vm); > if (drm_mm_scan_remove_block(&vma->node)) { > list_move(&obj->exec_list, &eviction_list); > drm_gem_object_reference(&obj->base); > @@ -145,7 +145,7 @@ found: > struct drm_i915_gem_object, > exec_list); > if (ret == 0) > - ret = i915_gem_object_unbind(obj); > + ret = i915_gem_object_unbind(obj, vm); > > list_del_init(&obj->exec_list); > drm_gem_object_unreference(&obj->base); > @@ -158,13 +158,18 @@ int > i915_gem_evict_everything(struct drm_device *dev) > { > drm_i915_private_t *dev_priv = dev->dev_private; > - struct i915_address_space *vm = &dev_priv->gtt.base; > + struct i915_address_space *vm; > struct drm_i915_gem_object *obj, *next; > - bool lists_empty; > + bool lists_empty = true; > int ret; > > - lists_empty = (list_empty(&vm->inactive_list) && > - list_empty(&vm->active_list)); > + list_for_each_entry(vm, &dev_priv->vm_list, global_link) { > + lists_empty = (list_empty(&vm->inactive_list) && > + list_empty(&vm->active_list)); > + if (!lists_empty) > + lists_empty = false; > + } > + > if (lists_empty) > return -ENOSPC; > > @@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev) > i915_gem_retire_requests(dev); > > /* Having flushed everything, unbind() should never raise an error */ > - list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) > - if (obj->pin_count == 0) > - WARN_ON(i915_gem_object_unbind(obj)); > + list_for_each_entry(vm, &dev_priv->vm_list, global_link) { > + list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) > + if (obj->pin_count == 0) > + WARN_ON(i915_gem_object_unbind(obj, vm)); > + } > > return 0; > } > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > index 1734825..819d8d8 100644 > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > @@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle) > } > > static void > -eb_destroy(struct eb_objects *eb) > +eb_destroy(struct eb_objects *eb, struct i915_address_space *vm) > { > while (!list_empty(&eb->objects)) { > struct drm_i915_gem_object *obj; > @@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj) > static int > i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj, > struct eb_objects *eb, > - struct drm_i915_gem_relocation_entry *reloc) > + struct drm_i915_gem_relocation_entry *reloc, > + struct i915_address_space *vm) > { > struct drm_device *dev = obj->base.dev; > struct drm_gem_object *target_obj; > @@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj, > > static int > i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj, > - struct eb_objects *eb) > + struct eb_objects *eb, > + struct i915_address_space *vm) > { > #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry)) > struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)]; > @@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj, > do { > u64 offset = r->presumed_offset; > > - ret = i915_gem_execbuffer_relocate_entry(obj, eb, r); > + ret = i915_gem_execbuffer_relocate_entry(obj, eb, r, > + vm); > if (ret) > return ret; > > @@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj, > static int > i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj, > struct eb_objects *eb, > - struct drm_i915_gem_relocation_entry *relocs) > + struct drm_i915_gem_relocation_entry *relocs, > + struct i915_address_space *vm) > { > const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry; > int i, ret; > > for (i = 0; i < entry->relocation_count; i++) { > - ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]); > + ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i], > + vm); > if (ret) > return ret; > } > @@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj, > } > > static int > -i915_gem_execbuffer_relocate(struct eb_objects *eb) > +i915_gem_execbuffer_relocate(struct eb_objects *eb, > + struct i915_address_space *vm) > { > struct drm_i915_gem_object *obj; > int ret = 0; > @@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb) > */ > pagefault_disable(); > list_for_each_entry(obj, &eb->objects, exec_list) { > - ret = i915_gem_execbuffer_relocate_object(obj, eb); > + ret = i915_gem_execbuffer_relocate_object(obj, eb, vm); > if (ret) > break; > } > @@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj) > static int > i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj, > struct intel_ring_buffer *ring, > + struct i915_address_space *vm, > bool *need_reloc) > { > struct drm_i915_private *dev_priv = obj->base.dev->dev_private; > @@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj, > obj->tiling_mode != I915_TILING_NONE; > need_mappable = need_fence || need_reloc_mappable(obj); > > - ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false); > + ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable, > + false); > if (ret) > return ret; > > @@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj, > obj->has_aliasing_ppgtt_mapping = 1; > } > > - if (entry->offset != i915_gem_obj_ggtt_offset(obj)) { > - entry->offset = i915_gem_obj_ggtt_offset(obj); > + if (entry->offset != i915_gem_obj_offset(obj, vm)) { > + entry->offset = i915_gem_obj_offset(obj, vm); > *need_reloc = true; > } > > @@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj) > { > struct drm_i915_gem_exec_object2 *entry; > > - if (!i915_gem_obj_ggtt_bound(obj)) > + if (!i915_gem_obj_bound_any(obj)) > return; > > entry = obj->exec_entry; > @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj) > static int > i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring, > struct list_head *objects, > + struct i915_address_space *vm, > bool *need_relocs) > { > struct drm_i915_gem_object *obj; > @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring, > list_for_each_entry(obj, objects, exec_list) { > struct drm_i915_gem_exec_object2 *entry = obj->exec_entry; > bool need_fence, need_mappable; > + u32 obj_offset; > > - if (!i915_gem_obj_ggtt_bound(obj)) > + if (!i915_gem_obj_bound(obj, vm)) > continue; > > + obj_offset = i915_gem_obj_offset(obj, vm); > need_fence = > has_fenced_gpu_access && > entry->flags & EXEC_OBJECT_NEEDS_FENCE && > obj->tiling_mode != I915_TILING_NONE; > need_mappable = need_fence || need_reloc_mappable(obj); > > + BUG_ON((need_mappable || need_fence) && > + !i915_is_ggtt(vm)); > + > if ((entry->alignment && > - i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) || > + obj_offset & (entry->alignment - 1)) || > (need_mappable && !obj->map_and_fenceable)) > - ret = i915_gem_object_unbind(obj); > + ret = i915_gem_object_unbind(obj, vm); > else > - ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs); > + ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs); > if (ret) > goto err; > } > > /* Bind fresh objects */ > list_for_each_entry(obj, objects, exec_list) { > - if (i915_gem_obj_ggtt_bound(obj)) > + if (i915_gem_obj_bound(obj, vm)) > continue; > > - ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs); > + ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs); > if (ret) > goto err; > } > @@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev, > struct drm_file *file, > struct intel_ring_buffer *ring, > struct eb_objects *eb, > - struct drm_i915_gem_exec_object2 *exec) > + struct drm_i915_gem_exec_object2 *exec, > + struct i915_address_space *vm) > { > struct drm_i915_gem_relocation_entry *reloc; > struct drm_i915_gem_object *obj; > @@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev, > goto err; > > need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0; > - ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs); > + ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs); > if (ret) > goto err; > > list_for_each_entry(obj, &eb->objects, exec_list) { > int offset = obj->exec_entry - exec; > ret = i915_gem_execbuffer_relocate_object_slow(obj, eb, > - reloc + reloc_offset[offset]); > + reloc + reloc_offset[offset], > + vm); > if (ret) > goto err; > } > @@ -770,6 +786,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec, > > static void > i915_gem_execbuffer_move_to_active(struct list_head *objects, > + struct i915_address_space *vm, > struct intel_ring_buffer *ring) > { > struct drm_i915_gem_object *obj; > @@ -784,7 +801,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects, > obj->base.read_domains = obj->base.pending_read_domains; > obj->fenced_gpu_access = obj->pending_fenced_gpu_access; > > - i915_gem_object_move_to_active(obj, ring); > + i915_gem_object_move_to_active(obj, vm, ring); > if (obj->base.write_domain) { > obj->dirty = 1; > obj->last_write_seqno = intel_ring_get_seqno(ring); > @@ -838,7 +855,8 @@ static int > i915_gem_do_execbuffer(struct drm_device *dev, void *data, > struct drm_file *file, > struct drm_i915_gem_execbuffer2 *args, > - struct drm_i915_gem_exec_object2 *exec) > + struct drm_i915_gem_exec_object2 *exec, > + struct i915_address_space *vm) > { > drm_i915_private_t *dev_priv = dev->dev_private; > struct eb_objects *eb; > @@ -1000,17 +1018,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, > > /* Move the objects en-masse into the GTT, evicting if necessary. */ > need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0; > - ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs); > + ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs); > if (ret) > goto err; > > /* The objects are in their final locations, apply the relocations. */ > if (need_relocs) > - ret = i915_gem_execbuffer_relocate(eb); > + ret = i915_gem_execbuffer_relocate(eb, vm); > if (ret) { > if (ret == -EFAULT) { > ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring, > - eb, exec); > + eb, exec, vm); > BUG_ON(!mutex_is_locked(&dev->struct_mutex)); > } > if (ret) > @@ -1061,7 +1079,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, > goto err; > } > > - exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset; > + exec_start = i915_gem_obj_offset(batch_obj, vm) + > + args->batch_start_offset; > exec_len = args->batch_len; > if (cliprects) { > for (i = 0; i < args->num_cliprects; i++) { > @@ -1086,11 +1105,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, > > trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags); > > - i915_gem_execbuffer_move_to_active(&eb->objects, ring); > + i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring); > i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj); > > err: > - eb_destroy(eb); > + eb_destroy(eb, vm); > > mutex_unlock(&dev->struct_mutex); > > @@ -1107,6 +1126,7 @@ int > i915_gem_execbuffer(struct drm_device *dev, void *data, > struct drm_file *file) > { > + struct drm_i915_private *dev_priv = dev->dev_private; > struct drm_i915_gem_execbuffer *args = data; > struct drm_i915_gem_execbuffer2 exec2; > struct drm_i915_gem_exec_object *exec_list = NULL; > @@ -1162,7 +1182,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data, > exec2.flags = I915_EXEC_RENDER; > i915_execbuffer2_set_context_id(exec2, 0); > > - ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list); > + ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list, > + &dev_priv->gtt.base); > if (!ret) { > /* Copy the new buffer offsets back to the user's exec list. */ > for (i = 0; i < args->buffer_count; i++) > @@ -1188,6 +1209,7 @@ int > i915_gem_execbuffer2(struct drm_device *dev, void *data, > struct drm_file *file) > { > + struct drm_i915_private *dev_priv = dev->dev_private; > struct drm_i915_gem_execbuffer2 *args = data; > struct drm_i915_gem_exec_object2 *exec2_list = NULL; > int ret; > @@ -1218,7 +1240,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data, > return -EFAULT; > } > > - ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list); > + ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list, > + &dev_priv->gtt.base); > if (!ret) { > /* Copy the new buffer offsets back to the user's exec list. */ > ret = copy_to_user(to_user_ptr(args->buffers_ptr), > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c > index 3b639a9..44f3464 100644 > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > @@ -390,6 +390,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev) > ppgtt->base.total); > } > > + /* i915_init_vm(dev_priv, &ppgtt->base) */ > + > return ret; > } > > @@ -409,17 +411,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt, > struct drm_i915_gem_object *obj, > enum i915_cache_level cache_level) > { > - ppgtt->base.insert_entries(&ppgtt->base, obj->pages, > - i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT, > - cache_level); > + struct i915_address_space *vm = &ppgtt->base; > + unsigned long obj_offset = i915_gem_obj_offset(obj, vm); > + > + vm->insert_entries(vm, obj->pages, > + obj_offset >> PAGE_SHIFT, > + cache_level); > } > > void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt, > struct drm_i915_gem_object *obj) > { > - ppgtt->base.clear_range(&ppgtt->base, > - i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT, > - obj->base.size >> PAGE_SHIFT); > + struct i915_address_space *vm = &ppgtt->base; > + unsigned long obj_offset = i915_gem_obj_offset(obj, vm); > + > + vm->clear_range(vm, obj_offset >> PAGE_SHIFT, > + obj->base.size >> PAGE_SHIFT); > } > > extern int intel_iommu_gfx_mapped; > @@ -470,6 +477,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev) > dev_priv->gtt.base.start / PAGE_SIZE, > dev_priv->gtt.base.total / PAGE_SIZE); > > + if (dev_priv->mm.aliasing_ppgtt) > + gen6_write_pdes(dev_priv->mm.aliasing_ppgtt); > + > list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) { > i915_gem_clflush_object(obj); > i915_gem_gtt_bind_object(obj, obj->cache_level); > @@ -648,7 +658,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev, > * aperture. One page should be enough to keep any prefetching inside > * of the aperture. > */ > - drm_i915_private_t *dev_priv = dev->dev_private; > + struct drm_i915_private *dev_priv = dev->dev_private; > + struct i915_address_space *ggtt_vm = &dev_priv->gtt.base; > struct drm_mm_node *entry; > struct drm_i915_gem_object *obj; > unsigned long hole_start, hole_end; > @@ -656,19 +667,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev, > BUG_ON(mappable_end > end); > > /* Subtract the guard page ... */ > - drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE); > + drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE); > if (!HAS_LLC(dev)) > dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust; > > /* Mark any preallocated objects as occupied */ > list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) { > - struct i915_vma *vma = __i915_gem_obj_to_vma(obj); > + struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm); > int ret; > DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n", > i915_gem_obj_ggtt_offset(obj), obj->base.size); > > WARN_ON(i915_gem_obj_ggtt_bound(obj)); > - ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node); > + ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node); > if (ret) > DRM_DEBUG_KMS("Reservation failed\n"); > obj->has_global_gtt_mapping = 1; > @@ -679,19 +690,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev, > dev_priv->gtt.base.total = end - start; > > /* Clear any non-preallocated blocks */ > - drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm, > - hole_start, hole_end) { > + drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) { > const unsigned long count = (hole_end - hole_start) / PAGE_SIZE; > DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n", > hole_start, hole_end); > - dev_priv->gtt.base.clear_range(&dev_priv->gtt.base, > - hole_start / PAGE_SIZE, > - count); > + ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count); > } > > /* And finally clear the reserved guard page */ > - dev_priv->gtt.base.clear_range(&dev_priv->gtt.base, > - end / PAGE_SIZE - 1, 1); > + ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1); > } > > static bool > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c > index 27ffb4c..000ffbd 100644 > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c > @@ -351,7 +351,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev, > u32 size) > { > struct drm_i915_private *dev_priv = dev->dev_private; > - struct i915_address_space *vm = &dev_priv->gtt.base; > + struct i915_address_space *ggtt = &dev_priv->gtt.base; > struct drm_i915_gem_object *obj; > struct drm_mm_node *stolen; > struct i915_vma *vma; > @@ -394,7 +394,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev, > if (gtt_offset == I915_GTT_OFFSET_NONE) > return obj; > > - vma = i915_gem_vma_create(obj, &dev_priv->gtt.base); > + vma = i915_gem_vma_create(obj, ggtt); > if (IS_ERR(vma)) { > ret = PTR_ERR(vma); > goto err_out; > @@ -407,8 +407,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev, > */ > vma->node.start = gtt_offset; > vma->node.size = size; > - if (drm_mm_initialized(&dev_priv->gtt.base.mm)) { > - ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node); > + if (drm_mm_initialized(&ggtt->mm)) { > + ret = drm_mm_reserve_node(&ggtt->mm, &vma->node); > if (ret) { > DRM_DEBUG_KMS("failed to allocate stolen GTT space\n"); > i915_gem_vma_destroy(vma); > @@ -419,7 +419,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev, > obj->has_global_gtt_mapping = 1; > > list_add_tail(&obj->global_list, &dev_priv->mm.bound_list); > - list_add_tail(&obj->mm_list, &vm->inactive_list); > + list_add_tail(&obj->mm_list, &ggtt->inactive_list); > > return obj; > > diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c > index 92a8d27..808ca2a 100644 > --- a/drivers/gpu/drm/i915/i915_gem_tiling.c > +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c > @@ -360,17 +360,19 @@ i915_gem_set_tiling(struct drm_device *dev, void *data, > > obj->map_and_fenceable = > !i915_gem_obj_ggtt_bound(obj) || > - (i915_gem_obj_ggtt_offset(obj) + obj->base.size <= dev_priv->gtt.mappable_end && > + (i915_gem_obj_ggtt_offset(obj) + > + obj->base.size <= dev_priv->gtt.mappable_end && > i915_gem_object_fence_ok(obj, args->tiling_mode)); > > /* Rebind if we need a change of alignment */ > if (!obj->map_and_fenceable) { > - u32 unfenced_alignment = > + struct i915_address_space *ggtt = &dev_priv->gtt.base; > + u32 unfenced_align = > i915_gem_get_gtt_alignment(dev, obj->base.size, > args->tiling_mode, > false); > - if (i915_gem_obj_ggtt_offset(obj) & (unfenced_alignment - 1)) > - ret = i915_gem_object_unbind(obj); > + if (i915_gem_obj_ggtt_offset(obj) & (unfenced_align - 1)) > + ret = i915_gem_object_unbind(obj, ggtt); > } > > if (ret == 0) { > diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h > index 7d283b5..3f019d3 100644 > --- a/drivers/gpu/drm/i915/i915_trace.h > +++ b/drivers/gpu/drm/i915/i915_trace.h > @@ -34,11 +34,13 @@ TRACE_EVENT(i915_gem_object_create, > ); > > TRACE_EVENT(i915_gem_object_bind, > - TP_PROTO(struct drm_i915_gem_object *obj, bool mappable), > - TP_ARGS(obj, mappable), > + TP_PROTO(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm, bool mappable), > + TP_ARGS(obj, vm, mappable), > > TP_STRUCT__entry( > __field(struct drm_i915_gem_object *, obj) > + __field(struct i915_address_space *, vm) > __field(u32, offset) > __field(u32, size) > __field(bool, mappable) > @@ -46,8 +48,8 @@ TRACE_EVENT(i915_gem_object_bind, > > TP_fast_assign( > __entry->obj = obj; > - __entry->offset = i915_gem_obj_ggtt_offset(obj); > - __entry->size = i915_gem_obj_ggtt_size(obj); > + __entry->offset = i915_gem_obj_offset(obj, vm); > + __entry->size = i915_gem_obj_size(obj, vm); > __entry->mappable = mappable; > ), > > @@ -57,19 +59,21 @@ TRACE_EVENT(i915_gem_object_bind, > ); > > TRACE_EVENT(i915_gem_object_unbind, > - TP_PROTO(struct drm_i915_gem_object *obj), > - TP_ARGS(obj), > + TP_PROTO(struct drm_i915_gem_object *obj, > + struct i915_address_space *vm), > + TP_ARGS(obj, vm), > > TP_STRUCT__entry( > __field(struct drm_i915_gem_object *, obj) > + __field(struct i915_address_space *, vm) > __field(u32, offset) > __field(u32, size) > ), > > TP_fast_assign( > __entry->obj = obj; > - __entry->offset = i915_gem_obj_ggtt_offset(obj); > - __entry->size = i915_gem_obj_ggtt_size(obj); > + __entry->offset = i915_gem_obj_offset(obj, vm); > + __entry->size = i915_gem_obj_size(obj, vm); > ), > > TP_printk("obj=%p, offset=%08x size=%x", > diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c > index f3c97e0..b69cc63 100644 > --- a/drivers/gpu/drm/i915/intel_fb.c > +++ b/drivers/gpu/drm/i915/intel_fb.c > @@ -170,7 +170,6 @@ static int intelfb_create(struct drm_fb_helper *helper, > fb->width, fb->height, > i915_gem_obj_ggtt_offset(obj), obj); > > - > mutex_unlock(&dev->struct_mutex); > vga_switcheroo_client_fb_set(dev->pdev, info); > return 0; > diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c > index 2abb53e..22ccb7e 100644 > --- a/drivers/gpu/drm/i915/intel_overlay.c > +++ b/drivers/gpu/drm/i915/intel_overlay.c > @@ -1350,7 +1350,7 @@ void intel_setup_overlay(struct drm_device *dev) > } > overlay->flip_addr = reg_bo->phys_obj->handle->busaddr; > } else { > - ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false); > + ret = i915_gem_ggtt_pin(reg_bo, PAGE_SIZE, true, false); > if (ret) { > DRM_ERROR("failed to pin overlay register bo\n"); > goto out_free_bo; > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c > index 008e0e0..0fb081c 100644 > --- a/drivers/gpu/drm/i915/intel_pm.c > +++ b/drivers/gpu/drm/i915/intel_pm.c > @@ -2860,7 +2860,7 @@ intel_alloc_context_page(struct drm_device *dev) > return NULL; > } > > - ret = i915_gem_object_pin(ctx, 4096, true, false); > + ret = i915_gem_ggtt_pin(ctx, 4096, true, false); > if (ret) { > DRM_ERROR("failed to pin power context: %d\n", ret); > goto err_unref; > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c > index 8527ea0..88130a3 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > @@ -481,6 +481,7 @@ out: > static int > init_pipe_control(struct intel_ring_buffer *ring) > { > + struct drm_i915_private *dev_priv = ring->dev->dev_private; > struct pipe_control *pc; > struct drm_i915_gem_object *obj; > int ret; > @@ -499,9 +500,10 @@ init_pipe_control(struct intel_ring_buffer *ring) > goto err; > } > > - i915_gem_object_set_cache_level(obj, I915_CACHE_LLC); > + i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, > + I915_CACHE_LLC); > > - ret = i915_gem_object_pin(obj, 4096, true, false); > + ret = i915_gem_ggtt_pin(obj, 4096, true, false); > if (ret) > goto err_unref; > > @@ -1212,6 +1214,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring) > static int init_status_page(struct intel_ring_buffer *ring) > { > struct drm_device *dev = ring->dev; > + struct drm_i915_private *dev_priv = dev->dev_private; > struct drm_i915_gem_object *obj; > int ret; > > @@ -1222,9 +1225,10 @@ static int init_status_page(struct intel_ring_buffer *ring) > goto err; > } > > - i915_gem_object_set_cache_level(obj, I915_CACHE_LLC); > + i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, > + I915_CACHE_LLC); > > - ret = i915_gem_object_pin(obj, 4096, true, false); > + ret = i915_gem_ggtt_pin(obj, 4096, true, false); > if (ret != 0) { > goto err_unref; > } > @@ -1307,7 +1311,7 @@ static int intel_init_ring_buffer(struct drm_device *dev, > > ring->obj = obj; > > - ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false); > + ret = i915_gem_ggtt_pin(obj, PAGE_SIZE, true, false); > if (ret) > goto err_unref; > > @@ -1828,7 +1832,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev) > return -ENOMEM; > } > > - ret = i915_gem_object_pin(obj, 0, true, false); > + ret = i915_gem_ggtt_pin(obj, 0, true, false); > if (ret != 0) { > drm_gem_object_unreference(&obj->base); > DRM_ERROR("Failed to ping batch bo\n"); > -- > 1.8.3.3 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Fri, 26 Jul 2013 11:51:00 +0200 Daniel Vetter <daniel@ffwll.ch> wrote: > HI all, > > So Ben&I had a bit a private discussion and one thing I've explained a bit > more in detail is what kind of review I'm doing as maintainer. I've > figured this is generally useful. We've also discussed a bit that for > developers without their own lab it would be nice if QA could test random > branches on their set of machines. But imo that'll take quite a while, > there's lots of other stuff to improve in QA land first. Anyway, here's > it: > > Now an explanation for why this freaked me out, which is essentially an > explanation of what I do when I do maintainer reviews: > > Probably the most important question I ask myself when reading a patch is > "if a regression would bisect to this, and the bisect is the only useful > piece of evidence, would I stand a chance to understand it?". Your patch > is big, has the appearance of doing a few unrelated things and could very > well hide a bug which would take me an awful lot of time to spot. So imo > the answer for your patch is a clear "no". This is definitely a good point. Big patches are both hard to review and hard to debug, so should be kept as simple as possible (but no simpler!). > I've merged a few such patches in the past where I've had a similar hunch > and regretted it almost always. I've also sometimes split-up the patch > while applying, but that approach doesn't scale any more with our rather > big team. > > The second thing I try to figure out is whether the patch author is indeed > the local expert on the topic at hand now. With our team size and patch > flow I don't stand a chance if I try to understand everything to the last > detail. Instead I try to assess this through the proxy of convincing > myself the the patch submitter understands stuff much better than I do. I > tend to check that by asking random questions, proposing alternative > approaches and also by rating code/patch clarity. The obj_set_color > double-loop very much gave me the impression that you didn't have a clear > idea about how exactly this should work, so that hunk trigger this > maintainer hunch. This is the part I think is unfair (see below) when proposed alternatives aren't clearly defined. > I admit that this is all rather fluffy and very much an inexact science, > but it's the only tools I have as a maintainer. The alternative of doing > shit myself or checking everything myself in-depth just doesnt scale. I'm glad you brought this up, but I see a contradiction here: if you're just asking random questions to convince yourself the author knows what they're doing, but simultaneously you're not checking everything yourself in-depth, you'll have no way to know whether your questions are being dealt with properly. I think the way out of that contradiction is to trust reviewers, especially in specific areas. There's a downside in that the design will be a little less coherent (i.e. matching the vision of a single person), but as you said, that doesn't scale. So I'd suggest a couple of rules to help: 1) every patch gets at least two reviewed-bys 2) one of those reviewed-bys should be from a domain expert, e.g.: DP - Todd, Jani GEM - Chris, Daniel $PLATFORM - $PLATFORM owner HDMI - Paulo PSR/FBC - Rodrigo/Shobhit * - Daniel (you get to be a wildcard) etc. 3) reviews aren't allowed to contain solely bikeshed/codingstyle change requests, if there's nothing substantial merge shouldn't be blocked (modulo egregious violations like Hungarian notation) 4) review comments should be concrete and actionable, and ideally not leave the author hanging with hints about problems the reviewer has spotted, leaving the author looking for easter eggs For the most part I think we adhere to this, though reviews from the domain experts are done more on an ad-hoc basis these days... Thoughts?
On Fri, Jul 26, 2013 at 09:59:42AM -0700, Jesse Barnes wrote: > 4) review comments should be concrete and actionable, and ideally not > leave the author hanging with hints about problems the reviewer > has spotted, leaving the author looking for easter eggs Where am I going to find my fun, if I am not allowed to tell you that you missed a zero in a thousand line patch but not tell you where? Spoilsport :-p -Chris
On Fri, 26 Jul 2013 18:08:48 +0100 Chris Wilson <chris@chris-wilson.co.uk> wrote: > On Fri, Jul 26, 2013 at 09:59:42AM -0700, Jesse Barnes wrote: > > 4) review comments should be concrete and actionable, and ideally not > > leave the author hanging with hints about problems the reviewer > > has spotted, leaving the author looking for easter eggs > > Where am I going to find my fun, if I am not allowed to tell you that > you missed a zero in a thousand line patch but not tell you where? > Spoilsport :-p You'll just need to take up golf or something. :)
On Fri, Jul 26, 2013 at 09:59:42AM -0700, Jesse Barnes wrote: > On Fri, 26 Jul 2013 11:51:00 +0200 > Daniel Vetter <daniel@ffwll.ch> wrote: > > > HI all, > > > > So Ben&I had a bit a private discussion and one thing I've explained a bit > > more in detail is what kind of review I'm doing as maintainer. I've > > figured this is generally useful. We've also discussed a bit that for > > developers without their own lab it would be nice if QA could test random > > branches on their set of machines. But imo that'll take quite a while, > > there's lots of other stuff to improve in QA land first. Anyway, here's > > it: > > > > Now an explanation for why this freaked me out, which is essentially an > > explanation of what I do when I do maintainer reviews: > > > > Probably the most important question I ask myself when reading a patch is > > "if a regression would bisect to this, and the bisect is the only useful > > piece of evidence, would I stand a chance to understand it?". Your patch > > is big, has the appearance of doing a few unrelated things and could very > > well hide a bug which would take me an awful lot of time to spot. So imo > > the answer for your patch is a clear "no". > > This is definitely a good point. Big patches are both hard to review > and hard to debug, so should be kept as simple as possible (but no > simpler!). > > > I've merged a few such patches in the past where I've had a similar hunch > > and regretted it almost always. I've also sometimes split-up the patch > > while applying, but that approach doesn't scale any more with our rather > > big team. > > > > The second thing I try to figure out is whether the patch author is indeed > > the local expert on the topic at hand now. With our team size and patch > > flow I don't stand a chance if I try to understand everything to the last > > detail. Instead I try to assess this through the proxy of convincing > > myself the the patch submitter understands stuff much better than I do. I > > tend to check that by asking random questions, proposing alternative > > approaches and also by rating code/patch clarity. The obj_set_color > > double-loop very much gave me the impression that you didn't have a clear > > idea about how exactly this should work, so that hunk trigger this > > maintainer hunch. > > This is the part I think is unfair (see below) when proposed > alternatives aren't clearly defined. Ben split up the patches meanwhile and imo they now look great (so fully address the first concern). I've read through them this morning and dumped a few (imo actionable) quick comments on irc. For the example here my request is to squash a double-loop over vma lists (which will also rip out a function call indirection as a bonus). > > I admit that this is all rather fluffy and very much an inexact science, > > but it's the only tools I have as a maintainer. The alternative of doing > > shit myself or checking everything myself in-depth just doesnt scale. > > I'm glad you brought this up, but I see a contradiction here: if > you're just asking random questions to convince yourself the author > knows what they're doing, but simultaneously you're not checking > everything yourself in-depth, you'll have no way to know whether your > questions are being dealt with properly. Well if the reply is unsure or inconstistent then I tend to dig in. E.g. with Paulo's pc8+ stuff I've asked a few questions about interactions with gmbus/edid reading/gem execbuf and he replied that he doesn't know. His 2nd patch version was still a bit thin on details in that area, so I've sat down read through stuff and made a concrete&actionable list of corner-cases I think we should exercise. > I think the way out of that contradiction is to trust reviewers, > especially in specific areas. Imo I've already started with that, there's lots of patches where I only do a very cursory read when merging since I trust $AUTHOR and $REVIEWER to get it right. > There's a downside in that the design will be a little less coherent > (i.e. matching the vision of a single person), but as you said, that > doesn't scale. I think overall we can still achieve good consistency in the design, so that's a part where I try to chip in. But with a larger team it's clear that consistency in little details will fizzle out more, otoh doing such cleanups after big reworks (heck I've been rather inconstinent in all the refactoring in the modeset code myself) sounds like good material to drag newbies into our codebase. > So I'd suggest a couple of rules to help: > 1) every patch gets at least two reviewed-bys We have a hard time doing our current review load in a timely manner already, I don't expect this to scale if we do it formally. But ... > 2) one of those reviewed-bys should be from a domain expert, e.g.: > DP - Todd, Jani > GEM - Chris, Daniel > $PLATFORM - $PLATFORM owner > HDMI - Paulo > PSR/FBC - Rodrigo/Shobhit > * - Daniel (you get to be a wildcard) > etc. ... this is something that I've started to take into account already. E.g. when I ask someone less experienced for a given topic to do a fish-out-of-water review I'll also poke domain experts to ack it. And if there's a concern it obviously overrules an r-b tag from someone else. > 3) reviews aren't allowed to contain solely bikeshed/codingstyle > change requests, if there's nothing substantial merge shouldn't be > blocked (modulo egregious violations like Hungarian notation) I think we're doing fairly well. Occasionally I rant around review myself, but often that's just the schlep of digging the patch out again and refining it - most often the reviewer is right, which obviously makes it worse ;-) We have a few cases where discussions tend to loop forever. Sometimes I step in but often I feel like I shouldn't be the one to make the call, e.g. the audio discussions around the hsw power well drag out often, but imo that's a topic where Paulo should make the calls. Occasionally though I block a patch on bikeshed topics simply because I think the improved consistency is worth it. One example is the gen checks so that our code matches 0-based C array semantics and our usual writing style of using genX+ and pre-genX to be inclusive/exclusive respectively. > 4) review comments should be concrete and actionable, and ideally not > leave the author hanging with hints about problems the reviewer > has spotted, leaving the author looking for easter eggs Where's the fun in that? I think the right way to look at easter egg hunting is that the clear&actionable task from the reviewer is to go easter egg hunting ;-) More seriously though asking "what happens if?" questions is an important part of review imo, and sometimes those tend to be an easter egg hunt for both reviewer and patch author." > For the most part I think we adhere to this, though reviews from the > domain experts are done more on an ad-hoc basis these days... > > Thoughts? Generally I think our overall process is a) a mess (as in not really formalized much) and b) works surprisingly well. So I think fine-tuning of individual parts and having an occasional process discussion should be good enough to keep going. Cheers, Daniel
On Fri, Jul 26, 2013 at 11:51:00AM +0200, Daniel Vetter wrote: > HI all, > > So Ben&I had a bit a private discussion and one thing I've explained a bit > more in detail is what kind of review I'm doing as maintainer. I've > figured this is generally useful. We've also discussed a bit that for > developers without their own lab it would be nice if QA could test random > branches on their set of machines. But imo that'll take quite a while, > there's lots of other stuff to improve in QA land first. Anyway, here's > it: > > Now an explanation for why this freaked me out, which is essentially an > explanation of what I do when I do maintainer reviews: > > Probably the most important question I ask myself when reading a patch is > "if a regression would bisect to this, and the bisect is the only useful > piece of evidence, would I stand a chance to understand it?". Your patch > is big, has the appearance of doing a few unrelated things and could very > well hide a bug which would take me an awful lot of time to spot. So imo > the answer for your patch is a clear "no". > > I've merged a few such patches in the past where I've had a similar hunch > and regretted it almost always. I've also sometimes split-up the patch > while applying, but that approach doesn't scale any more with our rather > big team. You should never do this, IMO. If you require the patches to be split in your tree, the developer should do it. See below for reasons I think this sucks. > > The second thing I try to figure out is whether the patch author is indeed > the local expert on the topic at hand now. With our team size and patch > flow I don't stand a chance if I try to understand everything to the last > detail. Instead I try to assess this through the proxy of convincing > myself the the patch submitter understands stuff much better than I do. I > tend to check that by asking random questions, proposing alternative > approaches and also by rating code/patch clarity. The obj_set_color > double-loop very much gave me the impression that you didn't have a clear > idea about how exactly this should work, so that hunk trigger this > maintainer hunch. > > I admit that this is all rather fluffy and very much an inexact science, > but it's the only tools I have as a maintainer. The alternative of doing > shit myself or checking everything myself in-depth just doesnt scale. > > Cheers, Daniel > > > On Mon, Jul 22, 2013 at 4:08 AM, Ben Widawsky <ben@bwidawsk.net> wrote: I think the subthread Jesse started had a bunch of good points, but concisely I see 3 problems with our current process (and these were addressed in my original mail, but I guess you didn't want to air my dirty laundry :p): 1. Delay hurts QA. Balking on patches because they're hard to review limits QA on that patch, and reduces QA time on the fixed up patches. I agree this is something which is fixable within QA, but it doesn't exist at present. 2. We don't have a way to bound review/merge. I tried to do this on this series. After your initial review, I gave a list of things I was going to fix, and asked you for an ack that if I fixed those, you would merge. IMO, you didn't stick to this agreement, and came back with rework requests on a patch I had already submitted. I don't know how to fix this one because I think you should be entitled to change your mind. A caveat to this: I did make some mistakes on rebase that needed addressing. ie. the ends justified the means. 3a. Reworking code introduces bugs. I feel I am more guilty here than most, but, consider even in the best case of those new bugs being caught in review. In such a case, you've now introduced at least 2 extra revs, and 2 extra lag cycles waiting for review. That assumes further work doesn't spiral into more requested fixups, or more bugs. In the less ideal case, you've simply introduced a new bug in addition to the delay. 3b. Patch splitting is art not science. There is a really delicate balance between splitting patches because it's logically a functional split vs. splitting things up to make things easier to chew on. Now in my case specifically, I think overall the series has improved, and I found some crud that got squashed in which shouldn't have been there. I also believe a lot of the splitting really doesn't make much sense other than for review purposes and sometimes that is okay. In my case, I had a huge patch, but a lot of that patch was actually a sed job of "s/obj/obj,vm/." You came back with, "you're doing a bunch of extra lookups." That was exactly the point of the patch; the extra lookups should have made the review simpler, and could be cleaned up later. My point is: A larger quantity of small patches is not always easier to review than a small quantity of large patches. Large patch series review often requires the reviewer to keep a lot of context as they review. *4. The result of all this is I think a lot of the time we (the developers) end up writing your patch for you. While I respect your opinion very highly, and I think more often than not that your way is better, it's just inefficient. I'll wrap this all up with, I don't envy you. On a bunch of emails, I've seen you be apologetic for putting developers in between a rock, and a hard place (you, and program management). I recognize you have the same dilemma with Dave/Linus, and the rest of us developers. I think the overall strategy should be to improve QA, but then you have to take the leap of limiting your requests for reworks, and accepting QAs stamp of approval.
On Fri, Jul 26, 2013 at 10:15 PM, Ben Widawsky <ben@bwidawsk.net> wrote: > I think the subthread Jesse started had a bunch of good points, but > concisely I see 3 problems with our current process (and these were > addressed in my original mail, but I guess you didn't want to air my > dirty laundry :p): I've cut out some of the later discussion in my mail (and that thread) since I've figured it's not the main point I wanted to make. No fear of dirty laundry ;-) > > 1. Delay hurts QA. Balking on patches because they're hard to review > limits QA on that patch, and reduces QA time on the fixed up patches. I > agree this is something which is fixable within QA, but it doesn't exist > at present. Yeah, I agree that this is an issue for developers without their private lab ;-) And it's also an issue for those with one, since running tests without a good fully automated system is a pian. With discussed this a bit with Jesse yesterday on irc, but my point is that currentl QA doesn't have a quick enough turn-around even for testing -nightly that this would be feasible. And I also think that something like this should be started with userspace (i.e. mesa) testing first, which is already in progress. Once QA has infrastructure to test arbitrary branches and once they have enough horsepower and automation (and people to do all this) we can take a look again. But imo trying to do this early is just wishful thinking, we have to deal with what we have, not what we'd like to get for Xmas. > 2. We don't have a way to bound review/merge. I tried to do this on this > series. After your initial review, I gave a list of things I was going > to fix, and asked you for an ack that if I fixed those, you would merge. > IMO, you didn't stick to this agreement, and came back with rework > requests on a patch I had already submitted. I don't know how to fix > this one because I think you should be entitled to change your mind. > > A caveat to this: I did make some mistakes on rebase that needed > addressing. ie. the ends justified the means. Yeah, the problem is that for really big stuff like your ppgtt series the merge process is incremental: We'll do a rough plan and then pull in parts one-by-one. And then when the sub-series get reviewed new things pop up. And sometimes the reviewer is simply confused and asks for stupid things ... I don't think we can fix this since that's just how it works. But we can certainly keep this in mind when estimating the effort to get features in - big stuff will have some uncertainty (and hence need for time buffers) even after the first review. For the ppgtt work I need to blame myself too since the original plan was way too optimistic, but I really wanted to get this in before you get sucked away into the next big thing lined up (which in this case unfortunately came attached with a deadline). > 3a. Reworking code introduces bugs. I feel I am more guilty here than > most, but, consider even in the best case of those new bugs being > caught in review. In such a case, you've now introduced at least 2 extra > revs, and 2 extra lag cycles waiting for review. That assumes further > work doesn't spiral into more requested fixups, or more bugs. In the > less ideal case, you've simply introduced a new bug in addition to the > delay. I'm trying to address this by sharing rebase BKMs as much as possible. Since I'm the one on the team doing the most rebasing (with -internal) that hopefully helps. > 3b. Patch splitting is art not science. > > There is a really delicate balance between splitting patches because > it's logically a functional split vs. splitting things up to make things > easier to chew on. Now in my case specifically, I think overall the > series has improved, and I found some crud that got squashed in which > shouldn't have been there. I also believe a lot of the splitting really > doesn't make much sense other than for review purposes and sometimes > that is okay. Imo splitting patches has two functions: Make the reviewer's life easier (not really the developers) and have simple patches in case of a regression which bisects to it. Ime you get about a 1-in-5 regression rate in dinq, so that chance is very much neglectable. And for the ugly regressions where we have no clue we can easily blow through a few man-months of engineer time to track them time. > In my case, I had a huge patch, but a lot of that patch was actually a > sed job of "s/obj/obj,vm/." You came back with, "you're doing a bunch > of extra lookups." That was exactly the point of the patch; the extra > lookups should have made the review simpler, and could be cleaned up > later. > > My point is: A larger quantity of small patches is not always easier to > review than a small quantity of large patches. Large patch series review > often requires the reviewer to keep a lot of context as they review. I don't mind big sed jobs or moving functions to new files (well those quite a bit since they're a pain for rebasing -internal). But such a big patch needs to be conceptually really simple, my rule of thumb is that patch size times complexity should follow a constant upper limit. So a big move stuff patch shouldn't also rename a bunch of functions (wasn't too happy about Chris' intel_uncore.c extract) since that makes comparing harder (both in review and in rebasing). If the patch is really big (like driver-wide sed jobs) the conceptual change should approach 0. For example if you want to embed an object you first create an access helper (big sed job, no change, not even in the struct layout). Then a 2nd patch which changes the access helper, but would otherwise be very small. Imo the big patch I've asked you to split up had lot of sed-like things, but also a few potentially functional/conceptual changes in it. The combination was imo too much. But that doesn't mean I won't accept sed jobs that result in a much larger diff, just that they need to be really simple. > *4. The result of all this is I think a lot of the time we (the > developers) end up writing your patch for you. While I respect your > opinion very highly, and I think more often than not that your way is > better, it's just inefficient. Yeah, I'm aware that sometimes I go overboard with "my way or the highway" even if I don't state that explicitly. Often though when I drop random ideas or ask questions I'm ok if the patch author sticks to his way if it comes with a good explanation attached. That at least is one of the reason why I want to always update commit messages even when the reviewer in the end did not ask for a code change. Todays discussion about the loop in one of your patches in evict_everything was a prime example: I've read through your code, decided that it looks funny and dropped a suggestion on irc. But later on I've read the end result and noticed that my suggestion is much worse than what you have. In such cases I expect developers to stand up, explain why something is like it is and tell me that I'm full of myself ;-) This will be even more important going forward since with the growing team and code output I'll be less and less able to keep track of everything. So the chance that I'll utter complete bs in a review will only increase. If you don't call me out on it we'll end up with worse code, which I very much don't want to. > I'll wrap this all up with, I don't envy you. On a bunch of emails, I've > seen you be apologetic for putting developers in between a rock, and a > hard place (you, and program management). I recognize you have the same > dilemma with Dave/Linus, and the rest of us developers. I think the > overall strategy should be to improve QA, but then you have to take the > leap of limiting your requests for reworks, and accepting QAs stamp of > approval. Hey, overall it's actually quite a bit of fun. I do agree that QA is really important for a fastpaced process, but it's also not the only peace to get something in. Review (both of the patch itself but also of the test coverage) catches a lot of issues, and in many cases not the same ones as QA would. Especially if the testcoverage of a new feature is less than stellar, which imo is still the case for gem due to the tons of finickle cornercases. Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> > Hey, overall it's actually quite a bit of fun. > > I do agree that QA is really important for a fastpaced process, but > it's also not the only peace to get something in. Review (both of the > patch itself but also of the test coverage) catches a lot of issues, > and in many cases not the same ones as QA would. Especially if the > testcoverage of a new feature is less than stellar, which imo is still > the case for gem due to the tons of finickle cornercases. Just my 2c worth on this topic, since I like the current process, and I believe making it too formal is probably going to make things suck too much. I'd rather Daniel was slowing you guys down up front more, I don't give a crap about Intel project management or personal manager relying on getting features merged when, I do care that you engineers when you merge something generally get transferred 100% onto something else and don't react strongly enough to issues on older code you have created that either have lain dormant since patches merged or are regressions since patches merged. So I believe the slowing down of merging features gives a better chance of QA or other random devs of finding the misc regressions while you are still focused on the code and hitting the long term bugs that you guys rarely get resourced to fix unless I threaten to stop pulling stuff. So whatever Daniel says goes as far as I'm concerned, if I even suspect he's taken some internal Intel pressure to merge some feature, I'm going to stop pulling from him faster than I stopped pulling from the previous maintainers :-), so yeah engineers should be prepared to backup what they post even if Daniel is wrong, but on the other hand they need to demonstrate they understand the code they are pushing and sometimes with ppgtt and contexts I'm not sure anyone really understands how the hw works let alone the sw :-P Dave.
On Sat, Jul 27, 2013 at 09:13:38AM +1000, Dave Airlie wrote: > > > > Hey, overall it's actually quite a bit of fun. > > > > I do agree that QA is really important for a fastpaced process, but > > it's also not the only peace to get something in. Review (both of the > > patch itself but also of the test coverage) catches a lot of issues, > > and in many cases not the same ones as QA would. Especially if the > > testcoverage of a new feature is less than stellar, which imo is still > > the case for gem due to the tons of finickle cornercases. > > Just my 2c worth on this topic, since I like the current process, and > I believe making it too formal is probably going to make things suck > too much. > > I'd rather Daniel was slowing you guys down up front more, I don't > give a crap about Intel project management or personal manager relying > on getting features merged when, I do care that you engineers when you > merge something generally get transferred 100% onto something else and > don't react strongly enough to issues on older code you have created > that either have lain dormant since patches merged or are regressions > since patches merged. So I believe the slowing down of merging > features gives a better chance of QA or other random devs of finding > the misc regressions while you are still focused on the code and > hitting the long term bugs that you guys rarely get resourced to fix > unless I threaten to stop pulling stuff. > > So whatever Daniel says goes as far as I'm concerned, if I even > suspect he's taken some internal Intel pressure to merge some feature, > I'm going to stop pulling from him faster than I stopped pulling from > the previous maintainers :-), so yeah engineers should be prepared to > backup what they post even if Daniel is wrong, but on the other hand > they need to demonstrate they understand the code they are pushing and > sometimes with ppgtt and contexts I'm not sure anyone really > understands how the hw works let alone the sw :-P > > Dave. Honestly, I wouldn't have responded if you didn't mention the Intel program management thing... The problem I am trying to emphasize, and let's use contexts/ppgtt as an example, is we have three options: 1. It's complicated, and a big change, so let's not do it. 2. I continue to rebase the massive change on top of the extremely fast paced i915 tree, with no QA coverage. 3. We get decent bits merged ASAP by putting it in a repo that both gets much wider usage than my personal branch, and gets nightly QA coverage. PPGTT + Contexts have existed for a while, and so we went with #1 for quite a while. Now we're at #2. There's two sides to your 'developer needs to defend...' I need Daniel to give succinct feedback, and agree upon steps required to get code merged. My original gripe was that it's hard to deal with the, "that patch is too big" comments almost 2 months after the first version was sent. Equally, "that looks funny" without a real explanation of what looks funny, or sufficient thought up front about what might look better is just as hard to deal with. Inevitably, yes - it's a big scary series of patches - but if we're honest with ourselves, it's almost guaranteed to blow up somewhere regardless of how much we rework it, and who reviews it. Blowing up long before you merge would always be better than the after you merge. My desire is to get to something like #3. I had a really long paragraph on why and how we could do that, but I've redacted it. Let's just leave it as, I think that should be the goal. Finally, let me clear that none of the discussion I'm having with Daniel that spawned this thread are inspired by Intel program management. My personal opinion is that your firm stance has really helped us internally to fight back stupid decisions. Honestly, I wish you had a more direct input into our management, and product planners.
On Sat, Jul 27, 2013 at 10:05 AM, Ben Widawsky <ben@bwidawsk.net> wrote: > On Sat, Jul 27, 2013 at 09:13:38AM +1000, Dave Airlie wrote: >> > >> > Hey, overall it's actually quite a bit of fun. >> > >> > I do agree that QA is really important for a fastpaced process, but >> > it's also not the only peace to get something in. Review (both of the >> > patch itself but also of the test coverage) catches a lot of issues, >> > and in many cases not the same ones as QA would. Especially if the >> > testcoverage of a new feature is less than stellar, which imo is still >> > the case for gem due to the tons of finickle cornercases. >> >> Just my 2c worth on this topic, since I like the current process, and >> I believe making it too formal is probably going to make things suck >> too much. >> >> I'd rather Daniel was slowing you guys down up front more, I don't >> give a crap about Intel project management or personal manager relying >> on getting features merged when, I do care that you engineers when you >> merge something generally get transferred 100% onto something else and >> don't react strongly enough to issues on older code you have created >> that either have lain dormant since patches merged or are regressions >> since patches merged. So I believe the slowing down of merging >> features gives a better chance of QA or other random devs of finding >> the misc regressions while you are still focused on the code and >> hitting the long term bugs that you guys rarely get resourced to fix >> unless I threaten to stop pulling stuff. >> >> So whatever Daniel says goes as far as I'm concerned, if I even >> suspect he's taken some internal Intel pressure to merge some feature, >> I'm going to stop pulling from him faster than I stopped pulling from >> the previous maintainers :-), so yeah engineers should be prepared to >> backup what they post even if Daniel is wrong, but on the other hand >> they need to demonstrate they understand the code they are pushing and >> sometimes with ppgtt and contexts I'm not sure anyone really >> understands how the hw works let alone the sw :-P >> >> Dave. > > Honestly, I wouldn't have responded if you didn't mention the Intel > program management thing... > > The problem I am trying to emphasize, and let's use contexts/ppgtt as an > example, is we have three options: > 1. It's complicated, and a big change, so let's not do it. > 2. I continue to rebase the massive change on top of the extremely fast > paced i915 tree, with no QA coverage. > 3. We get decent bits merged ASAP by putting it in a repo that both gets > much wider usage than my personal branch, and gets nightly QA coverage. > > PPGTT + Contexts have existed for a while, and so we went with #1 for > quite a while. > > Now we're at #2. There's two sides to your 'developer needs to > defend...' I need Daniel to give succinct feedback, and agree upon steps > required to get code merged. My original gripe was that it's hard to > deal with the, "that patch is too big" comments almost 2 months after > the first version was sent. Equally, "that looks funny" without a real > explanation of what looks funny, or sufficient thought up front about > what might look better is just as hard to deal with. Inevitably, yes - > it's a big scary series of patches - but if we're honest with ourselves, > it's almost guaranteed to blow up somewhere regardless of how much we > rework it, and who reviews it. Blowing up long before you merge would > always be better than the after you merge. > > My desire is to get to something like #3. I had a really long paragraph > on why and how we could do that, but I've redacted it. Let's just leave > it as, I think that should be the goal. > Daniel could start taking topic branches like Ingo does, however he'd have a lot of fun merging them, he's already getting closer and closer to the extreme stuff -tip does, and he'd have to feed the topics to QA and possibly -next separately, the question is when to include a branch or not include it. Maybe he can schedule a time that QA gets all the branches, and maybe not put stuff into -next until we are sure its on its way. Dave.
On Sat, 27 Jul 2013 09:13:38 +1000 Dave Airlie <airlied@gmail.com> wrote: > > > > Hey, overall it's actually quite a bit of fun. > > > > I do agree that QA is really important for a fastpaced process, but > > it's also not the only peace to get something in. Review (both of the > > patch itself but also of the test coverage) catches a lot of issues, > > and in many cases not the same ones as QA would. Especially if the > > testcoverage of a new feature is less than stellar, which imo is still > > the case for gem due to the tons of finickle cornercases. > > Just my 2c worth on this topic, since I like the current process, and > I believe making it too formal is probably going to make things suck > too much. > > I'd rather Daniel was slowing you guys down up front more, I don't > give a crap about Intel project management or personal manager relying > on getting features merged when, I do care that you engineers when you > merge something generally get transferred 100% onto something else and > don't react strongly enough to issues on older code you have created > that either have lain dormant since patches merged or are regressions > since patches merged. So I believe the slowing down of merging > features gives a better chance of QA or other random devs of finding > the misc regressions while you are still focused on the code and > hitting the long term bugs that you guys rarely get resourced to fix > unless I threaten to stop pulling stuff. > > So whatever Daniel says goes as far as I'm concerned, if I even > suspect he's taken some internal Intel pressure to merge some feature, > I'm going to stop pulling from him faster than I stopped pulling from > the previous maintainers :-), so yeah engineers should be prepared to > backup what they post even if Daniel is wrong, but on the other hand > they need to demonstrate they understand the code they are pushing and > sometimes with ppgtt and contexts I'm not sure anyone really > understands how the hw works let alone the sw :-P Some of this is driven by me, because I have one main goal in mind in getting our code upstream: I want high quality kernel support for our products upstream and released, in an official Linus release, before the product ships. That gives OSVs and other downstream consumers of the code a chance to get the bits and be ready when products start rolling out. Without a bounded time process for getting bits upstream, that can't happen. That's why I was trying to encourage reviewers to provide specific feedback, since vague feedback is more likely to leave a patchset in the doldrums and de-motivate the author. I think the "slowing things down" may hurt more than it helps here. For example all the time Paulo spends on refactoring and rebasing his PC8 stuff is time he could have spent on HSW bugs instead. Likewise with Ben's stuff (and there the rebasing is actually reducing quality rather than increasing it, at least from a bug perspective).
>> > I do agree that QA is really important for a fastpaced process, but >> > it's also not the only peace to get something in. Review (both of the >> > patch itself but also of the test coverage) catches a lot of issues, >> > and in many cases not the same ones as QA would. Especially if the >> > testcoverage of a new feature is less than stellar, which imo is still >> > the case for gem due to the tons of finickle cornercases. >> >> Just my 2c worth on this topic, since I like the current process, and >> I believe making it too formal is probably going to make things suck >> too much. >> >> I'd rather Daniel was slowing you guys down up front more, I don't >> give a crap about Intel project management or personal manager relying >> on getting features merged when, I do care that you engineers when you >> merge something generally get transferred 100% onto something else and >> don't react strongly enough to issues on older code you have created >> that either have lain dormant since patches merged or are regressions >> since patches merged. So I believe the slowing down of merging >> features gives a better chance of QA or other random devs of finding >> the misc regressions while you are still focused on the code and >> hitting the long term bugs that you guys rarely get resourced to fix >> unless I threaten to stop pulling stuff. >> >> So whatever Daniel says goes as far as I'm concerned, if I even >> suspect he's taken some internal Intel pressure to merge some feature, >> I'm going to stop pulling from him faster than I stopped pulling from >> the previous maintainers :-), so yeah engineers should be prepared to >> backup what they post even if Daniel is wrong, but on the other hand >> they need to demonstrate they understand the code they are pushing and >> sometimes with ppgtt and contexts I'm not sure anyone really >> understands how the hw works let alone the sw :-P > > Some of this is driven by me, because I have one main goal in mind in > getting our code upstream: I want high quality kernel support for our > products upstream and released, in an official Linus release, before the > product ships. That gives OSVs and other downstream consumers of the > code a chance to get the bits and be ready when products start rolling > out. Your main goal is however different than mine, my main goal is to not regress the code that is already upstream and have bugs in it fixed. Slowing down new platform merges seems to do that a lot better than merging stuff :-) I realise you guys pay lip service to my goals at times, but I often get the feeling that you'd rather merge HSW support and run away to the next platform than spend a lot of time fixing reported bugs in Ironlake/Sandybridge/Ivybridge *cough RC6 after suspend/resume*. It would be nice to be proven wrong once in a while where someone is actually assigned a bug fix in preference to adding new features for new platforms. Dave.
On Sat, Jul 27, 2013 at 06:52:55PM +1000, Dave Airlie wrote: > On Sat, Jul 27, 2013 at 10:05 AM, Ben Widawsky <ben@bwidawsk.net> wrote: > > On Sat, Jul 27, 2013 at 09:13:38AM +1000, Dave Airlie wrote: > >> > > >> > Hey, overall it's actually quite a bit of fun. > >> > > >> > I do agree that QA is really important for a fastpaced process, but > >> > it's also not the only peace to get something in. Review (both of the > >> > patch itself but also of the test coverage) catches a lot of issues, > >> > and in many cases not the same ones as QA would. Especially if the > >> > testcoverage of a new feature is less than stellar, which imo is still > >> > the case for gem due to the tons of finickle cornercases. > >> > >> Just my 2c worth on this topic, since I like the current process, and > >> I believe making it too formal is probably going to make things suck > >> too much. > >> > >> I'd rather Daniel was slowing you guys down up front more, I don't > >> give a crap about Intel project management or personal manager relying > >> on getting features merged when, I do care that you engineers when you > >> merge something generally get transferred 100% onto something else and > >> don't react strongly enough to issues on older code you have created > >> that either have lain dormant since patches merged or are regressions > >> since patches merged. So I believe the slowing down of merging > >> features gives a better chance of QA or other random devs of finding > >> the misc regressions while you are still focused on the code and > >> hitting the long term bugs that you guys rarely get resourced to fix > >> unless I threaten to stop pulling stuff. > >> > >> So whatever Daniel says goes as far as I'm concerned, if I even > >> suspect he's taken some internal Intel pressure to merge some feature, > >> I'm going to stop pulling from him faster than I stopped pulling from > >> the previous maintainers :-), so yeah engineers should be prepared to > >> backup what they post even if Daniel is wrong, but on the other hand > >> they need to demonstrate they understand the code they are pushing and > >> sometimes with ppgtt and contexts I'm not sure anyone really > >> understands how the hw works let alone the sw :-P > >> > >> Dave. > > > > Honestly, I wouldn't have responded if you didn't mention the Intel > > program management thing... > > > > The problem I am trying to emphasize, and let's use contexts/ppgtt as an > > example, is we have three options: > > 1. It's complicated, and a big change, so let's not do it. > > 2. I continue to rebase the massive change on top of the extremely fast > > paced i915 tree, with no QA coverage. > > 3. We get decent bits merged ASAP by putting it in a repo that both gets > > much wider usage than my personal branch, and gets nightly QA coverage. > > > > PPGTT + Contexts have existed for a while, and so we went with #1 for > > quite a while. > > > > Now we're at #2. There's two sides to your 'developer needs to > > defend...' I need Daniel to give succinct feedback, and agree upon steps > > required to get code merged. My original gripe was that it's hard to > > deal with the, "that patch is too big" comments almost 2 months after > > the first version was sent. Equally, "that looks funny" without a real > > explanation of what looks funny, or sufficient thought up front about > > what might look better is just as hard to deal with. Inevitably, yes - > > it's a big scary series of patches - but if we're honest with ourselves, > > it's almost guaranteed to blow up somewhere regardless of how much we > > rework it, and who reviews it. Blowing up long before you merge would > > always be better than the after you merge. > > > > My desire is to get to something like #3. I had a really long paragraph > > on why and how we could do that, but I've redacted it. Let's just leave > > it as, I think that should be the goal. > > > > Daniel could start taking topic branches like Ingo does, however he'd > have a lot of fun merging them, > he's already getting closer and closer to the extreme stuff -tip does, > and he'd have to feed the topics to QA and possibly -next separately, > the question is when to include a branch or not include it. Yeah, I guess eventually we need to go more crazy with the branching model for drm/i915. But even getting to the current model was quite some fun, so I don't want to rock the boat too much if not required ;-) Also I fear that integrating random developer branches myself will put me at an ugly spot where I partially maintain (due to the regular merge conflicts) patches I haven't yet accepted. And since I'm only human I'll then just merge patches to get rid of the merge pain. So I don't really want to do that. Similarly for the internal tree (which just contains hw enabling for platforms we're not yet allowed to talk about and some related hacks) I've put down the rule that I won't take patches which are not upstream material (minus the last bit of polish and no real review requirement). Otherwise I'll start to bend the upstream rules a bit ... ;-) > Maybe he can schedule a time that QA gets all the branches, and maybe > not put stuff into -next until we are sure its on its way. Imo the solution here is for QA to beat the nightly test infrastructure into a solid enough shape that it can run arbitrary developer branches, unattended. I think we're slowly getting there (but for obvious reasons that's no my main aim as the maintainer when working together with our QA guys). Cheers, Daniel
The nice thing with kicking off a process discussion before disappearing into vacation is that I've had a long time to come up with some well-sharpened opinions. And what better way to start than with a good old-fashioned flamewar ;-) On Tue, Jul 30, 2013 at 09:50:21AM +1000, Dave Airlie wrote: > >> > I do agree that QA is really important for a fastpaced process, but > >> > it's also not the only peace to get something in. Review (both of the > >> > patch itself but also of the test coverage) catches a lot of issues, > >> > and in many cases not the same ones as QA would. Especially if the > >> > testcoverage of a new feature is less than stellar, which imo is still > >> > the case for gem due to the tons of finickle cornercases. > >> > >> Just my 2c worth on this topic, since I like the current process, and > >> I believe making it too formal is probably going to make things suck > >> too much. > >> > >> I'd rather Daniel was slowing you guys down up front more, I don't > >> give a crap about Intel project management or personal manager relying > >> on getting features merged when, I do care that you engineers when you > >> merge something generally get transferred 100% onto something else and > >> don't react strongly enough to issues on older code you have created > >> that either have lain dormant since patches merged or are regressions > >> since patches merged. So I believe the slowing down of merging > >> features gives a better chance of QA or other random devs of finding > >> the misc regressions while you are still focused on the code and > >> hitting the long term bugs that you guys rarely get resourced to fix > >> unless I threaten to stop pulling stuff. > >> > >> So whatever Daniel says goes as far as I'm concerned, if I even > >> suspect he's taken some internal Intel pressure to merge some feature, > >> I'm going to stop pulling from him faster than I stopped pulling from > >> the previous maintainers :-), so yeah engineers should be prepared to > >> backup what they post even if Daniel is wrong, but on the other hand > >> they need to demonstrate they understand the code they are pushing and > >> sometimes with ppgtt and contexts I'm not sure anyone really > >> understands how the hw works let alone the sw :-P > > > > Some of this is driven by me, because I have one main goal in mind in > > getting our code upstream: I want high quality kernel support for our > > products upstream and released, in an official Linus release, before the > > product ships. That gives OSVs and other downstream consumers of the > > code a chance to get the bits and be ready when products start rolling > > out. Imo the "unpredictable upstream" vs. "high quality kernel support in upstream" is a false dichotomy. Afaics the "unpredictability" is _because_ I am not willing to compromise on decent quality. I still claim that upstreaming is a fairly predictable thing (whithin some bounds of how well some tasks can be estimated up-front without doing some research or prototyping), and the blocker here is our mediocre project tracking. I've thought a bit about this (and read a few provoking books about the matter) over vacation and I fear I get to demonstrate this only by running the estimation show myself a bit. But atm I'm by far not frustrated enough yet with the current state of affairs to sign up for that - still chewing on that maintainer thing ;-) > Your main goal is however different than mine, my main goal is to > not regress the code that is already upstream and have bugs in it > fixed. Slowing down new platform merges seems to do that a lot > better than merging stuff :-) > > I realise you guys pay lip service to my goals at times, but I often > get the feeling that you'd rather merge HSW support and run away > to the next platform than spend a lot of time fixing reported bugs in > Ironlake/Sandybridge/Ivybridge *cough RC6 after suspend/resume*. > > It would be nice to be proven wrong once in a while where someone is > actually assigned a bug fix in preference to adding new features for new > platforms. Well, that team is 50% Chris&me with other people (many from the community ...) rounding things off. That is quite a bit better than a year ago (and yep, we blow up stuff, too) but not great. And it's imo also true that Intel as a company doesn't care one bit once the hw is shipped. My approach here has been to be a royal jerk about test coverage for new features and blocking stuff if a regression isn't tackled in time. People scream all around, but it seems to work and we're imo getting to a "farly decent regression handling" point. I also try to push for enabling features across platforms (if the hw should work the same way) in the name of increased test coverage. That one seems to be less effective (e.g. fbc for hsw only ...). Cheers, Daniel
On Fri, Jul 26, 2013 at 10:12:43AM -0700, Jesse Barnes wrote: > On Fri, 26 Jul 2013 18:08:48 +0100 > Chris Wilson <chris@chris-wilson.co.uk> wrote: > > On Fri, Jul 26, 2013 at 09:59:42AM -0700, Jesse Barnes wrote: > > > 4) review comments should be concrete and actionable, and ideally not > > > leave the author hanging with hints about problems the reviewer > > > has spotted, leaving the author looking for easter eggs > > > > Where am I going to find my fun, if I am not allowed to tell you that > > you missed a zero in a thousand line patch but not tell you where? > > Spoilsport :-p > > You'll just need to take up golf or something. :) Poignant opinion from the guy who bored himself on vacations: I disagree on two grounds: Chris without the occasional easter-egg sprinkling just wouldn't be Chris anymore, at least how I know him. Imo we're a bunch of individuals, quirks and all, not a pile of interchangeable cogs that just churn out code. And yes am as amused as the next guy when I spoil by pants by inadvertedly sitting in one of Chris' easter-eggs, otoh I can't help not grinning when I discover them in time ;-) Which leads to the "where's the fun?" question. I've started hacking on drm/i915 because it's fun (despite the frustration). And the fun is what keeps me slogging through bug reports each morning. So if we ditch that in the name of efficiency that'll affect my productivity a lot (just not in the intended direction) and you'll probably need to look for a new maintainer ... With that out of the way I'm obviously not advocating for unclear review - mail is an occasional rather lossy communication medium and we need to keep that in mind all the time. I'm only against your easter egg comment, since throwing those out with the badwather is imo bad. Cheers, Daniel
On Sun, 4 Aug 2013 22:17:47 +0200 Daniel Vetter <daniel@ffwll.ch> wrote: > Imo the "unpredictable upstream" vs. "high quality kernel support in > upstream" is a false dichotomy. Afaics the "unpredictability" is _because_ > I am not willing to compromise on decent quality. I still claim that > upstreaming is a fairly predictable thing (whithin some bounds of how well > some tasks can be estimated up-front without doing some research or > prototyping), and the blocker here is our mediocre project tracking. Well, I definitely disagree here. With our current (and recent past) processes, we've generally ended up with lots of hw support landing well after parts start shipping, and the quality hasn't been high (in terms of user reported bugs) despite all the delay. So while our code might look pretty, the fact is that it's late, and has hard to debug low level bugs (RC6, semaphores, etc). <rant> It's fairly easy to add support for hardware well after it ships, and in a substandard way (e.g. hard power features disabled because we can't figure them out because the hw debug folks have moved on). If we want to keep doing that, fine, but I'd really like us to do better and catch the hard bugs *before* hw ships, and make sure it's solid and complete *before* users get it. But maybe that's just me. Maybe treating our driver like any other RE or "best effort" Linux driver is the right way to go. If so, fine, let's just not change anything. </rant> > My approach here has been to be a royal jerk about test coverage for new > features and blocking stuff if a regression isn't tackled in time. People > scream all around, but it seems to work and we're imo getting to a "farly > decent regression handling" point. I also try to push for enabling > features across platforms (if the hw should work the same way) in the name > of increased test coverage. That one seems to be less effective (e.g. fbc > for hsw only ...). But code that isn't upstream *WON'T BE TESTED* reasonably. So if you're waiting for all tests to be written before going upstream, all you're doing is delaying the bug reports that will inevitably come in, both from new test programs and from general usage. On top of that, if someone is trying to refactor at the same time, things just become a mess with all sorts of regressions introduced that weren't an issue with the original patchset...
On Mon, Aug 5, 2013 at 11:33 PM, Jesse Barnes <jbarnes@virtuousgeek.org> wrote: > On Sun, 4 Aug 2013 22:17:47 +0200 > Daniel Vetter <daniel@ffwll.ch> wrote: >> Imo the "unpredictable upstream" vs. "high quality kernel support in >> upstream" is a false dichotomy. Afaics the "unpredictability" is _because_ >> I am not willing to compromise on decent quality. I still claim that >> upstreaming is a fairly predictable thing (whithin some bounds of how well >> some tasks can be estimated up-front without doing some research or >> prototyping), and the blocker here is our mediocre project tracking. > > Well, I definitely disagree here. With our current (and recent past) > processes, we've generally ended up with lots of hw support landing > well after parts start shipping, and the quality hasn't been high (in > terms of user reported bugs) despite all the delay. So while our code > might look pretty, the fact is that it's late, and has hard to debug > low level bugs (RC6, semaphores, etc). > > <rant> > It's fairly easy to add support for hardware well after it ships, and > in a substandard way (e.g. hard power features disabled because we > can't figure them out because the hw debug folks have moved on). If we > want to keep doing that, fine, but I'd really like us to do better and > catch the hard bugs *before* hw ships, and make sure it's solid and > complete *before* users get it. But maybe that's just me. Maybe > treating our driver like any other RE or "best effort" Linux driver is > the right way to go. If so, fine, let's just not change anything. > </rant> The only thing I read here, both in the paragraph above and in the rant is that we suck. I agree. My opinion is that this is because we've started late, had too few resources and didn't seriously estimate how much work is actually involved to enable something for real. The only reason I could distill from the above two paragraphs among the ranting for way we are so much late is "So while our code might look pretty, it's late and buggy". That's imo a farily shallow stab at preceived bikesheds, but not a useful angle to improve process. Now I agree that I uphold a fairly high quality standard for upstreaming, but not an unreasonable one: - drm/i915 transformed from the undisputed shittiest driver in the kernel to one that mostly just works, while picking up development pace. So I don't think I'm fully wrong on insisting on this level of quality. - we do ship the driver essentially continously, which means we can implement features only by small refactoring steps. That clearly involves more work than just stitching something together for a product. I welcome discussing whether I impose t0o high standards, but that needs to be supplied with examples and solid reasons. "It's just too hard" without more context isn't one, since yes, the work we pull off here actually is hard. Also note that Chris&me still bear the brute of fixing the random fallout all over (it's getting better). So if any proposed changes involves me blowing through even more time to track down issues I'm strongly not in favour. Same holds for Chris often-heard comment that a patch needs an improved commit message or a comment somewhere. Yes it's annoying that you need to resend it (this often bugs me myself) just to paint the bikeshed a bit different. But imo Chris is pretty much throughout spot-on with his requests and a high-quality git history has, in my experience at least, been extremely valueable to track down the really ugly issues and legalese around all the established precendence. >> My approach here has been to be a royal jerk about test coverage for new >> features and blocking stuff if a regression isn't tackled in time. People >> scream all around, but it seems to work and we're imo getting to a "farly >> decent regression handling" point. I also try to push for enabling >> features across platforms (if the hw should work the same way) in the name >> of increased test coverage. That one seems to be less effective (e.g. fbc >> for hsw only ...). > > But code that isn't upstream *WON'T BE TESTED* reasonably. So if > you're waiting for all tests to be written before going upstream, all > you're doing is delaying the bug reports that will inevitably come in, > both from new test programs and from general usage. On top of that, if > someone is trying to refactor at the same time, things just become a > mess with all sorts of regressions introduced that weren't an issue > with the original patchset... QA on my trees and the igt testcoverage I demand for new features is to catch regressions once something is merged. We've managed to break code in less than a day since it's merged on multiple occasions, so this is very real and just part of the quality standard I impose. Furthermore I don't want that a new feature regresses overall stability of our driver. And since that quality is increasing rather decently I ask for more testcases to exercise cornercases to make sure they're all covered. This is very much orthogonal to doing review and just one more puzzle to ensure we don't go back to the neat old days of shipping half-baked crap. Note that nowadays QA is catching a lot of the regressions even before the patches land in Dave's tree (sometimes there's the occasional brown paper bag event though, but in each such case I analysis the failure mode and work to prevent it in the future). And imo that's squarely due to much improved test coverage and the rigid test coverage requirements for new feautures I impose. And of course the overall improve QA process flow with much quicker regression turnaround times also greatly helps here. Now I agree (and I think I've mentioned this a bunch of times in this thread already) that this leads to a pain for developers. I see two main issues, both are (slowly) improving: - Testing patch series for regressions before merging. QA just set up the developer patch test system, and despite that it's still rather limited Ben seems to be fairly happy with where it's going. So I think we're on track to improve this and avoid the need for developers to have a private lab like Chris and I essentially have. - Rebase hell due to ongoing other work. Thus far I've only tried to help here by rechecking/delaying refactoring patches while big features are pending. I think we need to try new approaches here and imo better planing should help. E.g. the initial modeset refactor was way too big and a monolithic junk that I've just wrestled in by exorting r-b tags from you. In contrast the pipe config rework was about equally big, but at any given time only about 30-50 patches where outstanding (in extreme cases), and mutliple people contributed different parts of the overall beast. Of course that means that occasional, for really big stuff, we need to plan to write a first proof of concept as a landmark where we need to go to, which pretty much will be thrown away completely. One meta-comment on top of the actual discussion: I really appreciate critique and I've grown a good maintainer-skin to also deal with really harsh critique. But I prefer less ranting and more concrete examples where I've botched the job (there are plentiful to pick from imo) and concrete suggestion for how to improve our overall process. I think these process woes are painful for everyone and due to our fast growth we're constantly pushing into new levels of ugly, but imo the way to go forward is by small (sometimes positively tiny), but continous adjustements and improvements. I think we both agree where we'd like to be, but at least for me in the day-to-day fight in the trenches the rosy picture 200 miles away doesn't really help. Maybe I'm too delusional and sarcastic that way ;-) Cheers, Daniel
On Tue, 6 Aug 2013 00:19:33 +0200 Daniel Vetter <daniel@ffwll.ch> wrote: > The only thing I read here, both in the paragraph above and in the > rant is that we suck. I agree. My opinion is that this is because > we've started late, had too few resources and didn't seriously > estimate how much work is actually involved to enable something for > real. No, it's more than that, we suck in very specific ways: 1) large (and sometimes even small) features take waaay too long to land upstream, taking valuable developer time away from other things like bug fixing, regressions, etc 2) hw support lands late, which makes it harder to get debug traction with tough bugs (e.g. RC6) > > The only reason I could distill from the above two paragraphs among > the ranting for way we are so much late is "So while our code might > look pretty, it's late and buggy". That's imo a farily shallow stab at > preceived bikesheds, but not a useful angle to improve process. No, I suggested improvements to our process earlier, and it sounded like you mostly agreed, though seemed to deny point that we spin for too long on things (point #1 above). > Now I agree that I uphold a fairly high quality standard for > upstreaming, but not an unreasonable one: > - drm/i915 transformed from the undisputed shittiest driver in the > kernel to one that mostly just works, while picking up development > pace. So I don't think I'm fully wrong on insisting on this level of > quality. > - we do ship the driver essentially continously, which means we can > implement features only by small refactoring steps. That clearly > involves more work than just stitching something together for a > product. <sarcasm> You're way off base here. We should ship a shitty driver and just land everything without review or testing. That way we can go really fast. Your quality standards are too high (in that they exist at all). </sarcasm> More seriously, quality should be measured by the end result in terms of bugs and how users actually use our stuff. I'm not sure if that's what you mean by a "high quality standard". Sometimes it seems you care more about refactoring things ad-infinitum than tested code. > Also note that Chris&me still bear the brute of fixing the random > fallout all over (it's getting better). So if any proposed changes > involves me blowing through even more time to track down issues I'm > strongly not in favour. Same holds for Chris often-heard comment that > a patch needs an improved commit message or a comment somewhere. Yes > it's annoying that you need to resend it (this often bugs me myself) > just to paint the bikeshed a bit different. But imo Chris is pretty > much throughout spot-on with his requests and a high-quality git > history has, in my experience at least, been extremely valueable to > track down the really ugly issues and legalese around all the > established precendence. Again, no one is suggesting that we have shitty changelogs or that we add comments. Not sure why you brought that up. > - Rebase hell due to ongoing other work. Thus far I've only tried to > help here by rechecking/delaying refactoring patches while big > features are pending. I think we need to try new approaches here and > imo better planing should help. E.g. the initial modeset refactor was > way too big and a monolithic junk that I've just wrestled in by > exorting r-b tags from you. In contrast the pipe config rework was > about equally big, but at any given time only about 30-50 patches > where outstanding (in extreme cases), and mutliple people contributed > different parts of the overall beast. Of course that means that > occasional, for really big stuff, we need to plan to write a first > proof of concept as a landmark where we need to go to, which pretty > much will be thrown away completely. This is the real issue. We don't have enough people to burn on single features for 6 months each so they can be rewritten 3 times until they look how you would have done it. If we keep doing that, you may as well write all of it, and we'll be stuck in my <rant> from the previous message. That's why I suggested the two reviewed-by tags ought to be sufficient as a merge criteria. Sure, there may be room for refactoring, but if things are understandable by other developers and well tested, why block them? > One meta-comment on top of the actual discussion: I really appreciate > critique and I've grown a good maintainer-skin to also deal with > really harsh critique. But I prefer less ranting and more concrete > examples where I've botched the job (there are plentiful to pick from > imo) and concrete suggestion for how to improve our overall process. I've suggested some already, but they've fallen on deaf ears afaict. I don't know what more I can do to convince you that you acting as a review/refactor bottleneck actively undermines the goals I think we share. But I'm done with this thread. Maybe others want to comment on things they might think improve the situation.
Like I've said in my previous mail I expect such discussions to be hard and I also think stopping now and giving up is the wrong approach. So another round. On Tue, Aug 6, 2013 at 1:34 AM, Jesse Barnes <jbarnes@virtuousgeek.org> wrote: > On Tue, 6 Aug 2013 00:19:33 +0200 > Daniel Vetter <daniel@ffwll.ch> wrote: >> The only thing I read here, both in the paragraph above and in the >> rant is that we suck. I agree. My opinion is that this is because >> we've started late, had too few resources and didn't seriously >> estimate how much work is actually involved to enable something for >> real. > > No, it's more than that, we suck in very specific ways: > 1) large (and sometimes even small) features take waaay too long to > land upstream, taking valuable developer time away from other > things like bug fixing, regressions, etc > 2) hw support lands late, which makes it harder to get debug traction > with tough bugs (e.g. RC6) > >> >> The only reason I could distill from the above two paragraphs among >> the ranting for way we are so much late is "So while our code might >> look pretty, it's late and buggy". That's imo a farily shallow stab at >> preceived bikesheds, but not a useful angle to improve process. > > No, I suggested improvements to our process earlier, and it sounded > like you mostly agreed, though seemed to deny point that we spin for > too long on things (point #1 above). I'll cover your process suggestions below, since some of your clarifications below shine a different light onto them. But overall I agree, even that we seem to spin sometimes awfully long. >> Now I agree that I uphold a fairly high quality standard for >> upstreaming, but not an unreasonable one: >> - drm/i915 transformed from the undisputed shittiest driver in the >> kernel to one that mostly just works, while picking up development >> pace. So I don't think I'm fully wrong on insisting on this level of >> quality. >> - we do ship the driver essentially continously, which means we can >> implement features only by small refactoring steps. That clearly >> involves more work than just stitching something together for a >> product. > > <sarcasm> > You're way off base here. We should ship a shitty driver and just land > everything without review or testing. That way we can go really fast. > Your quality standards are too high (in that they exist at all). > </sarcasm> > > More seriously, quality should be measured by the end result in terms > of bugs and how users actually use our stuff. I'm not sure if that's > what you mean by a "high quality standard". Sometimes it seems you > care more about refactoring things ad-infinitum than tested code. I often throw in a refactoring suggestion when people work on a feature, that's right. Often it is also a crappy idea, but imo for long-term maintainance a neat&tidy codebase is really important. So I'll just throw them out and see what sticks with people. I realize that pretty much all of the quality standard discussion here is really fluffy, but like I explained I get to bear a large part of the "keep it going" workload. And as long as that's the case I frankly think my standards carry more weight. Furthermore in the cases where other people from our team chip in with bugfixing that's mostly in cases where a self-check or testcase clearly puts the blame on them. So if that is the only way to volunteer people I'll keep asking for those things (and delay patches indefinitely like e.g. your fastboot stuff). And like I've said I'm open to discuss those requirements, but I freely admit that I have a rather solid ground resolve on this topic. >> Also note that Chris&me still bear the brute of fixing the random >> fallout all over (it's getting better). So if any proposed changes >> involves me blowing through even more time to track down issues I'm >> strongly not in favour. Same holds for Chris often-heard comment that >> a patch needs an improved commit message or a comment somewhere. Yes >> it's annoying that you need to resend it (this often bugs me myself) >> just to paint the bikeshed a bit different. But imo Chris is pretty >> much throughout spot-on with his requests and a high-quality git >> history has, in my experience at least, been extremely valueable to >> track down the really ugly issues and legalese around all the >> established precendence. > > Again, no one is suggesting that we have shitty changelogs or that we > add comments. Not sure why you brought that up. I added it since I've just read through some of the patches on the android internal branch yesterday and a lot of those patches fall through on the "good enough commit message" criterion (mostly by failing to explain why the patch is needed). I've figured that's relevant since on internal irc you've said even pushing fixes to upstream is a PITA since they require 2-3 rounds to get in. To keep things concrete one such example is Kamal's recent rc6 fix where I've asked for a different approach and he sounded rather pissed that I don't just take his patch as-is. But after I've explained my reasoning he seemed to agree, at least he sent out a revised version. And the changes have all been what I guess you'd call bikesheds, since it was just shuffling the code logic around a bit and pimping the commit message. I claim that this is worth it and I think your stance is that we shouldn't delay patches like this. Or is this a bad example for a patch which you think was unduly delayed? Please bring another one up in this case, I really think process discussions are easier with concrete examples. >> - Rebase hell due to ongoing other work. Thus far I've only tried to >> help here by rechecking/delaying refactoring patches while big >> features are pending. I think we need to try new approaches here and >> imo better planing should help. E.g. the initial modeset refactor was >> way too big and a monolithic junk that I've just wrestled in by >> exorting r-b tags from you. In contrast the pipe config rework was >> about equally big, but at any given time only about 30-50 patches >> where outstanding (in extreme cases), and mutliple people contributed >> different parts of the overall beast. Of course that means that >> occasional, for really big stuff, we need to plan to write a first >> proof of concept as a landmark where we need to go to, which pretty >> much will be thrown away completely. > > This is the real issue. We don't have enough people to burn on > single features for 6 months each so they can be rewritten 3 times until > they look how you would have done it. If we keep doing that, you > may as well write all of it, and we'll be stuck in my <rant> from > the previous message. That's why I suggested the two reviewed-by tags > ought to be sufficient as a merge criteria. Sure, there may be room > for refactoring, but if things are understandable by other developers > and well tested, why block them? I'd like to see an example here for something that I blocked, really. One I could think up is the ips feature from Paulo where I've asked to convert it over to the pipe config tracking. But I asked for that specifically so that one of our giant long-term feature goals (atomic modeset) doesn't move further away, so I think for our long-term aims this request was justified. Otherwise I have a hard time coming up with features that had r-b tags from one of the domain expert you've listed (i.e. where understood well) and I blocked them. It's true that I often spot something small when applying a patch, but I also often fix it up while applying (mostly adding notes to the commit message) or asking for a quick follow-up fixup patch. >> One meta-comment on top of the actual discussion: I really appreciate >> critique and I've grown a good maintainer-skin to also deal with >> really harsh critique. But I prefer less ranting and more concrete >> examples where I've botched the job (there are plentiful to pick from >> imo) and concrete suggestion for how to improve our overall process. > > I've suggested some already, but they've fallen on deaf ears afaict. Your above clarification that the 2 r-b tags (one from the domain expert) should overrule my concern imo makes your original proposal a bit different - my impression was that you've asked for 2 r-b tags, period. Which would be more than what we currently have, and since we have a hard time doing even that would imo I think asking for 2 r-b tags is completely unrealistic. One prime example is Ville's watermark patches, which have been ready (he only did a very few v2 versions for bikesheds) since over a month ago. But stuck since no one bothered to review them. So your suggestions (points 1) thru 4) in your original mail in this thread) haven't fallen on deaf ears. Specifically wrt review from domain experts I'm ok with just an informal ack and letting someone else do the detailed review. That way the 2nd function of reviewing of diffusing knowledge in our distributed team works better when I pick non-domain-experts. > I don't know what more I can do to convince you that you acting as a > review/refactor bottleneck actively undermines the goals I think we > share. I disagree that I'm a bottleneck. Just yesterday I've merged roughly 50 patches because they where all nicely reviewed. And like I've said some of those patches have been stuck for a month in no-one-bothers-to-review-them limbo land. If we drag out another example and look at the ppgtt stuff from Ben which I've asked to be reworked quite a bit. Now one mistake I've done is to be way too optimistic about how much time this will take when hashing out a merge plan with Ben. I've committed the mistake of trying to fit the work that I think needs to be done into the available time Ben has and so done the same wishful thinking planning I complain about all the time. Next time around I'll try to make an honest plan first and then try to fit it into the time we have instead of the other way round. But I really think the rework was required since with the original patch series I was often left with the nagging feeling that I just don't understand what's going on, and whether I'd really be able to track down a regression if it bisected to one of the patches. So I couldn't slap an honset r-b tag onto it. The new series is imo great and a joy to review. So again please bring up an example where I've failed and we can look at it and figure out what needs to change to improve the process. Imo those little patches and adjustements to our process are the way forward. At least that approach worked really well for beating our kernel QA process into shape. And yes, it's tedious and results will take time to show up. > But I'm done with this thread. Maybe others want to comment on things > they might think improve the situation. I'm not letting you off the hook that easily ;-) Cheers, Daniel
Hi A few direct responses and my 2 cents at the end. This is all my humble opinion, feel free to disagree or ignore it :) 2013/8/6 Daniel Vetter <daniel@ffwll.ch>: > > I often throw in a refactoring suggestion when people work on a > feature, that's right. Often it is also a crappy idea, but imo for > long-term maintainance a neat&tidy codebase is really important. So > I'll just throw them out and see what sticks with people. > The problem is that if you throw and it doesn't stick, then people feel you won't merge it. So they kinda feel they have to do it all the time. Another thing is that sometimes the refactoring is just plain bikeshedding, and that leads to demotivated workers. People write things on their way, but then they are forced to do it in another way, which is also correct, but just different, and wastes a lot of time. And I'm not talking specifically about Daniel's suggestions, everybody does this kind of bikeshedding (well, I'm sure I do). If someone gives a bikeshed to a patch, Daniel will see there's an unattended review comment and will not merge the patch at all, so basically a random reviewer can easily block someone else's patch. I guess we all should try to give less bikeshedding, including me. > > One prime example is Ville's watermark patches, which have been ready > (he only did a very few v2 versions for bikesheds) since over a month > ago. But stuck since no one bothered to review them. Actually I subscribed myself to review (on review board) and purposely waited until he was back from vacation before I would start the review. I also did early 0-day testing on real hardware, which is IMHO way much more useful than just reviewing. Something that happened many times for me in the past: I reviewed a patch, thought it was correct, then decided to boot the patch before sending the R-B email and found a bug. And my 2 cents: Daniel and Jesse are based on different premises, which means they will basically discuss forever until they realize that. In an exaggerated view, Daniel's premises: - Merging patches with bugs is unacceptable - Colorary: users should never have to report bugs/regressions - Delaying patch merging due to refactoring or review comments will always make it better In the same exaggerated view, Jesse's premises: - Actual user/developer testing is more valuable than review and refactoring - Colorary: merging code with bugs is acceptable, we want the bug reports - Endless code churn due to review/refactoring may actually introduce bugs not present in the first version Please tell me if I'm wrong. From my point of view, this is all about tradeoffs and you two stand on different positions in these tradeoffs. Example: - Time time you save by not doing all the refactoring/bikeshedding can be spent doing bug fixing or reviewing/testing someone else's patches. - But the question is: which one is more worth it? An hour refactoring/rebasing so the code behaves exactly like $reviewer wants, or an hour staring at bugzilla or reviewing/testing patches? - From my point of view, it seems Daniel assumes people will always spend 0 time fixing bugs, that's why he requests people so much refactoring: the tradeoff slider is completely at one side. But that's kind of a vicious/virtuous cycle: the more he increases his "quality standards", the more we'll spend time on the refactorings, so we'll spend even less time on bugzilla", so Daniel will increase the standards even more due to even less time spent on bugzilla, and so on. One thing which we didn't discuss explicitly right now and IMHO is important is how people *feel* about all this. It seems to me that the current amount of reworking required is making some people (e.g., Jesse, Ben) demotivated and unhappy. While this is not really a measurable thing, I'm sure it negatively affects the rate we improve our code base and fix our bugs. If we bikeshed a feature to the point where the author gets fed up with it and just wants it to get merged, there's a high chance that future bugs discovered on this feature won't be solved that quickly due the stressful experience the author had with the feature. And sometimes the unavoidable "I'll just implement whatever review comments I get because I'm so tired about this series and now I just want to get it merged" attitude is a very nice way to introduce bugs. And one more thing. IMHO this discussion should all be on how we deal with the people on our team, who get paid to write this code. When external people contribute patches to us, IMHO we should give them big thanks, send emails with many smileys, and hold all our spotted bikesheds to separate patches that we'll send later. Too high quality standards doesn't seem to be a good way to encourage people who don't dominate our code base. My possible suggestions: - We already have drm-intel-next-queued as a barrier to protect against bugs in merged patches (it's a barrier to drm-intel-next, which external people should be using). Even though I do not spend that much time on bugzilla bugs, I do rebase on dinq/nightly every day and try to make sure all the regressions I spot are fixed, and I count this as "bug fixing time". What if we resist our OCDs and urge to request reworks, then merge patches to dinq more often? To compensate for this, if anybody reports a single problem in a patch or series present on dinq, it gets immediately reverted (which means dinq will either do lots of rebasing or contain many many reverts). And we try to keep drm-intel-next away from all the dinq madness. Does that sound maintainable? - Another idea I already gave a few times is to accept features more easily, but leave them disabled by default until all the required reworks are there. Daniel rejected this idea because he feels people won't do the reworks and will leave the feature disabled by default forever. My counter-argument: 99% of the features we do are somehow tracked by PMs, we should make sure the PMs know features are still disabled, and perhaps open sub-tasks on the feature tracking systems to document that the feature is not yet completed since it's not enabled by default. In other words: this problem is too hard, it's about tradeoffs and there's no perfect solution that will please everybody. My just 2 cents, I hope to not have offended anybody :( Cheers, Paulo
On Tue, Aug 6, 2013 at 4:50 PM, Paulo Zanoni <przanoni@gmail.com> wrote: > A few direct responses and my 2 cents at the end. This is all my > humble opinion, feel free to disagree or ignore it :) I think you make some excellent points, so thanks a lot for joining the discussion. > 2013/8/6 Daniel Vetter <daniel@ffwll.ch>: >> >> I often throw in a refactoring suggestion when people work on a >> feature, that's right. Often it is also a crappy idea, but imo for >> long-term maintainance a neat&tidy codebase is really important. So >> I'll just throw them out and see what sticks with people. >> > > The problem is that if you throw and it doesn't stick, then people > feel you won't merge it. So they kinda feel they have to do it all the > time. > > Another thing is that sometimes the refactoring is just plain > bikeshedding, and that leads to demotivated workers. People write > things on their way, but then they are forced to do it in another way, > which is also correct, but just different, and wastes a lot of time. > And I'm not talking specifically about Daniel's suggestions, everybody > does this kind of bikeshedding (well, I'm sure I do). If someone gives > a bikeshed to a patch, Daniel will see there's an unattended review > comment and will not merge the patch at all, so basically a random > reviewer can easily block someone else's patch. I guess we all should > try to give less bikeshedding, including me. Yeah, that happens. With all the stuff going on I relly can't keep track of everything, so if it looks like the patch author and the reviewer are still going back&forth I just wait. And like I've explained in private once I don't like stepping in as the maintainer when this happens since I'm not the topic expert by far, so my assessment will be about as good as a coin-toss. Of course if the question centers around integration issues with the overall codebase I'll happily chime in. I think the only way to reduce time wasted in such stuck discussions is to admit that the best solution isn't clear and that adding a fixme comment somewhere to look at the issue again for the next platform (bug, regression, feature, ...) that touches the same area. Or maybe reconsider once everything has landed and it's clear what then end-result really looks like. >> One prime example is Ville's watermark patches, which have been ready >> (he only did a very few v2 versions for bikesheds) since over a month >> ago. But stuck since no one bothered to review them. > > Actually I subscribed myself to review (on review board) and purposely > waited until he was back from vacation before I would start the > review. I also did early 0-day testing on real hardware, which is IMHO > way much more useful than just reviewing. Something that happened many > times for me in the past: I reviewed a patch, thought it was correct, > then decided to boot the patch before sending the R-B email and found > a bug. Imo review shouldn't require you to apply the patches and test them. Of course if it helps you to convince yourself the patch is good I'm fine with that approach. But myself if I have doubts I prefer to check whether a testcase/selfcheck exists to exercise that corner case (and so will prevent this from also ever breaking again). Testing itself should be done by the developer (or bug reporter). Hopefully the developer patch test system that QA is now rolling out will help a lot in that regard. > And my 2 cents: > > Daniel and Jesse are based on different premises, which means they > will basically discuss forever until they realize that. > > In an exaggerated view, Daniel's premises: > - Merging patches with bugs is unacceptable > - Colorary: users should never have to report bugs/regressions > - Delaying patch merging due to refactoring or review comments will > always make it better > > In the same exaggerated view, Jesse's premises: > - Actual user/developer testing is more valuable than review and refactoring > - Colorary: merging code with bugs is acceptable, we want the bug reports > - Endless code churn due to review/refactoring may actually introduce > bugs not present in the first version > > Please tell me if I'm wrong. At least from my pov I think this is a very accurate description of our different assumptions and how that shapes how we perceive these process issues. > From my point of view, this is all about tradeoffs and you two stand > on different positions in these tradeoffs. Example: > - Time time you save by not doing all the refactoring/bikeshedding can > be spent doing bug fixing or reviewing/testing someone else's patches. > - But the question is: which one is more worth it? An hour > refactoring/rebasing so the code behaves exactly like $reviewer wants, > or an hour staring at bugzilla or reviewing/testing patches? > - From my point of view, it seems Daniel assumes people will always > spend 0 time fixing bugs, that's why he requests people so much > refactoring: the tradeoff slider is completely at one side. But that's > kind of a vicious/virtuous cycle: the more he increases his "quality > standards", the more we'll spend time on the refactorings, so we'll > spend even less time on bugzilla", so Daniel will increase the > standards even more due to even less time spent on bugzilla, and so > on. tbh I haven't considered that I might cause a negative feedback cycle here. One thing that seems to work (at least for me) is when we have good testcase. With QA's much improved regression reporting I can then directly assign a bug to the patch auther of the offending commit. That seems to help a lot in distributing the regression handling work. But more tests aren't a magic solution since they also take a lot of time to write. And in a few areas our test coverage gaps are still so big that relying on tests only for good quality and much less on clean&clear code which is easy to review isn't really a workable approach. But I'd be willing to trade off more tests for less bikeshed in review since imo the two parts are at least partial substitutes. Thus far though writing tests seems to often come as an afterthough and not as the first thing, so I guess this doesn't work too well with our current team. Personally I don't like writing testcases too much, even though it's fun to blow up the kernel ;-) And it often helps a _lot_ with understanding the exact nature of a bug/issue, at least for me. Another approach could be if developers try to proactively work a bit on issues in they're area and take active ownership, I'm much more inclined to just merge patches in this case. Examples are how Jani wrestles around with the backlight code or how you constantly hunt down unclaimed register issues. Unfortunately that requires that people follow the bugspam and m-l mail flood, which is a major time drain :( > One thing which we didn't discuss explicitly right now and IMHO is > important is how people *feel* about all this. It seems to me that the > current amount of reworking required is making some people (e.g., > Jesse, Ben) demotivated and unhappy. While this is not really a > measurable thing, I'm sure it negatively affects the rate we improve > our code base and fix our bugs. If we bikeshed a feature to the point > where the author gets fed up with it and just wants it to get merged, > there's a high chance that future bugs discovered on this feature > won't be solved that quickly due the stressful experience the author > had with the feature. And sometimes the unavoidable "I'll just > implement whatever review comments I get because I'm so tired about > this series and now I just want to get it merged" attitude is a very > nice way to introduce bugs. Yep, people are the most important thing, technical issues can usually be solved much easier. Maybe we need to look for different approaches that suit people better (everyone's a bit different), like the idea above to emphasis tests more instead of code cleanliness and consistency. E.g. for your current pc8+ stuff I've somewhat decided that I'm not going to drop bikesheds, but just make sure the testcase looks good. Well throw a few ideas around while reading the patches, but those are just ideas ... again a case I guess where you can mistake my suggestions as requirements :( I need to work on making such idea-throwing clearer. Otherwise I'm running a bit low on ideas how we could change the patch polishing for upstream to better suit people and prevent fatalistic "this isn't really my work anymore" resingation. Ideas? > And one more thing. IMHO this discussion should all be on how we deal > with the people on our team, who get paid to write this code. When > external people contribute patches to us, IMHO we should give them big > thanks, send emails with many smileys, and hold all our spotted > bikesheds to separate patches that we'll send later. Too high quality > standards doesn't seem to be a good way to encourage people who don't > dominate our code base. I disagree. External contributions should follow the same standards as our own code. And just because we're paid to do this doesn't mean I won't be really happy about a tricky bugfix or a cool feature. Afaic remember the only non-intel feature that was merged that imo didn't live up to my standards was the initial i915 prime support from Dave. And I've clearly stated that I won't merge the patch through my tree and listed the reasons why I think it's not ready. > My possible suggestions: > > - We already have drm-intel-next-queued as a barrier to protect > against bugs in merged patches (it's a barrier to drm-intel-next, > which external people should be using). Even though I do not spend > that much time on bugzilla bugs, I do rebase on dinq/nightly every day > and try to make sure all the regressions I spot are fixed, and I count > this as "bug fixing time". What if we resist our OCDs and urge to > request reworks, then merge patches to dinq more often? To compensate > for this, if anybody reports a single problem in a patch or series > present on dinq, it gets immediately reverted (which means dinq will > either do lots of rebasing or contain many many reverts). And we try > to keep drm-intel-next away from all the dinq madness. Does that sound > maintainable? I occasionally botch a revert/merge/rebase and since it wouldn't scale when I ask people to cross check my tree in detail every time (or people just assume I didn't botch it) those slip out. So I prefer if I don't have to maintain more volatile trees. I'm also not terribly in favour of merging stuff early and hoping for reworks since often the attention moves immediately to the next thing. E.g. VECS support was merged after a long delay when finally some basic tests popped up. But then a slight change from Mika to better exercise some seqno wrap/gpu reset corner cases showed that semaphores don't work with VECS. QA dutifully reported this bug and Chris analysis the gpu hang state. Ever since then this was ignored. So I somewhat agree with Dave here, at least sometimes ... I'm also not sure that an immediate revert rule is the right approach. Often an issue is just minor (e.g. the modeset state checker trips up), dropping the patch right away might be the wrong approach. Of course if something doesn't get fixed quickly that's not great, either. > - Another idea I already gave a few times is to accept features more > easily, but leave them disabled by default until all the required > reworks are there. Daniel rejected this idea because he feels people > won't do the reworks and will leave the feature disabled by default > forever. My counter-argument: 99% of the features we do are somehow > tracked by PMs, we should make sure the PMs know features are still > disabled, and perhaps open sub-tasks on the feature tracking systems > to document that the feature is not yet completed since it's not > enabled by default. I'm not sure how much that would help. If something is disabled by default it won't getting beaten on by QA. And Jesse is right that we just need that coverage, but to discover corner case bugs but also to ensure a feature doesn't regress. If we merge something disabled by default I fear it'll bitrot as quickly as an unmerged patch series. But we leave in the delusion that it all still works. So I'm not sure it's a good approach, but with psr we kinda have this as a real-world experiment running. Let's see how it goes ... > In other words: this problem is too hard, it's about tradeoffs and > there's no perfect solution that will please everybody. Yeah, I think your approach of clearly stating this as a tradeoff issue cleared up things a lot for me. I think we need to actively hunt for opportunities and new ideas. I've added a few of my own above, but I think it's clear that there's no silver bullet. One idea I'm pondering is whether a much more detailed breakdown of a task/feature/... and how to get the test coverage and all the parts merged could help. At least from my pov a big part of the frustration seems to stem from the fact that the upstreaming process is highly variable, and like I've said a few times I think we can do much better. At least once we've tried this a few times and have some experience. But again this is not for free but involves quite some work. And I guess I need to be highly involved or even do large parts of that break-down to make sure nothing gets missed, and I kinda don't want to sign up for that work ;-) > My just 2 cents, I hope to not have offended anybody :( Not at all, and I think your input has been very valuable to the discussion. Thanks a lot, Daniel
> > In the same exaggerated view, Jesse's premises: > - Actual user/developer testing is more valuable than review and refactoring > - Colorary: merging code with bugs is acceptable, we want the bug reports > - Endless code churn due to review/refactoring may actually introduce > bugs not present in the first version > > Please tell me if I'm wrong. > > From my point of view, this is all about tradeoffs and you two stand > on different positions in these tradeoffs. Example: > - Time time you save by not doing all the refactoring/bikeshedding can > be spent doing bug fixing or reviewing/testing someone else's patches. > - But the question is: which one is more worth it? An hour > refactoring/rebasing so the code behaves exactly like $reviewer wants, > or an hour staring at bugzilla or reviewing/testing patches? > - From my point of view, it seems Daniel assumes people will always > spend 0 time fixing bugs, that's why he requests people so much > refactoring: the tradeoff slider is completely at one side. But that's > kind of a vicious/virtuous cycle: the more he increases his "quality > standards", the more we'll spend time on the refactorings, so we'll > spend even less time on bugzilla", so Daniel will increase the > standards even more due to even less time spent on bugzilla, and so > on. Here is the thing, before Daniel started making people write tests and bikeshedding, people spent 0 time on bugs, I can dig up countless times now I've had RHEL regressions that I've had to stop merging code to get anyone to look at. So Jesse likes to think that people will have more time to look at bugzilla if they aren't refactoring patches, but generally I find people will just get moved onto the next task the second the code is merged by Daniel, and will fight against taking any responsibility for code that is already merged unless hit with a big stick. This is just ingrained in how people work, doing new shiny stuff is always more fun than spending 4 days or weeks to send a one liner patch, so really if people thinking we just need to merge most stuff faster is the solution they are delusional, and I'll gladly stop pulling until they stop. I've spent 2-3 weeks on single bugs in the graphics stack before and I'm sure I will again, but the incentive to go hunting for them generally comes from someone important reporting the bug, not from a misc bug report in bugzilla from someone who isn't a monetary concern. So Jesse if you really believe the team will focus on bugs 2-3 months after the code is merged and drop their priority for merging whatever cool feature they are on now, then maybe I'd agree, but so far history has shown this never happens. Dave.
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index be69807..f8e590f 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -92,6 +92,7 @@ static const char *get_tiling_flag(struct drm_i915_gem_object *obj) static void describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) { + struct i915_vma *vma; seq_printf(m, "%pK: %s%s %8zdKiB %02x %02x %d %d %d%s%s%s", &obj->base, get_pin_flag(obj), @@ -111,9 +112,15 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) seq_printf(m, " (pinned x %d)", obj->pin_count); if (obj->fence_reg != I915_FENCE_REG_NONE) seq_printf(m, " (fence: %d)", obj->fence_reg); - if (i915_gem_obj_ggtt_bound(obj)) - seq_printf(m, " (gtt offset: %08lx, size: %08x)", - i915_gem_obj_ggtt_offset(obj), (unsigned int)i915_gem_obj_ggtt_size(obj)); + list_for_each_entry(vma, &obj->vma_list, vma_link) { + if (!i915_is_ggtt(vma->vm)) + seq_puts(m, " (pp"); + else + seq_puts(m, " (g"); + seq_printf(m, "gtt offset: %08lx, size: %08lx)", + i915_gem_obj_offset(obj, vma->vm), + i915_gem_obj_size(obj, vma->vm)); + } if (obj->stolen) seq_printf(m, " (stolen: %08lx)", obj->stolen->start); if (obj->pin_mappable || obj->fault_mappable) { @@ -175,6 +182,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data) return 0; } +/* FIXME: Support multiple VM? */ #define count_objects(list, member) do { \ list_for_each_entry(obj, list, member) { \ size += i915_gem_obj_ggtt_size(obj); \ @@ -1781,18 +1789,21 @@ i915_drop_caches_set(void *data, u64 val) if (val & DROP_BOUND) { list_for_each_entry_safe(obj, next, &vm->inactive_list, - mm_list) - if (obj->pin_count == 0) { - ret = i915_gem_object_unbind(obj); - if (ret) - goto unlock; - } + mm_list) { + if (obj->pin_count) + continue; + + ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base); + if (ret) + goto unlock; + } } if (val & DROP_UNBOUND) { list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list, global_list) if (obj->pages_pin_count == 0) { + /* FIXME: Do this for all vms? */ ret = i915_gem_object_put_pages(obj); if (ret) goto unlock; diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 1449d06..4650519 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1499,10 +1499,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags) i915_dump_device_info(dev_priv); - INIT_LIST_HEAD(&dev_priv->vm_list); - INIT_LIST_HEAD(&dev_priv->gtt.base.global_link); - list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list); - if (i915_get_bridge_dev(dev)) { ret = -EIO; goto free_priv; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 8b3167e..681cb41 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1379,52 +1379,6 @@ struct drm_i915_gem_object { #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base) -/* This is a temporary define to help transition us to real VMAs. If you see - * this, you're either reviewing code, or bisecting it. */ -static inline struct i915_vma * -__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj) -{ - if (list_empty(&obj->vma_list)) - return NULL; - return list_first_entry(&obj->vma_list, struct i915_vma, vma_link); -} - -/* Whether or not this object is currently mapped by the translation tables */ -static inline bool -i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *o) -{ - struct i915_vma *vma = __i915_gem_obj_to_vma(o); - if (vma == NULL) - return false; - return drm_mm_node_allocated(&vma->node); -} - -/* Offset of the first PTE pointing to this object */ -static inline unsigned long -i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o) -{ - BUG_ON(list_empty(&o->vma_list)); - return __i915_gem_obj_to_vma(o)->node.start; -} - -/* The size used in the translation tables may be larger than the actual size of - * the object on GEN2/GEN3 because of the way tiling is handled. See - * i915_gem_get_gtt_size() for more details. - */ -static inline unsigned long -i915_gem_obj_ggtt_size(struct drm_i915_gem_object *o) -{ - BUG_ON(list_empty(&o->vma_list)); - return __i915_gem_obj_to_vma(o)->node.size; -} - -static inline void -i915_gem_obj_ggtt_set_color(struct drm_i915_gem_object *o, - enum i915_cache_level color) -{ - __i915_gem_obj_to_vma(o)->node.color = color; -} - /** * Request queue structure. * @@ -1736,11 +1690,13 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj, void i915_gem_vma_destroy(struct i915_vma *vma); int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj, + struct i915_address_space *vm, uint32_t alignment, bool map_and_fenceable, bool nonblocking); void i915_gem_object_unpin(struct drm_i915_gem_object *obj); -int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj); +int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj, + struct i915_address_space *vm); int i915_gem_object_put_pages(struct drm_i915_gem_object *obj); void i915_gem_release_mmap(struct drm_i915_gem_object *obj); void i915_gem_lastclose(struct drm_device *dev); @@ -1770,6 +1726,7 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev); int i915_gem_object_sync(struct drm_i915_gem_object *obj, struct intel_ring_buffer *to); void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj, + struct i915_address_space *vm, struct intel_ring_buffer *ring); int i915_gem_dumb_create(struct drm_file *file_priv, @@ -1876,6 +1833,7 @@ i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size, int tiling_mode, bool fenced); int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, + struct i915_address_space *vm, enum i915_cache_level cache_level); struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, @@ -1886,6 +1844,56 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev, void i915_gem_restore_fences(struct drm_device *dev); +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o, + struct i915_address_space *vm); +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o); +bool i915_gem_obj_bound(struct drm_i915_gem_object *o, + struct i915_address_space *vm); +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o, + struct i915_address_space *vm); +void i915_gem_obj_set_color(struct drm_i915_gem_object *o, + struct i915_address_space *vm, + enum i915_cache_level color); +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj, + struct i915_address_space *vm); +/* Some GGTT VM helpers */ +#define obj_to_ggtt(obj) \ + (&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base) +static inline bool i915_is_ggtt(struct i915_address_space *vm) +{ + struct i915_address_space *ggtt = + &((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base; + return vm == ggtt; +} + +static inline bool i915_gem_obj_ggtt_bound(struct drm_i915_gem_object *obj) +{ + return i915_gem_obj_bound(obj, obj_to_ggtt(obj)); +} + +static inline unsigned long +i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *obj) +{ + return i915_gem_obj_offset(obj, obj_to_ggtt(obj)); +} + +static inline unsigned long +i915_gem_obj_ggtt_size(struct drm_i915_gem_object *obj) +{ + return i915_gem_obj_size(obj, obj_to_ggtt(obj)); +} + +static inline int __must_check +i915_gem_ggtt_pin(struct drm_i915_gem_object *obj, + uint32_t alignment, + bool map_and_fenceable, + bool nonblocking) +{ + return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment, + map_and_fenceable, nonblocking); +} +#undef obj_to_ggtt + /* i915_gem_context.c */ void i915_gem_context_init(struct drm_device *dev); void i915_gem_context_fini(struct drm_device *dev); @@ -1922,6 +1930,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt, void i915_gem_restore_gtt_mappings(struct drm_device *dev); int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj); +/* FIXME: this is never okay with full PPGTT */ void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj, enum i915_cache_level cache_level); void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj); @@ -1938,7 +1947,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev) /* i915_gem_evict.c */ -int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size, +int __must_check i915_gem_evict_something(struct drm_device *dev, + struct i915_address_space *vm, + int min_size, unsigned alignment, unsigned cache_level, bool mappable, diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 2283765..0111554 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -38,10 +38,12 @@ static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj); static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj); -static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj, - unsigned alignment, - bool map_and_fenceable, - bool nonblocking); +static __must_check int +i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj, + struct i915_address_space *vm, + unsigned alignment, + bool map_and_fenceable, + bool nonblocking); static int i915_gem_phys_pwrite(struct drm_device *dev, struct drm_i915_gem_object *obj, struct drm_i915_gem_pwrite *args, @@ -120,7 +122,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev) static inline bool i915_gem_object_is_inactive(struct drm_i915_gem_object *obj) { - return i915_gem_obj_ggtt_bound(obj) && !obj->active; + return i915_gem_obj_bound_any(obj) && !obj->active; } int @@ -406,7 +408,7 @@ i915_gem_shmem_pread(struct drm_device *dev, * anyway again before the next pread happens. */ if (obj->cache_level == I915_CACHE_NONE) needs_clflush = 1; - if (i915_gem_obj_ggtt_bound(obj)) { + if (i915_gem_obj_bound_any(obj)) { ret = i915_gem_object_set_to_gtt_domain(obj, false); if (ret) return ret; @@ -578,7 +580,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev, char __user *user_data; int page_offset, page_length, ret; - ret = i915_gem_object_pin(obj, 0, true, true); + ret = i915_gem_ggtt_pin(obj, 0, true, true); if (ret) goto out; @@ -723,7 +725,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev, * right away and we therefore have to clflush anyway. */ if (obj->cache_level == I915_CACHE_NONE) needs_clflush_after = 1; - if (i915_gem_obj_ggtt_bound(obj)) { + if (i915_gem_obj_bound_any(obj)) { ret = i915_gem_object_set_to_gtt_domain(obj, true); if (ret) return ret; @@ -1332,7 +1334,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf) } /* Now bind it into the GTT if needed */ - ret = i915_gem_object_pin(obj, 0, true, false); + ret = i915_gem_ggtt_pin(obj, 0, true, false); if (ret) goto unlock; @@ -1654,11 +1656,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj) if (obj->pages == NULL) return 0; - BUG_ON(i915_gem_obj_ggtt_bound(obj)); - if (obj->pages_pin_count) return -EBUSY; + BUG_ON(i915_gem_obj_bound_any(obj)); + /* ->put_pages might need to allocate memory for the bit17 swizzle * array, hence protect them from being reaped by removing them from gtt * lists early. */ @@ -1678,7 +1680,6 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target, bool purgeable_only) { struct drm_i915_gem_object *obj, *next; - struct i915_address_space *vm = &dev_priv->gtt.base; long count = 0; list_for_each_entry_safe(obj, next, @@ -1692,14 +1693,22 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target, } } - list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) { - if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) && - i915_gem_object_unbind(obj) == 0 && - i915_gem_object_put_pages(obj) == 0) { + list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list, + global_list) { + struct i915_vma *vma, *v; + + if (!i915_gem_object_is_purgeable(obj) && purgeable_only) + continue; + + list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link) + if (i915_gem_object_unbind(obj, vma->vm)) + break; + + if (!i915_gem_object_put_pages(obj)) count += obj->base.size >> PAGE_SHIFT; - if (count >= target) - return count; - } + + if (count >= target) + return count; } return count; @@ -1859,11 +1868,11 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj) void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj, + struct i915_address_space *vm, struct intel_ring_buffer *ring) { struct drm_device *dev = obj->base.dev; struct drm_i915_private *dev_priv = dev->dev_private; - struct i915_address_space *vm = &dev_priv->gtt.base; u32 seqno = intel_ring_get_seqno(ring); BUG_ON(ring == NULL); @@ -1900,12 +1909,9 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj, } static void -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj) +i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj, + struct i915_address_space *vm) { - struct drm_device *dev = obj->base.dev; - struct drm_i915_private *dev_priv = dev->dev_private; - struct i915_address_space *vm = &dev_priv->gtt.base; - BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS); BUG_ON(!obj->active); @@ -2105,10 +2111,11 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request) spin_unlock(&file_priv->mm.lock); } -static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj) +static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj, + struct i915_address_space *vm) { - if (acthd >= i915_gem_obj_ggtt_offset(obj) && - acthd < i915_gem_obj_ggtt_offset(obj) + obj->base.size) + if (acthd >= i915_gem_obj_offset(obj, vm) && + acthd < i915_gem_obj_offset(obj, vm) + obj->base.size) return true; return false; @@ -2131,6 +2138,17 @@ static bool i915_head_inside_request(const u32 acthd_unmasked, return false; } +static struct i915_address_space * +request_to_vm(struct drm_i915_gem_request *request) +{ + struct drm_i915_private *dev_priv = request->ring->dev->dev_private; + struct i915_address_space *vm; + + vm = &dev_priv->gtt.base; + + return vm; +} + static bool i915_request_guilty(struct drm_i915_gem_request *request, const u32 acthd, bool *inside) { @@ -2138,9 +2156,9 @@ static bool i915_request_guilty(struct drm_i915_gem_request *request, * pointing inside the ring, matches the batch_obj address range. * However this is extremely unlikely. */ - if (request->batch_obj) { - if (i915_head_inside_object(acthd, request->batch_obj)) { + if (i915_head_inside_object(acthd, request->batch_obj, + request_to_vm(request))) { *inside = true; return true; } @@ -2160,17 +2178,21 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring, { struct i915_ctx_hang_stats *hs = NULL; bool inside, guilty; + unsigned long offset = 0; /* Innocent until proven guilty */ guilty = false; + if (request->batch_obj) + offset = i915_gem_obj_offset(request->batch_obj, + request_to_vm(request)); + if (ring->hangcheck.action != wait && i915_request_guilty(request, acthd, &inside)) { DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n", ring->name, inside ? "inside" : "flushing", - request->batch_obj ? - i915_gem_obj_ggtt_offset(request->batch_obj) : 0, + offset, request->ctx ? request->ctx->id : 0, acthd); @@ -2227,13 +2249,15 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv, } while (!list_empty(&ring->active_list)) { + struct i915_address_space *vm; struct drm_i915_gem_object *obj; obj = list_first_entry(&ring->active_list, struct drm_i915_gem_object, ring_list); - i915_gem_object_move_to_inactive(obj); + list_for_each_entry(vm, &dev_priv->vm_list, global_link) + i915_gem_object_move_to_inactive(obj, vm); } } @@ -2261,7 +2285,7 @@ void i915_gem_restore_fences(struct drm_device *dev) void i915_gem_reset(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev->dev_private; - struct i915_address_space *vm = &dev_priv->gtt.base; + struct i915_address_space *vm; struct drm_i915_gem_object *obj; struct intel_ring_buffer *ring; int i; @@ -2272,8 +2296,9 @@ void i915_gem_reset(struct drm_device *dev) /* Move everything out of the GPU domains to ensure we do any * necessary invalidation upon reuse. */ - list_for_each_entry(obj, &vm->inactive_list, mm_list) - obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS; + list_for_each_entry(vm, &dev_priv->vm_list, global_link) + list_for_each_entry(obj, &vm->inactive_list, mm_list) + obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS; i915_gem_restore_fences(dev); } @@ -2318,6 +2343,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring) * by the ringbuffer to the flushing/inactive lists as appropriate. */ while (!list_empty(&ring->active_list)) { + struct drm_i915_private *dev_priv = ring->dev->dev_private; + struct i915_address_space *vm; struct drm_i915_gem_object *obj; obj = list_first_entry(&ring->active_list, @@ -2327,7 +2354,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring) if (!i915_seqno_passed(seqno, obj->last_read_seqno)) break; - i915_gem_object_move_to_inactive(obj); + list_for_each_entry(vm, &dev_priv->vm_list, global_link) + i915_gem_object_move_to_inactive(obj, vm); } if (unlikely(ring->trace_irq_seqno && @@ -2573,13 +2601,14 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj) * Unbinds an object from the GTT aperture. */ int -i915_gem_object_unbind(struct drm_i915_gem_object *obj) +i915_gem_object_unbind(struct drm_i915_gem_object *obj, + struct i915_address_space *vm) { drm_i915_private_t *dev_priv = obj->base.dev->dev_private; struct i915_vma *vma; int ret; - if (!i915_gem_obj_ggtt_bound(obj)) + if (!i915_gem_obj_bound(obj, vm)) return 0; if (obj->pin_count) @@ -2602,7 +2631,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj) if (ret) return ret; - trace_i915_gem_object_unbind(obj); + trace_i915_gem_object_unbind(obj, vm); if (obj->has_global_gtt_mapping) i915_gem_gtt_unbind_object(obj); @@ -2617,7 +2646,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj) /* Avoid an unnecessary call to unbind on rebind. */ obj->map_and_fenceable = true; - vma = __i915_gem_obj_to_vma(obj); + vma = i915_gem_obj_to_vma(obj, vm); list_del(&vma->vma_link); drm_mm_remove_node(&vma->node); i915_gem_vma_destroy(vma); @@ -2764,6 +2793,7 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg, "object 0x%08lx not 512K or pot-size 0x%08x aligned\n", i915_gem_obj_ggtt_offset(obj), size); + pitch_val = obj->stride / 128; pitch_val = ffs(pitch_val) - 1; @@ -3049,24 +3079,26 @@ static void i915_gem_verify_gtt(struct drm_device *dev) * Finds free space in the GTT aperture and binds the object there. */ static int -i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj, - unsigned alignment, - bool map_and_fenceable, - bool nonblocking) +i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj, + struct i915_address_space *vm, + unsigned alignment, + bool map_and_fenceable, + bool nonblocking) { struct drm_device *dev = obj->base.dev; drm_i915_private_t *dev_priv = dev->dev_private; - struct i915_address_space *vm = &dev_priv->gtt.base; u32 size, fence_size, fence_alignment, unfenced_alignment; bool mappable, fenceable; - size_t gtt_max = map_and_fenceable ? - dev_priv->gtt.mappable_end : dev_priv->gtt.base.total; + size_t gtt_max = + map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total; struct i915_vma *vma; int ret; if (WARN_ON(!list_empty(&obj->vma_list))) return -EBUSY; + BUG_ON(!i915_is_ggtt(vm)); + fence_size = i915_gem_get_gtt_size(dev, obj->base.size, obj->tiling_mode); @@ -3105,19 +3137,21 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj, i915_gem_object_pin_pages(obj); - vma = i915_gem_vma_create(obj, &dev_priv->gtt.base); + /* For now we only ever use 1 vma per object */ + WARN_ON(!list_empty(&obj->vma_list)); + + vma = i915_gem_vma_create(obj, vm); if (IS_ERR(vma)) { i915_gem_object_unpin_pages(obj); return PTR_ERR(vma); } search_free: - ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm, - &vma->node, + ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node, size, alignment, obj->cache_level, 0, gtt_max); if (ret) { - ret = i915_gem_evict_something(dev, size, alignment, + ret = i915_gem_evict_something(dev, vm, size, alignment, obj->cache_level, map_and_fenceable, nonblocking); @@ -3138,18 +3172,25 @@ search_free: list_move_tail(&obj->global_list, &dev_priv->mm.bound_list); list_add_tail(&obj->mm_list, &vm->inactive_list); - list_add(&vma->vma_link, &obj->vma_list); + + /* Keep GGTT vmas first to make debug easier */ + if (i915_is_ggtt(vm)) + list_add(&vma->vma_link, &obj->vma_list); + else + list_add_tail(&vma->vma_link, &obj->vma_list); fenceable = + i915_is_ggtt(vm) && i915_gem_obj_ggtt_size(obj) == fence_size && (i915_gem_obj_ggtt_offset(obj) & (fence_alignment - 1)) == 0; - mappable = i915_gem_obj_ggtt_offset(obj) + obj->base.size <= - dev_priv->gtt.mappable_end; + mappable = + i915_is_ggtt(vm) && + vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end; obj->map_and_fenceable = mappable && fenceable; - trace_i915_gem_object_bind(obj, map_and_fenceable); + trace_i915_gem_object_bind(obj, vm, map_and_fenceable); i915_gem_verify_gtt(dev); return 0; @@ -3253,7 +3294,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) int ret; /* Not valid to be called on unbound objects. */ - if (!i915_gem_obj_ggtt_bound(obj)) + if (!i915_gem_obj_bound_any(obj)) return -EINVAL; if (obj->base.write_domain == I915_GEM_DOMAIN_GTT) @@ -3299,11 +3340,12 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) } int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, + struct i915_address_space *vm, enum i915_cache_level cache_level) { struct drm_device *dev = obj->base.dev; drm_i915_private_t *dev_priv = dev->dev_private; - struct i915_vma *vma = __i915_gem_obj_to_vma(obj); + struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm); int ret; if (obj->cache_level == cache_level) @@ -3315,12 +3357,12 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, } if (vma && !i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) { - ret = i915_gem_object_unbind(obj); + ret = i915_gem_object_unbind(obj, vm); if (ret) return ret; } - if (i915_gem_obj_ggtt_bound(obj)) { + list_for_each_entry(vma, &obj->vma_list, vma_link) { ret = i915_gem_object_finish_gpu(obj); if (ret) return ret; @@ -3343,7 +3385,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt, obj, cache_level); - i915_gem_obj_ggtt_set_color(obj, cache_level); + i915_gem_obj_set_color(obj, vma->vm, cache_level); } if (cache_level == I915_CACHE_NONE) { @@ -3403,6 +3445,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { struct drm_i915_gem_caching *args = data; + struct drm_i915_private *dev_priv; struct drm_i915_gem_object *obj; enum i915_cache_level level; int ret; @@ -3427,8 +3470,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data, ret = -ENOENT; goto unlock; } + dev_priv = obj->base.dev->dev_private; - ret = i915_gem_object_set_cache_level(obj, level); + /* FIXME: Add interface for specific VM? */ + ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level); drm_gem_object_unreference(&obj->base); unlock: @@ -3446,6 +3491,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, u32 alignment, struct intel_ring_buffer *pipelined) { + struct drm_i915_private *dev_priv = obj->base.dev->dev_private; u32 old_read_domains, old_write_domain; int ret; @@ -3464,7 +3510,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, * of uncaching, which would allow us to flush all the LLC-cached data * with that bit in the PTE to main memory with just one PIPE_CONTROL. */ - ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE); + ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, + I915_CACHE_NONE); if (ret) return ret; @@ -3472,7 +3519,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, * (e.g. libkms for the bootup splash), we have to ensure that we * always use map_and_fenceable for all scanout buffers. */ - ret = i915_gem_object_pin(obj, alignment, true, false); + ret = i915_gem_ggtt_pin(obj, alignment, true, false); if (ret) return ret; @@ -3615,6 +3662,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file) int i915_gem_object_pin(struct drm_i915_gem_object *obj, + struct i915_address_space *vm, uint32_t alignment, bool map_and_fenceable, bool nonblocking) @@ -3624,28 +3672,31 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj, if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT)) return -EBUSY; - if (i915_gem_obj_ggtt_bound(obj)) { - if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) || + WARN_ON(map_and_fenceable && !i915_is_ggtt(vm)); + + if (i915_gem_obj_bound(obj, vm)) { + if ((alignment && + i915_gem_obj_offset(obj, vm) & (alignment - 1)) || (map_and_fenceable && !obj->map_and_fenceable)) { WARN(obj->pin_count, "bo is already pinned with incorrect alignment:" " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d," " obj->map_and_fenceable=%d\n", - i915_gem_obj_ggtt_offset(obj), alignment, + i915_gem_obj_offset(obj, vm), alignment, map_and_fenceable, obj->map_and_fenceable); - ret = i915_gem_object_unbind(obj); + ret = i915_gem_object_unbind(obj, vm); if (ret) return ret; } } - if (!i915_gem_obj_ggtt_bound(obj)) { + if (!i915_gem_obj_bound(obj, vm)) { struct drm_i915_private *dev_priv = obj->base.dev->dev_private; - ret = i915_gem_object_bind_to_gtt(obj, alignment, - map_and_fenceable, - nonblocking); + ret = i915_gem_object_bind_to_vm(obj, vm, alignment, + map_and_fenceable, + nonblocking); if (ret) return ret; @@ -3666,7 +3717,7 @@ void i915_gem_object_unpin(struct drm_i915_gem_object *obj) { BUG_ON(obj->pin_count == 0); - BUG_ON(!i915_gem_obj_ggtt_bound(obj)); + BUG_ON(!i915_gem_obj_bound_any(obj)); if (--obj->pin_count == 0) obj->pin_mappable = false; @@ -3704,7 +3755,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data, } if (obj->user_pin_count == 0) { - ret = i915_gem_object_pin(obj, args->alignment, true, false); + ret = i915_gem_ggtt_pin(obj, args->alignment, true, false); if (ret) goto out; } @@ -3937,6 +3988,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj) struct drm_i915_gem_object *obj = to_intel_bo(gem_obj); struct drm_device *dev = obj->base.dev; drm_i915_private_t *dev_priv = dev->dev_private; + struct i915_vma *vma, *next; trace_i915_gem_object_destroy(obj); @@ -3944,15 +3996,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj) i915_gem_detach_phys_object(dev, obj); obj->pin_count = 0; - if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) { - bool was_interruptible; + /* NB: 0 or 1 elements */ + WARN_ON(!list_empty(&obj->vma_list) && + !list_is_singular(&obj->vma_list)); + list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) { + int ret = i915_gem_object_unbind(obj, vma->vm); + if (WARN_ON(ret == -ERESTARTSYS)) { + bool was_interruptible; - was_interruptible = dev_priv->mm.interruptible; - dev_priv->mm.interruptible = false; + was_interruptible = dev_priv->mm.interruptible; + dev_priv->mm.interruptible = false; - WARN_ON(i915_gem_object_unbind(obj)); + WARN_ON(i915_gem_object_unbind(obj, vma->vm)); - dev_priv->mm.interruptible = was_interruptible; + dev_priv->mm.interruptible = was_interruptible; + } } /* Stolen objects don't hold a ref, but do hold pin count. Fix that up @@ -4319,6 +4377,16 @@ init_ring_lists(struct intel_ring_buffer *ring) INIT_LIST_HEAD(&ring->request_list); } +static void i915_init_vm(struct drm_i915_private *dev_priv, + struct i915_address_space *vm) +{ + vm->dev = dev_priv->dev; + INIT_LIST_HEAD(&vm->active_list); + INIT_LIST_HEAD(&vm->inactive_list); + INIT_LIST_HEAD(&vm->global_link); + list_add(&vm->global_link, &dev_priv->vm_list); +} + void i915_gem_load(struct drm_device *dev) { @@ -4331,8 +4399,9 @@ i915_gem_load(struct drm_device *dev) SLAB_HWCACHE_ALIGN, NULL); - INIT_LIST_HEAD(&dev_priv->gtt.base.active_list); - INIT_LIST_HEAD(&dev_priv->gtt.base.inactive_list); + INIT_LIST_HEAD(&dev_priv->vm_list); + i915_init_vm(dev_priv, &dev_priv->gtt.base); + INIT_LIST_HEAD(&dev_priv->mm.unbound_list); INIT_LIST_HEAD(&dev_priv->mm.bound_list); INIT_LIST_HEAD(&dev_priv->mm.fence_list); @@ -4603,9 +4672,8 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc) struct drm_i915_private, mm.inactive_shrinker); struct drm_device *dev = dev_priv->dev; - struct i915_address_space *vm = &dev_priv->gtt.base; struct drm_i915_gem_object *obj; - int nr_to_scan = sc->nr_to_scan; + int nr_to_scan; bool unlock = true; int cnt; @@ -4619,6 +4687,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc) unlock = false; } + nr_to_scan = sc->nr_to_scan; if (nr_to_scan) { nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan); if (nr_to_scan > 0) @@ -4632,11 +4701,109 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc) list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list) if (obj->pages_pin_count == 0) cnt += obj->base.size >> PAGE_SHIFT; - list_for_each_entry(obj, &vm->inactive_list, mm_list) + + list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) { + if (obj->active) + continue; + + i915_gem_object_flush_gtt_write_domain(obj); + i915_gem_object_flush_cpu_write_domain(obj); + /* FIXME: Can't assume global gtt */ + i915_gem_object_move_to_inactive(obj, &dev_priv->gtt.base); + if (obj->pin_count == 0 && obj->pages_pin_count == 0) cnt += obj->base.size >> PAGE_SHIFT; + } if (unlock) mutex_unlock(&dev->struct_mutex); return cnt; } + +/* All the new VM stuff */ +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o, + struct i915_address_space *vm) +{ + struct drm_i915_private *dev_priv = o->base.dev->dev_private; + struct i915_vma *vma; + + if (vm == &dev_priv->mm.aliasing_ppgtt->base) + vm = &dev_priv->gtt.base; + + BUG_ON(list_empty(&o->vma_list)); + list_for_each_entry(vma, &o->vma_list, vma_link) { + if (vma->vm == vm) + return vma->node.start; + + } + return -1; +} + +bool i915_gem_obj_bound(struct drm_i915_gem_object *o, + struct i915_address_space *vm) +{ + struct i915_vma *vma; + + list_for_each_entry(vma, &o->vma_list, vma_link) + if (vma->vm == vm) + return true; + + return false; +} + +bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o) +{ + struct drm_i915_private *dev_priv = o->base.dev->dev_private; + struct i915_address_space *vm; + + list_for_each_entry(vm, &dev_priv->vm_list, global_link) + if (i915_gem_obj_bound(o, vm)) + return true; + + return false; +} + +unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o, + struct i915_address_space *vm) +{ + struct drm_i915_private *dev_priv = o->base.dev->dev_private; + struct i915_vma *vma; + + if (vm == &dev_priv->mm.aliasing_ppgtt->base) + vm = &dev_priv->gtt.base; + + BUG_ON(list_empty(&o->vma_list)); + + list_for_each_entry(vma, &o->vma_list, vma_link) + if (vma->vm == vm) + return vma->node.size; + + return 0; +} + +void i915_gem_obj_set_color(struct drm_i915_gem_object *o, + struct i915_address_space *vm, + enum i915_cache_level color) +{ + struct i915_vma *vma; + BUG_ON(list_empty(&o->vma_list)); + list_for_each_entry(vma, &o->vma_list, vma_link) { + if (vma->vm == vm) { + vma->node.color = color; + return; + } + } + + WARN(1, "Couldn't set color for VM %p\n", vm); +} + +struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj, + struct i915_address_space *vm) +{ + struct i915_vma *vma; + list_for_each_entry(vma, &obj->vma_list, vma_link) + if (vma->vm == vm) + return vma; + + return NULL; +} diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 2470206..873577d 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -155,6 +155,7 @@ create_hw_context(struct drm_device *dev, if (INTEL_INFO(dev)->gen >= 7) { ret = i915_gem_object_set_cache_level(ctx->obj, + &dev_priv->gtt.base, I915_CACHE_LLC_MLC); /* Failure shouldn't ever happen this early */ if (WARN_ON(ret)) @@ -214,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv) * default context. */ dev_priv->ring[RCS].default_context = ctx; - ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false); + ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false); if (ret) { DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret); goto err_destroy; @@ -391,6 +392,7 @@ mi_set_context(struct intel_ring_buffer *ring, static int do_switch(struct i915_hw_context *to) { struct intel_ring_buffer *ring = to->ring; + struct drm_i915_private *dev_priv = ring->dev->dev_private; struct i915_hw_context *from = ring->last_context; u32 hw_flags = 0; int ret; @@ -400,7 +402,7 @@ static int do_switch(struct i915_hw_context *to) if (from == to) return 0; - ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false); + ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false); if (ret) return ret; @@ -437,7 +439,8 @@ static int do_switch(struct i915_hw_context *to) */ if (from != NULL) { from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION; - i915_gem_object_move_to_active(from->obj, ring); + i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base, + ring); /* As long as MI_SET_CONTEXT is serializing, ie. it flushes the * whole damn pipeline, we don't need to explicitly mark the * object dirty. The only exception is that the context must be diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c index df61f33..32efdc0 100644 --- a/drivers/gpu/drm/i915/i915_gem_evict.c +++ b/drivers/gpu/drm/i915/i915_gem_evict.c @@ -32,24 +32,21 @@ #include "i915_trace.h" static bool -mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind) +mark_free(struct i915_vma *vma, struct list_head *unwind) { - struct i915_vma *vma = __i915_gem_obj_to_vma(obj); - - if (obj->pin_count) + if (vma->obj->pin_count) return false; - list_add(&obj->exec_list, unwind); + list_add(&vma->obj->exec_list, unwind); return drm_mm_scan_add_block(&vma->node); } int -i915_gem_evict_something(struct drm_device *dev, int min_size, - unsigned alignment, unsigned cache_level, +i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm, + int min_size, unsigned alignment, unsigned cache_level, bool mappable, bool nonblocking) { drm_i915_private_t *dev_priv = dev->dev_private; - struct i915_address_space *vm = &dev_priv->gtt.base; struct list_head eviction_list, unwind_list; struct i915_vma *vma; struct drm_i915_gem_object *obj; @@ -81,16 +78,18 @@ i915_gem_evict_something(struct drm_device *dev, int min_size, */ INIT_LIST_HEAD(&unwind_list); - if (mappable) + if (mappable) { + BUG_ON(!i915_is_ggtt(vm)); drm_mm_init_scan_with_range(&vm->mm, min_size, alignment, cache_level, 0, dev_priv->gtt.mappable_end); - else + } else drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level); /* First see if there is a large enough contiguous idle region... */ list_for_each_entry(obj, &vm->inactive_list, mm_list) { - if (mark_free(obj, &unwind_list)) + struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm); + if (mark_free(vma, &unwind_list)) goto found; } @@ -99,7 +98,8 @@ i915_gem_evict_something(struct drm_device *dev, int min_size, /* Now merge in the soon-to-be-expired objects... */ list_for_each_entry(obj, &vm->active_list, mm_list) { - if (mark_free(obj, &unwind_list)) + struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm); + if (mark_free(vma, &unwind_list)) goto found; } @@ -109,7 +109,7 @@ none: obj = list_first_entry(&unwind_list, struct drm_i915_gem_object, exec_list); - vma = __i915_gem_obj_to_vma(obj); + vma = i915_gem_obj_to_vma(obj, vm); ret = drm_mm_scan_remove_block(&vma->node); BUG_ON(ret); @@ -130,7 +130,7 @@ found: obj = list_first_entry(&unwind_list, struct drm_i915_gem_object, exec_list); - vma = __i915_gem_obj_to_vma(obj); + vma = i915_gem_obj_to_vma(obj, vm); if (drm_mm_scan_remove_block(&vma->node)) { list_move(&obj->exec_list, &eviction_list); drm_gem_object_reference(&obj->base); @@ -145,7 +145,7 @@ found: struct drm_i915_gem_object, exec_list); if (ret == 0) - ret = i915_gem_object_unbind(obj); + ret = i915_gem_object_unbind(obj, vm); list_del_init(&obj->exec_list); drm_gem_object_unreference(&obj->base); @@ -158,13 +158,18 @@ int i915_gem_evict_everything(struct drm_device *dev) { drm_i915_private_t *dev_priv = dev->dev_private; - struct i915_address_space *vm = &dev_priv->gtt.base; + struct i915_address_space *vm; struct drm_i915_gem_object *obj, *next; - bool lists_empty; + bool lists_empty = true; int ret; - lists_empty = (list_empty(&vm->inactive_list) && - list_empty(&vm->active_list)); + list_for_each_entry(vm, &dev_priv->vm_list, global_link) { + lists_empty = (list_empty(&vm->inactive_list) && + list_empty(&vm->active_list)); + if (!lists_empty) + lists_empty = false; + } + if (lists_empty) return -ENOSPC; @@ -181,9 +186,11 @@ i915_gem_evict_everything(struct drm_device *dev) i915_gem_retire_requests(dev); /* Having flushed everything, unbind() should never raise an error */ - list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) - if (obj->pin_count == 0) - WARN_ON(i915_gem_object_unbind(obj)); + list_for_each_entry(vm, &dev_priv->vm_list, global_link) { + list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list) + if (obj->pin_count == 0) + WARN_ON(i915_gem_object_unbind(obj, vm)); + } return 0; } diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 1734825..819d8d8 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle) } static void -eb_destroy(struct eb_objects *eb) +eb_destroy(struct eb_objects *eb, struct i915_address_space *vm) { while (!list_empty(&eb->objects)) { struct drm_i915_gem_object *obj; @@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj) static int i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj, struct eb_objects *eb, - struct drm_i915_gem_relocation_entry *reloc) + struct drm_i915_gem_relocation_entry *reloc, + struct i915_address_space *vm) { struct drm_device *dev = obj->base.dev; struct drm_gem_object *target_obj; @@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj, static int i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj, - struct eb_objects *eb) + struct eb_objects *eb, + struct i915_address_space *vm) { #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry)) struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)]; @@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj, do { u64 offset = r->presumed_offset; - ret = i915_gem_execbuffer_relocate_entry(obj, eb, r); + ret = i915_gem_execbuffer_relocate_entry(obj, eb, r, + vm); if (ret) return ret; @@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj, static int i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj, struct eb_objects *eb, - struct drm_i915_gem_relocation_entry *relocs) + struct drm_i915_gem_relocation_entry *relocs, + struct i915_address_space *vm) { const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry; int i, ret; for (i = 0; i < entry->relocation_count; i++) { - ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]); + ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i], + vm); if (ret) return ret; } @@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj, } static int -i915_gem_execbuffer_relocate(struct eb_objects *eb) +i915_gem_execbuffer_relocate(struct eb_objects *eb, + struct i915_address_space *vm) { struct drm_i915_gem_object *obj; int ret = 0; @@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb) */ pagefault_disable(); list_for_each_entry(obj, &eb->objects, exec_list) { - ret = i915_gem_execbuffer_relocate_object(obj, eb); + ret = i915_gem_execbuffer_relocate_object(obj, eb, vm); if (ret) break; } @@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj) static int i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj, struct intel_ring_buffer *ring, + struct i915_address_space *vm, bool *need_reloc) { struct drm_i915_private *dev_priv = obj->base.dev->dev_private; @@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj, obj->tiling_mode != I915_TILING_NONE; need_mappable = need_fence || need_reloc_mappable(obj); - ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false); + ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable, + false); if (ret) return ret; @@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj, obj->has_aliasing_ppgtt_mapping = 1; } - if (entry->offset != i915_gem_obj_ggtt_offset(obj)) { - entry->offset = i915_gem_obj_ggtt_offset(obj); + if (entry->offset != i915_gem_obj_offset(obj, vm)) { + entry->offset = i915_gem_obj_offset(obj, vm); *need_reloc = true; } @@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj) { struct drm_i915_gem_exec_object2 *entry; - if (!i915_gem_obj_ggtt_bound(obj)) + if (!i915_gem_obj_bound_any(obj)) return; entry = obj->exec_entry; @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj) static int i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring, struct list_head *objects, + struct i915_address_space *vm, bool *need_relocs) { struct drm_i915_gem_object *obj; @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring, list_for_each_entry(obj, objects, exec_list) { struct drm_i915_gem_exec_object2 *entry = obj->exec_entry; bool need_fence, need_mappable; + u32 obj_offset; - if (!i915_gem_obj_ggtt_bound(obj)) + if (!i915_gem_obj_bound(obj, vm)) continue; + obj_offset = i915_gem_obj_offset(obj, vm); need_fence = has_fenced_gpu_access && entry->flags & EXEC_OBJECT_NEEDS_FENCE && obj->tiling_mode != I915_TILING_NONE; need_mappable = need_fence || need_reloc_mappable(obj); + BUG_ON((need_mappable || need_fence) && + !i915_is_ggtt(vm)); + if ((entry->alignment && - i915_gem_obj_ggtt_offset(obj) & (entry->alignment - 1)) || + obj_offset & (entry->alignment - 1)) || (need_mappable && !obj->map_and_fenceable)) - ret = i915_gem_object_unbind(obj); + ret = i915_gem_object_unbind(obj, vm); else - ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs); + ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs); if (ret) goto err; } /* Bind fresh objects */ list_for_each_entry(obj, objects, exec_list) { - if (i915_gem_obj_ggtt_bound(obj)) + if (i915_gem_obj_bound(obj, vm)) continue; - ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs); + ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs); if (ret) goto err; } @@ -578,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev, struct drm_file *file, struct intel_ring_buffer *ring, struct eb_objects *eb, - struct drm_i915_gem_exec_object2 *exec) + struct drm_i915_gem_exec_object2 *exec, + struct i915_address_space *vm) { struct drm_i915_gem_relocation_entry *reloc; struct drm_i915_gem_object *obj; @@ -662,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev, goto err; need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0; - ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs); + ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs); if (ret) goto err; list_for_each_entry(obj, &eb->objects, exec_list) { int offset = obj->exec_entry - exec; ret = i915_gem_execbuffer_relocate_object_slow(obj, eb, - reloc + reloc_offset[offset]); + reloc + reloc_offset[offset], + vm); if (ret) goto err; } @@ -770,6 +786,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec, static void i915_gem_execbuffer_move_to_active(struct list_head *objects, + struct i915_address_space *vm, struct intel_ring_buffer *ring) { struct drm_i915_gem_object *obj; @@ -784,7 +801,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects, obj->base.read_domains = obj->base.pending_read_domains; obj->fenced_gpu_access = obj->pending_fenced_gpu_access; - i915_gem_object_move_to_active(obj, ring); + i915_gem_object_move_to_active(obj, vm, ring); if (obj->base.write_domain) { obj->dirty = 1; obj->last_write_seqno = intel_ring_get_seqno(ring); @@ -838,7 +855,8 @@ static int i915_gem_do_execbuffer(struct drm_device *dev, void *data, struct drm_file *file, struct drm_i915_gem_execbuffer2 *args, - struct drm_i915_gem_exec_object2 *exec) + struct drm_i915_gem_exec_object2 *exec, + struct i915_address_space *vm) { drm_i915_private_t *dev_priv = dev->dev_private; struct eb_objects *eb; @@ -1000,17 +1018,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, /* Move the objects en-masse into the GTT, evicting if necessary. */ need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0; - ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs); + ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs); if (ret) goto err; /* The objects are in their final locations, apply the relocations. */ if (need_relocs) - ret = i915_gem_execbuffer_relocate(eb); + ret = i915_gem_execbuffer_relocate(eb, vm); if (ret) { if (ret == -EFAULT) { ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring, - eb, exec); + eb, exec, vm); BUG_ON(!mutex_is_locked(&dev->struct_mutex)); } if (ret) @@ -1061,7 +1079,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, goto err; } - exec_start = i915_gem_obj_ggtt_offset(batch_obj) + args->batch_start_offset; + exec_start = i915_gem_obj_offset(batch_obj, vm) + + args->batch_start_offset; exec_len = args->batch_len; if (cliprects) { for (i = 0; i < args->num_cliprects; i++) { @@ -1086,11 +1105,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags); - i915_gem_execbuffer_move_to_active(&eb->objects, ring); + i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring); i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj); err: - eb_destroy(eb); + eb_destroy(eb, vm); mutex_unlock(&dev->struct_mutex); @@ -1107,6 +1126,7 @@ int i915_gem_execbuffer(struct drm_device *dev, void *data, struct drm_file *file) { + struct drm_i915_private *dev_priv = dev->dev_private; struct drm_i915_gem_execbuffer *args = data; struct drm_i915_gem_execbuffer2 exec2; struct drm_i915_gem_exec_object *exec_list = NULL; @@ -1162,7 +1182,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data, exec2.flags = I915_EXEC_RENDER; i915_execbuffer2_set_context_id(exec2, 0); - ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list); + ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list, + &dev_priv->gtt.base); if (!ret) { /* Copy the new buffer offsets back to the user's exec list. */ for (i = 0; i < args->buffer_count; i++) @@ -1188,6 +1209,7 @@ int i915_gem_execbuffer2(struct drm_device *dev, void *data, struct drm_file *file) { + struct drm_i915_private *dev_priv = dev->dev_private; struct drm_i915_gem_execbuffer2 *args = data; struct drm_i915_gem_exec_object2 *exec2_list = NULL; int ret; @@ -1218,7 +1240,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data, return -EFAULT; } - ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list); + ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list, + &dev_priv->gtt.base); if (!ret) { /* Copy the new buffer offsets back to the user's exec list. */ ret = copy_to_user(to_user_ptr(args->buffers_ptr), diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 3b639a9..44f3464 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -390,6 +390,8 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev) ppgtt->base.total); } + /* i915_init_vm(dev_priv, &ppgtt->base) */ + return ret; } @@ -409,17 +411,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt, struct drm_i915_gem_object *obj, enum i915_cache_level cache_level) { - ppgtt->base.insert_entries(&ppgtt->base, obj->pages, - i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT, - cache_level); + struct i915_address_space *vm = &ppgtt->base; + unsigned long obj_offset = i915_gem_obj_offset(obj, vm); + + vm->insert_entries(vm, obj->pages, + obj_offset >> PAGE_SHIFT, + cache_level); } void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt, struct drm_i915_gem_object *obj) { - ppgtt->base.clear_range(&ppgtt->base, - i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT, - obj->base.size >> PAGE_SHIFT); + struct i915_address_space *vm = &ppgtt->base; + unsigned long obj_offset = i915_gem_obj_offset(obj, vm); + + vm->clear_range(vm, obj_offset >> PAGE_SHIFT, + obj->base.size >> PAGE_SHIFT); } extern int intel_iommu_gfx_mapped; @@ -470,6 +477,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev) dev_priv->gtt.base.start / PAGE_SIZE, dev_priv->gtt.base.total / PAGE_SIZE); + if (dev_priv->mm.aliasing_ppgtt) + gen6_write_pdes(dev_priv->mm.aliasing_ppgtt); + list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) { i915_gem_clflush_object(obj); i915_gem_gtt_bind_object(obj, obj->cache_level); @@ -648,7 +658,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev, * aperture. One page should be enough to keep any prefetching inside * of the aperture. */ - drm_i915_private_t *dev_priv = dev->dev_private; + struct drm_i915_private *dev_priv = dev->dev_private; + struct i915_address_space *ggtt_vm = &dev_priv->gtt.base; struct drm_mm_node *entry; struct drm_i915_gem_object *obj; unsigned long hole_start, hole_end; @@ -656,19 +667,19 @@ void i915_gem_setup_global_gtt(struct drm_device *dev, BUG_ON(mappable_end > end); /* Subtract the guard page ... */ - drm_mm_init(&dev_priv->gtt.base.mm, start, end - start - PAGE_SIZE); + drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE); if (!HAS_LLC(dev)) dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust; /* Mark any preallocated objects as occupied */ list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) { - struct i915_vma *vma = __i915_gem_obj_to_vma(obj); + struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm); int ret; DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n", i915_gem_obj_ggtt_offset(obj), obj->base.size); WARN_ON(i915_gem_obj_ggtt_bound(obj)); - ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node); + ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node); if (ret) DRM_DEBUG_KMS("Reservation failed\n"); obj->has_global_gtt_mapping = 1; @@ -679,19 +690,15 @@ void i915_gem_setup_global_gtt(struct drm_device *dev, dev_priv->gtt.base.total = end - start; /* Clear any non-preallocated blocks */ - drm_mm_for_each_hole(entry, &dev_priv->gtt.base.mm, - hole_start, hole_end) { + drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) { const unsigned long count = (hole_end - hole_start) / PAGE_SIZE; DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n", hole_start, hole_end); - dev_priv->gtt.base.clear_range(&dev_priv->gtt.base, - hole_start / PAGE_SIZE, - count); + ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count); } /* And finally clear the reserved guard page */ - dev_priv->gtt.base.clear_range(&dev_priv->gtt.base, - end / PAGE_SIZE - 1, 1); + ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1); } static bool diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c index 27ffb4c..000ffbd 100644 --- a/drivers/gpu/drm/i915/i915_gem_stolen.c +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c @@ -351,7 +351,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev, u32 size) { struct drm_i915_private *dev_priv = dev->dev_private; - struct i915_address_space *vm = &dev_priv->gtt.base; + struct i915_address_space *ggtt = &dev_priv->gtt.base; struct drm_i915_gem_object *obj; struct drm_mm_node *stolen; struct i915_vma *vma; @@ -394,7 +394,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev, if (gtt_offset == I915_GTT_OFFSET_NONE) return obj; - vma = i915_gem_vma_create(obj, &dev_priv->gtt.base); + vma = i915_gem_vma_create(obj, ggtt); if (IS_ERR(vma)) { ret = PTR_ERR(vma); goto err_out; @@ -407,8 +407,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev, */ vma->node.start = gtt_offset; vma->node.size = size; - if (drm_mm_initialized(&dev_priv->gtt.base.mm)) { - ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node); + if (drm_mm_initialized(&ggtt->mm)) { + ret = drm_mm_reserve_node(&ggtt->mm, &vma->node); if (ret) { DRM_DEBUG_KMS("failed to allocate stolen GTT space\n"); i915_gem_vma_destroy(vma); @@ -419,7 +419,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev, obj->has_global_gtt_mapping = 1; list_add_tail(&obj->global_list, &dev_priv->mm.bound_list); - list_add_tail(&obj->mm_list, &vm->inactive_list); + list_add_tail(&obj->mm_list, &ggtt->inactive_list); return obj; diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c index 92a8d27..808ca2a 100644 --- a/drivers/gpu/drm/i915/i915_gem_tiling.c +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c @@ -360,17 +360,19 @@ i915_gem_set_tiling(struct drm_device *dev, void *data, obj->map_and_fenceable = !i915_gem_obj_ggtt_bound(obj) || - (i915_gem_obj_ggtt_offset(obj) + obj->base.size <= dev_priv->gtt.mappable_end && + (i915_gem_obj_ggtt_offset(obj) + + obj->base.size <= dev_priv->gtt.mappable_end && i915_gem_object_fence_ok(obj, args->tiling_mode)); /* Rebind if we need a change of alignment */ if (!obj->map_and_fenceable) { - u32 unfenced_alignment = + struct i915_address_space *ggtt = &dev_priv->gtt.base; + u32 unfenced_align = i915_gem_get_gtt_alignment(dev, obj->base.size, args->tiling_mode, false); - if (i915_gem_obj_ggtt_offset(obj) & (unfenced_alignment - 1)) - ret = i915_gem_object_unbind(obj); + if (i915_gem_obj_ggtt_offset(obj) & (unfenced_align - 1)) + ret = i915_gem_object_unbind(obj, ggtt); } if (ret == 0) { diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h index 7d283b5..3f019d3 100644 --- a/drivers/gpu/drm/i915/i915_trace.h +++ b/drivers/gpu/drm/i915/i915_trace.h @@ -34,11 +34,13 @@ TRACE_EVENT(i915_gem_object_create, ); TRACE_EVENT(i915_gem_object_bind, - TP_PROTO(struct drm_i915_gem_object *obj, bool mappable), - TP_ARGS(obj, mappable), + TP_PROTO(struct drm_i915_gem_object *obj, + struct i915_address_space *vm, bool mappable), + TP_ARGS(obj, vm, mappable), TP_STRUCT__entry( __field(struct drm_i915_gem_object *, obj) + __field(struct i915_address_space *, vm) __field(u32, offset) __field(u32, size) __field(bool, mappable) @@ -46,8 +48,8 @@ TRACE_EVENT(i915_gem_object_bind, TP_fast_assign( __entry->obj = obj; - __entry->offset = i915_gem_obj_ggtt_offset(obj); - __entry->size = i915_gem_obj_ggtt_size(obj); + __entry->offset = i915_gem_obj_offset(obj, vm); + __entry->size = i915_gem_obj_size(obj, vm); __entry->mappable = mappable; ), @@ -57,19 +59,21 @@ TRACE_EVENT(i915_gem_object_bind, ); TRACE_EVENT(i915_gem_object_unbind, - TP_PROTO(struct drm_i915_gem_object *obj), - TP_ARGS(obj), + TP_PROTO(struct drm_i915_gem_object *obj, + struct i915_address_space *vm), + TP_ARGS(obj, vm), TP_STRUCT__entry( __field(struct drm_i915_gem_object *, obj) + __field(struct i915_address_space *, vm) __field(u32, offset) __field(u32, size) ), TP_fast_assign( __entry->obj = obj; - __entry->offset = i915_gem_obj_ggtt_offset(obj); - __entry->size = i915_gem_obj_ggtt_size(obj); + __entry->offset = i915_gem_obj_offset(obj, vm); + __entry->size = i915_gem_obj_size(obj, vm); ), TP_printk("obj=%p, offset=%08x size=%x", diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c index f3c97e0..b69cc63 100644 --- a/drivers/gpu/drm/i915/intel_fb.c +++ b/drivers/gpu/drm/i915/intel_fb.c @@ -170,7 +170,6 @@ static int intelfb_create(struct drm_fb_helper *helper, fb->width, fb->height, i915_gem_obj_ggtt_offset(obj), obj); - mutex_unlock(&dev->struct_mutex); vga_switcheroo_client_fb_set(dev->pdev, info); return 0; diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c index 2abb53e..22ccb7e 100644 --- a/drivers/gpu/drm/i915/intel_overlay.c +++ b/drivers/gpu/drm/i915/intel_overlay.c @@ -1350,7 +1350,7 @@ void intel_setup_overlay(struct drm_device *dev) } overlay->flip_addr = reg_bo->phys_obj->handle->busaddr; } else { - ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false); + ret = i915_gem_ggtt_pin(reg_bo, PAGE_SIZE, true, false); if (ret) { DRM_ERROR("failed to pin overlay register bo\n"); goto out_free_bo; diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 008e0e0..0fb081c 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -2860,7 +2860,7 @@ intel_alloc_context_page(struct drm_device *dev) return NULL; } - ret = i915_gem_object_pin(ctx, 4096, true, false); + ret = i915_gem_ggtt_pin(ctx, 4096, true, false); if (ret) { DRM_ERROR("failed to pin power context: %d\n", ret); goto err_unref; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 8527ea0..88130a3 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -481,6 +481,7 @@ out: static int init_pipe_control(struct intel_ring_buffer *ring) { + struct drm_i915_private *dev_priv = ring->dev->dev_private; struct pipe_control *pc; struct drm_i915_gem_object *obj; int ret; @@ -499,9 +500,10 @@ init_pipe_control(struct intel_ring_buffer *ring) goto err; } - i915_gem_object_set_cache_level(obj, I915_CACHE_LLC); + i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, + I915_CACHE_LLC); - ret = i915_gem_object_pin(obj, 4096, true, false); + ret = i915_gem_ggtt_pin(obj, 4096, true, false); if (ret) goto err_unref; @@ -1212,6 +1214,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring) static int init_status_page(struct intel_ring_buffer *ring) { struct drm_device *dev = ring->dev; + struct drm_i915_private *dev_priv = dev->dev_private; struct drm_i915_gem_object *obj; int ret; @@ -1222,9 +1225,10 @@ static int init_status_page(struct intel_ring_buffer *ring) goto err; } - i915_gem_object_set_cache_level(obj, I915_CACHE_LLC); + i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, + I915_CACHE_LLC); - ret = i915_gem_object_pin(obj, 4096, true, false); + ret = i915_gem_ggtt_pin(obj, 4096, true, false); if (ret != 0) { goto err_unref; } @@ -1307,7 +1311,7 @@ static int intel_init_ring_buffer(struct drm_device *dev, ring->obj = obj; - ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false); + ret = i915_gem_ggtt_pin(obj, PAGE_SIZE, true, false); if (ret) goto err_unref; @@ -1828,7 +1832,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev) return -ENOMEM; } - ret = i915_gem_object_pin(obj, 0, true, false); + ret = i915_gem_ggtt_pin(obj, 0, true, false); if (ret != 0) { drm_gem_object_unreference(&obj->base); DRM_ERROR("Failed to ping batch bo\n");
This patch was formerly known as: "drm/i915: Create VMAs (part 3) - plumbing" This patch adds a VM argument, bind/unbind, and the object offset/size/color getters/setters. It preserves the old ggtt helper functions because things still need, and will continue to need them. Some code will still need to be ported over after this. v2: Fix purge to pick an object and unbind all vmas This was doable because of the global bound list change. v3: With the commit to actually pin/unpin pages in place, there is no longer a need to check if unbind succeeded before calling put_pages(). Make put_pages only BUG() after checking pin count. v4: Rebased on top of the new hangcheck work by Mika plumbed eb_destroy also Many checkpatch related fixes v5: Very large rebase v6: Change BUG_ON to WARN_ON (Daniel) Rename vm to ggtt in preallocate stolen, since it is always ggtt when dealing with stolen memory. (Daniel) list_for_each will short-circuit already (Daniel) remove superflous space (Daniel) Use per object list of vmas (Daniel) Make obj_bound_any() use obj_bound for each vm (Ben) s/bind_to_gtt/bind_to_vm/ (Ben) Fixed up the inactive shrinker. As Daniel noticed the code could potentially count the same object multiple times. While it's not possible in the current case, since 1 object can only ever be bound into 1 address space thus far - we may as well try to get something more future proof in place now. With a prep patch before this to switch over to using the bound list + inactive check, we're now able to carry that forward for every address space an object is bound into. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> --- drivers/gpu/drm/i915/i915_debugfs.c | 29 ++- drivers/gpu/drm/i915/i915_dma.c | 4 - drivers/gpu/drm/i915/i915_drv.h | 107 +++++---- drivers/gpu/drm/i915/i915_gem.c | 337 +++++++++++++++++++++-------- drivers/gpu/drm/i915/i915_gem_context.c | 9 +- drivers/gpu/drm/i915/i915_gem_evict.c | 51 +++-- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 85 +++++--- drivers/gpu/drm/i915/i915_gem_gtt.c | 41 ++-- drivers/gpu/drm/i915/i915_gem_stolen.c | 10 +- drivers/gpu/drm/i915/i915_gem_tiling.c | 10 +- drivers/gpu/drm/i915/i915_trace.h | 20 +- drivers/gpu/drm/i915/intel_fb.c | 1 - drivers/gpu/drm/i915/intel_overlay.c | 2 +- drivers/gpu/drm/i915/intel_pm.c | 2 +- drivers/gpu/drm/i915/intel_ringbuffer.c | 16 +- 15 files changed, 479 insertions(+), 245 deletions(-)