Message ID | 1379196199-11295-4-git-send-email-benjamin.widawsky@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sat, Sep 14, 2013 at 03:03:17PM -0700, Ben Widawsky wrote: > +static void gen6_ggtt_bind_vma(struct i915_vma *vma, > + enum i915_cache_level cache_level, > + u32 flags) > +{ > + struct drm_device *dev = vma->vm->dev; > + struct drm_i915_private *dev_priv = dev->dev_private; > + struct drm_i915_gem_object *obj = vma->obj; > + const unsigned long entry = vma->node.start >> PAGE_SHIFT; > + > + /* If there is an aliasing PPGTT, and the user didn't explicitly ask for > + * the global, just use aliasing */ > + if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) { > + /* If the object is unbound, or we're change the cache bits */ > + if (!obj->has_global_gtt_mapping || > + (cache_level != obj->cache_level)) { > + gen6_ggtt_insert_entries(vma->vm, obj->pages, entry, > + cache_level); > + obj->has_global_gtt_mapping = 1; > + } > + } > + > + /* If put the mapping in the aliasing PPGTT as well as Global if we have > + * aliasing, but the user requested global. */ Why? As a proponent of full-ppgtt I thought you would be envisoning a future where the aliasing_ppgtt was used far less (i.e. never), and the ggtt would only continue to be used for the truly global entries such as scanouts, contexts, pdes, execlists etc. > + if (dev_priv->mm.aliasing_ppgtt && > + (!obj->has_aliasing_ppgtt_mapping || > + (cache_level != obj->cache_level))) { > + gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base, > + vma->obj->pages, entry, cache_level); > + vma->obj->has_aliasing_ppgtt_mapping = 1; > + } > +}
On Mon, Sep 16, 2013 at 10:25:28AM +0100, Chris Wilson wrote: > On Sat, Sep 14, 2013 at 03:03:17PM -0700, Ben Widawsky wrote: > > +static void gen6_ggtt_bind_vma(struct i915_vma *vma, > > + enum i915_cache_level cache_level, > > + u32 flags) > > +{ > > + struct drm_device *dev = vma->vm->dev; > > + struct drm_i915_private *dev_priv = dev->dev_private; > > + struct drm_i915_gem_object *obj = vma->obj; > > + const unsigned long entry = vma->node.start >> PAGE_SHIFT; > > + > > + /* If there is an aliasing PPGTT, and the user didn't explicitly ask for > > + * the global, just use aliasing */ > > + if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) { > > + /* If the object is unbound, or we're change the cache bits */ > > + if (!obj->has_global_gtt_mapping || > > + (cache_level != obj->cache_level)) { > > + gen6_ggtt_insert_entries(vma->vm, obj->pages, entry, > > + cache_level); > > + obj->has_global_gtt_mapping = 1; > > + } > > + } > > + > > + /* If put the mapping in the aliasing PPGTT as well as Global if we have > > + * aliasing, but the user requested global. */ > > Why? As a proponent of full-ppgtt I thought you would be envisoning a > future where the aliasing_ppgtt was used far less (i.e. never), and the > ggtt would only continue to be used for the truly global entries such as > scanouts, contexts, pdes, execlists etc. > Firstly, I've still yet to expose the grand plan at this point in the series, so I am not really certain if you're just complaining for the fun of it, or what. I'd like to make everything functionally the same, just with VMA support. Secondly, I was under the impression that for Sandybridge we had to have all global mappings in the aliasing to support PIPE_CONTROL, or some command like that. It's a bit mixed up in my head atm, and I'm too lazy to look at the exact reason. Finally, see firstly. I'll try to rip it out later on if it's possible. [snip] > -- > Chris Wilson, Intel Open Source Technology Centre
On Mon, Sep 16, 2013 at 11:23:43AM -0700, Ben Widawsky wrote: > On Mon, Sep 16, 2013 at 10:25:28AM +0100, Chris Wilson wrote: > > On Sat, Sep 14, 2013 at 03:03:17PM -0700, Ben Widawsky wrote: > > > +static void gen6_ggtt_bind_vma(struct i915_vma *vma, > > > + enum i915_cache_level cache_level, > > > + u32 flags) > > > +{ > > > + struct drm_device *dev = vma->vm->dev; > > > + struct drm_i915_private *dev_priv = dev->dev_private; > > > + struct drm_i915_gem_object *obj = vma->obj; > > > + const unsigned long entry = vma->node.start >> PAGE_SHIFT; > > > + > > > + /* If there is an aliasing PPGTT, and the user didn't explicitly ask for > > > + * the global, just use aliasing */ > > > + if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) { > > > + /* If the object is unbound, or we're change the cache bits */ > > > + if (!obj->has_global_gtt_mapping || > > > + (cache_level != obj->cache_level)) { > > > + gen6_ggtt_insert_entries(vma->vm, obj->pages, entry, > > > + cache_level); > > > + obj->has_global_gtt_mapping = 1; > > > + } > > > + } > > > + > > > + /* If put the mapping in the aliasing PPGTT as well as Global if we have > > > + * aliasing, but the user requested global. */ > > > > Why? As a proponent of full-ppgtt I thought you would be envisoning a > > future where the aliasing_ppgtt was used far less (i.e. never), and the > > ggtt would only continue to be used for the truly global entries such as > > scanouts, contexts, pdes, execlists etc. > > > > Firstly, I've still yet to expose the grand plan at this point in the > series, so I am not really certain if you're just complaining for the > fun of it, or what. I'd like to make everything functionally the same, > just with VMA support. > > Secondly, I was under the impression that for Sandybridge we had to have > all global mappings in the aliasing to support PIPE_CONTROL, or some > command like that. It's a bit mixed up in my head atm, and I'm too lazy > to look at the exact reason. > > Finally, see firstly. I'll try to rip it out later on if it's possible. Ben's right afaik, we need this kludge to keep snb happy. And we need ppgtt to make kernel cs scanning possible, which we seem to need for geomtry shaders or some other gl3.2/3 feature. So not much choice I'd say ... -Daniel
On Mon, Sep 16, 2013 at 11:23:43AM -0700, Ben Widawsky wrote: > On Mon, Sep 16, 2013 at 10:25:28AM +0100, Chris Wilson wrote: > > On Sat, Sep 14, 2013 at 03:03:17PM -0700, Ben Widawsky wrote: > > > +static void gen6_ggtt_bind_vma(struct i915_vma *vma, > > > + enum i915_cache_level cache_level, > > > + u32 flags) > > > +{ > > > + struct drm_device *dev = vma->vm->dev; > > > + struct drm_i915_private *dev_priv = dev->dev_private; > > > + struct drm_i915_gem_object *obj = vma->obj; > > > + const unsigned long entry = vma->node.start >> PAGE_SHIFT; > > > + > > > + /* If there is an aliasing PPGTT, and the user didn't explicitly ask for > > > + * the global, just use aliasing */ > > > + if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) { > > > + /* If the object is unbound, or we're change the cache bits */ > > > + if (!obj->has_global_gtt_mapping || > > > + (cache_level != obj->cache_level)) { > > > + gen6_ggtt_insert_entries(vma->vm, obj->pages, entry, > > > + cache_level); > > > + obj->has_global_gtt_mapping = 1; > > > + } > > > + } > > > + > > > + /* If put the mapping in the aliasing PPGTT as well as Global if we have > > > + * aliasing, but the user requested global. */ > > > > Why? As a proponent of full-ppgtt I thought you would be envisoning a > > future where the aliasing_ppgtt was used far less (i.e. never), and the > > ggtt would only continue to be used for the truly global entries such as > > scanouts, contexts, pdes, execlists etc. > > > > Firstly, I've still yet to expose the grand plan at this point in the > series, so I am not really certain if you're just complaining for the > fun of it, or what. I'd like to make everything functionally the same, > just with VMA support. I'm complaining because the comment is awful: telling me what the code is doing but not why. It doesn't seem obvious that if the user explicitly wanted a global mapping and that the object is not already in aliasing ppgtt that it is likely to be used in the aliasing ppgtt in the near future. > Secondly, I was under the impression that for Sandybridge we had to have > all global mappings in the aliasing to support PIPE_CONTROL, or some > command like that. It's a bit mixed up in my head atm, and I'm too lazy > to look at the exact reason. It does, but if we never enable full-ppgtt for SNB we don't have to worry about full-ppgtt being unusable for OpenGL (at least not without a 1:1 ppgtt to global mapping of all oq objects). -Chris
On Tue, Sep 17, 2013 at 12:05:46AM +0200, Daniel Vetter wrote: > On Mon, Sep 16, 2013 at 11:23:43AM -0700, Ben Widawsky wrote: > > On Mon, Sep 16, 2013 at 10:25:28AM +0100, Chris Wilson wrote: > > > On Sat, Sep 14, 2013 at 03:03:17PM -0700, Ben Widawsky wrote: > > > > +static void gen6_ggtt_bind_vma(struct i915_vma *vma, > > > > + enum i915_cache_level cache_level, > > > > + u32 flags) > > > > +{ > > > > + struct drm_device *dev = vma->vm->dev; > > > > + struct drm_i915_private *dev_priv = dev->dev_private; > > > > + struct drm_i915_gem_object *obj = vma->obj; > > > > + const unsigned long entry = vma->node.start >> PAGE_SHIFT; > > > > + > > > > + /* If there is an aliasing PPGTT, and the user didn't explicitly ask for > > > > + * the global, just use aliasing */ > > > > + if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) { > > > > + /* If the object is unbound, or we're change the cache bits */ > > > > + if (!obj->has_global_gtt_mapping || > > > > + (cache_level != obj->cache_level)) { > > > > + gen6_ggtt_insert_entries(vma->vm, obj->pages, entry, > > > > + cache_level); > > > > + obj->has_global_gtt_mapping = 1; > > > > + } > > > > + } > > > > + > > > > + /* If put the mapping in the aliasing PPGTT as well as Global if we have > > > > + * aliasing, but the user requested global. */ > > > > > > Why? As a proponent of full-ppgtt I thought you would be envisoning a > > > future where the aliasing_ppgtt was used far less (i.e. never), and the > > > ggtt would only continue to be used for the truly global entries such as > > > scanouts, contexts, pdes, execlists etc. > > > > > > > Firstly, I've still yet to expose the grand plan at this point in the > > series, so I am not really certain if you're just complaining for the > > fun of it, or what. I'd like to make everything functionally the same, > > just with VMA support. > > > > Secondly, I was under the impression that for Sandybridge we had to have > > all global mappings in the aliasing to support PIPE_CONTROL, or some > > command like that. It's a bit mixed up in my head atm, and I'm too lazy > > to look at the exact reason. > > > > Finally, see firstly. I'll try to rip it out later on if it's possible. > > Ben's right afaik, we need this kludge to keep snb happy. And we need > ppgtt to make kernel cs scanning possible, which we seem to need for > geomtry shaders or some other gl3.2/3 feature. So not much choice I'd say > ... It shouldn't be a kludge here, but in execbuffer where we detect the SNB w/a and so should pass the flag down to the bind and make sure we have any other fixup required (see the extra details required for full-ppgtt). -Chris
On Tue, Sep 17, 2013 at 12:18 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote: >> Ben's right afaik, we need this kludge to keep snb happy. And we need >> ppgtt to make kernel cs scanning possible, which we seem to need for >> geomtry shaders or some other gl3.2/3 feature. So not much choice I'd say >> ... > > It shouldn't be a kludge here, but in execbuffer where we detect the SNB > w/a and so should pass the flag down to the bind and make sure we have any > other fixup required (see the extra details required for full-ppgtt). Yeah, I've written my response without actually reading the code really, only the discussion, and gotten all confused. So I'll hide in shame a bit and let you two duke this out ;-) -Daniel
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 427c537..686a66c 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -465,6 +465,36 @@ enum i915_cache_level { typedef uint32_t gen6_gtt_pte_t; +/** + * A VMA represents a GEM BO that is bound into an address space. Therefore, a + * VMA's presence cannot be guaranteed before binding, or after unbinding the + * object into/from the address space. + * + * To make things as simple as possible (ie. no refcounting), a VMA's lifetime + * will always be <= an objects lifetime. So object refcounting should cover us. + */ +struct i915_vma { + struct drm_mm_node node; + struct drm_i915_gem_object *obj; + struct i915_address_space *vm; + + /** This object's place on the active/inactive lists */ + struct list_head mm_list; + + struct list_head vma_link; /* Link in the object's VMA list */ + + /** This vma's place in the batchbuffer or on the eviction list */ + struct list_head exec_list; + + /** + * Used for performing relocations during execbuffer insertion. + */ + struct hlist_node exec_node; + unsigned long exec_handle; + struct drm_i915_gem_exec_object2 *exec_entry; + +}; + struct i915_address_space { struct drm_mm mm; struct drm_device *dev; @@ -503,9 +533,18 @@ struct i915_address_space { /* FIXME: Need a more generic return type */ gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr, enum i915_cache_level level); + + /** Unmap an object from an address space. This usually consists of + * setting the valid PTE entries to a reserved scratch page. */ + void (*unbind_vma)(struct i915_vma *vma); void (*clear_range)(struct i915_address_space *vm, unsigned int first_entry, unsigned int num_entries); + /* Map an object into an address space with the given cache flags. */ +#define GLOBAL_BIND (1<<0) + void (*bind_vma)(struct i915_vma *vma, + enum i915_cache_level cache_level, + u32 flags); void (*insert_entries)(struct i915_address_space *vm, struct sg_table *st, unsigned int first_entry, @@ -552,36 +591,6 @@ struct i915_hw_ppgtt { int (*enable)(struct drm_device *dev); }; -/** - * A VMA represents a GEM BO that is bound into an address space. Therefore, a - * VMA's presence cannot be guaranteed before binding, or after unbinding the - * object into/from the address space. - * - * To make things as simple as possible (ie. no refcounting), a VMA's lifetime - * will always be <= an objects lifetime. So object refcounting should cover us. - */ -struct i915_vma { - struct drm_mm_node node; - struct drm_i915_gem_object *obj; - struct i915_address_space *vm; - - /** This object's place on the active/inactive lists */ - struct list_head mm_list; - - struct list_head vma_link; /* Link in the object's VMA list */ - - /** This vma's place in the batchbuffer or on the eviction list */ - struct list_head exec_list; - - /** - * Used for performing relocations during execbuffer insertion. - */ - struct hlist_node exec_node; - unsigned long exec_handle; - struct drm_i915_gem_exec_object2 *exec_entry; - -}; - struct i915_ctx_hang_stats { /* This context had batch pending when hang was declared */ unsigned batch_pending; diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 212f6d8..2a71a29 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -57,6 +57,11 @@ #define HSW_WB_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0xb) #define HSW_WT_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x6) +static void gen6_ppgtt_bind_vma(struct i915_vma *vma, + enum i915_cache_level cache_level, + u32 flags); +static void gen6_ppgtt_unbind_vma(struct i915_vma *vma); + static gen6_gtt_pte_t snb_pte_encode(dma_addr_t addr, enum i915_cache_level level) { @@ -332,7 +337,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt) ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode; ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES; ppgtt->enable = gen6_ppgtt_enable; + ppgtt->base.unbind_vma = NULL; ppgtt->base.clear_range = gen6_ppgtt_clear_range; + ppgtt->base.bind_vma = NULL; ppgtt->base.insert_entries = gen6_ppgtt_insert_entries; ppgtt->base.cleanup = gen6_ppgtt_cleanup; ppgtt->base.scratch = dev_priv->gtt.base.scratch; @@ -439,6 +446,18 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt, cache_level); } +static void __always_unused +gen6_ppgtt_bind_vma(struct i915_vma *vma, + enum i915_cache_level cache_level, + u32 flags) +{ + const unsigned long entry = vma->node.start >> PAGE_SHIFT; + + WARN_ON(flags); + + gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level); +} + void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt, struct drm_i915_gem_object *obj) { @@ -447,6 +466,14 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt, obj->base.size >> PAGE_SHIFT); } +static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma) +{ + const unsigned long entry = vma->node.start >> PAGE_SHIFT; + + gen6_ppgtt_clear_range(vma->vm, entry, + vma->obj->base.size >> PAGE_SHIFT); +} + extern int intel_iommu_gfx_mapped; /* Certain Gen5 chipsets require require idling the GPU before * unmapping anything from the GTT when VT-d is enabled. @@ -592,6 +619,19 @@ static void i915_ggtt_insert_entries(struct i915_address_space *vm, } +static void i915_ggtt_bind_vma(struct i915_vma *vma, + enum i915_cache_level cache_level, + u32 unused) +{ + const unsigned long entry = vma->node.start >> PAGE_SHIFT; + unsigned int flags = (cache_level == I915_CACHE_NONE) ? + AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY; + + BUG_ON(!i915_is_ggtt(vma->vm)); + intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags); + vma->obj->has_global_gtt_mapping = 1; +} + static void i915_ggtt_clear_range(struct i915_address_space *vm, unsigned int first_entry, unsigned int num_entries) @@ -599,6 +639,47 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm, intel_gtt_clear_range(first_entry, num_entries); } +static void i915_ggtt_unbind_vma(struct i915_vma *vma) +{ + const unsigned int first = vma->node.start >> PAGE_SHIFT; + const unsigned int size = vma->obj->base.size >> PAGE_SHIFT; + + BUG_ON(!i915_is_ggtt(vma->vm)); + vma->obj->has_global_gtt_mapping = 0; + intel_gtt_clear_range(first, size); +} + +static void gen6_ggtt_bind_vma(struct i915_vma *vma, + enum i915_cache_level cache_level, + u32 flags) +{ + struct drm_device *dev = vma->vm->dev; + struct drm_i915_private *dev_priv = dev->dev_private; + struct drm_i915_gem_object *obj = vma->obj; + const unsigned long entry = vma->node.start >> PAGE_SHIFT; + + /* If there is an aliasing PPGTT, and the user didn't explicitly ask for + * the global, just use aliasing */ + if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) { + /* If the object is unbound, or we're change the cache bits */ + if (!obj->has_global_gtt_mapping || + (cache_level != obj->cache_level)) { + gen6_ggtt_insert_entries(vma->vm, obj->pages, entry, + cache_level); + obj->has_global_gtt_mapping = 1; + } + } + + /* If put the mapping in the aliasing PPGTT as well as Global if we have + * aliasing, but the user requested global. */ + if (dev_priv->mm.aliasing_ppgtt && + (!obj->has_aliasing_ppgtt_mapping || + (cache_level != obj->cache_level))) { + gen6_ppgtt_insert_entries(&dev_priv->mm.aliasing_ppgtt->base, + vma->obj->pages, entry, cache_level); + vma->obj->has_aliasing_ppgtt_mapping = 1; + } +} void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj, enum i915_cache_level cache_level) @@ -627,6 +708,27 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj) obj->has_global_gtt_mapping = 0; } +static void gen6_ggtt_unbind_vma(struct i915_vma *vma) +{ + struct drm_device *dev = vma->vm->dev; + struct drm_i915_private *dev_priv = dev->dev_private; + struct drm_i915_gem_object *obj = vma->obj; + const unsigned long entry = vma->node.start >> PAGE_SHIFT; + + if (obj->has_global_gtt_mapping) { + gen6_ggtt_clear_range(vma->vm, entry, + vma->obj->base.size >> PAGE_SHIFT); + obj->has_global_gtt_mapping = 0; + } + + if (obj->has_aliasing_ppgtt_mapping) { + gen6_ppgtt_clear_range(&dev_priv->mm.aliasing_ppgtt->base, + entry, + obj->base.size >> PAGE_SHIFT); + obj->has_aliasing_ppgtt_mapping = 0; + } +} + void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj) { struct drm_device *dev = obj->base.dev; @@ -860,7 +962,9 @@ static int gen6_gmch_probe(struct drm_device *dev, DRM_ERROR("Scratch setup failed\n"); dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range; + dev_priv->gtt.base.unbind_vma = gen6_ggtt_unbind_vma; dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries; + dev_priv->gtt.base.bind_vma = gen6_ggtt_bind_vma; return ret; } @@ -892,7 +996,9 @@ static int i915_gmch_probe(struct drm_device *dev, dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev); dev_priv->gtt.base.clear_range = i915_ggtt_clear_range; + dev_priv->gtt.base.unbind_vma = i915_ggtt_unbind_vma; dev_priv->gtt.base.insert_entries = i915_ggtt_insert_entries; + dev_priv->gtt.base.bind_vma = i915_ggtt_bind_vma; return 0; }