Message ID | 1341833679-11614-2-git-send-email-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Jul 09, 2012 at 12:34:37PM +0100, Chris Wilson wrote: > In order to support snoopable memory on non-LLC architectures (so that > we can bind vgem objects into the i915 GATT for example), we have to > avoid the prefetcher on the GPU from crossing memory domains and so > prevent allocation of a snoopable PTE immediately following an uncached > PTE. To do that, we need to extend the range allocator with support for > tracking and segregating different node colours. > > This will be used by i915 to segregate memory domains within the GTT. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Dave Airlie <airlied@redhat.com > Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> > Cc: Ben Skeggs <bskeggs@redhat.com> > Cc: Jerome Glisse <jglisse@redhat.com> > Cc: Alex Deucher <alexander.deucher@amd.com> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Two little bikesheds: - Do we really need 64bits of colour? Especially since we have quite a few bits of space left ... - I think we could add a new insert_color helper that always takes a range (we can select the right rang in the driver). That way this patch wouldn't need to touch the drivers, and we could take the opportunity to embed the gtt_space mm_node into our gem object ... Besides that this looks good and I like it, but I think I've mentioned that way back when this patch first popped up ;-) -Daniel > > Conflicts: > > drivers/gpu/drm/i915/i915_gem_stolen.c > --- > drivers/gpu/drm/drm_gem.c | 2 +- > drivers/gpu/drm/drm_mm.c | 151 +++++++++++++++++----------- > drivers/gpu/drm/i915/i915_gem.c | 6 +- > drivers/gpu/drm/i915/i915_gem_evict.c | 9 +- > drivers/gpu/drm/i915/i915_gem_stolen.c | 5 +- > drivers/gpu/drm/nouveau/nouveau_notifier.c | 4 +- > drivers/gpu/drm/nouveau/nouveau_object.c | 2 +- > drivers/gpu/drm/nouveau/nv04_instmem.c | 2 +- > drivers/gpu/drm/nouveau/nv20_fb.c | 2 +- > drivers/gpu/drm/nouveau/nv50_vram.c | 2 +- > drivers/gpu/drm/ttm/ttm_bo.c | 2 +- > drivers/gpu/drm/ttm/ttm_bo_manager.c | 4 +- > include/drm/drm_mm.h | 38 +++++-- > 13 files changed, 143 insertions(+), 86 deletions(-) > > diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c > index d58e69d..961ccd8 100644 > --- a/drivers/gpu/drm/drm_gem.c > +++ b/drivers/gpu/drm/drm_gem.c > @@ -354,7 +354,7 @@ drm_gem_create_mmap_offset(struct drm_gem_object *obj) > > /* Get a DRM GEM mmap offset allocated... */ > list->file_offset_node = drm_mm_search_free(&mm->offset_manager, > - obj->size / PAGE_SIZE, 0, 0); > + obj->size / PAGE_SIZE, 0, 0, false); > > if (!list->file_offset_node) { > DRM_ERROR("failed to allocate offset for bo %d\n", obj->name); > diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c > index 961fb54..0311dba 100644 > --- a/drivers/gpu/drm/drm_mm.c > +++ b/drivers/gpu/drm/drm_mm.c > @@ -118,45 +118,53 @@ static inline unsigned long drm_mm_hole_node_end(struct drm_mm_node *hole_node) > > static void drm_mm_insert_helper(struct drm_mm_node *hole_node, > struct drm_mm_node *node, > - unsigned long size, unsigned alignment) > + unsigned long size, unsigned alignment, > + unsigned long color) > { > struct drm_mm *mm = hole_node->mm; > - unsigned long tmp = 0, wasted = 0; > unsigned long hole_start = drm_mm_hole_node_start(hole_node); > unsigned long hole_end = drm_mm_hole_node_end(hole_node); > + unsigned long adj_start = hole_start; > + unsigned long adj_end = hole_end; > > BUG_ON(!hole_node->hole_follows || node->allocated); > > - if (alignment) > - tmp = hole_start % alignment; > + if (mm->color_adjust) > + mm->color_adjust(hole_node, color, &adj_start, &adj_end); > > - if (!tmp) { > + if (alignment) { > + unsigned tmp = adj_start % alignment; > + if (tmp) > + adj_start += alignment - tmp; > + } > + > + if (adj_start == hole_start) { > hole_node->hole_follows = 0; > - list_del_init(&hole_node->hole_stack); > - } else > - wasted = alignment - tmp; > + list_del(&hole_node->hole_stack); > + } > > - node->start = hole_start + wasted; > + node->start = adj_start; > node->size = size; > node->mm = mm; > + node->color = color; > node->allocated = 1; > > INIT_LIST_HEAD(&node->hole_stack); > list_add(&node->node_list, &hole_node->node_list); > > - BUG_ON(node->start + node->size > hole_end); > + BUG_ON(node->start + node->size > adj_end); > > + node->hole_follows = 0; > if (node->start + node->size < hole_end) { > list_add(&node->hole_stack, &mm->hole_stack); > node->hole_follows = 1; > - } else { > - node->hole_follows = 0; > } > } > > struct drm_mm_node *drm_mm_get_block_generic(struct drm_mm_node *hole_node, > unsigned long size, > unsigned alignment, > + unsigned long color, > int atomic) > { > struct drm_mm_node *node; > @@ -165,7 +173,7 @@ struct drm_mm_node *drm_mm_get_block_generic(struct drm_mm_node *hole_node, > if (unlikely(node == NULL)) > return NULL; > > - drm_mm_insert_helper(hole_node, node, size, alignment); > + drm_mm_insert_helper(hole_node, node, size, alignment, color); > > return node; > } > @@ -181,11 +189,11 @@ int drm_mm_insert_node(struct drm_mm *mm, struct drm_mm_node *node, > { > struct drm_mm_node *hole_node; > > - hole_node = drm_mm_search_free(mm, size, alignment, 0); > + hole_node = drm_mm_search_free(mm, size, alignment, 0, false); > if (!hole_node) > return -ENOSPC; > > - drm_mm_insert_helper(hole_node, node, size, alignment); > + drm_mm_insert_helper(hole_node, node, size, alignment, 0); > > return 0; > } > @@ -194,50 +202,57 @@ EXPORT_SYMBOL(drm_mm_insert_node); > static void drm_mm_insert_helper_range(struct drm_mm_node *hole_node, > struct drm_mm_node *node, > unsigned long size, unsigned alignment, > + unsigned long color, > unsigned long start, unsigned long end) > { > struct drm_mm *mm = hole_node->mm; > - unsigned long tmp = 0, wasted = 0; > unsigned long hole_start = drm_mm_hole_node_start(hole_node); > unsigned long hole_end = drm_mm_hole_node_end(hole_node); > + unsigned long adj_start = hole_start; > + unsigned long adj_end = hole_end; > > BUG_ON(!hole_node->hole_follows || node->allocated); > > - if (hole_start < start) > - wasted += start - hole_start; > - if (alignment) > - tmp = (hole_start + wasted) % alignment; > + if (mm->color_adjust) > + mm->color_adjust(hole_node, color, &adj_start, &adj_end); > > - if (tmp) > - wasted += alignment - tmp; > + if (adj_start < start) > + adj_start = start; > > - if (!wasted) { > + if (alignment) { > + unsigned tmp = adj_start % alignment; > + if (tmp) > + adj_start += alignment - tmp; > + } > + > + if (adj_start == hole_start) { > hole_node->hole_follows = 0; > - list_del_init(&hole_node->hole_stack); > + list_del(&hole_node->hole_stack); > } > > - node->start = hole_start + wasted; > + node->start = adj_start; > node->size = size; > node->mm = mm; > + node->color = color; > node->allocated = 1; > > INIT_LIST_HEAD(&node->hole_stack); > list_add(&node->node_list, &hole_node->node_list); > > - BUG_ON(node->start + node->size > hole_end); > + BUG_ON(node->start + node->size > adj_end); > BUG_ON(node->start + node->size > end); > > + node->hole_follows = 0; > if (node->start + node->size < hole_end) { > list_add(&node->hole_stack, &mm->hole_stack); > node->hole_follows = 1; > - } else { > - node->hole_follows = 0; > } > } > > struct drm_mm_node *drm_mm_get_block_range_generic(struct drm_mm_node *hole_node, > unsigned long size, > unsigned alignment, > + unsigned long color, > unsigned long start, > unsigned long end, > int atomic) > @@ -248,7 +263,7 @@ struct drm_mm_node *drm_mm_get_block_range_generic(struct drm_mm_node *hole_node > if (unlikely(node == NULL)) > return NULL; > > - drm_mm_insert_helper_range(hole_node, node, size, alignment, > + drm_mm_insert_helper_range(hole_node, node, size, alignment, color, > start, end); > > return node; > @@ -266,12 +281,12 @@ int drm_mm_insert_node_in_range(struct drm_mm *mm, struct drm_mm_node *node, > { > struct drm_mm_node *hole_node; > > - hole_node = drm_mm_search_free_in_range(mm, size, alignment, > - start, end, 0); > + hole_node = drm_mm_search_free_in_range(mm, size, alignment, 0, > + start, end, false); > if (!hole_node) > return -ENOSPC; > > - drm_mm_insert_helper_range(hole_node, node, size, alignment, > + drm_mm_insert_helper_range(hole_node, node, size, alignment, 0, > start, end); > > return 0; > @@ -336,27 +351,23 @@ EXPORT_SYMBOL(drm_mm_put_block); > static int check_free_hole(unsigned long start, unsigned long end, > unsigned long size, unsigned alignment) > { > - unsigned wasted = 0; > - > if (end - start < size) > return 0; > > if (alignment) { > unsigned tmp = start % alignment; > if (tmp) > - wasted = alignment - tmp; > - } > - > - if (end >= start + size + wasted) { > - return 1; > + start += alignment - tmp; > } > > - return 0; > + return end >= start + size; > } > > struct drm_mm_node *drm_mm_search_free(const struct drm_mm *mm, > unsigned long size, > - unsigned alignment, int best_match) > + unsigned alignment, > + unsigned long color, > + bool best_match) > { > struct drm_mm_node *entry; > struct drm_mm_node *best; > @@ -368,10 +379,17 @@ struct drm_mm_node *drm_mm_search_free(const struct drm_mm *mm, > best_size = ~0UL; > > list_for_each_entry(entry, &mm->hole_stack, hole_stack) { > + unsigned long adj_start = drm_mm_hole_node_start(entry); > + unsigned long adj_end = drm_mm_hole_node_end(entry); > + > + if (mm->color_adjust) { > + mm->color_adjust(entry, color, &adj_start, &adj_end); > + if (adj_end <= adj_start) > + continue; > + } > + > BUG_ON(!entry->hole_follows); > - if (!check_free_hole(drm_mm_hole_node_start(entry), > - drm_mm_hole_node_end(entry), > - size, alignment)) > + if (!check_free_hole(adj_start, adj_end, size, alignment)) > continue; > > if (!best_match) > @@ -390,9 +408,10 @@ EXPORT_SYMBOL(drm_mm_search_free); > struct drm_mm_node *drm_mm_search_free_in_range(const struct drm_mm *mm, > unsigned long size, > unsigned alignment, > + unsigned long color, > unsigned long start, > unsigned long end, > - int best_match) > + bool best_match) > { > struct drm_mm_node *entry; > struct drm_mm_node *best; > @@ -410,6 +429,13 @@ struct drm_mm_node *drm_mm_search_free_in_range(const struct drm_mm *mm, > end : drm_mm_hole_node_end(entry); > > BUG_ON(!entry->hole_follows); > + > + if (mm->color_adjust) { > + mm->color_adjust(entry, color, &adj_start, &adj_end); > + if (adj_end <= adj_start) > + continue; > + } > + > if (!check_free_hole(adj_start, adj_end, size, alignment)) > continue; > > @@ -437,6 +463,7 @@ void drm_mm_replace_node(struct drm_mm_node *old, struct drm_mm_node *new) > new->mm = old->mm; > new->start = old->start; > new->size = old->size; > + new->color = old->color; > > old->allocated = 0; > new->allocated = 1; > @@ -452,9 +479,12 @@ EXPORT_SYMBOL(drm_mm_replace_node); > * Warning: As long as the scan list is non-empty, no other operations than > * adding/removing nodes to/from the scan list are allowed. > */ > -void drm_mm_init_scan(struct drm_mm *mm, unsigned long size, > - unsigned alignment) > +void drm_mm_init_scan(struct drm_mm *mm, > + unsigned long size, > + unsigned alignment, > + unsigned long color) > { > + mm->scan_color = color; > mm->scan_alignment = alignment; > mm->scan_size = size; > mm->scanned_blocks = 0; > @@ -474,11 +504,14 @@ EXPORT_SYMBOL(drm_mm_init_scan); > * Warning: As long as the scan list is non-empty, no other operations than > * adding/removing nodes to/from the scan list are allowed. > */ > -void drm_mm_init_scan_with_range(struct drm_mm *mm, unsigned long size, > +void drm_mm_init_scan_with_range(struct drm_mm *mm, > + unsigned long size, > unsigned alignment, > + unsigned long color, > unsigned long start, > unsigned long end) > { > + mm->scan_color = color; > mm->scan_alignment = alignment; > mm->scan_size = size; > mm->scanned_blocks = 0; > @@ -522,17 +555,21 @@ int drm_mm_scan_add_block(struct drm_mm_node *node) > > hole_start = drm_mm_hole_node_start(prev_node); > hole_end = drm_mm_hole_node_end(prev_node); > + > + adj_start = hole_start; > + adj_end = hole_end; > + > + if (mm->color_adjust) > + mm->color_adjust(prev_node, mm->scan_color, &adj_start, &adj_end); > + > if (mm->scan_check_range) { > - adj_start = hole_start < mm->scan_start ? > - mm->scan_start : hole_start; > - adj_end = hole_end > mm->scan_end ? > - mm->scan_end : hole_end; > - } else { > - adj_start = hole_start; > - adj_end = hole_end; > + if (adj_start < mm->scan_start) > + adj_start = mm->scan_start; > + if (adj_end > mm->scan_end) > + adj_end = mm->scan_end; > } > > - if (check_free_hole(adj_start , adj_end, > + if (check_free_hole(adj_start, adj_end, > mm->scan_size, mm->scan_alignment)) { > mm->scan_hit_start = hole_start; > mm->scan_hit_size = hole_end; > @@ -616,6 +653,8 @@ int drm_mm_init(struct drm_mm * mm, unsigned long start, unsigned long size) > mm->head_node.size = start - mm->head_node.start; > list_add_tail(&mm->head_node.hole_stack, &mm->hole_stack); > > + mm->color_adjust = NULL; > + > return 0; > } > EXPORT_SYMBOL(drm_mm_init); > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index db438f0..cad56dd 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -2756,18 +2756,18 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj, > free_space = > drm_mm_search_free_in_range(&dev_priv->mm.gtt_space, > size, alignment, 0, > - dev_priv->mm.gtt_mappable_end, > + 0, dev_priv->mm.gtt_mappable_end, > 0); > else > free_space = drm_mm_search_free(&dev_priv->mm.gtt_space, > - size, alignment, 0); > + size, alignment, 0, 0); > > if (free_space != NULL) { > if (map_and_fenceable) > obj->gtt_space = > drm_mm_get_block_range_generic(free_space, > size, alignment, 0, > - dev_priv->mm.gtt_mappable_end, > + 0, dev_priv->mm.gtt_mappable_end, > 0); > else > obj->gtt_space = > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c > index ae7c24e..eba0308 100644 > --- a/drivers/gpu/drm/i915/i915_gem_evict.c > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c > @@ -78,11 +78,12 @@ i915_gem_evict_something(struct drm_device *dev, int min_size, > > INIT_LIST_HEAD(&unwind_list); > if (mappable) > - drm_mm_init_scan_with_range(&dev_priv->mm.gtt_space, min_size, > - alignment, 0, > - dev_priv->mm.gtt_mappable_end); > + drm_mm_init_scan_with_range(&dev_priv->mm.gtt_space, > + min_size, alignment, 0, > + 0, dev_priv->mm.gtt_mappable_end); > else > - drm_mm_init_scan(&dev_priv->mm.gtt_space, min_size, alignment); > + drm_mm_init_scan(&dev_priv->mm.gtt_space, > + min_size, alignment, 0); > > /* First see if there is a large enough contiguous idle region... */ > list_for_each_entry(obj, &dev_priv->mm.inactive_list, mm_list) { > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c > index ada2e90..dba13cf 100644 > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c > @@ -111,7 +111,8 @@ static void i915_setup_compression(struct drm_device *dev, int size) > /* Just in case the BIOS is doing something questionable. */ > intel_disable_fbc(dev); > > - compressed_fb = drm_mm_search_free(&dev_priv->mm.stolen, size, 4096, 0); > + compressed_fb = drm_mm_search_free(&dev_priv->mm.stolen, > + size, 4096, 0, 0); > if (compressed_fb) > compressed_fb = drm_mm_get_block(compressed_fb, size, 4096); > if (!compressed_fb) > @@ -123,7 +124,7 @@ static void i915_setup_compression(struct drm_device *dev, int size) > > if (!(IS_GM45(dev) || HAS_PCH_SPLIT(dev))) { > compressed_llb = drm_mm_search_free(&dev_priv->mm.stolen, > - 4096, 4096, 0); > + 4096, 4096, 0, 0); > if (compressed_llb) > compressed_llb = drm_mm_get_block(compressed_llb, > 4096, 4096); > diff --git a/drivers/gpu/drm/nouveau/nouveau_notifier.c b/drivers/gpu/drm/nouveau/nouveau_notifier.c > index 2ef883c..65c64b1 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_notifier.c > +++ b/drivers/gpu/drm/nouveau/nouveau_notifier.c > @@ -118,10 +118,10 @@ nouveau_notifier_alloc(struct nouveau_channel *chan, uint32_t handle, > uint64_t offset; > int target, ret; > > - mem = drm_mm_search_free_in_range(&chan->notifier_heap, size, 0, > + mem = drm_mm_search_free_in_range(&chan->notifier_heap, size, 0, 0, > start, end, 0); > if (mem) > - mem = drm_mm_get_block_range(mem, size, 0, start, end); > + mem = drm_mm_get_block_range(mem, size, 0, 0, start, end); > if (!mem) { > NV_ERROR(dev, "Channel %d notifier block full\n", chan->id); > return -ENOMEM; > diff --git a/drivers/gpu/drm/nouveau/nouveau_object.c b/drivers/gpu/drm/nouveau/nouveau_object.c > index b190cc0..15d5d97 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_object.c > +++ b/drivers/gpu/drm/nouveau/nouveau_object.c > @@ -163,7 +163,7 @@ nouveau_gpuobj_new(struct drm_device *dev, struct nouveau_channel *chan, > spin_unlock(&dev_priv->ramin_lock); > > if (!(flags & NVOBJ_FLAG_VM) && chan) { > - ramin = drm_mm_search_free(&chan->ramin_heap, size, align, 0); > + ramin = drm_mm_search_free(&chan->ramin_heap, size, align, 0, 0); > if (ramin) > ramin = drm_mm_get_block(ramin, size, align); > if (!ramin) { > diff --git a/drivers/gpu/drm/nouveau/nv04_instmem.c b/drivers/gpu/drm/nouveau/nv04_instmem.c > index ef7a934..ce57bcd 100644 > --- a/drivers/gpu/drm/nouveau/nv04_instmem.c > +++ b/drivers/gpu/drm/nouveau/nv04_instmem.c > @@ -149,7 +149,7 @@ nv04_instmem_get(struct nouveau_gpuobj *gpuobj, struct nouveau_channel *chan, > return -ENOMEM; > > spin_lock(&dev_priv->ramin_lock); > - ramin = drm_mm_search_free(&dev_priv->ramin_heap, size, align, 0); > + ramin = drm_mm_search_free(&dev_priv->ramin_heap, size, align, 0, 0); > if (ramin == NULL) { > spin_unlock(&dev_priv->ramin_lock); > return -ENOMEM; > diff --git a/drivers/gpu/drm/nouveau/nv20_fb.c b/drivers/gpu/drm/nouveau/nv20_fb.c > index 19bd640..754f47f 100644 > --- a/drivers/gpu/drm/nouveau/nv20_fb.c > +++ b/drivers/gpu/drm/nouveau/nv20_fb.c > @@ -16,7 +16,7 @@ nv20_fb_alloc_tag(struct drm_device *dev, uint32_t size) > return NULL; > > spin_lock(&dev_priv->tile.lock); > - mem = drm_mm_search_free(&pfb->tag_heap, size, 0, 0); > + mem = drm_mm_search_free(&pfb->tag_heap, size, 0, 0, 0); > if (mem) > mem = drm_mm_get_block_atomic(mem, size, 0); > spin_unlock(&dev_priv->tile.lock); > diff --git a/drivers/gpu/drm/nouveau/nv50_vram.c b/drivers/gpu/drm/nouveau/nv50_vram.c > index 9ed9ae39..6c8ea3f 100644 > --- a/drivers/gpu/drm/nouveau/nv50_vram.c > +++ b/drivers/gpu/drm/nouveau/nv50_vram.c > @@ -105,7 +105,7 @@ nv50_vram_new(struct drm_device *dev, u64 size, u32 align, u32 size_nc, > struct nouveau_fb_engine *pfb = &dev_priv->engine.fb; > int n = (size >> 4) * comp; > > - mem->tag = drm_mm_search_free(&pfb->tag_heap, n, 0, 0); > + mem->tag = drm_mm_search_free(&pfb->tag_heap, n, 0, 0, 0); > if (mem->tag) > mem->tag = drm_mm_get_block(mem->tag, n, 0); > } > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c > index 36f4b28..76ee39f 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo.c > +++ b/drivers/gpu/drm/ttm/ttm_bo.c > @@ -1686,7 +1686,7 @@ retry_pre_get: > > write_lock(&bdev->vm_lock); > bo->vm_node = drm_mm_search_free(&bdev->addr_space_mm, > - bo->mem.num_pages, 0, 0); > + bo->mem.num_pages, 0, 0, 0); > > if (unlikely(bo->vm_node == NULL)) { > ret = -ENOMEM; > diff --git a/drivers/gpu/drm/ttm/ttm_bo_manager.c b/drivers/gpu/drm/ttm/ttm_bo_manager.c > index 038e947..b426b29 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo_manager.c > +++ b/drivers/gpu/drm/ttm/ttm_bo_manager.c > @@ -68,14 +68,14 @@ static int ttm_bo_man_get_node(struct ttm_mem_type_manager *man, > > spin_lock(&rman->lock); > node = drm_mm_search_free_in_range(mm, > - mem->num_pages, mem->page_alignment, > + mem->num_pages, mem->page_alignment, 0, > placement->fpfn, lpfn, 1); > if (unlikely(node == NULL)) { > spin_unlock(&rman->lock); > return 0; > } > node = drm_mm_get_block_atomic_range(node, mem->num_pages, > - mem->page_alignment, > + mem->page_alignment, 0, > placement->fpfn, > lpfn); > spin_unlock(&rman->lock); > diff --git a/include/drm/drm_mm.h b/include/drm/drm_mm.h > index 564b14a..04a9554 100644 > --- a/include/drm/drm_mm.h > +++ b/include/drm/drm_mm.h > @@ -50,6 +50,7 @@ struct drm_mm_node { > unsigned scanned_next_free : 1; > unsigned scanned_preceeds_hole : 1; > unsigned allocated : 1; > + unsigned long color; > unsigned long start; > unsigned long size; > struct drm_mm *mm; > @@ -66,6 +67,7 @@ struct drm_mm { > spinlock_t unused_lock; > unsigned int scan_check_range : 1; > unsigned scan_alignment; > + unsigned long scan_color; > unsigned long scan_size; > unsigned long scan_hit_start; > unsigned scan_hit_size; > @@ -73,6 +75,9 @@ struct drm_mm { > unsigned long scan_start; > unsigned long scan_end; > struct drm_mm_node *prev_scanned_node; > + > + void (*color_adjust)(struct drm_mm_node *node, unsigned long color, > + unsigned long *start, unsigned long *end); > }; > > static inline bool drm_mm_node_allocated(struct drm_mm_node *node) > @@ -100,11 +105,13 @@ static inline bool drm_mm_initialized(struct drm_mm *mm) > extern struct drm_mm_node *drm_mm_get_block_generic(struct drm_mm_node *node, > unsigned long size, > unsigned alignment, > + unsigned long color, > int atomic); > extern struct drm_mm_node *drm_mm_get_block_range_generic( > struct drm_mm_node *node, > unsigned long size, > unsigned alignment, > + unsigned long color, > unsigned long start, > unsigned long end, > int atomic); > @@ -112,32 +119,34 @@ static inline struct drm_mm_node *drm_mm_get_block(struct drm_mm_node *parent, > unsigned long size, > unsigned alignment) > { > - return drm_mm_get_block_generic(parent, size, alignment, 0); > + return drm_mm_get_block_generic(parent, size, alignment, 0, 0); > } > static inline struct drm_mm_node *drm_mm_get_block_atomic(struct drm_mm_node *parent, > unsigned long size, > unsigned alignment) > { > - return drm_mm_get_block_generic(parent, size, alignment, 1); > + return drm_mm_get_block_generic(parent, size, alignment, 0, 1); > } > static inline struct drm_mm_node *drm_mm_get_block_range( > struct drm_mm_node *parent, > unsigned long size, > unsigned alignment, > + unsigned long color, > unsigned long start, > unsigned long end) > { > - return drm_mm_get_block_range_generic(parent, size, alignment, > - start, end, 0); > + return drm_mm_get_block_range_generic(parent, size, alignment, color, > + start, end, 0); > } > static inline struct drm_mm_node *drm_mm_get_block_atomic_range( > struct drm_mm_node *parent, > unsigned long size, > unsigned alignment, > + unsigned long color, > unsigned long start, > unsigned long end) > { > - return drm_mm_get_block_range_generic(parent, size, alignment, > + return drm_mm_get_block_range_generic(parent, size, alignment, color, > start, end, 1); > } > extern int drm_mm_insert_node(struct drm_mm *mm, struct drm_mm_node *node, > @@ -152,15 +161,18 @@ extern void drm_mm_replace_node(struct drm_mm_node *old, struct drm_mm_node *new > extern struct drm_mm_node *drm_mm_search_free(const struct drm_mm *mm, > unsigned long size, > unsigned alignment, > - int best_match); > + unsigned long color, > + bool best_match); > extern struct drm_mm_node *drm_mm_search_free_in_range( > const struct drm_mm *mm, > unsigned long size, > unsigned alignment, > + unsigned long color, > unsigned long start, > unsigned long end, > - int best_match); > -extern int drm_mm_init(struct drm_mm *mm, unsigned long start, > + bool best_match); > +extern int drm_mm_init(struct drm_mm *mm, > + unsigned long start, > unsigned long size); > extern void drm_mm_takedown(struct drm_mm *mm); > extern int drm_mm_clean(struct drm_mm *mm); > @@ -171,10 +183,14 @@ static inline struct drm_mm *drm_get_mm(struct drm_mm_node *block) > return block->mm; > } > > -void drm_mm_init_scan(struct drm_mm *mm, unsigned long size, > - unsigned alignment); > -void drm_mm_init_scan_with_range(struct drm_mm *mm, unsigned long size, > +void drm_mm_init_scan(struct drm_mm *mm, > + unsigned long size, > + unsigned alignment, > + unsigned long color); > +void drm_mm_init_scan_with_range(struct drm_mm *mm, > + unsigned long size, > unsigned alignment, > + unsigned long color, > unsigned long start, > unsigned long end); > int drm_mm_scan_add_block(struct drm_mm_node *node); > -- > 1.7.10 >
On Tue, 10 Jul 2012 11:21:57 +0200, Daniel Vetter <daniel@ffwll.ch> wrote: > On Mon, Jul 09, 2012 at 12:34:37PM +0100, Chris Wilson wrote: > > In order to support snoopable memory on non-LLC architectures (so that > > we can bind vgem objects into the i915 GATT for example), we have to > > avoid the prefetcher on the GPU from crossing memory domains and so > > prevent allocation of a snoopable PTE immediately following an uncached > > PTE. To do that, we need to extend the range allocator with support for > > tracking and segregating different node colours. > > > > This will be used by i915 to segregate memory domains within the GTT. > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Dave Airlie <airlied@redhat.com > > Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> > > Cc: Ben Skeggs <bskeggs@redhat.com> > > Cc: Jerome Glisse <jglisse@redhat.com> > > Cc: Alex Deucher <alexander.deucher@amd.com> > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > > Two little bikesheds: > - Do we really need 64bits of colour? Especially since we have quite a few > bits of space left ... It was following the convention that we passed around an argument large enough to stuff a pointer into if we ever needed to make a far more complex decision. > - I think we could add a new insert_color helper that always takes a range > (we can select the right rang in the driver). That way this patch > wouldn't need to touch the drivers, and we could take the opportunity to > embed the gtt_space mm_node into our gem object ... I was just a bit more wary of adding yet another helper since they quickly get just as confusing as the extra arguments they replace. :) -Chris
On Tue, Jul 10, 2012 at 10:29:09AM +0100, Chris Wilson wrote: > On Tue, 10 Jul 2012 11:21:57 +0200, Daniel Vetter <daniel@ffwll.ch> wrote: > > On Mon, Jul 09, 2012 at 12:34:37PM +0100, Chris Wilson wrote: > > > In order to support snoopable memory on non-LLC architectures (so that > > > we can bind vgem objects into the i915 GATT for example), we have to > > > avoid the prefetcher on the GPU from crossing memory domains and so > > > prevent allocation of a snoopable PTE immediately following an uncached > > > PTE. To do that, we need to extend the range allocator with support for > > > tracking and segregating different node colours. > > > > > > This will be used by i915 to segregate memory domains within the GTT. > > > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > > Cc: Dave Airlie <airlied@redhat.com > > > Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> > > > Cc: Ben Skeggs <bskeggs@redhat.com> > > > Cc: Jerome Glisse <jglisse@redhat.com> > > > Cc: Alex Deucher <alexander.deucher@amd.com> > > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > > > > Two little bikesheds: > > - Do we really need 64bits of colour? Especially since we have quite a few > > bits of space left ... > > It was following the convention that we passed around an argument large > enough to stuff a pointer into if we ever needed to make a far more > complex decision. I think the right thing to do in that case would be to embed the gtt_space and do an upcast ;-) > > - I think we could add a new insert_color helper that always takes a range > > (we can select the right rang in the driver). That way this patch > > wouldn't need to touch the drivers, and we could take the opportunity to > > embed the gtt_space mm_node into our gem object ... > > I was just a bit more wary of adding yet another helper since they > quickly get just as confusing as the extra arguments they replace. :) Oh, I guess you mean a different helper than I do. I think we should add a new drm_mm_insert_node_colour function that takes a pre-allocated mm_node, color and range and goes hole-hunting. That way we'd avoid changing any of the existing drivers (who rather likely will never care about). And I wouldn't have to convert over the drm_mm functions that deal with pre-allocated drm_mm_node structs when I get around to resurrect the embedd gtt_space patch. I agree that shoveling all the alignment constrains into a new helper for would be a bit too much overkill -Daniel
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index d58e69d..961ccd8 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -354,7 +354,7 @@ drm_gem_create_mmap_offset(struct drm_gem_object *obj) /* Get a DRM GEM mmap offset allocated... */ list->file_offset_node = drm_mm_search_free(&mm->offset_manager, - obj->size / PAGE_SIZE, 0, 0); + obj->size / PAGE_SIZE, 0, 0, false); if (!list->file_offset_node) { DRM_ERROR("failed to allocate offset for bo %d\n", obj->name); diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c index 961fb54..0311dba 100644 --- a/drivers/gpu/drm/drm_mm.c +++ b/drivers/gpu/drm/drm_mm.c @@ -118,45 +118,53 @@ static inline unsigned long drm_mm_hole_node_end(struct drm_mm_node *hole_node) static void drm_mm_insert_helper(struct drm_mm_node *hole_node, struct drm_mm_node *node, - unsigned long size, unsigned alignment) + unsigned long size, unsigned alignment, + unsigned long color) { struct drm_mm *mm = hole_node->mm; - unsigned long tmp = 0, wasted = 0; unsigned long hole_start = drm_mm_hole_node_start(hole_node); unsigned long hole_end = drm_mm_hole_node_end(hole_node); + unsigned long adj_start = hole_start; + unsigned long adj_end = hole_end; BUG_ON(!hole_node->hole_follows || node->allocated); - if (alignment) - tmp = hole_start % alignment; + if (mm->color_adjust) + mm->color_adjust(hole_node, color, &adj_start, &adj_end); - if (!tmp) { + if (alignment) { + unsigned tmp = adj_start % alignment; + if (tmp) + adj_start += alignment - tmp; + } + + if (adj_start == hole_start) { hole_node->hole_follows = 0; - list_del_init(&hole_node->hole_stack); - } else - wasted = alignment - tmp; + list_del(&hole_node->hole_stack); + } - node->start = hole_start + wasted; + node->start = adj_start; node->size = size; node->mm = mm; + node->color = color; node->allocated = 1; INIT_LIST_HEAD(&node->hole_stack); list_add(&node->node_list, &hole_node->node_list); - BUG_ON(node->start + node->size > hole_end); + BUG_ON(node->start + node->size > adj_end); + node->hole_follows = 0; if (node->start + node->size < hole_end) { list_add(&node->hole_stack, &mm->hole_stack); node->hole_follows = 1; - } else { - node->hole_follows = 0; } } struct drm_mm_node *drm_mm_get_block_generic(struct drm_mm_node *hole_node, unsigned long size, unsigned alignment, + unsigned long color, int atomic) { struct drm_mm_node *node; @@ -165,7 +173,7 @@ struct drm_mm_node *drm_mm_get_block_generic(struct drm_mm_node *hole_node, if (unlikely(node == NULL)) return NULL; - drm_mm_insert_helper(hole_node, node, size, alignment); + drm_mm_insert_helper(hole_node, node, size, alignment, color); return node; } @@ -181,11 +189,11 @@ int drm_mm_insert_node(struct drm_mm *mm, struct drm_mm_node *node, { struct drm_mm_node *hole_node; - hole_node = drm_mm_search_free(mm, size, alignment, 0); + hole_node = drm_mm_search_free(mm, size, alignment, 0, false); if (!hole_node) return -ENOSPC; - drm_mm_insert_helper(hole_node, node, size, alignment); + drm_mm_insert_helper(hole_node, node, size, alignment, 0); return 0; } @@ -194,50 +202,57 @@ EXPORT_SYMBOL(drm_mm_insert_node); static void drm_mm_insert_helper_range(struct drm_mm_node *hole_node, struct drm_mm_node *node, unsigned long size, unsigned alignment, + unsigned long color, unsigned long start, unsigned long end) { struct drm_mm *mm = hole_node->mm; - unsigned long tmp = 0, wasted = 0; unsigned long hole_start = drm_mm_hole_node_start(hole_node); unsigned long hole_end = drm_mm_hole_node_end(hole_node); + unsigned long adj_start = hole_start; + unsigned long adj_end = hole_end; BUG_ON(!hole_node->hole_follows || node->allocated); - if (hole_start < start) - wasted += start - hole_start; - if (alignment) - tmp = (hole_start + wasted) % alignment; + if (mm->color_adjust) + mm->color_adjust(hole_node, color, &adj_start, &adj_end); - if (tmp) - wasted += alignment - tmp; + if (adj_start < start) + adj_start = start; - if (!wasted) { + if (alignment) { + unsigned tmp = adj_start % alignment; + if (tmp) + adj_start += alignment - tmp; + } + + if (adj_start == hole_start) { hole_node->hole_follows = 0; - list_del_init(&hole_node->hole_stack); + list_del(&hole_node->hole_stack); } - node->start = hole_start + wasted; + node->start = adj_start; node->size = size; node->mm = mm; + node->color = color; node->allocated = 1; INIT_LIST_HEAD(&node->hole_stack); list_add(&node->node_list, &hole_node->node_list); - BUG_ON(node->start + node->size > hole_end); + BUG_ON(node->start + node->size > adj_end); BUG_ON(node->start + node->size > end); + node->hole_follows = 0; if (node->start + node->size < hole_end) { list_add(&node->hole_stack, &mm->hole_stack); node->hole_follows = 1; - } else { - node->hole_follows = 0; } } struct drm_mm_node *drm_mm_get_block_range_generic(struct drm_mm_node *hole_node, unsigned long size, unsigned alignment, + unsigned long color, unsigned long start, unsigned long end, int atomic) @@ -248,7 +263,7 @@ struct drm_mm_node *drm_mm_get_block_range_generic(struct drm_mm_node *hole_node if (unlikely(node == NULL)) return NULL; - drm_mm_insert_helper_range(hole_node, node, size, alignment, + drm_mm_insert_helper_range(hole_node, node, size, alignment, color, start, end); return node; @@ -266,12 +281,12 @@ int drm_mm_insert_node_in_range(struct drm_mm *mm, struct drm_mm_node *node, { struct drm_mm_node *hole_node; - hole_node = drm_mm_search_free_in_range(mm, size, alignment, - start, end, 0); + hole_node = drm_mm_search_free_in_range(mm, size, alignment, 0, + start, end, false); if (!hole_node) return -ENOSPC; - drm_mm_insert_helper_range(hole_node, node, size, alignment, + drm_mm_insert_helper_range(hole_node, node, size, alignment, 0, start, end); return 0; @@ -336,27 +351,23 @@ EXPORT_SYMBOL(drm_mm_put_block); static int check_free_hole(unsigned long start, unsigned long end, unsigned long size, unsigned alignment) { - unsigned wasted = 0; - if (end - start < size) return 0; if (alignment) { unsigned tmp = start % alignment; if (tmp) - wasted = alignment - tmp; - } - - if (end >= start + size + wasted) { - return 1; + start += alignment - tmp; } - return 0; + return end >= start + size; } struct drm_mm_node *drm_mm_search_free(const struct drm_mm *mm, unsigned long size, - unsigned alignment, int best_match) + unsigned alignment, + unsigned long color, + bool best_match) { struct drm_mm_node *entry; struct drm_mm_node *best; @@ -368,10 +379,17 @@ struct drm_mm_node *drm_mm_search_free(const struct drm_mm *mm, best_size = ~0UL; list_for_each_entry(entry, &mm->hole_stack, hole_stack) { + unsigned long adj_start = drm_mm_hole_node_start(entry); + unsigned long adj_end = drm_mm_hole_node_end(entry); + + if (mm->color_adjust) { + mm->color_adjust(entry, color, &adj_start, &adj_end); + if (adj_end <= adj_start) + continue; + } + BUG_ON(!entry->hole_follows); - if (!check_free_hole(drm_mm_hole_node_start(entry), - drm_mm_hole_node_end(entry), - size, alignment)) + if (!check_free_hole(adj_start, adj_end, size, alignment)) continue; if (!best_match) @@ -390,9 +408,10 @@ EXPORT_SYMBOL(drm_mm_search_free); struct drm_mm_node *drm_mm_search_free_in_range(const struct drm_mm *mm, unsigned long size, unsigned alignment, + unsigned long color, unsigned long start, unsigned long end, - int best_match) + bool best_match) { struct drm_mm_node *entry; struct drm_mm_node *best; @@ -410,6 +429,13 @@ struct drm_mm_node *drm_mm_search_free_in_range(const struct drm_mm *mm, end : drm_mm_hole_node_end(entry); BUG_ON(!entry->hole_follows); + + if (mm->color_adjust) { + mm->color_adjust(entry, color, &adj_start, &adj_end); + if (adj_end <= adj_start) + continue; + } + if (!check_free_hole(adj_start, adj_end, size, alignment)) continue; @@ -437,6 +463,7 @@ void drm_mm_replace_node(struct drm_mm_node *old, struct drm_mm_node *new) new->mm = old->mm; new->start = old->start; new->size = old->size; + new->color = old->color; old->allocated = 0; new->allocated = 1; @@ -452,9 +479,12 @@ EXPORT_SYMBOL(drm_mm_replace_node); * Warning: As long as the scan list is non-empty, no other operations than * adding/removing nodes to/from the scan list are allowed. */ -void drm_mm_init_scan(struct drm_mm *mm, unsigned long size, - unsigned alignment) +void drm_mm_init_scan(struct drm_mm *mm, + unsigned long size, + unsigned alignment, + unsigned long color) { + mm->scan_color = color; mm->scan_alignment = alignment; mm->scan_size = size; mm->scanned_blocks = 0; @@ -474,11 +504,14 @@ EXPORT_SYMBOL(drm_mm_init_scan); * Warning: As long as the scan list is non-empty, no other operations than * adding/removing nodes to/from the scan list are allowed. */ -void drm_mm_init_scan_with_range(struct drm_mm *mm, unsigned long size, +void drm_mm_init_scan_with_range(struct drm_mm *mm, + unsigned long size, unsigned alignment, + unsigned long color, unsigned long start, unsigned long end) { + mm->scan_color = color; mm->scan_alignment = alignment; mm->scan_size = size; mm->scanned_blocks = 0; @@ -522,17 +555,21 @@ int drm_mm_scan_add_block(struct drm_mm_node *node) hole_start = drm_mm_hole_node_start(prev_node); hole_end = drm_mm_hole_node_end(prev_node); + + adj_start = hole_start; + adj_end = hole_end; + + if (mm->color_adjust) + mm->color_adjust(prev_node, mm->scan_color, &adj_start, &adj_end); + if (mm->scan_check_range) { - adj_start = hole_start < mm->scan_start ? - mm->scan_start : hole_start; - adj_end = hole_end > mm->scan_end ? - mm->scan_end : hole_end; - } else { - adj_start = hole_start; - adj_end = hole_end; + if (adj_start < mm->scan_start) + adj_start = mm->scan_start; + if (adj_end > mm->scan_end) + adj_end = mm->scan_end; } - if (check_free_hole(adj_start , adj_end, + if (check_free_hole(adj_start, adj_end, mm->scan_size, mm->scan_alignment)) { mm->scan_hit_start = hole_start; mm->scan_hit_size = hole_end; @@ -616,6 +653,8 @@ int drm_mm_init(struct drm_mm * mm, unsigned long start, unsigned long size) mm->head_node.size = start - mm->head_node.start; list_add_tail(&mm->head_node.hole_stack, &mm->hole_stack); + mm->color_adjust = NULL; + return 0; } EXPORT_SYMBOL(drm_mm_init); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index db438f0..cad56dd 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2756,18 +2756,18 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj, free_space = drm_mm_search_free_in_range(&dev_priv->mm.gtt_space, size, alignment, 0, - dev_priv->mm.gtt_mappable_end, + 0, dev_priv->mm.gtt_mappable_end, 0); else free_space = drm_mm_search_free(&dev_priv->mm.gtt_space, - size, alignment, 0); + size, alignment, 0, 0); if (free_space != NULL) { if (map_and_fenceable) obj->gtt_space = drm_mm_get_block_range_generic(free_space, size, alignment, 0, - dev_priv->mm.gtt_mappable_end, + 0, dev_priv->mm.gtt_mappable_end, 0); else obj->gtt_space = diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c index ae7c24e..eba0308 100644 --- a/drivers/gpu/drm/i915/i915_gem_evict.c +++ b/drivers/gpu/drm/i915/i915_gem_evict.c @@ -78,11 +78,12 @@ i915_gem_evict_something(struct drm_device *dev, int min_size, INIT_LIST_HEAD(&unwind_list); if (mappable) - drm_mm_init_scan_with_range(&dev_priv->mm.gtt_space, min_size, - alignment, 0, - dev_priv->mm.gtt_mappable_end); + drm_mm_init_scan_with_range(&dev_priv->mm.gtt_space, + min_size, alignment, 0, + 0, dev_priv->mm.gtt_mappable_end); else - drm_mm_init_scan(&dev_priv->mm.gtt_space, min_size, alignment); + drm_mm_init_scan(&dev_priv->mm.gtt_space, + min_size, alignment, 0); /* First see if there is a large enough contiguous idle region... */ list_for_each_entry(obj, &dev_priv->mm.inactive_list, mm_list) { diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c index ada2e90..dba13cf 100644 --- a/drivers/gpu/drm/i915/i915_gem_stolen.c +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c @@ -111,7 +111,8 @@ static void i915_setup_compression(struct drm_device *dev, int size) /* Just in case the BIOS is doing something questionable. */ intel_disable_fbc(dev); - compressed_fb = drm_mm_search_free(&dev_priv->mm.stolen, size, 4096, 0); + compressed_fb = drm_mm_search_free(&dev_priv->mm.stolen, + size, 4096, 0, 0); if (compressed_fb) compressed_fb = drm_mm_get_block(compressed_fb, size, 4096); if (!compressed_fb) @@ -123,7 +124,7 @@ static void i915_setup_compression(struct drm_device *dev, int size) if (!(IS_GM45(dev) || HAS_PCH_SPLIT(dev))) { compressed_llb = drm_mm_search_free(&dev_priv->mm.stolen, - 4096, 4096, 0); + 4096, 4096, 0, 0); if (compressed_llb) compressed_llb = drm_mm_get_block(compressed_llb, 4096, 4096); diff --git a/drivers/gpu/drm/nouveau/nouveau_notifier.c b/drivers/gpu/drm/nouveau/nouveau_notifier.c index 2ef883c..65c64b1 100644 --- a/drivers/gpu/drm/nouveau/nouveau_notifier.c +++ b/drivers/gpu/drm/nouveau/nouveau_notifier.c @@ -118,10 +118,10 @@ nouveau_notifier_alloc(struct nouveau_channel *chan, uint32_t handle, uint64_t offset; int target, ret; - mem = drm_mm_search_free_in_range(&chan->notifier_heap, size, 0, + mem = drm_mm_search_free_in_range(&chan->notifier_heap, size, 0, 0, start, end, 0); if (mem) - mem = drm_mm_get_block_range(mem, size, 0, start, end); + mem = drm_mm_get_block_range(mem, size, 0, 0, start, end); if (!mem) { NV_ERROR(dev, "Channel %d notifier block full\n", chan->id); return -ENOMEM; diff --git a/drivers/gpu/drm/nouveau/nouveau_object.c b/drivers/gpu/drm/nouveau/nouveau_object.c index b190cc0..15d5d97 100644 --- a/drivers/gpu/drm/nouveau/nouveau_object.c +++ b/drivers/gpu/drm/nouveau/nouveau_object.c @@ -163,7 +163,7 @@ nouveau_gpuobj_new(struct drm_device *dev, struct nouveau_channel *chan, spin_unlock(&dev_priv->ramin_lock); if (!(flags & NVOBJ_FLAG_VM) && chan) { - ramin = drm_mm_search_free(&chan->ramin_heap, size, align, 0); + ramin = drm_mm_search_free(&chan->ramin_heap, size, align, 0, 0); if (ramin) ramin = drm_mm_get_block(ramin, size, align); if (!ramin) { diff --git a/drivers/gpu/drm/nouveau/nv04_instmem.c b/drivers/gpu/drm/nouveau/nv04_instmem.c index ef7a934..ce57bcd 100644 --- a/drivers/gpu/drm/nouveau/nv04_instmem.c +++ b/drivers/gpu/drm/nouveau/nv04_instmem.c @@ -149,7 +149,7 @@ nv04_instmem_get(struct nouveau_gpuobj *gpuobj, struct nouveau_channel *chan, return -ENOMEM; spin_lock(&dev_priv->ramin_lock); - ramin = drm_mm_search_free(&dev_priv->ramin_heap, size, align, 0); + ramin = drm_mm_search_free(&dev_priv->ramin_heap, size, align, 0, 0); if (ramin == NULL) { spin_unlock(&dev_priv->ramin_lock); return -ENOMEM; diff --git a/drivers/gpu/drm/nouveau/nv20_fb.c b/drivers/gpu/drm/nouveau/nv20_fb.c index 19bd640..754f47f 100644 --- a/drivers/gpu/drm/nouveau/nv20_fb.c +++ b/drivers/gpu/drm/nouveau/nv20_fb.c @@ -16,7 +16,7 @@ nv20_fb_alloc_tag(struct drm_device *dev, uint32_t size) return NULL; spin_lock(&dev_priv->tile.lock); - mem = drm_mm_search_free(&pfb->tag_heap, size, 0, 0); + mem = drm_mm_search_free(&pfb->tag_heap, size, 0, 0, 0); if (mem) mem = drm_mm_get_block_atomic(mem, size, 0); spin_unlock(&dev_priv->tile.lock); diff --git a/drivers/gpu/drm/nouveau/nv50_vram.c b/drivers/gpu/drm/nouveau/nv50_vram.c index 9ed9ae39..6c8ea3f 100644 --- a/drivers/gpu/drm/nouveau/nv50_vram.c +++ b/drivers/gpu/drm/nouveau/nv50_vram.c @@ -105,7 +105,7 @@ nv50_vram_new(struct drm_device *dev, u64 size, u32 align, u32 size_nc, struct nouveau_fb_engine *pfb = &dev_priv->engine.fb; int n = (size >> 4) * comp; - mem->tag = drm_mm_search_free(&pfb->tag_heap, n, 0, 0); + mem->tag = drm_mm_search_free(&pfb->tag_heap, n, 0, 0, 0); if (mem->tag) mem->tag = drm_mm_get_block(mem->tag, n, 0); } diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 36f4b28..76ee39f 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -1686,7 +1686,7 @@ retry_pre_get: write_lock(&bdev->vm_lock); bo->vm_node = drm_mm_search_free(&bdev->addr_space_mm, - bo->mem.num_pages, 0, 0); + bo->mem.num_pages, 0, 0, 0); if (unlikely(bo->vm_node == NULL)) { ret = -ENOMEM; diff --git a/drivers/gpu/drm/ttm/ttm_bo_manager.c b/drivers/gpu/drm/ttm/ttm_bo_manager.c index 038e947..b426b29 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_manager.c +++ b/drivers/gpu/drm/ttm/ttm_bo_manager.c @@ -68,14 +68,14 @@ static int ttm_bo_man_get_node(struct ttm_mem_type_manager *man, spin_lock(&rman->lock); node = drm_mm_search_free_in_range(mm, - mem->num_pages, mem->page_alignment, + mem->num_pages, mem->page_alignment, 0, placement->fpfn, lpfn, 1); if (unlikely(node == NULL)) { spin_unlock(&rman->lock); return 0; } node = drm_mm_get_block_atomic_range(node, mem->num_pages, - mem->page_alignment, + mem->page_alignment, 0, placement->fpfn, lpfn); spin_unlock(&rman->lock); diff --git a/include/drm/drm_mm.h b/include/drm/drm_mm.h index 564b14a..04a9554 100644 --- a/include/drm/drm_mm.h +++ b/include/drm/drm_mm.h @@ -50,6 +50,7 @@ struct drm_mm_node { unsigned scanned_next_free : 1; unsigned scanned_preceeds_hole : 1; unsigned allocated : 1; + unsigned long color; unsigned long start; unsigned long size; struct drm_mm *mm; @@ -66,6 +67,7 @@ struct drm_mm { spinlock_t unused_lock; unsigned int scan_check_range : 1; unsigned scan_alignment; + unsigned long scan_color; unsigned long scan_size; unsigned long scan_hit_start; unsigned scan_hit_size; @@ -73,6 +75,9 @@ struct drm_mm { unsigned long scan_start; unsigned long scan_end; struct drm_mm_node *prev_scanned_node; + + void (*color_adjust)(struct drm_mm_node *node, unsigned long color, + unsigned long *start, unsigned long *end); }; static inline bool drm_mm_node_allocated(struct drm_mm_node *node) @@ -100,11 +105,13 @@ static inline bool drm_mm_initialized(struct drm_mm *mm) extern struct drm_mm_node *drm_mm_get_block_generic(struct drm_mm_node *node, unsigned long size, unsigned alignment, + unsigned long color, int atomic); extern struct drm_mm_node *drm_mm_get_block_range_generic( struct drm_mm_node *node, unsigned long size, unsigned alignment, + unsigned long color, unsigned long start, unsigned long end, int atomic); @@ -112,32 +119,34 @@ static inline struct drm_mm_node *drm_mm_get_block(struct drm_mm_node *parent, unsigned long size, unsigned alignment) { - return drm_mm_get_block_generic(parent, size, alignment, 0); + return drm_mm_get_block_generic(parent, size, alignment, 0, 0); } static inline struct drm_mm_node *drm_mm_get_block_atomic(struct drm_mm_node *parent, unsigned long size, unsigned alignment) { - return drm_mm_get_block_generic(parent, size, alignment, 1); + return drm_mm_get_block_generic(parent, size, alignment, 0, 1); } static inline struct drm_mm_node *drm_mm_get_block_range( struct drm_mm_node *parent, unsigned long size, unsigned alignment, + unsigned long color, unsigned long start, unsigned long end) { - return drm_mm_get_block_range_generic(parent, size, alignment, - start, end, 0); + return drm_mm_get_block_range_generic(parent, size, alignment, color, + start, end, 0); } static inline struct drm_mm_node *drm_mm_get_block_atomic_range( struct drm_mm_node *parent, unsigned long size, unsigned alignment, + unsigned long color, unsigned long start, unsigned long end) { - return drm_mm_get_block_range_generic(parent, size, alignment, + return drm_mm_get_block_range_generic(parent, size, alignment, color, start, end, 1); } extern int drm_mm_insert_node(struct drm_mm *mm, struct drm_mm_node *node, @@ -152,15 +161,18 @@ extern void drm_mm_replace_node(struct drm_mm_node *old, struct drm_mm_node *new extern struct drm_mm_node *drm_mm_search_free(const struct drm_mm *mm, unsigned long size, unsigned alignment, - int best_match); + unsigned long color, + bool best_match); extern struct drm_mm_node *drm_mm_search_free_in_range( const struct drm_mm *mm, unsigned long size, unsigned alignment, + unsigned long color, unsigned long start, unsigned long end, - int best_match); -extern int drm_mm_init(struct drm_mm *mm, unsigned long start, + bool best_match); +extern int drm_mm_init(struct drm_mm *mm, + unsigned long start, unsigned long size); extern void drm_mm_takedown(struct drm_mm *mm); extern int drm_mm_clean(struct drm_mm *mm); @@ -171,10 +183,14 @@ static inline struct drm_mm *drm_get_mm(struct drm_mm_node *block) return block->mm; } -void drm_mm_init_scan(struct drm_mm *mm, unsigned long size, - unsigned alignment); -void drm_mm_init_scan_with_range(struct drm_mm *mm, unsigned long size, +void drm_mm_init_scan(struct drm_mm *mm, + unsigned long size, + unsigned alignment, + unsigned long color); +void drm_mm_init_scan_with_range(struct drm_mm *mm, + unsigned long size, unsigned alignment, + unsigned long color, unsigned long start, unsigned long end); int drm_mm_scan_add_block(struct drm_mm_node *node);
In order to support snoopable memory on non-LLC architectures (so that we can bind vgem objects into the i915 GATT for example), we have to avoid the prefetcher on the GPU from crossing memory domains and so prevent allocation of a snoopable PTE immediately following an uncached PTE. To do that, we need to extend the range allocator with support for tracking and segregating different node colours. This will be used by i915 to segregate memory domains within the GTT. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Airlie <airlied@redhat.com Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Conflicts: drivers/gpu/drm/i915/i915_gem_stolen.c --- drivers/gpu/drm/drm_gem.c | 2 +- drivers/gpu/drm/drm_mm.c | 151 +++++++++++++++++----------- drivers/gpu/drm/i915/i915_gem.c | 6 +- drivers/gpu/drm/i915/i915_gem_evict.c | 9 +- drivers/gpu/drm/i915/i915_gem_stolen.c | 5 +- drivers/gpu/drm/nouveau/nouveau_notifier.c | 4 +- drivers/gpu/drm/nouveau/nouveau_object.c | 2 +- drivers/gpu/drm/nouveau/nv04_instmem.c | 2 +- drivers/gpu/drm/nouveau/nv20_fb.c | 2 +- drivers/gpu/drm/nouveau/nv50_vram.c | 2 +- drivers/gpu/drm/ttm/ttm_bo.c | 2 +- drivers/gpu/drm/ttm/ttm_bo_manager.c | 4 +- include/drm/drm_mm.h | 38 +++++-- 13 files changed, 143 insertions(+), 86 deletions(-)