From patchwork Wed Apr 2 17:03:57 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Lauri Kasanen X-Patchwork-Id: 3929701 Return-Path: X-Original-To: patchwork-dri-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 0EBCE9F382 for ; Wed, 2 Apr 2014 17:03:38 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id BF0E1201D5 for ; Wed, 2 Apr 2014 17:03:36 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 1438E201F9 for ; Wed, 2 Apr 2014 17:03:35 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 597786E6D9; Wed, 2 Apr 2014 10:03:33 -0700 (PDT) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mout.gmx.net (mout.gmx.net [212.227.15.18]) by gabe.freedesktop.org (Postfix) with ESMTP id 0F26E6E6D9 for ; Wed, 2 Apr 2014 10:03:32 -0700 (PDT) Received: from Valinor ([84.248.177.169]) by mail.gmx.com (mrgmx003) with ESMTPA (Nemesis) id 0Lpbqo-1WzndR1ua6-00fOZF; Wed, 02 Apr 2014 19:03:27 +0200 Date: Wed, 2 Apr 2014 20:03:57 +0300 From: Lauri Kasanen To: dri-devel@lists.freedesktop.org Subject: [PATCH 1/2] drm: Add support for two-ended allocation, v3 Message-Id: <20140402200357.2c6ec6ed.cand@gmx.com> X-Mailer: Sylpheed 3.1.4 (GTK+ 2.18.6; x86_64-unknown-linux-gnu) Mime-Version: 1.0 X-Provags-ID: V03:K0:y3JltUAx/3qw2NnCkUSpHF6S0g55myx/KLZz+Cx1Xvl7Mqz8DW6 JZGbQMMbARG3dnaM0owHIGDAT65To0CDo1miD3WgiN6NNPZXfHxaJmzUvcRkGKStGy1vYBi z5KMFpWz0sfv94k0MwX1J3GjLQKDE6iEQ7UuBZ4UTEL+69IyZsuXgPnRMMloi6LiNnWU04F ZR287xhSw2Te4bMAx1yKw== Cc: jglisse@redhat.com, Thomas Hellstrom X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00,FREEMAIL_FROM, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Clients like i915 need to segregate cache domains within the GTT which can lead to small amounts of fragmentation. By allocating the uncached buffers from the bottom and the cacheable buffers from the top, we can reduce the amount of wasted space and also optimize allocation of the mappable portion of the GTT to only those buffers that require CPU access through the GTT. For other drivers, allocating small bos from one end and large ones from the other helps improve the quality of fragmentation. Based on drm_mm work by Chris Wilson. v3: Changed to use a TTM placement flag v2: Updated kerneldoc Cc: Chris Wilson Cc: Ben Widawsky Cc: Christian König Signed-off-by: Lauri Kasanen --- drivers/gpu/drm/drm_mm.c | 66 ++++++++++++++++++++++++++---------- drivers/gpu/drm/i915/i915_gem.c | 3 +- drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +- drivers/gpu/drm/ttm/ttm_bo_manager.c | 11 ++++-- include/drm/drm_mm.h | 32 ++++++++++++++--- include/drm/ttm/ttm_placement.h | 3 ++ 6 files changed, 92 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c index a2d45b74..8f64be4 100644 --- a/drivers/gpu/drm/drm_mm.c +++ b/drivers/gpu/drm/drm_mm.c @@ -82,6 +82,10 @@ * this to implement guard pages between incompatible caching domains in the * graphics TT. * + * Two behaviors are supported for searching and allocating: bottom-up and top-down. + * The default is bottom-up. Top-down allocation can be used if the memory area + * has different restrictions, or just to reduce fragmentation. + * * Finally iteration helpers to walk all nodes and all holes are provided as are * some basic allocator dumpers for debugging. */ @@ -102,7 +106,8 @@ static struct drm_mm_node *drm_mm_search_free_in_range_generic(const struct drm_ static void drm_mm_insert_helper(struct drm_mm_node *hole_node, struct drm_mm_node *node, unsigned long size, unsigned alignment, - unsigned long color) + unsigned long color, + enum drm_mm_allocator_flags flags) { struct drm_mm *mm = hole_node->mm; unsigned long hole_start = drm_mm_hole_node_start(hole_node); @@ -115,12 +120,22 @@ static void drm_mm_insert_helper(struct drm_mm_node *hole_node, if (mm->color_adjust) mm->color_adjust(hole_node, color, &adj_start, &adj_end); + if (flags & DRM_MM_CREATE_TOP) + adj_start = adj_end - size; + if (alignment) { unsigned tmp = adj_start % alignment; - if (tmp) - adj_start += alignment - tmp; + if (tmp) { + if (flags & DRM_MM_CREATE_TOP) + adj_start -= tmp; + else + adj_start += alignment - tmp; + } } + BUG_ON(adj_start < hole_start); + BUG_ON(adj_end > hole_end); + if (adj_start == hole_start) { hole_node->hole_follows = 0; list_del(&hole_node->hole_stack); @@ -205,7 +220,8 @@ EXPORT_SYMBOL(drm_mm_reserve_node); * @size: size of the allocation * @alignment: alignment of the allocation * @color: opaque tag value to use for this node - * @flags: flags to fine-tune the allocation + * @sflags: flags to fine-tune the allocation search + * @aflags: flags to fine-tune the allocation behavior * * The preallocated node must be cleared to 0. * @@ -215,16 +231,17 @@ EXPORT_SYMBOL(drm_mm_reserve_node); int drm_mm_insert_node_generic(struct drm_mm *mm, struct drm_mm_node *node, unsigned long size, unsigned alignment, unsigned long color, - enum drm_mm_search_flags flags) + enum drm_mm_search_flags sflags, + enum drm_mm_allocator_flags aflags) { struct drm_mm_node *hole_node; hole_node = drm_mm_search_free_generic(mm, size, alignment, - color, flags); + color, sflags); if (!hole_node) return -ENOSPC; - drm_mm_insert_helper(hole_node, node, size, alignment, color); + drm_mm_insert_helper(hole_node, node, size, alignment, color, aflags); return 0; } EXPORT_SYMBOL(drm_mm_insert_node_generic); @@ -233,7 +250,8 @@ static void drm_mm_insert_helper_range(struct drm_mm_node *hole_node, struct drm_mm_node *node, unsigned long size, unsigned alignment, unsigned long color, - unsigned long start, unsigned long end) + unsigned long start, unsigned long end, + enum drm_mm_allocator_flags flags) { struct drm_mm *mm = hole_node->mm; unsigned long hole_start = drm_mm_hole_node_start(hole_node); @@ -248,13 +266,20 @@ static void drm_mm_insert_helper_range(struct drm_mm_node *hole_node, if (adj_end > end) adj_end = end; + if (flags & DRM_MM_CREATE_TOP) + adj_start = adj_end - size; + if (mm->color_adjust) mm->color_adjust(hole_node, color, &adj_start, &adj_end); if (alignment) { unsigned tmp = adj_start % alignment; - if (tmp) - adj_start += alignment - tmp; + if (tmp) { + if (flags & DRM_MM_CREATE_TOP) + adj_start -= tmp; + else + adj_start += alignment - tmp; + } } if (adj_start == hole_start) { @@ -271,6 +296,8 @@ static void drm_mm_insert_helper_range(struct drm_mm_node *hole_node, INIT_LIST_HEAD(&node->hole_stack); list_add(&node->node_list, &hole_node->node_list); + BUG_ON(node->start < start); + BUG_ON(node->start < adj_start); BUG_ON(node->start + node->size > adj_end); BUG_ON(node->start + node->size > end); @@ -290,7 +317,8 @@ static void drm_mm_insert_helper_range(struct drm_mm_node *hole_node, * @color: opaque tag value to use for this node * @start: start of the allowed range for this node * @end: end of the allowed range for this node - * @flags: flags to fine-tune the allocation + * @sflags: flags to fine-tune the allocation search + * @aflags: flags to fine-tune the allocation behavior * * The preallocated node must be cleared to 0. * @@ -298,21 +326,23 @@ static void drm_mm_insert_helper_range(struct drm_mm_node *hole_node, * 0 on success, -ENOSPC if there's no suitable hole. */ int drm_mm_insert_node_in_range_generic(struct drm_mm *mm, struct drm_mm_node *node, - unsigned long size, unsigned alignment, unsigned long color, + unsigned long size, unsigned alignment, + unsigned long color, unsigned long start, unsigned long end, - enum drm_mm_search_flags flags) + enum drm_mm_search_flags sflags, + enum drm_mm_allocator_flags aflags) { struct drm_mm_node *hole_node; hole_node = drm_mm_search_free_in_range_generic(mm, size, alignment, color, - start, end, flags); + start, end, sflags); if (!hole_node) return -ENOSPC; drm_mm_insert_helper_range(hole_node, node, size, alignment, color, - start, end); + start, end, aflags); return 0; } EXPORT_SYMBOL(drm_mm_insert_node_in_range_generic); @@ -391,7 +421,8 @@ static struct drm_mm_node *drm_mm_search_free_generic(const struct drm_mm *mm, best = NULL; best_size = ~0UL; - drm_mm_for_each_hole(entry, mm, adj_start, adj_end) { + __drm_mm_for_each_hole(entry, mm, adj_start, adj_end, + flags & DRM_MM_SEARCH_BELOW) { if (mm->color_adjust) { mm->color_adjust(entry, color, &adj_start, &adj_end); if (adj_end <= adj_start) @@ -432,7 +463,8 @@ static struct drm_mm_node *drm_mm_search_free_in_range_generic(const struct drm_ best = NULL; best_size = ~0UL; - drm_mm_for_each_hole(entry, mm, adj_start, adj_end) { + __drm_mm_for_each_hole(entry, mm, adj_start, adj_end, + flags & DRM_MM_SEARCH_BELOW) { if (adj_start < start) adj_start = start; if (adj_end > end) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 9c52f68..489bc33 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3270,7 +3270,8 @@ search_free: ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node, size, alignment, obj->cache_level, 0, gtt_max, - DRM_MM_SEARCH_DEFAULT); + DRM_MM_SEARCH_DEFAULT, + DRM_MM_CREATE_DEFAULT); if (ret) { ret = i915_gem_evict_something(dev, vm, size, alignment, obj->cache_level, flags); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 63a6dc7..dd72d35 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1072,7 +1072,8 @@ alloc: &ppgtt->node, GEN6_PD_SIZE, GEN6_PD_ALIGN, 0, 0, dev_priv->gtt.base.total, - DRM_MM_SEARCH_DEFAULT); + DRM_MM_SEARCH_DEFAULT, + DRM_MM_CREATE_DEFAULT); if (ret == -ENOSPC && !retried) { ret = i915_gem_evict_something(dev, &dev_priv->gtt.base, GEN6_PD_SIZE, GEN6_PD_ALIGN, diff --git a/drivers/gpu/drm/ttm/ttm_bo_manager.c b/drivers/gpu/drm/ttm/ttm_bo_manager.c index c58eba33..bd850c9 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_manager.c +++ b/drivers/gpu/drm/ttm/ttm_bo_manager.c @@ -55,6 +55,7 @@ static int ttm_bo_man_get_node(struct ttm_mem_type_manager *man, struct ttm_range_manager *rman = (struct ttm_range_manager *) man->priv; struct drm_mm *mm = &rman->mm; struct drm_mm_node *node = NULL; + enum drm_mm_allocator_flags aflags = DRM_MM_CREATE_DEFAULT; unsigned long lpfn; int ret; @@ -66,11 +67,15 @@ static int ttm_bo_man_get_node(struct ttm_mem_type_manager *man, if (!node) return -ENOMEM; + if (bo->mem.placement & TTM_PL_FLAG_TOPDOWN) + aflags = DRM_MM_CREATE_TOP; + spin_lock(&rman->lock); - ret = drm_mm_insert_node_in_range(mm, node, mem->num_pages, - mem->page_alignment, + ret = drm_mm_insert_node_in_range_generic(mm, node, mem->num_pages, + mem->page_alignment, 0, placement->fpfn, lpfn, - DRM_MM_SEARCH_BEST); + DRM_MM_SEARCH_BEST, + aflags); spin_unlock(&rman->lock); if (unlikely(ret)) { diff --git a/include/drm/drm_mm.h b/include/drm/drm_mm.h index 8b6981a..a24addf 100644 --- a/include/drm/drm_mm.h +++ b/include/drm/drm_mm.h @@ -47,8 +47,17 @@ enum drm_mm_search_flags { DRM_MM_SEARCH_DEFAULT = 0, DRM_MM_SEARCH_BEST = 1 << 0, + DRM_MM_SEARCH_BELOW = 1 << 1, }; +enum drm_mm_allocator_flags { + DRM_MM_CREATE_DEFAULT = 0, + DRM_MM_CREATE_TOP = 1 << 0, +}; + +#define DRM_MM_BOTTOMUP DRM_MM_SEARCH_DEFAULT, DRM_MM_CREATE_DEFAULT +#define DRM_MM_TOPDOWN DRM_MM_SEARCH_BELOW, DRM_MM_CREATE_TOP + struct drm_mm_node { struct list_head node_list; struct list_head hole_stack; @@ -186,6 +195,9 @@ static inline unsigned long drm_mm_hole_node_end(struct drm_mm_node *hole_node) * Implementation Note: * We need to inline list_for_each_entry in order to be able to set hole_start * and hole_end on each iteration while keeping the macro sane. + * + * The __drm_mm_for_each_hole version is similar, but with added support for + * going backwards. */ #define drm_mm_for_each_hole(entry, mm, hole_start, hole_end) \ for (entry = list_entry((mm)->hole_stack.next, struct drm_mm_node, hole_stack); \ @@ -195,6 +207,14 @@ static inline unsigned long drm_mm_hole_node_end(struct drm_mm_node *hole_node) 1 : 0; \ entry = list_entry(entry->hole_stack.next, struct drm_mm_node, hole_stack)) +#define __drm_mm_for_each_hole(entry, mm, hole_start, hole_end, backwards) \ + for (entry = list_entry((backwards) ? (mm)->hole_stack.prev : (mm)->hole_stack.next, struct drm_mm_node, hole_stack); \ + &entry->hole_stack != &(mm)->hole_stack ? \ + hole_start = drm_mm_hole_node_start(entry), \ + hole_end = drm_mm_hole_node_end(entry), \ + 1 : 0; \ + entry = list_entry((backwards) ? entry->hole_stack.prev : entry->hole_stack.next, struct drm_mm_node, hole_stack)) + /* * Basic range manager support (drm_mm.c) */ @@ -205,7 +225,8 @@ int drm_mm_insert_node_generic(struct drm_mm *mm, unsigned long size, unsigned alignment, unsigned long color, - enum drm_mm_search_flags flags); + enum drm_mm_search_flags sflags, + enum drm_mm_allocator_flags aflags); /** * drm_mm_insert_node - search for space and insert @node * @mm: drm_mm to allocate from @@ -228,7 +249,8 @@ static inline int drm_mm_insert_node(struct drm_mm *mm, unsigned alignment, enum drm_mm_search_flags flags) { - return drm_mm_insert_node_generic(mm, node, size, alignment, 0, flags); + return drm_mm_insert_node_generic(mm, node, size, alignment, 0, flags, + DRM_MM_CREATE_DEFAULT); } int drm_mm_insert_node_in_range_generic(struct drm_mm *mm, @@ -238,7 +260,8 @@ int drm_mm_insert_node_in_range_generic(struct drm_mm *mm, unsigned long color, unsigned long start, unsigned long end, - enum drm_mm_search_flags flags); + enum drm_mm_search_flags sflags, + enum drm_mm_allocator_flags aflags); /** * drm_mm_insert_node_in_range - ranged search for space and insert @node * @mm: drm_mm to allocate from @@ -266,7 +289,8 @@ static inline int drm_mm_insert_node_in_range(struct drm_mm *mm, enum drm_mm_search_flags flags) { return drm_mm_insert_node_in_range_generic(mm, node, size, alignment, - 0, start, end, flags); + 0, start, end, flags, + DRM_MM_CREATE_DEFAULT); } void drm_mm_remove_node(struct drm_mm_node *node); diff --git a/include/drm/ttm/ttm_placement.h b/include/drm/ttm/ttm_placement.h index c84ff15..8ed44f9 100644 --- a/include/drm/ttm/ttm_placement.h +++ b/include/drm/ttm/ttm_placement.h @@ -65,6 +65,8 @@ * reference the buffer. * TTM_PL_FLAG_NO_EVICT means that the buffer may never * be evicted to make room for other buffers. + * TTM_PL_FLAG_TOPDOWN requests to be placed from the + * top of the memory area, instead of the bottom. */ #define TTM_PL_FLAG_CACHED (1 << 16) @@ -72,6 +74,7 @@ #define TTM_PL_FLAG_WC (1 << 18) #define TTM_PL_FLAG_SHARED (1 << 20) #define TTM_PL_FLAG_NO_EVICT (1 << 21) +#define TTM_PL_FLAG_TOPDOWN (1 << 22) #define TTM_PL_MASK_CACHING (TTM_PL_FLAG_CACHED | \ TTM_PL_FLAG_UNCACHED | \