diff mbox

[22/48] drm/i915: Use drm_mm for PPGTT PDEs

Message ID 1386367941-7131-70-git-send-email-benjamin.widawsky@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ben Widawsky Dec. 6, 2013, 10:11 p.m. UTC
From: Ben Widawsky <ben@bwidawsk.net>

When PPGTT support was originally enabled, it was only designed to
support 1 PPGTT. It therefore made sense to simply hide the GGTT space
required to enable this from the drm_mm allocator.

Since we intend to support full PPGTT, which means more than 1, and they
can be created and destroyed ad hoc it will be required to use the
proper allocation techniques we already have.

The first step here is to make the existing single PPGTT use the
allocator.

The astute observer will notice that we are reserving space in the GGTT
for the PDEs for the lifetime of the address space, and would be right
to question whether or not this is a good idea. It does not make a
difference with this current patch only the aliasing PPGTT (indeed the
PDEs should still be hidden from the shrinker). For the future, we are
allocating from top to bottom to avoid using the precious "gtt
space" The GGTT space at that point should only be used for scanout, HW
contexts, ringbuffers, HWSP, PDEs, and a couple of other small buffers
(potentially) used by the kernel. Everything else should be mapped into
a PPGTT. To put the consumption in more tangible terms, it takes
approximately 4 sets of PDEs to equal one 19x10 framebuffer (with no
fancy stride or alignment constraints). 3/4 of the total [average] GGTT
can be used for PDEs, and hopefully never touch the 1/4 that the
framebuffer needs.

The astute, and persistent observer might ask about the page tables
which are also pinned for the address space. This waste is unfortunate.
We use 2MB of memory per address space. We leave wrapping the PDEs as a
real GEM object as a TODO.

v2: Align PDEs to 64b in GTT
Allocate the node dynamically so we can use drm_mm_put_block
Now tested on IGT
Allocate node at the top to avoid fragmentation (Chris)

v3: Use Chris' top down allocator

v4: Embed drm_mm_node into ppgtt struct (Jesse)
Remove hunks which didn't belong (Jesse)

v5: Don't subtract guard page since we now killed the guard page prior
to this patch. (Ben)

v6: Rebased and removed guard page stuff.
Added a chunk to the commit message
Allow adding a context to mappable region

v7: Undo v3, so we can make the drm patch last in the series

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v4)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

squash: drm/i915: allow PPGTT to use mappable
---
 drivers/gpu/drm/i915/i915_drv.h     |  1 +
 drivers/gpu/drm/i915/i915_gem_gtt.c | 56 ++++++++++++++++++++-----------------
 2 files changed, 32 insertions(+), 25 deletions(-)

Comments

Chris Wilson March 20, 2014, 11:10 a.m. UTC | #1
On Fri, Dec 06, 2013 at 02:11:55PM -0800, Ben Widawsky wrote:
>  static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>  {
> +#define GEN6_PD_ALIGN (PAGE_SIZE * 16)
> +#define GEN6_PD_SIZE (GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE)
>  	struct drm_device *dev = ppgtt->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	unsigned first_pd_entry_in_global_pt;
> -	int i;
> -	int ret = -ENOMEM;
> +	int i, ret;
>  
> -	/* ppgtt PDEs reside in the global gtt pagetable, which has 512*1024
> -	 * entries. For aliasing ppgtt support we just steal them at the end for
> -	 * now. */
> -	first_pd_entry_in_global_pt = gtt_total_entries(dev_priv->gtt);
> +	/* PPGTT PDEs reside in the GGTT and consists of 512 entries. The
> +	 * allocator works in address space sizes, so it's multiplied by page
> +	 * size. We allocate at the top of the GTT to avoid fragmentation.
> +	 */
> +	BUG_ON(!drm_mm_initialized(&dev_priv->gtt.base.mm));
> +	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
> +						  &ppgtt->node, GEN6_PD_SIZE,
> +						  GEN6_PD_ALIGN, 0,
> +						  0, dev_priv->gtt.base.total,
> +						  DRM_MM_SEARCH_DEFAULT);
This could use the simpler drm_mm_insert_node_generic().
-Chris
Ben Widawsky March 24, 2014, 7:36 p.m. UTC | #2
On Thu, Mar 20, 2014 at 11:10:13AM +0000, Chris Wilson wrote:
> On Fri, Dec 06, 2013 at 02:11:55PM -0800, Ben Widawsky wrote:
> >  static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> >  {
> > +#define GEN6_PD_ALIGN (PAGE_SIZE * 16)
> > +#define GEN6_PD_SIZE (GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE)
> >  	struct drm_device *dev = ppgtt->base.dev;
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	unsigned first_pd_entry_in_global_pt;
> > -	int i;
> > -	int ret = -ENOMEM;
> > +	int i, ret;
> >  
> > -	/* ppgtt PDEs reside in the global gtt pagetable, which has 512*1024
> > -	 * entries. For aliasing ppgtt support we just steal them at the end for
> > -	 * now. */
> > -	first_pd_entry_in_global_pt = gtt_total_entries(dev_priv->gtt);
> > +	/* PPGTT PDEs reside in the GGTT and consists of 512 entries. The
> > +	 * allocator works in address space sizes, so it's multiplied by page
> > +	 * size. We allocate at the top of the GTT to avoid fragmentation.
> > +	 */
> > +	BUG_ON(!drm_mm_initialized(&dev_priv->gtt.base.mm));
> > +	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
> > +						  &ppgtt->node, GEN6_PD_SIZE,
> > +						  GEN6_PD_ALIGN, 0,
> > +						  0, dev_priv->gtt.base.total,
> > +						  DRM_MM_SEARCH_DEFAULT);
> This could use the simpler drm_mm_insert_node_generic().
> -Chris
> 

Not with my [simple] workaround to not use offset 0, which Daniel
reverted. I think he has some hope that we'll actually be able to figure
out why we can't use offset 0 instead of just using the workaround.
Chris Wilson March 24, 2014, 7:45 p.m. UTC | #3
On Mon, Mar 24, 2014 at 12:36:23PM -0700, Ben Widawsky wrote:
> On Thu, Mar 20, 2014 at 11:10:13AM +0000, Chris Wilson wrote:
> > On Fri, Dec 06, 2013 at 02:11:55PM -0800, Ben Widawsky wrote:
> > >  static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> > >  {
> > > +#define GEN6_PD_ALIGN (PAGE_SIZE * 16)
> > > +#define GEN6_PD_SIZE (GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE)
> > >  	struct drm_device *dev = ppgtt->base.dev;
> > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > -	unsigned first_pd_entry_in_global_pt;
> > > -	int i;
> > > -	int ret = -ENOMEM;
> > > +	int i, ret;
> > >  
> > > -	/* ppgtt PDEs reside in the global gtt pagetable, which has 512*1024
> > > -	 * entries. For aliasing ppgtt support we just steal them at the end for
> > > -	 * now. */
> > > -	first_pd_entry_in_global_pt = gtt_total_entries(dev_priv->gtt);
> > > +	/* PPGTT PDEs reside in the GGTT and consists of 512 entries. The
> > > +	 * allocator works in address space sizes, so it's multiplied by page
> > > +	 * size. We allocate at the top of the GTT to avoid fragmentation.
> > > +	 */
> > > +	BUG_ON(!drm_mm_initialized(&dev_priv->gtt.base.mm));
> > > +	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
> > > +						  &ppgtt->node, GEN6_PD_SIZE,
> > > +						  GEN6_PD_ALIGN, 0,
> > > +						  0, dev_priv->gtt.base.total,
> > > +						  DRM_MM_SEARCH_DEFAULT);
> > This could use the simpler drm_mm_insert_node_generic().
> > -Chris
> > 
> 
> Not with my [simple] workaround to not use offset 0, which Daniel
> reverted. I think he has some hope that we'll actually be able to figure
> out why we can't use offset 0 instead of just using the workaround.

You can simply reduce the drm_mm range...
-Chris
Ben Widawsky March 24, 2014, 8:02 p.m. UTC | #4
On Mon, Mar 24, 2014 at 07:45:56PM +0000, Chris Wilson wrote:
> On Mon, Mar 24, 2014 at 12:36:23PM -0700, Ben Widawsky wrote:
> > On Thu, Mar 20, 2014 at 11:10:13AM +0000, Chris Wilson wrote:
> > > On Fri, Dec 06, 2013 at 02:11:55PM -0800, Ben Widawsky wrote:
> > > >  static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> > > >  {
> > > > +#define GEN6_PD_ALIGN (PAGE_SIZE * 16)
> > > > +#define GEN6_PD_SIZE (GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE)
> > > >  	struct drm_device *dev = ppgtt->base.dev;
> > > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > -	unsigned first_pd_entry_in_global_pt;
> > > > -	int i;
> > > > -	int ret = -ENOMEM;
> > > > +	int i, ret;
> > > >  
> > > > -	/* ppgtt PDEs reside in the global gtt pagetable, which has 512*1024
> > > > -	 * entries. For aliasing ppgtt support we just steal them at the end for
> > > > -	 * now. */
> > > > -	first_pd_entry_in_global_pt = gtt_total_entries(dev_priv->gtt);
> > > > +	/* PPGTT PDEs reside in the GGTT and consists of 512 entries. The
> > > > +	 * allocator works in address space sizes, so it's multiplied by page
> > > > +	 * size. We allocate at the top of the GTT to avoid fragmentation.
> > > > +	 */
> > > > +	BUG_ON(!drm_mm_initialized(&dev_priv->gtt.base.mm));
> > > > +	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
> > > > +						  &ppgtt->node, GEN6_PD_SIZE,
> > > > +						  GEN6_PD_ALIGN, 0,
> > > > +						  0, dev_priv->gtt.base.total,
> > > > +						  DRM_MM_SEARCH_DEFAULT);
> > > This could use the simpler drm_mm_insert_node_generic().
> > > -Chris
> > > 
> > 
> > Not with my [simple] workaround to not use offset 0, which Daniel
> > reverted. I think he has some hope that we'll actually be able to figure
> > out why we can't use offset 0 instead of just using the workaround.
> 
> You can simply reduce the drm_mm range...
> -Chris
> 

Yeah, that's a better solution. Patches welcome?

> -- 
> Chris Wilson, Intel Open Source Technology Centre
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 571a0a7..c86f3fd 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -654,6 +654,7 @@  struct i915_gtt {
 
 struct i915_hw_ppgtt {
 	struct i915_address_space base;
+	struct drm_mm_node node;
 	unsigned num_pd_entries;
 	union {
 		struct page **pt_pages;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 7561efc..4c0b865 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -620,6 +620,7 @@  static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
 	int i;
 
 	drm_mm_takedown(&ppgtt->base.mm);
+	drm_mm_remove_node(&ppgtt->node);
 
 	if (ppgtt->pt_dma_addr) {
 		for (i = 0; i < ppgtt->num_pd_entries; i++)
@@ -637,16 +638,27 @@  static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
 
 static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 {
+#define GEN6_PD_ALIGN (PAGE_SIZE * 16)
+#define GEN6_PD_SIZE (GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE)
 	struct drm_device *dev = ppgtt->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned first_pd_entry_in_global_pt;
-	int i;
-	int ret = -ENOMEM;
+	int i, ret;
 
-	/* ppgtt PDEs reside in the global gtt pagetable, which has 512*1024
-	 * entries. For aliasing ppgtt support we just steal them at the end for
-	 * now. */
-	first_pd_entry_in_global_pt = gtt_total_entries(dev_priv->gtt);
+	/* PPGTT PDEs reside in the GGTT and consists of 512 entries. The
+	 * allocator works in address space sizes, so it's multiplied by page
+	 * size. We allocate at the top of the GTT to avoid fragmentation.
+	 */
+	BUG_ON(!drm_mm_initialized(&dev_priv->gtt.base.mm));
+	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
+						  &ppgtt->node, GEN6_PD_SIZE,
+						  GEN6_PD_ALIGN, 0,
+						  0, dev_priv->gtt.base.total,
+						  DRM_MM_SEARCH_DEFAULT);
+	if (ret)
+		return ret;
+
+	if (ppgtt->node.start < dev_priv->gtt.mappable_end)
+		DRM_DEBUG("Forced to use aperture for PDEs\n");
 
 	ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode;
 	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
@@ -659,8 +671,10 @@  static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	ppgtt->base.total = GEN6_PPGTT_PD_ENTRIES * I915_PPGTT_PT_ENTRIES * PAGE_SIZE;
 	ppgtt->pt_pages = kcalloc(ppgtt->num_pd_entries, sizeof(struct page *),
 				  GFP_KERNEL);
-	if (!ppgtt->pt_pages)
+	if (!ppgtt->pt_pages) {
+		drm_mm_remove_node(&ppgtt->node);
 		return -ENOMEM;
+	}
 
 	for (i = 0; i < ppgtt->num_pd_entries; i++) {
 		ppgtt->pt_pages[i] = alloc_page(GFP_KERNEL);
@@ -690,7 +704,11 @@  static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	ppgtt->base.clear_range(&ppgtt->base, 0,
 				ppgtt->num_pd_entries * I915_PPGTT_PT_ENTRIES, true);
 
-	ppgtt->pd_offset = first_pd_entry_in_global_pt * sizeof(gen6_gtt_pte_t);
+	DRM_DEBUG_DRIVER("Allocated pde space (%ldM) at GTT entry: %lx\n",
+			 ppgtt->node.size >> 20,
+			 ppgtt->node.start / PAGE_SIZE);
+	ppgtt->pd_offset =
+		ppgtt->node.start / PAGE_SIZE * sizeof(gen6_gtt_pte_t);
 
 	return 0;
 
@@ -707,6 +725,7 @@  err_pt_alloc:
 			__free_page(ppgtt->pt_pages[i]);
 	}
 	kfree(ppgtt->pt_pages);
+	drm_mm_remove_node(&ppgtt->node);
 
 	return ret;
 }
@@ -1251,27 +1270,14 @@  void i915_gem_init_global_gtt(struct drm_device *dev)
 	gtt_size = dev_priv->gtt.base.total;
 	mappable_size = dev_priv->gtt.mappable_end;
 
+	i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size);
 	if (intel_enable_ppgtt(dev) && HAS_ALIASING_PPGTT(dev)) {
 		int ret;
 
-		if (INTEL_INFO(dev)->gen <= 7) {
-			/* PPGTT pdes are stolen from global gtt ptes, so shrink the
-			 * aperture accordingly when using aliasing ppgtt. */
-			gtt_size -= GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE;
-		}
-
-		i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size);
-
 		ret = i915_gem_init_aliasing_ppgtt(dev);
-		if (!ret)
-			return;
-
-		DRM_ERROR("Aliased PPGTT setup failed %d\n", ret);
-		drm_mm_takedown(&dev_priv->gtt.base.mm);
-		if (INTEL_INFO(dev)->gen < 8)
-			gtt_size += GEN6_PPGTT_PD_ENTRIES*PAGE_SIZE;
+		if (ret)
+			DRM_ERROR("Aliased PPGTT setup failed %d\n", ret);
 	}
-	i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size);
 }
 
 static int setup_scratch_page(struct drm_device *dev)