Patchwork drm: i915: ensure objects are allocated below 4GB on PAE

login
register
mail settings
Submitter kyle mcmartin
Date May 26, 2009, 4:27 p.m.
Message ID <20090526162717.GC14808@bombadil.infradead.org>
Download mbox | patch
Permalink /patch/26047/
State New, archived
Headers show

Comments

kyle mcmartin - May 26, 2009, 4:27 p.m.
From: Kyle McMartin <kyle@redhat.com>

Ensure we allocate GEM objects below 4GB on PAE machines, otherwise
misery ensues. This patch is based on a patch found on dri-devel by
Shaohua Li, but Keith P. expressed reticence that the changes unfairly
penalized other hardware.

(The mm/shmem.c hunk is necessary to ensure the DMA32 flag isn't used
 by the slab allocator via radix_tree_preload, which will hit a
 WARN_ON.)

Signed-off-by: Kyle McMartin <kyle@redhat.com>
---

We're shipping a variant of this in Fedora 11 to fix a myriad of bugs on
PAE hardware.

cheers, Kyle

---
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Peter Zijlstra - May 26, 2009, 7:17 p.m.
On Tue, 2009-05-26 at 12:27 -0400, Kyle McMartin wrote:
> From: Kyle McMartin <kyle@redhat.com>
> 
> Ensure we allocate GEM objects below 4GB on PAE machines, otherwise
> misery ensues. This patch is based on a patch found on dri-devel by
> Shaohua Li, but Keith P. expressed reticence that the changes unfairly
> penalized other hardware.
> 
> (The mm/shmem.c hunk is necessary to ensure the DMA32 flag isn't used
>  by the slab allocator via radix_tree_preload, which will hit a
>  WARN_ON.)

Why is this, is the gart not PAE friendly?

Seems to me its a grand way of promoting 64bit hard/soft-ware.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Eric Anholt - May 26, 2009, 9:35 p.m.
On Tue, 2009-05-26 at 21:17 +0200, Peter Zijlstra wrote:
> On Tue, 2009-05-26 at 12:27 -0400, Kyle McMartin wrote:
> > From: Kyle McMartin <kyle@redhat.com>
> > 
> > Ensure we allocate GEM objects below 4GB on PAE machines, otherwise
> > misery ensues. This patch is based on a patch found on dri-devel by
> > Shaohua Li, but Keith P. expressed reticence that the changes unfairly
> > penalized other hardware.
> > 
> > (The mm/shmem.c hunk is necessary to ensure the DMA32 flag isn't used
> >  by the slab allocator via radix_tree_preload, which will hit a
> >  WARN_ON.)
> 
> Why is this, is the gart not PAE friendly?
> 
> Seems to me its a grand way of promoting 64bit hard/soft-ware.

No, the GART's fine.  But the APIs required to make the AGP code
PAE-friendly got deprecated, so the patches to fix the AGP code got
NAKed, and Venkatesh  never sent out his patches to undeprecate the APIs
and use them.

It's been like 6 months now, and it's absurd.  I'd like to see this
patch go in so people's graphics can start working again and stop
corrupting system memory.
Peter Zijlstra - May 26, 2009, 9:41 p.m.
On Tue, 2009-05-26 at 14:35 -0700, Eric Anholt wrote:
> On Tue, 2009-05-26 at 21:17 +0200, Peter Zijlstra wrote:
> > On Tue, 2009-05-26 at 12:27 -0400, Kyle McMartin wrote:
> > > From: Kyle McMartin <kyle@redhat.com>
> > > 
> > > Ensure we allocate GEM objects below 4GB on PAE machines, otherwise
> > > misery ensues. This patch is based on a patch found on dri-devel by
> > > Shaohua Li, but Keith P. expressed reticence that the changes unfairly
> > > penalized other hardware.
> > > 
> > > (The mm/shmem.c hunk is necessary to ensure the DMA32 flag isn't used
> > >  by the slab allocator via radix_tree_preload, which will hit a
> > >  WARN_ON.)
> > 
> > Why is this, is the gart not PAE friendly?
> > 
> > Seems to me its a grand way of promoting 64bit hard/soft-ware.
> 
> No, the GART's fine.  But the APIs required to make the AGP code
> PAE-friendly got deprecated, so the patches to fix the AGP code got
> NAKed, and Venkatesh  never sent out his patches to undeprecate the APIs
> and use them.
> 
> It's been like 6 months now, and it's absurd.  I'd like to see this
> patch go in so people's graphics can start working again and stop
> corrupting system memory.

For .30 yes, for .31 we need to resolve that AGP issue, 6 months does
seem excessive to get something like that sorted.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
kyle mcmartin - May 26, 2009, 9:43 p.m.
On Tue, May 26, 2009 at 11:41:25PM +0200, Peter Zijlstra wrote:
> On Tue, 2009-05-26 at 14:35 -0700, Eric Anholt wrote:
> > On Tue, 2009-05-26 at 21:17 +0200, Peter Zijlstra wrote:
> > > On Tue, 2009-05-26 at 12:27 -0400, Kyle McMartin wrote:
> > > > From: Kyle McMartin <kyle@redhat.com>
> > > > 
> > > > Ensure we allocate GEM objects below 4GB on PAE machines, otherwise
> > > > misery ensues. This patch is based on a patch found on dri-devel by
> > > > Shaohua Li, but Keith P. expressed reticence that the changes unfairly
> > > > penalized other hardware.
> > > > 
> > > > (The mm/shmem.c hunk is necessary to ensure the DMA32 flag isn't used
> > > >  by the slab allocator via radix_tree_preload, which will hit a
> > > >  WARN_ON.)
> > > 
> > > Why is this, is the gart not PAE friendly?
> > > 
> > > Seems to me its a grand way of promoting 64bit hard/soft-ware.
> > 
> > No, the GART's fine.  But the APIs required to make the AGP code
> > PAE-friendly got deprecated, so the patches to fix the AGP code got
> > NAKed, and Venkatesh  never sent out his patches to undeprecate the APIs
> > and use them.
> > 
> > It's been like 6 months now, and it's absurd.  I'd like to see this
> > patch go in so people's graphics can start working again and stop
> > corrupting system memory.
> 
> For .30 yes, for .31 we need to resolve that AGP issue, 6 months does
> seem excessive to get something like that sorted.
> 

Yeah, sorry, I should have explained it in the description better, this
is just a paper-over fix for the problem on >4GB 32-bit machines (which
is why I CC'd stable@.)

Thanks,
	Kyle
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Hugh Dickins - May 26, 2009, 10:55 p.m.
On Tue, 26 May 2009, Kyle McMartin wrote:
> From: Kyle McMartin <kyle@redhat.com>
> 
> Ensure we allocate GEM objects below 4GB on PAE machines, otherwise
> misery ensues. This patch is based on a patch found on dri-devel by
> Shaohua Li, but Keith P. expressed reticence that the changes unfairly
> penalized other hardware.
> 
> (The mm/shmem.c hunk is necessary to ensure the DMA32 flag isn't used
>  by the slab allocator via radix_tree_preload, which will hit a
>  WARN_ON.)
> 
> Signed-off-by: Kyle McMartin <kyle@redhat.com>

I'm confused: I thought GFP_DMA32 only applies on x86_64:
my 32-bit PAE machine with (slightly!) > 4GB shows no ZONE_DMA32.
Does this patch perhaps depend on another, to enable DMA32 on 32-bit
PAE, or am I just in a muddle?

Regarding the mm/shmem.c hunk:
> -		error = radix_tree_preload(gfp & ~__GFP_HIGHMEM);
> +		error = radix_tree_preload(gfp & ~(__GFP_HIGHMEM|__GFP_DMA32));

Yes, that would make sense.  I dislike it, but my dislike is no
reason to hold you up: what it ought to say is
		error = radix_tree_preload(gfp & GFP_RECLAIM_MASK);
and similarly the several other (gfp & ~__GFP_HIGHMEM)s to be found
in nearby files: it's just an accident of history that nobody has
hit this issue with __GFP_DMA or __GFP_DMA32 before.  I intended
to change these months ago, my slowness is no reason to delay you.

Hugh

> ---
> 
> We're shipping a variant of this in Fedora 11 to fix a myriad of bugs on
> PAE hardware.
> 
> cheers, Kyle
> 
> ---
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index 4984aa8..ae52edc 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -142,6 +142,9 @@ drm_gem_object_alloc(struct drm_device *dev, size_t size)
>  		return NULL;
>  	}
>  
> +	if (dev->gem_flags)
> +		mapping_set_gfp_mask(obj->filp->f_mapping, dev->gem_flags);
> +
>  	kref_init(&obj->refcount);
>  	kref_init(&obj->handlecount);
>  	obj->size = size;
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 53d5445..c89ae3d 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1153,12 +1153,12 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
>  	}
>  
>  #ifdef CONFIG_HIGHMEM64G
> -	/* don't enable GEM on PAE - needs agp + set_memory_* interface fixes */
> -	dev_priv->has_gem = 0;
> -#else
> +	/* avoid allocating buffers above 4GB on PAE */
> +	dev->gem_flags = GFP_USER | GFP_DMA32;
> +#endif
> +
>  	/* enable GEM by default */
>  	dev_priv->has_gem = 1;
> -#endif
>  
>  	dev->driver->get_vblank_counter = i915_get_vblank_counter;
>  	if (IS_GM45(dev))
> diff --git a/include/drm/drmP.h b/include/drm/drmP.h
> index c8c4221..3744c1f 100644
> --- a/include/drm/drmP.h
> +++ b/include/drm/drmP.h
> @@ -1019,6 +1019,7 @@ struct drm_device {
>  	uint32_t gtt_total;
>  	uint32_t invalidate_domains;    /* domains pending invalidation */
>  	uint32_t flush_domains;         /* domains pending flush */
> +	gfp_t gem_flags;		/* object allocation flags */
>  	/*@} */
>  
>  };
> diff --git a/mm/shmem.c b/mm/shmem.c
> index b25f95c..e615887 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -1241,7 +1241,7 @@ repeat:
>  		 * Try to preload while we can wait, to not make a habit of
>  		 * draining atomic reserves; but don't latch on to this cpu.
>  		 */
> -		error = radix_tree_preload(gfp & ~__GFP_HIGHMEM);
> +		error = radix_tree_preload(gfp & ~(__GFP_HIGHMEM|__GFP_DMA32));
>  		if (error)
>  			goto failed;
>  		radix_tree_preload_end();
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
kyle mcmartin - May 27, 2009, 12:18 a.m.
On Tue, May 26, 2009 at 11:55:50PM +0100, Hugh Dickins wrote:
> I'm confused: I thought GFP_DMA32 only applies on x86_64:
> my 32-bit PAE machine with (slightly!) > 4GB shows no ZONE_DMA32.
> Does this patch perhaps depend on another, to enable DMA32 on 32-bit
> PAE, or am I just in a muddle?
> 

No, you're exactly right, I'm just a muppet and missed the obvious.
Looks like the "correct" fix is the fact that the allocation is thus
filled out with GFP_USER, therefore, from ZONE_NORMAL, and below
max_low_pfn.

Looks like we'll need some additional thinking to get true ZONE_DMA32 on
i386... ugh, I'll look into it tonight.

regards, Kyle

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Shaohua Li - May 27, 2009, 12:42 a.m.
On Wed, May 27, 2009 at 08:18:40AM +0800, Kyle McMartin wrote:
> On Tue, May 26, 2009 at 11:55:50PM +0100, Hugh Dickins wrote:
> > I'm confused: I thought GFP_DMA32 only applies on x86_64:
> > my 32-bit PAE machine with (slightly!) > 4GB shows no ZONE_DMA32.
> > Does this patch perhaps depend on another, to enable DMA32 on 32-bit
> > PAE, or am I just in a muddle?
> > 
> 
> No, you're exactly right, I'm just a muppet and missed the obvious.
> Looks like the "correct" fix is the fact that the allocation is thus
> filled out with GFP_USER, therefore, from ZONE_NORMAL, and below
> max_low_pfn.
> 
> Looks like we'll need some additional thinking to get true ZONE_DMA32 on
> i386... ugh, I'll look into it tonight.
For i386, GFP_USER is enough. But 945G GART can only map to physical page < 4G,
so for x64, we need GFP_DMA32. This is the reason I add extra GFP_DMA32.

Thanks,
Shaohua
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Eric Anholt - May 27, 2009, 5:40 p.m.
On Wed, 2009-05-27 at 08:42 +0800, Shaohua Li wrote:
> On Wed, May 27, 2009 at 08:18:40AM +0800, Kyle McMartin wrote:
> > On Tue, May 26, 2009 at 11:55:50PM +0100, Hugh Dickins wrote:
> > > I'm confused: I thought GFP_DMA32 only applies on x86_64:
> > > my 32-bit PAE machine with (slightly!) > 4GB shows no ZONE_DMA32.
> > > Does this patch perhaps depend on another, to enable DMA32 on 32-bit
> > > PAE, or am I just in a muddle?
> > > 
> > 
> > No, you're exactly right, I'm just a muppet and missed the obvious.
> > Looks like the "correct" fix is the fact that the allocation is thus
> > filled out with GFP_USER, therefore, from ZONE_NORMAL, and below
> > max_low_pfn.
> > 
> > Looks like we'll need some additional thinking to get true ZONE_DMA32 on
> > i386... ugh, I'll look into it tonight.
> For i386, GFP_USER is enough. But 945G GART can only map to physical page < 4G,
> so for x64, we need GFP_DMA32. This is the reason I add extra GFP_DMA32.

Those 945Gs don't have memory located above 4G, from my reading of the
chipset specs.
kyle mcmartin - May 27, 2009, 6:23 p.m.
On Wed, May 27, 2009 at 10:40:12AM -0700, Eric Anholt wrote:
> On Wed, 2009-05-27 at 08:42 +0800, Shaohua Li wrote:
> > On Wed, May 27, 2009 at 08:18:40AM +0800, Kyle McMartin wrote:
> > > On Tue, May 26, 2009 at 11:55:50PM +0100, Hugh Dickins wrote:
> > > > I'm confused: I thought GFP_DMA32 only applies on x86_64:
> > > > my 32-bit PAE machine with (slightly!) > 4GB shows no ZONE_DMA32.
> > > > Does this patch perhaps depend on another, to enable DMA32 on 32-bit
> > > > PAE, or am I just in a muddle?
> > > > 
> > > 
> > > No, you're exactly right, I'm just a muppet and missed the obvious.
> > > Looks like the "correct" fix is the fact that the allocation is thus
> > > filled out with GFP_USER, therefore, from ZONE_NORMAL, and below
> > > max_low_pfn.
> > > 
> > > Looks like we'll need some additional thinking to get true ZONE_DMA32 on
> > > i386... ugh, I'll look into it tonight.
> > For i386, GFP_USER is enough. But 945G GART can only map to physical page < 4G,
> > so for x64, we need GFP_DMA32. This is the reason I add extra GFP_DMA32.
> 
> Those 945Gs don't have memory located above 4G, from my reading of the
> chipset specs.
> 

Indeed, I thought they were clipped to 3G memory, max. (I assume we mean
945 in the strictest sense, and not "son of 945" like the G33 series?)

(But yeah, the ifdef in my patch should be nuked strictly, I guess.)

cheers, Kyle
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Shaohua Li - May 31, 2009, 12:49 a.m.
On Thu, May 28, 2009 at 01:40:12AM +0800, Eric Anholt wrote:
> On Wed, 2009-05-27 at 08:42 +0800, Shaohua Li wrote:
> > On Wed, May 27, 2009 at 08:18:40AM +0800, Kyle McMartin wrote:
> > > On Tue, May 26, 2009 at 11:55:50PM +0100, Hugh Dickins wrote:
> > > > I'm confused: I thought GFP_DMA32 only applies on x86_64:
> > > > my 32-bit PAE machine with (slightly!) > 4GB shows no ZONE_DMA32.
> > > > Does this patch perhaps depend on another, to enable DMA32 on 32-bit
> > > > PAE, or am I just in a muddle?
> > > > 
> > > 
> > > No, you're exactly right, I'm just a muppet and missed the obvious.
> > > Looks like the "correct" fix is the fact that the allocation is thus
> > > filled out with GFP_USER, therefore, from ZONE_NORMAL, and below
> > > max_low_pfn.
> > > 
> > > Looks like we'll need some additional thinking to get true ZONE_DMA32 on
> > > i386... ugh, I'll look into it tonight.
> > For i386, GFP_USER is enough. But 945G GART can only map to physical page < 4G,
> > so for x64, we need GFP_DMA32. This is the reason I add extra GFP_DMA32.
> 
> Those 945Gs don't have memory located above 4G, from my reading of the
> chipset specs.
ok, then GFP_DMA32 can be removed from the patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Patch

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 4984aa8..ae52edc 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -142,6 +142,9 @@  drm_gem_object_alloc(struct drm_device *dev, size_t size)
 		return NULL;
 	}
 
+	if (dev->gem_flags)
+		mapping_set_gfp_mask(obj->filp->f_mapping, dev->gem_flags);
+
 	kref_init(&obj->refcount);
 	kref_init(&obj->handlecount);
 	obj->size = size;
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 53d5445..c89ae3d 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1153,12 +1153,12 @@  int i915_driver_load(struct drm_device *dev, unsigned long flags)
 	}
 
 #ifdef CONFIG_HIGHMEM64G
-	/* don't enable GEM on PAE - needs agp + set_memory_* interface fixes */
-	dev_priv->has_gem = 0;
-#else
+	/* avoid allocating buffers above 4GB on PAE */
+	dev->gem_flags = GFP_USER | GFP_DMA32;
+#endif
+
 	/* enable GEM by default */
 	dev_priv->has_gem = 1;
-#endif
 
 	dev->driver->get_vblank_counter = i915_get_vblank_counter;
 	if (IS_GM45(dev))
diff --git a/include/drm/drmP.h b/include/drm/drmP.h
index c8c4221..3744c1f 100644
--- a/include/drm/drmP.h
+++ b/include/drm/drmP.h
@@ -1019,6 +1019,7 @@  struct drm_device {
 	uint32_t gtt_total;
 	uint32_t invalidate_domains;    /* domains pending invalidation */
 	uint32_t flush_domains;         /* domains pending flush */
+	gfp_t gem_flags;		/* object allocation flags */
 	/*@} */
 
 };
diff --git a/mm/shmem.c b/mm/shmem.c
index b25f95c..e615887 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1241,7 +1241,7 @@  repeat:
 		 * Try to preload while we can wait, to not make a habit of
 		 * draining atomic reserves; but don't latch on to this cpu.
 		 */
-		error = radix_tree_preload(gfp & ~__GFP_HIGHMEM);
+		error = radix_tree_preload(gfp & ~(__GFP_HIGHMEM|__GFP_DMA32));
 		if (error)
 			goto failed;
 		radix_tree_preload_end();