diff mbox

[v2,for,-fixes] drm/i915: Disable stolen memory when DMAR is active

Message ID 1395147544-16984-1-git-send-email-jani.nikula@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jani Nikula March 18, 2014, 12:59 p.m. UTC
From: Chris Wilson <chris@chris-wilson.co.uk>

We have reports of heavy screen corruption if we try to use the stolen
memory reserved by the BIOS whilst the DMA-Remapper is active. This
quirk may be only specific to a few machines or BIOSes, but first lets
apply the big hammer and always disable use of stolen memory when DMAR
is active.

v2 by Jani: Rebase on -fixes, only look at intel_iommu_gfx_mapped.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68535
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

---

Daniel, is this the color you want?

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_stolen.c |    7 +++++++
 1 file changed, 7 insertions(+)

Comments

Daniel Vetter March 18, 2014, 4:48 p.m. UTC | #1
On Tue, Mar 18, 2014 at 02:59:04PM +0200, Jani Nikula wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> We have reports of heavy screen corruption if we try to use the stolen
> memory reserved by the BIOS whilst the DMA-Remapper is active. This
> quirk may be only specific to a few machines or BIOSes, but first lets
> apply the big hammer and always disable use of stolen memory when DMAR
> is active.
> 
> v2 by Jani: Rebase on -fixes, only look at intel_iommu_gfx_mapped.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68535
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
> 
> ---
> 
> Daniel, is this the color you want?

Yeah, colour looks shiny ;-) Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_stolen.c |    7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index d58b4e287e32..28d24caa49f3 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -214,6 +214,13 @@ int i915_gem_init_stolen(struct drm_device *dev)
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	int bios_reserved = 0;
>  
> +#ifdef CONFIG_INTEL_IOMMU
> +	if (intel_iommu_gfx_mapped) {
> +		DRM_INFO("DMAR active, disabling use of stolen memory\n");
> +		return 0;
> +	}
> +#endif
> +
>  	if (dev_priv->gtt.stolen_size == 0)
>  		return 0;
>  
> -- 
> 1.7.9.5
>
Daniel Vetter March 18, 2014, 4:50 p.m. UTC | #2
On Tue, Mar 18, 2014 at 05:48:28PM +0100, Daniel Vetter wrote:
> On Tue, Mar 18, 2014 at 02:59:04PM +0200, Jani Nikula wrote:
> > From: Chris Wilson <chris@chris-wilson.co.uk>
> > 
> > We have reports of heavy screen corruption if we try to use the stolen
> > memory reserved by the BIOS whilst the DMA-Remapper is active. This
> > quirk may be only specific to a few machines or BIOSes, but first lets
> > apply the big hammer and always disable use of stolen memory when DMAR
> > is active.
> > 
> > v2 by Jani: Rebase on -fixes, only look at intel_iommu_gfx_mapped.
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68535
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Signed-off-by: Jani Nikula <jani.nikula@intel.com>
> > 
> > ---
> > 
> > Daniel, is this the color you want?
> 
> Yeah, colour looks shiny ;-) Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Correction, cc: stable is missing.
-Daniel

> > 
> > Signed-off-by: Jani Nikula <jani.nikula@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_gem_stolen.c |    7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > index d58b4e287e32..28d24caa49f3 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > @@ -214,6 +214,13 @@ int i915_gem_init_stolen(struct drm_device *dev)
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	int bios_reserved = 0;
> >  
> > +#ifdef CONFIG_INTEL_IOMMU
> > +	if (intel_iommu_gfx_mapped) {
> > +		DRM_INFO("DMAR active, disabling use of stolen memory\n");
> > +		return 0;
> > +	}
> > +#endif
> > +
> >  	if (dev_priv->gtt.stolen_size == 0)
> >  		return 0;
> >  
> > -- 
> > 1.7.9.5
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
Jani Nikula March 19, 2014, 9:07 a.m. UTC | #3
On Tue, 18 Mar 2014, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Tue, Mar 18, 2014 at 05:48:28PM +0100, Daniel Vetter wrote:
>> On Tue, Mar 18, 2014 at 02:59:04PM +0200, Jani Nikula wrote:
>> > From: Chris Wilson <chris@chris-wilson.co.uk>
>> > 
>> > We have reports of heavy screen corruption if we try to use the stolen
>> > memory reserved by the BIOS whilst the DMA-Remapper is active. This
>> > quirk may be only specific to a few machines or BIOSes, but first lets
>> > apply the big hammer and always disable use of stolen memory when DMAR
>> > is active.
>> > 
>> > v2 by Jani: Rebase on -fixes, only look at intel_iommu_gfx_mapped.
>> > 
>> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68535
>> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>> > Signed-off-by: Jani Nikula <jani.nikula@intel.com>
>> > 
>> > ---
>> > 
>> > Daniel, is this the color you want?
>> 
>> Yeah, colour looks shiny ;-) Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> Correction, cc: stable is missing.

Pushed to -fixes, thanks for the patch (original by Chris) and review.

Jani.


> -Daniel
>
>> > 
>> > Signed-off-by: Jani Nikula <jani.nikula@intel.com>
>> > ---
>> >  drivers/gpu/drm/i915/i915_gem_stolen.c |    7 +++++++
>> >  1 file changed, 7 insertions(+)
>> > 
>> > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
>> > index d58b4e287e32..28d24caa49f3 100644
>> > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
>> > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
>> > @@ -214,6 +214,13 @@ int i915_gem_init_stolen(struct drm_device *dev)
>> >  	struct drm_i915_private *dev_priv = dev->dev_private;
>> >  	int bios_reserved = 0;
>> >  
>> > +#ifdef CONFIG_INTEL_IOMMU
>> > +	if (intel_iommu_gfx_mapped) {
>> > +		DRM_INFO("DMAR active, disabling use of stolen memory\n");
>> > +		return 0;
>> > +	}
>> > +#endif
>> > +
>> >  	if (dev_priv->gtt.stolen_size == 0)
>> >  		return 0;
>> >  
>> > -- 
>> > 1.7.9.5
>> > 
>> 
>> -- 
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
David Woodhouse March 19, 2014, 8:51 p.m. UTC | #4
On Tue, 2014-03-18 at 14:59 +0200, Jani Nikula wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> We have reports of heavy screen corruption if we try to use the stolen
> memory reserved by the BIOS whilst the DMA-Remapper is active. This
> quirk may be only specific to a few machines or BIOSes, but first lets
> apply the big hammer and always disable use of stolen memory when DMAR
> is active.
> 
> v2 by Jani: Rebase on -fixes, only look at intel_iommu_gfx_mapped.

Perhaps this (and all similar workarounds) should be predicated on
i915_preliminary_hw_support? When people are using the Linux kernel for
chipset validation, we sure as hell don't want to silently disable this
stuff and let them think it's working when it's not.
Jani Nikula March 20, 2014, 7:36 a.m. UTC | #5
On Wed, 19 Mar 2014, David Woodhouse <dwmw2@infradead.org> wrote:
> On Tue, 2014-03-18 at 14:59 +0200, Jani Nikula wrote:
>> From: Chris Wilson <chris@chris-wilson.co.uk>
>> 
>> We have reports of heavy screen corruption if we try to use the stolen
>> memory reserved by the BIOS whilst the DMA-Remapper is active. This
>> quirk may be only specific to a few machines or BIOSes, but first lets
>> apply the big hammer and always disable use of stolen memory when DMAR
>> is active.
>> 
>> v2 by Jani: Rebase on -fixes, only look at intel_iommu_gfx_mapped.
>
> Perhaps this (and all similar workarounds) should be predicated on
> i915_preliminary_hw_support? When people are using the Linux kernel for
> chipset validation, we sure as hell don't want to silently disable this
> stuff and let them think it's working when it's not.

Or an additional knob, in case it's really not working and people want
to get other things depending on prelim hw support done.

BR,
Jani.
David Woodhouse March 20, 2014, 7:49 a.m. UTC | #6
On Thu, 2014-03-20 at 09:36 +0200, Jani Nikula wrote:
> 
> Or an additional knob, in case it's really not working and people want
> to get other things depending on prelim hw support done.

Yeah. Perhaps the best answer is a 'disable_silicon_workarounds' option,
to disable *all* workarounds for silicon bugs. Couple that with a printk
telling the user that workarounds are disabled *and* VT-d is enabled.

That's a nice simple thing for the chipset validation folks to be
looking for. Unless they see that and have no issues with either
framebuffer or X, the chipset hasn't been tested.

That aside, I'm also unhappy with your patch on general principles. As a
rule I'd like to see references to a *specific* published erratum, for
anything we disable. Otherwise we're just admitting that life is too
hard and we *never* bother to test our silicon before we ship it and we
*expect* it to be broken.

If we chase broken hardware to the point where errata are published, we
should hopefully ensure that the problem feeds back to the validation
folks who haven't done their job properly. Every time.

(Pondered making this an internal email, but hey — *you're* the one who
said "our hardware is always broken and we don't even bother to track
individual brokenness". I'm just translating it into English from what's
in your patch :)
Jani Nikula March 20, 2014, 9:23 a.m. UTC | #7
On Thu, 20 Mar 2014, David Woodhouse <dwmw2@infradead.org> wrote:
> On Thu, 2014-03-20 at 09:36 +0200, Jani Nikula wrote:
>> 
>> Or an additional knob, in case it's really not working and people want
>> to get other things depending on prelim hw support done.
>
> Yeah. Perhaps the best answer is a 'disable_silicon_workarounds' option,
> to disable *all* workarounds for silicon bugs. Couple that with a printk
> telling the user that workarounds are disabled *and* VT-d is enabled.
>
> That's a nice simple thing for the chipset validation folks to be
> looking for. Unless they see that and have no issues with either
> framebuffer or X, the chipset hasn't been tested.
>
> That aside, I'm also unhappy with your patch on general principles. As a
> rule I'd like to see references to a *specific* published erratum, for
> anything we disable. Otherwise we're just admitting that life is too
> hard and we *never* bother to test our silicon before we ship it and we
> *expect* it to be broken.
>
> If we chase broken hardware to the point where errata are published, we
> should hopefully ensure that the problem feeds back to the validation
> folks who haven't done their job properly. Every time.
>
> (Pondered making this an internal email, but hey — *you're* the one who
> said "our hardware is always broken and we don't even bother to track
> individual brokenness". I'm just translating it into English from what's
> in your patch :)

I'll have to dodge this particular discussion, just because it was
really Chris' patch which I merely repainted with colours requested by
our resident interior designer Daniel. ;)

BR,
Jani.


>
> -- 
> dwmw2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Daniel Vetter March 20, 2014, 9:45 a.m. UTC | #8
On Thu, Mar 20, 2014 at 8:49 AM, David Woodhouse <dwmw2@infradead.org> wrote:
> On Thu, 2014-03-20 at 09:36 +0200, Jani Nikula wrote:
>>
>> Or an additional knob, in case it's really not working and people want
>> to get other things depending on prelim hw support done.
>
> Yeah. Perhaps the best answer is a 'disable_silicon_workarounds' option,
> to disable *all* workarounds for silicon bugs. Couple that with a printk
> telling the user that workarounds are disabled *and* VT-d is enabled.
>
> That's a nice simple thing for the chipset validation folks to be
> looking for. Unless they see that and have no issues with either
> framebuffer or X, the chipset hasn't been tested.
>
> That aside, I'm also unhappy with your patch on general principles. As a
> rule I'd like to see references to a *specific* published erratum, for
> anything we disable. Otherwise we're just admitting that life is too
> hard and we *never* bother to test our silicon before we ship it and we
> *expect* it to be broken.
>
> If we chase broken hardware to the point where errata are published, we
> should hopefully ensure that the problem feeds back to the validation
> folks who haven't done their job properly. Every time.
>
> (Pondered making this an internal email, but hey -- *you're* the one who
> said "our hardware is always broken and we don't even bother to track
> individual brokenness". I'm just translating it into English from what's
> in your patch :)

I'd agree that this would be nice, but my maintainer time is not
endless and when I have users screaming "regression" I do have to do
something. And yeah with the track record set of some of the earliest
vtd+gfx chips I'm fairly aggressive with just disabling features,
especially when the original bug report is against a recent platform
like ivb (so presumably issues on olders exist, too).

Now this very likely is some fumble in our code, after all the bios
managed to set things up. But until I have managers screaming at me
and throwing people ("resources") at the problem, my only concern is
keeping the regressions out the door without disabling other stuff my
managers actually do scream around about.
-Daniel
David Woodhouse March 20, 2014, 10:21 a.m. UTC | #9
On Thu, 2014-03-20 at 10:45 +0100, Daniel Vetter wrote:
> I'd agree that this would be nice, but my maintainer time is not
> endless and when I have users screaming "regression" I do have to do
> something. And yeah with the track record set of some of the earliest
> vtd+gfx chips I'm fairly aggressive with just disabling features,

It's not just the earlier vtd+gfx chips. To my knowledge we've *never*
shipped a chip without egregious hardware bugs with VT-d.

> especially when the original bug report is against a recent platform
> like ivb (so presumably issues on olders exist, too).

Yes, by all means disable it for current and old chipsets. But please
not newer ones. I want it to show up when the validation folks use Linux
to test the hardware. And if we *do* have to subsequently bump the check
to include the next revision of the hardware, I want someone's head on a
plate.

As I commented in private, this is the first time we've used
intel_iommu_gfx_mapped to disable a feature without a check for specific
hardware revisions. Please don't do that.

> Now this very likely is some fumble in our code, after all the bios
> managed to set things up.

Maybe not; sometimes it's just that Linux does a little bit more with
the hardware and happens to tickle the bug. The superpage screwup I've
recently been chasing, for example, is blatantly a hardware issue but
didn't show up with the BIOS framebuffer. And *does* with the Linux
framebuffer in stolen memory.
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index d58b4e287e32..28d24caa49f3 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -214,6 +214,13 @@  int i915_gem_init_stolen(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	int bios_reserved = 0;
 
+#ifdef CONFIG_INTEL_IOMMU
+	if (intel_iommu_gfx_mapped) {
+		DRM_INFO("DMAR active, disabling use of stolen memory\n");
+		return 0;
+	}
+#endif
+
 	if (dev_priv->gtt.stolen_size == 0)
 		return 0;