Message ID | 1395147544-16984-1-git-send-email-jani.nikula@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Mar 18, 2014 at 02:59:04PM +0200, Jani Nikula wrote: > From: Chris Wilson <chris@chris-wilson.co.uk> > > We have reports of heavy screen corruption if we try to use the stolen > memory reserved by the BIOS whilst the DMA-Remapper is active. This > quirk may be only specific to a few machines or BIOSes, but first lets > apply the big hammer and always disable use of stolen memory when DMAR > is active. > > v2 by Jani: Rebase on -fixes, only look at intel_iommu_gfx_mapped. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68535 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > Signed-off-by: Jani Nikula <jani.nikula@intel.com> > > --- > > Daniel, is this the color you want? Yeah, colour looks shiny ;-) Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> > > Signed-off-by: Jani Nikula <jani.nikula@intel.com> > --- > drivers/gpu/drm/i915/i915_gem_stolen.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c > index d58b4e287e32..28d24caa49f3 100644 > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c > @@ -214,6 +214,13 @@ int i915_gem_init_stolen(struct drm_device *dev) > struct drm_i915_private *dev_priv = dev->dev_private; > int bios_reserved = 0; > > +#ifdef CONFIG_INTEL_IOMMU > + if (intel_iommu_gfx_mapped) { > + DRM_INFO("DMAR active, disabling use of stolen memory\n"); > + return 0; > + } > +#endif > + > if (dev_priv->gtt.stolen_size == 0) > return 0; > > -- > 1.7.9.5 >
On Tue, Mar 18, 2014 at 05:48:28PM +0100, Daniel Vetter wrote: > On Tue, Mar 18, 2014 at 02:59:04PM +0200, Jani Nikula wrote: > > From: Chris Wilson <chris@chris-wilson.co.uk> > > > > We have reports of heavy screen corruption if we try to use the stolen > > memory reserved by the BIOS whilst the DMA-Remapper is active. This > > quirk may be only specific to a few machines or BIOSes, but first lets > > apply the big hammer and always disable use of stolen memory when DMAR > > is active. > > > > v2 by Jani: Rebase on -fixes, only look at intel_iommu_gfx_mapped. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68535 > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > > Signed-off-by: Jani Nikula <jani.nikula@intel.com> > > > > --- > > > > Daniel, is this the color you want? > > Yeah, colour looks shiny ;-) Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Correction, cc: stable is missing. -Daniel > > > > Signed-off-by: Jani Nikula <jani.nikula@intel.com> > > --- > > drivers/gpu/drm/i915/i915_gem_stolen.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c > > index d58b4e287e32..28d24caa49f3 100644 > > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c > > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c > > @@ -214,6 +214,13 @@ int i915_gem_init_stolen(struct drm_device *dev) > > struct drm_i915_private *dev_priv = dev->dev_private; > > int bios_reserved = 0; > > > > +#ifdef CONFIG_INTEL_IOMMU > > + if (intel_iommu_gfx_mapped) { > > + DRM_INFO("DMAR active, disabling use of stolen memory\n"); > > + return 0; > > + } > > +#endif > > + > > if (dev_priv->gtt.stolen_size == 0) > > return 0; > > > > -- > > 1.7.9.5 > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
On Tue, 18 Mar 2014, Daniel Vetter <daniel@ffwll.ch> wrote: > On Tue, Mar 18, 2014 at 05:48:28PM +0100, Daniel Vetter wrote: >> On Tue, Mar 18, 2014 at 02:59:04PM +0200, Jani Nikula wrote: >> > From: Chris Wilson <chris@chris-wilson.co.uk> >> > >> > We have reports of heavy screen corruption if we try to use the stolen >> > memory reserved by the BIOS whilst the DMA-Remapper is active. This >> > quirk may be only specific to a few machines or BIOSes, but first lets >> > apply the big hammer and always disable use of stolen memory when DMAR >> > is active. >> > >> > v2 by Jani: Rebase on -fixes, only look at intel_iommu_gfx_mapped. >> > >> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68535 >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> >> > Signed-off-by: Jani Nikula <jani.nikula@intel.com> >> > >> > --- >> > >> > Daniel, is this the color you want? >> >> Yeah, colour looks shiny ;-) Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> > > Correction, cc: stable is missing. Pushed to -fixes, thanks for the patch (original by Chris) and review. Jani. > -Daniel > >> > >> > Signed-off-by: Jani Nikula <jani.nikula@intel.com> >> > --- >> > drivers/gpu/drm/i915/i915_gem_stolen.c | 7 +++++++ >> > 1 file changed, 7 insertions(+) >> > >> > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c >> > index d58b4e287e32..28d24caa49f3 100644 >> > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c >> > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c >> > @@ -214,6 +214,13 @@ int i915_gem_init_stolen(struct drm_device *dev) >> > struct drm_i915_private *dev_priv = dev->dev_private; >> > int bios_reserved = 0; >> > >> > +#ifdef CONFIG_INTEL_IOMMU >> > + if (intel_iommu_gfx_mapped) { >> > + DRM_INFO("DMAR active, disabling use of stolen memory\n"); >> > + return 0; >> > + } >> > +#endif >> > + >> > if (dev_priv->gtt.stolen_size == 0) >> > return 0; >> > >> > -- >> > 1.7.9.5 >> > >> >> -- >> Daniel Vetter >> Software Engineer, Intel Corporation >> +41 (0) 79 365 57 48 - http://blog.ffwll.ch > > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Tue, 2014-03-18 at 14:59 +0200, Jani Nikula wrote: > From: Chris Wilson <chris@chris-wilson.co.uk> > > We have reports of heavy screen corruption if we try to use the stolen > memory reserved by the BIOS whilst the DMA-Remapper is active. This > quirk may be only specific to a few machines or BIOSes, but first lets > apply the big hammer and always disable use of stolen memory when DMAR > is active. > > v2 by Jani: Rebase on -fixes, only look at intel_iommu_gfx_mapped. Perhaps this (and all similar workarounds) should be predicated on i915_preliminary_hw_support? When people are using the Linux kernel for chipset validation, we sure as hell don't want to silently disable this stuff and let them think it's working when it's not.
On Wed, 19 Mar 2014, David Woodhouse <dwmw2@infradead.org> wrote: > On Tue, 2014-03-18 at 14:59 +0200, Jani Nikula wrote: >> From: Chris Wilson <chris@chris-wilson.co.uk> >> >> We have reports of heavy screen corruption if we try to use the stolen >> memory reserved by the BIOS whilst the DMA-Remapper is active. This >> quirk may be only specific to a few machines or BIOSes, but first lets >> apply the big hammer and always disable use of stolen memory when DMAR >> is active. >> >> v2 by Jani: Rebase on -fixes, only look at intel_iommu_gfx_mapped. > > Perhaps this (and all similar workarounds) should be predicated on > i915_preliminary_hw_support? When people are using the Linux kernel for > chipset validation, we sure as hell don't want to silently disable this > stuff and let them think it's working when it's not. Or an additional knob, in case it's really not working and people want to get other things depending on prelim hw support done. BR, Jani.
On Thu, 2014-03-20 at 09:36 +0200, Jani Nikula wrote: > > Or an additional knob, in case it's really not working and people want > to get other things depending on prelim hw support done. Yeah. Perhaps the best answer is a 'disable_silicon_workarounds' option, to disable *all* workarounds for silicon bugs. Couple that with a printk telling the user that workarounds are disabled *and* VT-d is enabled. That's a nice simple thing for the chipset validation folks to be looking for. Unless they see that and have no issues with either framebuffer or X, the chipset hasn't been tested. That aside, I'm also unhappy with your patch on general principles. As a rule I'd like to see references to a *specific* published erratum, for anything we disable. Otherwise we're just admitting that life is too hard and we *never* bother to test our silicon before we ship it and we *expect* it to be broken. If we chase broken hardware to the point where errata are published, we should hopefully ensure that the problem feeds back to the validation folks who haven't done their job properly. Every time. (Pondered making this an internal email, but hey — *you're* the one who said "our hardware is always broken and we don't even bother to track individual brokenness". I'm just translating it into English from what's in your patch :)
On Thu, 20 Mar 2014, David Woodhouse <dwmw2@infradead.org> wrote: > On Thu, 2014-03-20 at 09:36 +0200, Jani Nikula wrote: >> >> Or an additional knob, in case it's really not working and people want >> to get other things depending on prelim hw support done. > > Yeah. Perhaps the best answer is a 'disable_silicon_workarounds' option, > to disable *all* workarounds for silicon bugs. Couple that with a printk > telling the user that workarounds are disabled *and* VT-d is enabled. > > That's a nice simple thing for the chipset validation folks to be > looking for. Unless they see that and have no issues with either > framebuffer or X, the chipset hasn't been tested. > > That aside, I'm also unhappy with your patch on general principles. As a > rule I'd like to see references to a *specific* published erratum, for > anything we disable. Otherwise we're just admitting that life is too > hard and we *never* bother to test our silicon before we ship it and we > *expect* it to be broken. > > If we chase broken hardware to the point where errata are published, we > should hopefully ensure that the problem feeds back to the validation > folks who haven't done their job properly. Every time. > > (Pondered making this an internal email, but hey — *you're* the one who > said "our hardware is always broken and we don't even bother to track > individual brokenness". I'm just translating it into English from what's > in your patch :) I'll have to dodge this particular discussion, just because it was really Chris' patch which I merely repainted with colours requested by our resident interior designer Daniel. ;) BR, Jani. > > -- > dwmw2 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Thu, Mar 20, 2014 at 8:49 AM, David Woodhouse <dwmw2@infradead.org> wrote: > On Thu, 2014-03-20 at 09:36 +0200, Jani Nikula wrote: >> >> Or an additional knob, in case it's really not working and people want >> to get other things depending on prelim hw support done. > > Yeah. Perhaps the best answer is a 'disable_silicon_workarounds' option, > to disable *all* workarounds for silicon bugs. Couple that with a printk > telling the user that workarounds are disabled *and* VT-d is enabled. > > That's a nice simple thing for the chipset validation folks to be > looking for. Unless they see that and have no issues with either > framebuffer or X, the chipset hasn't been tested. > > That aside, I'm also unhappy with your patch on general principles. As a > rule I'd like to see references to a *specific* published erratum, for > anything we disable. Otherwise we're just admitting that life is too > hard and we *never* bother to test our silicon before we ship it and we > *expect* it to be broken. > > If we chase broken hardware to the point where errata are published, we > should hopefully ensure that the problem feeds back to the validation > folks who haven't done their job properly. Every time. > > (Pondered making this an internal email, but hey -- *you're* the one who > said "our hardware is always broken and we don't even bother to track > individual brokenness". I'm just translating it into English from what's > in your patch :) I'd agree that this would be nice, but my maintainer time is not endless and when I have users screaming "regression" I do have to do something. And yeah with the track record set of some of the earliest vtd+gfx chips I'm fairly aggressive with just disabling features, especially when the original bug report is against a recent platform like ivb (so presumably issues on olders exist, too). Now this very likely is some fumble in our code, after all the bios managed to set things up. But until I have managers screaming at me and throwing people ("resources") at the problem, my only concern is keeping the regressions out the door without disabling other stuff my managers actually do scream around about. -Daniel
On Thu, 2014-03-20 at 10:45 +0100, Daniel Vetter wrote: > I'd agree that this would be nice, but my maintainer time is not > endless and when I have users screaming "regression" I do have to do > something. And yeah with the track record set of some of the earliest > vtd+gfx chips I'm fairly aggressive with just disabling features, It's not just the earlier vtd+gfx chips. To my knowledge we've *never* shipped a chip without egregious hardware bugs with VT-d. > especially when the original bug report is against a recent platform > like ivb (so presumably issues on olders exist, too). Yes, by all means disable it for current and old chipsets. But please not newer ones. I want it to show up when the validation folks use Linux to test the hardware. And if we *do* have to subsequently bump the check to include the next revision of the hardware, I want someone's head on a plate. As I commented in private, this is the first time we've used intel_iommu_gfx_mapped to disable a feature without a check for specific hardware revisions. Please don't do that. > Now this very likely is some fumble in our code, after all the bios > managed to set things up. Maybe not; sometimes it's just that Linux does a little bit more with the hardware and happens to tickle the bug. The superpage screwup I've recently been chasing, for example, is blatantly a hardware issue but didn't show up with the BIOS framebuffer. And *does* with the Linux framebuffer in stolen memory.
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c index d58b4e287e32..28d24caa49f3 100644 --- a/drivers/gpu/drm/i915/i915_gem_stolen.c +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c @@ -214,6 +214,13 @@ int i915_gem_init_stolen(struct drm_device *dev) struct drm_i915_private *dev_priv = dev->dev_private; int bios_reserved = 0; +#ifdef CONFIG_INTEL_IOMMU + if (intel_iommu_gfx_mapped) { + DRM_INFO("DMAR active, disabling use of stolen memory\n"); + return 0; + } +#endif + if (dev_priv->gtt.stolen_size == 0) return 0;