Message ID | 1418955618-382-1-git-send-email-benjamin.widawsky@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Dec 18, 2014 at 06:20:18PM -0800, Ben Widawsky wrote: > From: Ben Widawsky <ben@bwidawsk.net> > > The docs specify this needs to be set on HSW GT1 parts. I've implemented it as > such since it should only be needed when using RC6, but it can probably go > anywhere. > > This patch fixes extremely reproducible hangs on our Jenkins setup. > > The interesting failure signature is: > IPEHR: 0x780c0000 (3DSTATE_VF) > INSTDONE_0: 0xffdfbffa (SVG + VS) We see the same bug on IVB, and the bug (verified on hsw gt1) persists with i915.enable_rc6=0. -Chris
On Fri, Dec 19, 2014 at 08:20:04AM +0000, Chris Wilson wrote: > On Thu, Dec 18, 2014 at 06:20:18PM -0800, Ben Widawsky wrote: > > From: Ben Widawsky <ben@bwidawsk.net> > > > > The docs specify this needs to be set on HSW GT1 parts. I've implemented it as > > such since it should only be needed when using RC6, but it can probably go > > anywhere. > > > > This patch fixes extremely reproducible hangs on our Jenkins setup. > > > > The interesting failure signature is: > > IPEHR: 0x780c0000 (3DSTATE_VF) > > INSTDONE_0: 0xffdfbffa (SVG + VS) > > We see the same bug on IVB, and the bug (verified on hsw gt1) persists with i915.enable_rc6=0. > -Chris I would claim it's not the same bug on IVB. The workaround very specifically says HSW GT1 only. The failure signature itself is pretty generic, it just suggests the VS went off and did something stupid, and the VF was unable to fill the URB. (That's just my best guess) I do apologize if I inadvertently made a connection to the IVB bug issue. So, what is your point? I'm willing to concede we don't know all the facts, but the [previous version of this] patch demonstrably fixes hangs on HSW GT1.
On Fri, Dec 19, 2014 at 09:26:01AM -0800, Ben Widawsky wrote: > On Fri, Dec 19, 2014 at 08:20:04AM +0000, Chris Wilson wrote: > > On Thu, Dec 18, 2014 at 06:20:18PM -0800, Ben Widawsky wrote: > > > From: Ben Widawsky <ben@bwidawsk.net> > > > > > > The docs specify this needs to be set on HSW GT1 parts. I've implemented it as > > > such since it should only be needed when using RC6, but it can probably go > > > anywhere. > > > > > > This patch fixes extremely reproducible hangs on our Jenkins setup. > > > > > > The interesting failure signature is: > > > IPEHR: 0x780c0000 (3DSTATE_VF) > > > INSTDONE_0: 0xffdfbffa (SVG + VS) > > > > We see the same bug on IVB, and the bug (verified on hsw gt1) persists with i915.enable_rc6=0. > > -Chris > > I would claim it's not the same bug on IVB. The workaround very specifically > says HSW GT1 only. The failure signature itself is pretty generic, it just > suggests the VS went off and did something stupid, and the VF was unable to fill > the URB. (That's just my best guess) I do apologize if I inadvertently made a > connection to the IVB bug issue. The failure message matches the context hang bug: "dies on last instruction in the context restore" and probably has nothing to do with the VS. > So, what is your point? I'm willing to concede we don't know all the facts, but > the [previous version of this] patch demonstrably fixes hangs on HSW GT1. Just a patch a few days ago that to fix a very ontologically similar bug for ivb+. -Chris
On Fri, Dec 19, 2014 at 06:34:07PM +0000, Chris Wilson wrote: > Just a patch a few days ago that to fix a very ontologically similar bug > for ivb+. Oh boy. I just double checked the error states from those bugs I marked as ivb context restore hangs... So far I appear to have consistently mislabled 0x402/0x406 as ivb and not hsw gt1. /o\ -Chris
On Fri, Dec 19, 2014 at 08:05:30PM +0000, Chris Wilson wrote: > On Fri, Dec 19, 2014 at 06:34:07PM +0000, Chris Wilson wrote: > > Just a patch a few days ago that to fix a very ontologically similar bug > > for ivb+. > > Oh boy. I just double checked the error states from those bugs I marked > as ivb context restore hangs... So far I appear to have consistently > mislabled 0x402/0x406 as ivb and not hsw gt1. /o\ > -Chris I don't know what to do with this patch now. It looks like all we needed was your patch. Merge without the "stable" is what I'm thinking. > > -- > Chris Wilson, Intel Open Source Technology Centre
On Fri, Dec 19, 2014 at 02:14:31PM -0800, Ben Widawsky wrote: > On Fri, Dec 19, 2014 at 08:05:30PM +0000, Chris Wilson wrote: > > On Fri, Dec 19, 2014 at 06:34:07PM +0000, Chris Wilson wrote: > > > Just a patch a few days ago that to fix a very ontologically similar bug > > > for ivb+. > > > > Oh boy. I just double checked the error states from those bugs I marked > > as ivb context restore hangs... So far I appear to have consistently > > mislabled 0x402/0x406 as ivb and not hsw gt1. /o\ > > -Chris > > I don't know what to do with this patch now. It looks like all we needed was > your patch. Merge without the "stable" is what I'm thinking. Makes sense if we really don't need it any more with Chris' patches in -fixes. Can you please volunteer someone to cough up an r-b? Thanks, Daniel
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 921e4c5..f69984d 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2282,6 +2282,8 @@ struct drm_i915_cmd_table { (INTEL_DEVID(dev) & 0xf) == 0xe)) #define IS_BDW_GT3(dev) (IS_BROADWELL(dev) && \ (INTEL_DEVID(dev) & 0x00F0) == 0x0020) +#define IS_HSW_GT1(dev) (IS_HASWELL(dev) && \ + (INTEL_DEVID(dev) & 0x00F0) == 0x0) #define IS_HSW_ULT(dev) (IS_HASWELL(dev) && \ (INTEL_DEVID(dev) & 0xFF00) == 0x0A00) #define IS_HSW_GT3(dev) (IS_HASWELL(dev) && \ diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 40ca873..f9ff662 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -1218,6 +1218,13 @@ enum punit_power_well { #define RING_DMA_FADD_UDW(base) ((base)+0x60) /* gen8+ */ #define RING_INSTPM(base) ((base)+0xc0) #define RING_MI_MODE(base) ((base)+0x9c) +#define RING_WAIT_FOR_RC6_EXIT(base) ((base)+0xcc) +#define RING_RC6_SEL_WRITE_ADDR_MASK (0x7 << 4) +#define RING_RC6_SEL_WRITE_ADDR_MULTICAST (0x0 << 4) +#define RING_RC6_SEL_WRITE_ADDR_UPPER_LEFT (0x4 << 4) +#define RING_RC6_SEL_WRITE_ADDR_UPPER_RIGHT (0x5 << 4) +#define RING_RC6_SEL_WRITE_ADDR_LOWER_LEFT (0x6 << 4) +#define RING_RC6_SEL_WRITE_ADDR_LOWER_RIGHT (0x7 << 4) #define INSTPS 0x02070 /* 965+ only */ #define INSTDONE1 0x0207c /* 965+ only */ #define ACTHD_I965 0x02074 diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index a3ebaa8..a27003c 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -4259,6 +4259,15 @@ static void gen6_enable_rps(struct drm_device *dev) DRM_ERROR("Couldn't fix incorrect rc6 voltage\n"); } + /* HSW GT1: "This field must be always [be] programmed to “100” , this + * is required to address know [sic] HW issue." */ + if (IS_HSW_GT1(dev)) { + for_each_ring(ring, dev_priv, i) { + I915_WRITE(RING_WAIT_FOR_RC6_EXIT(ring->mmio_base), + _MASKED_FIELD(RING_RC6_SEL_WRITE_ADDR_MASK, + RING_RC6_SEL_WRITE_ADDR_UPPER_LEFT)); + } + } gen6_gt_force_wake_put(dev_priv, FORCEWAKE_ALL); }