Message ID | 1370990967-22892-1-git-send-email-marcheu@chromium.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Jun 11, 2013 at 03:49:26PM -0700, Stéphane Marchesin wrote: > It's basically the same deal as the RC6+ issues on ivy bridge > except this time with RC6 on sandy bridge. Like last time the > core of the issue is that the timings don't work 100% with our > voltage regulator. So from time to time, the kernel will print > a warning message about the GPU not getting out of RC6. In > particular, I found this fairly easy to reproduce during > suspend/resume. > > Changing the threshold to 150000 instead of 50000 seems to fix > the issue. > > I also measured the idle power usage before/after this patch and > didn't see a difference on a sandy bridge laptop. > > Signed-off-by: Stéphane Marchesin <marcheu@chromium.org> One magic number for another with no idea what is blowing up - I fear we are just changing the frequency of the hang. I've pinged a number of snb rc6 bug reports to see if we get a bite. FWIW, Acked-by: Chris Wilson <chris@chris-wilson.co.uk> -Chris
On Wed, Jun 12, 2013 at 2:41 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote: > On Tue, Jun 11, 2013 at 03:49:26PM -0700, Stéphane Marchesin wrote: >> It's basically the same deal as the RC6+ issues on ivy bridge >> except this time with RC6 on sandy bridge. Like last time the >> core of the issue is that the timings don't work 100% with our >> voltage regulator. So from time to time, the kernel will print >> a warning message about the GPU not getting out of RC6. In >> particular, I found this fairly easy to reproduce during >> suspend/resume. >> >> Changing the threshold to 150000 instead of 50000 seems to fix >> the issue. >> >> I also measured the idle power usage before/after this patch and >> didn't see a difference on a sandy bridge laptop. >> >> Signed-off-by: Stéphane Marchesin <marcheu@chromium.org> > > One magic number for another with no idea what is blowing up - I fear we > are just changing the frequency of the hang. I've pinged a number of snb > rc6 bug reports to see if we get a bite. Yup, if only Intel documented those registers :) Stéphane > > FWIW, > Acked-by: Chris Wilson <chris@chris-wilson.co.uk> > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre
On Fri, Jun 14, 2013 at 9:13 PM, Stéphane Marchesin <marcheu@chromium.org> wrote: > On Wed, Jun 12, 2013 at 2:41 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote: >> On Tue, Jun 11, 2013 at 03:49:26PM -0700, Stéphane Marchesin wrote: >>> It's basically the same deal as the RC6+ issues on ivy bridge >>> except this time with RC6 on sandy bridge. Like last time the >>> core of the issue is that the timings don't work 100% with our >>> voltage regulator. So from time to time, the kernel will print >>> a warning message about the GPU not getting out of RC6. In >>> particular, I found this fairly easy to reproduce during >>> suspend/resume. >>> >>> Changing the threshold to 150000 instead of 50000 seems to fix >>> the issue. >>> >>> I also measured the idle power usage before/after this patch and >>> didn't see a difference on a sandy bridge laptop. >>> >>> Signed-off-by: Stéphane Marchesin <marcheu@chromium.org> >> >> One magic number for another with no idea what is blowing up - I fear we >> are just changing the frequency of the hang. I've pinged a number of snb >> rc6 bug reports to see if we get a bite. > > Yup, if only Intel documented those registers :) We've spammed rc6 bugs in bugzilla, one reporter says that this patch breaks rc6 from "sometimes it doesn't work after resume" to "always broken": https://bugs.freedesktop.org/show_bug.cgi?id=54089#c63 So I guess I can't merge this :( -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
On Fri, Jun 14, 2013 at 12:32 PM, Daniel Vetter <daniel.vetter@ffwll.ch> wrote: > On Fri, Jun 14, 2013 at 9:13 PM, Stéphane Marchesin > <marcheu@chromium.org> wrote: >> On Wed, Jun 12, 2013 at 2:41 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote: >>> On Tue, Jun 11, 2013 at 03:49:26PM -0700, Stéphane Marchesin wrote: >>>> It's basically the same deal as the RC6+ issues on ivy bridge >>>> except this time with RC6 on sandy bridge. Like last time the >>>> core of the issue is that the timings don't work 100% with our >>>> voltage regulator. So from time to time, the kernel will print >>>> a warning message about the GPU not getting out of RC6. In >>>> particular, I found this fairly easy to reproduce during >>>> suspend/resume. >>>> >>>> Changing the threshold to 150000 instead of 50000 seems to fix >>>> the issue. >>>> >>>> I also measured the idle power usage before/after this patch and >>>> didn't see a difference on a sandy bridge laptop. >>>> >>>> Signed-off-by: Stéphane Marchesin <marcheu@chromium.org> >>> >>> One magic number for another with no idea what is blowing up - I fear we >>> are just changing the frequency of the hang. I've pinged a number of snb >>> rc6 bug reports to see if we get a bite. >> >> Yup, if only Intel documented those registers :) > > We've spammed rc6 bugs in bugzilla, one reporter says that this patch > breaks rc6 from "sometimes it doesn't work after resume" to "always > broken": > > https://bugs.freedesktop.org/show_bug.cgi?id=54089#c63 > > So I guess I can't merge this :( Yeah I was actually going to send an email to withdraw this patch, as it prevents rc6 from working on some machines here. So I guess I found out the same thing. Stéphane
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index aa01128..52fe8f7 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -2577,7 +2577,7 @@ static void gen6_enable_rps(struct drm_device *dev) I915_WRITE(GEN6_RC_SLEEP, 0); I915_WRITE(GEN6_RC1e_THRESHOLD, 1000); - I915_WRITE(GEN6_RC6_THRESHOLD, 50000); + I915_WRITE(GEN6_RC6_THRESHOLD, 150000); I915_WRITE(GEN6_RC6p_THRESHOLD, 150000); I915_WRITE(GEN6_RC6pp_THRESHOLD, 64000); /* unused */
It's basically the same deal as the RC6+ issues on ivy bridge except this time with RC6 on sandy bridge. Like last time the core of the issue is that the timings don't work 100% with our voltage regulator. So from time to time, the kernel will print a warning message about the GPU not getting out of RC6. In particular, I found this fairly easy to reproduce during suspend/resume. Changing the threshold to 150000 instead of 50000 seems to fix the issue. I also measured the idle power usage before/after this patch and didn't see a difference on a sandy bridge laptop. Signed-off-by: Stéphane Marchesin <marcheu@chromium.org> --- drivers/gpu/drm/i915/intel_pm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)