diff mbox

Linux 3.10-rc7

Message ID 20130625130816.55e6cb80@jbarnes-desktop (mailing list archive)
State New, archived
Headers show

Commit Message

Jesse Barnes June 25, 2013, 8:08 p.m. UTC
On Tue, 25 Jun 2013 19:59:28 +0000
Shuah Khan <shuah.kh@samsung.com> wrote:

> On 06/25/2013 01:52 PM, Jesse Barnes wrote:
> > On Tue, 25 Jun 2013 21:37:37 +0200
> > Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> 
> >>
> >> Adding more lists to cc + Jesse since he's the guilty one for the
> >> vt-switchless state restore stuff.
> >
> > Yeah, looks like we don't fetch the PLL state on resume from hibernate,
> > leading to this warning.  The refcount is nonzero, indicating the pll
> > is in use, but the active field is clear, which means we're missing an
> > update somewhere.
> >
> > Shuah, just to confirm, does your resume actually work ok aside from
> > the warning?  I *think* it's harmless in this case, but does indicate a
> > real bug in our state tracking... trying to come up with a patch now.
> >
> > Thanks,
> >
> 
> Resume works just fine. I see it take longer for it to suspend compared 
> to 3.9.7 and then resumes just fine. Suspend taking longer very well 
> could be because of this warn_on. Other than this warn_on I haven't 
> noticed any other problems.

Here's the patch I'm testing now, can you give it a try?

Comments

Shuah Khan June 25, 2013, 8:51 p.m. UTC | #1
On 06/25/2013 02:06 PM, Jesse Barnes wrote:
> On Tue, 25 Jun 2013 19:59:28 +0000
> Shuah Khan <shuah.kh@samsung.com> wrote:
>
>> On 06/25/2013 01:52 PM, Jesse Barnes wrote:
>>> On Tue, 25 Jun 2013 21:37:37 +0200
>>> Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>>>
>>
>>>>
>>>> Adding more lists to cc + Jesse since he's the guilty one for the
>>>> vt-switchless state restore stuff.
>>>
>>> Yeah, looks like we don't fetch the PLL state on resume from hibernate,
>>> leading to this warning.  The refcount is nonzero, indicating the pll
>>> is in use, but the active field is clear, which means we're missing an
>>> update somewhere.
>>>
>>> Shuah, just to confirm, does your resume actually work ok aside from
>>> the warning?  I *think* it's harmless in this case, but does indicate a
>>> real bug in our state tracking... trying to come up with a patch now.
>>>
>>> Thanks,
>>>
>>
>> Resume works just fine. I see it take longer for it to suspend compared
>> to 3.9.7 and then resumes just fine. Suspend taking longer very well
>> could be because of this warn_on. Other than this warn_on I haven't
>> noticed any other problems.
>
> Here's the patch I'm testing now, can you give it a try?
>

Jesse,

With this patch warn_on went away. Resume worked. I started seeing:

[   78.733062] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
[   78.733079] mei_me 0000:00:16.0: reset: wrong host start response
[   78.733082] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING

over and over again after resume from reboot mode suspend. dmesg filled 
up with these messages.

I did suspend to disk shutdown mode right away. Resume worked, no 
warn_ons, and no mei_me messages.

-- Shuah

Shuah Khan, Linux Kernel Developer - Open Source Group Samsung Research 
America (Silicon Valley) shuah.kh@samsung.com | (970) 672-0658
Winkler, Tomas June 25, 2013, 8:54 p.m. UTC | #2
On Tue, Jun 25, 2013 at 11:51 PM, Shuah Khan <shuah.kh@samsung.com> wrote:

> On 06/25/2013 02:06 PM, Jesse Barnes wrote:
> > On Tue, 25 Jun 2013 19:59:28 +0000
> > Shuah Khan <shuah.kh@samsung.com> wrote:
> >
> >> On 06/25/2013 01:52 PM, Jesse Barnes wrote:
> >>> On Tue, 25 Jun 2013 21:37:37 +0200
> >>> Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >>>
> >>
> >>>>
> >>>> Adding more lists to cc + Jesse since he's the guilty one for the
> >>>> vt-switchless state restore stuff.
> >>>
> >>> Yeah, looks like we don't fetch the PLL state on resume from hibernate,
> >>> leading to this warning.  The refcount is nonzero, indicating the pll
> >>> is in use, but the active field is clear, which means we're missing an
> >>> update somewhere.
> >>>
> >>> Shuah, just to confirm, does your resume actually work ok aside from
> >>> the warning?  I *think* it's harmless in this case, but does indicate a
> >>> real bug in our state tracking... trying to come up with a patch now.
> >>>
> >>> Thanks,
> >>>
> >>
> >> Resume works just fine. I see it take longer for it to suspend compared
> >> to 3.9.7 and then resumes just fine. Suspend taking longer very well
> >> could be because of this warn_on. Other than this warn_on I haven't
> >> noticed any other problems.
> >
> > Here's the patch I'm testing now, can you give it a try?
> >
>
> Jesse,
>
> With this patch warn_on went away. Resume worked. I started seeing:
>
> [   78.733062] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
> [   78.733079] mei_me 0000:00:16.0: reset: wrong host start response
> [   78.733082] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
>
> over and over again after resume from reboot mode suspend. dmesg filled
> up with these messages.
>

Can you please send me the log part when this starts?

Thanks
Tomas
Winkler, Tomas June 25, 2013, 8:57 p.m. UTC | #3
On Tue, Jun 25, 2013 at 11:51 PM, Shuah Khan <shuah.kh@samsung.com> wrote:
>
> On 06/25/2013 02:06 PM, Jesse Barnes wrote:
> > On Tue, 25 Jun 2013 19:59:28 +0000
> > Shuah Khan <shuah.kh@samsung.com> wrote:
> >
> >> On 06/25/2013 01:52 PM, Jesse Barnes wrote:
> >>> On Tue, 25 Jun 2013 21:37:37 +0200
> >>> Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >>>
> >>
> >>>>
> >>>> Adding more lists to cc + Jesse since he's the guilty one for the
> >>>> vt-switchless state restore stuff.
> >>>
> >>> Yeah, looks like we don't fetch the PLL state on resume from
> >>> hibernate,
> >>> leading to this warning.  The refcount is nonzero, indicating the pll
> >>> is in use, but the active field is clear, which means we're missing an
> >>> update somewhere.
> >>>
> >>> Shuah, just to confirm, does your resume actually work ok aside from
> >>> the warning?  I *think* it's harmless in this case, but does indicate
> >>> a
> >>> real bug in our state tracking... trying to come up with a patch now.
> >>>
> >>> Thanks,
> >>>
> >>
> >> Resume works just fine. I see it take longer for it to suspend compared
> >> to 3.9.7 and then resumes just fine. Suspend taking longer very well
> >> could be because of this warn_on. Other than this warn_on I haven't
> >> noticed any other problems.
> >
> > Here's the patch I'm testing now, can you give it a try?
> >
>
> Jesse,
>
> With this patch warn_on went away. Resume worked. I started seeing:
>
> [   78.733062] mei_me 0000:00:16.0: unexpected reset: dev_state =
> RESETTING
> [   78.733079] mei_me 0000:00:16.0: reset: wrong host start response
> [   78.733082] mei_me 0000:00:16.0: unexpected reset: dev_state =
> RESETTING
>
> over and over again after resume from reboot mode suspend. dmesg filled
> up with these messages.

Can you please send me the log part when this starts?

Thanks

> I did suspend to disk shutdown mode right away. Resume worked, no
> warn_ons, and no mei_me messages.
>
> -- Shuah
Jesse Barnes June 25, 2013, 9:09 p.m. UTC | #4
On Tue, 25 Jun 2013 20:51:27 +0000
Shuah Khan <shuah.kh@samsung.com> wrote:

> On 06/25/2013 02:06 PM, Jesse Barnes wrote:
> > On Tue, 25 Jun 2013 19:59:28 +0000
> > Shuah Khan <shuah.kh@samsung.com> wrote:
> >
> >> On 06/25/2013 01:52 PM, Jesse Barnes wrote:
> >>> On Tue, 25 Jun 2013 21:37:37 +0200
> >>> Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >>>
> >>
> >>>>
> >>>> Adding more lists to cc + Jesse since he's the guilty one for the
> >>>> vt-switchless state restore stuff.
> >>>
> >>> Yeah, looks like we don't fetch the PLL state on resume from hibernate,
> >>> leading to this warning.  The refcount is nonzero, indicating the pll
> >>> is in use, but the active field is clear, which means we're missing an
> >>> update somewhere.
> >>>
> >>> Shuah, just to confirm, does your resume actually work ok aside from
> >>> the warning?  I *think* it's harmless in this case, but does indicate a
> >>> real bug in our state tracking... trying to come up with a patch now.
> >>>
> >>> Thanks,
> >>>
> >>
> >> Resume works just fine. I see it take longer for it to suspend compared
> >> to 3.9.7 and then resumes just fine. Suspend taking longer very well
> >> could be because of this warn_on. Other than this warn_on I haven't
> >> noticed any other problems.
> >
> > Here's the patch I'm testing now, can you give it a try?
> >
> 
> Jesse,
> 
> With this patch warn_on went away. Resume worked. I started seeing:
> 
> [   78.733062] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
> [   78.733079] mei_me 0000:00:16.0: reset: wrong host start response
> [   78.733082] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
> 
> over and over again after resume from reboot mode suspend. dmesg filled 
> up with these messages.
> 
> I did suspend to disk shutdown mode right away. Resume worked, no 
> warn_ons, and no mei_me messages.

Ok good, so that means the new stuff we have queued will likely work
too.

I'll leave it up to Daniel and Linus whether we just kill this warning
for now though, or apply the state fetching patch.
Shuah Khan June 25, 2013, 9:11 p.m. UTC | #5
On 06/25/2013 02:57 PM, Tomas Winkler wrote:
> On Tue, Jun 25, 2013 at 11:51 PM, Shuah Khan <shuah.kh@samsung.com> wrote:
>>

>> With this patch warn_on went away. Resume worked. I started seeing:
>>
>> [   78.733062] mei_me 0000:00:16.0: unexpected reset: dev_state =
>> RESETTING
>> [   78.733079] mei_me 0000:00:16.0: reset: wrong host start response
>> [   78.733082] mei_me 0000:00:16.0: unexpected reset: dev_state =
>> RESETTING
>>
>> over and over again after resume from reboot mode suspend. dmesg filled
>> up with these messages.
>
> Can you please send me the log part when this starts?
>
> Thanks
>

It rolled over and I don't have prior messages. I tried reproducing 
twice and didn't see it again. I will try a few more times and see if I 
can get it to happen again.

This is what I could save before dmesg rolled over:

[   78.709014] mei_me 0000:00:16.0: reset: wrong host start response
[   78.709016] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
[   78.709029] mei_me 0000:00:16.0: reset: unexpected enumeration 
response hbm.
[   78.709031] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
[   78.709069] mei_me 0000:00:16.0: reset: wrong host start response

-- Shuah

Shuah Khan, Linux Kernel Developer - Open Source Group Samsung Research 
America (Silicon Valley) shuah.kh@samsung.com | (970) 672-0658
Winkler, Tomas June 26, 2013, 10:11 p.m. UTC | #6
> > Can you please send me the log part when this starts?
> >
> > Thanks
> >
> 
> It rolled over and I don't have prior messages. I tried reproducing twice and
> didn't see it again. I will try a few more times and see if I can get it to happen
> again.
> 
> This is what I could save before dmesg rolled over:
> 
> [   78.709014] mei_me 0000:00:16.0: reset: wrong host start response
> [   78.709016] mei_me 0000:00:16.0: unexpected reset: dev_state =
> RESETTING
> [   78.709029] mei_me 0000:00:16.0: reset: unexpected enumeration
> response hbm.
> [   78.709031] mei_me 0000:00:16.0: unexpected reset: dev_state =
> RESETTING
> [   78.709069] mei_me 0000:00:16.0: reset: wrong host start response
> 
So far I was able to positively reproduce it with  3.10-rc5 but not with 3.10-rc7 and above.

There are 3 patches that went in to fix this issue

42f132f mei: me: clear interrupts on the resume path
2753ff5 mei: nfc: fix nfc device freeing
5e85b36 mei: init: Flush scheduled work before resetting the device

Are you sure you have these 3 in?

Thanks
Tomas
Shuah Khan June 26, 2013, 10:24 p.m. UTC | #7
On 06/26/2013 04:12 PM, Winkler, Tomas wrote:
>
>
>>> Can you please send me the log part when this starts?
>>>
>>> Thanks
>>>
>>
>> It rolled over and I don't have prior messages. I tried reproducing twice and
>> didn't see it again. I will try a few more times and see if I can get it to happen
>> again.
>>
>> This is what I could save before dmesg rolled over:
>>
>> [   78.709014] mei_me 0000:00:16.0: reset: wrong host start response
>> [   78.709016] mei_me 0000:00:16.0: unexpected reset: dev_state =
>> RESETTING
>> [   78.709029] mei_me 0000:00:16.0: reset: unexpected enumeration
>> response hbm.
>> [   78.709031] mei_me 0000:00:16.0: unexpected reset: dev_state =
>> RESETTING
>> [   78.709069] mei_me 0000:00:16.0: reset: wrong host start response
>>
> So far I was able to positively reproduce it with  3.10-rc5 but not with 3.10-rc7 and above.
>
> There are 3 patches that went in to fix this issue
>
> 42f132f mei: me: clear interrupts on the resume path
> 2753ff5 mei: nfc: fix nfc device freeing
> 5e85b36 mei: init: Flush scheduled work before resetting the device
>
> Are you sure you have these 3 in?
>

Checked the git log and yes I have all three commits. It appears this 
problem is intermittent and hard to reproduce at least on 3.10-rc7. I 
tried several times yesterday to capture the log and couldn't reproduce.

-- Shuah

Shuah Khan, Linux Kernel Developer - Open Source Group Samsung Research 
America (Silicon Valley) shuah.kh@samsung.com | (970) 672-0658
Shuah Khan July 1, 2013, 2:54 p.m. UTC | #8
On 06/26/2013 04:24 PM, Shuah Khan wrote:
> On 06/26/2013 04:12 PM, Winkler, Tomas wrote:
>>
>>

>> 42f132f mei: me: clear interrupts on the resume path
>> 2753ff5 mei: nfc: fix nfc device freeing
>> 5e85b36 mei: init: Flush scheduled work before resetting the device
>>
>> Are you sure you have these 3 in?
>>
>
> Checked the git log and yes I have all three commits. It appears this
> problem is intermittent and hard to reproduce at least on 3.10-rc7. I
> tried several times yesterday to capture the log and couldn't reproduce.
>
> -- Shuah

Tomas,

I saw the mei_me problem again, however couldn't save the logs. I am 
getting into the habit of saving dmesg as soon as system gets resumed to 
catch the dmesg buffer prior to mei getting into this state. There is 
another difference in suspend sequence between 3.9.8 and 3.10-rc6 and rc-7.

When I do echo disk > state,

Screen clears and instead of going into console mode like it does on 
3.9.8, it will get back into graphics mode and show the screen exactly 
the way it was right after echo disk > state command was issued. It 
stays in that state for good 60 seconds or more and then I see the 
suspend complete.

I can start bi-sect of this problem on intel-display scope if you would 
like me to. Please let me know if the bisect scope should be larger.

-- Shuah

Shuah Khan, Linux Kernel Developer - Open Source Group Samsung Research 
America (Silicon Valley) shuah.kh@samsung.com | (970) 672-0658
Winkler, Tomas July 4, 2013, 7:26 p.m. UTC | #9
On Mon, Jul 1, 2013 at 5:54 PM, Shuah Khan <shuah.kh@samsung.com> wrote:
> On 06/26/2013 04:24 PM, Shuah Khan wrote:
>> On 06/26/2013 04:12 PM, Winkler, Tomas wrote:
>>>
>>>
>
>>> 42f132f mei: me: clear interrupts on the resume path
>>> 2753ff5 mei: nfc: fix nfc device freeing
>>> 5e85b36 mei: init: Flush scheduled work before resetting the device
>>>
>>> Are you sure you have these 3 in?
>>>
>>
>> Checked the git log and yes I have all three commits. It appears this
>> problem is intermittent and hard to reproduce at least on 3.10-rc7. I
>> tried several times yesterday to capture the log and couldn't reproduce.
>>
>> -- Shuah
>
> Tomas,
>
> I saw the mei_me problem again, however couldn't save the logs. I am
> getting into the habit of saving dmesg as soon as system gets resumed to
> catch the dmesg buffer prior to mei getting into this state. There is
> another difference in suspend sequence between 3.9.8 and 3.10-rc6 and rc-7.
>
> When I do echo disk > state,
>
> Screen clears and instead of going into console mode like it does on
> 3.9.8, it will get back into graphics mode and show the screen exactly
> the way it was right after echo disk > state command was issued. It
> stays in that state for good 60 seconds or more and then I see the
> suspend complete.
>
> I can start bi-sect of this problem on intel-display scope if you would
> like me to. Please let me know if the bisect scope should be larger.
>
> -- Shuah

I got finally an older system where this reproduces consistently, I'm
trying to root cause that now.
As soon I have something to test I will send it out.

Thanks
Tomas
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 56746dc..df0caf0 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9289,6 +9289,49 @@  void i915_redisable_vga(struct drm_device *dev)
 	}
 }
 
+static void ironlake_crtc_pll_get(struct intel_crtc *crtc)
+{
+	struct drm_device *dev = crtc->base.dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	u32 dpll_sel;
+
+	if (HAS_PCH_IBX(dev_priv->dev))
+		crtc->pch_pll = &dev_priv->pch_plls[crtc->pipe];
+
+	if (HAS_PCH_CPT(dev_priv->dev)) {
+		dpll_sel = I915_READ(PCH_DPLL_SEL);
+
+		switch (crtc->pipe) {
+		case PIPE_A:
+			if ((dpll_sel & TRANSA_DPLL_ENABLE) &&
+			    (dpll_sel & TRANSA_DPLLB_SEL))
+				crtc->pch_pll = &dev_priv->pch_plls[1];
+			else if (dpll_sel & TRANSA_DPLL_ENABLE)
+				crtc->pch_pll = &dev_priv->pch_plls[0];
+			break;
+		case PIPE_B:
+			if ((dpll_sel & TRANSB_DPLL_ENABLE) &&
+			    (dpll_sel & TRANSB_DPLLB_SEL))
+				crtc->pch_pll = &dev_priv->pch_plls[1];
+			else if (dpll_sel & TRANSB_DPLL_ENABLE)
+				crtc->pch_pll = &dev_priv->pch_plls[0];
+			break;
+		case PIPE_C:
+			if ((dpll_sel & TRANSC_DPLL_ENABLE) &&
+			    (dpll_sel & TRANSC_DPLLB_SEL))
+				crtc->pch_pll = &dev_priv->pch_plls[1];
+			else if (dpll_sel & TRANSC_DPLL_ENABLE)
+				crtc->pch_pll = &dev_priv->pch_plls[0];
+			break;
+		default:
+			BUG();
+		}
+	}
+
+	crtc->pch_pll->refcount++;
+	crtc->pch_pll->active = 1;
+}
+
 /* Scan out the current hw modeset state, sanitizes it and maps it into the drm
  * and i915 state tracking structures. */
 void intel_modeset_setup_hw_state(struct drm_device *dev,
@@ -9346,6 +9389,9 @@  setup_pipes:
 
 		crtc->base.enabled = crtc->active;
 
+		if (crtc->active && HAS_PCH_SPLIT(dev))
+			ironlake_crtc_pll_get(crtc);
+
 		DRM_DEBUG_KMS("[CRTC:%d] hw state readout: %s\n",
 			      crtc->base.base.id,
 			      crtc->active ? "enabled" : "disabled");