diff mbox

New snd-hda warning spew

Message ID s5hk2kyu5yf.wl-tiwai@suse.de (mailing list archive)
State New, archived
Headers show

Commit Message

Takashi Iwai March 19, 2016, 10:18 a.m. UTC
On Tue, 15 Mar 2016 18:22:56 +0100,
Takashi Iwai wrote:
> 
> On Tue, 15 Mar 2016 17:02:07 +0100,
> Ville Syrjälä wrote:
> > This other one was seen at least on on SKL:
> > [  124.808525] ------------[ cut here ]------------
> > [  124.808545] WARNING: CPU: 3 PID: 173 at sound/hda/hdac_i915.c:91 snd_hdac_display_power+0xf1/0x110 [snd_hda_core]()
> 
> This is a different one, and it implies that the unbalanced power
> refcount.  Might be related with the recent fix for the recursive
> regmap deadlock.  I'll try later with a SKL machine here, too.
> 
> Didn't you see this before the recent tree, right?  Some good/bad
> commits would be really helpful...

So, we got a full log in bugzilla, and this looks like the on-demand
binding leading to the unbalanced refcount.

Could you try the patch below?  This essentially reverts to the 4.4
state, so a "regression" should be covered, at least.  It's applied on
top of sound.git tree for-linus branch.  (But it should be easily
applicable to older trees, too.)

We need a proper fix later, but now more urgently the CI test
regression has to be papered over.


thanks,

Takashi

-- 8< --
From: Takashi Iwai <tiwai@suse.de>
Subject: [PATCH] ALSA: hda - Workaround for unbalanced i915 power refcount by
 concurrent probe

The recent addition of on-demand i915 audio component binding in the
codec driver seems leading to the unbalanced i915 power refcount,
according to Intel CI tests.  Typically, it gets a kernel WARNING
like:
  WARNING: CPU: 3 PID: 173 at sound/hda/hdac_i915.c:91 snd_hdac_display_power+0xf1/0x110 [snd_hda_core]()
  Call Trace:
   [<ffffffff813fef15>] dump_stack+0x67/0x92
   [<ffffffff81078a21>] warn_slowpath_common+0x81/0xc0
   [<ffffffff81078b15>] warn_slowpath_null+0x15/0x20
   [<ffffffffa00f77e1>] snd_hdac_display_power+0xf1/0x110 [snd_hda_core]
   [<ffffffffa015039d>] azx_intel_link_power+0xd/0x10 [snd_hda_intel]
   [<ffffffffa011e32a>] azx_link_power+0x1a/0x30 [snd_hda_codec]
   [<ffffffffa00f21f9>] snd_hdac_link_power+0x29/0x40 [snd_hda_core]
   [<ffffffffa01192a6>] hda_codec_runtime_suspend+0x76/0xa0 [snd_hda_codec]
   .....

The scenario is like below:
- HD-audio driver and i915 driver are probed concurrently at the
  (almost) same time; HDA bus tries to bind with i915, but it fails
  because i915 initialization is still being processed.
- Later on, HD-audio probes the HDMI codec, where it again tries to
  bind with i915.  At this time, it succeeds.
- At finishing the probe of HDA, it decreases the refcount as if it
  were already bound at the bus probe, since the component is bound
  now.  This triggers a kernel WARNING due to the unbalance.

As a workaround, in this patch, we just disable the on-demand i915
component binding in the codec driver.  This essentially reverts back
to the state of 4.4 kernel.

We know that this is no real solution, but it's a minimalistic simple
change that can be applied to 4.5.x kernel as stable.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94566
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.5
Signed-off-by: Takashi Iwai <tiwai@suse.de>
---
 sound/pci/hda/patch_hdmi.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Imre Deak March 21, 2016, 12:34 p.m. UTC | #1
On la, 2016-03-19 at 11:18 +0100, Takashi Iwai wrote:
> On Tue, 15 Mar 2016 18:22:56 +0100,
> Takashi Iwai wrote:
> > 
> > On Tue, 15 Mar 2016 17:02:07 +0100,
> > Ville Syrjälä wrote:
> > > This other one was seen at least on on SKL:
> > > [  124.808525] ------------[ cut here ]------------
> > > [  124.808545] WARNING: CPU: 3 PID: 173 at
> > > sound/hda/hdac_i915.c:91 snd_hdac_display_power+0xf1/0x110
> > > [snd_hda_core]()
> > 
> > This is a different one, and it implies that the unbalanced power
> > refcount.  Might be related with the recent fix for the recursive
> > regmap deadlock.  I'll try later with a SKL machine here, too.
> > 
> > Didn't you see this before the recent tree, right?  Some good/bad
> > commits would be really helpful...
> 
> So, we got a full log in bugzilla, and this looks like the on-demand
> binding leading to the unbalanced refcount.
> 
> Could you try the patch below?  This essentially reverts to the 4.4
> state, so a "regression" should be covered, at least.  It's applied
> on
> top of sound.git tree for-linus branch.  (But it should be easily
> applicable to older trees, too.)
> 
> We need a proper fix later, but now more urgently the CI test
> regression has to be papered over.

Looking more into this, it seems the root cause for this failure is
that request_module returned pre-maturely (and with 0 return value)
when i915 probing was still not finished. I tracked this issue down to
the following kmod bug, fixed meanwhile upstream:
http://git.kernel.org/cgit/utils/kernel/kmod/kmod.git/commit/?id=fd44a98ae2eb5eb32161088954ab21e58e19dfc4

Updating to the latest kmod fixed the issue. Things will of course work
only in the general case where both the i915 and the sound driver are
built as modules, to support other configurations we'd need to move the
audio probing to happen at component bind time. 

--Imre

> 
> 
> thanks,
> 
> Takashi
> 
> -- 8< --
> From: Takashi Iwai <tiwai@suse.de>
> Subject: [PATCH] ALSA: hda - Workaround for unbalanced i915 power
> refcount by
>  concurrent probe
> 
> The recent addition of on-demand i915 audio component binding in the
> codec driver seems leading to the unbalanced i915 power refcount,
> according to Intel CI tests.  Typically, it gets a kernel WARNING
> like:
>   WARNING: CPU: 3 PID: 173 at sound/hda/hdac_i915.c:91
> snd_hdac_display_power+0xf1/0x110 [snd_hda_core]()
>   Call Trace:
>    [<ffffffff813fef15>] dump_stack+0x67/0x92
>    [<ffffffff81078a21>] warn_slowpath_common+0x81/0xc0
>    [<ffffffff81078b15>] warn_slowpath_null+0x15/0x20
>    [<ffffffffa00f77e1>] snd_hdac_display_power+0xf1/0x110
> [snd_hda_core]
>    [<ffffffffa015039d>] azx_intel_link_power+0xd/0x10 [snd_hda_intel]
>    [<ffffffffa011e32a>] azx_link_power+0x1a/0x30 [snd_hda_codec]
>    [<ffffffffa00f21f9>] snd_hdac_link_power+0x29/0x40 [snd_hda_core]
>    [<ffffffffa01192a6>] hda_codec_runtime_suspend+0x76/0xa0
> [snd_hda_codec]
>    .....
> 
> The scenario is like below:
> - HD-audio driver and i915 driver are probed concurrently at the
>   (almost) same time; HDA bus tries to bind with i915, but it fails
>   because i915 initialization is still being processed.
> - Later on, HD-audio probes the HDMI codec, where it again tries to
>   bind with i915.  At this time, it succeeds.
> - At finishing the probe of HDA, it decreases the refcount as if it
>   were already bound at the bus probe, since the component is bound
>   now.  This triggers a kernel WARNING due to the unbalance.
> 
> As a workaround, in this patch, we just disable the on-demand i915
> component binding in the codec driver.  This essentially reverts back
> to the state of 4.4 kernel.
> 
> We know that this is no real solution, but it's a minimalistic simple
> change that can be applied to 4.5.x kernel as stable.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94566
> Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: <stable@vger.kernel.org> # v4.5
> Signed-off-by: Takashi Iwai <tiwai@suse.de>
> ---
>  sound/pci/hda/patch_hdmi.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/sound/pci/hda/patch_hdmi.c b/sound/pci/hda/patch_hdmi.c
> index 56d3575ee6cc..7ae614d27954 100644
> --- a/sound/pci/hda/patch_hdmi.c
> +++ b/sound/pci/hda/patch_hdmi.c
> @@ -2258,9 +2258,15 @@ static int patch_generic_hdmi(struct hda_codec
> *codec)
>  	/* Try to bind with i915 for Intel HSW+ codecs (if not done
> yet) */
>  	if ((codec->core.vendor_id >> 16) == 0x8086 &&
>  	    is_haswell_plus(codec)) {
> +#if 0
> +		/* on-demand binding leads to an unbalanced refcount
> when
> +		 * both i915 and hda drivers are probed
> concurrently;
> +		 * disabled temporarily for now
> +		 */
>  		if (!codec->bus->core.audio_component)
>  			if (!snd_hdac_i915_init(&codec->bus->core))
>  				spec->i915_bound = true;
> +#endif
>  		/* use i915 audio component notifier for hotplug */
>  		if (codec->bus->core.audio_component)
>  			spec->use_acomp_notifier = true;
Takashi Iwai March 21, 2016, 12:55 p.m. UTC | #2
On Mon, 21 Mar 2016 13:34:35 +0100,
Imre Deak wrote:
> 
> On la, 2016-03-19 at 11:18 +0100, Takashi Iwai wrote:
> > On Tue, 15 Mar 2016 18:22:56 +0100,
> > Takashi Iwai wrote:
> > > 
> > > On Tue, 15 Mar 2016 17:02:07 +0100,
> > > Ville Syrjälä wrote:
> > > > This other one was seen at least on on SKL:
> > > > [  124.808525] ------------[ cut here ]------------
> > > > [  124.808545] WARNING: CPU: 3 PID: 173 at
> > > > sound/hda/hdac_i915.c:91 snd_hdac_display_power+0xf1/0x110
> > > > [snd_hda_core]()
> > > 
> > > This is a different one, and it implies that the unbalanced power
> > > refcount.  Might be related with the recent fix for the recursive
> > > regmap deadlock.  I'll try later with a SKL machine here, too.
> > > 
> > > Didn't you see this before the recent tree, right?  Some good/bad
> > > commits would be really helpful...
> > 
> > So, we got a full log in bugzilla, and this looks like the on-demand
> > binding leading to the unbalanced refcount.
> > 
> > Could you try the patch below?  This essentially reverts to the 4.4
> > state, so a "regression" should be covered, at least.  It's applied
> > on
> > top of sound.git tree for-linus branch.  (But it should be easily
> > applicable to older trees, too.)
> > 
> > We need a proper fix later, but now more urgently the CI test
> > regression has to be papered over.
> 
> Looking more into this, it seems the root cause for this failure is
> that request_module returned pre-maturely (and with 0 return value)
> when i915 probing was still not finished. I tracked this issue down to
> the following kmod bug, fixed meanwhile upstream:
> http://git.kernel.org/cgit/utils/kernel/kmod/kmod.git/commit/?id=fd44a98ae2eb5eb32161088954ab21e58e19dfc4
> 
> Updating to the latest kmod fixed the issue. Things will of course work
> only in the general case where both the i915 and the sound driver are
> built as modules, to support other configurations we'd need to move the
> audio probing to happen at component bind time. 

Ah, good to know that the culprit is not only us! :)

I queued my temporary fix in anyway since the on-demand binding isn't
used any longer for 4.5/4.6.  It'll be re-added once when we take back
the support of eld notifier for old chips (which I'm currently working
on).  And then we'll be heading to the more stable component binding
with your patches to add component stub.


thanks,

Takashi
diff mbox

Patch

diff --git a/sound/pci/hda/patch_hdmi.c b/sound/pci/hda/patch_hdmi.c
index 56d3575ee6cc..7ae614d27954 100644
--- a/sound/pci/hda/patch_hdmi.c
+++ b/sound/pci/hda/patch_hdmi.c
@@ -2258,9 +2258,15 @@  static int patch_generic_hdmi(struct hda_codec *codec)
 	/* Try to bind with i915 for Intel HSW+ codecs (if not done yet) */
 	if ((codec->core.vendor_id >> 16) == 0x8086 &&
 	    is_haswell_plus(codec)) {
+#if 0
+		/* on-demand binding leads to an unbalanced refcount when
+		 * both i915 and hda drivers are probed concurrently;
+		 * disabled temporarily for now
+		 */
 		if (!codec->bus->core.audio_component)
 			if (!snd_hdac_i915_init(&codec->bus->core))
 				spec->i915_bound = true;
+#endif
 		/* use i915 audio component notifier for hotplug */
 		if (codec->bus->core.audio_component)
 			spec->use_acomp_notifier = true;