[2/2,RFC] ALSA: hda: hdmi - preserve non-MST PCM routing for Intel platforms
diff mbox series

Message ID 20191129143756.23941-2-kai.vehmanen@linux.intel.com
State New
Headers show
Series
  • [1/2] ALSA: hda: hdmi - fix kernel oops caused by invalid PCM idx
Related show

Commit Message

Kai Vehmanen Nov. 29, 2019, 2:37 p.m. UTC
Commit 5398e94fb753 ("ALSA: hda - Add DP-MST support for NVIDIA codecs")
introduced a slight change of behaviour how non-MST monitors are
assigned to PCMs on Intel platforms.

In the drm_audio_component.h interface, the third parameter
to pin_eld_notify() is pipe number. On Intel platforms, this value
is -1 for MST. On other platforms, a non-zero pipe id is used to
signal MST use.

This difference leads to some subtle differences in hdmi_find_pcm_slot()
with regards to how non-MST monitors are assigned to PCMs.
This patch restores the original behaviour on Intel platforms while
keeping the new allocation policy on other platforms.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
---
 sound/pci/hda/patch_hdmi.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Takashi Iwai Nov. 29, 2019, 2:44 p.m. UTC | #1
On Fri, 29 Nov 2019 15:37:56 +0100,
Kai Vehmanen wrote:
> 
> Commit 5398e94fb753 ("ALSA: hda - Add DP-MST support for NVIDIA codecs")
> introduced a slight change of behaviour how non-MST monitors are
> assigned to PCMs on Intel platforms.
> 
> In the drm_audio_component.h interface, the third parameter
> to pin_eld_notify() is pipe number. On Intel platforms, this value
> is -1 for MST. On other platforms, a non-zero pipe id is used to
> signal MST use.
> 
> This difference leads to some subtle differences in hdmi_find_pcm_slot()
> with regards to how non-MST monitors are assigned to PCMs.
> This patch restores the original behaviour on Intel platforms while
> keeping the new allocation policy on other platforms.
> 
> Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>

I believe this is the right fix.  I thought of a similar possibility,
but didn't take it seriously whether it really matters.

So applied it now for the next 5.5-rc1 pull request.


thanks,

Takashi
Kai Vehmanen Nov. 29, 2019, 2:47 p.m. UTC | #2
Hi Takashi, Nikil and others,

On Fri, 29 Nov 2019, Kai Vehmanen wrote:
> This difference leads to some subtle differences in hdmi_find_pcm_slot()
> with regards to how non-MST monitors are assigned to PCMs.
> This patch restores the original behaviour on Intel platforms while
> keeping the new allocation policy on other platforms.

hmm, there seems a couple of more issues. The first patch is a clear bug 
that leads to segfault with SOF+patch_hdmi on some platforms (pipe B used
for single monitor HDMI case -> dev_id=1 -> non-existant pcm selected
and eventual kernel oops).

This second patch is however trickier. Nikhil your patch changed the 
default allocation a bit, so the routing might be difference also with 
snd-hda-intel (i.e. not SOF) for existing platforms and this may surprise 
users.

Digging deeper, we seem to have a slight semantics difference in how
intel_pin_eld_notify() and generic_acomp_pin_eld_notify() handle
the third pipe/dev_id parameter.

Any thoughts how to solve? I first I thought making separate functions
for hdmi_find_pcm_slot() and allow platforms to define an alternative
implementation, but in this RFC patch I opted for a simpler quirk in the 
function. This is becoming fairly messy I must say -- the amount of 
code commentary needed is a good indication some simplifaction would
be in order.

PS I did not have time to fully test the RFC patch, so this is just
   for discussion now...

Br, Kai
Takashi Iwai Nov. 29, 2019, 3:08 p.m. UTC | #3
On Fri, 29 Nov 2019 15:47:11 +0100,
Kai Vehmanen wrote:
> 
> Hi Takashi, Nikil and others,
> 
> On Fri, 29 Nov 2019, Kai Vehmanen wrote:
> > This difference leads to some subtle differences in hdmi_find_pcm_slot()
> > with regards to how non-MST monitors are assigned to PCMs.
> > This patch restores the original behaviour on Intel platforms while
> > keeping the new allocation policy on other platforms.
> 
> hmm, there seems a couple of more issues. The first patch is a clear bug 
> that leads to segfault with SOF+patch_hdmi on some platforms (pipe B used
> for single monitor HDMI case -> dev_id=1 -> non-existant pcm selected
> and eventual kernel oops).
> 
> This second patch is however trickier. Nikhil your patch changed the 
> default allocation a bit, so the routing might be difference also with 
> snd-hda-intel (i.e. not SOF) for existing platforms and this may surprise 
> users.

Well, but the allocation itself is dynamic for DP-MST, even on Intel,
so user can't expect the completely persistent index assignment.
That's the reason I took Nikhil's patch (even I suggested to simplify
in that way).

We had a trick to assign the primary index.  It still works, right?
That should be the only concern.

> Digging deeper, we seem to have a slight semantics difference in how
> intel_pin_eld_notify() and generic_acomp_pin_eld_notify() handle
> the third pipe/dev_id parameter.

This is a platform-specific part, and on Intel, the assumption has
been that pipe is equivalent with dev_id.  If this changed, of course,
we must reconsider the whole picture.

For generic_acomp_pin_eld_notify(), it's gfx-driver specific, too.
And currently dev_id = -1 in AMDGPU, so we don't think too much about
the behavior compatibility.

> Any thoughts how to solve? I first I thought making separate functions
> for hdmi_find_pcm_slot() and allow platforms to define an alternative
> implementation, but in this RFC patch I opted for a simpler quirk in the 
> function. This is becoming fairly messy I must say -- the amount of 
> code commentary needed is a good indication some simplifaction would
> be in order.

Yeah, that's a bit messy.  The only expectation is the primary slot
assignment -- i.e. the case only one monitor is connected.  As long as
this behavior is kept, I don't think any big problem with the dynamic
assignment.

> PS I did not have time to fully test the RFC patch, so this is just
>    for discussion now...

Since the assignment should work with your patch somehow, I already
applied it.  Let's do fine tune-up during 5.5 rc cycles, if any.


thanks,

Takashi
Kai Vehmanen Nov. 29, 2019, 4:36 p.m. UTC | #4
Hey,

On Fri, 29 Nov 2019, Takashi Iwai wrote:

> On Fri, 29 Nov 2019 15:47:11 +0100, Kai Vehmanen wrote:
> > This second patch is however trickier. Nikhil your patch changed the 
> > default allocation a bit, so the routing might be difference also with 
> > snd-hda-intel (i.e. not SOF) for existing platforms and this may surprise 
> > users.
> 
> Well, but the allocation itself is dynamic for DP-MST, even on Intel,
> so user can't expect the completely persistent index assignment.
> That's the reason I took Nikhil's patch (even I suggested to simplify
> in that way).
[...]
> We had a trick to assign the primary index.  It still works, right?
> That should be the only concern.
[...]
> This is a platform-specific part, and on Intel, the assumption has
> been that pipe is equivalent with dev_id.  If this changed, of course,
> we must reconsider the whole picture.

hmm, the pipe equivalency should actually still hold. Looking at the code 
more, this could also be a lurking bug on graphics driver that had 
new side-effects with the recent ALSA side changes. E.g. I've received
logs where dev_id=1 is for a single connected HDMI monitor. I need to
investigate more whether this is an expected behaviour or a bug. :)

>> PS I did not have time to fully test the RFC patch, so this is just
>>    for discussion now...
> Since the assignment should work with your patch somehow, I already
> applied it.  Let's do fine tune-up during 5.5 rc cycles, if any.

Ack, ok. My commit message is a bit confusing (the wording about MST is 
not correct) but the actual code restores original behaviour so this 
should be good to apply. I'll continue testing and also dig a bit deeper 
into the bugreports w.r.t. what happens in the problematic non-MST cases. 
Thanks for the quick reviews!

Br, Kai
Nikhil Mahale Dec. 2, 2019, 5:07 a.m. UTC | #5
Thanks Kai, see inline -

On 11/29/19 8:07 PM, Kai Vehmanen wrote:
> Commit 5398e94fb753 ("ALSA: hda - Add DP-MST support for NVIDIA codecs")
> introduced a slight change of behaviour how non-MST monitors are
> assigned to PCMs on Intel platforms.
> 
> In the drm_audio_component.h interface, the third parameter
> to pin_eld_notify() is pipe number. On Intel platforms, this value
> is -1 for MST. On other platforms, a non-zero pipe id is used to
> signal MST use.

Do you mean "on Intel platforms, this value is -1 for non-MST"?

I am looking into functions
intel_audio_codec_enable/intel_audio_codec_disable, they sets
pipe = -1 for non-MST cases, right?

> This difference leads to some subtle differences in hdmi_find_pcm_slot()
> with regards to how non-MST monitors are assigned to PCMs.
> This patch restores the original behaviour on Intel platforms while
> keeping the new allocation policy on other platforms.

What exact change commit 5398e94fb753 ("ALSA: hda - Add DP-MST support
for NVIDIA codecs") made in behaviour on Intel platform? Sorry, it is not
clear to me from your reply on this thread.

For non-MST monitors, pipe = -1 is getting passed
to intel_pin_eld_notify().
check_presence_and_report -> pin_id_to_pin_index changes value of dev_id
from -1 to 0, comment there says "(dev_id == -1) means it is NON-MST pin
return the first virtual pin on this port". If this is the case, non-MST
monitor should get PCM of index 'pin_nid_idx' like it was happening
before commit 5398e94fb753. Isn't it?

Thanks,
Nikhil Mahale

> Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
> ---
>  sound/pci/hda/patch_hdmi.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/sound/pci/hda/patch_hdmi.c b/sound/pci/hda/patch_hdmi.c
> index c3940c197122..1dd4c92254a4 100644
> --- a/sound/pci/hda/patch_hdmi.c
> +++ b/sound/pci/hda/patch_hdmi.c
> @@ -1353,6 +1353,11 @@ static int hdmi_find_pcm_slot(struct hdmi_spec *spec,
>  		i = spec->num_nids + (per_pin->dev_id - 1);
>  		if (i < spec->pcm_used && !(test_bit(i, &spec->pcm_bitmap)))
>  			return i;
> +
> +		/* keep legacy assignment for dev_id>0 on Intel platforms */
> +		if (spec->intel_hsw_fixup)
> +			if (!test_bit(per_pin->pin_nid_idx, &spec->pcm_bitmap))
> +				return per_pin->pin_nid_idx;
>  	}
>  
>  	/* have a second try; check the area over num_nids */
>
Kai Vehmanen Dec. 2, 2019, 11:18 a.m. UTC | #6
Hey,

On Mon, 2 Dec 2019, Nikhil Mahale wrote:

> On 11/29/19 8:07 PM, Kai Vehmanen wrote:
> > In the drm_audio_component.h interface, the third parameter
> > to pin_eld_notify() is pipe number. On Intel platforms, this value
> > is -1 for MST. On other platforms, a non-zero pipe id is used to
> 
> Do you mean "on Intel platforms, this value is -1 for non-MST"?
[...]
> I am looking into functions
> intel_audio_codec_enable/intel_audio_codec_disable, they sets
> pipe = -1 for non-MST cases, right?

yup, -1 for non-MST, that was just plain wrong in my Friday mail. The 
problem seems to be pipe id is positive in some non-MST cases (like 
https://github.com/thesofproject/linux/issues/1536 ), which is very 
suprising looking at e.g. intel_audio_codec_enable(), but that seems to be 
happening nevertheless.

This would seem like a lurking bug on the i915 graphics driver side. Your 
patch changed the behaviour in these cases on ALSA side (PCM was selected 
based on per_pin->pin_nid_idx, ignoring dev_id), but one can argue whether 
this is worth preserving (or just drop/revert the RFC patch). I'll try to 
get a bit more info on the rootcause and how common case this is.

Br,
Kai
Kai Vehmanen Dec. 3, 2019, 1:48 p.m. UTC | #7
Hi Takashi and Nikhil,

On Mon, 2 Dec 2019, Kai Vehmanen wrote:
> yup, -1 for non-MST, that was just plain wrong in my Friday mail. The 
> problem seems to be pipe id is positive in some non-MST cases (like 
> https://github.com/thesofproject/linux/issues/1536 ), which is very 
> suprising looking at e.g. intel_audio_codec_enable(), but that seems to be 
> happening nevertheless.

ok, got graphics driver log from this case now, and this turns out to be a 
system with an unusual converter setup for HDMI output. I.e. from user 
point of view it's HDMI, but for graphics driver it looks like a DP 
connection (with MST enabled which is not so common but apparently is the 
case). So i915 driver is working as expected and in patch_hdmi we should 
handle it as a normal MST case.

Takashi, considering the commit message is wrong, I think it's better
to revert the "ALSA: hda: hdmi - preserve non-MST PCM routing for Intel 
platforms" patch. Should I send a revert?

It also doesn't fully preseve the old routing as in the case of a single 
DP-MST monitor (with dev_id=1), hdmi_find_pcm_slot() will not pick the 
preferred PCM "per_pin->pin_nid_idx", but will choose "spec->num_nids + 
(per_pin->dev_id - 1)". So seems better to just drop it.

Br, Kai
Takashi Iwai Dec. 3, 2019, 2:05 p.m. UTC | #8
On Tue, 03 Dec 2019 14:48:10 +0100,
Kai Vehmanen wrote:
> 
> Hi Takashi and Nikhil,
> 
> On Mon, 2 Dec 2019, Kai Vehmanen wrote:
> > yup, -1 for non-MST, that was just plain wrong in my Friday mail. The 
> > problem seems to be pipe id is positive in some non-MST cases (like 
> > https://github.com/thesofproject/linux/issues/1536 ), which is very 
> > suprising looking at e.g. intel_audio_codec_enable(), but that seems to be 
> > happening nevertheless.
> 
> ok, got graphics driver log from this case now, and this turns out to be a 
> system with an unusual converter setup for HDMI output. I.e. from user 
> point of view it's HDMI, but for graphics driver it looks like a DP 
> connection (with MST enabled which is not so common but apparently is the 
> case). So i915 driver is working as expected and in patch_hdmi we should 
> handle it as a normal MST case.
> 
> Takashi, considering the commit message is wrong, I think it's better
> to revert the "ALSA: hda: hdmi - preserve non-MST PCM routing for Intel 
> platforms" patch. Should I send a revert?
> 
> It also doesn't fully preseve the old routing as in the case of a single 
> DP-MST monitor (with dev_id=1), hdmi_find_pcm_slot() will not pick the 
> preferred PCM "per_pin->pin_nid_idx", but will choose "spec->num_nids + 
> (per_pin->dev_id - 1)". So seems better to just drop it.

Well, if we want to keep the old behavior for compatibility just to be
sure, how about a patch like below?


Takashi

--- a/sound/pci/hda/patch_hdmi.c
+++ b/sound/pci/hda/patch_hdmi.c
@@ -1348,21 +1348,18 @@ static int hdmi_find_pcm_slot(struct hdmi_spec *spec,
 	 * with the legacy static per_pin-pcm assignment that existed in the
 	 * days before DP-MST.
 	 *
+	 * Intel DP-MST prefers this legacy behavior for compatibility, too.
+	 *
 	 * per_pin of m!=0 prefers to get pcm=(num_nids + (m - 1)).
 	 */
 
-	if (per_pin->dev_id == 0) {
+	if (per_pin->dev_id == 0 || spec->intel_hsw_fixup) {
 		if (!test_bit(per_pin->pin_nid_idx, &spec->pcm_bitmap))
 			return per_pin->pin_nid_idx;
 	} else {
 		i = spec->num_nids + (per_pin->dev_id - 1);
 		if (i < spec->pcm_used && !(test_bit(i, &spec->pcm_bitmap)))
 			return i;
-
-		/* keep legacy assignment for dev_id>0 on Intel platforms */
-		if (spec->intel_hsw_fixup)
-			if (!test_bit(per_pin->pin_nid_idx, &spec->pcm_bitmap))
-				return per_pin->pin_nid_idx;
 	}
 
 	/* have a second try; check the area over num_nids */
Kai Vehmanen Dec. 3, 2019, 2:35 p.m. UTC | #9
Hey,

On Tue, 3 Dec 2019, Takashi Iwai wrote:

> Well, if we want to keep the old behavior for compatibility just to be
> sure, how about a patch like below?
[...]
> -	if (per_pin->dev_id == 0) {
> +	if (per_pin->dev_id == 0 || spec->intel_hsw_fixup) {

that would work. spec->intel_hsw_fixup starts to be a bit overloaded, but 
better than branching off the whole pcm selection, so ok for me.

Br, Kai

Patch
diff mbox series

diff --git a/sound/pci/hda/patch_hdmi.c b/sound/pci/hda/patch_hdmi.c
index c3940c197122..1dd4c92254a4 100644
--- a/sound/pci/hda/patch_hdmi.c
+++ b/sound/pci/hda/patch_hdmi.c
@@ -1353,6 +1353,11 @@  static int hdmi_find_pcm_slot(struct hdmi_spec *spec,
 		i = spec->num_nids + (per_pin->dev_id - 1);
 		if (i < spec->pcm_used && !(test_bit(i, &spec->pcm_bitmap)))
 			return i;
+
+		/* keep legacy assignment for dev_id>0 on Intel platforms */
+		if (spec->intel_hsw_fixup)
+			if (!test_bit(per_pin->pin_nid_idx, &spec->pcm_bitmap))
+				return per_pin->pin_nid_idx;
 	}
 
 	/* have a second try; check the area over num_nids */