ASoC: soc-core: Fix null pointer dereference in soc_find_component
diff mbox series

Message ID 1547194442-1487-1-git-send-email-rohitkr@codeaurora.org
State New
Headers show
Series
  • ASoC: soc-core: Fix null pointer dereference in soc_find_component
Related show

Commit Message

Rohit Kumar Jan. 11, 2019, 8:14 a.m. UTC
From: Ajit Pandey <ajitp@codeaurora.org>

soc_find_component() may lead to null pointer exception if both
arguments i.e of_node and name is NULL. Add NULL check before
calling soc_find_component(). Also fix some typos.

Fixes: 8780cf1142a5 ("ASoC: soc-core: defer card probe until all component is added to list")
Reported-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Ajit Pandey <ajitp@codeaurora.org>
Signed-off-by: Rohit kumar <rohitkr@codeaurora.org>
---
 sound/soc/soc-core.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

Comments

Pierre-Louis Bossart Jan. 11, 2019, 3:44 p.m. UTC | #1
On 1/11/19 2:14 AM, Rohit kumar wrote:
> From: Ajit Pandey <ajitp@codeaurora.org>
>
> soc_find_component() may lead to null pointer exception if both
> arguments i.e of_node and name is NULL. Add NULL check before
> calling soc_find_component(). Also fix some typos.

Thanks for the overnight fix. This update fixes the issue on my Skylake 
XPS13 test device (blind testing since I don't understand what the code 
does).

Tested-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>

>
> Fixes: 8780cf1142a5 ("ASoC: soc-core: defer card probe until all component is added to list")
> Reported-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> Signed-off-by: Ajit Pandey <ajitp@codeaurora.org>
> Signed-off-by: Rohit kumar <rohitkr@codeaurora.org>
> ---
>   sound/soc/soc-core.c | 15 +++++++++------
>   1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
> index 0934b36..df05fb8 100644
> --- a/sound/soc/soc-core.c
> +++ b/sound/soc/soc-core.c
> @@ -1131,11 +1131,13 @@ static int soc_init_dai_link(struct snd_soc_card *card,
>   	}
>   
>   	/*
> -	 * Defer card registartion if platform dai component is not added to
> +	 * Defer card registration if platform dai component is not added to
>   	 * component list.
>   	 */
> -	if (!soc_find_component(link->platform->of_node, link->platform->name))
> -		return -EPROBE_DEFER;
> +	if (link->platform->of_node || link->platform->name)
> +		if (!soc_find_component(link->platform->of_node,
> +					link->platform->name))
> +			return -EPROBE_DEFER;
>   
>   	/*
>   	 * CPU device may be specified by either name or OF node, but
> @@ -1150,11 +1152,12 @@ static int soc_init_dai_link(struct snd_soc_card *card,
>   	}
>   
>   	/*
> -	 * Defer card registartion if cpu dai component is not added to
> +	 * Defer card registration if cpu dai component is not added to
>   	 * component list.
>   	 */
> -	if (!soc_find_component(link->cpu_of_node, link->cpu_name))
> -		return -EPROBE_DEFER;
> +	if (link->cpu_of_node || link->cpu_name)
> +		if (!soc_find_component(link->cpu_of_node, link->cpu_name))
> +			return -EPROBE_DEFER;
>   
>   	/*
>   	 * At least one of CPU DAI name or CPU device name/node must be
Pierre-Louis Bossart Jan. 11, 2019, 9:49 p.m. UTC | #2
> Thanks for the overnight fix. This update fixes the issue on my 
> Skylake XPS13 test device (blind testing since I don't understand what 
> the code does).
>
> Tested-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>

I need to take this back, this set of changes (initial+fix) causes an 
error with our HDMI support

[   17.437684] sof-audio sof-audio: created machine bxt-pcm512x
[   17.585279] bxt-pcm512x bxt-pcm512x: ASoC: failed to init link iDisp1
[   17.585639] bxt-pcm512x bxt-pcm512x: snd_soc_register_card failed -517

Removing your changes restores the functionality

Adding some traces I can see that the the platform name we use doesn't 
seem compatible with your logic. All the Intel boards used a constant 
platform name matching the PCI ID, see e.g. [1], which IIRC is used to 
bind components. Liam, do you recall in more details if this is really 
required?

[1] 
https://elixir.bootlin.com/linux/latest/source/sound/soc/intel/boards/bxt_da7219_max98357a.c#L475

[   18.205812] plb: platform name sof-audio
[   18.206059] plb: cpu_name (null)
[   18.206234] plb: platform name 0000:00:0e.0
[   18.206459] plb: returning -EPROBE_DEFER 1
[   18.206686] bxt-pcm512x bxt-pcm512x: ASoC: failed to init link iDisp1
[   18.207054] bxt-pcm512x bxt-pcm512x: snd_soc_register_card failed -517

diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
index cbafbdd02483..ae731212f82b 100644
--- a/sound/soc/soc-core.c
+++ b/sound/soc/soc-core.c
@@ -1133,11 +1133,15 @@ static int soc_init_dai_link(struct snd_soc_card 
*card,
          * Defer card registration if platform dai component is not 
added to
          * component list.
          */
+       pr_err("plb: platform name %s\n", link->platform->name);
         if (link->platform->of_node || link->platform->name)
                 if (!soc_find_component(link->platform->of_node,
link->platform->name))
-                       return -EPROBE_DEFER;
+                 {
+                         pr_err("plb: returning -EPROBE_DEFER 1\n");
+                         return -EPROBE_DEFER;

+                 }
         /*
          * CPU device may be specified by either name or OF node, but
          * can be left unspecified, and will be matched based on DAI
@@ -1154,9 +1158,14 @@ static int soc_init_dai_link(struct snd_soc_card 
*card,
          * Defer card registration if cpu dai component is not added to
          * component list.
          */
+       pr_err("plb: cpu_name %s\n", link->cpu_name);
         if (link->cpu_of_node || link->cpu_name)
                 if (!soc_find_component(link->cpu_of_node, link->cpu_name))
-                       return -EPROBE_DEFER;
+                 {
+                         pr_err("plb: returning -EPROBE_DEFER 2\n");
+                         return -EPROBE_DEFER;
+
+                 }

         /*
          * At least one of CPU DAI name or CPU device name/node must be
Rohit Kumar Jan. 12, 2019, 6:07 a.m. UTC | #3
On 1/12/2019 3:19 AM, Pierre-Louis Bossart wrote:
>
>> Thanks for the overnight fix. This update fixes the issue on my 
>> Skylake XPS13 test device (blind testing since I don't understand 
>> what the code does).
>>
>> Tested-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
>
> I need to take this back, this set of changes (initial+fix) causes an 
> error with our HDMI support
>
> [   17.437684] sof-audio sof-audio: created machine bxt-pcm512x
> [   17.585279] bxt-pcm512x bxt-pcm512x: ASoC: failed to init link iDisp1
> [   17.585639] bxt-pcm512x bxt-pcm512x: snd_soc_register_card failed -517
>
> Removing your changes restores the functionality
>
Looks like we should revert generic implementation for defering probe 
and move to call from machine driver as done in v1.
https://lore.kernel.org/patchwork/patch/1027560/
https://lore.kernel.org/patchwork/patch/1027561/

@Mark, Do you have suggestion to refine current patch?
> Adding some traces I can see that the the platform name we use doesn't 
> seem compatible with your logic. All the Intel boards used a constant 
> platform name matching the PCI ID, see e.g. [1], which IIRC is used to 
> bind components. Liam, do you recall in more details if this is really 
> required?
>
> [1] 
> https://elixir.bootlin.com/linux/latest/source/sound/soc/intel/boards/bxt_da7219_max98357a.c#L475
I think if it does not set platform name as 0000:00:0e.0, it should take 
snd-soc-dummy as platform name
and that might fix the issue.
snd-soc-dummy does have component associated with it.
https://elixir.bootlin.com/linux/latest/source/sound/soc/soc-utils.c#L281
>
> [   18.205812] plb: platform name sof-audio
> [   18.206059] plb: cpu_name (null)
> [   18.206234] plb: platform name 0000:00:0e.0
> [   18.206459] plb: returning -EPROBE_DEFER 1
> [   18.206686] bxt-pcm512x bxt-pcm512x: ASoC: failed to init link iDisp1
> [   18.207054] bxt-pcm512x bxt-pcm512x: snd_soc_register_card failed -517
>
> diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
> index cbafbdd02483..ae731212f82b 100644
> --- a/sound/soc/soc-core.c
> +++ b/sound/soc/soc-core.c
> @@ -1133,11 +1133,15 @@ static int soc_init_dai_link(struct 
> snd_soc_card *card,
>          * Defer card registration if platform dai component is not 
> added to
>          * component list.
>          */
> +       pr_err("plb: platform name %s\n", link->platform->name);
>         if (link->platform->of_node || link->platform->name)
>                 if (!soc_find_component(link->platform->of_node,
> link->platform->name))
> -                       return -EPROBE_DEFER;
> +                 {
> +                         pr_err("plb: returning -EPROBE_DEFER 1\n");
> +                         return -EPROBE_DEFER;
>
> +                 }
>         /*
>          * CPU device may be specified by either name or OF node, but
>          * can be left unspecified, and will be matched based on DAI
> @@ -1154,9 +1158,14 @@ static int soc_init_dai_link(struct 
> snd_soc_card *card,
>          * Defer card registration if cpu dai component is not added to
>          * component list.
>          */
> +       pr_err("plb: cpu_name %s\n", link->cpu_name);
>         if (link->cpu_of_node || link->cpu_name)
>                 if (!soc_find_component(link->cpu_of_node, 
> link->cpu_name))
> -                       return -EPROBE_DEFER;
> +                 {
> +                         pr_err("plb: returning -EPROBE_DEFER 2\n");
> +                         return -EPROBE_DEFER;
> +
> +                 }
>
>         /*
>          * At least one of CPU DAI name or CPU device name/node must be
>

Thanks,
Rohit
Liam Girdwood Jan. 14, 2019, 3:40 p.m. UTC | #4
On Sat, 2019-01-12 at 11:37 +0530, Rohit kumar wrote:
> On 1/12/2019 3:19 AM, Pierre-Louis Bossart wrote:
> > > Thanks for the overnight fix. This update fixes the issue on my 
> > > Skylake XPS13 test device (blind testing since I don't understand 
> > > what the code does).
> > > 
> > > Tested-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> > 
> > I need to take this back, this set of changes (initial+fix) causes an 
> > error with our HDMI support
> > 
> > [   17.437684] sof-audio sof-audio: created machine bxt-pcm512x
> > [   17.585279] bxt-pcm512x bxt-pcm512x: ASoC: failed to init link iDisp1
> > [   17.585639] bxt-pcm512x bxt-pcm512x: snd_soc_register_card failed -517
> > 
> > Removing your changes restores the functionality
> > 
> Looks like we should revert generic implementation for defering probe 
> and move to call from machine driver as done in v1.
> https://lore.kernel.org/patchwork/patch/1027560/
> https://lore.kernel.org/patchwork/patch/1027561/
> 
> @Mark, Do you have suggestion to refine current patch?
> > Adding some traces I can see that the the platform name we use doesn't 
> > seem compatible with your logic. All the Intel boards used a constant 
> > platform name matching the PCI ID, see e.g. [1], which IIRC is used to 
> > bind components. Liam, do you recall in more details if this is really 
> > required?

Sorry, I cant quite remember why the PCI ID was used for the platform name, I
think it started with the SKL generation as previous generations used "baytrail-
pcm" and "haswell-pcm" as platform names IIRC. Perhaps Vinod will know.

The platform name is only used by SOF when over writing the "legacy" platform
name e.g. "baytrail-pcm" would become "sof-audio" and this is only used for
binding DAI links (so that all legacy machine drivers can be reused without
modification). 

Liam
Mark Brown Jan. 14, 2019, 11:26 p.m. UTC | #5
On Fri, Jan 11, 2019 at 01:44:02PM +0530, Rohit kumar wrote:

> -	if (!soc_find_component(link->platform->of_node, link->platform->name))
> -		return -EPROBE_DEFER;
> +	if (link->platform->of_node || link->platform->name)
> +		if (!soc_find_component(link->platform->of_node,
> +					link->platform->name))
> +			return -EPROBE_DEFER;

If we need to do this for every user (which we do pretty much it seems)
we should just be doing it inside find_component().
Mark Brown Jan. 15, 2019, 12:06 a.m. UTC | #6
On Fri, Jan 11, 2019 at 03:49:08PM -0600, Pierre-Louis Bossart wrote:

> Adding some traces I can see that the the platform name we use doesn't seem
> compatible with your logic. All the Intel boards used a constant platform
> name matching the PCI ID, see e.g. [1], which IIRC is used to bind
> components. Liam, do you recall in more details if this is really required?

That's telling me that either snd_soc_find_components() isn't finding
components in the same way that we do when we bind things which isn't
good or we're binding links without having fully matched everything on
the link which also isn't good.  

Without a system that shows the issue I can't 100% confirm but I think
it's the former - I'm fairly sure that those machines are relying on the
component name being initialized to fmt_single_name() in
snd_soc_component_initialize().  That is supposed to wind up using
dev_name() (which would be the PCI address for a PCI device) as the
basis of the name.  What I can't currently see is how exactly that gets
bound (or how any of the other links avoid trouble for that matter).  We
could revert and push this into cards but I would rather be confident
that we understand what's going on, I'm not comfortable that we aren't
just pushing the breakage around rather than fixing it.  Can someone
with an x86 system take a look and confirm exactly what's going on with
binding these cards please?
Pierre-Louis Bossart Jan. 15, 2019, 3:08 a.m. UTC | #7
On 1/14/19 6:06 PM, Mark Brown wrote:
> On Fri, Jan 11, 2019 at 03:49:08PM -0600, Pierre-Louis Bossart wrote:
>
>> Adding some traces I can see that the the platform name we use doesn't seem
>> compatible with your logic. All the Intel boards used a constant platform
>> name matching the PCI ID, see e.g. [1], which IIRC is used to bind
>> components. Liam, do you recall in more details if this is really required?
> That's telling me that either snd_soc_find_components() isn't finding
> components in the same way that we do when we bind things which isn't
> good or we're binding links without having fully matched everything on
> the link which also isn't good.
>
> Without a system that shows the issue I can't 100% confirm but I think
> it's the former - I'm fairly sure that those machines are relying on the
> component name being initialized to fmt_single_name() in
> snd_soc_component_initialize().  That is supposed to wind up using
> dev_name() (which would be the PCI address for a PCI device) as the
> basis of the name.  What I can't currently see is how exactly that gets
> bound (or how any of the other links avoid trouble for that matter).  We
> could revert and push this into cards but I would rather be confident
> that we understand what's going on, I'm not comfortable that we aren't
> just pushing the breakage around rather than fixing it.  Can someone
> with an x86 system take a look and confirm exactly what's going on with
> binding these cards please?

I am actually not sure at all why we need the platform_name to be set in 
Intel machine drivers.

I ran a 5mn experiment with SOF and we completely ignore what is set by 
machine drivers, it's overridden by the component name:

             dev_info(card->dev, "info: override FE DAI link %s\n",
                  card->dai_link[i].name);

             /* override platform component */
             if (snd_soc_init_platform(card, dai_link) < 0) {
                 dev_err(card->dev, "init platform error");
                 continue;
             }
             pr_err("plb: platform_name was %s, now %s\n",
                    dai_link->platform->name,
                    component->name);

             dai_link->platform->name = component->name;

existing machine driver (info is incorrect btw, should be BE DAI link)

[   36.628466] bxt-pcm512x bxt-pcm512x: info: override FE DAI link 
SSP5-Codec
[   36.628469] plb: platform_name was sof-audio, now sof-audio
[   36.628772] bxt-pcm512x bxt-pcm512x: info: override FE DAI link iDisp1
[   36.628773] plb: platform_name was 0000:00:0e.0, now sof-audio
[   36.629111] bxt-pcm512x bxt-pcm512x: info: override FE DAI link iDisp2
[   36.629112] plb: platform_name was 0000:00:0e.0, now sof-audio
[   36.629422] bxt-pcm512x bxt-pcm512x: info: override FE DAI link iDisp3
[   36.629423] plb: platform_name was 0000:00:0e.0, now sof-audio

machine driver with all platform_name commented out, no regression at all.

[   15.839652] sof-audio sof-audio: created machine bxt-pcm512x
[   15.846990] bxt-pcm512x bxt-pcm512x: info: override FE DAI link 
SSP5-Codec
[   15.846995] plb: platform_name was snd-soc-dummy, now sof-audio
[   15.847325] bxt-pcm512x bxt-pcm512x: info: override FE DAI link iDisp1
[   15.847326] plb: platform_name was snd-soc-dummy, now sof-audio
[   15.847657] bxt-pcm512x bxt-pcm512x: info: override FE DAI link iDisp2
[   15.847658] plb: platform_name was snd-soc-dummy, now sof-audio
[   15.847974] bxt-pcm512x bxt-pcm512x: info: override FE DAI link iDisp3
[   15.847974] plb: platform_name was snd-soc-dummy, now sof-audio

I checked for a bit longer with the Skylake driver, I also couldn't see 
a difference with or without the platform_name set.

diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
index 0462b3ec977a..0fdf99ec17cd 100644
--- a/sound/soc/soc-core.c
+++ b/sound/soc/soc-core.c
@@ -918,7 +918,7 @@ static int soc_bind_dai_link(struct snd_soc_card *card,
                 if (!snd_soc_is_matching_component(dai_link->platform,
                                                    component))
                         continue;
-
+               pr_err("plb: binding with component_name %s \n", 
component->name);
                 snd_soc_rtdcom_add(rtd, component);
         }

@@ -1041,6 +1041,8 @@ static int snd_soc_init_platform(struct 
snd_soc_card *card,
                 if (!platform)
                         return -ENOMEM;

+               pr_err("plb: %s: plaform->name set to %s\n", __func__,
+                      dai_link->platform_name);
                 dai_link->platform      = platform;
                 platform->name          = dai_link->platform_name;
                 platform->of_node       = dai_link->platform_of_node;

[    1.345143] plb: snd_soc_init_platform: plaform->name set to 0000:00:1f.3
[    1.345148] skl_hda_dsp_generic skl_hda_dsp_generic: ASoC: binding 
CNL Audio
[    1.345151] plb: binding with component_name 0000:00:1f.3
[    1.345153] plb: snd_soc_init_platform: plaform->name set to 0000:00:1f.3
[    1.345154] skl_hda_dsp_generic skl_hda_dsp_generic: ASoC: binding 
CNL HDMI1
[    1.345155] plb: binding with component_name 0000:00:1f.3
[    1.345157] plb: snd_soc_init_platform: plaform->name set to 0000:00:1f.3
[    1.345158] skl_hda_dsp_generic skl_hda_dsp_generic: ASoC: binding 
CNL HDMI2
[    1.345159] plb: binding with component_name 0000:00:1f.3
[    1.345161] plb: snd_soc_init_platform: plaform->name set to 0000:00:1f.3
[    1.345162] skl_hda_dsp_generic skl_hda_dsp_generic: ASoC: binding 
CNL HDMI3
[    1.345163] plb: binding with component_name 0000:00:1f.3

[    1.349607] plb: snd_soc_init_platform: plaform->name set to (null)
[    1.349613] skl_hda_dsp_generic skl_hda_dsp_generic: ASoC: binding 
CNL Audio
[    1.349617] plb: binding with component_name snd-soc-dummy
[    1.349619] plb: binding with component_name snd-soc-dummy
[    1.349621] plb: snd_soc_init_platform: plaform->name set to (null)
[    1.349623] skl_hda_dsp_generic skl_hda_dsp_generic: ASoC: binding 
CNL HDMI1
[    1.349625] plb: binding with component_name snd-soc-dummy
[    1.349626] plb: binding with component_name snd-soc-dummy
[    1.349628] plb: snd_soc_init_platform: plaform->name set to (null)
[    1.349631] skl_hda_dsp_generic skl_hda_dsp_generic: ASoC: binding 
CNL HDMI2
[    1.349632] plb: binding with component_name snd-soc-dummy
[    1.349633] plb: binding with component_name snd-soc-dummy
[    1.349635] plb: snd_soc_init_platform: plaform->name set to (null)
[    1.349638] skl_hda_dsp_generic skl_hda_dsp_generic: ASoC: binding 
CNL HDMI3
[    1.349639] plb: binding with component_name snd-soc-dummy
[    1.349641] plb: binding with component_name snd-soc-dummy

Audio playback works in both cases.

The funny thing is that the list of components is right in 
/sys/kernel/debug/asoc.

My guess is that the required PCI component is already added when the 
platform_name is used in soc_bind_dai_link() and snd_soc_rtdcom_add() 
does nothing for the back-end, so the dailink platform_name has actually 
zero influence on the outcome.

Another way to look at it is that we already provide the 
dai_link->cpu_dai_name which is enough for soc_bind_dai_link() to find 
the relevant component and as a result the dailink->platform_name seems 
redundant/unnecessary. I am using the conditional here since this part 
of the code (Intel driver included) seems to work by accident rather 
than by design, and we'd need a set of additional tests before removing 
these hard-coded initializations.

Does this help?
Pierre-Louis Bossart Jan. 15, 2019, 7:35 p.m. UTC | #8
On 1/14/19 6:06 PM, Mark Brown wrote:
> On Fri, Jan 11, 2019 at 03:49:08PM -0600, Pierre-Louis Bossart wrote:
>
>> Adding some traces I can see that the the platform name we use doesn't seem
>> compatible with your logic. All the Intel boards used a constant platform
>> name matching the PCI ID, see e.g. [1], which IIRC is used to bind
>> components. Liam, do you recall in more details if this is really required?
> That's telling me that either snd_soc_find_components() isn't finding
> components in the same way that we do when we bind things which isn't
> good or we're binding links without having fully matched everything on
> the link which also isn't good.
>
> Without a system that shows the issue I can't 100% confirm but I think
> it's the former - I'm fairly sure that those machines are relying on the
> component name being initialized to fmt_single_name() in
> snd_soc_component_initialize().  That is supposed to wind up using
> dev_name() (which would be the PCI address for a PCI device) as the
> basis of the name.  What I can't currently see is how exactly that gets
> bound (or how any of the other links avoid trouble for that matter).  We
> could revert and push this into cards but I would rather be confident
> that we understand what's going on, I'm not comfortable that we aren't
> just pushing the breakage around rather than fixing it.  Can someone
> with an x86 system take a look and confirm exactly what's going on with
> binding these cards please?

Beyond the fact that the platform_name seems to be totally useless, 
additional tests show that the patch ('ASoC: soc-core: defer card probe 
until all component is added to list') adds a new restriction which 
contradicts existing error checks.

None of the Intel machine drivers set the dailink "cpu_name" field but 
use the "cpu_dai_name" field instead. This was perfectly legit as 
documented by the code at the end of soc_init_dai_link()

     /*
      * At least one of CPU DAI name or CPU device name/node must be
      * specified
      */
     if (!link->cpu_dai_name &&
         !(link->cpu_name || link->cpu_of_node)) {
         dev_err(card->dev,
             "ASoC: Neither cpu_dai_name nor cpu_name/of_node are set 
for %s\n",
             link->name);
         return -EINVAL;
     }

The code contributed by Qualcomm only checks for cpu_name, which 
prevents the init from completing.

So if we want to be consistent, the new code should be something like:

diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
index b680c673c553..2791da9417f8 100644
--- a/sound/soc/soc-core.c
+++ b/sound/soc/soc-core.c
@@ -1154,7 +1154,7 @@ static int soc_init_dai_link(struct snd_soc_card 
*card,
          * Defer card registartion if cpu dai component is not added to
          * component list.
          */
-       if (!soc_find_component(link->cpu_of_node, link->cpu_name))
+       if (!link->cpu_dai_name && 
!soc_find_component(link->cpu_of_node, link->cpu_name))
                 return -EPROBE_DEFER;

         /*

or try to call soc_find_component with both cpu_name or cpu_dai_name, if 
this makes sense?
Mark Brown Jan. 15, 2019, 9:07 p.m. UTC | #9
On Tue, Jan 15, 2019 at 01:35:07PM -0600, Pierre-Louis Bossart wrote:
> On 1/14/19 6:06 PM, Mark Brown wrote:

> > just pushing the breakage around rather than fixing it.  Can someone
> > with an x86 system take a look and confirm exactly what's going on with
> > binding these cards please?

> Beyond the fact that the platform_name seems to be totally useless,
> additional tests show that the patch ('ASoC: soc-core: defer card probe
> until all component is added to list') adds a new restriction which
> contradicts existing error checks.

Yes...  I'd been coming to the conclusion that it was a huge red herring
here.  

> So if we want to be consistent, the new code should be something like:

> diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
> index b680c673c553..2791da9417f8 100644
> --- a/sound/soc/soc-core.c
> +++ b/sound/soc/soc-core.c
> @@ -1154,7 +1154,7 @@ static int soc_init_dai_link(struct snd_soc_card
> *card,
>          * Defer card registartion if cpu dai component is not added to
>          * component list.
>          */
> -       if (!soc_find_component(link->cpu_of_node, link->cpu_name))
> +       if (!link->cpu_dai_name && !soc_find_component(link->cpu_of_node,
> link->cpu_name))
>                 return -EPROBE_DEFER;
> 
>         /*

> or try to call soc_find_component with both cpu_name or cpu_dai_name, if
> this makes sense?

I think calling _find_component() makes more sense here as it will do
the check it's actually there thing no matter how the link is
identified.  Assuming that does resolve the issue do you want to make a
patch given that you got there first?
Matthias Reichl Jan. 15, 2019, 9:11 p.m. UTC | #10
On Tue, Jan 15, 2019 at 01:35:07PM -0600, Pierre-Louis Bossart wrote:
> 
> On 1/14/19 6:06 PM, Mark Brown wrote:
> > On Fri, Jan 11, 2019 at 03:49:08PM -0600, Pierre-Louis Bossart wrote:
> > 
> > > Adding some traces I can see that the the platform name we use doesn't seem
> > > compatible with your logic. All the Intel boards used a constant platform
> > > name matching the PCI ID, see e.g. [1], which IIRC is used to bind
> > > components. Liam, do you recall in more details if this is really required?
> > That's telling me that either snd_soc_find_components() isn't finding
> > components in the same way that we do when we bind things which isn't
> > good or we're binding links without having fully matched everything on
> > the link which also isn't good.
> > 
> > Without a system that shows the issue I can't 100% confirm but I think
> > it's the former - I'm fairly sure that those machines are relying on the
> > component name being initialized to fmt_single_name() in
> > snd_soc_component_initialize().  That is supposed to wind up using
> > dev_name() (which would be the PCI address for a PCI device) as the
> > basis of the name.  What I can't currently see is how exactly that gets
> > bound (or how any of the other links avoid trouble for that matter).  We
> > could revert and push this into cards but I would rather be confident
> > that we understand what's going on, I'm not comfortable that we aren't
> > just pushing the breakage around rather than fixing it.  Can someone
> > with an x86 system take a look and confirm exactly what's going on with
> > binding these cards please?
> 
> Beyond the fact that the platform_name seems to be totally useless,
> additional tests show that the patch ('ASoC: soc-core: defer card probe
> until all component is added to list') adds a new restriction which
> contradicts existing error checks.
> 
> None of the Intel machine drivers set the dailink "cpu_name" field but use
> the "cpu_dai_name" field instead. This was perfectly legit as documented by
> the code at the end of soc_init_dai_link()

This should be fixed by the patch
"ASoC: core: Don't defer probe on optional, NULL components" which Mark
already applied to his tree. See
http://mailman.alsa-project.org/pipermail/alsa-devel/2019-January/144323.html

Maybe the defer card probe logic needs to be extended to also check if
dai_link_name had already been registered (either cpu or cpu_dai_name
needs to be set), not 100% sure which problem the defer card probe patch
was trying to solve.

so long,

Hias

> 
>     /*
>      * At least one of CPU DAI name or CPU device name/node must be
>      * specified
>      */
>     if (!link->cpu_dai_name &&
>         !(link->cpu_name || link->cpu_of_node)) {
>         dev_err(card->dev,
>             "ASoC: Neither cpu_dai_name nor cpu_name/of_node are set for
> %s\n",
>             link->name);
>         return -EINVAL;
>     }
> 
> The code contributed by Qualcomm only checks for cpu_name, which prevents
> the init from completing.
> 
> So if we want to be consistent, the new code should be something like:
> 
> diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
> index b680c673c553..2791da9417f8 100644
> --- a/sound/soc/soc-core.c
> +++ b/sound/soc/soc-core.c
> @@ -1154,7 +1154,7 @@ static int soc_init_dai_link(struct snd_soc_card
> *card,
>          * Defer card registartion if cpu dai component is not added to
>          * component list.
>          */
> -       if (!soc_find_component(link->cpu_of_node, link->cpu_name))
> +       if (!link->cpu_dai_name && !soc_find_component(link->cpu_of_node,
> link->cpu_name))
>                 return -EPROBE_DEFER;
> 
>         /*
> 
> or try to call soc_find_component with both cpu_name or cpu_dai_name, if
> this makes sense?
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
Pierre-Louis Bossart Jan. 15, 2019, 9:16 p.m. UTC | #11
>> Beyond the fact that the platform_name seems to be totally useless,
>> additional tests show that the patch ('ASoC: soc-core: defer card probe
>> until all component is added to list') adds a new restriction which
>> contradicts existing error checks.
>>
>> None of the Intel machine drivers set the dailink "cpu_name" field but use
>> the "cpu_dai_name" field instead. This was perfectly legit as documented by
>> the code at the end of soc_init_dai_link()
> This should be fixed by the patch
> "ASoC: core: Don't defer probe on optional, NULL components" which Mark
> already applied to his tree. See
> http://mailman.alsa-project.org/pipermail/alsa-devel/2019-January/144323.html

Ah yes, I missed this patch while I was debugging. Indeed this fixes the 
problem and my devices work again with Mark's for-next branch. Thanks 
Matthias!

>
> Maybe the defer card probe logic needs to be extended to also check if
> dai_link_name had already been registered (either cpu or cpu_dai_name
> needs to be set), not 100% sure which problem the defer card probe patch
> was trying to solve.
same here, I don't get why the deferred probe stuff only deals with one 
of the two options.
Mark Brown Jan. 15, 2019, 9:41 p.m. UTC | #12
On Tue, Jan 15, 2019 at 03:16:57PM -0600, Pierre-Louis Bossart wrote:

> > Maybe the defer card probe logic needs to be extended to also check if
> > dai_link_name had already been registered (either cpu or cpu_dai_name
> > needs to be set), not 100% sure which problem the defer card probe patch
> > was trying to solve.

We were getting cards probing without the platforms being registered
(which in turn meant we were just skipping their init) and had patches
proposed to implement the deferral in the cards.  The deferral stuff is
supposed to making sure that everything is registered when we
instantiate.

> same here, I don't get why the deferred probe stuff only deals with one of
> the two options.

I think it's just an oversight - I think the change you were proposing
to check the cpu_dai_name is a good idea anyway as it makes things more
consistent and work more obviously by intention.  And more generally if
we can simplify the code by removing legacy options that'd be good but
that's a bigger bit of work...
Matthias Reichl Jan. 15, 2019, 9:48 p.m. UTC | #13
On Tue, Jan 15, 2019 at 09:41:38PM +0000, Mark Brown wrote:
> On Tue, Jan 15, 2019 at 03:16:57PM -0600, Pierre-Louis Bossart wrote:
> 
> > > Maybe the defer card probe logic needs to be extended to also check if
> > > dai_link_name had already been registered (either cpu or cpu_dai_name
> > > needs to be set), not 100% sure which problem the defer card probe patch
> > > was trying to solve.
> 
> We were getting cards probing without the platforms being registered
> (which in turn meant we were just skipping their init) and had patches
> proposed to implement the deferral in the cards.  The deferral stuff is
> supposed to making sure that everything is registered when we
> instantiate.

Ah, that makes sense. Thanks a lot for the info!

so long,

Hias

> 
> > same here, I don't get why the deferred probe stuff only deals with one of
> > the two options.
> 
> I think it's just an oversight - I think the change you were proposing
> to check the cpu_dai_name is a good idea anyway as it makes things more
> consistent and work more obviously by intention.  And more generally if
> we can simplify the code by removing legacy options that'd be good but
> that's a bigger bit of work...

Patch
diff mbox series

diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
index 0934b36..df05fb8 100644
--- a/sound/soc/soc-core.c
+++ b/sound/soc/soc-core.c
@@ -1131,11 +1131,13 @@  static int soc_init_dai_link(struct snd_soc_card *card,
 	}
 
 	/*
-	 * Defer card registartion if platform dai component is not added to
+	 * Defer card registration if platform dai component is not added to
 	 * component list.
 	 */
-	if (!soc_find_component(link->platform->of_node, link->platform->name))
-		return -EPROBE_DEFER;
+	if (link->platform->of_node || link->platform->name)
+		if (!soc_find_component(link->platform->of_node,
+					link->platform->name))
+			return -EPROBE_DEFER;
 
 	/*
 	 * CPU device may be specified by either name or OF node, but
@@ -1150,11 +1152,12 @@  static int soc_init_dai_link(struct snd_soc_card *card,
 	}
 
 	/*
-	 * Defer card registartion if cpu dai component is not added to
+	 * Defer card registration if cpu dai component is not added to
 	 * component list.
 	 */
-	if (!soc_find_component(link->cpu_of_node, link->cpu_name))
-		return -EPROBE_DEFER;
+	if (link->cpu_of_node || link->cpu_name)
+		if (!soc_find_component(link->cpu_of_node, link->cpu_name))
+			return -EPROBE_DEFER;
 
 	/*
 	 * At least one of CPU DAI name or CPU device name/node must be