ALSA: PCM: check if ops are defined before suspending PCM
diff mbox series

Message ID 20190208232953.7266-1-pierre-louis.bossart@linux.intel.com
State New
Headers show
Series
  • ALSA: PCM: check if ops are defined before suspending PCM
Related show

Commit Message

Pierre-Louis Bossart Feb. 8, 2019, 11:29 p.m. UTC
From: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>

BE dai links only have internal PCM's and their substream ops may
not be set. Suspending these PCM's will result in their
 ops->trigger() being invoked and cause a kernel oops.
So skip suspending PCM's if their ops are NULL.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 sound/core/pcm_native.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Takashi Iwai Feb. 9, 2019, 9:27 a.m. UTC | #1
On Sat, 09 Feb 2019 00:29:53 +0100,
Pierre-Louis Bossart wrote:
> 
> From: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
> 
> BE dai links only have internal PCM's and their substream ops may
> not be set. Suspending these PCM's will result in their
>  ops->trigger() being invoked and cause a kernel oops.
> So skip suspending PCM's if their ops are NULL.
> 
> Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> ---
>  sound/core/pcm_native.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
> index 818dff1de545..b6e158ce6650 100644
> --- a/sound/core/pcm_native.c
> +++ b/sound/core/pcm_native.c
> @@ -1506,6 +1506,14 @@ int snd_pcm_suspend_all(struct snd_pcm *pcm)
>  			/* FIXME: the open/close code should lock this as well */
>  			if (substream->runtime == NULL)
>  				continue;
> +
> +			/*
> +			 * Skip BE dai link PCM's that are internal and may
> +			 * not have their substream ops set.
> +			 */
> +			if (!substream->ops)
> +				continue;
> +
>  			err = snd_pcm_suspend(substream);
>  			if (err < 0 && err != -EBUSY)
>  				return err;

Basically it's OK and safe to apply this check.  We may need to add
such sanity checks in more places if this really hits.

But I still wonder how this can go through.  Is substream->runtime set
even if substream->ops is NULL?  The substream->runtime is assigned
dynamically at opening a substream via snd_pcm_attach_substream(), so
without opening it, it must be NULL.


thanks,

Takashi
Pierre-Louis Bossart Feb. 11, 2019, 3:41 p.m. UTC | #2
On 2/9/19 3:27 AM, Takashi Iwai wrote:
> On Sat, 09 Feb 2019 00:29:53 +0100,
> Pierre-Louis Bossart wrote:
>> From: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
>>
>> BE dai links only have internal PCM's and their substream ops may
>> not be set. Suspending these PCM's will result in their
>>   ops->trigger() being invoked and cause a kernel oops.
>> So skip suspending PCM's if their ops are NULL.
>>
>> Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
>> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
>> ---
>>   sound/core/pcm_native.c | 8 ++++++++
>>   1 file changed, 8 insertions(+)
>>
>> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
>> index 818dff1de545..b6e158ce6650 100644
>> --- a/sound/core/pcm_native.c
>> +++ b/sound/core/pcm_native.c
>> @@ -1506,6 +1506,14 @@ int snd_pcm_suspend_all(struct snd_pcm *pcm)
>>   			/* FIXME: the open/close code should lock this as well */
>>   			if (substream->runtime == NULL)
>>   				continue;
>> +
>> +			/*
>> +			 * Skip BE dai link PCM's that are internal and may
>> +			 * not have their substream ops set.
>> +			 */
>> +			if (!substream->ops)
>> +				continue;
>> +
>>   			err = snd_pcm_suspend(substream);
>>   			if (err < 0 && err != -EBUSY)
>>   				return err;
> Basically it's OK and safe to apply this check.  We may need to add
> such sanity checks in more places if this really hits.
>
> But I still wonder how this can go through.  Is substream->runtime set
> even if substream->ops is NULL?  The substream->runtime is assigned
> dynamically at opening a substream via snd_pcm_attach_substream(), so
> without opening it, it must be NULL.

This error case was exposed when we tried to get rid of 
snd_pcm_suspend() per your recommendation, and use snd_soc_suspend() 
instead to do the work for us.

In the case of back-ends, all initializations are bypassed in 
soc_new_pcm() - see below a code snippet - and the ops aren't set before 
suspend is called.

The complete thread where we discussed this is at 
https://github.com/thesofproject/linux/pull/582

     if (rtd->dai_link->no_pcm) {
         if (playback)
  pcm->streams[SNDRV_PCM_STREAM_PLAYBACK].substream->private_data = rtd;
         if (capture)
  pcm->streams[SNDRV_PCM_STREAM_CAPTURE].substream->private_data = rtd;
         goto out; //<<<< this bypasses all the ops initializations!!
     }

     /* ASoC PCM operations */
     if (rtd->dai_link->dynamic) {
         rtd->ops.open        = dpcm_fe_dai_open;
         rtd->ops.hw_params    = dpcm_fe_dai_hw_params;
         rtd->ops.prepare    = dpcm_fe_dai_prepare;
         rtd->ops.trigger    = dpcm_fe_dai_trigger;
         rtd->ops.hw_free    = dpcm_fe_dai_hw_free;
         rtd->ops.close        = dpcm_fe_dai_close;
         rtd->ops.pointer    = soc_pcm_pointer;
         rtd->ops.ioctl        = soc_pcm_ioctl;
     } else {
         rtd->ops.open        = soc_pcm_open;
         rtd->ops.hw_params    = soc_pcm_hw_params;
         rtd->ops.prepare    = soc_pcm_prepare;
         rtd->ops.trigger    = soc_pcm_trigger;
         rtd->ops.hw_free    = soc_pcm_hw_free;
         rtd->ops.close        = soc_pcm_close;
         rtd->ops.pointer    = soc_pcm_pointer;
         rtd->ops.ioctl        = soc_pcm_ioctl;
     }

     for_each_rtdcom(rtd, rtdcom) {
         const struct snd_pcm_ops *ops = rtdcom->component->driver->ops;

         if (!ops)
             continue;

         if (ops->ack)
             rtd->ops.ack        = soc_rtdcom_ack;
         if (ops->copy_user)
             rtd->ops.copy_user    = soc_rtdcom_copy_user;
         if (ops->copy_kernel)
             rtd->ops.copy_kernel    = soc_rtdcom_copy_kernel;
         if (ops->fill_silence)
             rtd->ops.fill_silence    = soc_rtdcom_fill_silence;
         if (ops->page)
             rtd->ops.page        = soc_rtdcom_page;
         if (ops->mmap)
             rtd->ops.mmap        = soc_rtdcom_mmap;
     }

     if (playback)
         snd_pcm_set_ops(pcm, SNDRV_PCM_STREAM_PLAYBACK, &rtd->ops);

     if (capture)
         snd_pcm_set_ops(pcm, SNDRV_PCM_STREAM_CAPTURE, &rtd->ops);

     for_each_rtdcom(rtd, rtdcom) {
         component = rtdcom->component;

         if (!component->driver->pcm_new)
             continue;

         ret = component->driver->pcm_new(rtd);
         if (ret < 0) {
             dev_err(component->dev,
                 "ASoC: pcm constructor failed: %d\n",
                 ret);
             return ret;
         }
     }

     pcm->private_free = soc_pcm_private_free;
out:
     dev_info(rtd->card->dev, "%s <-> %s mapping ok\n",
          (rtd->num_codecs > 1) ? "multicodec" : rtd->codec_dai->name,
          cpu_dai->name);
     return ret;
Takashi Iwai Feb. 11, 2019, 4:05 p.m. UTC | #3
On Mon, 11 Feb 2019 16:41:31 +0100,
Pierre-Louis Bossart wrote:
> 
> 
> On 2/9/19 3:27 AM, Takashi Iwai wrote:
> > On Sat, 09 Feb 2019 00:29:53 +0100,
> > Pierre-Louis Bossart wrote:
> >> From: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
> >>
> >> BE dai links only have internal PCM's and their substream ops may
> >> not be set. Suspending these PCM's will result in their
> >>   ops->trigger() being invoked and cause a kernel oops.
> >> So skip suspending PCM's if their ops are NULL.
> >>
> >> Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
> >> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> >> ---
> >>   sound/core/pcm_native.c | 8 ++++++++
> >>   1 file changed, 8 insertions(+)
> >>
> >> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
> >> index 818dff1de545..b6e158ce6650 100644
> >> --- a/sound/core/pcm_native.c
> >> +++ b/sound/core/pcm_native.c
> >> @@ -1506,6 +1506,14 @@ int snd_pcm_suspend_all(struct snd_pcm *pcm)
> >>   			/* FIXME: the open/close code should lock this as well */
> >>   			if (substream->runtime == NULL)
> >>   				continue;
> >> +
> >> +			/*
> >> +			 * Skip BE dai link PCM's that are internal and may
> >> +			 * not have their substream ops set.
> >> +			 */
> >> +			if (!substream->ops)
> >> +				continue;
> >> +
> >>   			err = snd_pcm_suspend(substream);
> >>   			if (err < 0 && err != -EBUSY)
> >>   				return err;
> > Basically it's OK and safe to apply this check.  We may need to add
> > such sanity checks in more places if this really hits.
> >
> > But I still wonder how this can go through.  Is substream->runtime set
> > even if substream->ops is NULL?  The substream->runtime is assigned
> > dynamically at opening a substream via snd_pcm_attach_substream(), so
> > without opening it, it must be NULL.
> 
> This error case was exposed when we tried to get rid of
> snd_pcm_suspend() per your recommendation, and use snd_soc_suspend()
> instead to do the work for us.
> 
> In the case of back-ends, all initializations are bypassed in
> soc_new_pcm() - see below a code snippet - and the ops aren't set
> before suspend is called.
> The complete thread where we discussed this is at
> https://github.com/thesofproject/linux/pull/582

Thanks, now I took a look at the code.  And, this surfaced that the
another part of the problem is that DPCM does the substream open
handling by itself in soc-pcm.c.  Oh well.  I'm afraid that we have
some hidden bugs there that may lead to a crash easily.  (Fortunately
(or unfortunately) fuzzer isn't performed on ASoC because we have no
virtual device driver :)

IMO, some of DPCM code should be raised to the upper level, to ALSA
PCM core.  The current code is still in a rough form of early
plumbing.

In anyway, I merged the patch now with a bit more comments.


Thanks!

Takashi
Pierre-Louis Bossart Feb. 11, 2019, 4:59 p.m. UTC | #4
>>>> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
>>>> index 818dff1de545..b6e158ce6650 100644
>>>> --- a/sound/core/pcm_native.c
>>>> +++ b/sound/core/pcm_native.c
>>>> @@ -1506,6 +1506,14 @@ int snd_pcm_suspend_all(struct snd_pcm *pcm)
>>>>    			/* FIXME: the open/close code should lock this as well */
>>>>    			if (substream->runtime == NULL)
>>>>    				continue;
>>>> +
>>>> +			/*
>>>> +			 * Skip BE dai link PCM's that are internal and may
>>>> +			 * not have their substream ops set.
>>>> +			 */
>>>> +			if (!substream->ops)
>>>> +				continue;
>>>> +
>>>>    			err = snd_pcm_suspend(substream);
>>>>    			if (err < 0 && err != -EBUSY)
>>>>    				return err;
>>> Basically it's OK and safe to apply this check.  We may need to add
>>> such sanity checks in more places if this really hits.
>>>
>>> But I still wonder how this can go through.  Is substream->runtime set
>>> even if substream->ops is NULL?  The substream->runtime is assigned
>>> dynamically at opening a substream via snd_pcm_attach_substream(), so
>>> without opening it, it must be NULL.
>> This error case was exposed when we tried to get rid of
>> snd_pcm_suspend() per your recommendation, and use snd_soc_suspend()
>> instead to do the work for us.
>>
>> In the case of back-ends, all initializations are bypassed in
>> soc_new_pcm() - see below a code snippet - and the ops aren't set
>> before suspend is called.
>> The complete thread where we discussed this is at
>> https://github.com/thesofproject/linux/pull/582
> Thanks, now I took a look at the code.  And, this surfaced that the
> another part of the problem is that DPCM does the substream open
> handling by itself in soc-pcm.c.  Oh well.  I'm afraid that we have
> some hidden bugs there that may lead to a crash easily.  (Fortunately
> (or unfortunately) fuzzer isn't performed on ASoC because we have no
> virtual device driver :)
>
> IMO, some of DPCM code should be raised to the upper level, to ALSA
> PCM core.  The current code is still in a rough form of early
> plumbing.

Can't disagree, we were surprised to hit this issue knowing that the SOF 
code isn't the first to use DPCM at all (same with some topology 
issues). It's very likely that there are specific initialization flows 
that aren't quite right, hopefully we'll fix them one after the other :-)

>
> In anyway, I merged the patch now with a bit more comments.
>
>
> Thanks!
>
> Takashi

Thanks for the review+additional comments, much appreciated.
Mark Brown Feb. 12, 2019, 4:20 p.m. UTC | #5
On Mon, Feb 11, 2019 at 10:59:49AM -0600, Pierre-Louis Bossart wrote:

> Can't disagree, we were surprised to hit this issue knowing that the SOF
> code isn't the first to use DPCM at all (same with some topology issues).
> It's very likely that there are specific initialization flows that aren't
> quite right, hopefully we'll fix them one after the other :-)

I'm fairly sure that DPCM has only been tested in very restricted use
cases, and then mostly at the system level rather than directly.
Obviously long term the component refactoring and digital domains should
improve matters here but that's going to take a while :(
Ranjani Sridharan Feb. 12, 2019, 8:48 p.m. UTC | #6
On Sat, 2019-02-09 at 10:27 +0100, Takashi Iwai wrote:
> On Sat, 09 Feb 2019 00:29:53 +0100,
> Pierre-Louis Bossart wrote:
> > 
> > From: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
> > 
> > BE dai links only have internal PCM's and their substream ops may
> > not be set. Suspending these PCM's will result in their
> >  ops->trigger() being invoked and cause a kernel oops.
> > So skip suspending PCM's if their ops are NULL.
> > 
> > Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com
> > >
> > Signed-off-by: Pierre-Louis Bossart <
> > pierre-louis.bossart@linux.intel.com>
> > ---
> >  sound/core/pcm_native.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
> > index 818dff1de545..b6e158ce6650 100644
> > --- a/sound/core/pcm_native.c
> > +++ b/sound/core/pcm_native.c
> > @@ -1506,6 +1506,14 @@ int snd_pcm_suspend_all(struct snd_pcm *pcm)
> >  			/* FIXME: the open/close code should lock this
> > as well */
> >  			if (substream->runtime == NULL)
> >  				continue;
> > +
> > +			/*
> > +			 * Skip BE dai link PCM's that are internal and
> > may
> > +			 * not have their substream ops set.
> > +			 */
> > +			if (!substream->ops)
> > +				continue;
> > +
> >  			err = snd_pcm_suspend(substream);
> >  			if (err < 0 && err != -EBUSY)
> >  				return err;
> 
> Basically it's OK and safe to apply this check.  We may need to add
> such sanity checks in more places if this really hits.
> 
> But I still wonder how this can go through.  Is substream->runtime
> set
> even if substream->ops is NULL?  The substream->runtime is assigned
> dynamically at opening a substream via snd_pcm_attach_substream(), so
> without opening it, it must be NULL.
Hi Takashi,

My guess is that this happens during
dpcm_be_connect(fe, be, stream) in dpcm_add_paths().

The reason this wasnt exposed before was that the fe dai link pcm's
were suspended first. So when it was BE dai links' turn, the pcm was
already suspended. In the case of SOF, the order of dai links in the
rtd_list is BE dai links first and then the FE dai links.

Thanks,
Ranjani

> 
> 
> thanks,
> 
> Takashi
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

Patch
diff mbox series

diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 818dff1de545..b6e158ce6650 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -1506,6 +1506,14 @@  int snd_pcm_suspend_all(struct snd_pcm *pcm)
 			/* FIXME: the open/close code should lock this as well */
 			if (substream->runtime == NULL)
 				continue;
+
+			/*
+			 * Skip BE dai link PCM's that are internal and may
+			 * not have their substream ops set.
+			 */
+			if (!substream->ops)
+				continue;
+
 			err = snd_pcm_suspend(substream);
 			if (err < 0 && err != -EBUSY)
 				return err;