Message ID | 20190208232953.7266-1-pierre-louis.bossart@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ALSA: PCM: check if ops are defined before suspending PCM | expand |
On Sat, 09 Feb 2019 00:29:53 +0100, Pierre-Louis Bossart wrote: > > From: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> > > BE dai links only have internal PCM's and their substream ops may > not be set. Suspending these PCM's will result in their > ops->trigger() being invoked and cause a kernel oops. > So skip suspending PCM's if their ops are NULL. > > Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> > Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> > --- > sound/core/pcm_native.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c > index 818dff1de545..b6e158ce6650 100644 > --- a/sound/core/pcm_native.c > +++ b/sound/core/pcm_native.c > @@ -1506,6 +1506,14 @@ int snd_pcm_suspend_all(struct snd_pcm *pcm) > /* FIXME: the open/close code should lock this as well */ > if (substream->runtime == NULL) > continue; > + > + /* > + * Skip BE dai link PCM's that are internal and may > + * not have their substream ops set. > + */ > + if (!substream->ops) > + continue; > + > err = snd_pcm_suspend(substream); > if (err < 0 && err != -EBUSY) > return err; Basically it's OK and safe to apply this check. We may need to add such sanity checks in more places if this really hits. But I still wonder how this can go through. Is substream->runtime set even if substream->ops is NULL? The substream->runtime is assigned dynamically at opening a substream via snd_pcm_attach_substream(), so without opening it, it must be NULL. thanks, Takashi
On 2/9/19 3:27 AM, Takashi Iwai wrote: > On Sat, 09 Feb 2019 00:29:53 +0100, > Pierre-Louis Bossart wrote: >> From: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> >> >> BE dai links only have internal PCM's and their substream ops may >> not be set. Suspending these PCM's will result in their >> ops->trigger() being invoked and cause a kernel oops. >> So skip suspending PCM's if their ops are NULL. >> >> Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> >> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> >> --- >> sound/core/pcm_native.c | 8 ++++++++ >> 1 file changed, 8 insertions(+) >> >> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c >> index 818dff1de545..b6e158ce6650 100644 >> --- a/sound/core/pcm_native.c >> +++ b/sound/core/pcm_native.c >> @@ -1506,6 +1506,14 @@ int snd_pcm_suspend_all(struct snd_pcm *pcm) >> /* FIXME: the open/close code should lock this as well */ >> if (substream->runtime == NULL) >> continue; >> + >> + /* >> + * Skip BE dai link PCM's that are internal and may >> + * not have their substream ops set. >> + */ >> + if (!substream->ops) >> + continue; >> + >> err = snd_pcm_suspend(substream); >> if (err < 0 && err != -EBUSY) >> return err; > Basically it's OK and safe to apply this check. We may need to add > such sanity checks in more places if this really hits. > > But I still wonder how this can go through. Is substream->runtime set > even if substream->ops is NULL? The substream->runtime is assigned > dynamically at opening a substream via snd_pcm_attach_substream(), so > without opening it, it must be NULL. This error case was exposed when we tried to get rid of snd_pcm_suspend() per your recommendation, and use snd_soc_suspend() instead to do the work for us. In the case of back-ends, all initializations are bypassed in soc_new_pcm() - see below a code snippet - and the ops aren't set before suspend is called. The complete thread where we discussed this is at https://github.com/thesofproject/linux/pull/582 if (rtd->dai_link->no_pcm) { if (playback) pcm->streams[SNDRV_PCM_STREAM_PLAYBACK].substream->private_data = rtd; if (capture) pcm->streams[SNDRV_PCM_STREAM_CAPTURE].substream->private_data = rtd; goto out; //<<<< this bypasses all the ops initializations!! } /* ASoC PCM operations */ if (rtd->dai_link->dynamic) { rtd->ops.open = dpcm_fe_dai_open; rtd->ops.hw_params = dpcm_fe_dai_hw_params; rtd->ops.prepare = dpcm_fe_dai_prepare; rtd->ops.trigger = dpcm_fe_dai_trigger; rtd->ops.hw_free = dpcm_fe_dai_hw_free; rtd->ops.close = dpcm_fe_dai_close; rtd->ops.pointer = soc_pcm_pointer; rtd->ops.ioctl = soc_pcm_ioctl; } else { rtd->ops.open = soc_pcm_open; rtd->ops.hw_params = soc_pcm_hw_params; rtd->ops.prepare = soc_pcm_prepare; rtd->ops.trigger = soc_pcm_trigger; rtd->ops.hw_free = soc_pcm_hw_free; rtd->ops.close = soc_pcm_close; rtd->ops.pointer = soc_pcm_pointer; rtd->ops.ioctl = soc_pcm_ioctl; } for_each_rtdcom(rtd, rtdcom) { const struct snd_pcm_ops *ops = rtdcom->component->driver->ops; if (!ops) continue; if (ops->ack) rtd->ops.ack = soc_rtdcom_ack; if (ops->copy_user) rtd->ops.copy_user = soc_rtdcom_copy_user; if (ops->copy_kernel) rtd->ops.copy_kernel = soc_rtdcom_copy_kernel; if (ops->fill_silence) rtd->ops.fill_silence = soc_rtdcom_fill_silence; if (ops->page) rtd->ops.page = soc_rtdcom_page; if (ops->mmap) rtd->ops.mmap = soc_rtdcom_mmap; } if (playback) snd_pcm_set_ops(pcm, SNDRV_PCM_STREAM_PLAYBACK, &rtd->ops); if (capture) snd_pcm_set_ops(pcm, SNDRV_PCM_STREAM_CAPTURE, &rtd->ops); for_each_rtdcom(rtd, rtdcom) { component = rtdcom->component; if (!component->driver->pcm_new) continue; ret = component->driver->pcm_new(rtd); if (ret < 0) { dev_err(component->dev, "ASoC: pcm constructor failed: %d\n", ret); return ret; } } pcm->private_free = soc_pcm_private_free; out: dev_info(rtd->card->dev, "%s <-> %s mapping ok\n", (rtd->num_codecs > 1) ? "multicodec" : rtd->codec_dai->name, cpu_dai->name); return ret;
On Mon, 11 Feb 2019 16:41:31 +0100, Pierre-Louis Bossart wrote: > > > On 2/9/19 3:27 AM, Takashi Iwai wrote: > > On Sat, 09 Feb 2019 00:29:53 +0100, > > Pierre-Louis Bossart wrote: > >> From: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> > >> > >> BE dai links only have internal PCM's and their substream ops may > >> not be set. Suspending these PCM's will result in their > >> ops->trigger() being invoked and cause a kernel oops. > >> So skip suspending PCM's if their ops are NULL. > >> > >> Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> > >> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> > >> --- > >> sound/core/pcm_native.c | 8 ++++++++ > >> 1 file changed, 8 insertions(+) > >> > >> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c > >> index 818dff1de545..b6e158ce6650 100644 > >> --- a/sound/core/pcm_native.c > >> +++ b/sound/core/pcm_native.c > >> @@ -1506,6 +1506,14 @@ int snd_pcm_suspend_all(struct snd_pcm *pcm) > >> /* FIXME: the open/close code should lock this as well */ > >> if (substream->runtime == NULL) > >> continue; > >> + > >> + /* > >> + * Skip BE dai link PCM's that are internal and may > >> + * not have their substream ops set. > >> + */ > >> + if (!substream->ops) > >> + continue; > >> + > >> err = snd_pcm_suspend(substream); > >> if (err < 0 && err != -EBUSY) > >> return err; > > Basically it's OK and safe to apply this check. We may need to add > > such sanity checks in more places if this really hits. > > > > But I still wonder how this can go through. Is substream->runtime set > > even if substream->ops is NULL? The substream->runtime is assigned > > dynamically at opening a substream via snd_pcm_attach_substream(), so > > without opening it, it must be NULL. > > This error case was exposed when we tried to get rid of > snd_pcm_suspend() per your recommendation, and use snd_soc_suspend() > instead to do the work for us. > > In the case of back-ends, all initializations are bypassed in > soc_new_pcm() - see below a code snippet - and the ops aren't set > before suspend is called. > The complete thread where we discussed this is at > https://github.com/thesofproject/linux/pull/582 Thanks, now I took a look at the code. And, this surfaced that the another part of the problem is that DPCM does the substream open handling by itself in soc-pcm.c. Oh well. I'm afraid that we have some hidden bugs there that may lead to a crash easily. (Fortunately (or unfortunately) fuzzer isn't performed on ASoC because we have no virtual device driver :) IMO, some of DPCM code should be raised to the upper level, to ALSA PCM core. The current code is still in a rough form of early plumbing. In anyway, I merged the patch now with a bit more comments. Thanks! Takashi
>>>> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c >>>> index 818dff1de545..b6e158ce6650 100644 >>>> --- a/sound/core/pcm_native.c >>>> +++ b/sound/core/pcm_native.c >>>> @@ -1506,6 +1506,14 @@ int snd_pcm_suspend_all(struct snd_pcm *pcm) >>>> /* FIXME: the open/close code should lock this as well */ >>>> if (substream->runtime == NULL) >>>> continue; >>>> + >>>> + /* >>>> + * Skip BE dai link PCM's that are internal and may >>>> + * not have their substream ops set. >>>> + */ >>>> + if (!substream->ops) >>>> + continue; >>>> + >>>> err = snd_pcm_suspend(substream); >>>> if (err < 0 && err != -EBUSY) >>>> return err; >>> Basically it's OK and safe to apply this check. We may need to add >>> such sanity checks in more places if this really hits. >>> >>> But I still wonder how this can go through. Is substream->runtime set >>> even if substream->ops is NULL? The substream->runtime is assigned >>> dynamically at opening a substream via snd_pcm_attach_substream(), so >>> without opening it, it must be NULL. >> This error case was exposed when we tried to get rid of >> snd_pcm_suspend() per your recommendation, and use snd_soc_suspend() >> instead to do the work for us. >> >> In the case of back-ends, all initializations are bypassed in >> soc_new_pcm() - see below a code snippet - and the ops aren't set >> before suspend is called. >> The complete thread where we discussed this is at >> https://github.com/thesofproject/linux/pull/582 > Thanks, now I took a look at the code. And, this surfaced that the > another part of the problem is that DPCM does the substream open > handling by itself in soc-pcm.c. Oh well. I'm afraid that we have > some hidden bugs there that may lead to a crash easily. (Fortunately > (or unfortunately) fuzzer isn't performed on ASoC because we have no > virtual device driver :) > > IMO, some of DPCM code should be raised to the upper level, to ALSA > PCM core. The current code is still in a rough form of early > plumbing. Can't disagree, we were surprised to hit this issue knowing that the SOF code isn't the first to use DPCM at all (same with some topology issues). It's very likely that there are specific initialization flows that aren't quite right, hopefully we'll fix them one after the other :-) > > In anyway, I merged the patch now with a bit more comments. > > > Thanks! > > Takashi Thanks for the review+additional comments, much appreciated.
On Mon, Feb 11, 2019 at 10:59:49AM -0600, Pierre-Louis Bossart wrote: > Can't disagree, we were surprised to hit this issue knowing that the SOF > code isn't the first to use DPCM at all (same with some topology issues). > It's very likely that there are specific initialization flows that aren't > quite right, hopefully we'll fix them one after the other :-) I'm fairly sure that DPCM has only been tested in very restricted use cases, and then mostly at the system level rather than directly. Obviously long term the component refactoring and digital domains should improve matters here but that's going to take a while :(
On Sat, 2019-02-09 at 10:27 +0100, Takashi Iwai wrote: > On Sat, 09 Feb 2019 00:29:53 +0100, > Pierre-Louis Bossart wrote: > > > > From: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> > > > > BE dai links only have internal PCM's and their substream ops may > > not be set. Suspending these PCM's will result in their > > ops->trigger() being invoked and cause a kernel oops. > > So skip suspending PCM's if their ops are NULL. > > > > Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com > > > > > Signed-off-by: Pierre-Louis Bossart < > > pierre-louis.bossart@linux.intel.com> > > --- > > sound/core/pcm_native.c | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > > > diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c > > index 818dff1de545..b6e158ce6650 100644 > > --- a/sound/core/pcm_native.c > > +++ b/sound/core/pcm_native.c > > @@ -1506,6 +1506,14 @@ int snd_pcm_suspend_all(struct snd_pcm *pcm) > > /* FIXME: the open/close code should lock this > > as well */ > > if (substream->runtime == NULL) > > continue; > > + > > + /* > > + * Skip BE dai link PCM's that are internal and > > may > > + * not have their substream ops set. > > + */ > > + if (!substream->ops) > > + continue; > > + > > err = snd_pcm_suspend(substream); > > if (err < 0 && err != -EBUSY) > > return err; > > Basically it's OK and safe to apply this check. We may need to add > such sanity checks in more places if this really hits. > > But I still wonder how this can go through. Is substream->runtime > set > even if substream->ops is NULL? The substream->runtime is assigned > dynamically at opening a substream via snd_pcm_attach_substream(), so > without opening it, it must be NULL. Hi Takashi, My guess is that this happens during dpcm_be_connect(fe, be, stream) in dpcm_add_paths(). The reason this wasnt exposed before was that the fe dai link pcm's were suspended first. So when it was BE dai links' turn, the pcm was already suspended. In the case of SOF, the order of dai links in the rtd_list is BE dai links first and then the FE dai links. Thanks, Ranjani > > > thanks, > > Takashi > _______________________________________________ > Alsa-devel mailing list > Alsa-devel@alsa-project.org > http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 818dff1de545..b6e158ce6650 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -1506,6 +1506,14 @@ int snd_pcm_suspend_all(struct snd_pcm *pcm) /* FIXME: the open/close code should lock this as well */ if (substream->runtime == NULL) continue; + + /* + * Skip BE dai link PCM's that are internal and may + * not have their substream ops set. + */ + if (!substream->ops) + continue; + err = snd_pcm_suspend(substream); if (err < 0 && err != -EBUSY) return err;