diff mbox

[RFC] ASoC: pcm: Trigger all commands on error

Message ID 1380883997-18236-1-git-send-email-mpa@pengutronix.de (mailing list archive)
State New, archived
Headers show

Commit Message

Markus Pargmann Oct. 4, 2013, 10:53 a.m. UTC
If a start command is triggered and an error occures in the first called
function, e.g. DMA, all other functions are not executed and the error
value is returned immediately.

In case of a start command, the calling function will call undo_start,
calling this trigger function with the stop command. It will call stop
for each function. But only the first function was started previously.
The other functions may fail in the assumption that a stop command
always comes after a start command.

As the API does not specify the behaviour of trigger functionpointers, I
think this should be fixed in the function calling the trigger
functionpointers.

This patch changes the behaviour. The trigger function calls all
functionpointers independent of their returncodes. The first error-code
is returned.

Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
---
 sound/soc/soc-pcm.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

Comments

Mark Brown Oct. 7, 2013, 5:27 p.m. UTC | #1
On Fri, Oct 04, 2013 at 12:53:17PM +0200, Markus Pargmann wrote:

> In case of a start command, the calling function will call undo_start,
> calling this trigger function with the stop command. It will call stop
> for each function. But only the first function was started previously.
> The other functions may fail in the assumption that a stop command
> always comes after a start command.

> As the API does not specify the behaviour of trigger functionpointers, I
> think this should be fixed in the function calling the trigger
> functionpointers.

> This patch changes the behaviour. The trigger function calls all
> functionpointers independent of their returncodes. The first error-code
> is returned.

I'm not sure if this will resolve the problem robustly - if something is
going to get confused about this it seems just as likely that the
trigger that failed will get upset because it gets a _STOP after
returning an error from _START.  I think what I'd expect here is that
the unwind on error would unwind only the triggers that it successfully
ran.  Equally well I'd be a bit surprised if a trigger function actually
had a problem with extra _STOPs since they mostly just do register
writes...  did you see this from code inspection or were you resolving a
practical problem in your system?
Markus Pargmann Oct. 8, 2013, 8:29 a.m. UTC | #2
On Mon, Oct 07, 2013 at 06:27:06PM +0100, Mark Brown wrote:
> On Fri, Oct 04, 2013 at 12:53:17PM +0200, Markus Pargmann wrote:
> 
> > In case of a start command, the calling function will call undo_start,
> > calling this trigger function with the stop command. It will call stop
> > for each function. But only the first function was started previously.
> > The other functions may fail in the assumption that a stop command
> > always comes after a start command.
> 
> > As the API does not specify the behaviour of trigger functionpointers, I
> > think this should be fixed in the function calling the trigger
> > functionpointers.
> 
> > This patch changes the behaviour. The trigger function calls all
> > functionpointers independent of their returncodes. The first error-code
> > is returned.
> 
> I'm not sure if this will resolve the problem robustly - if something is
> going to get confused about this it seems just as likely that the
> trigger that failed will get upset because it gets a _STOP after
> returning an error from _START.  I think what I'd expect here is that
> the unwind on error would unwind only the triggers that it successfully
> ran.  Equally well I'd be a bit surprised if a trigger function actually
> had a problem with extra _STOPs since they mostly just do register
> writes...  did you see this from code inspection or were you resolving a
> practical problem in your system?

Yes this is a practical problem on mx28. mxs-saif enables/disables
clocks when starting/stopping. I debugged a problem where the DMA engine
returned an error on channel preparation and clk_disabled in mxs-saif
was called twice.

To make proper error handling and only STOP functions that successfully
started, we have to store the state of each function. The command error
handling is done in pcm_native.c so we don't have any influence on that.

Another possibility is to explicitly allow multiple _STOPs. Then I could
fix the driver to store the clock state. But I would prefer a generic
solution.

Regards,

Markus
Mark Brown Oct. 8, 2013, 9:34 a.m. UTC | #3
On Tue, Oct 08, 2013 at 10:29:33AM +0200, Markus Pargmann wrote:

> To make proper error handling and only STOP functions that successfully
> started, we have to store the state of each function. The command error
> handling is done in pcm_native.c so we don't have any influence on that.

I meant just unwinding the things done in the trigger call.

> Another possibility is to explicitly allow multiple _STOPs. Then I could
> fix the driver to store the clock state. But I would prefer a generic
> solution.

You're going to have to handle this anyway to be robust - what happens
if it's the clock enable that fails?
Markus Pargmann Oct. 8, 2013, 9:48 a.m. UTC | #4
On Tue, Oct 08, 2013 at 10:34:13AM +0100, Mark Brown wrote:
> On Tue, Oct 08, 2013 at 10:29:33AM +0200, Markus Pargmann wrote:
> 
> > To make proper error handling and only STOP functions that successfully
> > started, we have to store the state of each function. The command error
> > handling is done in pcm_native.c so we don't have any influence on that.
> 
> I meant just unwinding the things done in the trigger call.

You mean unwinding when we detect the error? If we return the
error code from the trigger function, the pcm_native.c will still
trigger a STOP command which is executed on all components.

> 
> > Another possibility is to explicitly allow multiple _STOPs. Then I could
> > fix the driver to store the clock state. But I would prefer a generic
> > solution.
> 
> You're going to have to handle this anyway to be robust - what happens
> if it's the clock enable that fails?

If clk_enable fails, we can return an error code in the mxs-saif trigger
function. Of course the error handling is missing here, but it is not
necessary to store the state of the clock to handle errors properly.

Regards,

Markus
Mark Brown Oct. 8, 2013, 10:44 a.m. UTC | #5
On Tue, Oct 08, 2013 at 11:48:39AM +0200, Markus Pargmann wrote:
> On Tue, Oct 08, 2013 at 10:34:13AM +0100, Mark Brown wrote:

> > I meant just unwinding the things done in the trigger call.

> You mean unwinding when we detect the error? If we return the
> error code from the trigger function, the pcm_native.c will still
> trigger a STOP command which is executed on all components.

Sure, but it's another way of getting consistency and does seem a bit
more obvious.

> > > Another possibility is to explicitly allow multiple _STOPs. Then I could
> > > fix the driver to store the clock state. But I would prefer a generic
> > > solution.

> > You're going to have to handle this anyway to be robust - what happens
> > if it's the clock enable that fails?

> If clk_enable fails, we can return an error code in the mxs-saif trigger
> function. Of course the error handling is missing here, but it is not
> necessary to store the state of the clock to handle errors properly.

Right, but then _STOP will be called without the clock having been
successfully enabled so you'll have an unbalanced clk_disable().
Markus Pargmann Oct. 9, 2013, 1:05 p.m. UTC | #6
On Tue, Oct 08, 2013 at 11:44:57AM +0100, Mark Brown wrote:
> On Tue, Oct 08, 2013 at 11:48:39AM +0200, Markus Pargmann wrote:
> > On Tue, Oct 08, 2013 at 10:34:13AM +0100, Mark Brown wrote:
> 
> > > I meant just unwinding the things done in the trigger call.
> 
> > You mean unwinding when we detect the error? If we return the
> > error code from the trigger function, the pcm_native.c will still
> > trigger a STOP command which is executed on all components.
> 
> Sure, but it's another way of getting consistency and does seem a bit
> more obvious.

It is still not consistent as we get an additional STOP after cleanly
unwinding the START:

snd_pcm_do_start()
    soc_pcm_trigger() # START
        codec_dai->driver->ops->trigger() # START
	platform->driver->ops->trigger() # Assuming an error here

	unwind:
        codec_dai->driver->ops->trigger() # STOP

A failed snd_pcm_do_start() is detected, so undo_start is called:

snd_pcm_undo_start()
    soc_pcm_trigger() # STOP
        codec_dai->driver->ops->trigger() # STOP
	platform->driver->ops->trigger() # STOP
	cpu_dai->driver->ops->trigger() # STOP

Now all component trigger functions are called without a matching START
command.

> 
> > > > Another possibility is to explicitly allow multiple _STOPs. Then I could
> > > > fix the driver to store the clock state. But I would prefer a generic
> > > > solution.
> 
> > > You're going to have to handle this anyway to be robust - what happens
> > > if it's the clock enable that fails?
> 
> > If clk_enable fails, we can return an error code in the mxs-saif trigger
> > function. Of course the error handling is missing here, but it is not
> > necessary to store the state of the clock to handle errors properly.
> 
> Right, but then _STOP will be called without the clock having been
> successfully enabled so you'll have an unbalanced clk_disable().

_STOP is only called as long as the error handling is not working
properly. Otherwise STOP should not be called for a failed START.

Regards,

Markus
Mark Brown Oct. 9, 2013, 1:20 p.m. UTC | #7
On Wed, Oct 09, 2013 at 03:05:53PM +0200, Markus Pargmann wrote:
> On Tue, Oct 08, 2013 at 11:44:57AM +0100, Mark Brown wrote:

> > Right, but then _STOP will be called without the clock having been
> > successfully enabled so you'll have an unbalanced clk_disable().

> _STOP is only called as long as the error handling is not working
> properly. Otherwise STOP should not be called for a failed START.

I'm not sure what you mean here.  How would the error be handled without
calling _STOP?
Markus Pargmann Oct. 9, 2013, 2:12 p.m. UTC | #8
On Wed, Oct 09, 2013 at 02:20:04PM +0100, Mark Brown wrote:
> On Wed, Oct 09, 2013 at 03:05:53PM +0200, Markus Pargmann wrote:
> > On Tue, Oct 08, 2013 at 11:44:57AM +0100, Mark Brown wrote:
> 
> > > Right, but then _STOP will be called without the clock having been
> > > successfully enabled so you'll have an unbalanced clk_disable().
> 
> > _STOP is only called as long as the error handling is not working
> > properly. Otherwise STOP should not be called for a failed START.
> 
> I'm not sure what you mean here.  How would the error be handled without
> calling _STOP?

The START error should be handled in mxs_saif and there shouldn't be a
_STOP call for mxs_saif_trigger afterwards to handle the failed START.

Regards,

Markus
Mark Brown Oct. 9, 2013, 2:59 p.m. UTC | #9
On Wed, Oct 09, 2013 at 04:12:42PM +0200, Markus Pargmann wrote:
> On Wed, Oct 09, 2013 at 02:20:04PM +0100, Mark Brown wrote:

> > I'm not sure what you mean here.  How would the error be handled without
> > calling _STOP?

> The START error should be handled in mxs_saif and there shouldn't be a
> _STOP call for mxs_saif_trigger afterwards to handle the failed START.

How would the framework know that the error has been handled - what
tells it that the error returned is for an error that has been handled
as opposed to an error that has not been handled?
Mark Brown Oct. 9, 2013, 6:22 p.m. UTC | #10
On Wed, Oct 09, 2013 at 04:31:16PM +0200, Markus Pargmann wrote:
> Store the state of the components to detect not matching START/STOP
> command pairs. Otherwise a component's trigger function may be called
> twice with the same command which can lead to problems.

> This is a patch which stores the state of platform, cpu_dai and codec_dai to
> call their trigger functions only if they are not already in the correct state.

No, this seems way too complex and fragile.  The original idea was
better though I'm still not convinced it actually fixes the problem you
think it fixes.
Markus Pargmann Oct. 10, 2013, 8:40 a.m. UTC | #11
On Wed, Oct 09, 2013 at 03:59:17PM +0100, Mark Brown wrote:
> On Wed, Oct 09, 2013 at 04:12:42PM +0200, Markus Pargmann wrote:
> > On Wed, Oct 09, 2013 at 02:20:04PM +0100, Mark Brown wrote:
> 
> > > I'm not sure what you mean here.  How would the error be handled without
> > > calling _STOP?
> 
> > The START error should be handled in mxs_saif and there shouldn't be a
> > _STOP call for mxs_saif_trigger afterwards to handle the failed START.
> 
> How would the framework know that the error has been handled - what
> tells it that the error returned is for an error that has been handled
> as opposed to an error that has not been handled?

I finally understand what you mean. Yes, RFCv1 fixes the behavior for
the components without _START failures. But it doesn't fix it for the
failing component.

For mxs-saif, we still have to store if the last START failed to handle
the following STOP correctly which is not very different from handling
double STOPs correctly. I don't have an idea how to fix RFCv1 without
using some state as in RFCv2. Perhaps we should simply fix it in
mxs-saif and other drivers then?

Regards,

Markus
Mark Brown Oct. 10, 2013, 12:35 p.m. UTC | #12
On Thu, Oct 10, 2013 at 10:40:40AM +0200, Markus Pargmann wrote:

> For mxs-saif, we still have to store if the last START failed to handle
> the following STOP correctly which is not very different from handling
> double STOPs correctly. I don't have an idea how to fix RFCv1 without
> using some state as in RFCv2. Perhaps we should simply fix it in
> mxs-saif and other drivers then?

I think we have to - it seems like it's going to be more trouble than
it's worth to try to keep track of this in the core, there seems to be
a bit of an assumption that the triggers can just be called and for
things that aren't refcounting it's probably more robust to run the
trigger extra times rather than adding a state machine that we might get
wrong.
diff mbox

Patch

diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c
index 330c9a6..23fc25b 100644
--- a/sound/soc/soc-pcm.c
+++ b/sound/soc/soc-pcm.c
@@ -616,26 +616,28 @@  static int soc_pcm_trigger(struct snd_pcm_substream *substream, int cmd)
 	struct snd_soc_platform *platform = rtd->platform;
 	struct snd_soc_dai *cpu_dai = rtd->cpu_dai;
 	struct snd_soc_dai *codec_dai = rtd->codec_dai;
-	int ret;
+	int ret = 0;
+	int err = 0;
 
 	if (codec_dai->driver->ops->trigger) {
 		ret = codec_dai->driver->ops->trigger(substream, cmd, codec_dai);
-		if (ret < 0)
-			return ret;
+		if (!err && ret)
+			err = ret;
 	}
 
 	if (platform->driver->ops && platform->driver->ops->trigger) {
 		ret = platform->driver->ops->trigger(substream, cmd);
-		if (ret < 0)
-			return ret;
+		if (!err && ret)
+			err = ret;
 	}
 
 	if (cpu_dai->driver->ops->trigger) {
 		ret = cpu_dai->driver->ops->trigger(substream, cmd, cpu_dai);
-		if (ret < 0)
-			return ret;
+		if (!err && ret)
+			err = ret;
 	}
-	return 0;
+
+	return err;
 }
 
 static int soc_pcm_bespoke_trigger(struct snd_pcm_substream *substream,