diff mbox

[23/25] ALSA/dummy: Replace tasklet with softirq hrtimer

Message ID 20170901102537.8066-1-o-takashi@sakamocchi.jp (mailing list archive)
State New, archived
Headers show

Commit Message

Takashi Sakamoto Sept. 1, 2017, 10:25 a.m. UTC
Hi,

On Sep 1 2017 00:36, Takashi Iwai wrote:
> I gave it at try, but it caused a kernel hang, unfortunately.
> 
> The reason is that snd_pcm_period_elapased() may stop the stream
> (e.g. when reaching at the end).  With this patchset, it'll lead to
> the call of hrtimer_cancel() from the hrtimer callback itself, thus it
> stalls.
 
I can reproduce this bug.

> Below is the additional fix over your patch for working around it.
> I believe it should cover most corner cases, and seems working fine
> through quick tests, so far.

This patch looks good to me, too. But I have an alternative.

We can use 'hrtimer_callback_running()' to detect whether to be on hrtimer
callback or not (please read '__run_hrtimer()' in 'kernel/time/hrtimer.c').
Usage of this helper function on .stop callback to skip cancellation can
avoid the stall. In this case, after stopping PCM substream, the hrtimer
callback should return HRTIMER_NORESTART to avoid restarting, as well as
your patch.  Please test a patch in this message.

> ---
> diff --git a/sound/drivers/dummy.c b/sound/drivers/dummy.c
> index 273d60c42125..b5dd64e3dab1 100644
> --- a/sound/drivers/dummy.c
> +++ b/sound/drivers/dummy.c
> @@ -375,6 +375,7 @@ struct dummy_hrtimer_pcm {
>   	ktime_t base_time;
>   	ktime_t period_time;
>   	atomic_t running;
> +	atomic_t callback_running;
>   	struct hrtimer timer;
>   	struct snd_pcm_substream *substream;
>   };
> @@ -387,8 +388,15 @@ static enum hrtimer_restart dummy_hrtimer_callback(struct hrtimer *timer)
>   	if (!atomic_read(&dpcm->running))
>   		return HRTIMER_NORESTART;
>   
> +	atomic_inc(&dpcm->callback_running);
>   	snd_pcm_period_elapsed(dpcm->substream);
> +	atomic_dec(&dpcm->callback_running);
> +	/* may be flipped during snd_pcm_period_elapsed() */
> +	if (!atomic_read(&dpcm->running))
> +		return HRTIMER_NORESTART;
> +
>   	hrtimer_forward_now(timer, dpcm->period_time);
> +	atomic_dec(&dpcm->callback_running);
>   	return HRTIMER_RESTART;
>   }
>   
> @@ -407,7 +415,9 @@ static int dummy_hrtimer_stop(struct snd_pcm_substream *substream)
>   	struct dummy_hrtimer_pcm *dpcm = substream->runtime->private_data;
>   
>   	atomic_set(&dpcm->running, 0);
> -	hrtimer_cancel(&dpcm->timer);
> +	/* issue hrtimer_cancel() only when called outside the callback */
> +	if (!atomic_read(&dpcm->callback_running))
> +		hrtimer_cancel(&dpcm->timer);
>   	return 0;
>   }
>   
> @@ -462,6 +472,7 @@ static int dummy_hrtimer_create(struct snd_pcm_substream *substream)
>   	dpcm->timer.function = dummy_hrtimer_callback;
>   	dpcm->substream = substream;
>   	atomic_set(&dpcm->running, 0);
> +	atomic_set(&dpcm->callback_running, 0);
>   	return 0;
>   }

From 07d61ba2a1c0e06e914443225e194d99f2d8c58d Mon Sep 17 00:00:00 2001
From: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Date: Fri, 1 Sep 2017 19:10:18 +0900
Subject: [PATCH] ALSA: dummy: avoid stall due to a call of hrtimer_cancel() on
 a callback of hrtimer

A call of 'htrimer_cancel()' on a callback of hrtimer brings endless loop
because 'struct hrtimer_clock_base.running' is not NULL on the callback.
In hrtimer subsystem, this member is used to indicate the instance of
hrtimer gets callbacks and there's a helper function,
'hrtimer_callback_running()' to check it.

ALSA dummy driver uses hrtimer to emulate hardware interrupt per period
of PCM buffer. When XRUN occurs on PCM substream, in a call of
'snd_pcm_period_elapsed()', 'struct snd_pcm_ops.stop()' is called to
stop the substream. In current implementation, 'hrtimer_cancel()' is
used to wait for cancellation of hrtimer. However, as described, this
brings endless loop.

For this problem, this commit uses 'hrtimer_callback_running()' to
detect whether to be on a callback of hrtimer or not, then skip
cancellation of hrtimer in hrtimer callbacks. Furthermore, at a case of
XRUN, hrtimer callback returns HRTIMER_NORESTART after a call of
'snd_pcm_period_elapsed()' to discontinue hrtimr because cancellation is
skipped.

Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
---
 sound/drivers/dummy.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Takashi Iwai Sept. 1, 2017, 11:58 a.m. UTC | #1
On Fri, 01 Sep 2017 12:25:37 +0200,
Takashi Sakamoto wrote:
> 
> Hi,
> 
> On Sep 1 2017 00:36, Takashi Iwai wrote:
> > I gave it at try, but it caused a kernel hang, unfortunately.
> > 
> > The reason is that snd_pcm_period_elapased() may stop the stream
> > (e.g. when reaching at the end).  With this patchset, it'll lead to
> > the call of hrtimer_cancel() from the hrtimer callback itself, thus it
> > stalls.
>  
> I can reproduce this bug.
> 
> > Below is the additional fix over your patch for working around it.
> > I believe it should cover most corner cases, and seems working fine
> > through quick tests, so far.
> 
> This patch looks good to me, too. But I have an alternative.
> 
> We can use 'hrtimer_callback_running()' to detect whether to be on hrtimer
> callback or not (please read '__run_hrtimer()' in 'kernel/time/hrtimer.c').

A good point, this is a better choice.

> Usage of this helper function on .stop callback to skip cancellation can
> avoid the stall. In this case, after stopping PCM substream, the hrtimer
> callback should return HRTIMER_NORESTART to avoid restarting, as well as
> your patch.  Please test a patch in this message.
> 
> > ---
> > diff --git a/sound/drivers/dummy.c b/sound/drivers/dummy.c
> > index 273d60c42125..b5dd64e3dab1 100644
> > --- a/sound/drivers/dummy.c
> > +++ b/sound/drivers/dummy.c
> > @@ -375,6 +375,7 @@ struct dummy_hrtimer_pcm {
> >   	ktime_t base_time;
> >   	ktime_t period_time;
> >   	atomic_t running;
> > +	atomic_t callback_running;
> >   	struct hrtimer timer;
> >   	struct snd_pcm_substream *substream;
> >   };
> > @@ -387,8 +388,15 @@ static enum hrtimer_restart dummy_hrtimer_callback(struct hrtimer *timer)
> >   	if (!atomic_read(&dpcm->running))
> >   		return HRTIMER_NORESTART;
> >   
> > +	atomic_inc(&dpcm->callback_running);
> >   	snd_pcm_period_elapsed(dpcm->substream);
> > +	atomic_dec(&dpcm->callback_running);
> > +	/* may be flipped during snd_pcm_period_elapsed() */
> > +	if (!atomic_read(&dpcm->running))
> > +		return HRTIMER_NORESTART;
> > +
> >   	hrtimer_forward_now(timer, dpcm->period_time);
> > +	atomic_dec(&dpcm->callback_running);
> >   	return HRTIMER_RESTART;
> >   }
> >   
> > @@ -407,7 +415,9 @@ static int dummy_hrtimer_stop(struct snd_pcm_substream *substream)
> >   	struct dummy_hrtimer_pcm *dpcm = substream->runtime->private_data;
> >   
> >   	atomic_set(&dpcm->running, 0);
> > -	hrtimer_cancel(&dpcm->timer);
> > +	/* issue hrtimer_cancel() only when called outside the callback */
> > +	if (!atomic_read(&dpcm->callback_running))
> > +		hrtimer_cancel(&dpcm->timer);
> >   	return 0;
> >   }
> >   
> > @@ -462,6 +472,7 @@ static int dummy_hrtimer_create(struct snd_pcm_substream *substream)
> >   	dpcm->timer.function = dummy_hrtimer_callback;
> >   	dpcm->substream = substream;
> >   	atomic_set(&dpcm->running, 0);
> > +	atomic_set(&dpcm->callback_running, 0);
> >   	return 0;
> >   }
> 
> >From 07d61ba2a1c0e06e914443225e194d99f2d8c58d Mon Sep 17 00:00:00 2001
> From: Takashi Sakamoto <o-takashi@sakamocchi.jp>
> Date: Fri, 1 Sep 2017 19:10:18 +0900
> Subject: [PATCH] ALSA: dummy: avoid stall due to a call of hrtimer_cancel() on
>  a callback of hrtimer
> 
> A call of 'htrimer_cancel()' on a callback of hrtimer brings endless loop
> because 'struct hrtimer_clock_base.running' is not NULL on the callback.
> In hrtimer subsystem, this member is used to indicate the instance of
> hrtimer gets callbacks and there's a helper function,
> 'hrtimer_callback_running()' to check it.
> 
> ALSA dummy driver uses hrtimer to emulate hardware interrupt per period
> of PCM buffer. When XRUN occurs on PCM substream, in a call of
> 'snd_pcm_period_elapsed()', 'struct snd_pcm_ops.stop()' is called to
> stop the substream. In current implementation, 'hrtimer_cancel()' is
> used to wait for cancellation of hrtimer. However, as described, this
> brings endless loop.

It's not only about XRUN.  When the stream finishes the draining, it
stops the stream gracefully -- that is the very normal operation.

> For this problem, this commit uses 'hrtimer_callback_running()' to
> detect whether to be on a callback of hrtimer or not, then skip
> cancellation of hrtimer in hrtimer callbacks. Furthermore, at a case of
> XRUN, hrtimer callback returns HRTIMER_NORESTART after a call of
> 'snd_pcm_period_elapsed()' to discontinue hrtimr because cancellation is
> skipped.
> 
> Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>

It's better to fold the fix into the original patch instead of
introducing a bug and fixing it.


Takashi
Takashi Sakamoto Sept. 2, 2017, 1:19 a.m. UTC | #2
On p 1 2017 20:58, Takashi Iwai wrote:
>> >From 07d61ba2a1c0e06e914443225e194d99f2d8c58d Mon Sep 17 00:00:00 2001
>> From: Takashi Sakamoto <o-takashi@sakamocchi.jp>
>> Date: Fri, 1 Sep 2017 19:10:18 +0900
>> Subject: [PATCH] ALSA: dummy: avoid stall due to a call of hrtimer_cancel() on
>>   a callback of hrtimer
>>
>> A call of 'htrimer_cancel()' on a callback of hrtimer brings endless loop
>> because 'struct hrtimer_clock_base.running' is not NULL on the callback.
>> In hrtimer subsystem, this member is used to indicate the instance of
>> hrtimer gets callbacks and there's a helper function,
>> 'hrtimer_callback_running()' to check it.
>>
>> ALSA dummy driver uses hrtimer to emulate hardware interrupt per period
>> of PCM buffer. When XRUN occurs on PCM substream, in a call of
>> 'snd_pcm_period_elapsed()', 'struct snd_pcm_ops.stop()' is called to
>> stop the substream. In current implementation, 'hrtimer_cancel()' is
>> used to wait for cancellation of hrtimer. However, as described, this
>> brings endless loop.
> 
> It's not only about XRUN.  When the stream finishes the draining, it
> stops the stream gracefully -- that is the very normal operation.

I overlooked it. Thanks for your indication.

>> For this problem, this commit uses 'hrtimer_callback_running()' to
>> detect whether to be on a callback of hrtimer or not, then skip
>> cancellation of hrtimer in hrtimer callbacks. Furthermore, at a case of
>> XRUN, hrtimer callback returns HRTIMER_NORESTART after a call of
>> 'snd_pcm_period_elapsed()' to discontinue hrtimr because cancellation is
>> skipped.
>>
>> Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
> 
> It's better to fold the fix into the original patch instead of
> introducing a bug and fixing it.

Yep. I request the authors to include this fix.


Well, in sound subsystem, there're a few drivers which uses hrtimer:
  - snd-pcsp
  - snd-sh-dac-audio
  - snd-soc-imx-pcm-fiq

As a quick glance, 'snd-sh-dac-audio' includes the same bug, too. 
Additionally, 'snd-soc-imx-pcm-fiq' maintains hrtimer with loose manner 
in a point of state of PCM substream and it shall gain the same bug if 
improved. Later, I posted some patches for them.


Thanks

Takashi Sakamoto
Takashi Sakamoto Sept. 4, 2017, 12:45 p.m. UTC | #3
Hi,

On Sep 2 2017 10:19, Takashi Sakamoto wrote:
> Well, in sound subsystem, there're a few drivers which uses hrtimer:
>   - snd-pcsp
>   - snd-sh-dac-audio
>   - snd-soc-imx-pcm-fiq
> 
> As a quick glance, 'snd-sh-dac-audio' includes the same bug, too. 
> Additionally, 'snd-soc-imx-pcm-fiq' maintains hrtimer with loose manner 
> in a point of state of PCM substream and it shall gain the same bug if 
> improved. Later, I posted some patches for them.

After reading code thoroughly, I conclude that no need to fix these two
drivers. They're programmed with own protections. The former
(snd-sh-dac-audio) has 'struct snd_sh_dac.empty' and the latter
(snd-soc-imx-pcm-fiq) has 'struct imx_pcm_runtime_data.playing' and
'.capturing', to avoid cancellation of hrtimer on hrtimer callback.

These ways are not necessarily efficient but actually have no trouble.
I leave them as is.


Regards

Takashi Sakamoto
diff mbox

Patch

diff --git a/sound/drivers/dummy.c b/sound/drivers/dummy.c
index 273d60c42125..9caf754c6135 100644
--- a/sound/drivers/dummy.c
+++ b/sound/drivers/dummy.c
@@ -387,7 +387,11 @@  static enum hrtimer_restart dummy_hrtimer_callback(struct hrtimer *timer)
 	if (!atomic_read(&dpcm->running))
 		return HRTIMER_NORESTART;
 
+	/* In a case of XRUN, this calls .trigger to stop PCM substream. */
 	snd_pcm_period_elapsed(dpcm->substream);
+	if (!atomic_read(&dpcm->running))
+		return HRTIMER_NORESTART;
+
 	hrtimer_forward_now(timer, dpcm->period_time);
 	return HRTIMER_RESTART;
 }
@@ -407,7 +411,8 @@  static int dummy_hrtimer_stop(struct snd_pcm_substream *substream)
 	struct dummy_hrtimer_pcm *dpcm = substream->runtime->private_data;
 
 	atomic_set(&dpcm->running, 0);
-	hrtimer_cancel(&dpcm->timer);
+	if (!hrtimer_callback_running(&dpcm->timer))
+		hrtimer_cancel(&dpcm->timer);
 	return 0;
 }