diff mbox

sound: deadlock involving snd_hrtimer_callback

Message ID s5hpotgqbs5.wl-tiwai@suse.de (mailing list archive)
State New, archived
Headers show

Commit Message

Takashi Iwai April 23, 2016, 9:02 p.m. UTC
On Sat, 23 Apr 2016 15:40:21 +0200,
Dmitry Vyukov wrote:
> 
> Hi Takashi,
> 
> I've incorporated your hrtimer fixes (but also updated to
> ddce192106e4f984123884f8e878f66ace94b573) and now I am seeing lots of
> the following deadlock messages:
> 
> 
> [ INFO: possible circular locking dependency detected ]
> 4.6.0-rc4+ #351 Not tainted
> -------------------------------------------------------
> swapper/0/0 is trying to acquire lock:
>  (&(&timer->lock)->rlock){-.-...}, at: [<ffffffff8537a749>]
> snd_timer_interrupt+0xa9/0xd30 sound/core/timer.c:701
> 
> but task is already holding lock:
>  (&(&stime->lock)->rlock){-.....}, at: [<ffffffff85383d3f>]
> snd_hrtimer_callback+0x4f/0x2b0 sound/core/hrtimer.c:54
> 
> which lock already depends on the new lock.

Oh crap, my second patch is buggy, it leads to ABBA lock, indeed.
The first patch is still OK, as it just adds a new behavior mode.

Could you replace the second patch with the below one?


thanks,

Takashi

-- 8< --
From: Takashi Iwai <tiwai@suse.de>
Subject: [PATCH] ALSA: hrtimer: Use manual start/stop in callback (v2)

With the new SNDRV_TIMER_HW_RET_CTRL flag, hrtimer can manage the
callback behavior more correctly.  Now it gets a return value from
snd_timer_interrupt() whether to reprogram or stop the timer, and it
can choose the right return value.

Thanks to this change, we can revert the long-standing hack to work
around the hrtimer_cancel() call at stopping.  Since the stop is
called only explicitly outside the callback, hrtimer_cancel() can be
called safely now.

This fix essentially reverts the old commit [fcfdebe70759: ALSA:
hrtimer - Fix lock-up] while converting to use the new framework with
SNDRV_TIMER_HW_RET_CTRL.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
---
 sound/core/hrtimer.c | 21 +++++++--------------
 1 file changed, 7 insertions(+), 14 deletions(-)

Comments

Dmitry Vyukov April 24, 2016, 3:16 p.m. UTC | #1
On Sat, Apr 23, 2016 at 11:02 PM, Takashi Iwai <tiwai@suse.de> wrote:
> On Sat, 23 Apr 2016 15:40:21 +0200,
> Dmitry Vyukov wrote:
>>
>> Hi Takashi,
>>
>> I've incorporated your hrtimer fixes (but also updated to
>> ddce192106e4f984123884f8e878f66ace94b573) and now I am seeing lots of
>> the following deadlock messages:
>>
>>
>> [ INFO: possible circular locking dependency detected ]
>> 4.6.0-rc4+ #351 Not tainted
>> -------------------------------------------------------
>> swapper/0/0 is trying to acquire lock:
>>  (&(&timer->lock)->rlock){-.-...}, at: [<ffffffff8537a749>]
>> snd_timer_interrupt+0xa9/0xd30 sound/core/timer.c:701
>>
>> but task is already holding lock:
>>  (&(&stime->lock)->rlock){-.....}, at: [<ffffffff85383d3f>]
>> snd_hrtimer_callback+0x4f/0x2b0 sound/core/hrtimer.c:54
>>
>> which lock already depends on the new lock.
>
> Oh crap, my second patch is buggy, it leads to ABBA lock, indeed.
> The first patch is still OK, as it just adds a new behavior mode.
>
> Could you replace the second patch with the below one?


I've replaced the second path with this one. The deadlocks has gone,
but I've hit these two hangs that look related:

https://gist.githubusercontent.com/dvyukov/805718ea249c49d17ae759d1b0160684/raw/20891f7e87fe9af3967565559d465d296469244b/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/7f397ea4aeb9e35596e0c8053cf35a11/raw/3fc22f24f7bab5941e47bab604f96487b5f1944d/gistfile1.txt
Dmitry Vyukov April 24, 2016, 3:22 p.m. UTC | #2
On Sun, Apr 24, 2016 at 5:16 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Sat, Apr 23, 2016 at 11:02 PM, Takashi Iwai <tiwai@suse.de> wrote:
>> On Sat, 23 Apr 2016 15:40:21 +0200,
>> Dmitry Vyukov wrote:
>>>
>>> Hi Takashi,
>>>
>>> I've incorporated your hrtimer fixes (but also updated to
>>> ddce192106e4f984123884f8e878f66ace94b573) and now I am seeing lots of
>>> the following deadlock messages:
>>>
>>>
>>> [ INFO: possible circular locking dependency detected ]
>>> 4.6.0-rc4+ #351 Not tainted
>>> -------------------------------------------------------
>>> swapper/0/0 is trying to acquire lock:
>>>  (&(&timer->lock)->rlock){-.-...}, at: [<ffffffff8537a749>]
>>> snd_timer_interrupt+0xa9/0xd30 sound/core/timer.c:701
>>>
>>> but task is already holding lock:
>>>  (&(&stime->lock)->rlock){-.....}, at: [<ffffffff85383d3f>]
>>> snd_hrtimer_callback+0x4f/0x2b0 sound/core/hrtimer.c:54
>>>
>>> which lock already depends on the new lock.
>>
>> Oh crap, my second patch is buggy, it leads to ABBA lock, indeed.
>> The first patch is still OK, as it just adds a new behavior mode.
>>
>> Could you replace the second patch with the below one?
>
>
> I've replaced the second path with this one. The deadlocks has gone,
> but I've hit these two hangs that look related:
>
> https://gist.githubusercontent.com/dvyukov/805718ea249c49d17ae759d1b0160684/raw/20891f7e87fe9af3967565559d465d296469244b/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/7f397ea4aeb9e35596e0c8053cf35a11/raw/3fc22f24f7bab5941e47bab604f96487b5f1944d/gistfile1.txt

Another one:

[  509.169244] BUG: spinlock lockup suspected on CPU#2, syz-executor/23652
[  509.169244]  lock: 0xffff880065662038, .magic: dead4ead, .owner:
syz-executor/23642, .owner_cpu: 0
[  509.169244] CPU: 2 PID: 23652 Comm: syz-executor Not tainted 4.6.0-rc4+ #352
[  509.169244] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Bochs 01/01/2011
[  509.169244]  ffffffff87eb25c0 ffff88006d407c58 ffffffff82c8fe0f
ffffffff5f369e98
[  509.169244]  fffffbfff0fd64b8 ffff880065662038 ffff88005f369840
ffff88005e806040
[  509.169244]  ffff880065662040 000000009a9d0630 ffff88006d407c90
ffffffff8147f75d
[  509.169244] Call Trace:
[  509.169244]  <IRQ>  [<ffffffff82c8fe0f>] dump_stack+0x12e/0x18f
[  509.169244]  [<ffffffff8147f75d>] spin_dump+0x14d/0x280
[  509.169244]  [<ffffffff8537ad39>] ? snd_timer_interrupt+0xa9/0xd30
[  509.169244]  [<ffffffff8147fa7d>] do_raw_spin_lock+0x15d/0x2b0
[  509.169244]  [<ffffffff867c04e7>] _raw_spin_lock_irqsave+0xa7/0xd0
[  509.169244]  [<ffffffff8537ad39>] ? snd_timer_interrupt+0xa9/0xd30
[  509.169244]  [<ffffffff8537ad39>] snd_timer_interrupt+0xa9/0xd30
[  509.169244]  [<ffffffff8127c405>] ? kvm_clock_get_cycles+0x25/0x30
[  509.169244]  [<ffffffff814e53ba>] ? ktime_get+0x32a/0x550
[  509.169244]  [<ffffffff853842e7>] ? snd_hrtimer_callback+0xb7/0x180
[  509.169244]  [<ffffffff814cff2c>] ? hrtimer_forward+0x1cc/0x3e0
[  509.169244]  [<ffffffff85384338>] snd_hrtimer_callback+0x108/0x180
[  509.169244]  [<ffffffff814d1241>] __hrtimer_run_queues+0x331/0xe90
[  509.169244]  [<ffffffff85384230>] ? snd_hrtimer_close+0xa0/0xa0
[  509.169244]  [<ffffffff814d0f10>] ? enqueue_hrtimer+0x3d0/0x3d0
[  509.169244]  [<ffffffff814d3c12>] hrtimer_interrupt+0x182/0x430
[  509.169244]  [<ffffffff8125aa62>] local_apic_timer_interrupt+0x72/0xe0
[  509.169244]  [<ffffffff867c31d9>] smp_apic_timer_interrupt+0x79/0xa0
[  509.169244]  [<ffffffff867c152c>] apic_timer_interrupt+0x8c/0xa0
[  509.169244]  <EOI>  [<ffffffff8171902e>] ? unmap_page_range+0x8be/0x19f0
Takashi Iwai April 24, 2016, 4:16 p.m. UTC | #3
On Sun, 24 Apr 2016 17:16:32 +0200,
Dmitry Vyukov wrote:
> 
> On Sat, Apr 23, 2016 at 11:02 PM, Takashi Iwai <tiwai@suse.de> wrote:
> > On Sat, 23 Apr 2016 15:40:21 +0200,
> > Dmitry Vyukov wrote:
> >>
> >> Hi Takashi,
> >>
> >> I've incorporated your hrtimer fixes (but also updated to
> >> ddce192106e4f984123884f8e878f66ace94b573) and now I am seeing lots of
> >> the following deadlock messages:
> >>
> >>
> >> [ INFO: possible circular locking dependency detected ]
> >> 4.6.0-rc4+ #351 Not tainted
> >> -------------------------------------------------------
> >> swapper/0/0 is trying to acquire lock:
> >>  (&(&timer->lock)->rlock){-.-...}, at: [<ffffffff8537a749>]
> >> snd_timer_interrupt+0xa9/0xd30 sound/core/timer.c:701
> >>
> >> but task is already holding lock:
> >>  (&(&stime->lock)->rlock){-.....}, at: [<ffffffff85383d3f>]
> >> snd_hrtimer_callback+0x4f/0x2b0 sound/core/hrtimer.c:54
> >>
> >> which lock already depends on the new lock.
> >
> > Oh crap, my second patch is buggy, it leads to ABBA lock, indeed.
> > The first patch is still OK, as it just adds a new behavior mode.
> >
> > Could you replace the second patch with the below one?
> 
> 
> I've replaced the second path with this one. The deadlocks has gone,
> but I've hit these two hangs that look related:
> 
> https://gist.githubusercontent.com/dvyukov/805718ea249c49d17ae759d1b0160684/raw/20891f7e87fe9af3967565559d465d296469244b/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/7f397ea4aeb9e35596e0c8053cf35a11/raw/3fc22f24f7bab5941e47bab604f96487b5f1944d/gistfile1.txt

Hmm, so it wasn't a good idea to call hrtimer_cancel() in the
spinlock, in anyway.  Scratch the previous one.

OK, below is the yet revised two patches.  One is the simplified
version of the patch, and another is to call hrtimer_cancel() in a new
timer op without spinlock.  Apply these after the first patch
"ALSA: timer: Allow backend disabling start/stop from handler".


thanks,

Takashi
Dmitry Vyukov April 24, 2016, 5:09 p.m. UTC | #4
On Sun, Apr 24, 2016 at 6:16 PM, Takashi Iwai <tiwai@suse.de> wrote:
> On Sun, 24 Apr 2016 17:16:32 +0200,
> Dmitry Vyukov wrote:
>>
>> On Sat, Apr 23, 2016 at 11:02 PM, Takashi Iwai <tiwai@suse.de> wrote:
>> > On Sat, 23 Apr 2016 15:40:21 +0200,
>> > Dmitry Vyukov wrote:
>> >>
>> >> Hi Takashi,
>> >>
>> >> I've incorporated your hrtimer fixes (but also updated to
>> >> ddce192106e4f984123884f8e878f66ace94b573) and now I am seeing lots of
>> >> the following deadlock messages:
>> >>
>> >>
>> >> [ INFO: possible circular locking dependency detected ]
>> >> 4.6.0-rc4+ #351 Not tainted
>> >> -------------------------------------------------------
>> >> swapper/0/0 is trying to acquire lock:
>> >>  (&(&timer->lock)->rlock){-.-...}, at: [<ffffffff8537a749>]
>> >> snd_timer_interrupt+0xa9/0xd30 sound/core/timer.c:701
>> >>
>> >> but task is already holding lock:
>> >>  (&(&stime->lock)->rlock){-.....}, at: [<ffffffff85383d3f>]
>> >> snd_hrtimer_callback+0x4f/0x2b0 sound/core/hrtimer.c:54
>> >>
>> >> which lock already depends on the new lock.
>> >
>> > Oh crap, my second patch is buggy, it leads to ABBA lock, indeed.
>> > The first patch is still OK, as it just adds a new behavior mode.
>> >
>> > Could you replace the second patch with the below one?
>>
>>
>> I've replaced the second path with this one. The deadlocks has gone,
>> but I've hit these two hangs that look related:
>>
>> https://gist.githubusercontent.com/dvyukov/805718ea249c49d17ae759d1b0160684/raw/20891f7e87fe9af3967565559d465d296469244b/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/7f397ea4aeb9e35596e0c8053cf35a11/raw/3fc22f24f7bab5941e47bab604f96487b5f1944d/gistfile1.txt
>
> Hmm, so it wasn't a good idea to call hrtimer_cancel() in the
> spinlock, in anyway.  Scratch the previous one.
>
> OK, below is the yet revised two patches.  One is the simplified
> version of the patch, and another is to call hrtimer_cancel() in a new
> timer op without spinlock.  Apply these after the first patch
> "ALSA: timer: Allow backend disabling start/stop from handler".

Done. I will let you know if I see any failures.
diff mbox

Patch

diff --git a/sound/core/hrtimer.c b/sound/core/hrtimer.c
index 656d9a9032dc..27e2493c94e4 100644
--- a/sound/core/hrtimer.c
+++ b/sound/core/hrtimer.c
@@ -38,7 +38,6 @@  static unsigned int resolution;
 struct snd_hrtimer {
 	struct snd_timer *timer;
 	struct hrtimer hrt;
-	atomic_t running;
 };
 
 static enum hrtimer_restart snd_hrtimer_callback(struct hrtimer *hrt)
@@ -46,14 +45,11 @@  static enum hrtimer_restart snd_hrtimer_callback(struct hrtimer *hrt)
 	struct snd_hrtimer *stime = container_of(hrt, struct snd_hrtimer, hrt);
 	struct snd_timer *t = stime->timer;
 	unsigned long oruns;
-
-	if (!atomic_read(&stime->running))
-		return HRTIMER_NORESTART;
+	int ret;
 
 	oruns = hrtimer_forward_now(hrt, ns_to_ktime(t->sticks * resolution));
-	snd_timer_interrupt(stime->timer, t->sticks * oruns);
-
-	if (!atomic_read(&stime->running))
+	ret = snd_timer_interrupt(stime->timer, t->sticks * oruns);
+	if (ret == SNDRV_TIMER_RET_STOP)
 		return HRTIMER_NORESTART;
 	return HRTIMER_RESTART;
 }
@@ -68,7 +64,6 @@  static int snd_hrtimer_open(struct snd_timer *t)
 	hrtimer_init(&stime->hrt, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
 	stime->timer = t;
 	stime->hrt.function = snd_hrtimer_callback;
-	atomic_set(&stime->running, 0);
 	t->private_data = stime;
 	return 0;
 }
@@ -89,24 +84,22 @@  static int snd_hrtimer_start(struct snd_timer *t)
 {
 	struct snd_hrtimer *stime = t->private_data;
 
-	atomic_set(&stime->running, 0);
-	hrtimer_try_to_cancel(&stime->hrt);
 	hrtimer_start(&stime->hrt, ns_to_ktime(t->sticks * resolution),
 		      HRTIMER_MODE_REL);
-	atomic_set(&stime->running, 1);
 	return 0;
 }
 
 static int snd_hrtimer_stop(struct snd_timer *t)
 {
 	struct snd_hrtimer *stime = t->private_data;
-	atomic_set(&stime->running, 0);
-	hrtimer_try_to_cancel(&stime->hrt);
+
+	hrtimer_cancel(&stime->hrt);
 	return 0;
 }
 
 static struct snd_timer_hardware hrtimer_hw = {
-	.flags =	SNDRV_TIMER_HW_AUTO | SNDRV_TIMER_HW_TASKLET,
+	.flags =	SNDRV_TIMER_HW_AUTO | SNDRV_TIMER_HW_TASKLET |
+			SNDRV_TIMER_HW_RET_CTRL,
 	.open =		snd_hrtimer_open,
 	.close =	snd_hrtimer_close,
 	.start =	snd_hrtimer_start,