diff mbox

sound: divide by 0 in snd_hrtimer_callback (or hang)

Message ID CACT4Y+Yw_05S47dEYLWbwMCJvfB4cb9Mdb3q1qOYLyKfJMK1bA@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Dmitry Vyukov Sept. 3, 2016, 12:35 p.m. UTC
Hello,

The following program causes either division error or hangs kernel:

https://gist.githubusercontent.com/dvyukov/b1f33be3bfcc15d629e10db483bee1e4/raw/1574b407e0456bf8277bee423e7731641f5f2c14/gistfile1.txt


divide error: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 27320 Comm: syz-executor Not tainted 4.8.0-rc3-next-20160825+ #8
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff88003c9b2280 task.stack: ffff880027280000
RIP: 0010:[<ffffffff858e1a6c>]  [<     inline     >] ktime_divns
include/linux/ktime.h:195
RIP: 0010:[<ffffffff858e1a6c>]  [<ffffffff858e1a6c>]
snd_hrtimer_callback+0x1bc/0x3c0 sound/core/hrtimer.c:62
RSP: 0018:ffff88003ed07dd8  EFLAGS: 00010006
RAX: 0000000000004801 RBX: ffff88003d3a0a88 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff88003d3a0aa0 RDI: ffffffff886b69a8
RBP: ffff88003ed07e30 R08: 0000010e85417665 R09: ffff88007fff7048
R10: ffff88007fff7058 R11: ffff88007fff7050 R12: ffff88003d00f840
R13: ffff88003d00f8f4 R14: ffff88003d00f970 R15: ffff88003d3a0a80
FS:  0000000000000000(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f4152126db8 CR3: 000000006b896000 CR4: 00000000000006e0
DR0: 000000000000001e DR1: 000000000000001e DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
Stack:
 0000000000000000 ffff88003d3a0aa0 ffff88003d00f8f8 0000000000004801
 ffff88003d3a0ae0 0000000000000000 ffff88003d3a0a88 0000000000000001
 ffff88003ed1b400 dffffc0000000000 0000000000000001 ffff88003ed07f28
Call Trace:
 <IRQ>
 [<     inline     >] __run_hrtimer kernel/time/hrtimer.c:1238
 [<ffffffff81504335>] __hrtimer_run_queues+0x325/0xe70
kernel/time/hrtimer.c:1302
 [<ffffffff81506ceb>] hrtimer_interrupt+0x18b/0x420 kernel/time/hrtimer.c:1336
 [<ffffffff8126d8df>] local_apic_timer_interrupt+0x6f/0xe0
arch/x86/kernel/apic/apic.c:933
 [<ffffffff86e13056>] smp_apic_timer_interrupt+0x76/0xa0
arch/x86/kernel/apic/apic.c:957
 [<ffffffff86e1210c>] apic_timer_interrupt+0x8c/0xa0
arch/x86/entry/entry_64.S:487
 <EOI>
 [<     inline     >] rcu_lock_release include/linux/rcupdate.h:494
 [<     inline     >] rcu_read_unlock include/linux/rcupdate.h:927
 [<ffffffff81839f55>] unlock_page_memcg+0x95/0x130 mm/memcontrol.c:1680
 [<     inline     >] page_remove_file_rmap mm/rmap.c:1343
 [<ffffffff817b742c>] page_remove_rmap+0x1cc/0x8e0 mm/rmap.c:1394
 [<     inline     >] zap_pte_range mm/memory.c:1170
 [<     inline     >] zap_pmd_range mm/memory.c:1257
 [<     inline     >] zap_pud_range mm/memory.c:1278
 [<ffffffff81789119>] unmap_page_range+0xf89/0x1ba0 mm/memory.c:1299
 [<ffffffff81789e31>] unmap_single_vma+0x101/0x260 mm/memory.c:1344
 [<ffffffff8178a741>] unmap_vmas+0xf1/0x1b0
 [<ffffffff817a5aab>] exit_mmap+0x22b/0x420 mm/mmap.c:2787
 [<     inline     >] __mmput kernel/fork.c:770
 [<ffffffff8137c576>] mmput+0xd6/0x3c0 kernel/fork.c:790
 [<     inline     >] exit_mm kernel/exit.c:512
 [<ffffffff813940e8>] do_exit+0x738/0x2e70 kernel/exit.c:815
 [<ffffffff81396998>] do_group_exit+0x108/0x330 kernel/exit.c:958
 [<ffffffff813ba4aa>] get_signal+0x62a/0x15d0 kernel/signal.c:2307
 [<ffffffff81202333>] do_signal+0x83/0x1f60 arch/x86/kernel/signal.c:805
 [<ffffffff81006345>] exit_to_usermode_loop+0x1a5/0x210
arch/x86/entry/common.c:163
 [<     inline     >] prepare_exit_to_usermode arch/x86/entry/common.c:198
 [<     inline     >] syscall_return_slowpath arch/x86/entry/common.c:267
 [<ffffffff81008be0>] do_syscall_64+0x4c0/0x640 arch/x86/entry/common.c:293
 [<ffffffff86e107c3>] entry_SYSCALL64_slow_path+0x25/0x25
Code: cb a2 ce fb 8b 0d f5 21 70 05 48 0f af 4d d0 48 85 c9 0f 88 6a
01 00 00 48 89 4d a8 e8 ae a2 ce fb 48 8b 45 c0 48 8b 4d a8 48 99 <48>
f7 f9 48 01 45 d0 e8 98 a2 ce fb 4c 89 fa 48 b8 00 00 00 00
RIP  [<     inline     >] ktime_divns include/linux/ktime.h:195
RIP  [<ffffffff858e1a6c>] snd_hrtimer_callback+0x1bc/0x3c0
sound/core/hrtimer.c:62
 RSP <ffff88003ed07dd8>
---[ end trace e1a6d1e5a6136d50 ]---


or:

INFO: rcu_sched detected stalls on CPUs/tasks:
0-...: (1 GPs behind) idle=203/140000000000001/0 softirq=923090/923091 fqs=6490
(detected by 1, t=26002 jiffies, g=508022, c=508021, q=20)
Task dump for CPU 0:
a.out           R  running task    28848 31756   4047 0x0000000a
 0000000000000000 0000000000000282 0000000000000000 ffff880066a9b430
 ffff880066a9b430 ffff8800667cfae0 0000000000000282 0000000000000000
 0000000000000000 0000000000000000 000003a0184674dd dffffc0000000000
Call Trace:
 [<     inline     >] spin_unlock_irqrestore ./include/linux/spinlock.h:362
 [<ffffffff858db3dd>] snd_timer_start1+0xdd/0x740 sound/core/timer.c:477
 [<ffffffff858dbb25>] snd_timer_continue+0x45/0x80 sound/core/timer.c:606
 [<     inline     >] snd_timer_user_continue sound/core/timer.c:1835
 [<     inline     >] __snd_timer_user_ioctl sound/core/timer.c:1902
 [<ffffffff858df299>] snd_timer_user_ioctl+0xe99/0x2470 sound/core/timer.c:1917
 [<     inline     >] vfs_ioctl fs/ioctl.c:43
 [<ffffffff818a1dfc>] do_vfs_ioctl+0x18c/0x1080 fs/ioctl.c:675
 [<     inline     >] SYSC_ioctl fs/ioctl.c:690
 [<ffffffff818a2d7f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:681
 [<ffffffff86e10700>] entry_SYSCALL_64_fastpath+0x23/0xc1
arch/x86/entry/entry_64.S:208



I am on 0f98f121e1670eaa2a2fbb675e07d6ba7f0e146f of linux-next with
the following diff in sound/core/timer.c (one is a fix for another
division by 0):

        INIT_LIST_HEAD(&timer->active_list_head);
@@ -1958,6 +1959,7 @@ static ssize_t snd_timer_user_read(struct file
*file, char __user *buffer,
                tu->qused--;
                spin_unlock_irq(&tu->qlock);

+               mutex_lock(&tu->ioctl_lock);
                if (tu->tread) {
                        if (copy_to_user(buffer, &tu->tqueue[qhead],
                                         sizeof(struct snd_timer_tread)))
@@ -1967,6 +1969,7 @@ static ssize_t snd_timer_user_read(struct file
*file, char __user *buffer,
                                         sizeof(struct snd_timer_read)))
                                err = -EFAULT;
                }
+               mutex_unlock(&tu->ioctl_lock);

                spin_lock_irq(&tu->qlock);
                if (err < 0)

Comments

Takashi Iwai Sept. 6, 2016, 1 p.m. UTC | #1
On Sat, 03 Sep 2016 14:35:15 +0200,
Dmitry Vyukov wrote:
> 
> Hello,
> 
> The following program causes either division error or hangs kernel:
> 
> https://gist.githubusercontent.com/dvyukov/b1f33be3bfcc15d629e10db483bee1e4/raw/1574b407e0456bf8277bee423e7731641f5f2c14/gistfile1.txt
> 
> 
> divide error: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Modules linked in:
> CPU: 1 PID: 27320 Comm: syz-executor Not tainted 4.8.0-rc3-next-20160825+ #8
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> task: ffff88003c9b2280 task.stack: ffff880027280000
> RIP: 0010:[<ffffffff858e1a6c>]  [<     inline     >] ktime_divns
> include/linux/ktime.h:195
> RIP: 0010:[<ffffffff858e1a6c>]  [<ffffffff858e1a6c>]
> snd_hrtimer_callback+0x1bc/0x3c0 sound/core/hrtimer.c:62
> RSP: 0018:ffff88003ed07dd8  EFLAGS: 00010006
> RAX: 0000000000004801 RBX: ffff88003d3a0a88 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffff88003d3a0aa0 RDI: ffffffff886b69a8
> RBP: ffff88003ed07e30 R08: 0000010e85417665 R09: ffff88007fff7048
> R10: ffff88007fff7058 R11: ffff88007fff7050 R12: ffff88003d00f840
> R13: ffff88003d00f8f4 R14: ffff88003d00f970 R15: ffff88003d3a0a80
> FS:  0000000000000000(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f4152126db8 CR3: 000000006b896000 CR4: 00000000000006e0
> DR0: 000000000000001e DR1: 000000000000001e DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
> Stack:
>  0000000000000000 ffff88003d3a0aa0 ffff88003d00f8f8 0000000000004801
>  ffff88003d3a0ae0 0000000000000000 ffff88003d3a0a88 0000000000000001
>  ffff88003ed1b400 dffffc0000000000 0000000000000001 ffff88003ed07f28
> Call Trace:
>  <IRQ>
>  [<     inline     >] __run_hrtimer kernel/time/hrtimer.c:1238
>  [<ffffffff81504335>] __hrtimer_run_queues+0x325/0xe70
> kernel/time/hrtimer.c:1302
>  [<ffffffff81506ceb>] hrtimer_interrupt+0x18b/0x420 kernel/time/hrtimer.c:1336
>  [<ffffffff8126d8df>] local_apic_timer_interrupt+0x6f/0xe0
> arch/x86/kernel/apic/apic.c:933
>  [<ffffffff86e13056>] smp_apic_timer_interrupt+0x76/0xa0
> arch/x86/kernel/apic/apic.c:957
>  [<ffffffff86e1210c>] apic_timer_interrupt+0x8c/0xa0
> arch/x86/entry/entry_64.S:487
>  <EOI>
>  [<     inline     >] rcu_lock_release include/linux/rcupdate.h:494
>  [<     inline     >] rcu_read_unlock include/linux/rcupdate.h:927
>  [<ffffffff81839f55>] unlock_page_memcg+0x95/0x130 mm/memcontrol.c:1680
>  [<     inline     >] page_remove_file_rmap mm/rmap.c:1343
>  [<ffffffff817b742c>] page_remove_rmap+0x1cc/0x8e0 mm/rmap.c:1394
>  [<     inline     >] zap_pte_range mm/memory.c:1170
>  [<     inline     >] zap_pmd_range mm/memory.c:1257
>  [<     inline     >] zap_pud_range mm/memory.c:1278
>  [<ffffffff81789119>] unmap_page_range+0xf89/0x1ba0 mm/memory.c:1299
>  [<ffffffff81789e31>] unmap_single_vma+0x101/0x260 mm/memory.c:1344
>  [<ffffffff8178a741>] unmap_vmas+0xf1/0x1b0
>  [<ffffffff817a5aab>] exit_mmap+0x22b/0x420 mm/mmap.c:2787
>  [<     inline     >] __mmput kernel/fork.c:770
>  [<ffffffff8137c576>] mmput+0xd6/0x3c0 kernel/fork.c:790
>  [<     inline     >] exit_mm kernel/exit.c:512
>  [<ffffffff813940e8>] do_exit+0x738/0x2e70 kernel/exit.c:815
>  [<ffffffff81396998>] do_group_exit+0x108/0x330 kernel/exit.c:958
>  [<ffffffff813ba4aa>] get_signal+0x62a/0x15d0 kernel/signal.c:2307
>  [<ffffffff81202333>] do_signal+0x83/0x1f60 arch/x86/kernel/signal.c:805
>  [<ffffffff81006345>] exit_to_usermode_loop+0x1a5/0x210
> arch/x86/entry/common.c:163
>  [<     inline     >] prepare_exit_to_usermode arch/x86/entry/common.c:198
>  [<     inline     >] syscall_return_slowpath arch/x86/entry/common.c:267
>  [<ffffffff81008be0>] do_syscall_64+0x4c0/0x640 arch/x86/entry/common.c:293
>  [<ffffffff86e107c3>] entry_SYSCALL64_slow_path+0x25/0x25
> Code: cb a2 ce fb 8b 0d f5 21 70 05 48 0f af 4d d0 48 85 c9 0f 88 6a
> 01 00 00 48 89 4d a8 e8 ae a2 ce fb 48 8b 45 c0 48 8b 4d a8 48 99 <48>
> f7 f9 48 01 45 d0 e8 98 a2 ce fb 4c 89 fa 48 b8 00 00 00 00
> RIP  [<     inline     >] ktime_divns include/linux/ktime.h:195
> RIP  [<ffffffff858e1a6c>] snd_hrtimer_callback+0x1bc/0x3c0
> sound/core/hrtimer.c:62
>  RSP <ffff88003ed07dd8>
> ---[ end trace e1a6d1e5a6136d50 ]---
> 
> 
> or:
> 
> INFO: rcu_sched detected stalls on CPUs/tasks:
> 0-...: (1 GPs behind) idle=203/140000000000001/0 softirq=923090/923091 fqs=6490
> (detected by 1, t=26002 jiffies, g=508022, c=508021, q=20)
> Task dump for CPU 0:
> a.out           R  running task    28848 31756   4047 0x0000000a
>  0000000000000000 0000000000000282 0000000000000000 ffff880066a9b430
>  ffff880066a9b430 ffff8800667cfae0 0000000000000282 0000000000000000
>  0000000000000000 0000000000000000 000003a0184674dd dffffc0000000000
> Call Trace:
>  [<     inline     >] spin_unlock_irqrestore ./include/linux/spinlock.h:362
>  [<ffffffff858db3dd>] snd_timer_start1+0xdd/0x740 sound/core/timer.c:477
>  [<ffffffff858dbb25>] snd_timer_continue+0x45/0x80 sound/core/timer.c:606
>  [<     inline     >] snd_timer_user_continue sound/core/timer.c:1835
>  [<     inline     >] __snd_timer_user_ioctl sound/core/timer.c:1902
>  [<ffffffff858df299>] snd_timer_user_ioctl+0xe99/0x2470 sound/core/timer.c:1917
>  [<     inline     >] vfs_ioctl fs/ioctl.c:43
>  [<ffffffff818a1dfc>] do_vfs_ioctl+0x18c/0x1080 fs/ioctl.c:675
>  [<     inline     >] SYSC_ioctl fs/ioctl.c:690
>  [<ffffffff818a2d7f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:681
>  [<ffffffff86e10700>] entry_SYSCALL_64_fastpath+0x23/0xc1
> arch/x86/entry/entry_64.S:208
> 
> 
> 
> I am on 0f98f121e1670eaa2a2fbb675e07d6ba7f0e146f of linux-next with
> the following diff in sound/core/timer.c (one is a fix for another
> division by 0):
> 
> diff --git a/sound/core/timer.c b/sound/core/timer.c
> index 9a6157e..3e55c6d 100644
> --- a/sound/core/timer.c
> +++ b/sound/core/timer.c
> @@ -813,6 +813,7 @@ int snd_timer_new(struct snd_card *card, char *id,
> struct snd_timer_id *tid,
>         timer->tmr_subdevice = tid->subdevice;
>         if (id)
>                 strlcpy(timer->id, id, sizeof(timer->id));
> +       timer->sticks = 1;
>         INIT_LIST_HEAD(&timer->device_list);
>         INIT_LIST_HEAD(&timer->open_list_head);
>         INIT_LIST_HEAD(&timer->active_list_head);
> @@ -1958,6 +1959,7 @@ static ssize_t snd_timer_user_read(struct file
> *file, char __user *buffer,
>                 tu->qused--;
>                 spin_unlock_irq(&tu->qlock);
> 
> +               mutex_lock(&tu->ioctl_lock);
>                 if (tu->tread) {
>                         if (copy_to_user(buffer, &tu->tqueue[qhead],
>                                          sizeof(struct snd_timer_tread)))
> @@ -1967,6 +1969,7 @@ static ssize_t snd_timer_user_read(struct file
> *file, char __user *buffer,
>                                          sizeof(struct snd_timer_read)))
>                                 err = -EFAULT;
>                 }
> +               mutex_unlock(&tu->ioctl_lock);
> 
>                 spin_lock_irq(&tu->qlock);
>                 if (err < 0)
> 

I couldn't trigger the bug any longer after merging for-linus branch
of sound.git tree:
  git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git

But I thought it got fixed by sticks=1 in the above.  Maybe it's a
combination, as it contains yet another timer fix by Vegard.

The pull request will be sent in this week for 4.8-rc5.


thanks,

Takashi
Dmitry Vyukov Sept. 6, 2016, 1:09 p.m. UTC | #2
On Tue, Sep 6, 2016 at 3:00 PM, Takashi Iwai <tiwai@suse.de> wrote:
> On Sat, 03 Sep 2016 14:35:15 +0200,
> Dmitry Vyukov wrote:
>>
>> Hello,
>>
>> The following program causes either division error or hangs kernel:
>>
>> https://gist.githubusercontent.com/dvyukov/b1f33be3bfcc15d629e10db483bee1e4/raw/1574b407e0456bf8277bee423e7731641f5f2c14/gistfile1.txt
>>
>>
>> divide error: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
>> Dumping ftrace buffer:
>>    (ftrace buffer empty)
>> Modules linked in:
>> CPU: 1 PID: 27320 Comm: syz-executor Not tainted 4.8.0-rc3-next-20160825+ #8
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>> task: ffff88003c9b2280 task.stack: ffff880027280000
>> RIP: 0010:[<ffffffff858e1a6c>]  [<     inline     >] ktime_divns
>> include/linux/ktime.h:195
>> RIP: 0010:[<ffffffff858e1a6c>]  [<ffffffff858e1a6c>]
>> snd_hrtimer_callback+0x1bc/0x3c0 sound/core/hrtimer.c:62
>> RSP: 0018:ffff88003ed07dd8  EFLAGS: 00010006
>> RAX: 0000000000004801 RBX: ffff88003d3a0a88 RCX: 0000000000000000
>> RDX: 0000000000000000 RSI: ffff88003d3a0aa0 RDI: ffffffff886b69a8
>> RBP: ffff88003ed07e30 R08: 0000010e85417665 R09: ffff88007fff7048
>> R10: ffff88007fff7058 R11: ffff88007fff7050 R12: ffff88003d00f840
>> R13: ffff88003d00f8f4 R14: ffff88003d00f970 R15: ffff88003d3a0a80
>> FS:  0000000000000000(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007f4152126db8 CR3: 000000006b896000 CR4: 00000000000006e0
>> DR0: 000000000000001e DR1: 000000000000001e DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
>> Stack:
>>  0000000000000000 ffff88003d3a0aa0 ffff88003d00f8f8 0000000000004801
>>  ffff88003d3a0ae0 0000000000000000 ffff88003d3a0a88 0000000000000001
>>  ffff88003ed1b400 dffffc0000000000 0000000000000001 ffff88003ed07f28
>> Call Trace:
>>  <IRQ>
>>  [<     inline     >] __run_hrtimer kernel/time/hrtimer.c:1238
>>  [<ffffffff81504335>] __hrtimer_run_queues+0x325/0xe70
>> kernel/time/hrtimer.c:1302
>>  [<ffffffff81506ceb>] hrtimer_interrupt+0x18b/0x420 kernel/time/hrtimer.c:1336
>>  [<ffffffff8126d8df>] local_apic_timer_interrupt+0x6f/0xe0
>> arch/x86/kernel/apic/apic.c:933
>>  [<ffffffff86e13056>] smp_apic_timer_interrupt+0x76/0xa0
>> arch/x86/kernel/apic/apic.c:957
>>  [<ffffffff86e1210c>] apic_timer_interrupt+0x8c/0xa0
>> arch/x86/entry/entry_64.S:487
>>  <EOI>
>>  [<     inline     >] rcu_lock_release include/linux/rcupdate.h:494
>>  [<     inline     >] rcu_read_unlock include/linux/rcupdate.h:927
>>  [<ffffffff81839f55>] unlock_page_memcg+0x95/0x130 mm/memcontrol.c:1680
>>  [<     inline     >] page_remove_file_rmap mm/rmap.c:1343
>>  [<ffffffff817b742c>] page_remove_rmap+0x1cc/0x8e0 mm/rmap.c:1394
>>  [<     inline     >] zap_pte_range mm/memory.c:1170
>>  [<     inline     >] zap_pmd_range mm/memory.c:1257
>>  [<     inline     >] zap_pud_range mm/memory.c:1278
>>  [<ffffffff81789119>] unmap_page_range+0xf89/0x1ba0 mm/memory.c:1299
>>  [<ffffffff81789e31>] unmap_single_vma+0x101/0x260 mm/memory.c:1344
>>  [<ffffffff8178a741>] unmap_vmas+0xf1/0x1b0
>>  [<ffffffff817a5aab>] exit_mmap+0x22b/0x420 mm/mmap.c:2787
>>  [<     inline     >] __mmput kernel/fork.c:770
>>  [<ffffffff8137c576>] mmput+0xd6/0x3c0 kernel/fork.c:790
>>  [<     inline     >] exit_mm kernel/exit.c:512
>>  [<ffffffff813940e8>] do_exit+0x738/0x2e70 kernel/exit.c:815
>>  [<ffffffff81396998>] do_group_exit+0x108/0x330 kernel/exit.c:958
>>  [<ffffffff813ba4aa>] get_signal+0x62a/0x15d0 kernel/signal.c:2307
>>  [<ffffffff81202333>] do_signal+0x83/0x1f60 arch/x86/kernel/signal.c:805
>>  [<ffffffff81006345>] exit_to_usermode_loop+0x1a5/0x210
>> arch/x86/entry/common.c:163
>>  [<     inline     >] prepare_exit_to_usermode arch/x86/entry/common.c:198
>>  [<     inline     >] syscall_return_slowpath arch/x86/entry/common.c:267
>>  [<ffffffff81008be0>] do_syscall_64+0x4c0/0x640 arch/x86/entry/common.c:293
>>  [<ffffffff86e107c3>] entry_SYSCALL64_slow_path+0x25/0x25
>> Code: cb a2 ce fb 8b 0d f5 21 70 05 48 0f af 4d d0 48 85 c9 0f 88 6a
>> 01 00 00 48 89 4d a8 e8 ae a2 ce fb 48 8b 45 c0 48 8b 4d a8 48 99 <48>
>> f7 f9 48 01 45 d0 e8 98 a2 ce fb 4c 89 fa 48 b8 00 00 00 00
>> RIP  [<     inline     >] ktime_divns include/linux/ktime.h:195
>> RIP  [<ffffffff858e1a6c>] snd_hrtimer_callback+0x1bc/0x3c0
>> sound/core/hrtimer.c:62
>>  RSP <ffff88003ed07dd8>
>> ---[ end trace e1a6d1e5a6136d50 ]---
>>
>>
>> or:
>>
>> INFO: rcu_sched detected stalls on CPUs/tasks:
>> 0-...: (1 GPs behind) idle=203/140000000000001/0 softirq=923090/923091 fqs=6490
>> (detected by 1, t=26002 jiffies, g=508022, c=508021, q=20)
>> Task dump for CPU 0:
>> a.out           R  running task    28848 31756   4047 0x0000000a
>>  0000000000000000 0000000000000282 0000000000000000 ffff880066a9b430
>>  ffff880066a9b430 ffff8800667cfae0 0000000000000282 0000000000000000
>>  0000000000000000 0000000000000000 000003a0184674dd dffffc0000000000
>> Call Trace:
>>  [<     inline     >] spin_unlock_irqrestore ./include/linux/spinlock.h:362
>>  [<ffffffff858db3dd>] snd_timer_start1+0xdd/0x740 sound/core/timer.c:477
>>  [<ffffffff858dbb25>] snd_timer_continue+0x45/0x80 sound/core/timer.c:606
>>  [<     inline     >] snd_timer_user_continue sound/core/timer.c:1835
>>  [<     inline     >] __snd_timer_user_ioctl sound/core/timer.c:1902
>>  [<ffffffff858df299>] snd_timer_user_ioctl+0xe99/0x2470 sound/core/timer.c:1917
>>  [<     inline     >] vfs_ioctl fs/ioctl.c:43
>>  [<ffffffff818a1dfc>] do_vfs_ioctl+0x18c/0x1080 fs/ioctl.c:675
>>  [<     inline     >] SYSC_ioctl fs/ioctl.c:690
>>  [<ffffffff818a2d7f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:681
>>  [<ffffffff86e10700>] entry_SYSCALL_64_fastpath+0x23/0xc1
>> arch/x86/entry/entry_64.S:208
>>
>>
>>
>> I am on 0f98f121e1670eaa2a2fbb675e07d6ba7f0e146f of linux-next with
>> the following diff in sound/core/timer.c (one is a fix for another
>> division by 0):
>>
>> diff --git a/sound/core/timer.c b/sound/core/timer.c
>> index 9a6157e..3e55c6d 100644
>> --- a/sound/core/timer.c
>> +++ b/sound/core/timer.c
>> @@ -813,6 +813,7 @@ int snd_timer_new(struct snd_card *card, char *id,
>> struct snd_timer_id *tid,
>>         timer->tmr_subdevice = tid->subdevice;
>>         if (id)
>>                 strlcpy(timer->id, id, sizeof(timer->id));
>> +       timer->sticks = 1;
>>         INIT_LIST_HEAD(&timer->device_list);
>>         INIT_LIST_HEAD(&timer->open_list_head);
>>         INIT_LIST_HEAD(&timer->active_list_head);
>> @@ -1958,6 +1959,7 @@ static ssize_t snd_timer_user_read(struct file
>> *file, char __user *buffer,
>>                 tu->qused--;
>>                 spin_unlock_irq(&tu->qlock);
>>
>> +               mutex_lock(&tu->ioctl_lock);
>>                 if (tu->tread) {
>>                         if (copy_to_user(buffer, &tu->tqueue[qhead],
>>                                          sizeof(struct snd_timer_tread)))
>> @@ -1967,6 +1969,7 @@ static ssize_t snd_timer_user_read(struct file
>> *file, char __user *buffer,
>>                                          sizeof(struct snd_timer_read)))
>>                                 err = -EFAULT;
>>                 }
>> +               mutex_unlock(&tu->ioctl_lock);
>>
>>                 spin_lock_irq(&tu->qlock);
>>                 if (err < 0)
>>
>
> I couldn't trigger the bug any longer after merging for-linus branch
> of sound.git tree:
>   git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git
>
> But I thought it got fixed by sticks=1 in the above.  Maybe it's a
> combination, as it contains yet another timer fix by Vegard.


You mean that my local changes are a combination of two patches? But I
would not expect that addition of mutex_lock/unlock can affect
division by 0.
Or you mean that there some other fix by Vegard? What is it?




> The pull request will be sent in this week for 4.8-rc5.
Takashi Iwai Sept. 6, 2016, 1:43 p.m. UTC | #3
On Tue, 06 Sep 2016 15:09:25 +0200,
Dmitry Vyukov wrote:
> 
> On Tue, Sep 6, 2016 at 3:00 PM, Takashi Iwai <tiwai@suse.de> wrote:
> > On Sat, 03 Sep 2016 14:35:15 +0200,
> > Dmitry Vyukov wrote:
> >>
> >> Hello,
> >>
> >> The following program causes either division error or hangs kernel:
> >>
> >> https://gist.githubusercontent.com/dvyukov/b1f33be3bfcc15d629e10db483bee1e4/raw/1574b407e0456bf8277bee423e7731641f5f2c14/gistfile1.txt
> >>
> >>
> >> divide error: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
> >> Dumping ftrace buffer:
> >>    (ftrace buffer empty)
> >> Modules linked in:
> >> CPU: 1 PID: 27320 Comm: syz-executor Not tainted 4.8.0-rc3-next-20160825+ #8
> >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> >> task: ffff88003c9b2280 task.stack: ffff880027280000
> >> RIP: 0010:[<ffffffff858e1a6c>]  [<     inline     >] ktime_divns
> >> include/linux/ktime.h:195
> >> RIP: 0010:[<ffffffff858e1a6c>]  [<ffffffff858e1a6c>]
> >> snd_hrtimer_callback+0x1bc/0x3c0 sound/core/hrtimer.c:62
> >> RSP: 0018:ffff88003ed07dd8  EFLAGS: 00010006
> >> RAX: 0000000000004801 RBX: ffff88003d3a0a88 RCX: 0000000000000000
> >> RDX: 0000000000000000 RSI: ffff88003d3a0aa0 RDI: ffffffff886b69a8
> >> RBP: ffff88003ed07e30 R08: 0000010e85417665 R09: ffff88007fff7048
> >> R10: ffff88007fff7058 R11: ffff88007fff7050 R12: ffff88003d00f840
> >> R13: ffff88003d00f8f4 R14: ffff88003d00f970 R15: ffff88003d3a0a80
> >> FS:  0000000000000000(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000
> >> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> CR2: 00007f4152126db8 CR3: 000000006b896000 CR4: 00000000000006e0
> >> DR0: 000000000000001e DR1: 000000000000001e DR2: 0000000000000000
> >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
> >> Stack:
> >>  0000000000000000 ffff88003d3a0aa0 ffff88003d00f8f8 0000000000004801
> >>  ffff88003d3a0ae0 0000000000000000 ffff88003d3a0a88 0000000000000001
> >>  ffff88003ed1b400 dffffc0000000000 0000000000000001 ffff88003ed07f28
> >> Call Trace:
> >>  <IRQ>
> >>  [<     inline     >] __run_hrtimer kernel/time/hrtimer.c:1238
> >>  [<ffffffff81504335>] __hrtimer_run_queues+0x325/0xe70
> >> kernel/time/hrtimer.c:1302
> >>  [<ffffffff81506ceb>] hrtimer_interrupt+0x18b/0x420 kernel/time/hrtimer.c:1336
> >>  [<ffffffff8126d8df>] local_apic_timer_interrupt+0x6f/0xe0
> >> arch/x86/kernel/apic/apic.c:933
> >>  [<ffffffff86e13056>] smp_apic_timer_interrupt+0x76/0xa0
> >> arch/x86/kernel/apic/apic.c:957
> >>  [<ffffffff86e1210c>] apic_timer_interrupt+0x8c/0xa0
> >> arch/x86/entry/entry_64.S:487
> >>  <EOI>
> >>  [<     inline     >] rcu_lock_release include/linux/rcupdate.h:494
> >>  [<     inline     >] rcu_read_unlock include/linux/rcupdate.h:927
> >>  [<ffffffff81839f55>] unlock_page_memcg+0x95/0x130 mm/memcontrol.c:1680
> >>  [<     inline     >] page_remove_file_rmap mm/rmap.c:1343
> >>  [<ffffffff817b742c>] page_remove_rmap+0x1cc/0x8e0 mm/rmap.c:1394
> >>  [<     inline     >] zap_pte_range mm/memory.c:1170
> >>  [<     inline     >] zap_pmd_range mm/memory.c:1257
> >>  [<     inline     >] zap_pud_range mm/memory.c:1278
> >>  [<ffffffff81789119>] unmap_page_range+0xf89/0x1ba0 mm/memory.c:1299
> >>  [<ffffffff81789e31>] unmap_single_vma+0x101/0x260 mm/memory.c:1344
> >>  [<ffffffff8178a741>] unmap_vmas+0xf1/0x1b0
> >>  [<ffffffff817a5aab>] exit_mmap+0x22b/0x420 mm/mmap.c:2787
> >>  [<     inline     >] __mmput kernel/fork.c:770
> >>  [<ffffffff8137c576>] mmput+0xd6/0x3c0 kernel/fork.c:790
> >>  [<     inline     >] exit_mm kernel/exit.c:512
> >>  [<ffffffff813940e8>] do_exit+0x738/0x2e70 kernel/exit.c:815
> >>  [<ffffffff81396998>] do_group_exit+0x108/0x330 kernel/exit.c:958
> >>  [<ffffffff813ba4aa>] get_signal+0x62a/0x15d0 kernel/signal.c:2307
> >>  [<ffffffff81202333>] do_signal+0x83/0x1f60 arch/x86/kernel/signal.c:805
> >>  [<ffffffff81006345>] exit_to_usermode_loop+0x1a5/0x210
> >> arch/x86/entry/common.c:163
> >>  [<     inline     >] prepare_exit_to_usermode arch/x86/entry/common.c:198
> >>  [<     inline     >] syscall_return_slowpath arch/x86/entry/common.c:267
> >>  [<ffffffff81008be0>] do_syscall_64+0x4c0/0x640 arch/x86/entry/common.c:293
> >>  [<ffffffff86e107c3>] entry_SYSCALL64_slow_path+0x25/0x25
> >> Code: cb a2 ce fb 8b 0d f5 21 70 05 48 0f af 4d d0 48 85 c9 0f 88 6a
> >> 01 00 00 48 89 4d a8 e8 ae a2 ce fb 48 8b 45 c0 48 8b 4d a8 48 99 <48>
> >> f7 f9 48 01 45 d0 e8 98 a2 ce fb 4c 89 fa 48 b8 00 00 00 00
> >> RIP  [<     inline     >] ktime_divns include/linux/ktime.h:195
> >> RIP  [<ffffffff858e1a6c>] snd_hrtimer_callback+0x1bc/0x3c0
> >> sound/core/hrtimer.c:62
> >>  RSP <ffff88003ed07dd8>
> >> ---[ end trace e1a6d1e5a6136d50 ]---
> >>
> >>
> >> or:
> >>
> >> INFO: rcu_sched detected stalls on CPUs/tasks:
> >> 0-...: (1 GPs behind) idle=203/140000000000001/0 softirq=923090/923091 fqs=6490
> >> (detected by 1, t=26002 jiffies, g=508022, c=508021, q=20)
> >> Task dump for CPU 0:
> >> a.out           R  running task    28848 31756   4047 0x0000000a
> >>  0000000000000000 0000000000000282 0000000000000000 ffff880066a9b430
> >>  ffff880066a9b430 ffff8800667cfae0 0000000000000282 0000000000000000
> >>  0000000000000000 0000000000000000 000003a0184674dd dffffc0000000000
> >> Call Trace:
> >>  [<     inline     >] spin_unlock_irqrestore ./include/linux/spinlock.h:362
> >>  [<ffffffff858db3dd>] snd_timer_start1+0xdd/0x740 sound/core/timer.c:477
> >>  [<ffffffff858dbb25>] snd_timer_continue+0x45/0x80 sound/core/timer.c:606
> >>  [<     inline     >] snd_timer_user_continue sound/core/timer.c:1835
> >>  [<     inline     >] __snd_timer_user_ioctl sound/core/timer.c:1902
> >>  [<ffffffff858df299>] snd_timer_user_ioctl+0xe99/0x2470 sound/core/timer.c:1917
> >>  [<     inline     >] vfs_ioctl fs/ioctl.c:43
> >>  [<ffffffff818a1dfc>] do_vfs_ioctl+0x18c/0x1080 fs/ioctl.c:675
> >>  [<     inline     >] SYSC_ioctl fs/ioctl.c:690
> >>  [<ffffffff818a2d7f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:681
> >>  [<ffffffff86e10700>] entry_SYSCALL_64_fastpath+0x23/0xc1
> >> arch/x86/entry/entry_64.S:208
> >>
> >>
> >>
> >> I am on 0f98f121e1670eaa2a2fbb675e07d6ba7f0e146f of linux-next with
> >> the following diff in sound/core/timer.c (one is a fix for another
> >> division by 0):
> >>
> >> diff --git a/sound/core/timer.c b/sound/core/timer.c
> >> index 9a6157e..3e55c6d 100644
> >> --- a/sound/core/timer.c
> >> +++ b/sound/core/timer.c
> >> @@ -813,6 +813,7 @@ int snd_timer_new(struct snd_card *card, char *id,
> >> struct snd_timer_id *tid,
> >>         timer->tmr_subdevice = tid->subdevice;
> >>         if (id)
> >>                 strlcpy(timer->id, id, sizeof(timer->id));
> >> +       timer->sticks = 1;
> >>         INIT_LIST_HEAD(&timer->device_list);
> >>         INIT_LIST_HEAD(&timer->open_list_head);
> >>         INIT_LIST_HEAD(&timer->active_list_head);
> >> @@ -1958,6 +1959,7 @@ static ssize_t snd_timer_user_read(struct file
> >> *file, char __user *buffer,
> >>                 tu->qused--;
> >>                 spin_unlock_irq(&tu->qlock);
> >>
> >> +               mutex_lock(&tu->ioctl_lock);
> >>                 if (tu->tread) {
> >>                         if (copy_to_user(buffer, &tu->tqueue[qhead],
> >>                                          sizeof(struct snd_timer_tread)))
> >> @@ -1967,6 +1969,7 @@ static ssize_t snd_timer_user_read(struct file
> >> *file, char __user *buffer,
> >>                                          sizeof(struct snd_timer_read)))
> >>                                 err = -EFAULT;
> >>                 }
> >> +               mutex_unlock(&tu->ioctl_lock);
> >>
> >>                 spin_lock_irq(&tu->qlock);
> >>                 if (err < 0)
> >>
> >
> > I couldn't trigger the bug any longer after merging for-linus branch
> > of sound.git tree:
> >   git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git
> >
> > But I thought it got fixed by sticks=1 in the above.  Maybe it's a
> > combination, as it contains yet another timer fix by Vegard.
> 
> 
> You mean that my local changes are a combination of two patches? But I
> would not expect that addition of mutex_lock/unlock can affect
> division by 0.
> Or you mean that there some other fix by Vegard? What is it?

There is another fix
  8ddc05638ee42b18ba4fe99b5fb647fa3ad20456
    ALSA: timer: fix NULL pointer dereference on memory allocation failure
But it's likely irrelevant.

I retested now, and could see the latter stack trace, the rcu tall,
while running the program, although it doesn't lead to a complete
freeze and the system is restored once after the program quits.

Do you really see the zero-division error with sticks=1 fix?  That's
unexpected.


Takashi
Dmitry Vyukov Sept. 6, 2016, 2:06 p.m. UTC | #4
On Tue, Sep 6, 2016 at 3:43 PM, Takashi Iwai <tiwai@suse.de> wrote:
> On Tue, 06 Sep 2016 15:09:25 +0200,
> Dmitry Vyukov wrote:
>>
>> On Tue, Sep 6, 2016 at 3:00 PM, Takashi Iwai <tiwai@suse.de> wrote:
>> > On Sat, 03 Sep 2016 14:35:15 +0200,
>> > Dmitry Vyukov wrote:
>> >>
>> >> Hello,
>> >>
>> >> The following program causes either division error or hangs kernel:
>> >>
>> >> https://gist.githubusercontent.com/dvyukov/b1f33be3bfcc15d629e10db483bee1e4/raw/1574b407e0456bf8277bee423e7731641f5f2c14/gistfile1.txt
>> >>
>> >>
>> >> divide error: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
>> >> Dumping ftrace buffer:
>> >>    (ftrace buffer empty)
>> >> Modules linked in:
>> >> CPU: 1 PID: 27320 Comm: syz-executor Not tainted 4.8.0-rc3-next-20160825+ #8
>> >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>> >> task: ffff88003c9b2280 task.stack: ffff880027280000
>> >> RIP: 0010:[<ffffffff858e1a6c>]  [<     inline     >] ktime_divns
>> >> include/linux/ktime.h:195
>> >> RIP: 0010:[<ffffffff858e1a6c>]  [<ffffffff858e1a6c>]
>> >> snd_hrtimer_callback+0x1bc/0x3c0 sound/core/hrtimer.c:62
>> >> RSP: 0018:ffff88003ed07dd8  EFLAGS: 00010006
>> >> RAX: 0000000000004801 RBX: ffff88003d3a0a88 RCX: 0000000000000000
>> >> RDX: 0000000000000000 RSI: ffff88003d3a0aa0 RDI: ffffffff886b69a8
>> >> RBP: ffff88003ed07e30 R08: 0000010e85417665 R09: ffff88007fff7048
>> >> R10: ffff88007fff7058 R11: ffff88007fff7050 R12: ffff88003d00f840
>> >> R13: ffff88003d00f8f4 R14: ffff88003d00f970 R15: ffff88003d3a0a80
>> >> FS:  0000000000000000(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000
>> >> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> >> CR2: 00007f4152126db8 CR3: 000000006b896000 CR4: 00000000000006e0
>> >> DR0: 000000000000001e DR1: 000000000000001e DR2: 0000000000000000
>> >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
>> >> Stack:
>> >>  0000000000000000 ffff88003d3a0aa0 ffff88003d00f8f8 0000000000004801
>> >>  ffff88003d3a0ae0 0000000000000000 ffff88003d3a0a88 0000000000000001
>> >>  ffff88003ed1b400 dffffc0000000000 0000000000000001 ffff88003ed07f28
>> >> Call Trace:
>> >>  <IRQ>
>> >>  [<     inline     >] __run_hrtimer kernel/time/hrtimer.c:1238
>> >>  [<ffffffff81504335>] __hrtimer_run_queues+0x325/0xe70
>> >> kernel/time/hrtimer.c:1302
>> >>  [<ffffffff81506ceb>] hrtimer_interrupt+0x18b/0x420 kernel/time/hrtimer.c:1336
>> >>  [<ffffffff8126d8df>] local_apic_timer_interrupt+0x6f/0xe0
>> >> arch/x86/kernel/apic/apic.c:933
>> >>  [<ffffffff86e13056>] smp_apic_timer_interrupt+0x76/0xa0
>> >> arch/x86/kernel/apic/apic.c:957
>> >>  [<ffffffff86e1210c>] apic_timer_interrupt+0x8c/0xa0
>> >> arch/x86/entry/entry_64.S:487
>> >>  <EOI>
>> >>  [<     inline     >] rcu_lock_release include/linux/rcupdate.h:494
>> >>  [<     inline     >] rcu_read_unlock include/linux/rcupdate.h:927
>> >>  [<ffffffff81839f55>] unlock_page_memcg+0x95/0x130 mm/memcontrol.c:1680
>> >>  [<     inline     >] page_remove_file_rmap mm/rmap.c:1343
>> >>  [<ffffffff817b742c>] page_remove_rmap+0x1cc/0x8e0 mm/rmap.c:1394
>> >>  [<     inline     >] zap_pte_range mm/memory.c:1170
>> >>  [<     inline     >] zap_pmd_range mm/memory.c:1257
>> >>  [<     inline     >] zap_pud_range mm/memory.c:1278
>> >>  [<ffffffff81789119>] unmap_page_range+0xf89/0x1ba0 mm/memory.c:1299
>> >>  [<ffffffff81789e31>] unmap_single_vma+0x101/0x260 mm/memory.c:1344
>> >>  [<ffffffff8178a741>] unmap_vmas+0xf1/0x1b0
>> >>  [<ffffffff817a5aab>] exit_mmap+0x22b/0x420 mm/mmap.c:2787
>> >>  [<     inline     >] __mmput kernel/fork.c:770
>> >>  [<ffffffff8137c576>] mmput+0xd6/0x3c0 kernel/fork.c:790
>> >>  [<     inline     >] exit_mm kernel/exit.c:512
>> >>  [<ffffffff813940e8>] do_exit+0x738/0x2e70 kernel/exit.c:815
>> >>  [<ffffffff81396998>] do_group_exit+0x108/0x330 kernel/exit.c:958
>> >>  [<ffffffff813ba4aa>] get_signal+0x62a/0x15d0 kernel/signal.c:2307
>> >>  [<ffffffff81202333>] do_signal+0x83/0x1f60 arch/x86/kernel/signal.c:805
>> >>  [<ffffffff81006345>] exit_to_usermode_loop+0x1a5/0x210
>> >> arch/x86/entry/common.c:163
>> >>  [<     inline     >] prepare_exit_to_usermode arch/x86/entry/common.c:198
>> >>  [<     inline     >] syscall_return_slowpath arch/x86/entry/common.c:267
>> >>  [<ffffffff81008be0>] do_syscall_64+0x4c0/0x640 arch/x86/entry/common.c:293
>> >>  [<ffffffff86e107c3>] entry_SYSCALL64_slow_path+0x25/0x25
>> >> Code: cb a2 ce fb 8b 0d f5 21 70 05 48 0f af 4d d0 48 85 c9 0f 88 6a
>> >> 01 00 00 48 89 4d a8 e8 ae a2 ce fb 48 8b 45 c0 48 8b 4d a8 48 99 <48>
>> >> f7 f9 48 01 45 d0 e8 98 a2 ce fb 4c 89 fa 48 b8 00 00 00 00
>> >> RIP  [<     inline     >] ktime_divns include/linux/ktime.h:195
>> >> RIP  [<ffffffff858e1a6c>] snd_hrtimer_callback+0x1bc/0x3c0
>> >> sound/core/hrtimer.c:62
>> >>  RSP <ffff88003ed07dd8>
>> >> ---[ end trace e1a6d1e5a6136d50 ]---
>> >>
>> >>
>> >> or:
>> >>
>> >> INFO: rcu_sched detected stalls on CPUs/tasks:
>> >> 0-...: (1 GPs behind) idle=203/140000000000001/0 softirq=923090/923091 fqs=6490
>> >> (detected by 1, t=26002 jiffies, g=508022, c=508021, q=20)
>> >> Task dump for CPU 0:
>> >> a.out           R  running task    28848 31756   4047 0x0000000a
>> >>  0000000000000000 0000000000000282 0000000000000000 ffff880066a9b430
>> >>  ffff880066a9b430 ffff8800667cfae0 0000000000000282 0000000000000000
>> >>  0000000000000000 0000000000000000 000003a0184674dd dffffc0000000000
>> >> Call Trace:
>> >>  [<     inline     >] spin_unlock_irqrestore ./include/linux/spinlock.h:362
>> >>  [<ffffffff858db3dd>] snd_timer_start1+0xdd/0x740 sound/core/timer.c:477
>> >>  [<ffffffff858dbb25>] snd_timer_continue+0x45/0x80 sound/core/timer.c:606
>> >>  [<     inline     >] snd_timer_user_continue sound/core/timer.c:1835
>> >>  [<     inline     >] __snd_timer_user_ioctl sound/core/timer.c:1902
>> >>  [<ffffffff858df299>] snd_timer_user_ioctl+0xe99/0x2470 sound/core/timer.c:1917
>> >>  [<     inline     >] vfs_ioctl fs/ioctl.c:43
>> >>  [<ffffffff818a1dfc>] do_vfs_ioctl+0x18c/0x1080 fs/ioctl.c:675
>> >>  [<     inline     >] SYSC_ioctl fs/ioctl.c:690
>> >>  [<ffffffff818a2d7f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:681
>> >>  [<ffffffff86e10700>] entry_SYSCALL_64_fastpath+0x23/0xc1
>> >> arch/x86/entry/entry_64.S:208
>> >>
>> >>
>> >>
>> >> I am on 0f98f121e1670eaa2a2fbb675e07d6ba7f0e146f of linux-next with
>> >> the following diff in sound/core/timer.c (one is a fix for another
>> >> division by 0):
>> >>
>> >> diff --git a/sound/core/timer.c b/sound/core/timer.c
>> >> index 9a6157e..3e55c6d 100644
>> >> --- a/sound/core/timer.c
>> >> +++ b/sound/core/timer.c
>> >> @@ -813,6 +813,7 @@ int snd_timer_new(struct snd_card *card, char *id,
>> >> struct snd_timer_id *tid,
>> >>         timer->tmr_subdevice = tid->subdevice;
>> >>         if (id)
>> >>                 strlcpy(timer->id, id, sizeof(timer->id));
>> >> +       timer->sticks = 1;
>> >>         INIT_LIST_HEAD(&timer->device_list);
>> >>         INIT_LIST_HEAD(&timer->open_list_head);
>> >>         INIT_LIST_HEAD(&timer->active_list_head);
>> >> @@ -1958,6 +1959,7 @@ static ssize_t snd_timer_user_read(struct file
>> >> *file, char __user *buffer,
>> >>                 tu->qused--;
>> >>                 spin_unlock_irq(&tu->qlock);
>> >>
>> >> +               mutex_lock(&tu->ioctl_lock);
>> >>                 if (tu->tread) {
>> >>                         if (copy_to_user(buffer, &tu->tqueue[qhead],
>> >>                                          sizeof(struct snd_timer_tread)))
>> >> @@ -1967,6 +1969,7 @@ static ssize_t snd_timer_user_read(struct file
>> >> *file, char __user *buffer,
>> >>                                          sizeof(struct snd_timer_read)))
>> >>                                 err = -EFAULT;
>> >>                 }
>> >> +               mutex_unlock(&tu->ioctl_lock);
>> >>
>> >>                 spin_lock_irq(&tu->qlock);
>> >>                 if (err < 0)
>> >>
>> >
>> > I couldn't trigger the bug any longer after merging for-linus branch
>> > of sound.git tree:
>> >   git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git
>> >
>> > But I thought it got fixed by sticks=1 in the above.  Maybe it's a
>> > combination, as it contains yet another timer fix by Vegard.
>>
>>
>> You mean that my local changes are a combination of two patches? But I
>> would not expect that addition of mutex_lock/unlock can affect
>> division by 0.
>> Or you mean that there some other fix by Vegard? What is it?
>
> There is another fix
>   8ddc05638ee42b18ba4fe99b5fb647fa3ad20456
>     ALSA: timer: fix NULL pointer dereference on memory allocation failure
> But it's likely irrelevant.
>
> I retested now, and could see the latter stack trace, the rcu tall,
> while running the program, although it doesn't lead to a complete
> freeze and the system is restored once after the program quits.
>
> Do you really see the zero-division error with sticks=1 fix?  That's
> unexpected.


Yes, just reproduced it again.
I am on 0693c28cfc8e25f18c25b65a8942c026f1854a3c of linux-next now, so
I already have:
    commit 6b760bb2c63a9e322c0e4a0b5daf335ad93d5a33
    ALSA: timer: fix division by zero after SNDRV_TIMER_IOCTL_CONTINUE

Compiled the program verbatim and run it using stress utility (http
s://godoc.org/golang.org/x/tools/cmd/stress namely just runs the
program in parallel loop) and instantly get the following crash.

Attaching my config just in case.
I start qemu as:

qemu-system-x86_64 -net user,host=10.0.2.10,hostfwd=tcp::10022-:22 -net nic \
-no-reboot -display none -serial stdio -kernel arch/x86/boot/bzImage \
-fsdev local,id=fsdev0,path=/,security_model=none,readonly \
-device virtio-9p-pci,fsdev=fsdev0,mount_tag=/dev/root \
-enable-kvm -pidfile vm_pid -m 2G -numa node,nodeid=0,cpus=0-1 -numa
node,nodeid=1,cpus=2-3 -smp sockets=2,cores=2,threads=1 \
-usb -usbdevice mouse -usbdevice tablet -soundhw all \
-append "console=ttyS0 debug oops=panic panic_on_warn=1 panic=-1
earlyprintk=serial slub_debug=UZ root=/dev/root rootfstype=9p
rootflags=trans=virtio,version=9p2000.L,cache=loose
init=/usr/local/google/home/dvyukov/bin/9p/init.sh"


sound dir is:

-bash-4.3# ls -lt /dev/snd/
total 0
crw------- 1 root 0 116, 20 Sep  6 14:03 hwC4D0
crw------- 1 root 0 116, 19 Sep  6 14:03 pcmC4D0c
crw------- 1 root 0 116, 18 Sep  6 14:03 pcmC4D0p
crw------- 1 root 0 116, 17 Sep  6 14:03 controlC4
crw------- 1 root 0 116, 16 Sep  6 14:03 pcmC3D0p
crw------- 1 root 0 116, 15 Sep  6 14:03 controlC3
crw------- 1 root 0 116, 14 Sep  6 14:03 midiC2D3
crw------- 1 root 0 116, 13 Sep  6 14:03 midiC2D2
crw------- 1 root 0 116, 12 Sep  6 14:03 midiC2D1
crw------- 1 root 0 116, 11 Sep  6 14:03 midiC2D0
crw------- 1 root 0 116, 10 Sep  6 14:03 controlC2
crw------- 1 root 0 116,  9 Sep  6 14:03 pcmC1D1c
crw------- 1 root 0 116,  8 Sep  6 14:03 pcmC1D1p
crw------- 1 root 0 116,  7 Sep  6 14:03 pcmC1D0c
crw------- 1 root 0 116,  6 Sep  6 14:03 pcmC1D0p
crw------- 1 root 0 116,  5 Sep  6 14:03 controlC1
crw------- 1 root 0 116,  4 Sep  6 14:03 pcmC0D0c
crw------- 1 root 0 116,  3 Sep  6 14:03 pcmC0D0p
crw------- 1 root 0 116,  2 Sep  6 14:03 controlC0
crw------- 1 root 0 116,  1 Sep  6 14:03 seq
crw------- 1 root 0 116, 33 Sep  6 14:03 timer



[  134.435009] divide error: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
[  134.435009] Modules linked in:
[  134.435009] CPU: 3 PID: 4965 Comm: a.out Not tainted
4.8.0-rc5-next-20160905+ #14
[  134.435009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Bochs 01/01/2011
[  134.435009] task: ffff880038b6e200 task.stack: ffff880036c48000
[  134.435009] RIP: 0010:[<ffffffff858e147c>]  [<ffffffff858e147c>]
snd_hrtimer_callback+0x1bc/0x3c0
[  134.435009] RSP: 0018:ffff88006d307dd8  EFLAGS: 00010006
[  134.435009] RAX: 00000000002b5b0c RBX: ffff88006bb7d608 RCX: 0000000000000000
[  134.435009] RDX: 0000000000000000 RSI: ffff88006bb7d620 RDI: ffffffff886b69a8
[  134.435009] RBP: ffff88006d307e30 R08: 0000002066910785 R09: ffff88007fff70c8
[  134.435009] R10: ffff88007fff70d8 R11: ffff88007fff70d0 R12: ffff880038ae3d00
[  134.435009] R13: ffff880038ae3db4 R14: ffff880038ae3e30 R15: ffff88006bb7d600
[  134.435009] FS:  00000000009f6940(0000) GS:ffff88006d300000(0000)
knlGS:0000000000000000
[  134.435009] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  134.435009] CR2: 000000002000ffb0 CR3: 000000006a816000 CR4: 00000000000006e0
[  134.435009] Stack:
[  134.435009]  0000000000000000 ffff88006bb7d620 ffff880038ae3db8
00000000002b5b0c
[  134.435009]  ffff88006bb7d660 0000000000000000 ffff88006bb7d608
0000000000000003
[  134.435009]  ffff88006d31b400 dffffc0000000000 0000000000000003
ffff88006d307f28
[  134.447443] Call Trace:
[  134.447443]  <IRQ>
[  134.447443]  [<ffffffff81504425>] __hrtimer_run_queues+0x325/0xe70
[  134.447443]  [<ffffffff858e12c0>] ? snd_hrtimer_close+0x130/0x130
[  134.447443]  [<ffffffff81504100>] ? enqueue_hrtimer+0x3d0/0x3d0
[  134.447443]  [<ffffffff81506ddb>] hrtimer_interrupt+0x18b/0x420
[  134.447443]  [<ffffffff8126ceff>] local_apic_timer_interrupt+0x6f/0xe0
[  134.450051]  [<ffffffff86e1d216>] smp_apic_timer_interrupt+0x76/0xa0
[  134.450051]  [<ffffffff86e1c2cc>] apic_timer_interrupt+0x8c/0xa0
[  134.450051]  <EOI>
[  134.450051]  [<ffffffff86e1a63e>] ? _raw_spin_unlock_irqrestore+0x5e/0xc0
[  134.450051]  [<ffffffff858daded>] snd_timer_start1+0xdd/0x740
[  134.450051]  [<ffffffff858db535>] snd_timer_continue+0x45/0x80
[  134.450051]  [<ffffffff858deca9>] snd_timer_user_ioctl+0xe99/0x2470
[  134.450051]  [<ffffffff858dde10>] ? snd_timer_user_params.isra.22+0xa30/0xa30
[  134.450051]  [<ffffffff8183bca0>] ?
mem_cgroup_count_precharge_pte_range+0x530/0x530
[  134.450051]  [<ffffffff8183c03f>] ? get_mem_cgroup_from_mm+0x39f/0x4a0
[  134.450051]  [<ffffffff817129b4>] ? __lru_cache_add+0x174/0x210
[  134.450051]  [<ffffffff81715f49>] ? lru_cache_add+0xd9/0x1e0
[  134.450051]  [<ffffffff81791a30>] ? handle_mm_fault+0x28a0/0x34e0
[  134.450051]  [<ffffffff86e19c92>] ? _raw_spin_unlock+0x22/0x30
[  134.450051]  [<ffffffff8178fb40>] ? handle_mm_fault+0x9b0/0x34e0
[  134.450051]  [<ffffffff8178f2ca>] ? handle_mm_fault+0x13a/0x34e0
[  134.450051]  [<ffffffff8178f190>] ? __pmd_alloc+0x3f0/0x3f0
[  134.450051]  [<ffffffff818c902d>] ? __fd_install+0x24d/0x590
[  134.450051]  [<ffffffff818c8de0>] ? get_unused_fd_flags+0xd0/0xd0
[  134.450051]  [<ffffffff858dde10>] ? snd_timer_user_params.isra.22+0xa30/0xa30
[  134.450051]  [<ffffffff818a331c>] do_vfs_ioctl+0x18c/0x1080
[  134.450051]  [<ffffffff8129d19c>] ? __do_page_fault+0x2cc/0xbb0
[  134.450051]  [<ffffffff818a3190>] ? ioctl_preallocate+0x210/0x210
[  134.450051]  [<ffffffff8129d349>] ? __do_page_fault+0x479/0xbb0
[  134.450051]  [<ffffffff8148ef0a>] ? up_read+0x1a/0x40
[  134.450051]  [<ffffffff8129d068>] ? __do_page_fault+0x198/0xbb0
[  134.450051]  [<ffffffff82ad1dc9>] ? security_file_ioctl+0x89/0xb0
[  134.450051]  [<ffffffff818a429f>] SyS_ioctl+0x8f/0xc0
[  134.450051]  [<ffffffff86e1a8c0>] entry_SYSCALL_64_fastpath+0x23/0xc1
[  134.450051] Code: 7b ad ce fb 8b 0d e5 f1 6f 05 48 0f af 4d d0 48
85 c9 0f 88 6a 01 00 00 48 89 4d a8 e8 5e ad ce fb 48 8b 45 c0 48 8b
4d a8 48 99 <48> f7 f9 48 01 45 d0 e8 48 ad ce fb 4c 89 fa 48 b8 00 00
00 00
[  134.450051] RIP  [<ffffffff858e147c>] snd_hrtimer_callback+0x1bc/0x3c0
[  134.450051]  RSP <ffff88006d307dd8>
[  134.450051] ---[ end trace 8614aafe29f5fee0 ]---
[  134.450051] Kernel panic - not syncing: Fatal exception in interrupt
[  134.450051] Kernel Offset: disabled
Dmitry Vyukov Sept. 6, 2016, 2:08 p.m. UTC | #5
On Tue, Sep 6, 2016 at 3:43 PM, Takashi Iwai <tiwai@suse.de> wrote:
> On Tue, 06 Sep 2016 15:09:25 +0200,
> Dmitry Vyukov wrote:
>>
>> On Tue, Sep 6, 2016 at 3:00 PM, Takashi Iwai <tiwai@suse.de> wrote:
>> > On Sat, 03 Sep 2016 14:35:15 +0200,
>> > Dmitry Vyukov wrote:
>> >>
>> >> Hello,
>> >>
>> >> The following program causes either division error or hangs kernel:
>> >>
>> >> https://gist.githubusercontent.com/dvyukov/b1f33be3bfcc15d629e10db483bee1e4/raw/1574b407e0456bf8277bee423e7731641f5f2c14/gistfile1.txt
>> >>
>> >>
>> >> divide error: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
>> >> Dumping ftrace buffer:
>> >>    (ftrace buffer empty)
>> >> Modules linked in:
>> >> CPU: 1 PID: 27320 Comm: syz-executor Not tainted 4.8.0-rc3-next-20160825+ #8
>> >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>> >> task: ffff88003c9b2280 task.stack: ffff880027280000
>> >> RIP: 0010:[<ffffffff858e1a6c>]  [<     inline     >] ktime_divns
>> >> include/linux/ktime.h:195
>> >> RIP: 0010:[<ffffffff858e1a6c>]  [<ffffffff858e1a6c>]
>> >> snd_hrtimer_callback+0x1bc/0x3c0 sound/core/hrtimer.c:62
>> >> RSP: 0018:ffff88003ed07dd8  EFLAGS: 00010006
>> >> RAX: 0000000000004801 RBX: ffff88003d3a0a88 RCX: 0000000000000000
>> >> RDX: 0000000000000000 RSI: ffff88003d3a0aa0 RDI: ffffffff886b69a8
>> >> RBP: ffff88003ed07e30 R08: 0000010e85417665 R09: ffff88007fff7048
>> >> R10: ffff88007fff7058 R11: ffff88007fff7050 R12: ffff88003d00f840
>> >> R13: ffff88003d00f8f4 R14: ffff88003d00f970 R15: ffff88003d3a0a80
>> >> FS:  0000000000000000(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000
>> >> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> >> CR2: 00007f4152126db8 CR3: 000000006b896000 CR4: 00000000000006e0
>> >> DR0: 000000000000001e DR1: 000000000000001e DR2: 0000000000000000
>> >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
>> >> Stack:
>> >>  0000000000000000 ffff88003d3a0aa0 ffff88003d00f8f8 0000000000004801
>> >>  ffff88003d3a0ae0 0000000000000000 ffff88003d3a0a88 0000000000000001
>> >>  ffff88003ed1b400 dffffc0000000000 0000000000000001 ffff88003ed07f28
>> >> Call Trace:
>> >>  <IRQ>
>> >>  [<     inline     >] __run_hrtimer kernel/time/hrtimer.c:1238
>> >>  [<ffffffff81504335>] __hrtimer_run_queues+0x325/0xe70
>> >> kernel/time/hrtimer.c:1302
>> >>  [<ffffffff81506ceb>] hrtimer_interrupt+0x18b/0x420 kernel/time/hrtimer.c:1336
>> >>  [<ffffffff8126d8df>] local_apic_timer_interrupt+0x6f/0xe0
>> >> arch/x86/kernel/apic/apic.c:933
>> >>  [<ffffffff86e13056>] smp_apic_timer_interrupt+0x76/0xa0
>> >> arch/x86/kernel/apic/apic.c:957
>> >>  [<ffffffff86e1210c>] apic_timer_interrupt+0x8c/0xa0
>> >> arch/x86/entry/entry_64.S:487
>> >>  <EOI>
>> >>  [<     inline     >] rcu_lock_release include/linux/rcupdate.h:494
>> >>  [<     inline     >] rcu_read_unlock include/linux/rcupdate.h:927
>> >>  [<ffffffff81839f55>] unlock_page_memcg+0x95/0x130 mm/memcontrol.c:1680
>> >>  [<     inline     >] page_remove_file_rmap mm/rmap.c:1343
>> >>  [<ffffffff817b742c>] page_remove_rmap+0x1cc/0x8e0 mm/rmap.c:1394
>> >>  [<     inline     >] zap_pte_range mm/memory.c:1170
>> >>  [<     inline     >] zap_pmd_range mm/memory.c:1257
>> >>  [<     inline     >] zap_pud_range mm/memory.c:1278
>> >>  [<ffffffff81789119>] unmap_page_range+0xf89/0x1ba0 mm/memory.c:1299
>> >>  [<ffffffff81789e31>] unmap_single_vma+0x101/0x260 mm/memory.c:1344
>> >>  [<ffffffff8178a741>] unmap_vmas+0xf1/0x1b0
>> >>  [<ffffffff817a5aab>] exit_mmap+0x22b/0x420 mm/mmap.c:2787
>> >>  [<     inline     >] __mmput kernel/fork.c:770
>> >>  [<ffffffff8137c576>] mmput+0xd6/0x3c0 kernel/fork.c:790
>> >>  [<     inline     >] exit_mm kernel/exit.c:512
>> >>  [<ffffffff813940e8>] do_exit+0x738/0x2e70 kernel/exit.c:815
>> >>  [<ffffffff81396998>] do_group_exit+0x108/0x330 kernel/exit.c:958
>> >>  [<ffffffff813ba4aa>] get_signal+0x62a/0x15d0 kernel/signal.c:2307
>> >>  [<ffffffff81202333>] do_signal+0x83/0x1f60 arch/x86/kernel/signal.c:805
>> >>  [<ffffffff81006345>] exit_to_usermode_loop+0x1a5/0x210
>> >> arch/x86/entry/common.c:163
>> >>  [<     inline     >] prepare_exit_to_usermode arch/x86/entry/common.c:198
>> >>  [<     inline     >] syscall_return_slowpath arch/x86/entry/common.c:267
>> >>  [<ffffffff81008be0>] do_syscall_64+0x4c0/0x640 arch/x86/entry/common.c:293
>> >>  [<ffffffff86e107c3>] entry_SYSCALL64_slow_path+0x25/0x25
>> >> Code: cb a2 ce fb 8b 0d f5 21 70 05 48 0f af 4d d0 48 85 c9 0f 88 6a
>> >> 01 00 00 48 89 4d a8 e8 ae a2 ce fb 48 8b 45 c0 48 8b 4d a8 48 99 <48>
>> >> f7 f9 48 01 45 d0 e8 98 a2 ce fb 4c 89 fa 48 b8 00 00 00 00
>> >> RIP  [<     inline     >] ktime_divns include/linux/ktime.h:195
>> >> RIP  [<ffffffff858e1a6c>] snd_hrtimer_callback+0x1bc/0x3c0
>> >> sound/core/hrtimer.c:62
>> >>  RSP <ffff88003ed07dd8>
>> >> ---[ end trace e1a6d1e5a6136d50 ]---
>> >>
>> >>
>> >> or:
>> >>
>> >> INFO: rcu_sched detected stalls on CPUs/tasks:
>> >> 0-...: (1 GPs behind) idle=203/140000000000001/0 softirq=923090/923091 fqs=6490
>> >> (detected by 1, t=26002 jiffies, g=508022, c=508021, q=20)
>> >> Task dump for CPU 0:
>> >> a.out           R  running task    28848 31756   4047 0x0000000a
>> >>  0000000000000000 0000000000000282 0000000000000000 ffff880066a9b430
>> >>  ffff880066a9b430 ffff8800667cfae0 0000000000000282 0000000000000000
>> >>  0000000000000000 0000000000000000 000003a0184674dd dffffc0000000000
>> >> Call Trace:
>> >>  [<     inline     >] spin_unlock_irqrestore ./include/linux/spinlock.h:362
>> >>  [<ffffffff858db3dd>] snd_timer_start1+0xdd/0x740 sound/core/timer.c:477
>> >>  [<ffffffff858dbb25>] snd_timer_continue+0x45/0x80 sound/core/timer.c:606
>> >>  [<     inline     >] snd_timer_user_continue sound/core/timer.c:1835
>> >>  [<     inline     >] __snd_timer_user_ioctl sound/core/timer.c:1902
>> >>  [<ffffffff858df299>] snd_timer_user_ioctl+0xe99/0x2470 sound/core/timer.c:1917
>> >>  [<     inline     >] vfs_ioctl fs/ioctl.c:43
>> >>  [<ffffffff818a1dfc>] do_vfs_ioctl+0x18c/0x1080 fs/ioctl.c:675
>> >>  [<     inline     >] SYSC_ioctl fs/ioctl.c:690
>> >>  [<ffffffff818a2d7f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:681
>> >>  [<ffffffff86e10700>] entry_SYSCALL_64_fastpath+0x23/0xc1
>> >> arch/x86/entry/entry_64.S:208
>> >>
>> >>
>> >>
>> >> I am on 0f98f121e1670eaa2a2fbb675e07d6ba7f0e146f of linux-next with
>> >> the following diff in sound/core/timer.c (one is a fix for another
>> >> division by 0):
>> >>
>> >> diff --git a/sound/core/timer.c b/sound/core/timer.c
>> >> index 9a6157e..3e55c6d 100644
>> >> --- a/sound/core/timer.c
>> >> +++ b/sound/core/timer.c
>> >> @@ -813,6 +813,7 @@ int snd_timer_new(struct snd_card *card, char *id,
>> >> struct snd_timer_id *tid,
>> >>         timer->tmr_subdevice = tid->subdevice;
>> >>         if (id)
>> >>                 strlcpy(timer->id, id, sizeof(timer->id));
>> >> +       timer->sticks = 1;
>> >>         INIT_LIST_HEAD(&timer->device_list);
>> >>         INIT_LIST_HEAD(&timer->open_list_head);
>> >>         INIT_LIST_HEAD(&timer->active_list_head);
>> >> @@ -1958,6 +1959,7 @@ static ssize_t snd_timer_user_read(struct file
>> >> *file, char __user *buffer,
>> >>                 tu->qused--;
>> >>                 spin_unlock_irq(&tu->qlock);
>> >>
>> >> +               mutex_lock(&tu->ioctl_lock);
>> >>                 if (tu->tread) {
>> >>                         if (copy_to_user(buffer, &tu->tqueue[qhead],
>> >>                                          sizeof(struct snd_timer_tread)))
>> >> @@ -1967,6 +1969,7 @@ static ssize_t snd_timer_user_read(struct file
>> >> *file, char __user *buffer,
>> >>                                          sizeof(struct snd_timer_read)))
>> >>                                 err = -EFAULT;
>> >>                 }
>> >> +               mutex_unlock(&tu->ioctl_lock);
>> >>
>> >>                 spin_lock_irq(&tu->qlock);
>> >>                 if (err < 0)
>> >>
>> >
>> > I couldn't trigger the bug any longer after merging for-linus branch
>> > of sound.git tree:
>> >   git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git
>> >
>> > But I thought it got fixed by sticks=1 in the above.  Maybe it's a
>> > combination, as it contains yet another timer fix by Vegard.
>>
>>
>> You mean that my local changes are a combination of two patches? But I
>> would not expect that addition of mutex_lock/unlock can affect
>> division by 0.
>> Or you mean that there some other fix by Vegard? What is it?
>
> There is another fix
>   8ddc05638ee42b18ba4fe99b5fb647fa3ad20456
>     ALSA: timer: fix NULL pointer dereference on memory allocation failure
> But it's likely irrelevant.
>
> I retested now, and could see the latter stack trace, the rcu tall,
> while running the program, although it doesn't lead to a complete
> freeze and the system is restored once after the program quits.

Are you sure it get back to normal?
RCU stall is printed when a CPU does not respond to timer interrupts
for 20s. Try to do 'echo -n l > /proc/sysrq-trigger'.


> Do you really see the zero-division error with sticks=1 fix?  That's
> unexpected.
>
>
> Takashi
Takashi Iwai Sept. 6, 2016, 4:10 p.m. UTC | #6
On Tue, 06 Sep 2016 16:06:08 +0200,
Dmitry Vyukov wrote:
> 
> > Do you really see the zero-division error with sticks=1 fix?  That's
> > unexpected.
> 
> Yes, just reproduced it again.
> I am on 0693c28cfc8e25f18c25b65a8942c026f1854a3c of linux-next now, so
> I already have:
>     commit 6b760bb2c63a9e322c0e4a0b5daf335ad93d5a33
>     ALSA: timer: fix division by zero after SNDRV_TIMER_IOCTL_CONTINUE
> 
> Compiled the program verbatim and run it using stress utility (http
> s://godoc.org/golang.org/x/tools/cmd/stress namely just runs the
> program in parallel loop) and instantly get the following crash.

OK, I'll take a deeper look.


thanks,

Takashi
diff mbox

Patch

diff --git a/sound/core/timer.c b/sound/core/timer.c
index 9a6157e..3e55c6d 100644
--- a/sound/core/timer.c
+++ b/sound/core/timer.c
@@ -813,6 +813,7 @@  int snd_timer_new(struct snd_card *card, char *id,
struct snd_timer_id *tid,
        timer->tmr_subdevice = tid->subdevice;
        if (id)
                strlcpy(timer->id, id, sizeof(timer->id));
+       timer->sticks = 1;
        INIT_LIST_HEAD(&timer->device_list);
        INIT_LIST_HEAD(&timer->open_list_head);