Message ID | tencent_6537E04AAC74F976B567603CEB377A96FA09@qq.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [next] trace/blktrace: fix task hung in blk_trace_ioctl | expand |
Hi, 在 2023/12/02 17:01, Edward Adam Davis 写道: > The reproducer involves running test programs on multiple processors separately, > in order to enter blkdev_ioctl() and ultimately reach blk_trace_ioctl() through > two different paths, triggering an AA deadlock. > > CPU0 CPU1 > --- --- > mutex_lock(&q->debugfs_mutex) mutex_lock(&q->debugfs_mutex) > mutex_lock(&q->debugfs_mutex) mutex_lock(&q->debugfs_mutex) > > > The first path: > blkdev_ioctl()-> > blk_trace_ioctl()-> > mutex_lock(&q->debugfs_mutex) > > The second path: > blkdev_ioctl()-> > blkdev_common_ioctl()-> > blk_trace_ioctl()-> > mutex_lock(&q->debugfs_mutex) I still don't understand how this AA deadlock is triggered, does the 'debugfs_mutex' already held before calling blk_trace_ioctl()? > > The solution I have proposed is to exit blk_trace_ioctl() to avoid AA locks if > a task has already obtained debugfs_mutex. > > Fixes: 0d345996e4cb ("x86/kernel: increase kcov coverage under arch/x86/kernel folder") > Reported-and-tested-by: syzbot+ed812ed461471ab17a0c@syzkaller.appspotmail.com > Signed-off-by: Edward Adam Davis <eadavis@qq.com> > --- > kernel/trace/blktrace.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c > index 54ade89a1ad2..34e5bce42b1e 100644 > --- a/kernel/trace/blktrace.c > +++ b/kernel/trace/blktrace.c > @@ -735,7 +735,8 @@ int blk_trace_ioctl(struct block_device *bdev, unsigned cmd, char __user *arg) > int ret, start = 0; > char b[BDEVNAME_SIZE]; > > - mutex_lock(&q->debugfs_mutex); > + if (!mutex_trylock(&q->debugfs_mutex)) > + return -EBUSY; This is absolutely not a proper fix, a lot of user case will fail after this patch. Thanks, Kuai > > switch (cmd) { > case BLKTRACESETUP: >
On Sat, 2 Dec 2023 17:19:25 +0800 Yu Kuai <yukuai1@huaweicloud.com> wrote: > Hi, > > 在 2023/12/02 17:01, Edward Adam Davis 写道: > > The reproducer involves running test programs on multiple processors separately, > > in order to enter blkdev_ioctl() and ultimately reach blk_trace_ioctl() through > > two different paths, triggering an AA deadlock. > > > > CPU0 CPU1 > > --- --- > > mutex_lock(&q->debugfs_mutex) mutex_lock(&q->debugfs_mutex) > > mutex_lock(&q->debugfs_mutex) mutex_lock(&q->debugfs_mutex) > > > > > > The first path: > > blkdev_ioctl()-> > > blk_trace_ioctl()-> > > mutex_lock(&q->debugfs_mutex) > > > > The second path: > > blkdev_ioctl()-> > > blkdev_common_ioctl()-> > > blk_trace_ioctl()-> > > mutex_lock(&q->debugfs_mutex) > I still don't understand how this AA deadlock is triggered, does the > 'debugfs_mutex' already held before calling blk_trace_ioctl()? Right, I don't see where the mutex is taken twice. You don't need two paths for an AA lock, you only need one. > > > > > The solution I have proposed is to exit blk_trace_ioctl() to avoid AA locks if > > a task has already obtained debugfs_mutex. > > > > Fixes: 0d345996e4cb ("x86/kernel: increase kcov coverage under arch/x86/kernel folder") How does it fix the above? I don't see how the above is even related to this. -- Steve > > Reported-and-tested-by: syzbot+ed812ed461471ab17a0c@syzkaller.appspotmail.com > > Signed-off-by: Edward Adam Davis <eadavis@qq.com> > > --- > > kernel/trace/blktrace.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
Hi, On 2023-12-03 at 06:07:43 +0800, Steven Rostedt wrote: > On Sat, 2 Dec 2023 17:19:25 +0800 > Yu Kuai <yukuai1@huaweicloud.com> wrote: > > > Hi, > > > > 在 2023/12/02 17:01, Edward Adam Davis 写道: > > > The reproducer involves running test programs on multiple processors separately, > > > in order to enter blkdev_ioctl() and ultimately reach blk_trace_ioctl() through > > > two different paths, triggering an AA deadlock. > > > > > > CPU0 CPU1 > > > --- --- > > > mutex_lock(&q->debugfs_mutex) mutex_lock(&q->debugfs_mutex) > > > mutex_lock(&q->debugfs_mutex) mutex_lock(&q->debugfs_mutex) > > > > > > > > > The first path: > > > blkdev_ioctl()-> > > > blk_trace_ioctl()-> > > > mutex_lock(&q->debugfs_mutex) > > > > > > The second path: > > > blkdev_ioctl()-> > > > blkdev_common_ioctl()-> > > > blk_trace_ioctl()-> > > > mutex_lock(&q->debugfs_mutex) > > I still don't understand how this AA deadlock is triggered, does the > > 'debugfs_mutex' already held before calling blk_trace_ioctl()? > > Right, I don't see where the mutex is taken twice. You don't need two > paths for an AA lock, you only need one. > > > > > > > > > The solution I have proposed is to exit blk_trace_ioctl() to avoid AA locks if > > > a task has already obtained debugfs_mutex. > > > > > > Fixes: 0d345996e4cb ("x86/kernel: increase kcov coverage under arch/x86/kernel folder") > > How does it fix the above? I don't see how the above is even related to this. I bisected this issue and the following fix information is more accurate: " Fixes: f2c2e717642c ("usb: gadget: add raw-gadget interface") " All the bisected info is in link: https://github.com/xupengfe/syzkaller_logs/tree/main/231203_140738_blk_trace_ioctl Acked-by: Pengfei Xu <pengfei.xu@intel.com> Thanks! > > -- Steve > > > > Reported-and-tested-by: syzbot+ed812ed461471ab17a0c@syzkaller.appspotmail.com > > > Signed-off-by: Edward Adam Davis <eadavis@qq.com> > > > --- > > > kernel/trace/blktrace.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c index 54ade89a1ad2..34e5bce42b1e 100644 --- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -735,7 +735,8 @@ int blk_trace_ioctl(struct block_device *bdev, unsigned cmd, char __user *arg) int ret, start = 0; char b[BDEVNAME_SIZE]; - mutex_lock(&q->debugfs_mutex); + if (!mutex_trylock(&q->debugfs_mutex)) + return -EBUSY; switch (cmd) { case BLKTRACESETUP:
The reproducer involves running test programs on multiple processors separately, in order to enter blkdev_ioctl() and ultimately reach blk_trace_ioctl() through two different paths, triggering an AA deadlock. CPU0 CPU1 --- --- mutex_lock(&q->debugfs_mutex) mutex_lock(&q->debugfs_mutex) mutex_lock(&q->debugfs_mutex) mutex_lock(&q->debugfs_mutex) The first path: blkdev_ioctl()-> blk_trace_ioctl()-> mutex_lock(&q->debugfs_mutex) The second path: blkdev_ioctl()-> blkdev_common_ioctl()-> blk_trace_ioctl()-> mutex_lock(&q->debugfs_mutex) The solution I have proposed is to exit blk_trace_ioctl() to avoid AA locks if a task has already obtained debugfs_mutex. Fixes: 0d345996e4cb ("x86/kernel: increase kcov coverage under arch/x86/kernel folder") Reported-and-tested-by: syzbot+ed812ed461471ab17a0c@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> --- kernel/trace/blktrace.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)