From patchwork Wed Jul 11 11:13:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tetsuo Handa X-Patchwork-Id: 10519371 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id E0F79600CA for ; Wed, 11 Jul 2018 11:14:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D8D1D20223 for ; Wed, 11 Jul 2018 11:14:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CCF9428066; Wed, 11 Jul 2018 11:14:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4359320223 for ; Wed, 11 Jul 2018 11:14:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732529AbeGKLSS (ORCPT ); Wed, 11 Jul 2018 07:18:18 -0400 Received: from www262.sakura.ne.jp ([202.181.97.72]:31809 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732387AbeGKLSS (ORCPT ); Wed, 11 Jul 2018 07:18:18 -0400 Received: from fsav403.sakura.ne.jp (fsav403.sakura.ne.jp [133.242.250.102]) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTP id w6BBDDDT031165; Wed, 11 Jul 2018 20:13:14 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav403.sakura.ne.jp (F-Secure/fsigk_smtp/530/fsav403.sakura.ne.jp); Wed, 11 Jul 2018 20:13:13 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/530/fsav403.sakura.ne.jp) Received: from [192.168.1.8] (softbank126074194044.bbtec.net [126.74.194.44]) (authenticated bits=0) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTPSA id w6BBDCsS031115 (version=TLSv1.2 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 11 Jul 2018 20:13:13 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Subject: Re: INFO: task hung in __sb_start_write To: Dmitry Vyukov Cc: Peter Zijlstra , Ingo Molnar , Will Deacon , syzbot , linux-fsdevel , LKML , syzkaller-bugs , Al Viro , Linus Torvalds References: <000000000000283c37056b4a81a5@google.com> <20180611073038.GK12217@hirez.programming.kicks-ass.net> From: Tetsuo Handa Message-ID: <51f1de87-3fbe-48f6-0297-9717d9919772@I-love.SAKURA.ne.jp> Date: Wed, 11 Jul 2018 20:13:16 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 2018/06/19 20:10, Tetsuo Handa wrote: > On 2018/06/16 4:40, Tetsuo Handa wrote: >> Hmm, there might be other locations calling percpu_rwsem_release() ? > > There are other locations calling percpu_rwsem_release(), but quite few. > > include/linux/fs.h:1494:#define __sb_writers_release(sb, lev) \ > include/linux/fs.h-1495- percpu_rwsem_release(&(sb)->s_writers.rw_sem[(lev)-1], 1, _THIS_IP_) > > fs/btrfs/transaction.c:1821: __sb_writers_release(fs_info->sb, SB_FREEZE_FS); > fs/aio.c:1566: __sb_writers_release(file_inode(file)->i_sb, SB_FREEZE_WRITE); > fs/xfs/xfs_aops.c:211: __sb_writers_release(ioend->io_inode->i_sb, SB_FREEZE_FS); > > > > I'd like to check what atomic_long_read(&sem->rw_sem.count) says > when hung task is reported. > syzbot reproduced this problem with the patch applied. percpu_rw_semaphore(00000000082ac9da) ->rw_sem.count=0xfffffffe00000001 ->rss.gp_state=2 ->rss.gp_count=1 ->rss.cb_state=0 ->rss.gp_type=1 ->readers_block=1 ->read_count=0 ->list_empty(rw_sem.wait_list)=0 ->writer.task= (null) The output says that percpu_down_read() was blocked because somebody has called percpu_down_write(). DEFINE_STATIC_PERCPU_RWSEM(sem); percpu_down_write(&sem); percpu_down_read(&sem); percpu_up_read(&sem); percpu_up_write(&sem); The next step is to find who is calling percpu_down_write(). How do we want to do this? We don't want to annoy normal linux-next.git testers. Below one? --- include/linux/percpu-rwsem.h | 4 ++++ lib/Kconfig.debug | 6 ++++++ 2 files changed, 10 insertions(+) diff --git a/include/linux/percpu-rwsem.h b/include/linux/percpu-rwsem.h index 79b99d6..26e87c3 100644 --- a/include/linux/percpu-rwsem.h +++ b/include/linux/percpu-rwsem.h @@ -130,7 +130,9 @@ extern int __percpu_init_rwsem(struct percpu_rw_semaphore *, static inline void percpu_rwsem_release(struct percpu_rw_semaphore *sem, bool read, unsigned long ip) { +#ifndef CONFIG_DEBUG_AID_FOR_SYZBOT lock_release(&sem->rw_sem.dep_map, 1, ip); +#endif #ifdef CONFIG_RWSEM_SPIN_ON_OWNER if (!read) sem->rw_sem.owner = RWSEM_OWNER_UNKNOWN; @@ -140,7 +142,9 @@ static inline void percpu_rwsem_release(struct percpu_rw_semaphore *sem, static inline void percpu_rwsem_acquire(struct percpu_rw_semaphore *sem, bool read, unsigned long ip) { +#ifndef CONFIG_DEBUG_AID_FOR_SYZBOT lock_acquire(&sem->rw_sem.dep_map, 0, 1, read, 1, NULL, ip); +#endif #ifdef CONFIG_RWSEM_SPIN_ON_OWNER if (!read) sem->rw_sem.owner = current; diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index c731ff9..f0d02e8 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1181,6 +1181,12 @@ config DEBUG_LOCK_ALLOC spin_lock_init()/mutex_init()/etc., or whether there is any lock held during task exit. +config DEBUG_AID_FOR_SYZBOT + bool "Additional debug options for syzbot" + default n + help + This option is intended for testing by syzbot. + config LOCKDEP bool depends on DEBUG_KERNEL && LOCK_DEBUGGING_SUPPORT