From patchwork Tue Jul 19 04:09:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ioannis Angelakopoulos X-Patchwork-Id: 12922093 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 084CEC433EF for ; Tue, 19 Jul 2022 04:12:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235793AbiGSEMs (ORCPT ); Tue, 19 Jul 2022 00:12:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59630 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234366AbiGSEMr (ORCPT ); Tue, 19 Jul 2022 00:12:47 -0400 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3A6BC3AB08 for ; Mon, 18 Jul 2022 21:12:46 -0700 (PDT) Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 26IGhxwR030353 for ; Mon, 18 Jul 2022 21:12:45 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=9j9IEWtrl4Tf9dtlhASpJUIKoFMWGsQiJBCZHLWxbMQ=; b=CxmOdBPhYbXrhgzRPmBTzP+OS1nFdIaWrEduAY1Hdx6PJlaO9Q8THDVX/RKZWZD2SyQ1 v74OJ+qiCeRxHGn4m4NJdrLN3SJ05tB079MsBimmRwHxc5JHJXlJWOinDkg/h/mbO3wH qFxtJUwJ1R5wojhuvPXOr9/1FuIixcaK+zs= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3hcs5krq9d-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 18 Jul 2022 21:12:45 -0700 Received: from twshared25478.08.ash9.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Mon, 18 Jul 2022 21:12:44 -0700 Received: by devvm7778.ftw0.facebook.com (Postfix, from userid 558217) id D13AE24A8AD4; Mon, 18 Jul 2022 21:12:33 -0700 (PDT) From: Ioannis Angelakopoulos To: , Subject: [PATCH v2 1/5] btrfs: Add a lockdep model for the num_writers wait event Date: Mon, 18 Jul 2022 21:09:52 -0700 Message-ID: <20220719040954.3964407-2-iangelak@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220719040954.3964407-1-iangelak@fb.com> References: <20220719040954.3964407-1-iangelak@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: mVeYtLF_iOggdt_WduQnnHyIdOY319-8 X-Proofpoint-ORIG-GUID: mVeYtLF_iOggdt_WduQnnHyIdOY319-8 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-07-18_22,2022-07-18_01,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Annotate the num_writers wait event in fs/btrfs/transaction.c with lockdep in order to catch deadlocks involving this wait event. Use a read/write lockdep map for the annotation. A thread starting/joining the transaction acquires the map as a reader when it increments cur_trans->num_writers and it acquires the map as a writer before it blocks on the wait event. Signed-off-by: Ioannis Angelakopoulos Reviewed-by: Josef Bacik --- fs/btrfs/ctree.h | 14 ++++++++++++++ fs/btrfs/disk-io.c | 6 ++++++ fs/btrfs/transaction.c | 37 ++++++++++++++++++++++++++++++++----- 3 files changed, 52 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 202496172059..999868734be7 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1095,6 +1095,8 @@ struct btrfs_fs_info { /* Updates are not protected by any lock */ struct btrfs_commit_stats commit_stats; + struct lockdep_map btrfs_trans_num_writers_map; + #ifdef CONFIG_BTRFS_FS_REF_VERIFY spinlock_t ref_verify_lock; struct rb_root block_tree; @@ -1175,6 +1177,18 @@ enum { BTRFS_ROOT_UNFINISHED_DROP, }; +#define btrfs_might_wait_for_event(b, lock) \ + do { \ + rwsem_acquire(&b->lock##_map, 0, 0, _THIS_IP_); \ + rwsem_release(&b->lock##_map, _THIS_IP_); \ + } while (0) + +#define btrfs_lockdep_acquire(b, lock) \ + rwsem_acquire_read(&b->lock##_map, 0, 0, _THIS_IP_) + +#define btrfs_lockdep_release(b, lock) \ + rwsem_release(&b->lock##_map, _THIS_IP_) + static inline void btrfs_wake_unfinished_drop(struct btrfs_fs_info *fs_info) { clear_and_wake_up_bit(BTRFS_FS_UNFINISHED_DROPS, &fs_info->flags); diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 3fac429cf8a4..01a5a49a3a11 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3046,6 +3046,8 @@ static int __cold init_tree_roots(struct btrfs_fs_info *fs_info) void btrfs_init_fs_info(struct btrfs_fs_info *fs_info) { + static struct lock_class_key btrfs_trans_num_writers_key; + INIT_RADIX_TREE(&fs_info->fs_roots_radix, GFP_ATOMIC); INIT_RADIX_TREE(&fs_info->buffer_radix, GFP_ATOMIC); INIT_LIST_HEAD(&fs_info->trans_list); @@ -3074,6 +3076,10 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info) mutex_init(&fs_info->zoned_data_reloc_io_lock); seqlock_init(&fs_info->profiles_lock); + lockdep_init_map(&fs_info->btrfs_trans_num_writers_map, + "btrfs_trans_num_writers", + &btrfs_trans_num_writers_key, 0); + INIT_LIST_HEAD(&fs_info->dirty_cowonly_roots); INIT_LIST_HEAD(&fs_info->space_info); INIT_LIST_HEAD(&fs_info->tree_mod_seq_list); diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index 0bec10740ad3..d8287ec890bc 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -313,6 +313,7 @@ static noinline int join_transaction(struct btrfs_fs_info *fs_info, atomic_inc(&cur_trans->num_writers); extwriter_counter_inc(cur_trans, type); spin_unlock(&fs_info->trans_lock); + btrfs_lockdep_acquire(fs_info, btrfs_trans_num_writers); return 0; } spin_unlock(&fs_info->trans_lock); @@ -334,16 +335,20 @@ static noinline int join_transaction(struct btrfs_fs_info *fs_info, if (!cur_trans) return -ENOMEM; + btrfs_lockdep_acquire(fs_info, btrfs_trans_num_writers); + spin_lock(&fs_info->trans_lock); if (fs_info->running_transaction) { /* * someone started a transaction after we unlocked. Make sure * to redo the checks above */ + btrfs_lockdep_release(fs_info, btrfs_trans_num_writers); kfree(cur_trans); goto loop; } else if (BTRFS_FS_ERROR(fs_info)) { spin_unlock(&fs_info->trans_lock); + btrfs_lockdep_release(fs_info, btrfs_trans_num_writers); kfree(cur_trans); return -EROFS; } @@ -1022,6 +1027,9 @@ static int __btrfs_end_transaction(struct btrfs_trans_handle *trans, extwriter_counter_dec(cur_trans, trans->type); cond_wake_up(&cur_trans->writer_wait); + + btrfs_lockdep_release(info, btrfs_trans_num_writers); + btrfs_put_transaction(cur_trans); if (current->journal_info == trans) @@ -1994,6 +2002,12 @@ static void cleanup_transaction(struct btrfs_trans_handle *trans, int err) if (cur_trans == fs_info->running_transaction) { cur_trans->state = TRANS_STATE_COMMIT_DOING; spin_unlock(&fs_info->trans_lock); + + /* + * The thread has already released the lockdep map as reader already in + * btrfs_commit_transaction(). + */ + btrfs_might_wait_for_event(fs_info, btrfs_trans_num_writers); wait_event(cur_trans->writer_wait, atomic_read(&cur_trans->num_writers) == 1); @@ -2222,7 +2236,7 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) btrfs_put_transaction(prev_trans); if (ret) - goto cleanup_transaction; + goto lockdep_release; } else { spin_unlock(&fs_info->trans_lock); } @@ -2236,7 +2250,7 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) */ if (BTRFS_FS_ERROR(fs_info)) { ret = -EROFS; - goto cleanup_transaction; + goto lockdep_release; } } @@ -2250,19 +2264,21 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) ret = btrfs_start_delalloc_flush(fs_info); if (ret) - goto cleanup_transaction; + goto lockdep_release; ret = btrfs_run_delayed_items(trans); if (ret) - goto cleanup_transaction; + goto lockdep_release; wait_event(cur_trans->writer_wait, extwriter_counter_read(cur_trans) == 0); /* some pending stuffs might be added after the previous flush. */ ret = btrfs_run_delayed_items(trans); - if (ret) + if (ret) { + btrfs_lockdep_release(fs_info, btrfs_trans_num_writers); goto cleanup_transaction; + } btrfs_wait_delalloc_flush(fs_info); @@ -2284,6 +2300,14 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) add_pending_snapshot(trans); cur_trans->state = TRANS_STATE_COMMIT_DOING; spin_unlock(&fs_info->trans_lock); + + /* + * The thread has started/joined the transaction thus it holds the lockdep + * map as a reader. It has to release it before acquiring the lockdep map + * as a writer. + */ + btrfs_lockdep_release(fs_info, btrfs_trans_num_writers); + btrfs_might_wait_for_event(fs_info, btrfs_trans_num_writers); wait_event(cur_trans->writer_wait, atomic_read(&cur_trans->num_writers) == 1); @@ -2515,6 +2539,9 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) cleanup_transaction(trans, ret); return ret; +lockdep_release: + btrfs_lockdep_release(fs_info, btrfs_trans_num_writers); + goto cleanup_transaction; } /* From patchwork Tue Jul 19 04:09:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ioannis Angelakopoulos X-Patchwork-Id: 12922098 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BAD9CC43334 for ; Tue, 19 Jul 2022 04:13:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236809AbiGSENu (ORCPT ); Tue, 19 Jul 2022 00:13:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60032 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234998AbiGSENs (ORCPT ); Tue, 19 Jul 2022 00:13:48 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 76DB53E77F for ; Mon, 18 Jul 2022 21:13:44 -0700 (PDT) Received: from pps.filterd (m0109333.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 26J46s46027259 for ; Mon, 18 Jul 2022 21:13:44 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=Y+3UWzTV0pQ8LQ1QX73zMX49c9ImvBJAB6NmG1HM4x8=; b=QnoKtn1Waibj9CqOxGMAQZ3SqrQ28rRtwCei9MW/zPi+5Xe2hvfgeNAF4GAMRf1zELIW vFAQ3jcX0yv8cq1XXQZfU3GtRXjZqyHtJA2nUvgEeTMTmIIy1k3Z99TNphUAhnAlftyD 5RrFTaSBdZ025OFhmfNG07WfkOjKso3vBNI= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3hd9744k5v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 18 Jul 2022 21:13:43 -0700 Received: from twshared25478.08.ash9.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:82::e) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Mon, 18 Jul 2022 21:13:42 -0700 Received: by devvm7778.ftw0.facebook.com (Postfix, from userid 558217) id 083D624A8C9C; Mon, 18 Jul 2022 21:13:35 -0700 (PDT) From: Ioannis Angelakopoulos To: , Subject: [PATCH v2 2/5] btrfs: Add a lockdep model for the num_extwriters wait event Date: Mon, 18 Jul 2022 21:09:54 -0700 Message-ID: <20220719040954.3964407-3-iangelak@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220719040954.3964407-1-iangelak@fb.com> References: <20220719040954.3964407-1-iangelak@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: 9zohAz4CshzDtyDUd40GnBO4Nirvfpm4 X-Proofpoint-GUID: 9zohAz4CshzDtyDUd40GnBO4Nirvfpm4 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-07-18_22,2022-07-18_01,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Similarly to the num_writers wait event in fs/btrfs/transaction.c add a lockdep annotation for the num_extwriters wait event. Use a read/write lockdep map for the annotation. A thread starting/joining the transaction acquires the map as a reader when it increments cur_trans->num_writers and it acquires the map as a writer before it blocks on the wait event. Signed-off-by: Ioannis Angelakopoulos Reviewed-by: Josef Bacik --- fs/btrfs/ctree.h | 1 + fs/btrfs/disk-io.c | 4 ++++ fs/btrfs/transaction.c | 13 +++++++++++++ 3 files changed, 18 insertions(+) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 999868734be7..586756f831e5 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1096,6 +1096,7 @@ struct btrfs_fs_info { struct btrfs_commit_stats commit_stats; struct lockdep_map btrfs_trans_num_writers_map; + struct lockdep_map btrfs_trans_num_extwriters_map; #ifdef CONFIG_BTRFS_FS_REF_VERIFY spinlock_t ref_verify_lock; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 01a5a49a3a11..b1193584ba49 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3047,6 +3047,7 @@ static int __cold init_tree_roots(struct btrfs_fs_info *fs_info) void btrfs_init_fs_info(struct btrfs_fs_info *fs_info) { static struct lock_class_key btrfs_trans_num_writers_key; + static struct lock_class_key btrfs_trans_num_extwriters_key; INIT_RADIX_TREE(&fs_info->fs_roots_radix, GFP_ATOMIC); INIT_RADIX_TREE(&fs_info->buffer_radix, GFP_ATOMIC); @@ -3079,6 +3080,9 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info) lockdep_init_map(&fs_info->btrfs_trans_num_writers_map, "btrfs_trans_num_writers", &btrfs_trans_num_writers_key, 0); + lockdep_init_map(&fs_info->btrfs_trans_num_extwriters_map, + "btrfs_trans_num_extwriters", + &btrfs_trans_num_extwriters_key, 0); INIT_LIST_HEAD(&fs_info->dirty_cowonly_roots); INIT_LIST_HEAD(&fs_info->space_info); diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index d8287ec890bc..c9751a05c029 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -314,6 +314,7 @@ static noinline int join_transaction(struct btrfs_fs_info *fs_info, extwriter_counter_inc(cur_trans, type); spin_unlock(&fs_info->trans_lock); btrfs_lockdep_acquire(fs_info, btrfs_trans_num_writers); + btrfs_lockdep_acquire(fs_info, btrfs_trans_num_extwriters); return 0; } spin_unlock(&fs_info->trans_lock); @@ -336,6 +337,7 @@ static noinline int join_transaction(struct btrfs_fs_info *fs_info, return -ENOMEM; btrfs_lockdep_acquire(fs_info, btrfs_trans_num_writers); + btrfs_lockdep_acquire(fs_info, btrfs_trans_num_extwriters); spin_lock(&fs_info->trans_lock); if (fs_info->running_transaction) { @@ -343,11 +345,13 @@ static noinline int join_transaction(struct btrfs_fs_info *fs_info, * someone started a transaction after we unlocked. Make sure * to redo the checks above */ + btrfs_lockdep_release(fs_info, btrfs_trans_num_extwriters); btrfs_lockdep_release(fs_info, btrfs_trans_num_writers); kfree(cur_trans); goto loop; } else if (BTRFS_FS_ERROR(fs_info)) { spin_unlock(&fs_info->trans_lock); + btrfs_lockdep_release(fs_info, btrfs_trans_num_extwriters); btrfs_lockdep_release(fs_info, btrfs_trans_num_writers); kfree(cur_trans); return -EROFS; @@ -1028,6 +1032,7 @@ static int __btrfs_end_transaction(struct btrfs_trans_handle *trans, cond_wake_up(&cur_trans->writer_wait); + btrfs_lockdep_release(info, btrfs_trans_num_extwriters); btrfs_lockdep_release(info, btrfs_trans_num_writers); btrfs_put_transaction(cur_trans); @@ -2270,6 +2275,13 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) if (ret) goto lockdep_release; + /* + * The thread has started/joined the transaction thus it holds the lockdep + * map as a reader. It has to release it before acquiring the lockdep map + * as a writer. + */ + btrfs_lockdep_release(fs_info, btrfs_trans_num_extwriters); + btrfs_might_wait_for_event(fs_info, btrfs_trans_num_extwriters); wait_event(cur_trans->writer_wait, extwriter_counter_read(cur_trans) == 0); @@ -2540,6 +2552,7 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) return ret; lockdep_release: + btrfs_lockdep_release(fs_info, btrfs_trans_num_extwriters); btrfs_lockdep_release(fs_info, btrfs_trans_num_writers); goto cleanup_transaction; } From patchwork Tue Jul 19 04:09:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ioannis Angelakopoulos X-Patchwork-Id: 12922099 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88A02C43334 for ; Tue, 19 Jul 2022 04:16:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235120AbiGSEQn (ORCPT ); Tue, 19 Jul 2022 00:16:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235610AbiGSEQm (ORCPT ); Tue, 19 Jul 2022 00:16:42 -0400 Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 526F13AE6D for ; Mon, 18 Jul 2022 21:16:41 -0700 (PDT) Received: from pps.filterd (m0089730.ppops.net [127.0.0.1]) by m0089730.ppops.net (8.17.1.5/8.17.1.5) with ESMTP id 26IKFm1A024990 for ; Mon, 18 Jul 2022 21:16:40 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=65IS/sjc2SHZYogNDyf0HdEdP3QKXpUo+pPGv+kP64A=; b=UaOwkh/14vlPXdz/ptFM5qtBP6gSqLSSfUwjTB8YKRMWUTANOoSBxBwPupNkROy+NoIY Ylk+yvjEhNoIhxe2ZQhaKfHv8VtsjlO62/aKXBwDnMU5BlKksni0oAwu86DmCx3MW5vJ imx+IxXIrbIi87rFQPySKBd4qIkIqfVoDCw= Received: from maileast.thefacebook.com ([163.114.130.16]) by m0089730.ppops.net (PPS) with ESMTPS id 3hded7a592-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 18 Jul 2022 21:16:40 -0700 Received: from twshared0725.22.frc3.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Mon, 18 Jul 2022 21:16:39 -0700 Received: by devvm7778.ftw0.facebook.com (Postfix, from userid 558217) id 3815E24A8FBB; Mon, 18 Jul 2022 21:16:34 -0700 (PDT) From: Ioannis Angelakopoulos To: , Subject: [PATCH v2 3/5] btrfs: Add lockdep models for the transaction states wait events Date: Mon, 18 Jul 2022 21:09:56 -0700 Message-ID: <20220719040954.3964407-4-iangelak@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220719040954.3964407-1-iangelak@fb.com> References: <20220719040954.3964407-1-iangelak@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: HIYZzC53iYkr71_Avyuku3Gal5_c_5sk X-Proofpoint-ORIG-GUID: HIYZzC53iYkr71_Avyuku3Gal5_c_5sk X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-07-18_22,2022-07-18_01,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add a lockdep annotation for the transaction states that have wait events; 1) TRANS_STATE_COMMIT_START, 2) TRANS_STATE_UNBLOCKED, 3) TRANS_STATE_SUPER_COMMITTED, and 4) TRANS_STATE_COMPLETED in fs/btrfs/transaction.c. With the exception of the lockdep annotation for TRANS_STATE_COMMIT_START the transaction thread has to acquire the lockdep maps for the transaction states as reader after the lockdep map for num_writers is released so that lockdep does not complain. Signed-off-by: Ioannis Angelakopoulos --- fs/btrfs/ctree.h | 20 +++++++++++++++ fs/btrfs/disk-io.c | 17 +++++++++++++ fs/btrfs/transaction.c | 57 +++++++++++++++++++++++++++++++++++++----- 3 files changed, 88 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 586756f831e5..e6c7cafcd296 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1097,6 +1097,7 @@ struct btrfs_fs_info { struct lockdep_map btrfs_trans_num_writers_map; struct lockdep_map btrfs_trans_num_extwriters_map; + struct lockdep_map btrfs_state_change_map[4]; #ifdef CONFIG_BTRFS_FS_REF_VERIFY spinlock_t ref_verify_lock; @@ -1178,6 +1179,13 @@ enum { BTRFS_ROOT_UNFINISHED_DROP, }; +enum btrfs_lockdep_trans_states { + BTRFS_LOCKDEP_TRANS_COMMIT_START, + BTRFS_LOCKDEP_TRANS_UNBLOCKED, + BTRFS_LOCKDEP_TRANS_SUPER_COMMITTED, + BTRFS_LOCKDEP_TRANS_COMPLETED, +}; + #define btrfs_might_wait_for_event(b, lock) \ do { \ rwsem_acquire(&b->lock##_map, 0, 0, _THIS_IP_); \ @@ -1190,6 +1198,18 @@ enum { #define btrfs_lockdep_release(b, lock) \ rwsem_release(&b->lock##_map, _THIS_IP_) +#define btrfs_might_wait_for_state(b, i) \ + do { \ + rwsem_acquire(&b->btrfs_state_change_map[i], 0, 0, _THIS_IP_); \ + rwsem_release(&b->btrfs_state_change_map[i], _THIS_IP_); \ + } while (0) + +#define btrfs_trans_state_lockdep_acquire(b, i) \ + rwsem_acquire_read(&b->btrfs_state_change_map[i], 0, 0, _THIS_IP_) + +#define btrfs_trans_state_lockdep_release(b, i) \ + rwsem_release(&b->btrfs_state_change_map[i], _THIS_IP_) + static inline void btrfs_wake_unfinished_drop(struct btrfs_fs_info *fs_info) { clear_and_wake_up_bit(BTRFS_FS_UNFINISHED_DROPS, &fs_info->flags); diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index b1193584ba49..be5cf86fa992 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3048,6 +3048,10 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info) { static struct lock_class_key btrfs_trans_num_writers_key; static struct lock_class_key btrfs_trans_num_extwriters_key; + static struct lock_class_key btrfs_trans_commit_start_key; + static struct lock_class_key btrfs_trans_unblocked_key; + static struct lock_class_key btrfs_trans_sup_committed_key; + static struct lock_class_key btrfs_trans_completed_key; INIT_RADIX_TREE(&fs_info->fs_roots_radix, GFP_ATOMIC); INIT_RADIX_TREE(&fs_info->buffer_radix, GFP_ATOMIC); @@ -3084,6 +3088,19 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info) "btrfs_trans_num_extwriters", &btrfs_trans_num_extwriters_key, 0); + lockdep_init_map(&fs_info->btrfs_state_change_map[0], + "btrfs_trans_commit_start", + &btrfs_trans_commit_start_key, 0); + lockdep_init_map(&fs_info->btrfs_state_change_map[1], + "btrfs_trans_unblocked", + &btrfs_trans_unblocked_key, 0); + lockdep_init_map(&fs_info->btrfs_state_change_map[2], + "btrfs_trans_sup_commited", + &btrfs_trans_sup_committed_key, 0); + lockdep_init_map(&fs_info->btrfs_state_change_map[3], + "btrfs_trans_completed", + &btrfs_trans_completed_key, 0); + INIT_LIST_HEAD(&fs_info->dirty_cowonly_roots); INIT_LIST_HEAD(&fs_info->space_info); INIT_LIST_HEAD(&fs_info->tree_mod_seq_list); diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index c9751a05c029..e4efaa27ec17 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -550,6 +550,7 @@ static void wait_current_trans(struct btrfs_fs_info *fs_info) refcount_inc(&cur_trans->use_count); spin_unlock(&fs_info->trans_lock); + btrfs_might_wait_for_state(fs_info, BTRFS_LOCKDEP_TRANS_UNBLOCKED); wait_event(fs_info->transaction_wait, cur_trans->state >= TRANS_STATE_UNBLOCKED || TRANS_ABORTED(cur_trans)); @@ -949,6 +950,7 @@ int btrfs_wait_for_commit(struct btrfs_fs_info *fs_info, u64 transid) goto out; /* nothing committing|committed */ } + btrfs_might_wait_for_state(fs_info, BTRFS_LOCKDEP_TRANS_COMPLETED); wait_for_commit(cur_trans, TRANS_STATE_COMPLETED); btrfs_put_transaction(cur_trans); out: @@ -1980,6 +1982,7 @@ void btrfs_commit_transaction_async(struct btrfs_trans_handle *trans) * Wait for the current transaction commit to start and block * subsequent transaction joins */ + btrfs_might_wait_for_state(fs_info, BTRFS_LOCKDEP_TRANS_COMMIT_START); wait_event(fs_info->transaction_blocked_wait, cur_trans->state >= TRANS_STATE_COMMIT_START || TRANS_ABORTED(cur_trans)); @@ -2137,14 +2140,16 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) ktime_t interval; ASSERT(refcount_read(&trans->use_count) == 1); + btrfs_trans_state_lockdep_acquire(fs_info, + BTRFS_LOCKDEP_TRANS_COMMIT_START); /* Stop the commit early if ->aborted is set */ if (TRANS_ABORTED(cur_trans)) { ret = cur_trans->aborted; - btrfs_end_transaction(trans); - return ret; + goto lockdep_trans_commit_start_release; } + btrfs_trans_release_metadata(trans); trans->block_rsv = NULL; @@ -2160,8 +2165,7 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) */ ret = btrfs_run_delayed_refs(trans, 0); if (ret) { - btrfs_end_transaction(trans); - return ret; + goto lockdep_trans_commit_start_release; } } @@ -2192,8 +2196,7 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) if (run_it) { ret = btrfs_start_dirty_block_groups(trans); if (ret) { - btrfs_end_transaction(trans); - return ret; + goto lockdep_trans_commit_start_release; } } } @@ -2209,7 +2212,17 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) if (trans->in_fsync) want_state = TRANS_STATE_SUPER_COMMITTED; + + btrfs_trans_state_lockdep_release(fs_info, + BTRFS_LOCKDEP_TRANS_COMMIT_START); ret = btrfs_end_transaction(trans); + + if (want_state == TRANS_STATE_COMPLETED) + btrfs_might_wait_for_state(fs_info, BTRFS_LOCKDEP_TRANS_COMPLETED); + else + btrfs_might_wait_for_state(fs_info, + BTRFS_LOCKDEP_TRANS_SUPER_COMMITTED); + wait_for_commit(cur_trans, want_state); if (TRANS_ABORTED(cur_trans)) @@ -2222,6 +2235,8 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) cur_trans->state = TRANS_STATE_COMMIT_START; wake_up(&fs_info->transaction_blocked_wait); + btrfs_trans_state_lockdep_release(fs_info, + BTRFS_LOCKDEP_TRANS_COMMIT_START); if (cur_trans->list.prev != &fs_info->trans_list) { enum btrfs_trans_state want_state = TRANS_STATE_COMPLETED; @@ -2235,6 +2250,12 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) refcount_inc(&prev_trans->use_count); spin_unlock(&fs_info->trans_lock); + if (want_state == TRANS_STATE_COMPLETED) + btrfs_might_wait_for_state(fs_info, + BTRFS_LOCKDEP_TRANS_COMPLETED); + else + btrfs_might_wait_for_state(fs_info, + BTRFS_LOCKDEP_TRANS_SUPER_COMMITTED); wait_for_commit(prev_trans, want_state); ret = READ_ONCE(prev_trans->aborted); @@ -2323,6 +2344,15 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) wait_event(cur_trans->writer_wait, atomic_read(&cur_trans->num_writers) == 1); + /* + * Make lockdep happy by acquiring the state locks after + * btrfs_trans_num_writers is released. + */ + btrfs_trans_state_lockdep_acquire(fs_info, BTRFS_LOCKDEP_TRANS_COMPLETED); + btrfs_trans_state_lockdep_acquire(fs_info, + BTRFS_LOCKDEP_TRANS_SUPER_COMMITTED); + btrfs_trans_state_lockdep_acquire(fs_info, BTRFS_LOCKDEP_TRANS_UNBLOCKED); + /* * We've started the commit, clear the flag in case we were triggered to * do an async commit but somebody else started before the transaction @@ -2332,6 +2362,8 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) if (TRANS_ABORTED(cur_trans)) { ret = cur_trans->aborted; + btrfs_trans_state_lockdep_release(fs_info, + BTRFS_LOCKDEP_TRANS_UNBLOCKED); goto scrub_continue; } /* @@ -2466,6 +2498,7 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) mutex_unlock(&fs_info->reloc_mutex); wake_up(&fs_info->transaction_wait); + btrfs_trans_state_lockdep_release(fs_info, BTRFS_LOCKDEP_TRANS_UNBLOCKED); ret = btrfs_write_and_wait_transaction(trans); if (ret) { @@ -2497,6 +2530,8 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) */ cur_trans->state = TRANS_STATE_SUPER_COMMITTED; wake_up(&cur_trans->commit_wait); + btrfs_trans_state_lockdep_release(fs_info, + BTRFS_LOCKDEP_TRANS_SUPER_COMMITTED); btrfs_finish_extent_commit(trans); @@ -2510,6 +2545,7 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) */ cur_trans->state = TRANS_STATE_COMPLETED; wake_up(&cur_trans->commit_wait); + btrfs_trans_state_lockdep_release(fs_info, BTRFS_LOCKDEP_TRANS_COMPLETED); spin_lock(&fs_info->trans_lock); list_del_init(&cur_trans->list); @@ -2538,7 +2574,11 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) unlock_reloc: mutex_unlock(&fs_info->reloc_mutex); + btrfs_trans_state_lockdep_release(fs_info, BTRFS_LOCKDEP_TRANS_UNBLOCKED); scrub_continue: + btrfs_trans_state_lockdep_release(fs_info, + BTRFS_LOCKDEP_TRANS_SUPER_COMMITTED); + btrfs_trans_state_lockdep_release(fs_info, BTRFS_LOCKDEP_TRANS_COMPLETED); btrfs_scrub_continue(fs_info); cleanup_transaction: btrfs_trans_release_metadata(trans); @@ -2555,6 +2595,11 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) btrfs_lockdep_release(fs_info, btrfs_trans_num_extwriters); btrfs_lockdep_release(fs_info, btrfs_trans_num_writers); goto cleanup_transaction; +lockdep_trans_commit_start_release: + btrfs_trans_state_lockdep_release(fs_info, + BTRFS_LOCKDEP_TRANS_COMMIT_START); + btrfs_end_transaction(trans); + return ret; } /* From patchwork Tue Jul 19 04:09:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ioannis Angelakopoulos X-Patchwork-Id: 12922100 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B149DC43334 for ; Tue, 19 Jul 2022 04:18:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235754AbiGSESP (ORCPT ); Tue, 19 Jul 2022 00:18:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34200 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235236AbiGSESO (ORCPT ); Tue, 19 Jul 2022 00:18:14 -0400 Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5293832BA6 for ; Mon, 18 Jul 2022 21:18:13 -0700 (PDT) Received: from pps.filterd (m0001303.ppops.net [127.0.0.1]) by m0001303.ppops.net (8.17.1.5/8.17.1.5) with ESMTP id 26IGi11O023393 for ; Mon, 18 Jul 2022 21:18:12 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=wOSzvKmN32krd94TIAAeVBV2GyyV48Fc1P6zynZasoo=; b=p+FP5SH2BQkvuHs3o5Zae7Zs9IYTPS1j3Wb99M9OiZGoJuMAaBPG2zQpxKOX40I2Q52E g4NQ/sAEHjrkc+VU1I6eoCeLiArWsZ/Nh42WTr+XAtavidulwE2knAenFBtGx6s/ufuy 02VbXrwMea/XM0wAUNT9S3XicWuzPZNK1pI= Received: from maileast.thefacebook.com ([163.114.130.16]) by m0001303.ppops.net (PPS) with ESMTPS id 3hbxbgd69m-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 18 Jul 2022 21:18:12 -0700 Received: from twshared35153.14.frc2.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Mon, 18 Jul 2022 21:18:09 -0700 Received: by devvm7778.ftw0.facebook.com (Postfix, from userid 558217) id 65CEA24A9238; Mon, 18 Jul 2022 21:18:04 -0700 (PDT) From: Ioannis Angelakopoulos To: , Subject: [PATCH v2 4/5] btrfs: Add a lockdep model for the pending_ordered wait event Date: Mon, 18 Jul 2022 21:09:58 -0700 Message-ID: <20220719040954.3964407-5-iangelak@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220719040954.3964407-1-iangelak@fb.com> References: <20220719040954.3964407-1-iangelak@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: s_5ULZ-YOhgkHxkzGF4d_KlQJuvYOf9k X-Proofpoint-GUID: s_5ULZ-YOhgkHxkzGF4d_KlQJuvYOf9k X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-07-18_22,2022-07-18_01,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In contrast to the num_writers and num_extwriters wait events, the condition for the pending_ordered wait event is signaled in a different context from the wait event itself. The condition signaling occurs in btrfs_remove_ordered_extent() in fs/btrfs/ordered-data.c while the wait event is implemented in btrfs_commit_transaction() in fs/btrfs/transaction.c Thus the thread signaling the condition has to acquire the lockdep map as a reader at the start of btrfs_remove_ordered_extent() and release it after it has signaled the condition. In this case some dependencies might be left out due to the placement of the annotation, but it is better than no annotation at all. Signed-off-by: Ioannis Angelakopoulos Reviewed-by: Josef Bacik --- fs/btrfs/ctree.h | 1 + fs/btrfs/disk-io.c | 4 ++++ fs/btrfs/ordered-data.c | 3 +++ fs/btrfs/transaction.c | 1 + 4 files changed, 9 insertions(+) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index e6c7cafcd296..661d19ac41d1 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1098,6 +1098,7 @@ struct btrfs_fs_info { struct lockdep_map btrfs_trans_num_writers_map; struct lockdep_map btrfs_trans_num_extwriters_map; struct lockdep_map btrfs_state_change_map[4]; + struct lockdep_map btrfs_trans_pending_ordered_map; #ifdef CONFIG_BTRFS_FS_REF_VERIFY spinlock_t ref_verify_lock; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index be5cf86fa992..e9956cbf653f 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3052,6 +3052,7 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info) static struct lock_class_key btrfs_trans_unblocked_key; static struct lock_class_key btrfs_trans_sup_committed_key; static struct lock_class_key btrfs_trans_completed_key; + static struct lock_class_key btrfs_trans_pending_ordered_key; INIT_RADIX_TREE(&fs_info->fs_roots_radix, GFP_ATOMIC); INIT_RADIX_TREE(&fs_info->buffer_radix, GFP_ATOMIC); @@ -3087,6 +3088,9 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info) lockdep_init_map(&fs_info->btrfs_trans_num_extwriters_map, "btrfs_trans_num_extwriters", &btrfs_trans_num_extwriters_key, 0); + lockdep_init_map(&fs_info->btrfs_trans_pending_ordered_map, + "btrfs_trans_pending_ordered", + &btrfs_trans_pending_ordered_key, 0); lockdep_init_map(&fs_info->btrfs_state_change_map[0], "btrfs_trans_commit_start", diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index 1952ac85222c..2a4cb6db42d1 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -525,6 +525,7 @@ void btrfs_remove_ordered_extent(struct btrfs_inode *btrfs_inode, struct rb_node *node; bool pending; + btrfs_lockdep_acquire(fs_info, btrfs_trans_pending_ordered); /* This is paired with btrfs_add_ordered_extent. */ spin_lock(&btrfs_inode->lock); btrfs_mod_outstanding_extents(btrfs_inode, -1); @@ -580,6 +581,8 @@ void btrfs_remove_ordered_extent(struct btrfs_inode *btrfs_inode, } } + btrfs_lockdep_release(fs_info, btrfs_trans_pending_ordered); + spin_lock(&root->ordered_extent_lock); list_del_init(&entry->root_extent_list); root->nr_ordered_extents--; diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index e4efaa27ec17..3522e8f3381c 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -2320,6 +2320,7 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) * transaction. Otherwise if this transaction commits before the ordered * extents complete we lose logged data after a power failure. */ + btrfs_might_wait_for_event(fs_info, btrfs_trans_pending_ordered); wait_event(cur_trans->pending_wait, atomic_read(&cur_trans->pending_ordered) == 0); From patchwork Tue Jul 19 04:10:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ioannis Angelakopoulos X-Patchwork-Id: 12922108 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C439AC43334 for ; Tue, 19 Jul 2022 04:20:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236215AbiGSEUR (ORCPT ); Tue, 19 Jul 2022 00:20:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35232 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235029AbiGSEUQ (ORCPT ); Tue, 19 Jul 2022 00:20:16 -0400 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 033DC6475 for ; Mon, 18 Jul 2022 21:20:14 -0700 (PDT) Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 26J2tLQX000922 for ; Mon, 18 Jul 2022 21:20:14 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=VdGRbT3hlA2KvsAFWbBCmwbTClxyj2o+swdk0nMdjG8=; b=M+CaWChDS1TmD1+GrRO5JjsDERzaqkrnY4oBx5RqNup4lZLtxere8ifQDm345nwoSiSC YrMb020sHfdNySfhhFbo4S/C6Ecxu/e1LQRKdG8tifdMfeHCej4ehbt4g82LQXcPQR54 R41qrAdNE1mEFM5duq2uUR7EeKLoZUkjiCc= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3hcs5krr0k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 18 Jul 2022 21:20:14 -0700 Received: from twshared22934.08.ash9.facebook.com (2620:10d:c085:208::f) by mail.thefacebook.com (2620:10d:c085:21d::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Mon, 18 Jul 2022 21:20:12 -0700 Received: by devvm7778.ftw0.facebook.com (Postfix, from userid 558217) id E800E24A95A3; Mon, 18 Jul 2022 21:20:10 -0700 (PDT) From: Ioannis Angelakopoulos To: , Subject: [PATCH v2 5/5] btrfs: Add a lockdep model for the ordered extents wait event Date: Mon, 18 Jul 2022 21:10:00 -0700 Message-ID: <20220719040954.3964407-6-iangelak@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220719040954.3964407-1-iangelak@fb.com> References: <20220719040954.3964407-1-iangelak@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: GjYSnxC1Wk5PDzFvIqhJs2kXaJBTfCfH X-Proofpoint-ORIG-GUID: GjYSnxC1Wk5PDzFvIqhJs2kXaJBTfCfH X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-07-18_22,2022-07-18_01,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This wait event is very similar to the pending_ordered wait event in the sense that it occurs in a different context than the condition signaling for the event. The signaling occurs in btrfs_remove_ordered_extent() while the wait event is implemented in btrfs_start_ordered_extent() in fs/btrfs/ordered-data.c However, in this case a thread must not acquire the lockdep map for the ordered extents wait event when the ordered extent is related to a free space inode. That is because lockdep creates dependencies between locks acquired both in execution paths related to normal inodes and paths related to free space inodes, thus leading to false positives. Also to prevent false positives related to free space inodes and normal inodes, the lockdep map class for the inode->mapping->invalidate_lock is reinitialized in load_free_space_cache() in fs/btrfs/free-space-cache.c Signed-off-by: Ioannis Angelakopoulos --- fs/btrfs/ctree.h | 1 + fs/btrfs/disk-io.c | 4 ++++ fs/btrfs/free-space-cache.c | 11 +++++++++++ fs/btrfs/inode.c | 13 +++++++++++++ fs/btrfs/ordered-data.c | 18 ++++++++++++++++++ 5 files changed, 47 insertions(+) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 661d19ac41d1..941987ed4c43 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1099,6 +1099,7 @@ struct btrfs_fs_info { struct lockdep_map btrfs_trans_num_extwriters_map; struct lockdep_map btrfs_state_change_map[4]; struct lockdep_map btrfs_trans_pending_ordered_map; + struct lockdep_map btrfs_ordered_extent_map; #ifdef CONFIG_BTRFS_FS_REF_VERIFY spinlock_t ref_verify_lock; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index e9956cbf653f..8f7a07362efb 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3053,6 +3053,7 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info) static struct lock_class_key btrfs_trans_sup_committed_key; static struct lock_class_key btrfs_trans_completed_key; static struct lock_class_key btrfs_trans_pending_ordered_key; + static struct lock_class_key btrfs_ordered_extent_key; INIT_RADIX_TREE(&fs_info->fs_roots_radix, GFP_ATOMIC); INIT_RADIX_TREE(&fs_info->buffer_radix, GFP_ATOMIC); @@ -3104,6 +3105,9 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info) lockdep_init_map(&fs_info->btrfs_state_change_map[3], "btrfs_trans_completed", &btrfs_trans_completed_key, 0); + lockdep_init_map(&fs_info->btrfs_ordered_extent_map, + "btrfs_ordered_extent", + &btrfs_ordered_extent_key, 0); INIT_LIST_HEAD(&fs_info->dirty_cowonly_roots); INIT_LIST_HEAD(&fs_info->space_info); diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index 996da650ecdc..1e41a01a782b 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -914,6 +914,8 @@ static int copy_free_space_cache(struct btrfs_block_group *block_group, return ret; } +static struct lock_class_key btrfs_free_space_inode_key; + int load_free_space_cache(struct btrfs_block_group *block_group) { struct btrfs_fs_info *fs_info = block_group->fs_info; @@ -924,6 +926,7 @@ int load_free_space_cache(struct btrfs_block_group *block_group) int ret = 0; bool matched; u64 used = block_group->used; + struct address_space *mapping; /* * Because we could potentially discard our loaded free space, we want @@ -983,6 +986,14 @@ int load_free_space_cache(struct btrfs_block_group *block_group) } spin_unlock(&block_group->lock); + /* + * Reinitialize the class of the inode->mapping->invalidate_lock for free + * space inodes to prevent false positives related to the ordered extents + * lockdep map. + */ + mapping = &inode->i_data; + lockdep_set_class(&mapping->invalidate_lock, &btrfs_free_space_inode_key); + ret = __load_free_space_cache(fs_info->tree_root, inode, &tmp_ctl, path, block_group->start); btrfs_free_path(path); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index f20740812e5b..36f973ffbd26 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -3223,6 +3223,8 @@ int btrfs_finish_ordered_io(struct btrfs_ordered_extent *ordered_extent) clear_bits |= EXTENT_DELALLOC_NEW; freespace_inode = btrfs_is_free_space_inode(inode); + if (!freespace_inode) + btrfs_lockdep_acquire(fs_info, btrfs_ordered_extent); if (test_bit(BTRFS_ORDERED_IOERR, &ordered_extent->flags)) { ret = -EIO; @@ -8952,6 +8954,7 @@ void btrfs_destroy_inode(struct inode *vfs_inode) struct btrfs_ordered_extent *ordered; struct btrfs_inode *inode = BTRFS_I(vfs_inode); struct btrfs_root *root = inode->root; + bool freespace_inode; WARN_ON(!hlist_empty(&vfs_inode->i_dentry)); WARN_ON(vfs_inode->i_data.nrpages); @@ -8973,6 +8976,12 @@ void btrfs_destroy_inode(struct inode *vfs_inode) if (!root) return; + /* + * If this is a free space inode do not take the ordered extents lockdep + * map. + */ + freespace_inode = btrfs_is_free_space_inode(inode); + while (1) { ordered = btrfs_lookup_first_ordered_extent(inode, (u64)-1); if (!ordered) @@ -8981,6 +8990,10 @@ void btrfs_destroy_inode(struct inode *vfs_inode) btrfs_err(root->fs_info, "found ordered extent %llu %llu on inode cleanup", ordered->file_offset, ordered->num_bytes); + + if (!freespace_inode) + btrfs_lockdep_acquire(root->fs_info, btrfs_ordered_extent); + btrfs_remove_ordered_extent(inode, ordered); btrfs_put_ordered_extent(ordered); btrfs_put_ordered_extent(ordered); diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index 2a4cb6db42d1..eb24a6d20ff8 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -524,6 +524,13 @@ void btrfs_remove_ordered_extent(struct btrfs_inode *btrfs_inode, struct btrfs_fs_info *fs_info = root->fs_info; struct rb_node *node; bool pending; + bool freespace_inode; + + /* + * If this is a free space inode the thread has not acquired the ordered + * extents lockdep map. + */ + freespace_inode = btrfs_is_free_space_inode(btrfs_inode); btrfs_lockdep_acquire(fs_info, btrfs_trans_pending_ordered); /* This is paired with btrfs_add_ordered_extent. */ @@ -597,6 +604,8 @@ void btrfs_remove_ordered_extent(struct btrfs_inode *btrfs_inode, } spin_unlock(&root->ordered_extent_lock); wake_up(&entry->wait); + if (!freespace_inode) + btrfs_lockdep_release(fs_info, btrfs_ordered_extent); } static void btrfs_run_ordered_extent_work(struct btrfs_work *work) @@ -715,9 +724,16 @@ void btrfs_start_ordered_extent(struct btrfs_ordered_extent *entry, int wait) u64 start = entry->file_offset; u64 end = start + entry->num_bytes - 1; struct btrfs_inode *inode = BTRFS_I(entry->inode); + bool freespace_inode; trace_btrfs_ordered_extent_start(inode, entry); + /* + * If this is a free space inode do not take the ordered extents lockdep + * map. + */ + freespace_inode = btrfs_is_free_space_inode(inode); + /* * pages in the range can be dirty, clean or writeback. We * start IO on any dirty ones so the wait doesn't stall waiting @@ -726,6 +742,8 @@ void btrfs_start_ordered_extent(struct btrfs_ordered_extent *entry, int wait) if (!test_bit(BTRFS_ORDERED_DIRECT, &entry->flags)) filemap_fdatawrite_range(inode->vfs_inode.i_mapping, start, end); if (wait) { + if (!freespace_inode) + btrfs_might_wait_for_event(inode->root->fs_info, btrfs_ordered_extent); wait_event(entry->wait, test_bit(BTRFS_ORDERED_COMPLETE, &entry->flags)); }