From patchwork Mon Jan 16 08:02:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ojaswin Mujoo X-Patchwork-Id: 13102729 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43A5CC67871 for ; Mon, 16 Jan 2023 08:03:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232250AbjAPID1 (ORCPT ); Mon, 16 Jan 2023 03:03:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35246 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232222AbjAPIC4 (ORCPT ); Mon, 16 Jan 2023 03:02:56 -0500 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C1A324ECF; Mon, 16 Jan 2023 00:02:30 -0800 (PST) Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 30G5pgkS022686; Mon, 16 Jan 2023 08:02:26 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=ayErR4z7RU1E0mY4Rl/twITSKE3ULnmDMtZUPIVR9cM=; b=IdlTGQAb12cpww1GOvEbO2QKK26pVKFXCOIl9oDj1v24MmggLdr0NeEKhGjJY20t3iEG f6ZAinRpe3e/kFEIjaqdFJEtrafGhBlaWgZIeVZQ4Udql9nIwW0PtqyJrQR50y4B0vvP TU/SWMfn5LDZ6vXgGI/Vrvr0u659xZUv3eGdF2aHnBPn3ii2ocMVQz3q8MvIqMbedkKQ 2mjZVK7Umu0fSV8l6NBhg3tWy1n/8RiWRKGbXMC4OqpsjwfThLsSH6C6PcGeCGLdbPey MTChvqynSRbJ+HrF25aGnSBSjWyhKbc3BWO1Dn7MR0qhjZHNyil5e7giHYO27AfQdCRo WQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3n4g07saxc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:26 +0000 Received: from m0098421.ppops.net (m0098421.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 30G7wBFJ011899; Mon, 16 Jan 2023 08:02:26 GMT Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3n4g07sawb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:25 +0000 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 30G1V8TP004735; Mon, 16 Jan 2023 08:02:24 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma04ams.nl.ibm.com (PPS) with ESMTPS id 3n3m16j3t9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:24 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 30G82L5k47645034 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 16 Jan 2023 08:02:21 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B3F4A2004B; Mon, 16 Jan 2023 08:02:21 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F20C020040; Mon, 16 Jan 2023 08:02:19 +0000 (GMT) Received: from li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.in.ibm.com (unknown [9.109.253.169]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 16 Jan 2023 08:02:19 +0000 (GMT) From: Ojaswin Mujoo To: linux-ext4@vger.kernel.org, "Theodore Ts'o" Cc: Ritesh Harjani , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara , rookxu , Ritesh Harjani Subject: [PATCH v3 1/8] ext4: Stop searching if PA doesn't satisfy non-extent file Date: Mon, 16 Jan 2023 13:32:09 +0530 Message-Id: <20230116080216.249195-2-ojaswin@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230116080216.249195-1-ojaswin@linux.ibm.com> References: <20230116080216.249195-1-ojaswin@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: WHRdd5FuSeXn01mX9O2E5XXj1X49gC-X X-Proofpoint-GUID: BuRvIj6G13_uT54NYjYRNp2nmO1CrYE_ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.923,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-01-16_06,2023-01-13_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 adultscore=0 bulkscore=0 lowpriorityscore=0 phishscore=0 malwarescore=0 spamscore=0 priorityscore=1501 mlxscore=0 mlxlogscore=934 clxscore=1011 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301160055 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org If we come across a PA that matches the logical offset but is unable to satisfy a non-extent file due to its physical start being higher than that supported by non extent files, then simply stop searching for another PA and break out of loop. This is because, since PAs don't overlap, we won't be able to find another inode PA which can satisfy the original request. Signed-off-by: Ojaswin Mujoo Reviewed-by: Ritesh Harjani (IBM) Reviewed-by: Jan Kara --- fs/ext4/mballoc.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 5b2ae37a8b80..e0bbca523b4b 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -4382,8 +4382,13 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac) /* non-extent files can't have physical blocks past 2^32 */ if (!(ext4_test_inode_flag(ac->ac_inode, EXT4_INODE_EXTENTS)) && (pa->pa_pstart + EXT4_C2B(sbi, pa->pa_len) > - EXT4_MAX_BLOCK_FILE_PHYS)) - continue; + EXT4_MAX_BLOCK_FILE_PHYS)) { + /* + * Since PAs don't overlap, we won't find any + * other PA to satisfy this. + */ + break; + } /* found preallocated blocks, use them */ spin_lock(&pa->pa_lock); From patchwork Mon Jan 16 08:02:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ojaswin Mujoo X-Patchwork-Id: 13102731 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02942C46467 for ; Mon, 16 Jan 2023 08:03:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232116AbjAPIDr (ORCPT ); Mon, 16 Jan 2023 03:03:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34182 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232231AbjAPIDD (ORCPT ); Mon, 16 Jan 2023 03:03:03 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E8263113FF; Mon, 16 Jan 2023 00:02:33 -0800 (PST) Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 30G5pgPb005210; Mon, 16 Jan 2023 08:02:30 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=ok7YbW6YpMCYxhzqrtvlXy1RVNtgESxGdAZll8GvBIM=; b=DA7I6/QkF1w2HKoB+pXz1mcuVuKkB+31OBuTFD7Svxn8Yh5o2Xv7GI8lRZjwvpQ5zjFu 8nahTmZnPbHZozmtd0h9bQXdgW3fY/qOxlR6hjMFKLdKGePBt2n4oApz+YOq0YWpPg5+ E41DUUMx3phTx49c1G17Y3Tb1vbO1hP2qrr5BdMeZmmUYG4uKo1UYElveYyWkZfhQaMA zvn/Lx/45Eajodg2FcwpGtyPMC7y6ttNmVT0ujjjdmuJB2J9WlXmB3jkshpmCMNawvUk kPXOeJQt/4d7OlO7MMIVJ+vZF0p5ONMUIuCFEcbQSGpW/HzoYHRxlEDlG7L2stN/oEYY aA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3n4gdc92ee-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:29 +0000 Received: from m0098396.ppops.net (m0098396.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 30G7gXPA026233; Mon, 16 Jan 2023 08:02:29 GMT Received: from ppma04fra.de.ibm.com (6a.4a.5195.ip4.static.sl-reverse.com [149.81.74.106]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3n4gdc92d8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:29 +0000 Received: from pps.filterd (ppma04fra.de.ibm.com [127.0.0.1]) by ppma04fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 30FKoJcI027702; Mon, 16 Jan 2023 08:02:26 GMT Received: from smtprelay03.fra02v.mail.ibm.com ([9.218.2.224]) by ppma04fra.de.ibm.com (PPS) with ESMTPS id 3n3m16hgv2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:26 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay03.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 30G82OCO50200878 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 16 Jan 2023 08:02:24 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2EF9F20043; Mon, 16 Jan 2023 08:02:24 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 22D0B20040; Mon, 16 Jan 2023 08:02:22 +0000 (GMT) Received: from li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.in.ibm.com (unknown [9.109.253.169]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 16 Jan 2023 08:02:21 +0000 (GMT) From: Ojaswin Mujoo To: linux-ext4@vger.kernel.org, "Theodore Ts'o" Cc: Ritesh Harjani , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara , rookxu , Ritesh Harjani Subject: [PATCH v3 2/8] ext4: Refactor code related to freeing PAs Date: Mon, 16 Jan 2023 13:32:10 +0530 Message-Id: <20230116080216.249195-3-ojaswin@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230116080216.249195-1-ojaswin@linux.ibm.com> References: <20230116080216.249195-1-ojaswin@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: v6wjoGbAaIAxct7YuJE8XG8ReqaYsmg9 X-Proofpoint-ORIG-GUID: tFlrJXrpiFRE3RLbaLDm-drT_YUgKu0h X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.923,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-01-16_06,2023-01-13_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 phishscore=0 suspectscore=0 bulkscore=0 impostorscore=0 mlxscore=0 lowpriorityscore=0 clxscore=1015 mlxlogscore=999 adultscore=0 malwarescore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301160058 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This patch makes the following changes: * Rename ext4_mb_pa_free to ext4_mb_pa_put_free to better reflect its purpose * Add new ext4_mb_pa_free() which only handles freeing * Refactor ext4_mb_pa_callback() to use ext4_mb_pa_free() There are no functional changes in this patch Signed-off-by: Ojaswin Mujoo Reviewed-by: Ritesh Harjani (IBM) Reviewed-by: Jan Kara --- fs/ext4/mballoc.c | 30 +++++++++++++++++++++--------- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index e0bbca523b4b..a5f2803aff93 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -4530,16 +4530,22 @@ static void ext4_mb_mark_pa_deleted(struct super_block *sb, } } -static void ext4_mb_pa_callback(struct rcu_head *head) +static inline void ext4_mb_pa_free(struct ext4_prealloc_space *pa) { - struct ext4_prealloc_space *pa; - pa = container_of(head, struct ext4_prealloc_space, u.pa_rcu); - + BUG_ON(!pa); BUG_ON(atomic_read(&pa->pa_count)); BUG_ON(pa->pa_deleted == 0); kmem_cache_free(ext4_pspace_cachep, pa); } +static void ext4_mb_pa_callback(struct rcu_head *head) +{ + struct ext4_prealloc_space *pa; + + pa = container_of(head, struct ext4_prealloc_space, u.pa_rcu); + ext4_mb_pa_free(pa); +} + /* * drops a reference to preallocated space descriptor * if this was the last reference and the space is consumed @@ -5066,14 +5072,20 @@ static int ext4_mb_pa_alloc(struct ext4_allocation_context *ac) return 0; } -static void ext4_mb_pa_free(struct ext4_allocation_context *ac) +static void ext4_mb_pa_put_free(struct ext4_allocation_context *ac) { struct ext4_prealloc_space *pa = ac->ac_pa; BUG_ON(!pa); ac->ac_pa = NULL; WARN_ON(!atomic_dec_and_test(&pa->pa_count)); - kmem_cache_free(ext4_pspace_cachep, pa); + /* + * current function is only called due to an error or due to + * len of found blocks < len of requested blocks hence the PA has not + * been added to grp->bb_prealloc_list. So we don't need to lock it + */ + pa->pa_deleted = 1; + ext4_mb_pa_free(pa); } #ifdef CONFIG_EXT4_DEBUG @@ -5616,13 +5628,13 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle, * So we have to free this pa here itself. */ if (*errp) { - ext4_mb_pa_free(ac); + ext4_mb_pa_put_free(ac); ext4_discard_allocated_blocks(ac); goto errout; } if (ac->ac_status == AC_STATUS_FOUND && ac->ac_o_ex.fe_len >= ac->ac_f_ex.fe_len) - ext4_mb_pa_free(ac); + ext4_mb_pa_put_free(ac); } if (likely(ac->ac_status == AC_STATUS_FOUND)) { *errp = ext4_mb_mark_diskspace_used(ac, handle, reserv_clstrs); @@ -5641,7 +5653,7 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle, * If block allocation fails then the pa allocated above * needs to be freed here itself. */ - ext4_mb_pa_free(ac); + ext4_mb_pa_put_free(ac); *errp = -ENOSPC; } From patchwork Mon Jan 16 08:02:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ojaswin Mujoo X-Patchwork-Id: 13102732 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 590D3C67871 for ; Mon, 16 Jan 2023 08:03:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232186AbjAPIDt (ORCPT ); Mon, 16 Jan 2023 03:03:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34176 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232128AbjAPIDD (ORCPT ); Mon, 16 Jan 2023 03:03:03 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5E02B35AA; Mon, 16 Jan 2023 00:02:35 -0800 (PST) Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 30G5phsj013663; Mon, 16 Jan 2023 08:02:32 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=KZS3hSXgTaegBvRxQgY6cJoL6DE+TBJpdySt/Ob5gMI=; b=QIO+vYLAE8PlSqgl5CdiJKh+pSMrHXSssmufWN+xFol8PI6OPispyYR2zuMdCjmvUPMJ c9J8ySJpFlXuhPsZGV/1cyGyCXfZ49UQCBbHP6pyz3LI1DP+hHgwmz35FCBRNUE28WiR EAb38EZ5QSG5LPdSREm9uXb8GMyKHf9XAE4t9aI/Rf4nyELo3Esb4mqwIC8YUjr1e6fd iXEa3pq86Rzauwc01JjihdWW6u3BEOS0GCzqZflJA1iD/X6MFLzd7OlG0f4DGVCEvELm S72NINPQy1cDy8K3d/9jrNRIQZLOtUkMDugpkqeXyC3vhBMv1ofoT63pAejaX4G40Prx pQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3n4apjdus9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:31 +0000 Received: from m0098410.ppops.net (m0098410.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 30G7gJbs003459; Mon, 16 Jan 2023 08:02:31 GMT Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3n4apjdurf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:31 +0000 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 30G1Px26005139; Mon, 16 Jan 2023 08:02:29 GMT Received: from smtprelay05.fra02v.mail.ibm.com ([9.218.2.225]) by ppma04ams.nl.ibm.com (PPS) with ESMTPS id 3n3m16j3ta-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:28 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay05.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 30G82QFL43188656 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 16 Jan 2023 08:02:26 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6D38C20043; Mon, 16 Jan 2023 08:02:26 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 797DC20040; Mon, 16 Jan 2023 08:02:24 +0000 (GMT) Received: from li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.in.ibm.com (unknown [9.109.253.169]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 16 Jan 2023 08:02:24 +0000 (GMT) From: Ojaswin Mujoo To: linux-ext4@vger.kernel.org, "Theodore Ts'o" Cc: Ritesh Harjani , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara , rookxu , Ritesh Harjani Subject: [PATCH v3 3/8] ext4: Refactor code in ext4_mb_normalize_request() and ext4_mb_use_preallocated() Date: Mon, 16 Jan 2023 13:32:11 +0530 Message-Id: <20230116080216.249195-4-ojaswin@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230116080216.249195-1-ojaswin@linux.ibm.com> References: <20230116080216.249195-1-ojaswin@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: ZALAh9_iYCHw1wNAvPDBicVXsANEpqJa X-Proofpoint-GUID: AkB9Mj398Bf6Fsq-PyeyVYEECU8nrVjq X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.923,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-01-16_06,2023-01-13_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 priorityscore=1501 malwarescore=0 mlxscore=0 spamscore=0 lowpriorityscore=0 mlxlogscore=999 impostorscore=0 suspectscore=0 adultscore=0 bulkscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301160058 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Change some variable names to be more consistent and refactor some of the code to make it easier to read. There are no functional changes in this patch Signed-off-by: Ojaswin Mujoo Reviewed-by: Ritesh Harjani (IBM) Reviewed-by: Jan Kara --- fs/ext4/mballoc.c | 97 ++++++++++++++++++++++++----------------------- 1 file changed, 49 insertions(+), 48 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index a5f2803aff93..f4e699bce99f 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -3999,7 +3999,8 @@ ext4_mb_normalize_request(struct ext4_allocation_context *ac, loff_t orig_size __maybe_unused; ext4_lblk_t start; struct ext4_inode_info *ei = EXT4_I(ac->ac_inode); - struct ext4_prealloc_space *pa; + struct ext4_prealloc_space *tmp_pa; + ext4_lblk_t tmp_pa_start, tmp_pa_end; /* do normalize only data requests, metadata requests do not need preallocation */ @@ -4102,56 +4103,53 @@ ext4_mb_normalize_request(struct ext4_allocation_context *ac, /* check we don't cross already preallocated blocks */ rcu_read_lock(); - list_for_each_entry_rcu(pa, &ei->i_prealloc_list, pa_inode_list) { - ext4_lblk_t pa_end; - - if (pa->pa_deleted) + list_for_each_entry_rcu(tmp_pa, &ei->i_prealloc_list, pa_inode_list) { + if (tmp_pa->pa_deleted) continue; - spin_lock(&pa->pa_lock); - if (pa->pa_deleted) { - spin_unlock(&pa->pa_lock); + spin_lock(&tmp_pa->pa_lock); + if (tmp_pa->pa_deleted) { + spin_unlock(&tmp_pa->pa_lock); continue; } - pa_end = pa->pa_lstart + EXT4_C2B(EXT4_SB(ac->ac_sb), - pa->pa_len); + tmp_pa_start = tmp_pa->pa_lstart; + tmp_pa_end = tmp_pa->pa_lstart + EXT4_C2B(sbi, tmp_pa->pa_len); /* PA must not overlap original request */ - BUG_ON(!(ac->ac_o_ex.fe_logical >= pa_end || - ac->ac_o_ex.fe_logical < pa->pa_lstart)); + BUG_ON(!(ac->ac_o_ex.fe_logical >= tmp_pa_end || + ac->ac_o_ex.fe_logical < tmp_pa_start)); /* skip PAs this normalized request doesn't overlap with */ - if (pa->pa_lstart >= end || pa_end <= start) { - spin_unlock(&pa->pa_lock); + if (tmp_pa_start >= end || tmp_pa_end <= start) { + spin_unlock(&tmp_pa->pa_lock); continue; } - BUG_ON(pa->pa_lstart <= start && pa_end >= end); + BUG_ON(tmp_pa_start <= start && tmp_pa_end >= end); /* adjust start or end to be adjacent to this pa */ - if (pa_end <= ac->ac_o_ex.fe_logical) { - BUG_ON(pa_end < start); - start = pa_end; - } else if (pa->pa_lstart > ac->ac_o_ex.fe_logical) { - BUG_ON(pa->pa_lstart > end); - end = pa->pa_lstart; + if (tmp_pa_end <= ac->ac_o_ex.fe_logical) { + BUG_ON(tmp_pa_end < start); + start = tmp_pa_end; + } else if (tmp_pa_start > ac->ac_o_ex.fe_logical) { + BUG_ON(tmp_pa_start > end); + end = tmp_pa_start; } - spin_unlock(&pa->pa_lock); + spin_unlock(&tmp_pa->pa_lock); } rcu_read_unlock(); size = end - start; /* XXX: extra loop to check we really don't overlap preallocations */ rcu_read_lock(); - list_for_each_entry_rcu(pa, &ei->i_prealloc_list, pa_inode_list) { - ext4_lblk_t pa_end; + list_for_each_entry_rcu(tmp_pa, &ei->i_prealloc_list, pa_inode_list) { + spin_lock(&tmp_pa->pa_lock); + if (tmp_pa->pa_deleted == 0) { + tmp_pa_start = tmp_pa->pa_lstart; + tmp_pa_end = tmp_pa->pa_lstart + EXT4_C2B(sbi, tmp_pa->pa_len); - spin_lock(&pa->pa_lock); - if (pa->pa_deleted == 0) { - pa_end = pa->pa_lstart + EXT4_C2B(EXT4_SB(ac->ac_sb), - pa->pa_len); - BUG_ON(!(start >= pa_end || end <= pa->pa_lstart)); + BUG_ON(!(start >= tmp_pa_end || end <= tmp_pa_start)); } - spin_unlock(&pa->pa_lock); + spin_unlock(&tmp_pa->pa_lock); } rcu_read_unlock(); @@ -4361,7 +4359,8 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac) int order, i; struct ext4_inode_info *ei = EXT4_I(ac->ac_inode); struct ext4_locality_group *lg; - struct ext4_prealloc_space *pa, *cpa = NULL; + struct ext4_prealloc_space *tmp_pa, *cpa = NULL; + ext4_lblk_t tmp_pa_start, tmp_pa_end; ext4_fsblk_t goal_block; /* only data can be preallocated */ @@ -4370,18 +4369,20 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac) /* first, try per-file preallocation */ rcu_read_lock(); - list_for_each_entry_rcu(pa, &ei->i_prealloc_list, pa_inode_list) { + list_for_each_entry_rcu(tmp_pa, &ei->i_prealloc_list, pa_inode_list) { /* all fields in this condition don't change, * so we can skip locking for them */ - if (ac->ac_o_ex.fe_logical < pa->pa_lstart || - ac->ac_o_ex.fe_logical >= (pa->pa_lstart + - EXT4_C2B(sbi, pa->pa_len))) + tmp_pa_start = tmp_pa->pa_lstart; + tmp_pa_end = tmp_pa->pa_lstart + EXT4_C2B(sbi, tmp_pa->pa_len); + + if (ac->ac_o_ex.fe_logical < tmp_pa_start || + ac->ac_o_ex.fe_logical >= tmp_pa_end) continue; /* non-extent files can't have physical blocks past 2^32 */ if (!(ext4_test_inode_flag(ac->ac_inode, EXT4_INODE_EXTENTS)) && - (pa->pa_pstart + EXT4_C2B(sbi, pa->pa_len) > + (tmp_pa->pa_pstart + EXT4_C2B(sbi, tmp_pa->pa_len) > EXT4_MAX_BLOCK_FILE_PHYS)) { /* * Since PAs don't overlap, we won't find any @@ -4391,16 +4392,16 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac) } /* found preallocated blocks, use them */ - spin_lock(&pa->pa_lock); - if (pa->pa_deleted == 0 && pa->pa_free) { - atomic_inc(&pa->pa_count); - ext4_mb_use_inode_pa(ac, pa); - spin_unlock(&pa->pa_lock); + spin_lock(&tmp_pa->pa_lock); + if (tmp_pa->pa_deleted == 0 && tmp_pa->pa_free) { + atomic_inc(&tmp_pa->pa_count); + ext4_mb_use_inode_pa(ac, tmp_pa); + spin_unlock(&tmp_pa->pa_lock); ac->ac_criteria = 10; rcu_read_unlock(); return true; } - spin_unlock(&pa->pa_lock); + spin_unlock(&tmp_pa->pa_lock); } rcu_read_unlock(); @@ -4424,16 +4425,16 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac) */ for (i = order; i < PREALLOC_TB_SIZE; i++) { rcu_read_lock(); - list_for_each_entry_rcu(pa, &lg->lg_prealloc_list[i], + list_for_each_entry_rcu(tmp_pa, &lg->lg_prealloc_list[i], pa_inode_list) { - spin_lock(&pa->pa_lock); - if (pa->pa_deleted == 0 && - pa->pa_free >= ac->ac_o_ex.fe_len) { + spin_lock(&tmp_pa->pa_lock); + if (tmp_pa->pa_deleted == 0 && + tmp_pa->pa_free >= ac->ac_o_ex.fe_len) { cpa = ext4_mb_check_group_pa(goal_block, - pa, cpa); + tmp_pa, cpa); } - spin_unlock(&pa->pa_lock); + spin_unlock(&tmp_pa->pa_lock); } rcu_read_unlock(); } From patchwork Mon Jan 16 08:02:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ojaswin Mujoo X-Patchwork-Id: 13102737 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 770B4C54EBE for ; Mon, 16 Jan 2023 08:04:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232192AbjAPIEb (ORCPT ); Mon, 16 Jan 2023 03:04:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34398 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232153AbjAPIDS (ORCPT ); Mon, 16 Jan 2023 03:03:18 -0500 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B6AC046B3; Mon, 16 Jan 2023 00:02:54 -0800 (PST) Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 30G5pfi7027923; Mon, 16 Jan 2023 08:02:51 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=xCvCGiyZshaoo7u5KiEiVhWdKt4a5oqWxtBVzxggjfU=; b=SbDKGt7izqBEwEWUH8ecCQTdwfp0dv0w0xlp+EjNu9yuq7XTf7LYPWEVSwkpXvtOsdme ntE2Z5RX68sVqaBXFaVhL/KVJ/1uhf8s/dSd+2/8Wevf9Nbynu+NIUprDWhemnxrA2an 5Pc1uISu5vhHhAoDCIRnnU2vs76mVJAo3pLMW4tsxd4u3Axk+/xxEYLWA1HASyBVSjt1 EmL+wd1OEaRux7V3humzmASl0ab8bQKUsByQVnx/lXmVOF48OO+bOQSmUGu8g+zbt6v0 8w2k2jE3bVzizKz9JaVa5t5eUvQmp+00alS2hN6nABfpy17o6Kp37F6AOeeroOCP3nT3 xw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3n4f9c1ucb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:50 +0000 Received: from m0127361.ppops.net (m0127361.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 30G7Z0lx017741; Mon, 16 Jan 2023 08:02:50 GMT Received: from ppma05fra.de.ibm.com (6c.4a.5195.ip4.static.sl-reverse.com [149.81.74.108]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3n4f9c1u6r-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:50 +0000 Received: from pps.filterd (ppma05fra.de.ibm.com [127.0.0.1]) by ppma05fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 30FKjgPK019225; Mon, 16 Jan 2023 08:02:31 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma05fra.de.ibm.com (PPS) with ESMTPS id 3n3m169gn6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:31 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 30G82SEE42991876 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 16 Jan 2023 08:02:28 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C845C2004B; Mon, 16 Jan 2023 08:02:28 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CFE4120043; Mon, 16 Jan 2023 08:02:26 +0000 (GMT) Received: from li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.in.ibm.com (unknown [9.109.253.169]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 16 Jan 2023 08:02:26 +0000 (GMT) From: Ojaswin Mujoo To: linux-ext4@vger.kernel.org, "Theodore Ts'o" Cc: Ritesh Harjani , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara , rookxu , Ritesh Harjani Subject: [PATCH v3 4/8] ext4: Move overlap assert logic into a separate function Date: Mon, 16 Jan 2023 13:32:12 +0530 Message-Id: <20230116080216.249195-5-ojaswin@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230116080216.249195-1-ojaswin@linux.ibm.com> References: <20230116080216.249195-1-ojaswin@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: deTKAbRVbDRXFHbFuhv_YZrY-GCNQJs- X-Proofpoint-GUID: -uLG3OR_CfO6G726HqnHbJQDkUaSqGP6 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.923,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-01-16_06,2023-01-13_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 lowpriorityscore=0 malwarescore=0 phishscore=0 spamscore=0 bulkscore=0 mlxlogscore=999 impostorscore=0 mlxscore=0 priorityscore=1501 suspectscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301160058 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Abstract out the logic to double check for overlaps in normalize_pa to a separate function. Since there has been no reports in past where we have seen any overlaps which hits this bug_on(), in future we can consider calling this function under "#ifdef AGGRESSIVE_CHECK" only. There are no functional changes in this patch Signed-off-by: Ojaswin Mujoo Reviewed-by: Ritesh Harjani (IBM) Reviewed-by: Jan Kara --- fs/ext4/mballoc.c | 36 ++++++++++++++++++++++++------------ 1 file changed, 24 insertions(+), 12 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index f4e699bce99f..1628b008a096 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -3984,6 +3984,29 @@ static void ext4_mb_normalize_group_request(struct ext4_allocation_context *ac) mb_debug(sb, "goal %u blocks for locality group\n", ac->ac_g_ex.fe_len); } +static inline void +ext4_mb_pa_assert_overlap(struct ext4_allocation_context *ac, + ext4_lblk_t start, ext4_lblk_t end) +{ + struct ext4_sb_info *sbi = EXT4_SB(ac->ac_sb); + struct ext4_inode_info *ei = EXT4_I(ac->ac_inode); + struct ext4_prealloc_space *tmp_pa; + ext4_lblk_t tmp_pa_start, tmp_pa_end; + + rcu_read_lock(); + list_for_each_entry_rcu(tmp_pa, &ei->i_prealloc_list, pa_inode_list) { + spin_lock(&tmp_pa->pa_lock); + if (tmp_pa->pa_deleted == 0) { + tmp_pa_start = tmp_pa->pa_lstart; + tmp_pa_end = tmp_pa->pa_lstart + EXT4_C2B(sbi, tmp_pa->pa_len); + + BUG_ON(!(start >= tmp_pa_end || end <= tmp_pa_start)); + } + spin_unlock(&tmp_pa->pa_lock); + } + rcu_read_unlock(); +} + /* * Normalization means making request better in terms of * size and alignment @@ -4140,18 +4163,7 @@ ext4_mb_normalize_request(struct ext4_allocation_context *ac, size = end - start; /* XXX: extra loop to check we really don't overlap preallocations */ - rcu_read_lock(); - list_for_each_entry_rcu(tmp_pa, &ei->i_prealloc_list, pa_inode_list) { - spin_lock(&tmp_pa->pa_lock); - if (tmp_pa->pa_deleted == 0) { - tmp_pa_start = tmp_pa->pa_lstart; - tmp_pa_end = tmp_pa->pa_lstart + EXT4_C2B(sbi, tmp_pa->pa_len); - - BUG_ON(!(start >= tmp_pa_end || end <= tmp_pa_start)); - } - spin_unlock(&tmp_pa->pa_lock); - } - rcu_read_unlock(); + ext4_mb_pa_assert_overlap(ac, start, end); /* * In this function "start" and "size" are normalized for better From patchwork Mon Jan 16 08:02:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ojaswin Mujoo X-Patchwork-Id: 13102733 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72C83C678D6 for ; Mon, 16 Jan 2023 08:03:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232268AbjAPIDv (ORCPT ); Mon, 16 Jan 2023 03:03:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35008 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232238AbjAPIDH (ORCPT ); Mon, 16 Jan 2023 03:03:07 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1B39A12F25; Mon, 16 Jan 2023 00:02:41 -0800 (PST) Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 30G5pgvJ013527; Mon, 16 Jan 2023 08:02:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=SeldDRZQ0+W2L3R9AmlihtnMi8L2bo3aoXvILnti/ts=; b=rjGi460KYnvzMucr1dmHLgD6Gmf6iIrJ4ysN5oinjd3fKCX6mvoAsxA+vIp9Kzkj/cZ7 /ELPBKYwvZLTqjVXzMqMTBl936xZttNHoD+rWg7KhaN6I7wmaMoAyyawUZcWfVipp0O8 F5xp0Z8RiuGIn6RE7nLJKdaQyvulzSpFmekB8k+dZdFHEqa3kkz5WlF4cEm3dGsVP8WF eOnkxusO1+2iV79QRmz7xD4i+tUfzyAqEILIYf4xa+jg7wstJwemoroXhy6xnn/LjJkm 0uPLjCci6l61JMd8nvXA8EVI++zjy+39HsbtdC4wLpReHfERvnKTRdgm4ehd5NuwNbJs IA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3n4apjduut-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:36 +0000 Received: from m0098410.ppops.net (m0098410.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 30G7YHWg008922; Mon, 16 Jan 2023 08:02:36 GMT Received: from ppma06ams.nl.ibm.com (66.31.33a9.ip4.static.sl-reverse.com [169.51.49.102]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3n4apjdutn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:35 +0000 Received: from pps.filterd (ppma06ams.nl.ibm.com [127.0.0.1]) by ppma06ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 30G1VVYm009485; Mon, 16 Jan 2023 08:02:33 GMT Received: from smtprelay03.fra02v.mail.ibm.com ([9.218.2.224]) by ppma06ams.nl.ibm.com (PPS) with ESMTPS id 3n3knfj3u0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:33 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay03.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 30G82V0E46268832 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 16 Jan 2023 08:02:31 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 265D420040; Mon, 16 Jan 2023 08:02:31 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2AE592004B; Mon, 16 Jan 2023 08:02:29 +0000 (GMT) Received: from li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.in.ibm.com (unknown [9.109.253.169]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 16 Jan 2023 08:02:28 +0000 (GMT) From: Ojaswin Mujoo To: linux-ext4@vger.kernel.org, "Theodore Ts'o" Cc: Ritesh Harjani , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara , rookxu , Ritesh Harjani Subject: [PATCH v3 5/8] ext4: Abstract out overlap fix/check logic in ext4_mb_normalize_request() Date: Mon, 16 Jan 2023 13:32:13 +0530 Message-Id: <20230116080216.249195-6-ojaswin@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230116080216.249195-1-ojaswin@linux.ibm.com> References: <20230116080216.249195-1-ojaswin@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: adgnx10pKoW2xAsQIHF9bcEowy0LYHSv X-Proofpoint-GUID: gSz7zW4sAnISdKXnYMr8lvTiMUt2J2H8 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.923,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-01-16_06,2023-01-13_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 priorityscore=1501 malwarescore=0 mlxscore=0 spamscore=0 lowpriorityscore=0 mlxlogscore=999 impostorscore=0 suspectscore=0 adultscore=0 bulkscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301160058 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Abstract out the logic of fixing PA overlaps in ext4_mb_normalize_request to improve readability of code. This also makes it easier to make changes to the overlap logic in future. There are no functional changes in this patch Signed-off-by: Ojaswin Mujoo Reviewed-by: Ritesh Harjani (IBM) Reviewed-by: Jan Kara --- fs/ext4/mballoc.c | 110 +++++++++++++++++++++++++++++----------------- 1 file changed, 69 insertions(+), 41 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 1628b008a096..fdb9d0a8f35d 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -4007,6 +4007,74 @@ ext4_mb_pa_assert_overlap(struct ext4_allocation_context *ac, rcu_read_unlock(); } +/* + * Given an allocation context "ac" and a range "start", "end", check + * and adjust boundaries if the range overlaps with any of the existing + * preallocatoins stored in the corresponding inode of the allocation context. + * + *Parameters: + * ac allocation context + * start start of the new range + * end end of the new range + */ +static inline void +ext4_mb_pa_adjust_overlap(struct ext4_allocation_context *ac, + ext4_lblk_t *start, ext4_lblk_t *end) +{ + struct ext4_inode_info *ei = EXT4_I(ac->ac_inode); + struct ext4_sb_info *sbi = EXT4_SB(ac->ac_sb); + struct ext4_prealloc_space *tmp_pa; + ext4_lblk_t new_start, new_end; + ext4_lblk_t tmp_pa_start, tmp_pa_end; + + new_start = *start; + new_end = *end; + + /* check we don't cross already preallocated blocks */ + rcu_read_lock(); + list_for_each_entry_rcu(tmp_pa, &ei->i_prealloc_list, pa_inode_list) { + if (tmp_pa->pa_deleted) + continue; + spin_lock(&tmp_pa->pa_lock); + if (tmp_pa->pa_deleted) { + spin_unlock(&tmp_pa->pa_lock); + continue; + } + + tmp_pa_start = tmp_pa->pa_lstart; + tmp_pa_end = tmp_pa->pa_lstart + EXT4_C2B(sbi, tmp_pa->pa_len); + + /* PA must not overlap original request */ + BUG_ON(!(ac->ac_o_ex.fe_logical >= tmp_pa_end || + ac->ac_o_ex.fe_logical < tmp_pa_start)); + + /* skip PAs this normalized request doesn't overlap with */ + if (tmp_pa_start >= new_end || tmp_pa_end <= new_start) { + spin_unlock(&tmp_pa->pa_lock); + continue; + } + BUG_ON(tmp_pa_start <= new_start && tmp_pa_end >= new_end); + + /* adjust start or end to be adjacent to this pa */ + if (tmp_pa_end <= ac->ac_o_ex.fe_logical) { + BUG_ON(tmp_pa_end < new_start); + new_start = tmp_pa_end; + } else if (tmp_pa_start > ac->ac_o_ex.fe_logical) { + BUG_ON(tmp_pa_start > new_end); + new_end = tmp_pa_start; + } + spin_unlock(&tmp_pa->pa_lock); + } + rcu_read_unlock(); + + /* XXX: extra loop to check we really don't overlap preallocations */ + ext4_mb_pa_assert_overlap(ac, new_start, new_end); + + *start = new_start; + *end = new_end; + return; +} + /* * Normalization means making request better in terms of * size and alignment @@ -4021,9 +4089,6 @@ ext4_mb_normalize_request(struct ext4_allocation_context *ac, loff_t size, start_off; loff_t orig_size __maybe_unused; ext4_lblk_t start; - struct ext4_inode_info *ei = EXT4_I(ac->ac_inode); - struct ext4_prealloc_space *tmp_pa; - ext4_lblk_t tmp_pa_start, tmp_pa_end; /* do normalize only data requests, metadata requests do not need preallocation */ @@ -4124,47 +4189,10 @@ ext4_mb_normalize_request(struct ext4_allocation_context *ac, end = start + size; - /* check we don't cross already preallocated blocks */ - rcu_read_lock(); - list_for_each_entry_rcu(tmp_pa, &ei->i_prealloc_list, pa_inode_list) { - if (tmp_pa->pa_deleted) - continue; - spin_lock(&tmp_pa->pa_lock); - if (tmp_pa->pa_deleted) { - spin_unlock(&tmp_pa->pa_lock); - continue; - } - - tmp_pa_start = tmp_pa->pa_lstart; - tmp_pa_end = tmp_pa->pa_lstart + EXT4_C2B(sbi, tmp_pa->pa_len); - - /* PA must not overlap original request */ - BUG_ON(!(ac->ac_o_ex.fe_logical >= tmp_pa_end || - ac->ac_o_ex.fe_logical < tmp_pa_start)); - - /* skip PAs this normalized request doesn't overlap with */ - if (tmp_pa_start >= end || tmp_pa_end <= start) { - spin_unlock(&tmp_pa->pa_lock); - continue; - } - BUG_ON(tmp_pa_start <= start && tmp_pa_end >= end); + ext4_mb_pa_adjust_overlap(ac, &start, &end); - /* adjust start or end to be adjacent to this pa */ - if (tmp_pa_end <= ac->ac_o_ex.fe_logical) { - BUG_ON(tmp_pa_end < start); - start = tmp_pa_end; - } else if (tmp_pa_start > ac->ac_o_ex.fe_logical) { - BUG_ON(tmp_pa_start > end); - end = tmp_pa_start; - } - spin_unlock(&tmp_pa->pa_lock); - } - rcu_read_unlock(); size = end - start; - /* XXX: extra loop to check we really don't overlap preallocations */ - ext4_mb_pa_assert_overlap(ac, start, end); - /* * In this function "start" and "size" are normalized for better * alignment and length such that we could preallocate more blocks. From patchwork Mon Jan 16 08:02:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ojaswin Mujoo X-Patchwork-Id: 13102735 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BD37C678D6 for ; Mon, 16 Jan 2023 08:03:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232329AbjAPIDy (ORCPT ); Mon, 16 Jan 2023 03:03:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35014 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232101AbjAPIDI (ORCPT ); Mon, 16 Jan 2023 03:03:08 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7FD0712055; Mon, 16 Jan 2023 00:02:42 -0800 (PST) Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 30G5phsm013663; Mon, 16 Jan 2023 08:02:39 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=Hht2cREdSEFCABFFZKFgpGtoMg/hgJNnviyd44rfGOs=; b=I9mRhqaW0CkZ0lL6xPy7wawp1fKZ5gs95QOeviRAfDfzl7Ub8DZTt+Mek/gCiwulKlPq zz637T0CPtAILPeHlS/m/WTUcaN2Wbst2dX2LflWnqZ1ZUg2Xpfl8AG4Y8FfiD+SUF7n PBouGQd/PO6/+op3mvHEqHoxAAQXjF7FXPNY03u4wgmzFZ+JE9lj8lKqghWTQ6E9NtHm Kwq9r+QuZTaQSW1vUop6ht+HL0Hb+o583rsoHqG2bGkaI/hgaItuLh3KoQ/ikcU8ZzTJ 1IU8ZvAW65QewJ0QgWb3FSpP+pzUPDibdQWtQAIT6mXdf0D80aTC7Iu1erKsciQ2b5ie DA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3n4apjduw3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:38 +0000 Received: from m0098410.ppops.net (m0098410.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 30G7ilSo015555; Mon, 16 Jan 2023 08:02:38 GMT Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3n4apjduv1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:38 +0000 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 30G1V8TR004735; Mon, 16 Jan 2023 08:02:36 GMT Received: from smtprelay05.fra02v.mail.ibm.com ([9.218.2.225]) by ppma04ams.nl.ibm.com (PPS) with ESMTPS id 3n3m16j3td-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:35 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay05.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 30G82Xkv38076798 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 16 Jan 2023 08:02:33 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 63B082004B; Mon, 16 Jan 2023 08:02:33 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 895A220040; Mon, 16 Jan 2023 08:02:31 +0000 (GMT) Received: from li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.in.ibm.com (unknown [9.109.253.169]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 16 Jan 2023 08:02:31 +0000 (GMT) From: Ojaswin Mujoo To: linux-ext4@vger.kernel.org, "Theodore Ts'o" Cc: Ritesh Harjani , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara , rookxu , Ritesh Harjani Subject: [PATCH v3 6/8] ext4: Convert pa->pa_inode_list and pa->pa_obj_lock into a union Date: Mon, 16 Jan 2023 13:32:14 +0530 Message-Id: <20230116080216.249195-7-ojaswin@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230116080216.249195-1-ojaswin@linux.ibm.com> References: <20230116080216.249195-1-ojaswin@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: R2MScmFqVSTWaEAlaBuB0mubLqScUaKR X-Proofpoint-GUID: Y8PS7y7DtGvINkxuQes4r3AmNmPH2wiM X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.923,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-01-16_06,2023-01-13_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 priorityscore=1501 malwarescore=0 mlxscore=0 spamscore=0 lowpriorityscore=0 mlxlogscore=999 impostorscore=0 suspectscore=0 adultscore=0 bulkscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301160058 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org ** Splitting pa->pa_inode_list ** Currently, we use the same pa->pa_inode_list to add a pa to either the inode preallocation list or the locality group preallocation list. For better clarity, split this list into a union of 2 list_heads and use either of the them based on the type of pa. ** Splitting pa->pa_obj_lock ** Currently, pa->pa_obj_lock is either assigned &ei->i_prealloc_lock for inode PAs or lg_prealloc_lock for lg PAs, and is then used to lock the lists containing these PAs. Make the distinction between the 2 PA types clear by changing this lock to a union of 2 locks. Explicitly use the pa_lock_node.inode_lock for inode PAs and pa_lock_node.lg_lock for lg PAs. This patch is required so that the locality group preallocation code remains the same as in upcoming patches we are going to make changes to inode preallocation code to move from list to rbtree based implementation. This patch also makes it easier to review the upcoming patches. There are no functional changes in this patch. Signed-off-by: Ojaswin Mujoo Reviewed-by: Ritesh Harjani (IBM) Reviewed-by: Jan Kara --- fs/ext4/mballoc.c | 76 +++++++++++++++++++++++++++-------------------- fs/ext4/mballoc.h | 10 +++++-- 2 files changed, 52 insertions(+), 34 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index fdb9d0a8f35d..53c4efd34d1c 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -3994,7 +3994,7 @@ ext4_mb_pa_assert_overlap(struct ext4_allocation_context *ac, ext4_lblk_t tmp_pa_start, tmp_pa_end; rcu_read_lock(); - list_for_each_entry_rcu(tmp_pa, &ei->i_prealloc_list, pa_inode_list) { + list_for_each_entry_rcu(tmp_pa, &ei->i_prealloc_list, pa_node.inode_list) { spin_lock(&tmp_pa->pa_lock); if (tmp_pa->pa_deleted == 0) { tmp_pa_start = tmp_pa->pa_lstart; @@ -4032,7 +4032,7 @@ ext4_mb_pa_adjust_overlap(struct ext4_allocation_context *ac, /* check we don't cross already preallocated blocks */ rcu_read_lock(); - list_for_each_entry_rcu(tmp_pa, &ei->i_prealloc_list, pa_inode_list) { + list_for_each_entry_rcu(tmp_pa, &ei->i_prealloc_list, pa_node.inode_list) { if (tmp_pa->pa_deleted) continue; spin_lock(&tmp_pa->pa_lock); @@ -4409,7 +4409,7 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac) /* first, try per-file preallocation */ rcu_read_lock(); - list_for_each_entry_rcu(tmp_pa, &ei->i_prealloc_list, pa_inode_list) { + list_for_each_entry_rcu(tmp_pa, &ei->i_prealloc_list, pa_node.inode_list) { /* all fields in this condition don't change, * so we can skip locking for them */ @@ -4466,7 +4466,7 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac) for (i = order; i < PREALLOC_TB_SIZE; i++) { rcu_read_lock(); list_for_each_entry_rcu(tmp_pa, &lg->lg_prealloc_list[i], - pa_inode_list) { + pa_node.lg_list) { spin_lock(&tmp_pa->pa_lock); if (tmp_pa->pa_deleted == 0 && tmp_pa->pa_free >= ac->ac_o_ex.fe_len) { @@ -4640,9 +4640,15 @@ static void ext4_mb_put_pa(struct ext4_allocation_context *ac, list_del(&pa->pa_group_list); ext4_unlock_group(sb, grp); - spin_lock(pa->pa_obj_lock); - list_del_rcu(&pa->pa_inode_list); - spin_unlock(pa->pa_obj_lock); + if (pa->pa_type == MB_INODE_PA) { + spin_lock(pa->pa_node_lock.inode_lock); + list_del_rcu(&pa->pa_node.inode_list); + spin_unlock(pa->pa_node_lock.inode_lock); + } else { + spin_lock(pa->pa_node_lock.lg_lock); + list_del_rcu(&pa->pa_node.lg_list); + spin_unlock(pa->pa_node_lock.lg_lock); + } call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback); } @@ -4710,7 +4716,7 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac) pa->pa_len = ac->ac_b_ex.fe_len; pa->pa_free = pa->pa_len; spin_lock_init(&pa->pa_lock); - INIT_LIST_HEAD(&pa->pa_inode_list); + INIT_LIST_HEAD(&pa->pa_node.inode_list); INIT_LIST_HEAD(&pa->pa_group_list); pa->pa_deleted = 0; pa->pa_type = MB_INODE_PA; @@ -4725,14 +4731,14 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac) ei = EXT4_I(ac->ac_inode); grp = ext4_get_group_info(sb, ac->ac_b_ex.fe_group); - pa->pa_obj_lock = &ei->i_prealloc_lock; + pa->pa_node_lock.inode_lock = &ei->i_prealloc_lock; pa->pa_inode = ac->ac_inode; list_add(&pa->pa_group_list, &grp->bb_prealloc_list); - spin_lock(pa->pa_obj_lock); - list_add_rcu(&pa->pa_inode_list, &ei->i_prealloc_list); - spin_unlock(pa->pa_obj_lock); + spin_lock(pa->pa_node_lock.inode_lock); + list_add_rcu(&pa->pa_node.inode_list, &ei->i_prealloc_list); + spin_unlock(pa->pa_node_lock.inode_lock); atomic_inc(&ei->i_prealloc_active); } @@ -4764,7 +4770,7 @@ ext4_mb_new_group_pa(struct ext4_allocation_context *ac) pa->pa_len = ac->ac_b_ex.fe_len; pa->pa_free = pa->pa_len; spin_lock_init(&pa->pa_lock); - INIT_LIST_HEAD(&pa->pa_inode_list); + INIT_LIST_HEAD(&pa->pa_node.lg_list); INIT_LIST_HEAD(&pa->pa_group_list); pa->pa_deleted = 0; pa->pa_type = MB_GROUP_PA; @@ -4780,7 +4786,7 @@ ext4_mb_new_group_pa(struct ext4_allocation_context *ac) lg = ac->ac_lg; BUG_ON(lg == NULL); - pa->pa_obj_lock = &lg->lg_prealloc_lock; + pa->pa_node_lock.lg_lock = &lg->lg_prealloc_lock; pa->pa_inode = NULL; list_add(&pa->pa_group_list, &grp->bb_prealloc_list); @@ -4956,9 +4962,15 @@ ext4_mb_discard_group_preallocations(struct super_block *sb, list_for_each_entry_safe(pa, tmp, &list, u.pa_tmp_list) { /* remove from object (inode or locality group) */ - spin_lock(pa->pa_obj_lock); - list_del_rcu(&pa->pa_inode_list); - spin_unlock(pa->pa_obj_lock); + if (pa->pa_type == MB_GROUP_PA) { + spin_lock(pa->pa_node_lock.lg_lock); + list_del_rcu(&pa->pa_node.lg_list); + spin_unlock(pa->pa_node_lock.lg_lock); + } else { + spin_lock(pa->pa_node_lock.inode_lock); + list_del_rcu(&pa->pa_node.inode_list); + spin_unlock(pa->pa_node_lock.inode_lock); + } if (pa->pa_type == MB_GROUP_PA) ext4_mb_release_group_pa(&e4b, pa); @@ -5021,8 +5033,8 @@ void ext4_discard_preallocations(struct inode *inode, unsigned int needed) spin_lock(&ei->i_prealloc_lock); while (!list_empty(&ei->i_prealloc_list) && needed) { pa = list_entry(ei->i_prealloc_list.prev, - struct ext4_prealloc_space, pa_inode_list); - BUG_ON(pa->pa_obj_lock != &ei->i_prealloc_lock); + struct ext4_prealloc_space, pa_node.inode_list); + BUG_ON(pa->pa_node_lock.inode_lock != &ei->i_prealloc_lock); spin_lock(&pa->pa_lock); if (atomic_read(&pa->pa_count)) { /* this shouldn't happen often - nobody should @@ -5039,7 +5051,7 @@ void ext4_discard_preallocations(struct inode *inode, unsigned int needed) if (pa->pa_deleted == 0) { ext4_mb_mark_pa_deleted(sb, pa); spin_unlock(&pa->pa_lock); - list_del_rcu(&pa->pa_inode_list); + list_del_rcu(&pa->pa_node.inode_list); list_add(&pa->u.pa_tmp_list, &list); needed--; continue; @@ -5329,7 +5341,7 @@ ext4_mb_discard_lg_preallocations(struct super_block *sb, spin_lock(&lg->lg_prealloc_lock); list_for_each_entry_rcu(pa, &lg->lg_prealloc_list[order], - pa_inode_list, + pa_node.lg_list, lockdep_is_held(&lg->lg_prealloc_lock)) { spin_lock(&pa->pa_lock); if (atomic_read(&pa->pa_count)) { @@ -5352,7 +5364,7 @@ ext4_mb_discard_lg_preallocations(struct super_block *sb, ext4_mb_mark_pa_deleted(sb, pa); spin_unlock(&pa->pa_lock); - list_del_rcu(&pa->pa_inode_list); + list_del_rcu(&pa->pa_node.lg_list); list_add(&pa->u.pa_tmp_list, &discard_list); total_entries--; @@ -5413,7 +5425,7 @@ static void ext4_mb_add_n_trim(struct ext4_allocation_context *ac) /* Add the prealloc space to lg */ spin_lock(&lg->lg_prealloc_lock); list_for_each_entry_rcu(tmp_pa, &lg->lg_prealloc_list[order], - pa_inode_list, + pa_node.lg_list, lockdep_is_held(&lg->lg_prealloc_lock)) { spin_lock(&tmp_pa->pa_lock); if (tmp_pa->pa_deleted) { @@ -5422,8 +5434,8 @@ static void ext4_mb_add_n_trim(struct ext4_allocation_context *ac) } if (!added && pa->pa_free < tmp_pa->pa_free) { /* Add to the tail of the previous entry */ - list_add_tail_rcu(&pa->pa_inode_list, - &tmp_pa->pa_inode_list); + list_add_tail_rcu(&pa->pa_node.lg_list, + &tmp_pa->pa_node.lg_list); added = 1; /* * we want to count the total @@ -5434,7 +5446,7 @@ static void ext4_mb_add_n_trim(struct ext4_allocation_context *ac) lg_prealloc_count++; } if (!added) - list_add_tail_rcu(&pa->pa_inode_list, + list_add_tail_rcu(&pa->pa_node.lg_list, &lg->lg_prealloc_list[order]); spin_unlock(&lg->lg_prealloc_lock); @@ -5490,9 +5502,9 @@ static int ext4_mb_release_context(struct ext4_allocation_context *ac) * doesn't grow big. */ if (likely(pa->pa_free)) { - spin_lock(pa->pa_obj_lock); - list_del_rcu(&pa->pa_inode_list); - spin_unlock(pa->pa_obj_lock); + spin_lock(pa->pa_node_lock.lg_lock); + list_del_rcu(&pa->pa_node.lg_list); + spin_unlock(pa->pa_node_lock.lg_lock); ext4_mb_add_n_trim(ac); } } @@ -5502,9 +5514,9 @@ static int ext4_mb_release_context(struct ext4_allocation_context *ac) * treat per-inode prealloc list as a lru list, then try * to trim the least recently used PA. */ - spin_lock(pa->pa_obj_lock); - list_move(&pa->pa_inode_list, &ei->i_prealloc_list); - spin_unlock(pa->pa_obj_lock); + spin_lock(pa->pa_node_lock.inode_lock); + list_move(&pa->pa_node.inode_list, &ei->i_prealloc_list); + spin_unlock(pa->pa_node_lock.inode_lock); } ext4_mb_put_pa(ac, ac->ac_sb, pa); diff --git a/fs/ext4/mballoc.h b/fs/ext4/mballoc.h index dcda2a943cee..398a6688c341 100644 --- a/fs/ext4/mballoc.h +++ b/fs/ext4/mballoc.h @@ -114,7 +114,10 @@ struct ext4_free_data { }; struct ext4_prealloc_space { - struct list_head pa_inode_list; + union { + struct list_head inode_list; /* for inode PAs */ + struct list_head lg_list; /* for lg PAs */ + } pa_node; struct list_head pa_group_list; union { struct list_head pa_tmp_list; @@ -128,7 +131,10 @@ struct ext4_prealloc_space { ext4_grpblk_t pa_len; /* len of preallocated chunk */ ext4_grpblk_t pa_free; /* how many blocks are free */ unsigned short pa_type; /* pa type. inode or group */ - spinlock_t *pa_obj_lock; + union { + spinlock_t *inode_lock; /* locks the inode list holding this PA */ + spinlock_t *lg_lock; /* locks the lg list holding this PA */ + } pa_node_lock; struct inode *pa_inode; /* hack, for history only */ }; From patchwork Mon Jan 16 08:02:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ojaswin Mujoo X-Patchwork-Id: 13102734 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DF3ACC678D7 for ; Mon, 16 Jan 2023 08:03:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232239AbjAPIDx (ORCPT ); Mon, 16 Jan 2023 03:03:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34282 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229930AbjAPIDJ (ORCPT ); Mon, 16 Jan 2023 03:03:09 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AFD0213532; Mon, 16 Jan 2023 00:02:45 -0800 (PST) Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 30G5phs5013748; Mon, 16 Jan 2023 08:02:41 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=YD36RItR/vF2c+PbA8sxKZ38Vf+Znn7j1us0SRZNOZk=; b=nQ2KSVJW4EXTC8Y8T9zKqT3K+73eNT7fLc3LhjkG7NZHxJHVDctB/0yZ9S0Gw9FdKrs2 j19+5x3t3lTo0stpgrh0OPT1FWN+/cHy5ATlGhbsx+oBysCxsUt1OI/U3rtRzk0C2rU1 E3vRVaDgO9Mjz0T52LfxpakQDBRMAzD4ZnOyRNc5a10pMB8IJBJMtmTpZQxYguqye9el CBwx+BgIxlG5VovR6DWpDQ2rxn3bMAYk5wzCAesHZioF3YcbI35jktV7sh/z9sr5Z9WG 4hREuUpoKQvZ0gDvXoja2meyeiN7BN0qknpcpqhMuN4foCCkKoOYdtUCmDkt5nQye1c2 Bw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3n4apjdux7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:41 +0000 Received: from m0098410.ppops.net (m0098410.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 30G7dpvl029759; Mon, 16 Jan 2023 08:02:40 GMT Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3n4apjduw9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:40 +0000 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 30G3cEVb001386; Mon, 16 Jan 2023 08:02:38 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma04ams.nl.ibm.com (PPS) with ESMTPS id 3n3m16j3tg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:38 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 30G82ZM428967202 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 16 Jan 2023 08:02:35 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B73422004E; Mon, 16 Jan 2023 08:02:35 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C743F2004B; Mon, 16 Jan 2023 08:02:33 +0000 (GMT) Received: from li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.in.ibm.com (unknown [9.109.253.169]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 16 Jan 2023 08:02:33 +0000 (GMT) From: Ojaswin Mujoo To: linux-ext4@vger.kernel.org, "Theodore Ts'o" Cc: Ritesh Harjani , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara , rookxu , Ritesh Harjani Subject: [PATCH v3 7/8] ext4: Use rbtrees to manage PAs instead of inode i_prealloc_list Date: Mon, 16 Jan 2023 13:32:15 +0530 Message-Id: <20230116080216.249195-8-ojaswin@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230116080216.249195-1-ojaswin@linux.ibm.com> References: <20230116080216.249195-1-ojaswin@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 2A5GWa8xuCGBuzHl51zNdUcD8SFiCD4o X-Proofpoint-GUID: HbASh12Eadr4f4o9J9k0XVd7vr--5cd6 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.923,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-01-16_06,2023-01-13_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 priorityscore=1501 malwarescore=0 mlxscore=0 spamscore=0 lowpriorityscore=0 mlxlogscore=999 impostorscore=0 suspectscore=0 adultscore=0 bulkscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301160058 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Currently, the kernel uses i_prealloc_list to hold all the inode preallocations. This is known to cause degradation in performance in workloads which perform large number of sparse writes on a single file. This is mainly because functions like ext4_mb_normalize_request() and ext4_mb_use_preallocated() iterate over this complete list, resulting in slowdowns when large number of PAs are present. Patch 27bc446e2 partially fixed this by enforcing a limit of 512 for the inode preallocation list and adding logic to continually trim the list if it grows above the threshold, however our testing revealed that a hardcoded value is not suitable for all kinds of workloads. To optimize this, add an rbtree to the inode and hold the inode preallocations in this rbtree. This will make iterating over inode PAs faster and scale much better than a linked list. Additionally, we also had to remove the LRU logic that was added during trimming of the list (in ext4_mb_release_context()) as it will add extra overhead in rbtree. The discards now happen in the lowest-logical-offset-first order. ** Locking notes ** With the introduction of rbtree to maintain inode PAs, we can't use RCU to walk the tree for searching since it can result in partial traversals which might miss some nodes(or entire subtrees) while discards happen in parallel (which happens under a lock). Hence this patch converts the ei->i_prealloc_lock spin_lock to rw_lock. Almost all the codepaths that read/modify the PA rbtrees are protected by the higher level inode->i_data_sem (except ext4_mb_discard_group_preallocations() and ext4_clear_inode()) IIUC, the only place we need lock protection is when one thread is reading "searching" the PA rbtree (earlier protected under rcu_read_lock()) and another is "deleting" the PAs in ext4_mb_discard_group_preallocations() function (which iterates all the PAs using the grp->bb_prealloc_list and deletes PAs from the tree without taking any inode lock (i_data_sem)). So, this patch converts all rcu_read_lock/unlock() paths for inode list PA to use read_lock() and all places where we were using ei->i_prealloc_lock spinlock will now be using write_lock(). Note that this makes the fast path (searching of the right PA e.g. ext4_mb_use_preallocated() or ext4_mb_normalize_request()), now use read_lock() instead of rcu_read_lock/unlock(). Ths also will now block due to slow discard path (ext4_mb_discard_group_preallocations()) which uses write_lock(). But this is not as bad as it looks. This is because - 1. The slow path only occurs when the normal allocation failed and we can say that we are low on disk space. One can argue this scenario won't be much frequent. 2. ext4_mb_discard_group_preallocations(), locks and unlocks the rwlock for deleting every individual PA. This gives enough opportunity for the fast path to acquire the read_lock for searching the PA inode list. ** Design changes around deleted PAs ** In ext4_mb_adjust_overlap(), it is possible for an allocating thread to come across a PA that is marked deleted via ext4_mb_discard_group_preallocations() but not yet removed from the inode PA rbtree. In such a case, we ignore any overlaps with this PA node and simply move forward in the rbtree by comparing logical start of to-be-inserted PA and the deleted PA node. Although this usually doesn't cause an issue and we can move forward in the tree, in one speacial case, i.e if range of deleted PA lies completely inside the normalized range, we might require to travese both the sub-trees under such a deleted PA. To simplify this special scenario and also as an optimization, undelete the PA If the allocating thread determines that this PA might be needed in the near future. This results in the following changes: - ext4_mb_use_preallocated(): Use a deleted PA if original request lies in it. - ext4_mb_pa_adjust_overlap(): Undelete a PA if it is deleted but there is an overlap with the normalized range. - ext4_mb_discard_group_preallocations(): Rollback delete operation if allocation path undeletes a PA before it is erased from inode PA list. Since this covers the special case we discussed above, we will always un-delete the PA when we encounter the special case and we can then adjust for overlap and traverse the PA rbtree without any issues. Signed-off-by: Ojaswin Mujoo Reviewed-by: Ritesh Harjani (IBM) --- fs/ext4/ext4.h | 4 +- fs/ext4/mballoc.c | 239 ++++++++++++++++++++++++++++++++++------------ fs/ext4/mballoc.h | 6 +- fs/ext4/super.c | 4 +- 4 files changed, 183 insertions(+), 70 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 140e1eb300d1..fad5f087e4c6 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1120,8 +1120,8 @@ struct ext4_inode_info { /* mballoc */ atomic_t i_prealloc_active; - struct list_head i_prealloc_list; - spinlock_t i_prealloc_lock; + struct rb_root i_prealloc_node; + rwlock_t i_prealloc_lock; /* extents status tree */ struct ext4_es_tree i_es_tree; diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 53c4efd34d1c..85598079b7ce 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -3984,6 +3984,44 @@ static void ext4_mb_normalize_group_request(struct ext4_allocation_context *ac) mb_debug(sb, "goal %u blocks for locality group\n", ac->ac_g_ex.fe_len); } +static void ext4_mb_mark_pa_inuse(struct super_block *sb, + struct ext4_prealloc_space *pa) +{ + struct ext4_inode_info *ei; + + if (pa->pa_deleted == 0) { + ext4_warning( + sb, "pa already inuse, type:%d, pblk:%llu, lblk:%u, len:%d\n", + pa->pa_type, pa->pa_pstart, pa->pa_lstart, pa->pa_len); + return; + } + + pa->pa_deleted = 0; + + if (pa->pa_type == MB_INODE_PA) { + ei = EXT4_I(pa->pa_inode); + atomic_inc(&ei->i_prealloc_active); + } +} + +/* + * This function returns the next element to look at during inode + * PA rbtree walk. We assume that we have held the inode PA rbtree lock + * (ei->i_prealloc_lock) + * + * new_start The start of the range we want to compare + * cur_start The existing start that we are comparing against + * node The node of the rb_tree + */ +static inline struct rb_node* +ext4_mb_pa_rb_next_iter(ext4_lblk_t new_start, ext4_lblk_t cur_start, struct rb_node *node) +{ + if (new_start < cur_start) + return node->rb_left; + else + return node->rb_right; +} + static inline void ext4_mb_pa_assert_overlap(struct ext4_allocation_context *ac, ext4_lblk_t start, ext4_lblk_t end) @@ -3992,27 +4030,31 @@ ext4_mb_pa_assert_overlap(struct ext4_allocation_context *ac, struct ext4_inode_info *ei = EXT4_I(ac->ac_inode); struct ext4_prealloc_space *tmp_pa; ext4_lblk_t tmp_pa_start, tmp_pa_end; + struct rb_node *iter; - rcu_read_lock(); - list_for_each_entry_rcu(tmp_pa, &ei->i_prealloc_list, pa_node.inode_list) { - spin_lock(&tmp_pa->pa_lock); - if (tmp_pa->pa_deleted == 0) { - tmp_pa_start = tmp_pa->pa_lstart; - tmp_pa_end = tmp_pa->pa_lstart + EXT4_C2B(sbi, tmp_pa->pa_len); + read_lock(&ei->i_prealloc_lock); + iter = ei->i_prealloc_node.rb_node; + while (iter) { + tmp_pa = rb_entry(iter, struct ext4_prealloc_space, + pa_node.inode_node); + tmp_pa_start = tmp_pa->pa_lstart; + tmp_pa_end = tmp_pa->pa_lstart + EXT4_C2B(sbi, tmp_pa->pa_len); + spin_lock(&tmp_pa->pa_lock); + if (tmp_pa->pa_deleted == 0) BUG_ON(!(start >= tmp_pa_end || end <= tmp_pa_start)); - } spin_unlock(&tmp_pa->pa_lock); + + iter = ext4_mb_pa_rb_next_iter(start, tmp_pa_start, iter); } - rcu_read_unlock(); + read_unlock(&ei->i_prealloc_lock); } - /* * Given an allocation context "ac" and a range "start", "end", check * and adjust boundaries if the range overlaps with any of the existing * preallocatoins stored in the corresponding inode of the allocation context. * - *Parameters: + * Parameters: * ac allocation context * start start of the new range * end end of the new range @@ -4024,6 +4066,7 @@ ext4_mb_pa_adjust_overlap(struct ext4_allocation_context *ac, struct ext4_inode_info *ei = EXT4_I(ac->ac_inode); struct ext4_sb_info *sbi = EXT4_SB(ac->ac_sb); struct ext4_prealloc_space *tmp_pa; + struct rb_node *iter; ext4_lblk_t new_start, new_end; ext4_lblk_t tmp_pa_start, tmp_pa_end; @@ -4031,25 +4074,40 @@ ext4_mb_pa_adjust_overlap(struct ext4_allocation_context *ac, new_end = *end; /* check we don't cross already preallocated blocks */ - rcu_read_lock(); - list_for_each_entry_rcu(tmp_pa, &ei->i_prealloc_list, pa_node.inode_list) { - if (tmp_pa->pa_deleted) - continue; - spin_lock(&tmp_pa->pa_lock); - if (tmp_pa->pa_deleted) { - spin_unlock(&tmp_pa->pa_lock); - continue; - } + read_lock(&ei->i_prealloc_lock); + for (iter = ei->i_prealloc_node.rb_node; iter; + iter = ext4_mb_pa_rb_next_iter(new_start, tmp_pa_start, iter)) { + int is_overlap; + tmp_pa = rb_entry(iter, struct ext4_prealloc_space, + pa_node.inode_node); tmp_pa_start = tmp_pa->pa_lstart; tmp_pa_end = tmp_pa->pa_lstart + EXT4_C2B(sbi, tmp_pa->pa_len); + is_overlap = !(tmp_pa_start >= new_end || tmp_pa_end <= new_start); + + spin_lock(&tmp_pa->pa_lock); + if (tmp_pa->pa_deleted) { + if (is_overlap) { + /* + * Normalized range overlaps with range of this + * deleted PA, that means we might need it in + * near future. Since the PA is yet to be + * removed from inode PA tree and freed, lets + * just undelete it. + */ + ext4_mb_mark_pa_inuse(ac->ac_sb, tmp_pa); + } else { + spin_unlock(&tmp_pa->pa_lock); + continue; + } + } /* PA must not overlap original request */ BUG_ON(!(ac->ac_o_ex.fe_logical >= tmp_pa_end || ac->ac_o_ex.fe_logical < tmp_pa_start)); /* skip PAs this normalized request doesn't overlap with */ - if (tmp_pa_start >= new_end || tmp_pa_end <= new_start) { + if (!is_overlap) { spin_unlock(&tmp_pa->pa_lock); continue; } @@ -4065,7 +4123,7 @@ ext4_mb_pa_adjust_overlap(struct ext4_allocation_context *ac, } spin_unlock(&tmp_pa->pa_lock); } - rcu_read_unlock(); + read_unlock(&ei->i_prealloc_lock); /* XXX: extra loop to check we really don't overlap preallocations */ ext4_mb_pa_assert_overlap(ac, new_start, new_end); @@ -4192,7 +4250,6 @@ ext4_mb_normalize_request(struct ext4_allocation_context *ac, ext4_mb_pa_adjust_overlap(ac, &start, &end); size = end - start; - /* * In this function "start" and "size" are normalized for better * alignment and length such that we could preallocate more blocks. @@ -4401,6 +4458,7 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac) struct ext4_locality_group *lg; struct ext4_prealloc_space *tmp_pa, *cpa = NULL; ext4_lblk_t tmp_pa_start, tmp_pa_end; + struct rb_node *iter; ext4_fsblk_t goal_block; /* only data can be preallocated */ @@ -4408,14 +4466,17 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac) return false; /* first, try per-file preallocation */ - rcu_read_lock(); - list_for_each_entry_rcu(tmp_pa, &ei->i_prealloc_list, pa_node.inode_list) { + read_lock(&ei->i_prealloc_lock); + for (iter = ei->i_prealloc_node.rb_node; iter; + iter = ext4_mb_pa_rb_next_iter(ac->ac_o_ex.fe_logical, tmp_pa_start, iter)) { + tmp_pa = rb_entry(iter, struct ext4_prealloc_space, pa_node.inode_node); /* all fields in this condition don't change, * so we can skip locking for them */ tmp_pa_start = tmp_pa->pa_lstart; tmp_pa_end = tmp_pa->pa_lstart + EXT4_C2B(sbi, tmp_pa->pa_len); + /* original request start doesn't lie in this PA */ if (ac->ac_o_ex.fe_logical < tmp_pa_start || ac->ac_o_ex.fe_logical >= tmp_pa_end) continue; @@ -4433,17 +4494,25 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac) /* found preallocated blocks, use them */ spin_lock(&tmp_pa->pa_lock); - if (tmp_pa->pa_deleted == 0 && tmp_pa->pa_free) { + if (tmp_pa->pa_free) { + if (tmp_pa->pa_deleted == 1) { + /* + * Since PA is yet to be deleted from inode PA + * rbtree, just undelete it and use it. + */ + ext4_mb_mark_pa_inuse(ac->ac_sb, tmp_pa); + } + atomic_inc(&tmp_pa->pa_count); ext4_mb_use_inode_pa(ac, tmp_pa); spin_unlock(&tmp_pa->pa_lock); ac->ac_criteria = 10; - rcu_read_unlock(); + read_unlock(&ei->i_prealloc_lock); return true; } spin_unlock(&tmp_pa->pa_lock); } - rcu_read_unlock(); + read_unlock(&ei->i_prealloc_lock); /* can we use group allocation? */ if (!(ac->ac_flags & EXT4_MB_HINT_GROUP_ALLOC)) @@ -4596,6 +4665,7 @@ static void ext4_mb_put_pa(struct ext4_allocation_context *ac, { ext4_group_t grp; ext4_fsblk_t grp_blk; + struct ext4_inode_info *ei = EXT4_I(ac->ac_inode); /* in this short window concurrent discard can set pa_deleted */ spin_lock(&pa->pa_lock); @@ -4641,16 +4711,41 @@ static void ext4_mb_put_pa(struct ext4_allocation_context *ac, ext4_unlock_group(sb, grp); if (pa->pa_type == MB_INODE_PA) { - spin_lock(pa->pa_node_lock.inode_lock); - list_del_rcu(&pa->pa_node.inode_list); - spin_unlock(pa->pa_node_lock.inode_lock); + write_lock(pa->pa_node_lock.inode_lock); + rb_erase(&pa->pa_node.inode_node, &ei->i_prealloc_node); + write_unlock(pa->pa_node_lock.inode_lock); + ext4_mb_pa_free(pa); } else { spin_lock(pa->pa_node_lock.lg_lock); list_del_rcu(&pa->pa_node.lg_list); spin_unlock(pa->pa_node_lock.lg_lock); + call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback); } +} - call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback); +static void ext4_mb_pa_rb_insert(struct rb_root *root, struct rb_node *new) +{ + struct rb_node **iter = &root->rb_node, *parent = NULL; + struct ext4_prealloc_space *iter_pa, *new_pa; + ext4_lblk_t iter_start, new_start; + + while (*iter) { + iter_pa = rb_entry(*iter, struct ext4_prealloc_space, + pa_node.inode_node); + new_pa = rb_entry(new, struct ext4_prealloc_space, + pa_node.inode_node); + iter_start = iter_pa->pa_lstart; + new_start = new_pa->pa_lstart; + + parent = *iter; + if (new_start < iter_start) + iter = &((*iter)->rb_left); + else + iter = &((*iter)->rb_right); + } + + rb_link_node(new, parent, iter); + rb_insert_color(new, root); } /* @@ -4716,7 +4811,6 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac) pa->pa_len = ac->ac_b_ex.fe_len; pa->pa_free = pa->pa_len; spin_lock_init(&pa->pa_lock); - INIT_LIST_HEAD(&pa->pa_node.inode_list); INIT_LIST_HEAD(&pa->pa_group_list); pa->pa_deleted = 0; pa->pa_type = MB_INODE_PA; @@ -4736,9 +4830,9 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac) list_add(&pa->pa_group_list, &grp->bb_prealloc_list); - spin_lock(pa->pa_node_lock.inode_lock); - list_add_rcu(&pa->pa_node.inode_list, &ei->i_prealloc_list); - spin_unlock(pa->pa_node_lock.inode_lock); + write_lock(pa->pa_node_lock.inode_lock); + ext4_mb_pa_rb_insert(&ei->i_prealloc_node, &pa->pa_node.inode_node); + write_unlock(pa->pa_node_lock.inode_lock); atomic_inc(&ei->i_prealloc_active); } @@ -4904,6 +4998,7 @@ ext4_mb_discard_group_preallocations(struct super_block *sb, struct ext4_prealloc_space *pa, *tmp; struct list_head list; struct ext4_buddy e4b; + struct ext4_inode_info *ei; int err; int free = 0; @@ -4967,18 +5062,45 @@ ext4_mb_discard_group_preallocations(struct super_block *sb, list_del_rcu(&pa->pa_node.lg_list); spin_unlock(pa->pa_node_lock.lg_lock); } else { - spin_lock(pa->pa_node_lock.inode_lock); - list_del_rcu(&pa->pa_node.inode_list); - spin_unlock(pa->pa_node_lock.inode_lock); + /* + * The allocation path might undelete a PA + * incase it expects it to be used again in near + * future. In that case, rollback the ongoing delete + * operation and don't remove the PA from inode PA + * tree. + * + * TODO: See if we need pa_lock since there might no + * path that contends with it once the rbtree writelock + * is taken. + */ + write_lock(pa->pa_node_lock.inode_lock); + spin_lock(&pa->pa_lock); + if (pa->pa_deleted == 0) { + free -= pa->pa_free; + list_add(&pa->pa_group_list, + &grp->bb_prealloc_list); + list_del(&pa->u.pa_tmp_list); + + spin_unlock(&pa->pa_lock); + write_unlock(pa->pa_node_lock.inode_lock); + continue; + } + spin_unlock(&pa->pa_lock); + + ei = EXT4_I(pa->pa_inode); + rb_erase(&pa->pa_node.inode_node, &ei->i_prealloc_node); + write_unlock(pa->pa_node_lock.inode_lock); } - if (pa->pa_type == MB_GROUP_PA) + list_del(&pa->u.pa_tmp_list); + + if (pa->pa_type == MB_GROUP_PA) { ext4_mb_release_group_pa(&e4b, pa); - else + call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback); + } else { ext4_mb_release_inode_pa(&e4b, bitmap_bh, pa); - - list_del(&pa->u.pa_tmp_list); - call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback); + ext4_mb_pa_free(pa); + } } ext4_unlock_group(sb, group); @@ -5008,6 +5130,7 @@ void ext4_discard_preallocations(struct inode *inode, unsigned int needed) ext4_group_t group = 0; struct list_head list; struct ext4_buddy e4b; + struct rb_node *iter; int err; if (!S_ISREG(inode->i_mode)) { @@ -5030,17 +5153,18 @@ void ext4_discard_preallocations(struct inode *inode, unsigned int needed) repeat: /* first, collect all pa's in the inode */ - spin_lock(&ei->i_prealloc_lock); - while (!list_empty(&ei->i_prealloc_list) && needed) { - pa = list_entry(ei->i_prealloc_list.prev, - struct ext4_prealloc_space, pa_node.inode_list); + write_lock(&ei->i_prealloc_lock); + for (iter = rb_first(&ei->i_prealloc_node); iter && needed; iter = rb_next(iter)) { + pa = rb_entry(iter, struct ext4_prealloc_space, + pa_node.inode_node); BUG_ON(pa->pa_node_lock.inode_lock != &ei->i_prealloc_lock); + spin_lock(&pa->pa_lock); if (atomic_read(&pa->pa_count)) { /* this shouldn't happen often - nobody should * use preallocation while we're discarding it */ spin_unlock(&pa->pa_lock); - spin_unlock(&ei->i_prealloc_lock); + write_unlock(&ei->i_prealloc_lock); ext4_msg(sb, KERN_ERR, "uh-oh! used pa while discarding"); WARN_ON(1); @@ -5051,7 +5175,7 @@ void ext4_discard_preallocations(struct inode *inode, unsigned int needed) if (pa->pa_deleted == 0) { ext4_mb_mark_pa_deleted(sb, pa); spin_unlock(&pa->pa_lock); - list_del_rcu(&pa->pa_node.inode_list); + rb_erase(&pa->pa_node.inode_node, &ei->i_prealloc_node); list_add(&pa->u.pa_tmp_list, &list); needed--; continue; @@ -5059,7 +5183,7 @@ void ext4_discard_preallocations(struct inode *inode, unsigned int needed) /* someone is deleting pa right now */ spin_unlock(&pa->pa_lock); - spin_unlock(&ei->i_prealloc_lock); + write_unlock(&ei->i_prealloc_lock); /* we have to wait here because pa_deleted * doesn't mean pa is already unlinked from @@ -5076,7 +5200,7 @@ void ext4_discard_preallocations(struct inode *inode, unsigned int needed) schedule_timeout_uninterruptible(HZ); goto repeat; } - spin_unlock(&ei->i_prealloc_lock); + write_unlock(&ei->i_prealloc_lock); list_for_each_entry_safe(pa, tmp, &list, u.pa_tmp_list) { BUG_ON(pa->pa_type != MB_INODE_PA); @@ -5108,7 +5232,7 @@ void ext4_discard_preallocations(struct inode *inode, unsigned int needed) put_bh(bitmap_bh); list_del(&pa->u.pa_tmp_list); - call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback); + ext4_mb_pa_free(pa); } } @@ -5482,7 +5606,6 @@ static void ext4_mb_trim_inode_pa(struct inode *inode) static int ext4_mb_release_context(struct ext4_allocation_context *ac) { struct inode *inode = ac->ac_inode; - struct ext4_inode_info *ei = EXT4_I(inode); struct ext4_sb_info *sbi = EXT4_SB(ac->ac_sb); struct ext4_prealloc_space *pa = ac->ac_pa; if (pa) { @@ -5509,16 +5632,6 @@ static int ext4_mb_release_context(struct ext4_allocation_context *ac) } } - if (pa->pa_type == MB_INODE_PA) { - /* - * treat per-inode prealloc list as a lru list, then try - * to trim the least recently used PA. - */ - spin_lock(pa->pa_node_lock.inode_lock); - list_move(&pa->pa_node.inode_list, &ei->i_prealloc_list); - spin_unlock(pa->pa_node_lock.inode_lock); - } - ext4_mb_put_pa(ac, ac->ac_sb, pa); } if (ac->ac_bitmap_page) diff --git a/fs/ext4/mballoc.h b/fs/ext4/mballoc.h index 398a6688c341..f8e8ee493867 100644 --- a/fs/ext4/mballoc.h +++ b/fs/ext4/mballoc.h @@ -115,7 +115,7 @@ struct ext4_free_data { struct ext4_prealloc_space { union { - struct list_head inode_list; /* for inode PAs */ + struct rb_node inode_node; /* for inode PA rbtree */ struct list_head lg_list; /* for lg PAs */ } pa_node; struct list_head pa_group_list; @@ -132,10 +132,10 @@ struct ext4_prealloc_space { ext4_grpblk_t pa_free; /* how many blocks are free */ unsigned short pa_type; /* pa type. inode or group */ union { - spinlock_t *inode_lock; /* locks the inode list holding this PA */ + rwlock_t *inode_lock; /* locks the rbtree holding this PA */ spinlock_t *lg_lock; /* locks the lg list holding this PA */ } pa_node_lock; - struct inode *pa_inode; /* hack, for history only */ + struct inode *pa_inode; /* used to get the inode during group discard */ }; enum { diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 72ead3b56706..5fb3e401de6b 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1325,9 +1325,9 @@ static struct inode *ext4_alloc_inode(struct super_block *sb) inode_set_iversion(&ei->vfs_inode, 1); ei->i_flags = 0; spin_lock_init(&ei->i_raw_lock); - INIT_LIST_HEAD(&ei->i_prealloc_list); + ei->i_prealloc_node = RB_ROOT; atomic_set(&ei->i_prealloc_active, 0); - spin_lock_init(&ei->i_prealloc_lock); + rwlock_init(&ei->i_prealloc_lock); ext4_es_init_tree(&ei->i_es_tree); rwlock_init(&ei->i_es_lock); INIT_LIST_HEAD(&ei->i_es_list); From patchwork Mon Jan 16 08:02:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ojaswin Mujoo X-Patchwork-Id: 13102736 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E2F9C54EBE for ; Mon, 16 Jan 2023 08:04:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232341AbjAPID5 (ORCPT ); Mon, 16 Jan 2023 03:03:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35054 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232084AbjAPIDL (ORCPT ); Mon, 16 Jan 2023 03:03:11 -0500 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AC1871205A; Mon, 16 Jan 2023 00:02:47 -0800 (PST) Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 30G5pfq2022611; Mon, 16 Jan 2023 08:02:43 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=qNtZkA13WTIemqsNq8nY2phpY5cLQU1E9IBQKgMGmpI=; b=mYD61q+FA552864UcyoI3qkpNPRVve+FBLACHAZ2ZYUrcjopXjCXISZjr9wt7ju1TO78 NnsUGIIBt/18pz2fOu/soBzwURyCPaKgC4DEcTPXhK2IRCONIdLM2Rk7rAJbMlj6xh4n pNuwjNoD/+DDh21lbc67OLZpaoUuBp6ygZSqlsQ+AloZkaenLkoYNDurP4V+Ip/+dcrZ x1P1LjRLfsse0dOSJumZmZMk2ix0RLBia/cL3jyKGKHOkJSFf4dyv0CQAL6bWArBuMVA Hbf7oZ4BxIsyXrAI04/BnxzAWYM0iGk6Htc1xSB3VKyXQgeOxlnQCWFlqpxoD6m0+bmw 0Q== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3n4g07sb5j-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:42 +0000 Received: from m0098421.ppops.net (m0098421.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 30G82gVY030160; Mon, 16 Jan 2023 08:02:42 GMT Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3n4g07sb4v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:42 +0000 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 30G1KLG4023792; Mon, 16 Jan 2023 08:02:40 GMT Received: from smtprelay02.fra02v.mail.ibm.com ([9.218.2.226]) by ppma03ams.nl.ibm.com (PPS) with ESMTPS id 3n3m16j3ey-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Jan 2023 08:02:40 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay02.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 30G82bK147907308 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 16 Jan 2023 08:02:38 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DFB8420040; Mon, 16 Jan 2023 08:02:37 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1081B2004E; Mon, 16 Jan 2023 08:02:36 +0000 (GMT) Received: from li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.in.ibm.com (unknown [9.109.253.169]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 16 Jan 2023 08:02:35 +0000 (GMT) From: Ojaswin Mujoo To: linux-ext4@vger.kernel.org, "Theodore Ts'o" Cc: Ritesh Harjani , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara , rookxu , Ritesh Harjani Subject: [PATCH v3 8/8] ext4: Remove the logic to trim inode PAs Date: Mon, 16 Jan 2023 13:32:16 +0530 Message-Id: <20230116080216.249195-9-ojaswin@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230116080216.249195-1-ojaswin@linux.ibm.com> References: <20230116080216.249195-1-ojaswin@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: qRRsXOqHRJSFjpEaGxjU5HyV0JmrwClD X-Proofpoint-GUID: 6unrsTKb8a7_mC1lJiVFQruQJ9oWufvA X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.923,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-01-16_06,2023-01-13_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 adultscore=0 bulkscore=0 lowpriorityscore=0 phishscore=0 malwarescore=0 spamscore=0 priorityscore=1501 mlxscore=0 mlxlogscore=999 clxscore=1015 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301160058 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Earlier, inode PAs were stored in a linked list. This caused a need to periodically trim the list down inorder to avoid growing it to a very large size, as this would severly affect performance during list iteration. Recent patches changed this list to an rbtree, and since the tree scales up much better, we no longer need to have the trim functionality, hence remove it. Signed-off-by: Ojaswin Mujoo Reviewed-by: Ritesh Harjani (IBM) Reviewed-by: Jan Kara --- Documentation/admin-guide/ext4.rst | 3 --- fs/ext4/ext4.h | 1 - fs/ext4/mballoc.c | 20 -------------------- fs/ext4/mballoc.h | 5 ----- fs/ext4/sysfs.c | 2 -- 5 files changed, 31 deletions(-) diff --git a/Documentation/admin-guide/ext4.rst b/Documentation/admin-guide/ext4.rst index 4c559e08d11e..5740d85439ff 100644 --- a/Documentation/admin-guide/ext4.rst +++ b/Documentation/admin-guide/ext4.rst @@ -489,9 +489,6 @@ Files in /sys/fs/ext4/: multiple of this tuning parameter if the stripe size is not set in the ext4 superblock - mb_max_inode_prealloc - The maximum length of per-inode ext4_prealloc_space list. - mb_max_to_scan The maximum number of extents the multiblock allocator will search to find the best extent. diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index fad5f087e4c6..d2869ad7d885 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1612,7 +1612,6 @@ struct ext4_sb_info { unsigned int s_mb_stats; unsigned int s_mb_order2_reqs; unsigned int s_mb_group_prealloc; - unsigned int s_mb_max_inode_prealloc; unsigned int s_max_dir_size_kb; /* where last allocation was done - for stream allocation */ unsigned long s_mb_last_group; diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 85598079b7ce..273a98bcaa0d 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -3419,7 +3419,6 @@ int ext4_mb_init(struct super_block *sb) sbi->s_mb_stats = MB_DEFAULT_STATS; sbi->s_mb_stream_request = MB_DEFAULT_STREAM_THRESHOLD; sbi->s_mb_order2_reqs = MB_DEFAULT_ORDER2_REQS; - sbi->s_mb_max_inode_prealloc = MB_DEFAULT_MAX_INODE_PREALLOC; /* * The default group preallocation is 512, which for 4k block * sizes translates to 2 megabytes. However for bigalloc file @@ -5583,29 +5582,11 @@ static void ext4_mb_add_n_trim(struct ext4_allocation_context *ac) return ; } -/* - * if per-inode prealloc list is too long, trim some PA - */ -static void ext4_mb_trim_inode_pa(struct inode *inode) -{ - struct ext4_inode_info *ei = EXT4_I(inode); - struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); - int count, delta; - - count = atomic_read(&ei->i_prealloc_active); - delta = (sbi->s_mb_max_inode_prealloc >> 2) + 1; - if (count > sbi->s_mb_max_inode_prealloc + delta) { - count -= sbi->s_mb_max_inode_prealloc; - ext4_discard_preallocations(inode, count); - } -} - /* * release all resource we used in allocation */ static int ext4_mb_release_context(struct ext4_allocation_context *ac) { - struct inode *inode = ac->ac_inode; struct ext4_sb_info *sbi = EXT4_SB(ac->ac_sb); struct ext4_prealloc_space *pa = ac->ac_pa; if (pa) { @@ -5641,7 +5622,6 @@ static int ext4_mb_release_context(struct ext4_allocation_context *ac) if (ac->ac_flags & EXT4_MB_HINT_GROUP_ALLOC) mutex_unlock(&ac->ac_lg->lg_mutex); ext4_mb_collect_stats(ac); - ext4_mb_trim_inode_pa(inode); return 0; } diff --git a/fs/ext4/mballoc.h b/fs/ext4/mballoc.h index f8e8ee493867..6d85ee8674a6 100644 --- a/fs/ext4/mballoc.h +++ b/fs/ext4/mballoc.h @@ -73,11 +73,6 @@ */ #define MB_DEFAULT_GROUP_PREALLOC 512 -/* - * maximum length of inode prealloc list - */ -#define MB_DEFAULT_MAX_INODE_PREALLOC 512 - /* * Number of groups to search linearly before performing group scanning * optimization. diff --git a/fs/ext4/sysfs.c b/fs/ext4/sysfs.c index d233c24ea342..f0d42cf44c71 100644 --- a/fs/ext4/sysfs.c +++ b/fs/ext4/sysfs.c @@ -214,7 +214,6 @@ EXT4_RW_ATTR_SBI_UI(mb_min_to_scan, s_mb_min_to_scan); EXT4_RW_ATTR_SBI_UI(mb_order2_req, s_mb_order2_reqs); EXT4_RW_ATTR_SBI_UI(mb_stream_req, s_mb_stream_request); EXT4_RW_ATTR_SBI_UI(mb_group_prealloc, s_mb_group_prealloc); -EXT4_RW_ATTR_SBI_UI(mb_max_inode_prealloc, s_mb_max_inode_prealloc); EXT4_RW_ATTR_SBI_UI(mb_max_linear_groups, s_mb_max_linear_groups); EXT4_RW_ATTR_SBI_UI(extent_max_zeroout_kb, s_extent_max_zeroout_kb); EXT4_ATTR(trigger_fs_error, 0200, trigger_test_error); @@ -264,7 +263,6 @@ static struct attribute *ext4_attrs[] = { ATTR_LIST(mb_order2_req), ATTR_LIST(mb_stream_req), ATTR_LIST(mb_group_prealloc), - ATTR_LIST(mb_max_inode_prealloc), ATTR_LIST(mb_max_linear_groups), ATTR_LIST(max_writeback_mb_bump), ATTR_LIST(extent_max_zeroout_kb),