From patchwork Fri Jan 27 12:37:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ojaswin Mujoo X-Patchwork-Id: 13118632 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4BFE6C54EAA for ; Fri, 27 Jan 2023 12:38:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233984AbjA0Mh6 (ORCPT ); Fri, 27 Jan 2023 07:37:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48954 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233969AbjA0Mhz (ORCPT ); Fri, 27 Jan 2023 07:37:55 -0500 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DB18F1115C; Fri, 27 Jan 2023 04:37:54 -0800 (PST) Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 30RBhKU6014035; Fri, 27 Jan 2023 12:37:51 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=XC6TK7Q+z3yL+r45529vOHcJKP/n9hfcP75Tz2gRgpk=; b=ewXOBydThx0lvrCuW4uVJIa9jiWL9qdWsvNAK19EiA6hJLUa/c7eQ4AL1rRkfs2fBM7I 7asSrwuamJhs7JF78eeU59Cd88X4zWU8R5TNNp7+qmwn3GWefPuMtmBsakygVFWOwLVX +MNGF9tncQ87xQBBwBAqQ6couOGYQmG2P6KIPVzE7KR4YjTGOqxL6TdChVW3CtBNTYez 5CUQVvx65VZFs68JhLs2JwapHPuqVsVheggjYlBP3rUqqBkxPGpP+i7Rg8uz0UC29Pb5 JNGV+VPX46JX9FMZ1xwlQWuw5YZd5JXikSrBGrAkI3PCJNTZ0vxQvqX3MzXrBaupXssx 6w== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ncdyy19pa-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:37:50 +0000 Received: from m0098417.ppops.net (m0098417.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 30RCSdag008886; Fri, 27 Jan 2023 12:37:50 GMT Received: from ppma06fra.de.ibm.com (48.49.7a9f.ip4.static.sl-reverse.com [159.122.73.72]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ncdyy19np-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:37:50 +0000 Received: from pps.filterd (ppma06fra.de.ibm.com [127.0.0.1]) by ppma06fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 30QJmSYv010363; Fri, 27 Jan 2023 12:37:48 GMT Received: from smtprelay01.fra02v.mail.ibm.com ([9.218.2.227]) by ppma06fra.de.ibm.com (PPS) with ESMTPS id 3n87afdemc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:37:48 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay01.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 30RCbkR446465484 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 27 Jan 2023 12:37:46 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0BC9C20040; Fri, 27 Jan 2023 12:37:46 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D6B8E2004B; Fri, 27 Jan 2023 12:37:43 +0000 (GMT) Received: from li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com (unknown [9.43.40.88]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 27 Jan 2023 12:37:43 +0000 (GMT) From: Ojaswin Mujoo To: linux-ext4@vger.kernel.org, "Theodore Ts'o" Cc: Ritesh Harjani , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara , Ritesh Harjani Subject: [RFC 01/11] ext4: mballoc: Remove useless setting of ac_criteria Date: Fri, 27 Jan 2023 18:07:28 +0530 Message-Id: <08aadf4fd475d87020c60792d81276a28d7176c1.1674822311.git.ojaswin@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: ClKmbQhYV6P6p80kykiv7mFx_RD2whrU X-Proofpoint-GUID: RW1rbZrZaRzHGeAA51HRy_ns-UxN0rf2 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-01-27_06,2023-01-27_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 malwarescore=0 bulkscore=0 mlxscore=0 lowpriorityscore=0 priorityscore=1501 suspectscore=0 phishscore=0 clxscore=1015 mlxlogscore=999 adultscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301270113 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org There will be changes coming in future patches which will introduce a new criteria for block allocation. This removes the useless setting of ac_criteria. AFAIU, this might be only used to differentiate between whether a preallocated blocks was allocated or was regular allocator called for allocating blocks. Hence this also adds the debug prints to identify what type of block allocation was done in ext4_mb_show_ac(). Signed-off-by: Ritesh Harjani (IBM) Signed-off-by: Ojaswin Mujoo Reviewed-by: Jan Kara --- fs/ext4/mballoc.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 5b2ae37a8b80..572e79a698d4 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -4391,7 +4391,6 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac) atomic_inc(&pa->pa_count); ext4_mb_use_inode_pa(ac, pa); spin_unlock(&pa->pa_lock); - ac->ac_criteria = 10; rcu_read_unlock(); return true; } @@ -4434,7 +4433,6 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac) } if (cpa) { ext4_mb_use_group_pa(ac, cpa); - ac->ac_criteria = 20; return true; } return false; @@ -5131,6 +5129,10 @@ static void ext4_mb_show_ac(struct ext4_allocation_context *ac) (unsigned long)ac->ac_b_ex.fe_logical, (int)ac->ac_criteria); mb_debug(sb, "%u found", ac->ac_found); + mb_debug(sb, "used pa: %s, ", ac->ac_pa ? "yes" : "no"); + if (ac->ac_pa) + mb_debug(sb, "pa_type %s\n", ac->ac_pa->pa_type == MB_GROUP_PA ? + "group pa" : "inode pa"); ext4_mb_show_pa(sb); } #else From patchwork Fri Jan 27 12:37:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ojaswin Mujoo X-Patchwork-Id: 13118634 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C77DC61DA7 for ; Fri, 27 Jan 2023 12:38:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234156AbjA0MiB (ORCPT ); Fri, 27 Jan 2023 07:38:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48974 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234047AbjA0Mh7 (ORCPT ); Fri, 27 Jan 2023 07:37:59 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6D9F5B743; Fri, 27 Jan 2023 04:37:57 -0800 (PST) Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 30RB9XI4026982; Fri, 27 Jan 2023 12:37:53 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=D5qgN0xIjMsOTK9wWgwyvQZoEQw3zsDtf7B4uXmC0ok=; b=gPvwgn/gy5nKXHdgcZCHMf4y0gDcJLm9AljsRr7w6J+wT7RuGBS0w48qG9LXAEbpFlBz DwIXz2ApbfhhZjEMaobBKPtrVN5lLLywtqpYlQ9Z2FH2nKDRVhtZt2/eOqbGacaiualM 8doh+6j2JWPbMaXJrIpMZnGK/tyBLkpH0zWAiAadt75Ggam0L2UNB2Wwr6aAFvenekqo XUr/FrdTQMQo9dMnfwpIlAx3dC4e/cSPrt9ajF9B/Bi0J+88GQDULR83iBlWpLdEDWyI 4FaLCRij5KPUZO7tesSR2qCd4nByqegQKFheTSQ0ZeTYSKxKkaGQo8DBfrOJnjR15b4t MQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3ncb55w9cn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:37:53 +0000 Received: from m0098419.ppops.net (m0098419.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 30RCWZSf027305; Fri, 27 Jan 2023 12:37:52 GMT Received: from ppma06ams.nl.ibm.com (66.31.33a9.ip4.static.sl-reverse.com [169.51.49.102]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3ncb55w9c7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:37:52 +0000 Received: from pps.filterd (ppma06ams.nl.ibm.com [127.0.0.1]) by ppma06ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 30R6MBAg014950; Fri, 27 Jan 2023 12:37:50 GMT Received: from smtprelay03.fra02v.mail.ibm.com ([9.218.2.224]) by ppma06ams.nl.ibm.com (PPS) with ESMTPS id 3n87affhcg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:37:50 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay03.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 30RCbmbq44564804 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 27 Jan 2023 12:37:48 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7B54020043; Fri, 27 Jan 2023 12:37:48 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6F3AD20040; Fri, 27 Jan 2023 12:37:46 +0000 (GMT) Received: from li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com (unknown [9.43.40.88]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 27 Jan 2023 12:37:46 +0000 (GMT) From: Ojaswin Mujoo To: linux-ext4@vger.kernel.org, "Theodore Ts'o" Cc: Ritesh Harjani , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara , Ritesh Harjani Subject: [RFC 02/11] ext4: Remove unused extern variables declaration Date: Fri, 27 Jan 2023 18:07:29 +0530 Message-Id: X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: aD1EUGbqbzkDOIonUsfxKhX4Go5PhI7Y X-Proofpoint-ORIG-GUID: LTdwKeTkvdswc9LuDfo5Gv4BJ7KZnVba X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-01-27_06,2023-01-27_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 bulkscore=0 spamscore=0 adultscore=0 mlxlogscore=999 clxscore=1015 suspectscore=0 priorityscore=1501 phishscore=0 mlxscore=0 malwarescore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301270113 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org ext4_mb_stats & ext4_mb_max_to_scan are never used. We use sbi->s_mb_stats and sbi->s_mb_max_to_scan instead. Hence kill these extern declarations. Signed-off-by: Ritesh Harjani (IBM) Signed-off-by: Ojaswin Mujoo Reviewed-by: Jan Kara --- fs/ext4/ext4.h | 2 -- fs/ext4/mballoc.h | 2 +- 2 files changed, 1 insertion(+), 3 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 140e1eb300d1..b8b00457da8d 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -2903,8 +2903,6 @@ int ext4_fc_record_regions(struct super_block *sb, int ino, /* mballoc.c */ extern const struct seq_operations ext4_mb_seq_groups_ops; extern const struct seq_operations ext4_mb_seq_structs_summary_ops; -extern long ext4_mb_stats; -extern long ext4_mb_max_to_scan; extern int ext4_seq_mb_stats_show(struct seq_file *seq, void *offset); extern int ext4_mb_init(struct super_block *); extern int ext4_mb_release(struct super_block *); diff --git a/fs/ext4/mballoc.h b/fs/ext4/mballoc.h index dcda2a943cee..165a17893c81 100644 --- a/fs/ext4/mballoc.h +++ b/fs/ext4/mballoc.h @@ -49,7 +49,7 @@ #define MB_DEFAULT_MIN_TO_SCAN 10 /* - * with 'ext4_mb_stats' allocator will collect stats that will be + * with 's_mb_stats' allocator will collect stats that will be * shown at umount. The collecting costs though! */ #define MB_DEFAULT_STATS 0 From patchwork Fri Jan 27 12:37:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ojaswin Mujoo X-Patchwork-Id: 13118635 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B56FC61DA4 for ; Fri, 27 Jan 2023 12:38:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234223AbjA0MiJ (ORCPT ); Fri, 27 Jan 2023 07:38:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49016 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234161AbjA0MiB (ORCPT ); Fri, 27 Jan 2023 07:38:01 -0500 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77C3C12061; Fri, 27 Jan 2023 04:38:00 -0800 (PST) Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 30RC1RsW010141; Fri, 27 Jan 2023 12:37:56 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=86vbvxV3oTbbVZUf0y4m/BYO8NK+DMGh0g9SOSIp/ko=; b=ZXzl+yCpV7nEXzO/FxLkftTFCrIkzZFhTCCHHjVMCJ83snzUGfGbA18OoGcb1hUNvSGO kK/sMis5w4qUJiLLXf6NGMjDxYwlmxdnniHKK5UqTR8OUu5jVAp6ISJa4JtsBD1GNh7D 7E4PzTyluNsgzgYAQIt6ObueLJqTmsSoE0L3Cn8qWuegzr1y4whXv0E3tNOykS/3mwKI /JHfwhaCOMSFVPElPWO5yNVwv1OQ3LLMGBvWDLTaOE+4dvdEhD5ijNM9e6F9m7idyC5j MUZL26iCMgG2jAKjTR7ZGYb+yJWyZ1sg9FJH2rOEteY9VZ3xXd1ECAjuDIG07V7oW6I3 cw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3nce8m8vcv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:37:55 +0000 Received: from m0127361.ppops.net (m0127361.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 30RCKn4A002183; Fri, 27 Jan 2023 12:37:55 GMT Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3nce8m8vce-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:37:55 +0000 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 30R4ux3h026714; Fri, 27 Jan 2023 12:37:53 GMT Received: from smtprelay06.fra02v.mail.ibm.com ([9.218.2.230]) by ppma04ams.nl.ibm.com (PPS) with ESMTPS id 3n87p6fhrv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:37:53 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay06.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 30RCbpRA21889394 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 27 Jan 2023 12:37:51 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4439920040; Fri, 27 Jan 2023 12:37:51 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DEC9F20043; Fri, 27 Jan 2023 12:37:48 +0000 (GMT) Received: from li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com (unknown [9.43.40.88]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 27 Jan 2023 12:37:48 +0000 (GMT) From: Ojaswin Mujoo To: linux-ext4@vger.kernel.org, "Theodore Ts'o" Cc: Ritesh Harjani , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara , Ritesh Harjani Subject: [RFC 03/11] ext4: mballoc: Fix getting the right group desc in ext4_mb_prefetch_fini Date: Fri, 27 Jan 2023 18:07:30 +0530 Message-Id: <85bbcb3774e38de65b737ef0000241ddbdda73aa.1674822311.git.ojaswin@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: hB3dcAg9-TZnWCKfKG4vQUi6nytJgI2c X-Proofpoint-GUID: PBpOGYKAC-JEASkWHryYuXUdGurJD6df X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-01-27_07,2023-01-27_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 bulkscore=0 mlxscore=0 spamscore=0 impostorscore=0 malwarescore=0 suspectscore=0 mlxlogscore=917 priorityscore=1501 adultscore=0 phishscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301270113 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org group descriptor and group info are not of the same group in ext4_mb_prefetch_fini(). This problem was found during code review/walkthrough and seems like a bug, so fix it. Signed-off-by: Ritesh Harjani (IBM) Signed-off-by: Ojaswin Mujoo Reviewed-by: Jan Kara --- fs/ext4/mballoc.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 572e79a698d4..8b22cc07b054 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -2569,14 +2569,14 @@ ext4_group_t ext4_mb_prefetch(struct super_block *sb, ext4_group_t group, void ext4_mb_prefetch_fini(struct super_block *sb, ext4_group_t group, unsigned int nr) { - while (nr-- > 0) { - struct ext4_group_desc *gdp = ext4_get_group_desc(sb, group, - NULL); - struct ext4_group_info *grp = ext4_get_group_info(sb, group); + struct ext4_group_desc *gdp; + struct ext4_group_info *grp; + while (nr-- > 0) { if (!group) group = ext4_get_groups_count(sb); group--; + gdp = ext4_get_group_desc(sb, group, NULL); grp = ext4_get_group_info(sb, group); if (EXT4_MB_GRP_NEED_INIT(grp) && From patchwork Fri Jan 27 12:37:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ojaswin Mujoo X-Patchwork-Id: 13118636 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A5E7C54EAA for ; Fri, 27 Jan 2023 12:38:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234241AbjA0MiT (ORCPT ); Fri, 27 Jan 2023 07:38:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49214 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234206AbjA0MiH (ORCPT ); Fri, 27 Jan 2023 07:38:07 -0500 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4A9B51CAC5; Fri, 27 Jan 2023 04:38:02 -0800 (PST) Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 30RC1NPZ009994; Fri, 27 Jan 2023 12:37:59 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=oxXbLlZMSCqBiLXI1OQsnriO0ZS7CHP7ifIFwBzqpTU=; b=GM2u6HPdY90xiF3aabJdrWZzxHYx7IirARw8SBOXGBH85VdHDr6gC1lgU4SK+d4PZfOB C6asRtUFSxZHUIUVrRPO8jbdoFaCqedWlbg9dzJ30Gjd16fAcfGfO2WpFUmz6wpb1p8a EQvRRD5XF1xM80MePS2E8PQMph3OVy3UXKtBJDYDpwlOE1KLkiqLDznGxVzVhVPrarqQ U5WK4k+hCHLklVHvWdcr7c2lfxG14AWV5HzL3AVYSZnB5jODDt6/2S3Ez63YrlTIokrB zx2kPNeMC4Chbz3IKk7rISSOWkL6p7YPCHd9E/uHKgJmxpQgVUJFDMl7vC+XxM8XgwCj wQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3nce8m8ve4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:37:58 +0000 Received: from m0127361.ppops.net (m0127361.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 30RCNMbN011766; Fri, 27 Jan 2023 12:37:58 GMT Received: from ppma03fra.de.ibm.com (6b.4a.5195.ip4.static.sl-reverse.com [149.81.74.107]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3nce8m8vda-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:37:58 +0000 Received: from pps.filterd (ppma03fra.de.ibm.com [127.0.0.1]) by ppma03fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 30QEe0Sc016228; Fri, 27 Jan 2023 12:37:56 GMT Received: from smtprelay01.fra02v.mail.ibm.com ([9.218.2.227]) by ppma03fra.de.ibm.com (PPS) with ESMTPS id 3n87p6dcs5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:37:56 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay01.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 30RCbrUj48366014 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 27 Jan 2023 12:37:53 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BB13420043; Fri, 27 Jan 2023 12:37:53 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D6FEA20040; Fri, 27 Jan 2023 12:37:51 +0000 (GMT) Received: from li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com (unknown [9.43.40.88]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 27 Jan 2023 12:37:51 +0000 (GMT) From: Ojaswin Mujoo To: linux-ext4@vger.kernel.org, "Theodore Ts'o" Cc: Ritesh Harjani , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara , Ritesh Harjani Subject: [RFC 04/11] ext4: Convert mballoc cr (criteria) to enum Date: Fri, 27 Jan 2023 18:07:31 +0530 Message-Id: <9670431b31aa62e83509fa2802aad364910ee52e.1674822311.git.ojaswin@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 9UEX5AozGlyJy9EPm2mLvGUWaBhhopF1 X-Proofpoint-GUID: PGS5qxUUkYbQ25qLQqCg_zcF-FeTCIdO X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-01-27_07,2023-01-27_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 bulkscore=0 mlxscore=0 spamscore=0 impostorscore=0 malwarescore=0 suspectscore=0 mlxlogscore=999 priorityscore=1501 adultscore=0 phishscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301270113 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Convert criteria to be an enum so it easier to maintain. This change also makes it easier to insert new criterias in the future. Signed-off-by: Ojaswin Mujoo Reviewed-by: Ritesh Harjani (IBM) --- fs/ext4/ext4.h | 14 ++++++-- fs/ext4/mballoc.c | 88 +++++++++++++++++++++++------------------------ fs/ext4/mballoc.h | 10 ++++++ 3 files changed, 65 insertions(+), 47 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index b8b00457da8d..6037b8e0af86 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -126,6 +126,14 @@ enum SHIFT_DIRECTION { SHIFT_RIGHT, }; +/* + * Number of criterias defined. For each criteria, mballoc has slightly + * different way of finding the required blocks nad usually, higher the + * criteria the slower the allocation. We start at lower criterias and keep + * falling back to higher ones if we are not able to find any blocks. + */ +#define EXT4_MB_NUM_CRS 4 + /* * Flags used in mballoc's allocation_context flags field. * @@ -1631,9 +1639,9 @@ struct ext4_sb_info { atomic_t s_bal_2orders; /* 2^order hits */ atomic_t s_bal_cr0_bad_suggestions; atomic_t s_bal_cr1_bad_suggestions; - atomic64_t s_bal_cX_groups_considered[4]; - atomic64_t s_bal_cX_hits[4]; - atomic64_t s_bal_cX_failed[4]; /* cX loop didn't find blocks */ + atomic64_t s_bal_cX_groups_considered[EXT4_MB_NUM_CRS]; + atomic64_t s_bal_cX_hits[EXT4_MB_NUM_CRS]; + atomic64_t s_bal_cX_failed[EXT4_MB_NUM_CRS]; /* cX loop didn't find blocks */ atomic_t s_mb_buddies_generated; /* number of buddies generated */ atomic64_t s_mb_generation_time; atomic_t s_mb_lost_chunks; diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 8b22cc07b054..323604a2ff45 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -409,7 +409,7 @@ static void ext4_mb_generate_from_freelist(struct super_block *sb, void *bitmap, static void ext4_mb_new_preallocation(struct ext4_allocation_context *ac); static bool ext4_mb_good_group(struct ext4_allocation_context *ac, - ext4_group_t group, int cr); + ext4_group_t group, enum criteria cr); static int ext4_try_to_trim_range(struct super_block *sb, struct ext4_buddy *e4b, ext4_grpblk_t start, @@ -857,7 +857,7 @@ mb_update_avg_fragment_size(struct super_block *sb, struct ext4_group_info *grp) * cr level needs an update. */ static void ext4_mb_choose_next_group_cr0(struct ext4_allocation_context *ac, - int *new_cr, ext4_group_t *group, ext4_group_t ngroups) + enum criteria *new_cr, ext4_group_t *group, ext4_group_t ngroups) { struct ext4_sb_info *sbi = EXT4_SB(ac->ac_sb); struct ext4_group_info *iter, *grp; @@ -882,8 +882,8 @@ static void ext4_mb_choose_next_group_cr0(struct ext4_allocation_context *ac, list_for_each_entry(iter, &sbi->s_mb_largest_free_orders[i], bb_largest_free_order_node) { if (sbi->s_mb_stats) - atomic64_inc(&sbi->s_bal_cX_groups_considered[0]); - if (likely(ext4_mb_good_group(ac, iter->bb_group, 0))) { + atomic64_inc(&sbi->s_bal_cX_groups_considered[CR0]); + if (likely(ext4_mb_good_group(ac, iter->bb_group, CR0))) { grp = iter; break; } @@ -895,7 +895,7 @@ static void ext4_mb_choose_next_group_cr0(struct ext4_allocation_context *ac, if (!grp) { /* Increment cr and search again */ - *new_cr = 1; + *new_cr = CR1; } else { *group = grp->bb_group; ac->ac_flags |= EXT4_MB_CR0_OPTIMIZED; @@ -907,7 +907,7 @@ static void ext4_mb_choose_next_group_cr0(struct ext4_allocation_context *ac, * order. Updates *new_cr if cr level needs an update. */ static void ext4_mb_choose_next_group_cr1(struct ext4_allocation_context *ac, - int *new_cr, ext4_group_t *group, ext4_group_t ngroups) + enum criteria *new_cr, ext4_group_t *group, ext4_group_t ngroups) { struct ext4_sb_info *sbi = EXT4_SB(ac->ac_sb); struct ext4_group_info *grp = NULL, *iter; @@ -930,8 +930,8 @@ static void ext4_mb_choose_next_group_cr1(struct ext4_allocation_context *ac, list_for_each_entry(iter, &sbi->s_mb_avg_fragment_size[i], bb_avg_fragment_size_node) { if (sbi->s_mb_stats) - atomic64_inc(&sbi->s_bal_cX_groups_considered[1]); - if (likely(ext4_mb_good_group(ac, iter->bb_group, 1))) { + atomic64_inc(&sbi->s_bal_cX_groups_considered[CR1]); + if (likely(ext4_mb_good_group(ac, iter->bb_group, CR1))) { grp = iter; break; } @@ -945,7 +945,7 @@ static void ext4_mb_choose_next_group_cr1(struct ext4_allocation_context *ac, *group = grp->bb_group; ac->ac_flags |= EXT4_MB_CR1_OPTIMIZED; } else { - *new_cr = 2; + *new_cr = CR2; } } @@ -953,7 +953,7 @@ static inline int should_optimize_scan(struct ext4_allocation_context *ac) { if (unlikely(!test_opt2(ac->ac_sb, MB_OPTIMIZE_SCAN))) return 0; - if (ac->ac_criteria >= 2) + if (ac->ac_criteria >= CR2) return 0; if (!ext4_test_inode_flag(ac->ac_inode, EXT4_INODE_EXTENTS)) return 0; @@ -998,7 +998,7 @@ next_linear_group(struct ext4_allocation_context *ac, int group, int ngroups) * @ngroups Total number of groups */ static void ext4_mb_choose_next_group(struct ext4_allocation_context *ac, - int *new_cr, ext4_group_t *group, ext4_group_t ngroups) + enum criteria *new_cr, ext4_group_t *group, ext4_group_t ngroups) { *new_cr = ac->ac_criteria; @@ -1007,9 +1007,9 @@ static void ext4_mb_choose_next_group(struct ext4_allocation_context *ac, return; } - if (*new_cr == 0) { + if (*new_cr == CR0) { ext4_mb_choose_next_group_cr0(ac, new_cr, group, ngroups); - } else if (*new_cr == 1) { + } else if (*new_cr == CR1) { ext4_mb_choose_next_group_cr1(ac, new_cr, group, ngroups); } else { /* @@ -2378,13 +2378,13 @@ void ext4_mb_scan_aligned(struct ext4_allocation_context *ac, * for the allocation or not. */ static bool ext4_mb_good_group(struct ext4_allocation_context *ac, - ext4_group_t group, int cr) + ext4_group_t group, enum criteria cr) { ext4_grpblk_t free, fragments; int flex_size = ext4_flex_bg_size(EXT4_SB(ac->ac_sb)); struct ext4_group_info *grp = ext4_get_group_info(ac->ac_sb, group); - BUG_ON(cr < 0 || cr >= 4); + BUG_ON(cr < CR0 || cr >= EXT4_MB_NUM_CRS); if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(grp))) return false; @@ -2398,7 +2398,7 @@ static bool ext4_mb_good_group(struct ext4_allocation_context *ac, return false; switch (cr) { - case 0: + case CR0: BUG_ON(ac->ac_2order == 0); /* Avoid using the first bg of a flexgroup for data files */ @@ -2417,15 +2417,15 @@ static bool ext4_mb_good_group(struct ext4_allocation_context *ac, return false; return true; - case 1: + case CR1: if ((free / fragments) >= ac->ac_g_ex.fe_len) return true; break; - case 2: + case CR2: if (free >= ac->ac_g_ex.fe_len) return true; break; - case 3: + case CR3: return true; default: BUG(); @@ -2446,7 +2446,7 @@ static bool ext4_mb_good_group(struct ext4_allocation_context *ac, * out"! */ static int ext4_mb_good_group_nolock(struct ext4_allocation_context *ac, - ext4_group_t group, int cr) + ext4_group_t group, enum criteria cr) { struct ext4_group_info *grp = ext4_get_group_info(ac->ac_sb, group); struct super_block *sb = ac->ac_sb; @@ -2464,7 +2464,7 @@ static int ext4_mb_good_group_nolock(struct ext4_allocation_context *ac, free = grp->bb_free; if (free == 0) goto out; - if (cr <= 2 && free < ac->ac_g_ex.fe_len) + if (cr <= CR2 && free < ac->ac_g_ex.fe_len) goto out; if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(grp))) goto out; @@ -2479,7 +2479,7 @@ static int ext4_mb_good_group_nolock(struct ext4_allocation_context *ac, ext4_get_group_desc(sb, group, NULL); int ret; - /* cr=0/1 is a very optimistic search to find large + /* cr=CR0/CR1 is a very optimistic search to find large * good chunks almost for free. If buddy data is not * ready, then this optimization makes no sense. But * we never skip the first block group in a flex_bg, @@ -2487,7 +2487,7 @@ static int ext4_mb_good_group_nolock(struct ext4_allocation_context *ac, * and we want to make sure we locate metadata blocks * in the first block group in the flex_bg if possible. */ - if (cr < 2 && + if (cr < CR2 && (!sbi->s_log_groups_per_flex || ((group & ((1 << sbi->s_log_groups_per_flex) - 1)) != 0)) && !(ext4_has_group_desc_csum(sb) && @@ -2593,7 +2593,7 @@ static noinline_for_stack int ext4_mb_regular_allocator(struct ext4_allocation_context *ac) { ext4_group_t prefetch_grp = 0, ngroups, group, i; - int cr = -1, new_cr; + enum criteria cr, new_cr; int err = 0, first_err = 0; unsigned int nr = 0, prefetch_ios = 0; struct ext4_sb_info *sbi; @@ -2651,13 +2651,13 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac) } /* Let's just scan groups to find more-less suitable blocks */ - cr = ac->ac_2order ? 0 : 1; + cr = ac->ac_2order ? CR0 : CR1; /* - * cr == 0 try to get exact allocation, - * cr == 3 try to get anything + * cr == CR0 try to get exact allocation, + * cr == CR3 try to get anything */ repeat: - for (; cr < 4 && ac->ac_status == AC_STATUS_CONTINUE; cr++) { + for (; cr < EXT4_MB_NUM_CRS && ac->ac_status == AC_STATUS_CONTINUE; cr++) { ac->ac_criteria = cr; /* * searching for the right group start @@ -2684,7 +2684,7 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac) * spend a lot of time loading imperfect groups */ if ((prefetch_grp == group) && - (cr > 1 || + (cr > CR1 || prefetch_ios < sbi->s_mb_prefetch_limit)) { unsigned int curr_ios = prefetch_ios; @@ -2726,9 +2726,9 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac) } ac->ac_groups_scanned++; - if (cr == 0) + if (cr == CR0) ext4_mb_simple_scan_group(ac, &e4b); - else if (cr == 1 && sbi->s_stripe && + else if (cr == CR1 && sbi->s_stripe && !(ac->ac_g_ex.fe_len % sbi->s_stripe)) ext4_mb_scan_aligned(ac, &e4b); else @@ -2768,7 +2768,7 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac) ac->ac_b_ex.fe_len = 0; ac->ac_status = AC_STATUS_CONTINUE; ac->ac_flags |= EXT4_MB_HINT_FIRST; - cr = 3; + cr = CR3; goto repeat; } } @@ -2891,36 +2891,36 @@ int ext4_seq_mb_stats_show(struct seq_file *seq, void *offset) seq_printf(seq, "\tgroups_scanned: %u\n", atomic_read(&sbi->s_bal_groups_scanned)); seq_puts(seq, "\tcr0_stats:\n"); - seq_printf(seq, "\t\thits: %llu\n", atomic64_read(&sbi->s_bal_cX_hits[0])); + seq_printf(seq, "\t\thits: %llu\n", atomic64_read(&sbi->s_bal_cX_hits[CR0])); seq_printf(seq, "\t\tgroups_considered: %llu\n", - atomic64_read(&sbi->s_bal_cX_groups_considered[0])); + atomic64_read(&sbi->s_bal_cX_groups_considered[CR0])); seq_printf(seq, "\t\tuseless_loops: %llu\n", - atomic64_read(&sbi->s_bal_cX_failed[0])); + atomic64_read(&sbi->s_bal_cX_failed[CR0])); seq_printf(seq, "\t\tbad_suggestions: %u\n", atomic_read(&sbi->s_bal_cr0_bad_suggestions)); seq_puts(seq, "\tcr1_stats:\n"); - seq_printf(seq, "\t\thits: %llu\n", atomic64_read(&sbi->s_bal_cX_hits[1])); + seq_printf(seq, "\t\thits: %llu\n", atomic64_read(&sbi->s_bal_cX_hits[CR1])); seq_printf(seq, "\t\tgroups_considered: %llu\n", - atomic64_read(&sbi->s_bal_cX_groups_considered[1])); + atomic64_read(&sbi->s_bal_cX_groups_considered[CR1])); seq_printf(seq, "\t\tuseless_loops: %llu\n", - atomic64_read(&sbi->s_bal_cX_failed[1])); + atomic64_read(&sbi->s_bal_cX_failed[CR1])); seq_printf(seq, "\t\tbad_suggestions: %u\n", atomic_read(&sbi->s_bal_cr1_bad_suggestions)); seq_puts(seq, "\tcr2_stats:\n"); - seq_printf(seq, "\t\thits: %llu\n", atomic64_read(&sbi->s_bal_cX_hits[2])); + seq_printf(seq, "\t\thits: %llu\n", atomic64_read(&sbi->s_bal_cX_hits[CR2])); seq_printf(seq, "\t\tgroups_considered: %llu\n", - atomic64_read(&sbi->s_bal_cX_groups_considered[2])); + atomic64_read(&sbi->s_bal_cX_groups_considered[CR2])); seq_printf(seq, "\t\tuseless_loops: %llu\n", - atomic64_read(&sbi->s_bal_cX_failed[2])); + atomic64_read(&sbi->s_bal_cX_failed[CR2])); seq_puts(seq, "\tcr3_stats:\n"); - seq_printf(seq, "\t\thits: %llu\n", atomic64_read(&sbi->s_bal_cX_hits[3])); + seq_printf(seq, "\t\thits: %llu\n", atomic64_read(&sbi->s_bal_cX_hits[CR3])); seq_printf(seq, "\t\tgroups_considered: %llu\n", - atomic64_read(&sbi->s_bal_cX_groups_considered[3])); + atomic64_read(&sbi->s_bal_cX_groups_considered[CR3])); seq_printf(seq, "\t\tuseless_loops: %llu\n", - atomic64_read(&sbi->s_bal_cX_failed[3])); + atomic64_read(&sbi->s_bal_cX_failed[CR3])); seq_printf(seq, "\textents_scanned: %u\n", atomic_read(&sbi->s_bal_ex_scanned)); seq_printf(seq, "\t\tgoal_hits: %u\n", atomic_read(&sbi->s_bal_goals)); seq_printf(seq, "\t\t2^n_hits: %u\n", atomic_read(&sbi->s_bal_2orders)); diff --git a/fs/ext4/mballoc.h b/fs/ext4/mballoc.h index 165a17893c81..f0087a85e366 100644 --- a/fs/ext4/mballoc.h +++ b/fs/ext4/mballoc.h @@ -95,6 +95,16 @@ */ #define MB_NUM_ORDERS(sb) ((sb)->s_blocksize_bits + 2) +/* + * All possible allocation criterias for mballoc + */ +enum criteria { + CR0, + CR1, + CR2, + CR3, +}; + struct ext4_free_data { /* this links the free block information from sb_info */ struct list_head efd_list; From patchwork Fri Jan 27 12:37:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ojaswin Mujoo X-Patchwork-Id: 13118637 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1A3CAC54EAA for ; Fri, 27 Jan 2023 12:38:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234258AbjA0Mif (ORCPT ); Fri, 27 Jan 2023 07:38:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234226AbjA0MiK (ORCPT ); Fri, 27 Jan 2023 07:38:10 -0500 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6103424CBA; Fri, 27 Jan 2023 04:38:04 -0800 (PST) Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 30RCRt8A027734; Fri, 27 Jan 2023 12:38:01 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=ddsu/0jyS7fDDzRbZJAZ4q3biNn4PL2l/bOCSoC16xU=; b=rBYj8jxaewqEWEUOTLro71wE5EGObhAiyDkvdnSxQHNvHc8aU84fEYN6nK8VeOsN4DEw x16B7lvDQTqBPUlVHb91kxk4EMv2cUfTxttodgVygeDaIWVLdU6XZ+gIkdM228jQe7Kt 6TXUABj5ToYWHdz06q4qsKe1sjxac2Sf7rt8PwaqMmQL3B3wd5dWfd23H9kgVTNsXGqb VuamSiTzgKnl31bnehZnOX552GddlfkI06PU7u3cbRbrUHHLnVAMvr/yBMJriUpiBllq jFebhrz1Omnix7bmH3j+SlMBepFnmNfII+JWQ6wHeyIfjKztRrb+Lyk4INhzQDR34PaM OQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3nce8m8vex-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:38:00 +0000 Received: from m0127361.ppops.net (m0127361.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 30RCKOWl001554; Fri, 27 Jan 2023 12:38:00 GMT Received: from ppma06ams.nl.ibm.com (66.31.33a9.ip4.static.sl-reverse.com [169.51.49.102]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3nce8m8vec-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:38:00 +0000 Received: from pps.filterd (ppma06ams.nl.ibm.com [127.0.0.1]) by ppma06ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 30R3aZGL028677; Fri, 27 Jan 2023 12:37:58 GMT Received: from smtprelay04.fra02v.mail.ibm.com ([9.218.2.228]) by ppma06ams.nl.ibm.com (PPS) with ESMTPS id 3n87affhcq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:37:58 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay04.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 30RCbuME15401514 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 27 Jan 2023 12:37:56 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id EF8FF20049; Fri, 27 Jan 2023 12:37:55 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 25BFD20040; Fri, 27 Jan 2023 12:37:54 +0000 (GMT) Received: from li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com (unknown [9.43.40.88]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 27 Jan 2023 12:37:53 +0000 (GMT) From: Ojaswin Mujoo To: linux-ext4@vger.kernel.org, "Theodore Ts'o" Cc: Ritesh Harjani , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara , Ritesh Harjani Subject: [RFC 05/11] ext4: Add per CR extent scanned counter Date: Fri, 27 Jan 2023 18:07:32 +0530 Message-Id: X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 5LjGqIYMMtTPIwktfANJT_7iolLfImSU X-Proofpoint-GUID: 1L0AmnUo-F8HudI_5BbIMdlmzY97eyBG X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-01-27_07,2023-01-27_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 bulkscore=0 mlxscore=0 spamscore=0 impostorscore=0 malwarescore=0 suspectscore=0 mlxlogscore=999 priorityscore=1501 adultscore=0 phishscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301270113 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This gives better visibility into the number of extents scanned in each particular CR. For example, this information can be used to see how out block group scanning logic is performing when the BG is fragmented. Signed-off-by: Ojaswin Mujoo Reviewed-by: Ritesh Harjani (IBM) Reviewed-by: Jan Kara --- fs/ext4/ext4.h | 1 + fs/ext4/mballoc.c | 12 ++++++++++++ fs/ext4/mballoc.h | 1 + 3 files changed, 14 insertions(+) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 6037b8e0af86..4ba2c95915eb 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1633,6 +1633,7 @@ struct ext4_sb_info { atomic_t s_bal_success; /* we found long enough chunks */ atomic_t s_bal_allocated; /* in blocks */ atomic_t s_bal_ex_scanned; /* total extents scanned */ + atomic_t s_bal_cX_ex_scanned[EXT4_MB_NUM_CRS]; /* total extents scanned */ atomic_t s_bal_groups_scanned; /* number of groups scanned */ atomic_t s_bal_goals; /* goal hits */ atomic_t s_bal_breaks; /* too long searches */ diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 323604a2ff45..07a50a13751c 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -2077,6 +2077,7 @@ static void ext4_mb_measure_extent(struct ext4_allocation_context *ac, BUG_ON(ac->ac_status != AC_STATUS_CONTINUE); ac->ac_found++; + ac->ac_cX_found[ac->ac_criteria]++; /* * The special case - take what you catch first @@ -2249,6 +2250,7 @@ void ext4_mb_simple_scan_group(struct ext4_allocation_context *ac, break; } ac->ac_found++; + ac->ac_cX_found[ac->ac_criteria]++; ac->ac_b_ex.fe_len = 1 << i; ac->ac_b_ex.fe_start = k << i; @@ -2362,6 +2364,7 @@ void ext4_mb_scan_aligned(struct ext4_allocation_context *ac, max = mb_find_extent(e4b, i, sbi->s_stripe, &ex); if (max >= sbi->s_stripe) { ac->ac_found++; + ac->ac_cX_found[ac->ac_criteria]++; ex.fe_logical = 0xDEADF00D; /* debug value */ ac->ac_b_ex = ex; ext4_mb_use_best_found(ac, e4b); @@ -2894,6 +2897,7 @@ int ext4_seq_mb_stats_show(struct seq_file *seq, void *offset) seq_printf(seq, "\t\thits: %llu\n", atomic64_read(&sbi->s_bal_cX_hits[CR0])); seq_printf(seq, "\t\tgroups_considered: %llu\n", atomic64_read(&sbi->s_bal_cX_groups_considered[CR0])); + seq_printf(seq, "\t\textents_scanned: %u\n", atomic_read(&sbi->s_bal_cX_ex_scanned[CR0])); seq_printf(seq, "\t\tuseless_loops: %llu\n", atomic64_read(&sbi->s_bal_cX_failed[CR0])); seq_printf(seq, "\t\tbad_suggestions: %u\n", @@ -2903,6 +2907,7 @@ int ext4_seq_mb_stats_show(struct seq_file *seq, void *offset) seq_printf(seq, "\t\thits: %llu\n", atomic64_read(&sbi->s_bal_cX_hits[CR1])); seq_printf(seq, "\t\tgroups_considered: %llu\n", atomic64_read(&sbi->s_bal_cX_groups_considered[CR1])); + seq_printf(seq, "\t\textents_scanned: %u\n", atomic_read(&sbi->s_bal_cX_ex_scanned[CR1])); seq_printf(seq, "\t\tuseless_loops: %llu\n", atomic64_read(&sbi->s_bal_cX_failed[CR1])); seq_printf(seq, "\t\tbad_suggestions: %u\n", @@ -2912,6 +2917,7 @@ int ext4_seq_mb_stats_show(struct seq_file *seq, void *offset) seq_printf(seq, "\t\thits: %llu\n", atomic64_read(&sbi->s_bal_cX_hits[CR2])); seq_printf(seq, "\t\tgroups_considered: %llu\n", atomic64_read(&sbi->s_bal_cX_groups_considered[CR2])); + seq_printf(seq, "\t\textents_scanned: %u\n", atomic_read(&sbi->s_bal_cX_ex_scanned[CR2])); seq_printf(seq, "\t\tuseless_loops: %llu\n", atomic64_read(&sbi->s_bal_cX_failed[CR2])); @@ -2919,6 +2925,7 @@ int ext4_seq_mb_stats_show(struct seq_file *seq, void *offset) seq_printf(seq, "\t\thits: %llu\n", atomic64_read(&sbi->s_bal_cX_hits[CR3])); seq_printf(seq, "\t\tgroups_considered: %llu\n", atomic64_read(&sbi->s_bal_cX_groups_considered[CR3])); + seq_printf(seq, "\t\textents_scanned: %u\n", atomic_read(&sbi->s_bal_cX_ex_scanned[CR3])); seq_printf(seq, "\t\tuseless_loops: %llu\n", atomic64_read(&sbi->s_bal_cX_failed[CR3])); seq_printf(seq, "\textents_scanned: %u\n", atomic_read(&sbi->s_bal_ex_scanned)); @@ -4216,7 +4223,12 @@ static void ext4_mb_collect_stats(struct ext4_allocation_context *ac) atomic_add(ac->ac_b_ex.fe_len, &sbi->s_bal_allocated); if (ac->ac_b_ex.fe_len >= ac->ac_o_ex.fe_len) atomic_inc(&sbi->s_bal_success); + atomic_add(ac->ac_found, &sbi->s_bal_ex_scanned); + for (int i=0; iac_cX_found[i], &sbi->s_bal_cX_ex_scanned[i]); + } + atomic_add(ac->ac_groups_scanned, &sbi->s_bal_groups_scanned); if (ac->ac_g_ex.fe_start == ac->ac_b_ex.fe_start && ac->ac_g_ex.fe_group == ac->ac_b_ex.fe_group) diff --git a/fs/ext4/mballoc.h b/fs/ext4/mballoc.h index f0087a85e366..004b8d163cc9 100644 --- a/fs/ext4/mballoc.h +++ b/fs/ext4/mballoc.h @@ -193,6 +193,7 @@ struct ext4_allocation_context { __u16 ac_groups_scanned; __u16 ac_groups_linear_remaining; __u16 ac_found; + __u16 ac_cX_found[EXT4_MB_NUM_CRS]; __u16 ac_tail; __u16 ac_buddy; __u8 ac_status; From patchwork Fri Jan 27 12:37:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ojaswin Mujoo X-Patchwork-Id: 13118641 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D704BC54EAA for ; Fri, 27 Jan 2023 12:38:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234292AbjA0Miy (ORCPT ); Fri, 27 Jan 2023 07:38:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50106 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234206AbjA0Mia (ORCPT ); Fri, 27 Jan 2023 07:38:30 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1E88223859; Fri, 27 Jan 2023 04:38:06 -0800 (PST) Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 30RBET1Y014380; Fri, 27 Jan 2023 12:38:03 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=M1tb4cjH4bgJJnWusS9MEZJss0JDmzKW4zB+owgJxGA=; b=btuC1X9nWwAj2GbGt8lRrwSlWKQNLH51G9HP4KFSIU9q8LugLCs2NK0TEyZDFfkEpMN9 8s14ML+FKTCqRlTwsLxB9LNdTDFV6jC0FcGgMpzWGKz/4pCSmiqiwjjRU5r9vDY8VoUI kfEQP+PnUlGy6VbZ9F/avhyEWSllK3Tzw6Khtz056MFnfA4wTKchEqOsjnG+BMb00Vcd yL0uUayJgrGxUm4QAJhn/Dfhf3vE8tJ/11vlmQgGjwr8aae+ipbQBdXEVcmXmg62al3/ LzNoKcMb393Db1Kbxz6yKwXP7lhBwIXZfULnIJhiOQdmFIheTHiWDYHBNXCBqkb6SVjq 6g== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3ncdj7hwyu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:38:02 +0000 Received: from m0098416.ppops.net (m0098416.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 30RBr2Zi010709; Fri, 27 Jan 2023 12:38:02 GMT Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3ncdj7hwy9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:38:02 +0000 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 30QKVxYn027489; Fri, 27 Jan 2023 12:38:00 GMT Received: from smtprelay06.fra02v.mail.ibm.com ([9.218.2.230]) by ppma04ams.nl.ibm.com (PPS) with ESMTPS id 3n87p6fhs0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:38:00 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay06.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 30RCbw1J24314518 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 27 Jan 2023 12:37:58 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1FD8D2004B; Fri, 27 Jan 2023 12:37:58 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5714220040; Fri, 27 Jan 2023 12:37:56 +0000 (GMT) Received: from li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com (unknown [9.43.40.88]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 27 Jan 2023 12:37:56 +0000 (GMT) From: Ojaswin Mujoo To: linux-ext4@vger.kernel.org, "Theodore Ts'o" Cc: Ritesh Harjani , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara , Ritesh Harjani Subject: [RFC 06/11] ext4: Add counter to track successful allocation of goal length Date: Fri, 27 Jan 2023 18:07:33 +0530 Message-Id: X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: k-7xJYDov3F73zXb2tDJk4djzdO9ZhOs X-Proofpoint-ORIG-GUID: S46KAwtfGnQooOLJOPtt0wa_y-m6rU0s X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-01-27_06,2023-01-27_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 phishscore=0 spamscore=0 malwarescore=0 clxscore=1015 mlxlogscore=999 priorityscore=1501 impostorscore=0 mlxscore=0 bulkscore=0 adultscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301270113 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Track number of allocations where the length of blocks allocated is equal to the length of goal blocks (post normalization). This metric could be useful if making changes to the allocator logic in the future as it could give us visibility into how often do we trim our requests. PS: ac_b_ex.fe_len might get modified due to preallocation efforts and hence we use ac_f_ex.fe_len instead since we want to compare how much the allocator was able to actually find. Signed-off-by: Ojaswin Mujoo Reviewed-by: Ritesh Harjani (IBM) Reviewed-by: Jan Kara --- fs/ext4/ext4.h | 1 + fs/ext4/mballoc.c | 3 +++ 2 files changed, 4 insertions(+) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 4ba2c95915eb..d8fa01e54e81 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1636,6 +1636,7 @@ struct ext4_sb_info { atomic_t s_bal_cX_ex_scanned[EXT4_MB_NUM_CRS]; /* total extents scanned */ atomic_t s_bal_groups_scanned; /* number of groups scanned */ atomic_t s_bal_goals; /* goal hits */ + atomic_t s_bal_len_goals; /* len goal hits */ atomic_t s_bal_breaks; /* too long searches */ atomic_t s_bal_2orders; /* 2^order hits */ atomic_t s_bal_cr0_bad_suggestions; diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 07a50a13751c..c4ab8f412d32 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -2930,6 +2930,7 @@ int ext4_seq_mb_stats_show(struct seq_file *seq, void *offset) atomic64_read(&sbi->s_bal_cX_failed[CR3])); seq_printf(seq, "\textents_scanned: %u\n", atomic_read(&sbi->s_bal_ex_scanned)); seq_printf(seq, "\t\tgoal_hits: %u\n", atomic_read(&sbi->s_bal_goals)); + seq_printf(seq, "\t\tlen_goal_hits: %u\n", atomic_read(&sbi->s_bal_len_goals)); seq_printf(seq, "\t\t2^n_hits: %u\n", atomic_read(&sbi->s_bal_2orders)); seq_printf(seq, "\t\tbreaks: %u\n", atomic_read(&sbi->s_bal_breaks)); seq_printf(seq, "\t\tlost: %u\n", atomic_read(&sbi->s_mb_lost_chunks)); @@ -4233,6 +4234,8 @@ static void ext4_mb_collect_stats(struct ext4_allocation_context *ac) if (ac->ac_g_ex.fe_start == ac->ac_b_ex.fe_start && ac->ac_g_ex.fe_group == ac->ac_b_ex.fe_group) atomic_inc(&sbi->s_bal_goals); + if (ac->ac_f_ex.fe_len == ac->ac_g_ex.fe_len) + atomic_inc(&sbi->s_bal_len_goals); if (ac->ac_found > sbi->s_mb_max_to_scan) atomic_inc(&sbi->s_bal_breaks); } From patchwork Fri Jan 27 12:37:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ojaswin Mujoo X-Patchwork-Id: 13118642 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 937FBC61DA4 for ; Fri, 27 Jan 2023 12:39:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234327AbjA0MjB (ORCPT ); Fri, 27 Jan 2023 07:39:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233929AbjA0Miw (ORCPT ); Fri, 27 Jan 2023 07:38:52 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B278D4671C; Fri, 27 Jan 2023 04:38:08 -0800 (PST) Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 30RBEO7T013729; Fri, 27 Jan 2023 12:38:05 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=bscJQYjzXho0Z4VD3EruBiAZYpwF3KYqE3WK1sXQOEw=; b=kS8OBGZEJzpPiqgYz5pkZCL5EHS+TIQZkCOZKrWMcnyM8S3Rak/eRRiRcDFvMiJZQdrx 1XEs78OCkGPNunTKvc/P3GyfuvpjGw4NhLFZYyZTYVAqPJs8XtbdXiLRQqPF2dKUqRK4 XU+299p4vOClIsHWRgvOc9omwQquynJ6YtfYfrf7OaKtFIXELYP4TUajJztxnYa26Q4o Ys6SlA27+w477qLhTcOyAXh0w37xMN5iipk9IUG3SersTVO+S9itRSOpQf+cp5dW9/Rz uXkoZbx5JKhvWd98pQ23FU8yARJq6mUKLawPqOAl7qKNSfZ6zOgLIWLpfhc8w1nY66A0 LQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3ncdj7hx0t-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:38:05 +0000 Received: from m0098416.ppops.net (m0098416.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 30RBOjIK021632; Fri, 27 Jan 2023 12:38:04 GMT Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3ncdj7hx06-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:38:04 +0000 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 30R7xVts019104; Fri, 27 Jan 2023 12:38:03 GMT Received: from smtprelay01.fra02v.mail.ibm.com ([9.218.2.227]) by ppma03ams.nl.ibm.com (PPS) with ESMTPS id 3n87p6qgt5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:38:02 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay01.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 30RCc0Kv44040452 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 27 Jan 2023 12:38:00 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9E87B20043; Fri, 27 Jan 2023 12:38:00 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 837FA20040; Fri, 27 Jan 2023 12:37:58 +0000 (GMT) Received: from li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com (unknown [9.43.40.88]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 27 Jan 2023 12:37:58 +0000 (GMT) From: Ojaswin Mujoo To: linux-ext4@vger.kernel.org, "Theodore Ts'o" Cc: Ritesh Harjani , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara , Ritesh Harjani Subject: [RFC 07/11] ext4: Avoid scanning smaller extents in BG during CR1 Date: Fri, 27 Jan 2023 18:07:34 +0530 Message-Id: <6fefb97af05081d344185334a36e90f093ccf310.1674822311.git.ojaswin@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: oDJjUhbojVGgpx2vkGJFz0rNxZQRerh8 X-Proofpoint-ORIG-GUID: pcP1HcbRSyTPyMpjNCo62SJkP5I_C3n2 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-01-27_06,2023-01-27_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 phishscore=0 spamscore=0 malwarescore=0 clxscore=1015 mlxlogscore=813 priorityscore=1501 impostorscore=0 mlxscore=0 bulkscore=0 adultscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301270113 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org When we are inside ext4_mb_complex_scan_group() in CR1, we can be sure that this group has atleast 1 big enough continuous free extent to satisfy our request because (free / fragments) > goal length. Hence, instead of wasting time looping over smaller free extents, only try to consider the free extent if we are sure that it has enough continuous free space to satisfy goal length. This is particularly useful when scanning highly fragmented BGs in CR1 as, without this patch, the allocator might stop scanning early before reaching the big enough free extent (due to ac_found > mb_max_to_scan) which causes us to uncessarily trim the request. Signed-off-by: Ojaswin Mujoo Reviewed-by: Ritesh Harjani (IBM) Reviewed-by: Jan Kara --- fs/ext4/mballoc.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index c4ab8f412d32..14529d2fe65f 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -2279,7 +2279,7 @@ void ext4_mb_complex_scan_group(struct ext4_allocation_context *ac, struct super_block *sb = ac->ac_sb; void *bitmap = e4b->bd_bitmap; struct ext4_free_extent ex; - int i; + int i, j, freelen; int free; free = e4b->bd_info->bb_free; @@ -2306,6 +2306,23 @@ void ext4_mb_complex_scan_group(struct ext4_allocation_context *ac, break; } + if (ac->ac_criteria < CR2) { + /* + * In CR1, we are sure that this group will + * have a large enough continuous free extent, so skip + * over the smaller free extents + */ + j = mb_find_next_bit(bitmap, + EXT4_CLUSTERS_PER_GROUP(sb), i); + freelen = j - i; + + if (freelen < ac->ac_g_ex.fe_len) { + i = j; + free -= freelen; + continue; + } + } + mb_find_extent(e4b, i, ac->ac_g_ex.fe_len, &ex); if (WARN_ON(ex.fe_len <= 0)) break; From patchwork Fri Jan 27 12:37:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ojaswin Mujoo X-Patchwork-Id: 13118643 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEA57C61DA7 for ; Fri, 27 Jan 2023 12:39:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234316AbjA0MjS (ORCPT ); Fri, 27 Jan 2023 07:39:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50546 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234314AbjA0Miz (ORCPT ); Fri, 27 Jan 2023 07:38:55 -0500 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C3682677AC; Fri, 27 Jan 2023 04:38:11 -0800 (PST) Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 30RCM9Q8006083; Fri, 27 Jan 2023 12:38:08 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=25XjohbFV2CvFZCrVi5Ol25gUnamkQMJyOCZ53Pxucs=; b=Ml+0HC/YzZNmKnIihe+BH7Vey+OgGBo+EFMiiDP+r3Q/bd4285J4Oyp7J6hP7fMgDuAO Udvv1fO2r4qAwVTCh2KlQn4bOkCVDbvE7c6puUXLVeTtBdT8MeRXx3xBdgAUWrs+f2k4 Augr4JJA9IH8XWyJSKHKj/qA/K+mAhaI26olsRv0e5+ndXO+itx+dNkfEZW/iD06Wi7U 1YLNv9z792bIJjqL4hZcfOIYrtu921/8E1c/X18R/ha7YLwr4l8V7HeTQ4PemDYZEF7G CSX7S+MAjkS/R7MU0DAidkyqM0S6BCgb3c61PElgBOj/Ql+ydVQNNs4LluZvvV2ixSZu gw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3nce8m8vhq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:38:07 +0000 Received: from m0127361.ppops.net (m0127361.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 30RCXd7s013631; Fri, 27 Jan 2023 12:38:07 GMT Received: from ppma06ams.nl.ibm.com (66.31.33a9.ip4.static.sl-reverse.com [169.51.49.102]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3nce8m8vh2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:38:07 +0000 Received: from pps.filterd (ppma06ams.nl.ibm.com [127.0.0.1]) by ppma06ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 30R5jmcL014941; Fri, 27 Jan 2023 12:38:05 GMT Received: from smtprelay04.fra02v.mail.ibm.com ([9.218.2.228]) by ppma06ams.nl.ibm.com (PPS) with ESMTPS id 3n87affhcx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:38:05 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay04.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 30RCc3cC22151574 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 27 Jan 2023 12:38:03 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0BB0220040; Fri, 27 Jan 2023 12:38:03 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3171520043; Fri, 27 Jan 2023 12:38:01 +0000 (GMT) Received: from li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com (unknown [9.43.40.88]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 27 Jan 2023 12:38:00 +0000 (GMT) From: Ojaswin Mujoo To: linux-ext4@vger.kernel.org, "Theodore Ts'o" Cc: Ritesh Harjani , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara , Ritesh Harjani Subject: [RFC 08/11] ext4: Don't skip prefetching BLOCK_UNINIT groups Date: Fri, 27 Jan 2023 18:07:35 +0530 Message-Id: <4881693a4f5ba1fed367310b27c793e4e78520d3.1674822311.git.ojaswin@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: pNaIHWS1mSNMwSyOrlEmrsCHnjwazt_I X-Proofpoint-GUID: Src4obuxj8IyQs2NKNTbDUMwAoNocL00 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-01-27_07,2023-01-27_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 bulkscore=0 mlxscore=0 spamscore=0 impostorscore=0 malwarescore=0 suspectscore=0 mlxlogscore=999 priorityscore=1501 adultscore=0 phishscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301270113 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Currently, ext4_mb_prefetch() and ext4_mb_prefetch_fini() skip BLOCK_UNINIT groups since fetching their bitmaps doesn't need disk IO. As a consequence, we end not initializing the buddy structures and CR0/1 lists for these BGs, even though it can be done without any disk IO overhead. Hence, don't skip such BGs during prefetch and prefetch_fini. This improves the accuracy of CR0/1 allocation as earlier, we could have essentially empty BLOCK_UNINIT groups being ignored by CR0/1 due to their buddy not being initialized, leading to slower CR2 allocations. With this patch CR0/1 will be able to discover these groups as well, thus improving performance. Signed-off-by: Ojaswin Mujoo Reviewed-by: Ritesh Harjani (IBM) Reviewed-by: Jan Kara --- fs/ext4/mballoc.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 14529d2fe65f..48726a831264 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -2557,9 +2557,7 @@ ext4_group_t ext4_mb_prefetch(struct super_block *sb, ext4_group_t group, */ if (!EXT4_MB_GRP_TEST_AND_SET_READ(grp) && EXT4_MB_GRP_NEED_INIT(grp) && - ext4_free_group_clusters(sb, gdp) > 0 && - !(ext4_has_group_desc_csum(sb) && - (gdp->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)))) { + ext4_free_group_clusters(sb, gdp) > 0 ) { bh = ext4_read_block_bitmap_nowait(sb, group, true); if (bh && !IS_ERR(bh)) { if (!buffer_uptodate(bh) && cnt) @@ -2600,9 +2598,7 @@ void ext4_mb_prefetch_fini(struct super_block *sb, ext4_group_t group, grp = ext4_get_group_info(sb, group); if (EXT4_MB_GRP_NEED_INIT(grp) && - ext4_free_group_clusters(sb, gdp) > 0 && - !(ext4_has_group_desc_csum(sb) && - (gdp->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)))) { + ext4_free_group_clusters(sb, gdp) > 0) { if (ext4_mb_init_group(sb, group, GFP_NOFS)) break; } From patchwork Fri Jan 27 12:37:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ojaswin Mujoo X-Patchwork-Id: 13118644 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B39E6C61DA4 for ; Fri, 27 Jan 2023 12:39:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234337AbjA0Mjd (ORCPT ); Fri, 27 Jan 2023 07:39:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50720 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234332AbjA0MjB (ORCPT ); Fri, 27 Jan 2023 07:39:01 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ED5FC4E52A; Fri, 27 Jan 2023 04:38:13 -0800 (PST) Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 30RB1IX3026721; Fri, 27 Jan 2023 12:38:10 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=H4qhakP4j4IPrNknMbV0Ue+ap7BZD6C65A+YztZVLtk=; b=ZXcF52JZQgIQYnxh/YITf5cPSfeZa9SObyY4PMF9CePKC+CrIS2AU9LWfRvW4Rkzi+Ma KNkt6rfLV8eEQ32nDRcJzNGaJIKC6RASP2gPniK55JPCTeY233Q5QaQpvBO3LNvijFnJ o50zxzUnw0M/EOdsAjEqhhQTu1tHnhUowwwED/4gfM3o1slVVVbXf7wjYTM3w3lMJnhX YlmuB/kSIjgdCR5Ek/GVAVsfFsfXjzt76iI9U+f6+y/cETqOlZGJgOE5Map7UZPahSZX 5/45quUdZvDQ4GXuUCYxQwnKgp3Qt+JoDwZXabhDsieZ5aZ5H1vf/Hc6f3Gp9uZ8ziyl 0A== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3ncb55w9jn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:38:10 +0000 Received: from m0098419.ppops.net (m0098419.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 30RCToMI010837; Fri, 27 Jan 2023 12:38:09 GMT Received: from ppma02fra.de.ibm.com (47.49.7a9f.ip4.static.sl-reverse.com [159.122.73.71]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3ncb55w9j4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:38:09 +0000 Received: from pps.filterd (ppma02fra.de.ibm.com [127.0.0.1]) by ppma02fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 30QKRNG8006374; Fri, 27 Jan 2023 12:38:08 GMT Received: from smtprelay06.fra02v.mail.ibm.com ([9.218.2.230]) by ppma02fra.de.ibm.com (PPS) with ESMTPS id 3n87p6dduw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:38:07 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay06.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 30RCc5wI21365072 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 27 Jan 2023 12:38:05 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7217120043; Fri, 27 Jan 2023 12:38:05 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 77CCF20040; Fri, 27 Jan 2023 12:38:03 +0000 (GMT) Received: from li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com (unknown [9.43.40.88]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 27 Jan 2023 12:38:03 +0000 (GMT) From: Ojaswin Mujoo To: linux-ext4@vger.kernel.org, "Theodore Ts'o" Cc: Ritesh Harjani , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara , Ritesh Harjani Subject: [RFC 09/11] ext4: Ensure ext4_mb_prefetch_fini() is called for all prefetched BGs Date: Fri, 27 Jan 2023 18:07:36 +0530 Message-Id: <7540e4069b22fce42dbef34ee0796d5cf5d82fe3.1674822311.git.ojaswin@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: Irr7xmvtPHPR9uAFXHZeB6fyyfaZRIIc X-Proofpoint-ORIG-GUID: WfAo5dPNbML428YRoRBRWVokQpExi_5j X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-01-27_06,2023-01-27_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 bulkscore=0 spamscore=0 adultscore=0 mlxlogscore=937 clxscore=1015 suspectscore=0 priorityscore=1501 phishscore=0 mlxscore=0 malwarescore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301270113 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Before this patch, the call stack in ext4_run_li_request is as follows: /* * nr = no. of BGs we want to fetch (=s_mb_prefetch) * prefetch_ios = no. of BGs not uptodate after * ext4_read_block_bitmap_nowait() */ next_group = ext4_mb_prefetch(sb, group, nr, prefetch_ios); ext4_mb_prefetch_fini(sb, next_group prefetch_ios); ext4_mb_prefetch_fini() will only try to initialize buddies for BGs in range [next_group - prefetch_ios, next_group). This is incorrect since sometimes (prefetch_ios < nr), which causes ext4_mb_prefetch_fini() to incorrectly ignore some of the BGs that might need initialization. This issue is more notable now with the previous patch enabling "fetching" of BLOCK_UNINIT BGs which are marked buffer_uptodate by default. Fix this by passing nr to ext4_mb_prefetch_fini() instead of prefetch_ios so that it considers the right range of groups. Similarly, make sure we don't pass nr=0 to ext4_mb_prefetch_fini() in ext4_mb_regular_allocator() since we might have prefetched BLOCK_UNINIT groups that would need buddy initialization. Signed-off-by: Ojaswin Mujoo Reviewed-by: Ritesh Harjani (IBM) Reviewed-by: Jan Kara --- fs/ext4/mballoc.c | 4 ---- fs/ext4/super.c | 11 ++++------- 2 files changed, 4 insertions(+), 11 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 48726a831264..410c9636907b 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -2702,8 +2702,6 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac) if ((prefetch_grp == group) && (cr > CR1 || prefetch_ios < sbi->s_mb_prefetch_limit)) { - unsigned int curr_ios = prefetch_ios; - nr = sbi->s_mb_prefetch; if (ext4_has_feature_flex_bg(sb)) { nr = 1 << sbi->s_log_groups_per_flex; @@ -2712,8 +2710,6 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac) } prefetch_grp = ext4_mb_prefetch(sb, group, nr, &prefetch_ios); - if (prefetch_ios == curr_ios) - nr = 0; } /* This now checks without needing the buddy page */ diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 72ead3b56706..9dbb09cfc8f7 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -3636,16 +3636,13 @@ static int ext4_run_li_request(struct ext4_li_request *elr) ext4_group_t group = elr->lr_next_group; unsigned int prefetch_ios = 0; int ret = 0; + int nr = EXT4_SB(sb)->s_mb_prefetch; u64 start_time; if (elr->lr_mode == EXT4_LI_MODE_PREFETCH_BBITMAP) { - elr->lr_next_group = ext4_mb_prefetch(sb, group, - EXT4_SB(sb)->s_mb_prefetch, &prefetch_ios); - if (prefetch_ios) - ext4_mb_prefetch_fini(sb, elr->lr_next_group, - prefetch_ios); - trace_ext4_prefetch_bitmaps(sb, group, elr->lr_next_group, - prefetch_ios); + elr->lr_next_group = ext4_mb_prefetch(sb, group, nr, &prefetch_ios); + ext4_mb_prefetch_fini(sb, elr->lr_next_group, nr); + trace_ext4_prefetch_bitmaps(sb, group, elr->lr_next_group, nr); if (group >= elr->lr_next_group) { ret = 1; if (elr->lr_first_not_zeroed != ngroups && From patchwork Fri Jan 27 12:37:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ojaswin Mujoo X-Patchwork-Id: 13118645 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA70BC38142 for ; Fri, 27 Jan 2023 12:40:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234354AbjA0Mjt (ORCPT ); Fri, 27 Jan 2023 07:39:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50806 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234353AbjA0MjD (ORCPT ); Fri, 27 Jan 2023 07:39:03 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F6286C127; Fri, 27 Jan 2023 04:38:16 -0800 (PST) Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 30RB9Wp2026743; Fri, 27 Jan 2023 12:38:12 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=ViiwjZsYS8rJfVsorx/BK548xVcoUVwZZK3Mhl/vduY=; b=RzrfbvGMowJmW0yytR/6KWkBaEma0ssg9kMSkMNWEeRFHhELtJLFl1quPRS5hsszZw2V oaL4Nl8PAgMCX7zSiN7wsDcA8uj7uWNcohEaCsxM51WTzghHEbdY2Z4Gk2Fr+Jy3Irbl N+YfPoa4B+KXUUoUSDFNNsRH7Q39M7CGdfDj5oAMA1XWO1jYdKWefP8rcajaSl6tlF+Q 0nmDW6OczfHAiVY8oniAXrtZ8PboZoJfk64IahgLgbq3xSYcVLLJwJDKV/ZjHPdGSpzh hg49IxhLdGyHSprkwh5h1xFVwttqDkOOLnvjNYDUNEqtO3bn+1R+GVZ2kPh6InA/9fLx SQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3ncb55w9kb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:38:12 +0000 Received: from m0098419.ppops.net (m0098419.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 30RCDHD0012119; Fri, 27 Jan 2023 12:38:12 GMT Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3ncb55w9ju-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:38:12 +0000 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 30R5NAKZ010329; Fri, 27 Jan 2023 12:38:10 GMT Received: from smtprelay01.fra02v.mail.ibm.com ([9.218.2.227]) by ppma03ams.nl.ibm.com (PPS) with ESMTPS id 3n87p6qgt9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:38:10 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay01.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 30RCc7BL44040462 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 27 Jan 2023 12:38:07 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D573520043; Fri, 27 Jan 2023 12:38:07 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DBE4C20040; Fri, 27 Jan 2023 12:38:05 +0000 (GMT) Received: from li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com (unknown [9.43.40.88]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 27 Jan 2023 12:38:05 +0000 (GMT) From: Ojaswin Mujoo To: linux-ext4@vger.kernel.org, "Theodore Ts'o" Cc: Ritesh Harjani , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara , Ritesh Harjani Subject: [RFC 10/11] ext4: Abstract out logic to search average fragment list Date: Fri, 27 Jan 2023 18:07:37 +0530 Message-Id: <3f0afae57eeaf47aa4b980eddc5e54efc78efa66.1674822311.git.ojaswin@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 8s6nDf2jbjAUlnTqX295flokxyhpNU5b X-Proofpoint-ORIG-GUID: 4OR5m4pUhQrn4sSmhQNL4_mtsa7ErGzl X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-01-27_06,2023-01-27_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 bulkscore=0 spamscore=0 adultscore=0 mlxlogscore=999 clxscore=1015 suspectscore=0 priorityscore=1501 phishscore=0 mlxscore=0 malwarescore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301270113 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Make the logic of searching average fragment list of a given order reusable by abstracting it out to a differnet function. This will also avoid code duplication in upcoming patches. No functional changes. Signed-off-by: Ojaswin Mujoo Reviewed-by: Ritesh Harjani (IBM) Reviewed-by: Jan Kara --- fs/ext4/mballoc.c | 51 ++++++++++++++++++++++++++++++----------------- 1 file changed, 33 insertions(+), 18 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 410c9636907b..1ce1174aea52 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -902,6 +902,37 @@ static void ext4_mb_choose_next_group_cr0(struct ext4_allocation_context *ac, } } +/* + * Find a suitable group of given order from the average fragments list. + */ +static struct ext4_group_info * +ext4_mb_find_good_group_avg_frag_lists(struct ext4_allocation_context *ac, int order) +{ + struct ext4_sb_info *sbi = EXT4_SB(ac->ac_sb); + struct list_head *frag_list = &sbi->s_mb_avg_fragment_size[order]; + rwlock_t *frag_list_lock = &sbi->s_mb_avg_fragment_size_locks[order]; + struct ext4_group_info *grp = NULL, *iter; + enum criteria cr = ac->ac_criteria; + + if (list_empty(frag_list)) + return NULL; + read_lock(frag_list_lock); + if (list_empty(frag_list)) { + read_unlock(frag_list_lock); + return NULL; + } + list_for_each_entry(iter, frag_list, bb_avg_fragment_size_node) { + if (sbi->s_mb_stats) + atomic64_inc(&sbi->s_bal_cX_groups_considered[cr]); + if (likely(ext4_mb_good_group(ac, iter->bb_group, cr))) { + grp = iter; + break; + } + } + read_unlock(frag_list_lock); + return grp; +} + /* * Choose next group by traversing average fragment size list of suitable * order. Updates *new_cr if cr level needs an update. @@ -910,7 +941,7 @@ static void ext4_mb_choose_next_group_cr1(struct ext4_allocation_context *ac, enum criteria *new_cr, ext4_group_t *group, ext4_group_t ngroups) { struct ext4_sb_info *sbi = EXT4_SB(ac->ac_sb); - struct ext4_group_info *grp = NULL, *iter; + struct ext4_group_info *grp = NULL; int i; if (unlikely(ac->ac_flags & EXT4_MB_CR1_OPTIMIZED)) { @@ -920,23 +951,7 @@ static void ext4_mb_choose_next_group_cr1(struct ext4_allocation_context *ac, for (i = mb_avg_fragment_size_order(ac->ac_sb, ac->ac_g_ex.fe_len); i < MB_NUM_ORDERS(ac->ac_sb); i++) { - if (list_empty(&sbi->s_mb_avg_fragment_size[i])) - continue; - read_lock(&sbi->s_mb_avg_fragment_size_locks[i]); - if (list_empty(&sbi->s_mb_avg_fragment_size[i])) { - read_unlock(&sbi->s_mb_avg_fragment_size_locks[i]); - continue; - } - list_for_each_entry(iter, &sbi->s_mb_avg_fragment_size[i], - bb_avg_fragment_size_node) { - if (sbi->s_mb_stats) - atomic64_inc(&sbi->s_bal_cX_groups_considered[CR1]); - if (likely(ext4_mb_good_group(ac, iter->bb_group, CR1))) { - grp = iter; - break; - } - } - read_unlock(&sbi->s_mb_avg_fragment_size_locks[i]); + grp = ext4_mb_find_good_group_avg_frag_lists(ac, i); if (grp) break; } From patchwork Fri Jan 27 12:37:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ojaswin Mujoo X-Patchwork-Id: 13118646 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12481C636BD for ; Fri, 27 Jan 2023 12:40:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232254AbjA0MkG (ORCPT ); Fri, 27 Jan 2023 07:40:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50576 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234313AbjA0MjR (ORCPT ); Fri, 27 Jan 2023 07:39:17 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2DA457C703; Fri, 27 Jan 2023 04:38:21 -0800 (PST) Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 30RCNH3x016842; Fri, 27 Jan 2023 12:38:16 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=vA9w2AqADgE+8GLQA2K3wl60BfEzSK5alu9EOQk+mOU=; b=W91QI8F0G/3cKTxGX5ujs9buJYyjfu8Hx/kHndJwK0KG0zqqWbHmCdT86ZvzcNZqlyVt bkH/4qBnjCGUqB0x5qwk35KASVhHJsb9j91oEVkowtBMhH+an801S5O+xIxF996V8l3I NCWnljci0AuklfNzgHncQqcdf4SRWkhBEWpg+x/tlQtRQsr2uAB3jO7MYUIdY66laffZ o1VohP1SNCsl1cx+46NfWaWWMPqeBA6kLwhns4Jq9G7x68HdOzQDI3bJtLlBUiYgdRRa xoxjJ6d/YubfRqTYC7eOltwFYrjUS1O0c2envE4jpInqvA0DT2rUbLPgJSlaW2Erq9+H YQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ncejugc9y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:38:16 +0000 Received: from m0098396.ppops.net (m0098396.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 30RCNEg0016698; Fri, 27 Jan 2023 12:38:15 GMT Received: from ppma06fra.de.ibm.com (48.49.7a9f.ip4.static.sl-reverse.com [159.122.73.72]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ncejugc8x-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:38:15 +0000 Received: from pps.filterd (ppma06fra.de.ibm.com [127.0.0.1]) by ppma06fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 30R9u6FN029861; Fri, 27 Jan 2023 12:38:13 GMT Received: from smtprelay04.fra02v.mail.ibm.com ([9.218.2.228]) by ppma06fra.de.ibm.com (PPS) with ESMTPS id 3n87afdems-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Jan 2023 12:38:13 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay04.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 30RCcAuW24642236 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 27 Jan 2023 12:38:10 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B14BB20043; Fri, 27 Jan 2023 12:38:10 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 755EF20040; Fri, 27 Jan 2023 12:38:08 +0000 (GMT) Received: from li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com (unknown [9.43.40.88]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 27 Jan 2023 12:38:08 +0000 (GMT) From: Ojaswin Mujoo To: linux-ext4@vger.kernel.org, "Theodore Ts'o" Cc: Ritesh Harjani , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara , Ritesh Harjani Subject: [RFC 11/11] ext4: Add allocation criteria 1.5 (CR1_5) Date: Fri, 27 Jan 2023 18:07:38 +0530 Message-Id: <08173ee255f70cdc8de9ac3aa2e851f9d74acb12.1674822312.git.ojaswin@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: TgljfCq8sVTfVOkJ9_mkAl6M6ASXEfxg X-Proofpoint-GUID: FSTixGVukMGXaNze1LdmIYi3lMbYihah X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-01-27_08,2023-01-27_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 mlxscore=0 lowpriorityscore=0 phishscore=0 adultscore=0 spamscore=0 bulkscore=0 suspectscore=0 impostorscore=0 clxscore=1015 mlxlogscore=999 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301270118 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org CR1_5 aims to optimize allocations which can't be satisfied in CR1. The fact that we couldn't find a group in CR1 suggests that it would be difficult to find a continuous extent to compleltely satisfy our allocations. So before falling to the slower CR2, in CR1.5 we proactively trim the the preallocations so we can find a group with (free / fragments) big enough. This speeds up our allocation at the cost of slightly reduced preallocation. The patch also adds a new sysfs tunable: * /sys/fs/ext4//mb_cr1_5_max_trim_order This controls how much CR1.5 can trim a request before falling to CR2. For example, for a request of order 7 and max trim order 2, CR1.5 can trim this upto order 5. Signed-off-by: Ojaswin Mujoo Reviewed-by: Ritesh Harjani (IBM) --- fs/ext4/ext4.h | 7 +++- fs/ext4/mballoc.c | 97 ++++++++++++++++++++++++++++++++++++++++++++--- fs/ext4/mballoc.h | 14 +++++++ fs/ext4/sysfs.c | 2 + 4 files changed, 113 insertions(+), 7 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index d8fa01e54e81..879aac5e39a9 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -132,7 +132,7 @@ enum SHIFT_DIRECTION { * criteria the slower the allocation. We start at lower criterias and keep * falling back to higher ones if we are not able to find any blocks. */ -#define EXT4_MB_NUM_CRS 4 +#define EXT4_MB_NUM_CRS 5 /* * Flags used in mballoc's allocation_context flags field. @@ -175,6 +175,9 @@ enum SHIFT_DIRECTION { #define EXT4_MB_CR0_OPTIMIZED 0x8000 /* Avg fragment size rb tree lookup succeeded at least once for cr = 1 */ #define EXT4_MB_CR1_OPTIMIZED 0x00010000 +/* Avg fragment size rb tree lookup succeeded at least once for cr = 1.5 */ +#define EXT4_MB_CR1_5_OPTIMIZED 0x00020000 + struct ext4_allocation_request { /* target inode for block we're allocating */ struct inode *inode; @@ -1627,6 +1630,7 @@ struct ext4_sb_info { unsigned long s_mb_last_start; unsigned int s_mb_prefetch; unsigned int s_mb_prefetch_limit; + unsigned int s_mb_cr1_5_max_trim_order; /* stats for buddy allocator */ atomic_t s_bal_reqs; /* number of reqs with len > 1 */ @@ -1641,6 +1645,7 @@ struct ext4_sb_info { atomic_t s_bal_2orders; /* 2^order hits */ atomic_t s_bal_cr0_bad_suggestions; atomic_t s_bal_cr1_bad_suggestions; + atomic_t s_bal_cr1_5_bad_suggestions; atomic64_t s_bal_cX_groups_considered[EXT4_MB_NUM_CRS]; atomic64_t s_bal_cX_hits[EXT4_MB_NUM_CRS]; atomic64_t s_bal_cX_failed[EXT4_MB_NUM_CRS]; /* cX loop didn't find blocks */ diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 1ce1174aea52..8e9032f94966 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -960,6 +960,67 @@ static void ext4_mb_choose_next_group_cr1(struct ext4_allocation_context *ac, *group = grp->bb_group; ac->ac_flags |= EXT4_MB_CR1_OPTIMIZED; } else { + *new_cr = CR1_5; + } +} + +/* + * We couldn't find a group in CR1 so try to find the highest free fragment + * order we have and proactively trim the goal request length to that order to + * find a suitable group faster. + * + * This optimizes allocation speed at the cost of slightly reduced + * preallocations. However, we make sure that we don't trim the request too + * much and fall to CR2 in that case. + */ +static void ext4_mb_choose_next_group_cr1_5(struct ext4_allocation_context *ac, + enum criteria *new_cr, ext4_group_t *group, ext4_group_t ngroups) +{ + struct ext4_sb_info *sbi = EXT4_SB(ac->ac_sb); + struct ext4_group_info *grp = NULL; + int i, order, min_order; + + if (unlikely(ac->ac_flags & EXT4_MB_CR1_5_OPTIMIZED)) { + if (sbi->s_mb_stats) + atomic_inc(&sbi->s_bal_cr1_5_bad_suggestions); + } + + /* + * mb_avg_fragment_size_order() returns order in a way that makes + * retrieving back the length using (1 << order) inaccurate. Hence, use + * fls() instead since we need to know the actual length while modifying + * goal length. + */ + order = fls(ac->ac_g_ex.fe_len); + min_order = order - sbi->s_mb_cr1_5_max_trim_order; + if (min_order < 0) + min_order = 0; + + for (i = order; i >= min_order; i--) { + if (ac->ac_o_ex.fe_len <= (1 << i)) { + /* + * Scale down goal len to make sure we find something + * in the free fragments list. Basically, reduce + * preallocations. + */ + ac->ac_g_ex.fe_len = 1 << i; + } else { + break; + } + + grp = ext4_mb_find_good_group_avg_frag_lists(ac, + mb_avg_fragment_size_order(ac->ac_sb, + ac->ac_g_ex.fe_len)); + if (grp) + break; + } + + if (grp) { + *group = grp->bb_group; + ac->ac_flags |= EXT4_MB_CR1_5_OPTIMIZED; + } else { + /* Reset goal length to original goal length before falling into CR2 */ + ac->ac_g_ex.fe_len = ac->ac_orig_goal_len; *new_cr = CR2; } } @@ -1026,6 +1087,8 @@ static void ext4_mb_choose_next_group(struct ext4_allocation_context *ac, ext4_mb_choose_next_group_cr0(ac, new_cr, group, ngroups); } else if (*new_cr == CR1) { ext4_mb_choose_next_group_cr1(ac, new_cr, group, ngroups); + } else if (*new_cr == CR1_5) { + ext4_mb_choose_next_group_cr1_5(ac, new_cr, group, ngroups); } else { /* * TODO: For CR=2, we can arrange groups in an rb tree sorted by @@ -2323,7 +2386,7 @@ void ext4_mb_complex_scan_group(struct ext4_allocation_context *ac, if (ac->ac_criteria < CR2) { /* - * In CR1, we are sure that this group will + * In CR1 and CR1_5, we are sure that this group will * have a large enough continuous free extent, so skip * over the smaller free extents */ @@ -2453,6 +2516,7 @@ static bool ext4_mb_good_group(struct ext4_allocation_context *ac, return true; case CR1: + case CR1_5: if ((free / fragments) >= ac->ac_g_ex.fe_len) return true; break; @@ -2715,7 +2779,7 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac) * spend a lot of time loading imperfect groups */ if ((prefetch_grp == group) && - (cr > CR1 || + (cr > CR1_5 || prefetch_ios < sbi->s_mb_prefetch_limit)) { nr = sbi->s_mb_prefetch; if (ext4_has_feature_flex_bg(sb)) { @@ -2755,8 +2819,8 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac) ac->ac_groups_scanned++; if (cr == CR0) ext4_mb_simple_scan_group(ac, &e4b); - else if (cr == CR1 && sbi->s_stripe && - !(ac->ac_g_ex.fe_len % sbi->s_stripe)) + else if ((cr == CR1 || cr == CR1_5) && sbi->s_stripe && + !(ac->ac_g_ex.fe_len % sbi->s_stripe)) ext4_mb_scan_aligned(ac, &e4b); else ext4_mb_complex_scan_group(ac, &e4b); @@ -2770,6 +2834,11 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac) /* Processed all groups and haven't found blocks */ if (sbi->s_mb_stats && i == ngroups) atomic64_inc(&sbi->s_bal_cX_failed[cr]); + + if (i == ngroups && ac->ac_criteria == CR1_5) + /* Reset goal length to original goal length before + * falling into CR2 */ + ac->ac_g_ex.fe_len = ac->ac_orig_goal_len; } if (ac->ac_b_ex.fe_len > 0 && ac->ac_status != AC_STATUS_FOUND && @@ -2937,6 +3006,16 @@ int ext4_seq_mb_stats_show(struct seq_file *seq, void *offset) seq_printf(seq, "\t\tbad_suggestions: %u\n", atomic_read(&sbi->s_bal_cr1_bad_suggestions)); + seq_puts(seq, "\tcr1.5_stats:\n"); + seq_printf(seq, "\t\thits: %llu\n", atomic64_read(&sbi->s_bal_cX_hits[CR1_5])); + seq_printf(seq, "\t\tgroups_considered: %llu\n", + atomic64_read(&sbi->s_bal_cX_groups_considered[CR1_5])); + seq_printf(seq, "\t\textents_scanned: %u\n", atomic_read(&sbi->s_bal_cX_ex_scanned[CR1_5])); + seq_printf(seq, "\t\tuseless_loops: %llu\n", + atomic64_read(&sbi->s_bal_cX_failed[CR1_5])); + seq_printf(seq, "\t\tbad_suggestions: %u\n", + atomic_read(&sbi->s_bal_cr1_5_bad_suggestions)); + seq_puts(seq, "\tcr2_stats:\n"); seq_printf(seq, "\t\thits: %llu\n", atomic64_read(&sbi->s_bal_cX_hits[CR2])); seq_printf(seq, "\t\tgroups_considered: %llu\n", @@ -3452,6 +3531,8 @@ int ext4_mb_init(struct super_block *sb) sbi->s_mb_stream_request = MB_DEFAULT_STREAM_THRESHOLD; sbi->s_mb_order2_reqs = MB_DEFAULT_ORDER2_REQS; sbi->s_mb_max_inode_prealloc = MB_DEFAULT_MAX_INODE_PREALLOC; + sbi->s_mb_cr1_5_max_trim_order = MB_DEFAULT_CR1_5_TRIM_ORDER; + /* * The default group preallocation is 512, which for 4k block * sizes translates to 2 megabytes. However for bigalloc file @@ -4218,6 +4299,7 @@ ext4_mb_normalize_request(struct ext4_allocation_context *ac, * placement or satisfy big request as is */ ac->ac_g_ex.fe_logical = start; ac->ac_g_ex.fe_len = EXT4_NUM_B2C(sbi, size); + ac->ac_orig_goal_len = ac->ac_g_ex.fe_len; /* define goal start in order to merge */ if (ar->pright && (ar->lright == (start + size))) { @@ -4258,8 +4340,10 @@ static void ext4_mb_collect_stats(struct ext4_allocation_context *ac) if (ac->ac_g_ex.fe_start == ac->ac_b_ex.fe_start && ac->ac_g_ex.fe_group == ac->ac_b_ex.fe_group) atomic_inc(&sbi->s_bal_goals); - if (ac->ac_f_ex.fe_len == ac->ac_g_ex.fe_len) + /* did we allocate as much as normalizer originally wanted? */ + if (ac->ac_f_ex.fe_len == ac->ac_orig_goal_len) atomic_inc(&sbi->s_bal_len_goals); + if (ac->ac_found > sbi->s_mb_max_to_scan) atomic_inc(&sbi->s_bal_breaks); } @@ -4652,7 +4736,7 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac) pa = ac->ac_pa; - if (ac->ac_b_ex.fe_len < ac->ac_g_ex.fe_len) { + if (ac->ac_b_ex.fe_len < ac->ac_orig_goal_len) { int winl; int wins; int win; @@ -5281,6 +5365,7 @@ ext4_mb_initialize_context(struct ext4_allocation_context *ac, ac->ac_o_ex.fe_start = block; ac->ac_o_ex.fe_len = len; ac->ac_g_ex = ac->ac_o_ex; + ac->ac_orig_goal_len = ac->ac_g_ex.fe_len; ac->ac_flags = ar->flags; /* we have to define context: we'll work with a file or diff --git a/fs/ext4/mballoc.h b/fs/ext4/mballoc.h index 004b8d163cc9..c1b0bf2f6f4d 100644 --- a/fs/ext4/mballoc.h +++ b/fs/ext4/mballoc.h @@ -90,6 +90,13 @@ */ #define MB_DEFAULT_LINEAR_SCAN_THRESHOLD 16 +/* + * The maximum order upto which CR1.5 can trim a particular allocation request. + * Example, if we have an order 7 request and max trim order of 3, CR1.5 can + * trim this upto order 4. + */ +#define MB_DEFAULT_CR1_5_TRIM_ORDER 3 + /* * Number of valid buddy orders */ @@ -101,6 +108,7 @@ enum criteria { CR0, CR1, + CR1_5, CR2, CR3, }; @@ -188,6 +196,12 @@ struct ext4_allocation_context { /* copy of the best found extent taken before preallocation efforts */ struct ext4_free_extent ac_f_ex; + /* + * goal len can change in CR1.5, so save the original len. This is + * used while adjusting the PA window and for accounting. + */ + ext4_grpblk_t ac_orig_goal_len; + __u32 ac_groups_considered; __u32 ac_flags; /* allocation hints */ __u16 ac_groups_scanned; diff --git a/fs/ext4/sysfs.c b/fs/ext4/sysfs.c index d233c24ea342..5ba884a0246e 100644 --- a/fs/ext4/sysfs.c +++ b/fs/ext4/sysfs.c @@ -224,6 +224,7 @@ EXT4_RW_ATTR_SBI_UI(warning_ratelimit_interval_ms, s_warning_ratelimit_state.int EXT4_RW_ATTR_SBI_UI(warning_ratelimit_burst, s_warning_ratelimit_state.burst); EXT4_RW_ATTR_SBI_UI(msg_ratelimit_interval_ms, s_msg_ratelimit_state.interval); EXT4_RW_ATTR_SBI_UI(msg_ratelimit_burst, s_msg_ratelimit_state.burst); +EXT4_RW_ATTR_SBI_UI(mb_cr1_5_max_trim_order, s_mb_cr1_5_max_trim_order); #ifdef CONFIG_EXT4_DEBUG EXT4_RW_ATTR_SBI_UL(simulate_fail, s_simulate_fail); #endif @@ -275,6 +276,7 @@ static struct attribute *ext4_attrs[] = { ATTR_LIST(warning_ratelimit_burst), ATTR_LIST(msg_ratelimit_interval_ms), ATTR_LIST(msg_ratelimit_burst), + ATTR_LIST(mb_cr1_5_max_trim_order), ATTR_LIST(errors_count), ATTR_LIST(warning_count), ATTR_LIST(msg_count),