From patchwork Thu Aug 15 19:31:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Krister Johansen X-Patchwork-Id: 13765094 Received: from giant.ash.relay.mailchannels.net (giant.ash.relay.mailchannels.net [23.83.222.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 383F380B for ; Thu, 15 Aug 2024 19:31:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=pass smtp.client-ip=23.83.222.68 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723750321; cv=pass; b=eBcoBsRj8O0uisfOVphFWpyMAPpGgk9z8OagtfPO9rOeMMayEHGSyqlcE5uA+0JC+ISpATf11hxJdAC4e8fsUOuS3C+9UPrHA8pBkHsQ+Wr8dwAoZMfgU7FQGdyZsQW6voWFuMH/GCjCQ6pfkp91q8JWhhSsWhLKgtCNHzh2dQc= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723750321; c=relaxed/simple; bh=5LmSysuK5yLZB0f0RQUaLLKvwe376pvZ2JeRv9QcNfQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=uavLNHczfqQ23Z3wN+Vzsh0/Je8XFt4nbgHY922C1dtC0rrJMlUEWUEW/2On5wbXfimoXk/48oA7TPZx7JCifU7i2pKQ4xaA/rlEUo/acgvX47TC8Wt96lcT+089sUIPC/gAwmYa/6kdC3O23Giw5sQDZQBd2FJpik0IftjvK1I= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=templeofstupid.com; spf=pass smtp.mailfrom=templeofstupid.com; dkim=pass (2048-bit key) header.d=templeofstupid.com header.i=@templeofstupid.com header.b=QrL4QAMG; arc=pass smtp.client-ip=23.83.222.68 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=templeofstupid.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=templeofstupid.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=templeofstupid.com header.i=@templeofstupid.com header.b="QrL4QAMG" X-Sender-Id: dreamhost|x-authsender|kjlx@templeofstupid.com Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 3F6BF2C6664 for ; Thu, 15 Aug 2024 19:31:52 +0000 (UTC) Received: from pdx1-sub0-mail-a210.dreamhost.com (unknown [127.0.0.6]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id E65DC2C3AE3 for ; Thu, 15 Aug 2024 19:31:51 +0000 (UTC) ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1723750311; a=rsa-sha256; cv=none; b=U857zv/idmlahWiz2ZhGQEA5E+ClDPDTdny+QRyDlghPeKxDbWEb8VDnxjcVnZZCUg/KSq 5EN93M9gUT3iTgRneNPp42HNJoJ9aR+tMvu96w6JdxlruL/BRUlf7wwiXWeMIq36VEbiPf dzcvKJpmeBZFMkNXvMy9fwF00n1lbmD69CVWodkcNAb4mVpDSvevG6gvbJ8p+TS0CKJQzH RhSf3CdvDq0E+kd1FwR+QHt/E+fAZAws9IZgZ13ZY30lAbX80nl2S5ja0f6cGG2rIPkVl9 PZrM9YFu7h4rx7RWhY/T6aP/4yw7An4YvOio72ra/HzIlo3RSyCVvH6QlT0VPQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=mailchannels.net; s=arc-2022; t=1723750311; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MiHOZpQ51dTUKfpN0a1BAL0ZPQ6v3YA3kaeuXRxm4ho=; b=LIlGppoe/skQsLWWqgrwILp/bXjUjLyHDITbEL1idQDlNWZTt0bt9WGoygNMqZcvUqwEPJ g+s9/wqymdoECb6o1jk/A/ymFiZHfLqdShCmt4lY5OVsEbKR+8SnIWlAOzFSeHuPsJPOem dgd7YDvkoLX7mt8FkHRPtXCHmNmzKAh4FBxdHK5hiLRXDxolYAL6Uzqmw1MmIqRIpuBXXj FZCTh0Vegv4sNBjkUTbMkNKAcd3PyduLObMypVxZIFGnmH3JFT0sTD/JWjY5hiGT8/uZ3I Y4bJ2HSi3geBVjmAdlbHcWPRSxfdA71re36C4UyoZ22QtkbFu+ymF5phaX766w== ARC-Authentication-Results: i=1; rspamd-587694846-mn8tv; auth=pass smtp.auth=dreamhost smtp.mailfrom=kjlx@templeofstupid.com X-Sender-Id: dreamhost|x-authsender|kjlx@templeofstupid.com X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|kjlx@templeofstupid.com X-MailChannels-Auth-Id: dreamhost X-Society-Exultant: 585ff78017668e43_1723750312162_3714702154 X-MC-Loop-Signature: 1723750312162:3578975185 X-MC-Ingress-Time: 1723750312161 Received: from pdx1-sub0-mail-a210.dreamhost.com (pop.dreamhost.com [64.90.62.162]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384) by 100.111.47.101 (trex/7.0.2); Thu, 15 Aug 2024 19:31:52 +0000 Received: from kmjvbox.templeofstupid.com (c-73-70-109-47.hsd1.ca.comcast.net [73.70.109.47]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: kjlx@templeofstupid.com) by pdx1-sub0-mail-a210.dreamhost.com (Postfix) with ESMTPSA id 4WlFcR4x7HzC6 for ; Thu, 15 Aug 2024 12:31:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=templeofstupid.com; s=dreamhost; t=1723750311; bh=MiHOZpQ51dTUKfpN0a1BAL0ZPQ6v3YA3kaeuXRxm4ho=; h=Date:From:To:Cc:Subject:Content-Type; b=QrL4QAMGr0HrnNFv6n/xesOqiMoHND6CxwTt2g94T2So/Y7S2kKtyhOdd0YZM9RtL xqUyB2y50Ud3Vl0qejVhDgR4VYyr6maQWlG4vU2AQ9Wt/4aJ4XBpi72tqOUbS9R4/F khG/+7Fnf/rza9Gil7pNWo+t/H3e+HaGjQkoCKPRS+BzW/S1/12VY/y6R/p9KjESPZ dEPXq7IadypwvHYT4qyVV1pXvxgWjkwvbgDTxsGSKJK3i/jXGsmJ5zsbc5OA7YftHg AIR3GV8Z6jw082maJQz7a0ak1g/GrUx8vN/cTUB2JqQQ+KsoTBf/q2fi89+y3VFu40 +/eHGBfqxMdTg== Received: from johansen (uid 1000) (envelope-from kjlx@templeofstupid.com) id e0064 by kmjvbox.templeofstupid.com (DragonFly Mail Agent v0.12); Thu, 15 Aug 2024 12:31:50 -0700 Date: Thu, 15 Aug 2024 12:31:50 -0700 From: Krister Johansen To: Chandan Babu R , "Darrick J. Wong" , Dave Chinner Cc: Dave Chinner , Zorro Lang , linux-xfs@vger.kernel.org, fstests@vger.kernel.org Subject: [PATCH 1/5] xfs: count the number of blocks in a per-ag reservation Message-ID: References: Precedence: bulk X-Mailing-List: fstests@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In order to get the AGFL reservation, alloc_set_aside, and ag_max_usable calculations correct in the face of per-AG reservations, we need to understand the number of blocks that a per-AG reservation can leave free in a worst-case scenario. Compute the number of blocks used for a per-ag reservation by using AG 0's reservation. Other code already assumes AG 0's reservation is as large or larger than the other AG's. Subsequent patches will used the block count to construct a more accurate set of parameters. The reservation is counted after log_mount_finish because reservations are temporarily enabled for this operation. An updated alloc_set_aside and ag_max_usable need to be computed before enabling reservations at the end of a RW mount. Signed-off-by: Krister Johansen --- fs/xfs/xfs_fsops.c | 21 +++++++++++++++++++++ fs/xfs/xfs_fsops.h | 1 + fs/xfs/xfs_mount.c | 7 +++++++ fs/xfs/xfs_mount.h | 7 +++++++ 4 files changed, 36 insertions(+) diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c index c211ea2b63c4..fefc20df8a2e 100644 --- a/fs/xfs/xfs_fsops.c +++ b/fs/xfs/xfs_fsops.c @@ -551,6 +551,27 @@ xfs_fs_reserve_ag_blocks( return error; } +/* + * Count the number of reserved blocks that an AG has requested. + */ +uint +xfs_fs_count_reserved_ag_blocks( + struct xfs_mount *mp, + xfs_agnumber_t agno) +{ + + struct xfs_perag *pag; + uint blocks = 0; + + pag = xfs_perag_grab(mp, agno); + if (!pag) + return blocks; + + blocks = pag->pag_meta_resv.ar_asked + pag->pag_rmapbt_resv.ar_asked; + xfs_perag_rele(pag); + return blocks; +} + /* * Free space reserved for per-AG metadata. */ diff --git a/fs/xfs/xfs_fsops.h b/fs/xfs/xfs_fsops.h index 3e2f73bcf831..75f5fa1a38f4 100644 --- a/fs/xfs/xfs_fsops.h +++ b/fs/xfs/xfs_fsops.h @@ -12,6 +12,7 @@ int xfs_reserve_blocks(struct xfs_mount *mp, uint64_t request); int xfs_fs_goingdown(struct xfs_mount *mp, uint32_t inflags); int xfs_fs_reserve_ag_blocks(struct xfs_mount *mp); +uint xfs_fs_count_reserved_ag_blocks(struct xfs_mount *mp, xfs_agnumber_t agno); void xfs_fs_unreserve_ag_blocks(struct xfs_mount *mp); #endif /* __XFS_FSOPS_H__ */ diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 09eef1721ef4..d6ba67a29e3a 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -952,6 +952,13 @@ xfs_mountfs( xfs_warn(mp, "ENOSPC reserving per-AG metadata pool, log recovery may fail."); error = xfs_log_mount_finish(mp); + /* + * Before disabling the temporary per-ag reservation, count up the + * reserved blocks in AG 0. This will be used to determine how to + * re-size the AGFL reserve and alloc_set_aside prior to enabling + * reservations if the mount is RW. + */ + mp->m_ag_resblk_count = xfs_fs_count_reserved_ag_blocks(mp, 0); xfs_fs_unreserve_ag_blocks(mp); if (error) { xfs_warn(mp, "log mount finish failed"); diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index d0567dfbc036..800788043ca6 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -213,6 +213,13 @@ typedef struct xfs_mount { uint64_t m_resblks; /* total reserved blocks */ uint64_t m_resblks_avail;/* available reserved blocks */ uint64_t m_resblks_save; /* reserved blks @ remount,ro */ + + /* + * Number of per-ag resv blocks for a single AG. Derived from AG 0 + * under the assumption no per-AG reservations will be larger than that + * one. + */ + uint m_ag_resblk_count; struct delayed_work m_reclaim_work; /* background inode reclaim */ struct dentry *m_debugfs; /* debugfs parent */ struct xfs_kobj m_kobj; From patchwork Thu Aug 15 19:32:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Krister Johansen X-Patchwork-Id: 13765109 Received: from dog.elm.relay.mailchannels.net (dog.elm.relay.mailchannels.net [23.83.212.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 598C81DDF5 for ; Thu, 15 Aug 2024 19:39:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=pass smtp.client-ip=23.83.212.48 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723750777; cv=pass; b=pntekevT3wx/FhPsfLa3Kc9K2CGEx4CnO3qL/RFDPPrjiF3u66LHvQ+KqywzaEoJk7GkxpLc/NwofjC9wnxcmH9Alm6vL2RPZSAOUwG1C/Rlq4GoAcF8qWjS7lVV+rQeTlkFj4xB89BB1vTXp3inUiSLiHzrJvbdGdTznGTR0Dk= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723750777; c=relaxed/simple; bh=oN/PUDnAjVWq+DgxdfEgfhbKENytoMooS7/fbncWYqE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=PbrQEBqCvP3O/mkCZT8MVj3oQzB2LsDaeb1eGc6v2XvKbs69aRzA3X/u4n/+M0ERsktCVbZk8g8ESqL5Wgjk/NIswCxYWz8C3xewIdzqLRQkjZeR7dZ9Ux74xVFHAzu7KKR2iECiCffMwLBqC6iikVDNt4k4EEBKj92fAuRZiD0= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=templeofstupid.com; spf=pass smtp.mailfrom=templeofstupid.com; dkim=pass (2048-bit key) header.d=templeofstupid.com header.i=@templeofstupid.com header.b=igNweWcX; arc=pass smtp.client-ip=23.83.212.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=templeofstupid.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=templeofstupid.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=templeofstupid.com header.i=@templeofstupid.com header.b="igNweWcX" X-Sender-Id: dreamhost|x-authsender|kjlx@templeofstupid.com Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id A1DE48678E for ; Thu, 15 Aug 2024 19:32:49 +0000 (UTC) Received: from pdx1-sub0-mail-a210.dreamhost.com (unknown [127.0.0.6]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id 53497862FC for ; Thu, 15 Aug 2024 19:32:49 +0000 (UTC) ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1723750369; a=rsa-sha256; cv=none; b=aoCc+yptAe1jr5ARdQRdrG05e1KZW1phfy7zu4RnaEUW216Sx/XxwtratMAQAZuT+th7iN F8qtCXIupbg4kcrRwARGE16QyC3TUb6oz1Gzm8vonRIiyF0XoeDSoi0eDE2FEyw91MP1pM UeP/dlQheD/NWYcLePknj+0sDRLzS84aRGms5st6ysDEdsElyQ2IYigXd2eAh48Nlmg7uX lFAPeRAIvkWXSqDDv0MWBYR9Rqu46Wlafujq0KUfrZTuG4//FS3G42CWoGMhfC1DxpEThR jUJ5XsOs5gB6lhZVjLmPvo2rAztBPqG1hcFXVEezr3lI4MfwAezfDGEe/dzIJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=mailchannels.net; s=arc-2022; t=1723750369; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ONJXDFMJbMD3oLnJMrIEFCbjfNXrSfx2YBc+dEYC7bM=; b=AmE9pqe8VR9pSbz97HvUDyclquw9A5dRWh8FmQY65fqhS99gDO3d2nznN9gBJPajxbsfQm GYSuB0ym7FyKV2mBnPo2S4U1GsHl/YXwN6jBguoYa3O3F6MJTq+AIONdQHPBLU6WCDjtbf rhPtTtBxdSd6CUk9iyxCzoFqni473ATwVVzlu34xlM8H+KIjcmgM6G4LKXQBbl0AeL/1F5 ThuEMUf3BkTPvlOdwn9acWlZpujhGB7aNoQxOorSfbeYWhYsT3+Gwaav3T4Hc1CTZ34dc5 yWtddL1f59WP9LC2D/6+LADmkHi+vfdmusNIq1tP2a0Oe7Ufc0Sf/eJkhAQQCg== ARC-Authentication-Results: i=1; rspamd-c4b59d8dc-cwtkn; auth=pass smtp.auth=dreamhost smtp.mailfrom=kjlx@templeofstupid.com X-Sender-Id: dreamhost|x-authsender|kjlx@templeofstupid.com X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|kjlx@templeofstupid.com X-MailChannels-Auth-Id: dreamhost X-Gusty-Power: 7da51090505eb2e0_1723750369558_3525630920 X-MC-Loop-Signature: 1723750369558:1707946835 X-MC-Ingress-Time: 1723750369558 Received: from pdx1-sub0-mail-a210.dreamhost.com (pop.dreamhost.com [64.90.62.162]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384) by 100.126.110.135 (trex/7.0.2); Thu, 15 Aug 2024 19:32:49 +0000 Received: from kmjvbox.templeofstupid.com (c-73-70-109-47.hsd1.ca.comcast.net [73.70.109.47]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: kjlx@templeofstupid.com) by pdx1-sub0-mail-a210.dreamhost.com (Postfix) with ESMTPSA id 4WlFdW1FghzZM for ; Thu, 15 Aug 2024 12:32:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=templeofstupid.com; s=dreamhost; t=1723750367; bh=ONJXDFMJbMD3oLnJMrIEFCbjfNXrSfx2YBc+dEYC7bM=; h=Date:From:To:Cc:Subject:Content-Type; b=igNweWcXlmO9kRDqrJY24K7BXyP4O0LIPZy6HVMtAvxAGiCCt1qHiaLiou/OIJAA7 i34offxByXO3ZgXQjNO0T37Mrjo7psmC06a4xdcVYKPTYayOrz/GJa8Jr8WzbrRVU0 KmRHg9Rkzoog7DZoEdlnz3UhBuub40mB3vmqPHMS2yCmdUUee1zQJoBk3rh1UmzCo9 z3t5Lg4uESfsVi5IS1C4Z5xJy1Jb/TdEw0P+0OBYQAwNohw+G7iKaXQVfp0uxNVWhk swrWr0VF1KUeizYJaXWkTKtyC25SB02WYXZ148jVco+/nLrX0XuWya2404CAE5YbXw 7UdSlRI6K0tZA== Received: from johansen (uid 1000) (envelope-from kjlx@templeofstupid.com) id e0064 by kmjvbox.templeofstupid.com (DragonFly Mail Agent v0.12); Thu, 15 Aug 2024 12:32:45 -0700 Date: Thu, 15 Aug 2024 12:32:45 -0700 From: Krister Johansen To: Chandan Babu R , "Darrick J. Wong" , Dave Chinner Cc: Dave Chinner , Zorro Lang , linux-xfs@vger.kernel.org, fstests@vger.kernel.org Subject: [PATCH 2/5] xfs: move calculation in xfs_alloc_min_freelist to its own function Message-ID: <7e4310e32168cc4aa9f3a0782a6f6fabcaf476d5.1723688622.git.kjlx@templeofstupid.com> References: Precedence: bulk X-Mailing-List: fstests@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: This splits xfs_alloc_min_freelist into two pieces. The piece that remains in the function determines the appropriate level to pass to the calculation function. The calcution is extracted into its own inline function so that it can be used in multiple locations in xfs_alloc. No functional change. A subsequent patch will leverage this split. Signed-off-by: Krister Johansen --- fs/xfs/libxfs/xfs_alloc.c | 75 +++++++++++++++++++++++---------------- 1 file changed, 44 insertions(+), 31 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 59326f84f6a5..17e029bb1b6d 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -79,6 +79,46 @@ xfs_prealloc_blocks( return XFS_IBT_BLOCK(mp) + 1; } +static inline unsigned int +xfs_alloc_min_freelist_calc( + const unsigned int bno_level, + const unsigned int cnt_level, + const unsigned int rmap_level) +{ + unsigned int min_free; + + /* + * For a btree shorter than the maximum height, the worst case is that + * every level gets split and a new level is added, then while inserting + * another entry to refill the AGFL, every level under the old root gets + * split again. This is: + * + * (full height split reservation) + (AGFL refill split height) + * = (current height + 1) + (current height - 1) + * = (new height) + (new height - 2) + * = 2 * new height - 2 + * + * For a btree of maximum height, the worst case is that every level + * under the root gets split, then while inserting another entry to + * refill the AGFL, every level under the root gets split again. This is + * also: + * + * 2 * (current height - 1) + * = 2 * (new height - 1) + * = 2 * new height - 2 + */ + + /* space needed by-bno freespace btree */ + min_free = bno_level * 2 - 2; + /* space needed by-size freespace btree */ + min_free += cnt_level * 2 - 2; + /* space needed reverse mapping used space btree */ + if (rmap_level) + min_free += rmap_level * 2 - 2; + + return min_free; +} + /* * The number of blocks per AG that we withhold from xfs_dec_fdblocks to * guarantee that we can refill the AGFL prior to allocating space in a nearly @@ -152,7 +192,6 @@ xfs_alloc_ag_max_usable( return mp->m_sb.sb_agblocks - blocks; } - static int xfs_alloc_lookup( struct xfs_btree_cur *cur, @@ -2449,39 +2488,13 @@ xfs_alloc_min_freelist( const unsigned int bno_level = pag ? pag->pagf_bno_level : 1; const unsigned int cnt_level = pag ? pag->pagf_cnt_level : 1; const unsigned int rmap_level = pag ? pag->pagf_rmap_level : 1; - unsigned int min_free; ASSERT(mp->m_alloc_maxlevels > 0); - /* - * For a btree shorter than the maximum height, the worst case is that - * every level gets split and a new level is added, then while inserting - * another entry to refill the AGFL, every level under the old root gets - * split again. This is: - * - * (full height split reservation) + (AGFL refill split height) - * = (current height + 1) + (current height - 1) - * = (new height) + (new height - 2) - * = 2 * new height - 2 - * - * For a btree of maximum height, the worst case is that every level - * under the root gets split, then while inserting another entry to - * refill the AGFL, every level under the root gets split again. This is - * also: - * - * 2 * (current height - 1) - * = 2 * (new height - 1) - * = 2 * new height - 2 - */ - - /* space needed by-bno freespace btree */ - min_free = min(bno_level + 1, mp->m_alloc_maxlevels) * 2 - 2; - /* space needed by-size freespace btree */ - min_free += min(cnt_level + 1, mp->m_alloc_maxlevels) * 2 - 2; - /* space needed reverse mapping used space btree */ - if (xfs_has_rmapbt(mp)) - min_free += min(rmap_level + 1, mp->m_rmap_maxlevels) * 2 - 2; - return min_free; + return xfs_alloc_min_freelist_calc( + min(bno_level + 1, mp->m_alloc_maxlevels), + min(cnt_level + 1, mp->m_alloc_maxlevels), + xfs_has_rmapbt(mp) ? min(rmap_level + 1, mp->m_rmap_maxlevels) : 0); } /* From patchwork Thu Aug 15 19:33:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Krister Johansen X-Patchwork-Id: 13765143 Received: from toucan.tulip.relay.mailchannels.net (toucan.tulip.relay.mailchannels.net [23.83.218.254]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 51914158D8D for ; Thu, 15 Aug 2024 20:49:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=pass smtp.client-ip=23.83.218.254 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723754977; cv=pass; b=C/rYnCNqnPaX93rjve+a6YmNrdUs/RM+6c8ZJSeaOo0b4yslZehPeKh3u2gxN1Ry8+ndhY6VdYFceNcia3P+JDBoW09NgJvz9H1rgVQCTbViA0Zax4uN7MLlT7qAj/ONksR1npfcB1PWKuLmdbCdP2jltlHwEZUUZjUvIDPJ22A= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723754977; c=relaxed/simple; bh=/CPRIjMvII+TbwmuGj/+mnilyM+arOek2XLIAmyk2Dg=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=lht/zelx2RtnXbZQ2pCPUV0UsbgqqmvBYWTFArdXd1Uatz4xIab536m+sIJ04H2HtpkdSOvkQIREhWMYdVvhlCOs83FCoiNOWyryljjU9bWY/YnIPGFkz5CyDqX7lg7puCIuS9sZV1HCU4YaDsHuzzDFQ8EaOwMdsGmXoe42ByI= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=templeofstupid.com; spf=pass smtp.mailfrom=templeofstupid.com; dkim=pass (2048-bit key) header.d=templeofstupid.com header.i=@templeofstupid.com header.b=K73OHGq4; arc=pass smtp.client-ip=23.83.218.254 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=templeofstupid.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=templeofstupid.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=templeofstupid.com header.i=@templeofstupid.com header.b="K73OHGq4" X-Sender-Id: dreamhost|x-authsender|kjlx@templeofstupid.com Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 0E58086BFE for ; Thu, 15 Aug 2024 19:33:47 +0000 (UTC) Received: from pdx1-sub0-mail-a210.dreamhost.com (unknown [127.0.0.6]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id B488986A8A for ; Thu, 15 Aug 2024 19:33:46 +0000 (UTC) ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1723750426; a=rsa-sha256; cv=none; b=WMaXeRQansws9q1Ww8qO3iIF3C6ATKvPAb7avhr9g1ObKUVYBAhrwibX2EuX3LvF+/BTc5 Z2z9F4nQ82/i5ygSPe21/zyKoI5zIqbLZRmtYGS3/MvRAGaaR7t6RUibme6DDs+YONMVxR Vxf+tyOwA6gKTOYJlM3eOFsMvYjQjiISquDXxj24jqqnNt1VdXxx0UyJmhRQrjetgjGNNE 8746tl+z3+umY0+ASryPOE9A4gXc9n+A6HbyOfCsfxG/+F1WRJGqpSg8QctvpzbeRwPzYX wcsi+/KznCSUIHK4Aa2XzsUmYYrqpe3J3Zc1AKyGLhcHUztFrB8jB31Snay4QQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=mailchannels.net; s=arc-2022; t=1723750426; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nuInn/XU5MHEybIS4Kct0LUrS5pmFt9IS2xy5Q8W7t4=; b=gzfNeVVVOKqvdsTxwgUC7Yr/O95ZcIpGGFH1gNbdrJxYqWMoyr7qc3EbKed3P4utKh6Vvf 2XBu899T+8Jy+lJLjppk4wq+rju2bIeaT/1BqddmswH8jOi8HgQhY+zC+Kfim+6R3ZuUjy Rf14HyyXjW3VpkAieM9P8Tx+VCK+GytSiqtUzphOH+BptpniyufVG/50jNPOpwf2FJbhU3 JDnYeMw4u10VcFmlZn8k3bYQPKmefrhGFS90oSq4j4FEBrz8IvZBG3hQxInoyEB1dFrsiX mzMCH8kc/g3jtQL2YRUTBvSbKOXMWxG+srlYp4RaI4nf8Von0/L2PE+svKy3ag== ARC-Authentication-Results: i=1; rspamd-c4b59d8dc-v8l2b; auth=pass smtp.auth=dreamhost smtp.mailfrom=kjlx@templeofstupid.com X-Sender-Id: dreamhost|x-authsender|kjlx@templeofstupid.com X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|kjlx@templeofstupid.com X-MailChannels-Auth-Id: dreamhost X-Shelf-Exultant: 2d44accb0fa516c6_1723750426953_1959891252 X-MC-Loop-Signature: 1723750426953:391189777 X-MC-Ingress-Time: 1723750426953 Received: from pdx1-sub0-mail-a210.dreamhost.com (pop.dreamhost.com [64.90.62.162]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384) by 100.123.254.159 (trex/7.0.2); Thu, 15 Aug 2024 19:33:46 +0000 Received: from kmjvbox.templeofstupid.com (c-73-70-109-47.hsd1.ca.comcast.net [73.70.109.47]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: kjlx@templeofstupid.com) by pdx1-sub0-mail-a210.dreamhost.com (Postfix) with ESMTPSA id 4WlFff3crpzK2 for ; Thu, 15 Aug 2024 12:33:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=templeofstupid.com; s=dreamhost; t=1723750426; bh=nuInn/XU5MHEybIS4Kct0LUrS5pmFt9IS2xy5Q8W7t4=; h=Date:From:To:Cc:Subject:Content-Type; b=K73OHGq4s+23hkurvqgkSbHTBbVpx7hmVTwN1OJhZvEOuMHDgBBaOLiNrFJWHuc7E mp2np+NLJxG1QgISge9epiv0dv38CR4YEuRSfx0Rp5TBDR1O7QQyQ1lN5bQqY+9uyS 3U6Libm6Sm0Y8f+mGHxamyFRESGtgd+HWjZZif0RberUkbnGG6l4byyc12m+aDpwh3 FVO8W24YnQgejCcuoobskuX5+i0mQ9Qc2wCdh9nWctqE6mJWU6VGnPzuXcjQmDnqAu cr9Db6cdlu+hJHefvtIWW4ou3GnZJKgOLi7hqRTPHeARgFKmr/ZVY8GS2gs1I0O7h6 ykcv0JKxxPavg== Received: from johansen (uid 1000) (envelope-from kjlx@templeofstupid.com) id e0064 by kmjvbox.templeofstupid.com (DragonFly Mail Agent v0.12); Thu, 15 Aug 2024 12:33:45 -0700 Date: Thu, 15 Aug 2024 12:33:45 -0700 From: Krister Johansen To: Chandan Babu R , "Darrick J. Wong" , Dave Chinner Cc: Dave Chinner , Zorro Lang , linux-xfs@vger.kernel.org, fstests@vger.kernel.org Subject: [PATCH 3/5] xfs: make alloc_set_aside and friends aware of per-AG reservations Message-ID: <4d4b1e18ffb159d167e239cb66ccaf5e3a27236c.1723688622.git.kjlx@templeofstupid.com> References: Precedence: bulk X-Mailing-List: fstests@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The code in xfs_alloc_set_aside and xfs_alloc_ag_max_usable assumes that the amount of free space that remains when a filesystem is at ENOSPC corresponds to the filesystem actually having consumed almost all the available free space. These functions control how much space is set aside to refill the AGFL when a filesystem is almost out of space. With per-AG reservations, an AG has more space available at ENOSPC than it did in the past. This leads to situations where the reservation code informs callers that an ENOSPC condition is present, yet the filesystem isn't fully empty. As a result, under certain edge cases, allocations that need to refill the AGFL at a reservation-induced ENOSPC may not have enough space set aside to complete that operation successfully. This is because there is more free-space metadata to track than there used to be. The result is ENOSPC related shutdowns in paths that only partially succeed at satsifying their allocations. Fix this by determining the size of the free space that remains when a filesystem's reservation is unused but all remaining blocks have been consumed. Use this remaining space to determine the size of the b-trees that manage the space, and correspondingly, the number of blocks needed to refill the AGFL if we have a split at or near ENOSPC. Signed-off-by: Krister Johansen --- fs/xfs/libxfs/xfs_alloc.c | 85 +++++++++++++++++++++++++++++++++++++-- fs/xfs/xfs_mount.c | 16 ++++++++ 2 files changed, 98 insertions(+), 3 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 17e029bb1b6d..826f527d20f2 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -26,6 +26,7 @@ #include "xfs_ag.h" #include "xfs_ag_resv.h" #include "xfs_bmap.h" +#include "xfs_bmap_btree.h" #include "xfs_health.h" #include "xfs_extfree_item.h" @@ -131,9 +132,81 @@ xfs_alloc_min_freelist_calc( * fdblocks to ensure user allocation does not overcommit the space the * filesystem needs for the AGFLs. The rmap btree uses a per-AG reservation to * withhold space from xfs_dec_fdblocks, so we do not account for that here. + * + * This value should be used on filesystems that do not have a per-AG + * reservation enabled. If per-AG reservations are on, then this value needs to + * be scaled to the size of the metadata used to track the freespace that the + * reservation prevents from being consumed. */ #define XFS_ALLOCBT_AGFL_RESERVE 4 +/* + * Calculate the number of blocks that should be reserved on a per-AG basis when + * per-AG reservations are in use. This is necessary because the per-AG + * reservations result in ENOSPC occurring before the filesystem is truly empty. + * This means that in cases where the reservations are enabled, additional space + * needs to be set aside to manage the freespace data structures that remain + * because of space held by the reservation. This function attempts to + * determine how much free space will remain, in a worst-case scenario, and then + * how much space is needed to manage the metadata for the space that remains. + * Failure to do this correctly results in users getting ENOSPC errors in the + * middle of dependent allocations when they are close to hitting the + * reservation-induced limits. + */ +static unsigned int +xfs_allocbt_agfl_reserve( + struct xfs_mount *mp) +{ + unsigned int ndependent_allocs, free_height, agfl_resv, dep_alloc_sz; + unsigned int agfl_min_refill; + + if (!mp->m_ag_resblk_count) + return XFS_ALLOCBT_AGFL_RESERVE + 4; + + /* + * Worst case, the number of dependent allocations will be a split for + * every level in the BMBT. Use the max BMBT levels for this filesystem + * to determine how many dependent allocations we'd see at the most. + */ + ndependent_allocs = XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK); + + /* + * Assume that worst case, the free space trees are managing + * single-block free records when all per-ag reservations are at their + * maximum size. Use m_ag_resblk_count, which is the maximum per-AG + * reserved space, to calculate the number of b-tree blocks needed to + * index this free space, and use that to determine the maximum height + * of the free space b-tree in this case. + */ + free_height = xfs_btree_compute_maxlevels(mp->m_alloc_mnr, + mp->m_ag_resblk_count); + + /* + * Assume that data extent can perform a full-height split, but that + * subsequent split from dependent allocations will be (height - 2). + * The these values are multipled by 2, because they count both + * freespace trees (bnobt and cnobt). + */ + agfl_resv = free_height * 2; + dep_alloc_sz = (max(free_height, 2) - 2) * 2; + + /* + * Finally, ensure that we have enough blocks reserved to keep the agfl + * at its minimum fullness for any dependent allocation once our + * freespace tree reaches its maximum height. In this case we need to + * compute the free_height + 1, and max rmap which would be our worst + * case scenario. If this function doesn't account for agfl fullness, + * it will underestimate the amount of space that must remain free to + * continue allocating. + */ + agfl_min_refill = xfs_alloc_min_freelist_calc( + free_height + 1, + free_height + 1, + xfs_has_rmapbt(mp) ? mp->m_rmap_maxlevels : 0); + + return agfl_resv + agfl_min_refill + (ndependent_allocs * dep_alloc_sz); +} + /* * Compute the number of blocks that we set aside to guarantee the ability to * refill the AGFL and handle a full bmap btree split. @@ -150,13 +223,19 @@ xfs_alloc_min_freelist_calc( * aside a few blocks which will not be reserved in delayed allocation. * * For each AG, we need to reserve enough blocks to replenish a totally empty - * AGFL and 4 more to handle a potential split of the file's bmap btree. + * AGFL and 4 more to handle a potential split of the file's bmap btree if no AG + * reservation is enabled. + * + * If per-AG reservations are enabled, then the size of the per-AG reservation + * needs to be factored into the space that is set aside to replenish a empty + * AGFL when the filesystem is at a reservation-induced ENOSPC (instead of + * actually empty). */ unsigned int xfs_alloc_set_aside( struct xfs_mount *mp) { - return mp->m_sb.sb_agcount * (XFS_ALLOCBT_AGFL_RESERVE + 4); + return mp->m_sb.sb_agcount * xfs_allocbt_agfl_reserve(mp); } /* @@ -180,7 +259,7 @@ xfs_alloc_ag_max_usable( unsigned int blocks; blocks = XFS_BB_TO_FSB(mp, XFS_FSS_TO_BB(mp, 4)); /* ag headers */ - blocks += XFS_ALLOCBT_AGFL_RESERVE; + blocks += xfs_allocbt_agfl_reserve(mp); blocks += 3; /* AGF, AGI btree root blocks */ if (xfs_has_finobt(mp)) blocks++; /* finobt root block */ diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index d6ba67a29e3a..ec1f7925b31f 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -987,6 +987,22 @@ xfs_mountfs( xfs_qm_mount_quotas(mp); } + /* + * Prior to enabling the reservations as part of completing a RW mount, + * recompute the alloc_set_aside and ag_max_usable values to account for + * the size of the free space that the reservation occupies. Since the + * reservation keeps some free space from being utilized, these values + * need to account for the space that must also be set aside to do AGFL + * management during transactions with dependent allocations. The + * reservation initialization code uses the set_aside value and modifies + * ag_max_usable, which means this needs to get configured before the + * reservation is enabled for real. The earlier temporary + * enabling of the reservation allows this code to estimate the size of + * the reservation in order to perform its calculations. + */ + mp->m_alloc_set_aside = xfs_alloc_set_aside(mp); + mp->m_ag_max_usable = xfs_alloc_ag_max_usable(mp); + /* * Now we are mounted, reserve a small amount of unused space for * privileged transactions. This is needed so that transaction From patchwork Thu Aug 15 19:34:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Krister Johansen X-Patchwork-Id: 13765123 Received: from tiger.tulip.relay.mailchannels.net (tiger.tulip.relay.mailchannels.net [23.83.218.248]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AC4DF1494C4 for ; Thu, 15 Aug 2024 20:29:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=pass smtp.client-ip=23.83.218.248 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723753783; cv=pass; b=mvr2QHRmmxpIwKmFmHJUBoeRvw7RHi9qFgoH+3r+uXogXWYnD4cpLW42GLX7qHURyclf+KINxvjC5ssup9ScwzT+aVZZQ+Zo7lp3KZGwkwMEtkaLe6g5OBb0pEwBpfjSI5JMNXPklGxPdjBxq0jorSnkbbiecR5RXynHTGrhFa4= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723753783; c=relaxed/simple; bh=f/ym/HhI+ad2ud3xHFvHvQqkvkiRfRFQ2kpUJ75CE8Y=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Zxb6eAtAePcZ+K782VNcqaCBIaZKlkW+3mgzca/stKVX/OXKgMooQqWap3hzX7oawJ8krT61TC8KcYHAxpUmJli1zJnksv/SLzhAspKTPqZKeZPS06S5uTbmpYabeWDMqg+HUGelGYi84HoO3tByQeO1fU7872wNPBzsDnsXgHk= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=templeofstupid.com; spf=pass smtp.mailfrom=templeofstupid.com; dkim=pass (2048-bit key) header.d=templeofstupid.com header.i=@templeofstupid.com header.b=PEM1nKIT; arc=pass smtp.client-ip=23.83.218.248 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=templeofstupid.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=templeofstupid.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=templeofstupid.com header.i=@templeofstupid.com header.b="PEM1nKIT" X-Sender-Id: dreamhost|x-authsender|kjlx@templeofstupid.com Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id C814CC5186 for ; Thu, 15 Aug 2024 19:34:51 +0000 (UTC) Received: from pdx1-sub0-mail-a210.dreamhost.com (unknown [127.0.0.6]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id 7E41AC51C0 for ; Thu, 15 Aug 2024 19:34:51 +0000 (UTC) ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1723750491; a=rsa-sha256; cv=none; b=3fcPA54oTw5IU9TlWNtbLzpLvkluRxpQZZtAtOakDNQYRtWUbUAiKWKWpRs8QaJsXhDgd3 jrBAzmnVn1Zsmev/SQjYeBgoHmVS5AQS0aacPtpXU/6gbOOCSvpl0cfzcPl6AURAZ+BOjv 2cLX4dro61dq3ceJmRdXbnHMNgw1wlyKG/fND41I774l9rEUKglBDtoF2Q0yZdq4mNg9oC Qiee5GcPICbvwF8/ULL7JtjhjG7QJdGcAfQHnG6oHljNahXJ0MDMnxwJvWOtB2BWXZnAny 4HtQWClAMmpzqYhaQoIeB95IQMcXOhPyAbNuwSYSZg4GEb90vO8tZrqP3x5pMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=mailchannels.net; s=arc-2022; t=1723750491; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OTZU2GHi0cFIUNIMsupcprXCrGD2KiplaAerMY4u130=; b=DqwlCoT+gQ8bpk0O1ql+2aL7enKdV6XEp0107a5Z1GFj20f1+SoTEY1+wjFHwCQ+5oA14x Y0AQDvtgYvLLQfVsIDxT2BbsK/4venp4UARQx/zpC5rUYFE5CjE4z5GgcgTSwXH3Wzq7EH jldcKL2JAWDRF9OazyoYH6NE/rdqTUo1S4QWXhTg8izBoqWKPMif1qjLF7zgkduFPEDZ/S BbWRWtTjVWbopd3j/o3SK2XZL8tiZJ/+PUV4vE66sLpdMUwGFaX2wn77bX0EJtoGHq2aKT /gRJmqhEFUqDBDdhyGuSTlQ2wKO+hV7rbOH0mSOgxl4MPiPOGFZH7scIi+tgEQ== ARC-Authentication-Results: i=1; rspamd-587694846-g59mf; auth=pass smtp.auth=dreamhost smtp.mailfrom=kjlx@templeofstupid.com X-Sender-Id: dreamhost|x-authsender|kjlx@templeofstupid.com X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|kjlx@templeofstupid.com X-MailChannels-Auth-Id: dreamhost X-Bottle-Oafish: 307597c0594cc764_1723750491730_2484953318 X-MC-Loop-Signature: 1723750491730:4149103670 X-MC-Ingress-Time: 1723750491730 Received: from pdx1-sub0-mail-a210.dreamhost.com (pop.dreamhost.com [64.90.62.162]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384) by 100.117.75.38 (trex/7.0.2); Thu, 15 Aug 2024 19:34:51 +0000 Received: from kmjvbox.templeofstupid.com (c-73-70-109-47.hsd1.ca.comcast.net [73.70.109.47]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: kjlx@templeofstupid.com) by pdx1-sub0-mail-a210.dreamhost.com (Postfix) with ESMTPSA id 4WlFgv09CJzK2 for ; Thu, 15 Aug 2024 12:34:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=templeofstupid.com; s=dreamhost; t=1723750491; bh=OTZU2GHi0cFIUNIMsupcprXCrGD2KiplaAerMY4u130=; h=Date:From:To:Cc:Subject:Content-Type; b=PEM1nKIT1BH7+ik2YKZt7HZNGJQd8p5ZuaWO0oIdN0RyUFML+/ed22ceUzv/2f9AA 6xfVbmYtENt1Znvug+lXYCGnjRS2gKdk4gbau+QQeeQSfvRahO8u3pnH/GSg6y5v8x TloOJ9y1jYB5NIHEnjbNqjSfrZ1nrS23we2XoiKK6T1q3Xmn9nDkwi9de5cWezB7DV yEE2hFZTx0QZLHygLxRgiEng93Z5eZO1t0u4hwyDOAAGDR4fClSptLizTlale09+S/ afuEzAIwiFelf6rKfWKqDJy9V0PtNDgtn7JlRkHf1v+pCscZuzA4pla9d0JWkBcFa8 AU5rea3DlajRg== Received: from johansen (uid 1000) (envelope-from kjlx@templeofstupid.com) id e0064 by kmjvbox.templeofstupid.com (DragonFly Mail Agent v0.12); Thu, 15 Aug 2024 12:34:49 -0700 Date: Thu, 15 Aug 2024 12:34:49 -0700 From: Krister Johansen To: Chandan Babu R , "Darrick J. Wong" , Dave Chinner Cc: Dave Chinner , Zorro Lang , linux-xfs@vger.kernel.org, fstests@vger.kernel.org Subject: [PATCH 4/5] xfs: push the agfl set aside into xfs_alloc_space_available Message-ID: References: Precedence: bulk X-Mailing-List: fstests@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The blocks that have been set aside for dependent allocations and freelist refilling at reservation induced ENOSPC are deducted from m_ag_max_usable, which prevents them from being factored into the longest_free_extent and maxlen calculations. However, it's still possible to eat into this space by making multiple small allocations. Catch this case by withholding the space that's been set aside in xfs_alloc_space_available's available space calculation. Signed-off-by: Krister Johansen --- fs/xfs/libxfs/xfs_alloc.c | 15 +++++++++++++-- fs/xfs/libxfs/xfs_alloc.h | 1 + fs/xfs/xfs_mount.c | 1 + fs/xfs/xfs_mount.h | 5 +++++ 4 files changed, 20 insertions(+), 2 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 826f527d20f2..4dd401d407c2 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -153,7 +153,7 @@ xfs_alloc_min_freelist_calc( * middle of dependent allocations when they are close to hitting the * reservation-induced limits. */ -static unsigned int +unsigned int xfs_allocbt_agfl_reserve( struct xfs_mount *mp) { @@ -2593,6 +2593,7 @@ xfs_alloc_space_available( xfs_extlen_t reservation; /* blocks that are still reserved */ int available; xfs_extlen_t agflcount; + xfs_extlen_t set_aside = 0; if (flags & XFS_ALLOC_FLAG_FREEING) return true; @@ -2605,6 +2606,16 @@ xfs_alloc_space_available( if (longest < alloc_len) return false; + /* + * Withhold from the available space any that has been set-aside as a + * reserve for refilling the AGFL close to ENOSPC. In the case where a + * dependent allocation is in progress, allow that space to be consumed + * so that the dependent allocation may complete successfully. Without + * this, we may ENOSPC in the middle of the allocation chain and + * shutdown the filesystem. + */ + if (args->tp->t_highest_agno == NULLAGNUMBER) + set_aside = args->mp->m_ag_agfl_setaside; /* * Do we have enough free space remaining for the allocation? Don't * account extra agfl blocks because we are about to defer free them, @@ -2612,7 +2623,7 @@ xfs_alloc_space_available( */ agflcount = min_t(xfs_extlen_t, pag->pagf_flcount, min_free); available = (int)(pag->pagf_freeblks + agflcount - - reservation - min_free - args->minleft); + reservation - min_free - args->minleft - set_aside); if (available < (int)max(args->total, alloc_len)) return false; diff --git a/fs/xfs/libxfs/xfs_alloc.h b/fs/xfs/libxfs/xfs_alloc.h index fae170825be0..7e92c4c455a1 100644 --- a/fs/xfs/libxfs/xfs_alloc.h +++ b/fs/xfs/libxfs/xfs_alloc.h @@ -70,6 +70,7 @@ typedef struct xfs_alloc_arg { /* freespace limit calculations */ unsigned int xfs_alloc_set_aside(struct xfs_mount *mp); unsigned int xfs_alloc_ag_max_usable(struct xfs_mount *mp); +unsigned int xfs_allocbt_agfl_reserve(struct xfs_mount *mp); xfs_extlen_t xfs_alloc_longest_free_extent(struct xfs_perag *pag, xfs_extlen_t need, xfs_extlen_t reserved); diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index ec1f7925b31f..1bc80983310a 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -1002,6 +1002,7 @@ xfs_mountfs( */ mp->m_alloc_set_aside = xfs_alloc_set_aside(mp); mp->m_ag_max_usable = xfs_alloc_ag_max_usable(mp); + mp->m_ag_agfl_setaside = xfs_allocbt_agfl_reserve(mp); /* * Now we are mounted, reserve a small amount of unused space for diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index 800788043ca6..4a9321424954 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -220,6 +220,11 @@ typedef struct xfs_mount { * one. */ uint m_ag_resblk_count; + /* + * Blocks set aside to refill the agfl at ENOSPC and satisfy any + * dependent allocation resulting from a chain of BMBT splits. + */ + uint m_ag_agfl_setaside; struct delayed_work m_reclaim_work; /* background inode reclaim */ struct dentry *m_debugfs; /* debugfs parent */ struct xfs_kobj m_kobj; From patchwork Thu Aug 15 19:35:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Krister Johansen X-Patchwork-Id: 13765119 Received: from black.elm.relay.mailchannels.net (black.elm.relay.mailchannels.net [23.83.212.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F1E965A0F5 for ; Thu, 15 Aug 2024 19:54:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=pass smtp.client-ip=23.83.212.19 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723751682; cv=pass; b=nbFhN7WbNjVwCr932NX4OS/I7EytVOL3wnrvobqPGPjohdM+r1XV+JcTrODzJ2K4M6sRqRXYakAcdoy5rbQQadeYn7vmbfaqlc05feqiH8ItRvmfDsNeBbscKKoW2ONbb8bc3RqWdW4JlBZY90Dn6N/dgyLtZcU5IGJzU4Hg/vY= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723751682; c=relaxed/simple; bh=QQK6GS69aXX+74krEAXspDA0eQU5MIwzbpRH8P40iVE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=ZeATleFqUbdYkfb1pSUP/HuJ89trnqGs9KJro+kGRm8kxTRV+0TOdeIvnQ2QBFquszmjm1KoHL/oStihrMyomUqtnmmx1pQBC2WW1Ms++O7HUdNUuzMo3b3FtGr0RK7ygSq7jw4woy29jSdudLrDMZCvzmF+kyP8VspLbrrRqGo= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=templeofstupid.com; spf=pass smtp.mailfrom=templeofstupid.com; dkim=pass (2048-bit key) header.d=templeofstupid.com header.i=@templeofstupid.com header.b=n/tGs89i; arc=pass smtp.client-ip=23.83.212.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=templeofstupid.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=templeofstupid.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=templeofstupid.com header.i=@templeofstupid.com header.b="n/tGs89i" X-Sender-Id: dreamhost|x-authsender|kjlx@templeofstupid.com Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 392D8105A62 for ; Thu, 15 Aug 2024 19:35:51 +0000 (UTC) Received: from pdx1-sub0-mail-a210.dreamhost.com (unknown [127.0.0.6]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id 7704610558A for ; Thu, 15 Aug 2024 19:35:50 +0000 (UTC) ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1723750550; a=rsa-sha256; cv=none; b=MTEhkfTmvg5QGF/MH8dSFuPAVfn2Sx9tcaQ0hD4O8eX0gmIQWX5UN/FBIRBILIf6Gt+dmt PB4yh2azJDJsmXh9eI9qMutkMymeIK8VmUR32IMxa4iqYHkObm58/pE4A4hdSWdKMQDcrw 0mImYjxlqevUQiMezOJFepM72J9OXdK4jsPRrpJ1jXX9yXkoUHH48/KwzpHL6eOoZf0P5D +IL7QF5Aa3e20Y1OuSHdl1rOuHRi5Y7DXRxHXtSqpXxipGUFpJb4u5NIgm1Jkf5E3FiiIl sJ/GLpWAvqDAUw9cpJkAL9wuZLn3/Z8n2lo8o0iOIV4rpYd895DfG1V5W+dQlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=mailchannels.net; s=arc-2022; t=1723750550; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RBN71pbRJTwK1EZaifhgiID77GSAsx/n1jmj9UaAqpI=; b=qRBQ2N+nyumBz7SzDR+NsTzf6BFsirbpGPNN/Hz+lmzogSg5wF0SRR+VVIUgeqZqeNoZvf NjAvNDTR64NLrk1koso4oUvp1sLZHeub4PnseGWBM83USXyHTwhrkZWM8fE4S+jy7beBDG W28FJO2Q83ddnSyqchAjGQPCdi8+THM+Ri3w0kHcPkiqWI5rO0PV34w46DAeTijXyRDcZr KYuoP7aW7ARqRk5p0mXM3mfF4GshgQXfv5cwZO0TBtvk5F6Tlm067loU4L/rhuLpADC3I2 4HrDOeB+usWYhKXWlJZWRoPV8TQ92JRImZW1WVVdkEJgmcTndQdGu0jWxsHdwg== ARC-Authentication-Results: i=1; rspamd-587694846-xgzwp; auth=pass smtp.auth=dreamhost smtp.mailfrom=kjlx@templeofstupid.com X-Sender-Id: dreamhost|x-authsender|kjlx@templeofstupid.com X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|kjlx@templeofstupid.com X-MailChannels-Auth-Id: dreamhost X-Thread-Desert: 222676364f6fcadc_1723750550693_3130685832 X-MC-Loop-Signature: 1723750550693:2708761552 X-MC-Ingress-Time: 1723750550693 Received: from pdx1-sub0-mail-a210.dreamhost.com (pop.dreamhost.com [64.90.62.162]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384) by 100.117.75.38 (trex/7.0.2); Thu, 15 Aug 2024 19:35:50 +0000 Received: from kmjvbox.templeofstupid.com (c-73-70-109-47.hsd1.ca.comcast.net [73.70.109.47]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: kjlx@templeofstupid.com) by pdx1-sub0-mail-a210.dreamhost.com (Postfix) with ESMTPSA id 4WlFj226FJzDj for ; Thu, 15 Aug 2024 12:35:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=templeofstupid.com; s=dreamhost; t=1723750550; bh=RBN71pbRJTwK1EZaifhgiID77GSAsx/n1jmj9UaAqpI=; h=Date:From:To:Cc:Subject:Content-Type; b=n/tGs89irVyH72kXnJ4RXETMWFKyLiDZSjN+/IOjdrWNd/3FL8u2P9xAK4GDTgisl jZq28QbZ7cwaHcXBKY4l7pyIk8pDzurGJb1Neeorm59292PcqHsmQ71IYNZVJaYYzf +UHQehM+7vEVft55ASYeycFi64fBVdd9xDQtYyWqzzGAits44wSIqbX2EGPpcYZYdn 1rD9rj2uR+/ZVbVpCj0wp4li6T9hakLYpp5UY3Fr/jyrGxaPZFHvILNWxg0ZW4x0eR sZg4dAwjaHeRVIv2yxxYU957NxujMwnQ8FfPgQ/0SJv3jETx5pKKxsT10hWLNGcKj1 0h6tE1dCNcFHA== Received: from johansen (uid 1000) (envelope-from kjlx@templeofstupid.com) id e0064 by kmjvbox.templeofstupid.com (DragonFly Mail Agent v0.12); Thu, 15 Aug 2024 12:35:49 -0700 Date: Thu, 15 Aug 2024 12:35:49 -0700 From: Krister Johansen To: Chandan Babu R , "Darrick J. Wong" , Dave Chinner Cc: Dave Chinner , Zorro Lang , linux-xfs@vger.kernel.org, fstests@vger.kernel.org Subject: [PATCH 5/5] xfs: include min freelist in m_ag_max_usable Message-ID: <4b9b30af3719389701c2dd00f8cb20f12043b3ee.1723688622.git.kjlx@templeofstupid.com> References: Precedence: bulk X-Mailing-List: fstests@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: If agfl_reserve blocks are conditionally withheld from consideration in xfs_alloc_space_available, then m_ag_max_usable overstates the amount of max available space on an empty filesystem by the amount of blocks that mkfs placed into the AGFL on our behalf. While this space _is_ technically free, it's not usable for a maximum sized allocation on an empty filesystem, because the blocks must remain in the AGFL in order for an allocation to succeed. Without this, stripe aligned allocations on an empty AG pick a size that they can't actually get which leads to allocations which can't be satisfied and that consequently come back unaligned. Signed-off-by: Krister Johansen --- fs/xfs/libxfs/xfs_alloc.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 4dd401d407c2..26447e6061b3 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -246,7 +246,9 @@ xfs_alloc_set_aside( * - the AG superblock, AGF, AGI and AGFL * - the AGF (bno and cnt) and AGI btree root blocks, and optionally * the AGI free inode and rmap btree root blocks. - * - blocks on the AGFL according to xfs_alloc_set_aside() limits + * - blocks on the AGFL when the filesystem is empty + * - blocks on needed to AGFL while performing dependent allocations close + * to ENOSPC as given by xfs_allocbt_agfl_reserve() * - the rmapbt root block * * The AG headers are sector sized, so the amount of space they take up is @@ -259,6 +261,13 @@ xfs_alloc_ag_max_usable( unsigned int blocks; blocks = XFS_BB_TO_FSB(mp, XFS_FSS_TO_BB(mp, 4)); /* ag headers */ + /* + * Minimal freelist length when filesystem is completely empty. + * xfs_alloc_min_freelist needs m_alloc_maxlevels so this is computed in + * our second invocation of xfs_alloc_ag_max_usable + */ + if (mp->m_alloc_maxlevels > 0) + blocks += xfs_alloc_min_freelist(mp, NULL); blocks += xfs_allocbt_agfl_reserve(mp); blocks += 3; /* AGF, AGI btree root blocks */ if (xfs_has_finobt(mp))