From patchwork Tue Nov 26 01:24:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13885406 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5E0F9DF49; Tue, 26 Nov 2024 01:24:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584299; cv=none; b=oswIHX2RtAqqaKatWtjvvejlpwgZ++DpHn0QEJ6plqHydBvFxQFKu/h4e4qlrJHahs6/vo8CI7tjzwtOwAYSO99vNSzjCyUCCaCVYjKTg9sk9d5ysZloF4rheX5ww0IqTsR57wJ1I3pt/kw6bEjWsuY2sHPU/God9ylAMNqV/IQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584299; c=relaxed/simple; bh=Pizfx7xfbsJPj1lI6/tM0ZwA3V6zijokxNO54cioPLk=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=im59YrGGCkPMKBx9netjDT2cGoouq0DYzek2dyQ35O0HasrB/7cBDUnIwyNqcuyr7I1LkiNj145t80kwgfKbL+y0BwmSIM18eB1UybYbcb0XISKErmeMxwypKAOVnd/ytCZEGsPw0DxkjIUqFrtwlo2EKjrgY2UAfhQiqMNCrrs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=TRgBWxVY; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="TRgBWxVY" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2EB2CC4CECE; Tue, 26 Nov 2024 01:24:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732584299; bh=Pizfx7xfbsJPj1lI6/tM0ZwA3V6zijokxNO54cioPLk=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=TRgBWxVYrJoBXgRfFz1OrG2HgnBuX/NJcHaBV3Saombqjq/J0o0MsfDHyhrw8uQCf 7FdxCQWpB8Tzd9S2Uur80Mk8UPqiZySNdXhC8AYXoZGJpGN5YyCj+i/yQPWQwFdXby mO4f/b8qtPU5MTYG0No6EZ/yyJj2fNaeHKS7WaGdy+GaJyJWYIEncWMiWH8qLp7iKA nRsutgDfyFQblgBSSHESQxDSRhQMz7PlY7xOt8wRCcZia5f25SRRhUidnfjikEixav 9T/H1uEv9eZHSXFwZp/laDFs6vXC0wYbQZZ3vL/WrKjZ5uCk0bb286iwK7jTKeQFRj NhX4QHQnLVZYg== Date: Mon, 25 Nov 2024 17:24:58 -0800 Subject: [PATCH 01/21] xfs: fix off-by-one error in fsmap's end_daddr usage From: "Darrick J. Wong" To: djwong@kernel.org Cc: stable@vger.kernel.org, wozizhi@huawei.com, hch@lst.de, linux-xfs@vger.kernel.org Message-ID: <173258397820.4032920.11184703272397099638.stgit@frogsfrogsfrogs> In-Reply-To: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> References: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong In commit ca6448aed4f10a, we created an "end_daddr" variable to fix fsmap reporting when the end of the range requested falls in the middle of an unknown (aka free on the rmapbt) region. Unfortunately, I didn't notice that the the code sets end_daddr to the last sector of the device but then uses that quantity to compute the length of the synthesized mapping. Zizhi Wo later observed that when end_daddr isn't set, we still don't report the last fsblock on a device because in that case (aka when info->last is true), the info->high mapping that we pass to xfs_getfsmap_group_helper has a startblock that points to the last fsblock. This is also wrong because the code uses startblock to compute the length of the synthesized mapping. Fix the second problem by setting end_daddr unconditionally, and fix the first problem by setting start_daddr to one past the end of the range to query. Cc: # v6.11 Fixes: ca6448aed4f10a ("xfs: Fix missing interval for missing_owner in xfs fsmap") Signed-off-by: "Darrick J. Wong" Reported-by: Zizhi Wo Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_fsmap.c | 38 ++++++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 16 deletions(-) diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c index 82f2e0dd224997..3290dd8524a69a 100644 --- a/fs/xfs/xfs_fsmap.c +++ b/fs/xfs/xfs_fsmap.c @@ -163,7 +163,8 @@ struct xfs_getfsmap_info { xfs_daddr_t next_daddr; /* next daddr we expect */ /* daddr of low fsmap key when we're using the rtbitmap */ xfs_daddr_t low_daddr; - xfs_daddr_t end_daddr; /* daddr of high fsmap key */ + /* daddr of high fsmap key, or the last daddr on the device */ + xfs_daddr_t end_daddr; u64 missing_owner; /* owner of holes */ u32 dev; /* device id */ /* @@ -387,8 +388,8 @@ xfs_getfsmap_group_helper( * we calculated from userspace's high key to synthesize the record. * Note that if the btree query found a mapping, there won't be a gap. */ - if (info->last && info->end_daddr != XFS_BUF_DADDR_NULL) - frec->start_daddr = info->end_daddr; + if (info->last) + frec->start_daddr = info->end_daddr + 1; else frec->start_daddr = xfs_gbno_to_daddr(xg, startblock); @@ -736,11 +737,10 @@ xfs_getfsmap_rtdev_rtbitmap_helper( * we calculated from userspace's high key to synthesize the record. * Note that if the btree query found a mapping, there won't be a gap. */ - if (info->last && info->end_daddr != XFS_BUF_DADDR_NULL) { - frec.start_daddr = info->end_daddr; - } else { + if (info->last) + frec.start_daddr = info->end_daddr + 1; + else frec.start_daddr = xfs_rtb_to_daddr(mp, start_rtb); - } frec.len_daddr = XFS_FSB_TO_BB(mp, rtbcount); return xfs_getfsmap_helper(tp, info, &frec); @@ -933,7 +933,10 @@ xfs_getfsmap( struct xfs_trans *tp = NULL; struct xfs_fsmap dkeys[2]; /* per-dev keys */ struct xfs_getfsmap_dev handlers[XFS_GETFSMAP_DEVS]; - struct xfs_getfsmap_info info = { NULL }; + struct xfs_getfsmap_info info = { + .fsmap_recs = fsmap_recs, + .head = head, + }; bool use_rmap; int i; int error = 0; @@ -998,9 +1001,6 @@ xfs_getfsmap( info.next_daddr = head->fmh_keys[0].fmr_physical + head->fmh_keys[0].fmr_length; - info.end_daddr = XFS_BUF_DADDR_NULL; - info.fsmap_recs = fsmap_recs; - info.head = head; /* For each device we support... */ for (i = 0; i < XFS_GETFSMAP_DEVS; i++) { @@ -1013,17 +1013,23 @@ xfs_getfsmap( break; /* - * If this device number matches the high key, we have - * to pass the high key to the handler to limit the - * query results. If the device number exceeds the - * low key, zero out the low key so that we get - * everything from the beginning. + * If this device number matches the high key, we have to pass + * the high key to the handler to limit the query results, and + * set the end_daddr so that we can synthesize records at the + * end of the query range or device. */ if (handlers[i].dev == head->fmh_keys[1].fmr_device) { dkeys[1] = head->fmh_keys[1]; info.end_daddr = min(handlers[i].nr_sectors - 1, dkeys[1].fmr_physical); + } else { + info.end_daddr = handlers[i].nr_sectors - 1; } + + /* + * If the device number exceeds the low key, zero out the low + * key so that we get everything from the beginning. + */ if (handlers[i].dev > head->fmh_keys[0].fmr_device) memset(&dkeys[0], 0, sizeof(struct xfs_fsmap)); From patchwork Tue Nov 26 01:25:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13885407 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 03CC8BA27 for ; Tue, 26 Nov 2024 01:25:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584315; cv=none; b=phUiUKfYKPYaUdkkx5IDmS4HGPBQNk8T1M+1GQU7RxDiXSpJAX//1/b0X2jPabza89rk37xlz1nwoDaEQaJQPbh9bJPtNTIIze/662sNRjqai5XngAThAb9ncdV05x0zfTzw4VOewoxKQ493K6CzJ41FNHhH/4bLSIzVQNyroOY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584315; c=relaxed/simple; bh=36ADRXqr99px5o5LNsS5+a1CZt14KdYi+ujGXtVocJ0=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=t8jjjpN0JDtp0Iz5xc7I5rhx08A0c6hU/vP6WrnF34CBBhmVod5qvKfkhuxZKkcHrI8c/9N+91f+1Hj7qlIacbLRpADPtRMMRMoBRkL72BSAMP3Um5zOgKEbCpxGIMUG4IJTVLu9wVUMH7faeOdGeeny0cOahdZc8uY0KygXkzw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=beQeK20O; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="beQeK20O" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C9480C4CECE; Tue, 26 Nov 2024 01:25:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732584314; bh=36ADRXqr99px5o5LNsS5+a1CZt14KdYi+ujGXtVocJ0=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=beQeK20OeKG1atUJ1H8QIt6PVt3rtfNdfCvaocu9WweqdpBbT2vtAihnYEXZNWkdM KRYNCQ2R6cS2TyOdOwcSp9X1cNkOh991fkcPDBD2ko+xYFS9DHzUN/5Sdp3+KjX5VD kHYpI2Cpizv0wlafzGb6357YNhSgcI0uQKQA7+wfJV3tCGvCEEwxFRR64aTfE7VTyV tRqwp2XlIn++LG8uRBbuMLjnhxgekOK3LtsZTDyIzLlWcdZeQ9AY3+DFwWY8zvlovS oLuRZ7934nB3AvFj2O3PcOAoxz01c9ZuR+j3Nj8hvv9iGLUw7XC07NV4qJWSs+0E7B qk7fpJYjfYPqA== Date: Mon, 25 Nov 2024 17:25:14 -0800 Subject: [PATCH 02/21] xfs: metapath scrubber should use the already loaded inodes From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org Message-ID: <173258397837.4032920.10276485588764375439.stgit@frogsfrogsfrogs> In-Reply-To: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> References: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Don't waste time in xchk_setup_metapath_dqinode doing a second lookup of the quota inodes, just grab them from the quotainfo structure. The whole point of this scrubber is to make sure that the dirents exist, so it's completely silly to do lookups. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/scrub/metapath.c | 41 +++++++++++++++++++++++++---------------- 1 file changed, 25 insertions(+), 16 deletions(-) diff --git a/fs/xfs/scrub/metapath.c b/fs/xfs/scrub/metapath.c index b78db651346518..80467d6bc76389 100644 --- a/fs/xfs/scrub/metapath.c +++ b/fs/xfs/scrub/metapath.c @@ -196,36 +196,45 @@ xchk_setup_metapath_dqinode( struct xfs_scrub *sc, xfs_dqtype_t type) { + struct xfs_quotainfo *qi = sc->mp->m_quotainfo; struct xfs_trans *tp = NULL; struct xfs_inode *dp = NULL; struct xfs_inode *ip = NULL; - const char *path; int error; + if (!qi) + return -ENOENT; + + switch (type) { + case XFS_DQTYPE_USER: + ip = qi->qi_uquotaip; + break; + case XFS_DQTYPE_GROUP: + ip = qi->qi_gquotaip; + break; + case XFS_DQTYPE_PROJ: + ip = qi->qi_pquotaip; + break; + default: + ASSERT(0); + return -EINVAL; + } + if (!ip) + return -ENOENT; + error = xfs_trans_alloc_empty(sc->mp, &tp); if (error) return error; error = xfs_dqinode_load_parent(tp, &dp); - if (error) - goto out_cancel; - - error = xfs_dqinode_load(tp, dp, type, &ip); - if (error) - goto out_dp; - xfs_trans_cancel(tp); - tp = NULL; + if (error) + return error; - path = kasprintf(GFP_KERNEL, "%s", xfs_dqinode_path(type)); - error = xchk_setup_metapath_scan(sc, dp, path, ip); + error = xchk_setup_metapath_scan(sc, dp, + kstrdup(xfs_dqinode_path(type), GFP_KERNEL), ip); - xfs_irele(ip); -out_dp: xfs_irele(dp); -out_cancel: - if (tp) - xfs_trans_cancel(tp); return error; } #else From patchwork Tue Nov 26 01:25:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13885408 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 987687462 for ; Tue, 26 Nov 2024 01:25:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584330; cv=none; b=WoLOEjJy2l07eQBJyifRAYg0wLC1a8w77F+LVFr/wHR/ObdZCeQdqSx10E+IEXUaJbezeNBxgj6rPaC0TbkjFXEoDKvqH3PDVz1SnahvZH76US4ORXfa7XalxMskgzxo5W/yal5xeKaX39gDIvQ9MItko+AQWHKqJ90sA49jMR0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584330; c=relaxed/simple; bh=G8P+ogE+bxuBAmqY4Lj424BQNciA8LSuhnc1xgfRdsY=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=GpvLI2I8Utz/qLlOJ8fHmWEWA9csPw6A+U7bBZZ/jX4o+aLQXgOllQLh91T0NjTEm5F5DY1alvGnuq2ya59ZcoCZyHSYXhTN4goYC8mHAAE5zrxx2mlBnCLy0PFa8LhUK1Zgdf5fqVK2GQOqYxkeJxMsE40XnEcrgiRcDnWrUq0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=C44VgRLI; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="C44VgRLI" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 67A01C4CECE; Tue, 26 Nov 2024 01:25:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732584330; bh=G8P+ogE+bxuBAmqY4Lj424BQNciA8LSuhnc1xgfRdsY=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=C44VgRLIdhaRsfhwGOLF3Wk4PAJDCF+2oDEYoVMi2dZKbpo5R1+9Y5EVQSF7lwo/E FIm378jtiiEy60HfD3LWjUpy2KPK8KL3vIeR+BCxv6d8OE0KEU2IbNy2zWMyAnaYr2 Sc2mGyexFB2xn+lKLzIURhtohG4ABaylKtlcqfBSV0FLUzqZubfccnp81TK1SW3fOk Qqg4+oKzNRvWjyzy/BPqXOG4AWjIpGkfObKWp3oPFeFtQ3NdBYowraKQ7Awi36GHvE zIiZLctvGmT6zy7Fcjm242eG3+vxvR5Z39P7bHDM6fFaK5TJicSn33WJHl/TALYRCz G0zazquAHrkMw== Date: Mon, 25 Nov 2024 17:25:29 -0800 Subject: [PATCH 03/21] xfs: keep quota directory inode loaded From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org Message-ID: <173258397854.4032920.7776347980322455777.stgit@frogsfrogsfrogs> In-Reply-To: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> References: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong In the same vein as the previous patch, there's no point in the metapath scrub setup function doing a lookup on the quota metadir just so it can validate that lookups work correctly. Instead, retain the quota directory inode in memory for the lifetime of the mount so that we can check this meaningfully. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/scrub/metapath.c | 37 ++++++------------------------------- fs/xfs/xfs_qm.c | 47 +++++++++++++++++++++++++---------------------- fs/xfs/xfs_qm.h | 1 + 3 files changed, 32 insertions(+), 53 deletions(-) diff --git a/fs/xfs/scrub/metapath.c b/fs/xfs/scrub/metapath.c index 80467d6bc76389..c678cba1ffc3f7 100644 --- a/fs/xfs/scrub/metapath.c +++ b/fs/xfs/scrub/metapath.c @@ -171,23 +171,13 @@ static int xchk_setup_metapath_quotadir( struct xfs_scrub *sc) { - struct xfs_trans *tp; - struct xfs_inode *dp = NULL; - int error; + struct xfs_quotainfo *qi = sc->mp->m_quotainfo; - error = xfs_trans_alloc_empty(sc->mp, &tp); - if (error) - return error; + if (!qi || !qi->qi_dirip) + return -ENOENT; - error = xfs_dqinode_load_parent(tp, &dp); - xfs_trans_cancel(tp); - if (error) - return error; - - error = xchk_setup_metapath_scan(sc, sc->mp->m_metadirip, - kasprintf(GFP_KERNEL, "quota"), dp); - xfs_irele(dp); - return error; + return xchk_setup_metapath_scan(sc, sc->mp->m_metadirip, + kstrdup("quota", GFP_KERNEL), qi->qi_dirip); } /* Scan a quota inode under the /quota directory. */ @@ -197,10 +187,7 @@ xchk_setup_metapath_dqinode( xfs_dqtype_t type) { struct xfs_quotainfo *qi = sc->mp->m_quotainfo; - struct xfs_trans *tp = NULL; - struct xfs_inode *dp = NULL; struct xfs_inode *ip = NULL; - int error; if (!qi) return -ENOENT; @@ -222,20 +209,8 @@ xchk_setup_metapath_dqinode( if (!ip) return -ENOENT; - error = xfs_trans_alloc_empty(sc->mp, &tp); - if (error) - return error; - - error = xfs_dqinode_load_parent(tp, &dp); - xfs_trans_cancel(tp); - if (error) - return error; - - error = xchk_setup_metapath_scan(sc, dp, + return xchk_setup_metapath_scan(sc, qi->qi_dirip, kstrdup(xfs_dqinode_path(type), GFP_KERNEL), ip); - - xfs_irele(dp); - return error; } #else # define xchk_setup_metapath_quotadir(...) (-ENOENT) diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c index b928b036990bc3..a4fa21dfd6b4ad 100644 --- a/fs/xfs/xfs_qm.c +++ b/fs/xfs/xfs_qm.c @@ -241,6 +241,10 @@ xfs_qm_destroy_quotainos( xfs_irele(qi->qi_pquotaip); qi->qi_pquotaip = NULL; } + if (qi->qi_dirip) { + xfs_irele(qi->qi_dirip); + qi->qi_dirip = NULL; + } } /* @@ -648,8 +652,7 @@ xfs_qm_init_timelimits( static int xfs_qm_load_metadir_qinos( struct xfs_mount *mp, - struct xfs_quotainfo *qi, - struct xfs_inode **dpp) + struct xfs_quotainfo *qi) { struct xfs_trans *tp; int error; @@ -658,7 +661,7 @@ xfs_qm_load_metadir_qinos( if (error) return error; - error = xfs_dqinode_load_parent(tp, dpp); + error = xfs_dqinode_load_parent(tp, &qi->qi_dirip); if (error == -ENOENT) { /* no quota dir directory, but we'll create one later */ error = 0; @@ -668,21 +671,21 @@ xfs_qm_load_metadir_qinos( goto out_trans; if (XFS_IS_UQUOTA_ON(mp)) { - error = xfs_dqinode_load(tp, *dpp, XFS_DQTYPE_USER, + error = xfs_dqinode_load(tp, qi->qi_dirip, XFS_DQTYPE_USER, &qi->qi_uquotaip); if (error && error != -ENOENT) goto out_trans; } if (XFS_IS_GQUOTA_ON(mp)) { - error = xfs_dqinode_load(tp, *dpp, XFS_DQTYPE_GROUP, + error = xfs_dqinode_load(tp, qi->qi_dirip, XFS_DQTYPE_GROUP, &qi->qi_gquotaip); if (error && error != -ENOENT) goto out_trans; } if (XFS_IS_PQUOTA_ON(mp)) { - error = xfs_dqinode_load(tp, *dpp, XFS_DQTYPE_PROJ, + error = xfs_dqinode_load(tp, qi->qi_dirip, XFS_DQTYPE_PROJ, &qi->qi_pquotaip); if (error && error != -ENOENT) goto out_trans; @@ -698,34 +701,33 @@ xfs_qm_load_metadir_qinos( STATIC int xfs_qm_create_metadir_qinos( struct xfs_mount *mp, - struct xfs_quotainfo *qi, - struct xfs_inode **dpp) + struct xfs_quotainfo *qi) { int error; - if (!*dpp) { - error = xfs_dqinode_mkdir_parent(mp, dpp); + if (!qi->qi_dirip) { + error = xfs_dqinode_mkdir_parent(mp, &qi->qi_dirip); if (error && error != -EEXIST) return error; } if (XFS_IS_UQUOTA_ON(mp) && !qi->qi_uquotaip) { - error = xfs_dqinode_metadir_create(*dpp, XFS_DQTYPE_USER, - &qi->qi_uquotaip); + error = xfs_dqinode_metadir_create(qi->qi_dirip, + XFS_DQTYPE_USER, &qi->qi_uquotaip); if (error) return error; } if (XFS_IS_GQUOTA_ON(mp) && !qi->qi_gquotaip) { - error = xfs_dqinode_metadir_create(*dpp, XFS_DQTYPE_GROUP, - &qi->qi_gquotaip); + error = xfs_dqinode_metadir_create(qi->qi_dirip, + XFS_DQTYPE_GROUP, &qi->qi_gquotaip); if (error) return error; } if (XFS_IS_PQUOTA_ON(mp) && !qi->qi_pquotaip) { - error = xfs_dqinode_metadir_create(*dpp, XFS_DQTYPE_PROJ, - &qi->qi_pquotaip); + error = xfs_dqinode_metadir_create(qi->qi_dirip, + XFS_DQTYPE_PROJ, &qi->qi_pquotaip); if (error) return error; } @@ -770,7 +772,6 @@ xfs_qm_init_metadir_qinos( struct xfs_mount *mp) { struct xfs_quotainfo *qi = mp->m_quotainfo; - struct xfs_inode *dp = NULL; int error; if (!xfs_has_quota(mp)) { @@ -779,20 +780,22 @@ xfs_qm_init_metadir_qinos( return error; } - error = xfs_qm_load_metadir_qinos(mp, qi, &dp); + error = xfs_qm_load_metadir_qinos(mp, qi); if (error) goto out_err; - error = xfs_qm_create_metadir_qinos(mp, qi, &dp); + error = xfs_qm_create_metadir_qinos(mp, qi); if (error) goto out_err; - xfs_irele(dp); + /* The only user of the quota dir inode is online fsck */ +#if !IS_ENABLED(CONFIG_XFS_ONLINE_SCRUB) + xfs_irele(qi->qi_dirip); + qi->qi_dirip = NULL; +#endif return 0; out_err: xfs_qm_destroy_quotainos(mp->m_quotainfo); - if (dp) - xfs_irele(dp); return error; } diff --git a/fs/xfs/xfs_qm.h b/fs/xfs/xfs_qm.h index e919c7f62f5780..35b64bc3a7a867 100644 --- a/fs/xfs/xfs_qm.h +++ b/fs/xfs/xfs_qm.h @@ -55,6 +55,7 @@ struct xfs_quotainfo { struct xfs_inode *qi_uquotaip; /* user quota inode */ struct xfs_inode *qi_gquotaip; /* group quota inode */ struct xfs_inode *qi_pquotaip; /* project quota inode */ + struct xfs_inode *qi_dirip; /* quota metadir */ struct list_lru qi_lru; int qi_dquots; struct mutex qi_quotaofflock;/* to serialize quotaoff */ From patchwork Tue Nov 26 01:25:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13885409 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A7FAF946C; Tue, 26 Nov 2024 01:25:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584346; cv=none; b=FfvIgxRx/KOHmpFQB7qWfJxzv1xfd3vFv8LmZYHWMXrgMPrdANoYbfljFyzxasNU2aOs5R773MWcV8mfIyqqcDTPJQNbYsxokdJ7cTRKZ/HFfv321tOhk4QA8fIFSb19Sp/2IJuNSwa8GZFBeF8kwL0perEo5+7iyiTofHMyCA4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584346; c=relaxed/simple; bh=hXAq9p/Jf6bf4ZMMvEx6UyR0/luXHzK6s2KR1Z7rWj8=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ad8mpakSPmO97FMD5bG+KpPptI/zmblJgR0Dd20LO87lxzxoOaI5/5NfCTvE80wG76dEpVAx1m8orxqa0l7QrRDsyLJAsDi7gz8hMozXv+PIq+DafF4fGNvPmxOBdZG/awUBdQvJs4+kuVafLfu4x8uX9ovkpNsOwPPraeWrVOA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=HltGL60Z; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="HltGL60Z" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 16566C4CECE; Tue, 26 Nov 2024 01:25:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732584346; bh=hXAq9p/Jf6bf4ZMMvEx6UyR0/luXHzK6s2KR1Z7rWj8=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=HltGL60ZqzAmjVie7oYsq32gfvg/l6Qqkwc9f8tSggT+e/8lwc7mPL5nvnq1zzRqM oY7WknqK0xJbhaLSvjoLAXq1C+JO5sRSfx83ac55jCxd4t88jMLKNCj0qbPiuQ1PM9 c5QT0gCVO4mwE7yNL9gwL+Zbs8AEeIxym265H7Uf89EXEJsCodzQ/Lpg18Q/vE340e 0roE3PNmfzVnzmceYM3uLjKbBToyqQ38r5cHHSlBFRzsmHuP97hlsA3k68SRTBs37C Z30aDTEWVmuqfrVbWKUh8x8TJlh3Z+A/zPlmk1qPOdtbV59zakyGiV7PsfSwebGv/k hR0q9YxcgEwYg== Date: Mon, 25 Nov 2024 17:25:45 -0800 Subject: [PATCH 04/21] xfs: return a 64-bit block count from xfs_btree_count_blocks From: "Darrick J. Wong" To: djwong@kernel.org Cc: stable@vger.kernel.org, hch@lst.de, linux-xfs@vger.kernel.org Message-ID: <173258397871.4032920.47151735139743461.stgit@frogsfrogsfrogs> In-Reply-To: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> References: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong With the nrext64 feature enabled, it's possible for a data fork to have 2^48 extent mappings. Even with a 64k fsblock size, that maps out to a bmbt containing more than 2^32 blocks. Therefore, this predicate must return a u64 count to avoid an integer wraparound that will cause scrub to do the wrong thing. It's unlikely that any such filesystem currently exists, because the incore bmbt would consume more than 64GB of kernel memory on its own, and so far nobody except me has driven a filesystem that far, judging from the lack of complaints. Cc: # v5.19 Fixes: df9ad5cc7a5240 ("xfs: Introduce macros to represent new maximum extent counts for data/attr forks") Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_btree.c | 4 ++-- fs/xfs/libxfs/xfs_btree.h | 2 +- fs/xfs/libxfs/xfs_ialloc_btree.c | 4 +++- fs/xfs/scrub/agheader.c | 6 +++--- fs/xfs/scrub/agheader_repair.c | 6 +++--- fs/xfs/scrub/fscounters.c | 2 +- fs/xfs/scrub/ialloc.c | 4 ++-- fs/xfs/scrub/refcount.c | 2 +- fs/xfs/xfs_bmap_util.c | 2 +- 9 files changed, 17 insertions(+), 15 deletions(-) diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index 2b5fc5fd16435d..c748866ef92368 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -5144,7 +5144,7 @@ xfs_btree_count_blocks_helper( int level, void *data) { - xfs_extlen_t *blocks = data; + xfs_filblks_t *blocks = data; (*blocks)++; return 0; @@ -5154,7 +5154,7 @@ xfs_btree_count_blocks_helper( int xfs_btree_count_blocks( struct xfs_btree_cur *cur, - xfs_extlen_t *blocks) + xfs_filblks_t *blocks) { *blocks = 0; return xfs_btree_visit_blocks(cur, xfs_btree_count_blocks_helper, diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h index 3b739459ebb0f4..c5bff273cae255 100644 --- a/fs/xfs/libxfs/xfs_btree.h +++ b/fs/xfs/libxfs/xfs_btree.h @@ -484,7 +484,7 @@ typedef int (*xfs_btree_visit_blocks_fn)(struct xfs_btree_cur *cur, int level, int xfs_btree_visit_blocks(struct xfs_btree_cur *cur, xfs_btree_visit_blocks_fn fn, unsigned int flags, void *data); -int xfs_btree_count_blocks(struct xfs_btree_cur *cur, xfs_extlen_t *blocks); +int xfs_btree_count_blocks(struct xfs_btree_cur *cur, xfs_filblks_t *blocks); union xfs_btree_rec *xfs_btree_rec_addr(struct xfs_btree_cur *cur, int n, struct xfs_btree_block *block); diff --git a/fs/xfs/libxfs/xfs_ialloc_btree.c b/fs/xfs/libxfs/xfs_ialloc_btree.c index 9b34896dd1a32f..6f270d8f4270cb 100644 --- a/fs/xfs/libxfs/xfs_ialloc_btree.c +++ b/fs/xfs/libxfs/xfs_ialloc_btree.c @@ -744,6 +744,7 @@ xfs_finobt_count_blocks( { struct xfs_buf *agbp = NULL; struct xfs_btree_cur *cur; + xfs_filblks_t blocks; int error; error = xfs_ialloc_read_agi(pag, tp, 0, &agbp); @@ -751,9 +752,10 @@ xfs_finobt_count_blocks( return error; cur = xfs_finobt_init_cursor(pag, tp, agbp); - error = xfs_btree_count_blocks(cur, tree_blocks); + error = xfs_btree_count_blocks(cur, &blocks); xfs_btree_del_cursor(cur, error); xfs_trans_brelse(tp, agbp); + *tree_blocks = blocks; return error; } diff --git a/fs/xfs/scrub/agheader.c b/fs/xfs/scrub/agheader.c index 61f80a6410c738..1d41b85478da9d 100644 --- a/fs/xfs/scrub/agheader.c +++ b/fs/xfs/scrub/agheader.c @@ -458,7 +458,7 @@ xchk_agf_xref_btreeblks( { struct xfs_agf *agf = sc->sa.agf_bp->b_addr; struct xfs_mount *mp = sc->mp; - xfs_agblock_t blocks; + xfs_filblks_t blocks; xfs_agblock_t btreeblks; int error; @@ -507,7 +507,7 @@ xchk_agf_xref_refcblks( struct xfs_scrub *sc) { struct xfs_agf *agf = sc->sa.agf_bp->b_addr; - xfs_agblock_t blocks; + xfs_filblks_t blocks; int error; if (!sc->sa.refc_cur) @@ -840,7 +840,7 @@ xchk_agi_xref_fiblocks( struct xfs_scrub *sc) { struct xfs_agi *agi = sc->sa.agi_bp->b_addr; - xfs_agblock_t blocks; + xfs_filblks_t blocks; int error = 0; if (!xfs_has_inobtcounts(sc->mp)) diff --git a/fs/xfs/scrub/agheader_repair.c b/fs/xfs/scrub/agheader_repair.c index 0fad0baaba2f69..b45d2b32051a63 100644 --- a/fs/xfs/scrub/agheader_repair.c +++ b/fs/xfs/scrub/agheader_repair.c @@ -256,7 +256,7 @@ xrep_agf_calc_from_btrees( struct xfs_agf *agf = agf_bp->b_addr; struct xfs_mount *mp = sc->mp; xfs_agblock_t btreeblks; - xfs_agblock_t blocks; + xfs_filblks_t blocks; int error; /* Update the AGF counters from the bnobt. */ @@ -946,7 +946,7 @@ xrep_agi_calc_from_btrees( if (error) goto err; if (xfs_has_inobtcounts(mp)) { - xfs_agblock_t blocks; + xfs_filblks_t blocks; error = xfs_btree_count_blocks(cur, &blocks); if (error) @@ -959,7 +959,7 @@ xrep_agi_calc_from_btrees( agi->agi_freecount = cpu_to_be32(freecount); if (xfs_has_finobt(mp) && xfs_has_inobtcounts(mp)) { - xfs_agblock_t blocks; + xfs_filblks_t blocks; cur = xfs_finobt_init_cursor(sc->sa.pag, sc->tp, agi_bp); error = xfs_btree_count_blocks(cur, &blocks); diff --git a/fs/xfs/scrub/fscounters.c b/fs/xfs/scrub/fscounters.c index 4a50f8e0004092..ca23cf4db6c5ef 100644 --- a/fs/xfs/scrub/fscounters.c +++ b/fs/xfs/scrub/fscounters.c @@ -261,7 +261,7 @@ xchk_fscount_btreeblks( struct xchk_fscounters *fsc, xfs_agnumber_t agno) { - xfs_extlen_t blocks; + xfs_filblks_t blocks; int error; error = xchk_ag_init_existing(sc, agno, &sc->sa); diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c index abad54c3621d44..4dc7c83dc08a40 100644 --- a/fs/xfs/scrub/ialloc.c +++ b/fs/xfs/scrub/ialloc.c @@ -650,8 +650,8 @@ xchk_iallocbt_xref_rmap_btreeblks( struct xfs_scrub *sc) { xfs_filblks_t blocks; - xfs_extlen_t inobt_blocks = 0; - xfs_extlen_t finobt_blocks = 0; + xfs_filblks_t inobt_blocks = 0; + xfs_filblks_t finobt_blocks = 0; int error; if (!sc->sa.ino_cur || !sc->sa.rmap_cur || diff --git a/fs/xfs/scrub/refcount.c b/fs/xfs/scrub/refcount.c index 2b6be75e942415..1c5e45cc64190c 100644 --- a/fs/xfs/scrub/refcount.c +++ b/fs/xfs/scrub/refcount.c @@ -491,7 +491,7 @@ xchk_refcount_xref_rmap( struct xfs_scrub *sc, xfs_filblks_t cow_blocks) { - xfs_extlen_t refcbt_blocks = 0; + xfs_filblks_t refcbt_blocks = 0; xfs_filblks_t blocks; int error; diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index a59bbe767a7dc4..0836fea2d6d814 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -103,7 +103,7 @@ xfs_bmap_count_blocks( struct xfs_mount *mp = ip->i_mount; struct xfs_ifork *ifp = xfs_ifork_ptr(ip, whichfork); struct xfs_btree_cur *cur; - xfs_extlen_t btblocks = 0; + xfs_filblks_t btblocks = 0; int error; *nextents = 0; From patchwork Tue Nov 26 01:26:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13885410 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DE1E18F40; Tue, 26 Nov 2024 01:26:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584362; cv=none; b=cI6fnQBEXb9hdOlrxKM6jm0qcGon0hTdEKwKhzfoIR7oyzjs5s+v+idSaJTI6y5R1FaLJynWUiLapThSve7AtJ0rj4oj9ZEaD8Hz29M/fSLi4MzxOSqr6mllZj5LrdQ+z7cXrR9LFkqqdGEUmAmW+rr/ixHp6m7MyOIs4vVVCcw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584362; c=relaxed/simple; bh=wrLq6yPT9pDVWwYoAd9TNkIvx+ryLd43WU7ftb1pqzY=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=CwUUZv4d45gEGgj1PpGwHH1bbz6QlhGakxnD93Tj7Cfmee28dMLlBT8VpMAKXs+xO6Kben49uNBHS0V80UFnBHwd0e7Vzb/wWodR6US6t1i9q6/Ftp/zkXUYtCD6Axj3QhE+JRTlNJBmiar3SMcJugbNrSGv0fkdKl4B3+D+dA4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=irpPdkEV; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="irpPdkEV" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A9CDEC4CECE; Tue, 26 Nov 2024 01:26:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732584361; bh=wrLq6yPT9pDVWwYoAd9TNkIvx+ryLd43WU7ftb1pqzY=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=irpPdkEVt7s/y5NSRMSzZy6wQVBCIYHTHRlLuYBjBwEe5RVdUNQun4OyrpmP0Wgkh Umlwn3vfoANEzthBOGnXnRXz9yhhUWyc7/v8F4YOMA0BqfOXJFap2QHA8f4YzkdN6M hsYjOaZ4n8ckDmZqf+t05hsNLJ2doLncwc4VOlQRJPWjmEqrGbI0sAP64ADtzgClWW PDAhDsKaTUSTYjMEahiv6N9xa5RoS9hbqxMpLE/iIRm4x/E2z8johVr6a086UyM/jm BWHOtKslow9KlCwpV3mpppZPxa/n+bTAVyFsofbgjkHrY+iTANqP7fMdMJXqPEJv+7 /UIwXV/HRMJDQ== Date: Mon, 25 Nov 2024 17:26:01 -0800 Subject: [PATCH 05/21] xfs: don't drop errno values when we fail to ficlone the entire range From: "Darrick J. Wong" To: djwong@kernel.org Cc: stable@vger.kernel.org, hch@lst.de, linux-xfs@vger.kernel.org Message-ID: <173258397889.4032920.12946980376907187230.stgit@frogsfrogsfrogs> In-Reply-To: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> References: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Way back when we first implemented FICLONE for XFS, life was simple -- either the the entire remapping completed, or something happened and we had to return an errno explaining what happened. Neither of those ioctls support returning partial results, so it's all or nothing. Then things got complicated when copy_file_range came along, because it actually can return the number of bytes copied, so commit 3f68c1f562f1e4 tried to make it so that we could return a partial result if the REMAP_FILE_CAN_SHORTEN flag is set. This is also how FIDEDUPERANGE can indicate that the kernel performed a partial deduplication. Unfortunately, the logic is wrong if an error stops the remapping and CAN_SHORTEN is not set. Because those callers cannot return partial results, it is an error for ->remap_file_range to return a positive quantity that is less than the @len passed in. Implementations really should be returning a negative errno in this case, because that's what btrfs (which introduced FICLONE{,RANGE}) did. Therefore, ->remap_range implementations cannot silently drop an errno that they might have when the number of bytes remapped is less than the number of bytes requested and CAN_SHORTEN is not set. Found by running generic/562 on a 64k fsblock filesystem and wondering why it reported corrupt files. Cc: # v4.20 Fixes: 3fc9f5e409319e ("xfs: remove xfs_reflink_remap_range") Really-Fixes: 3f68c1f562f1e4 ("xfs: support returning partial reflink results") Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_file.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index c6de6b865ef11c..73562ff1c956f0 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -1228,6 +1228,14 @@ xfs_file_remap_range( xfs_iunlock2_remapping(src, dest); if (ret) trace_xfs_reflink_remap_range_error(dest, ret, _RET_IP_); + /* + * If the caller did not set CAN_SHORTEN, then it is not prepared to + * handle partial results -- either the whole remap succeeds, or we + * must say why it did not. In this case, any error should be returned + * to the caller. + */ + if (ret && remapped < len && !(remap_flags & REMAP_FILE_CAN_SHORTEN)) + return ret; return remapped > 0 ? remapped : ret; } From patchwork Tue Nov 26 01:26:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13885411 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B469ABE46; Tue, 26 Nov 2024 01:26:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584377; cv=none; b=cGLuBpQ/K8bikM9Elp93pbDWA8r6rLtROtAVuEzlpAjEPRGSzvrCMniwo0wsyk5tEzG9zMYjlSNW/0nygOpEsemKz2kz/6Gr/5N/Wy8Hu2ZZhyML97OaP0iAaDtk5+9exhtqk7Jty7q4AyjjClQOxw94qaPXpw+4tzeN3qCDOOI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584377; c=relaxed/simple; bh=HOrKD3PoWR8/f43f707a6hbU3giSuSbxx7z1a3QieSM=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=iQ6jkTHIycW+7mZly0CyTNjR00bAY23vy8WdupdiiIKv1PyMgK769R98Sikk02ExSNjPDGnSBZ/Esqor0piH6lh9rGavjc322Z1pmmDBI6NlZoFSqaca+QYJWdHP0WPYSGYvs2EjTuu4GRoGBdqYyCyOA5WdbmrY8wRwlCiyt3E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=QA5wWp0U; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="QA5wWp0U" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4CE58C4CECE; Tue, 26 Nov 2024 01:26:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732584377; bh=HOrKD3PoWR8/f43f707a6hbU3giSuSbxx7z1a3QieSM=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=QA5wWp0U78EeutZX209iT7ri9PX3Fn6T84kQTQ6inz4dOlNcYJzpmv6MwhgZjuCLS mNdbd72/+1DVwYDTeAIPYq4BcmSvV5O9Q1BdypDEv5WU3KQWLULDgeBwNUmkkfR7VI Jw5odD+p3niWruvp2hy3boItWt+Yy7wdjr5seRxxcaIDSJXAOZxr+340WTtkWuFmKf tiRtqI5u/XAiZ/hExdZkxeShsTdxb55czLbtyfq2kRpdkyMymoIimVVeljpV2VzZdj an1/E2fNLdLrDdH31nGaay38LxrCd1uKkNltrlvyjnVtPZ4EnBL+D4VpnF15UUntGW 0SesBL4uLu7Sw== Date: Mon, 25 Nov 2024 17:26:16 -0800 Subject: [PATCH 06/21] xfs: separate healthy clearing mask during repair From: "Darrick J. Wong" To: djwong@kernel.org Cc: stable@vger.kernel.org, hch@lst.de, linux-xfs@vger.kernel.org Message-ID: <173258397906.4032920.11656317799554030362.stgit@frogsfrogsfrogs> In-Reply-To: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> References: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong In commit d9041681dd2f53 we introduced some XFS_SICK_*ZAPPED flags so that the inode record repair code could clean up a damaged inode record enough to iget the inode but still be able to remember that the higher level repair code needs to be called. As part of that, we introduced a xchk_mark_healthy_if_clean helper that is supposed to cause the ZAPPED state to be removed if that higher level metadata actually checks out. This was done by setting additional bits in sick_mask hoping that xchk_update_health will clear all those bits after a healthy scrub. Unfortunately, that's not quite what sick_mask means -- bits in that mask are indeed cleared if the metadata is healthy, but they're set if the metadata is NOT healthy. fsck is only intended to set the ZAPPED bits explicitly. If something else sets the CORRUPT/XCORRUPT state after the xchk_mark_healthy_if_clean call, we end up marking the metadata zapped. This can happen if the following sequence happens: 1. Scrub runs, discovers that the metadata is fine but could be optimized and calls xchk_mark_healthy_if_clean on a ZAPPED flag. That causes the ZAPPED flag to be set in sick_mask because the metadata is not CORRUPT or XCORRUPT. 2. Repair runs to optimize the metadata. 3. Some other metadata used for cross-referencing in (1) becomes corrupt. 4. Post-repair scrub runs, but this time it sets CORRUPT or XCORRUPT due to the events in (3). 5. Now the xchk_health_update sets the ZAPPED flag on the metadata we just repaired. This is not the correct state. Fix this by moving the "if healthy" mask to a separate field, and only ever using it to clear the sick state. Cc: # v6.8 Fixes: d9041681dd2f53 ("xfs: set inode sick state flags when we zap either ondisk fork") Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/scrub/health.c | 57 ++++++++++++++++++++++++++++--------------------- fs/xfs/scrub/scrub.h | 6 +++++ 2 files changed, 39 insertions(+), 24 deletions(-) diff --git a/fs/xfs/scrub/health.c b/fs/xfs/scrub/health.c index ce86bdad37fa42..ccc6ca5934ca6a 100644 --- a/fs/xfs/scrub/health.c +++ b/fs/xfs/scrub/health.c @@ -71,7 +71,8 @@ /* Map our scrub type to a sick mask and a set of health update functions. */ enum xchk_health_group { - XHG_FS = 1, + XHG_NONE = 1, + XHG_FS, XHG_AG, XHG_INO, XHG_RTGROUP, @@ -83,6 +84,7 @@ struct xchk_health_map { }; static const struct xchk_health_map type_to_health_flag[XFS_SCRUB_TYPE_NR] = { + [XFS_SCRUB_TYPE_PROBE] = { XHG_NONE, 0 }, [XFS_SCRUB_TYPE_SB] = { XHG_AG, XFS_SICK_AG_SB }, [XFS_SCRUB_TYPE_AGF] = { XHG_AG, XFS_SICK_AG_AGF }, [XFS_SCRUB_TYPE_AGFL] = { XHG_AG, XFS_SICK_AG_AGFL }, @@ -133,7 +135,7 @@ xchk_mark_healthy_if_clean( { if (!(sc->sm->sm_flags & (XFS_SCRUB_OFLAG_CORRUPT | XFS_SCRUB_OFLAG_XCORRUPT))) - sc->sick_mask |= mask; + sc->healthy_mask |= mask; } /* @@ -189,6 +191,7 @@ xchk_update_health( { struct xfs_perag *pag; struct xfs_rtgroup *rtg; + unsigned int mask = sc->sick_mask; bool bad; /* @@ -203,50 +206,56 @@ xchk_update_health( return; } - if (!sc->sick_mask) - return; - bad = (sc->sm->sm_flags & (XFS_SCRUB_OFLAG_CORRUPT | XFS_SCRUB_OFLAG_XCORRUPT)); + if (!bad) + mask |= sc->healthy_mask; switch (type_to_health_flag[sc->sm->sm_type].group) { + case XHG_NONE: + break; case XHG_AG: + if (!mask) + return; pag = xfs_perag_get(sc->mp, sc->sm->sm_agno); if (bad) - xfs_group_mark_corrupt(pag_group(pag), sc->sick_mask); + xfs_group_mark_corrupt(pag_group(pag), mask); else - xfs_group_mark_healthy(pag_group(pag), sc->sick_mask); + xfs_group_mark_healthy(pag_group(pag), mask); xfs_perag_put(pag); break; case XHG_INO: if (!sc->ip) return; - if (bad) { - unsigned int mask = sc->sick_mask; - - /* - * If we're coming in for repairs then we don't want - * sickness flags to propagate to the incore health - * status if the inode gets inactivated before we can - * fix it. - */ - if (sc->sm->sm_flags & XFS_SCRUB_IFLAG_REPAIR) - mask |= XFS_SICK_INO_FORGET; + /* + * If we're coming in for repairs then we don't want sickness + * flags to propagate to the incore health status if the inode + * gets inactivated before we can fix it. + */ + if (sc->sm->sm_flags & XFS_SCRUB_IFLAG_REPAIR) + mask |= XFS_SICK_INO_FORGET; + if (!mask) + return; + if (bad) xfs_inode_mark_corrupt(sc->ip, mask); - } else - xfs_inode_mark_healthy(sc->ip, sc->sick_mask); + else + xfs_inode_mark_healthy(sc->ip, mask); break; case XHG_FS: + if (!mask) + return; if (bad) - xfs_fs_mark_corrupt(sc->mp, sc->sick_mask); + xfs_fs_mark_corrupt(sc->mp, mask); else - xfs_fs_mark_healthy(sc->mp, sc->sick_mask); + xfs_fs_mark_healthy(sc->mp, mask); break; case XHG_RTGROUP: + if (!mask) + return; rtg = xfs_rtgroup_get(sc->mp, sc->sm->sm_agno); if (bad) - xfs_group_mark_corrupt(rtg_group(rtg), sc->sick_mask); + xfs_group_mark_corrupt(rtg_group(rtg), mask); else - xfs_group_mark_healthy(rtg_group(rtg), sc->sick_mask); + xfs_group_mark_healthy(rtg_group(rtg), mask); xfs_rtgroup_put(rtg); break; default: diff --git a/fs/xfs/scrub/scrub.h b/fs/xfs/scrub/scrub.h index a7fda3e2b01377..5dbbe93cb49bfa 100644 --- a/fs/xfs/scrub/scrub.h +++ b/fs/xfs/scrub/scrub.h @@ -184,6 +184,12 @@ struct xfs_scrub { */ unsigned int sick_mask; + /* + * Clear these XFS_SICK_* flags but only if the scan is ok. Useful for + * removing ZAPPED flags after a repair. + */ + unsigned int healthy_mask; + /* next time we want to cond_resched() */ struct xchk_relax relax; From patchwork Tue Nov 26 01:26:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13885412 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 17C0616415; Tue, 26 Nov 2024 01:26:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584393; cv=none; b=oKo5IHZhEN3z0r3pKuEgyuMndneibAE48MvQz/IfDddA6QazcNJgYLVznHNHF2YvoUROqFd0JhpcbV9o6w/QKN4A0jVeYTF5R2e34en5p9ie+z+vj0kcEfafxRD+iWj5E82O4fQt6mmsk4aTsrpm8K43rXuVj2NTyL6X+vr1wso= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584393; c=relaxed/simple; bh=0n2zX6/mxPc+XZPbT+FTVtajLFs2Gy5QQQ2igxVDaVc=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=IXJps+kAZgrC89J23qWo1HQMjJKnR+c6NexbQK8qT4Wm0v0RIbc0l1kYC+nZIjuXCu852wFi/JF1MsP26z2yknuD9iV1tUXZTOtC2GK2kO5PoUqKsonL8JDLZCfO3Zr5cTrKkK72VhhgX/nXor+URvTfZUdVmtL8gYZiMTR+7is= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=dIJPbBvS; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="dIJPbBvS" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E35A8C4CECF; Tue, 26 Nov 2024 01:26:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732584393; bh=0n2zX6/mxPc+XZPbT+FTVtajLFs2Gy5QQQ2igxVDaVc=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=dIJPbBvSvrqh9+ROlDvgOP/zEWk+NaccKCWxZ04vvHwnOz8u0qO6OX5bUNafboI7U iwFJpPXxL7coZvlxkQ8GgWSWsggVsdR6Prwmoh9lywhlujknMH3mqykj1JTr0AMP01 F2TayOmHOcctu67qYw6ydbFr1QXRDWMHdnDAbO9pdPAbuG0LDqP/nnhwN6SlxYhcBv l0KRlf57MnNFbzTrxsazJfvPSAxk6ex0NwTQCdZxMjZ5K+3k+vUllzzOAa57MJSo2n vVmUiL1LnPl5tVl4Lmgxqn5nbU9eVaEXPBU8Clfaaa26h2YA5TC0PNEGp3+K6SEb8/ cYXydNZpFmzfw== Date: Mon, 25 Nov 2024 17:26:32 -0800 Subject: [PATCH 07/21] xfs: set XFS_SICK_INO_SYMLINK_ZAPPED explicitly when zapping a symlink From: "Darrick J. Wong" To: djwong@kernel.org Cc: stable@vger.kernel.org, hch@lst.de, linux-xfs@vger.kernel.org Message-ID: <173258397923.4032920.8428901441460084038.stgit@frogsfrogsfrogs> In-Reply-To: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> References: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong If we need to reset a symlink target to the "durr it's busted" string, then we clear the zapped flag as well. However, this should be using the provided helper so that we don't set the zapped state on an otherwise ok symlink. Cc: # v6.10 Fixes: 2651923d8d8db0 ("xfs: online repair of symbolic links") Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/scrub/symlink_repair.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/xfs/scrub/symlink_repair.c b/fs/xfs/scrub/symlink_repair.c index d015a86ef460fb..953ce7be78dc2f 100644 --- a/fs/xfs/scrub/symlink_repair.c +++ b/fs/xfs/scrub/symlink_repair.c @@ -36,6 +36,7 @@ #include "scrub/tempfile.h" #include "scrub/tempexch.h" #include "scrub/reap.h" +#include "scrub/health.h" /* * Symbolic Link Repair @@ -233,7 +234,7 @@ xrep_symlink_salvage( * target zapped flag. */ if (buflen == 0) { - sc->sick_mask |= XFS_SICK_INO_SYMLINK_ZAPPED; + xchk_mark_healthy_if_clean(sc, XFS_SICK_INO_SYMLINK_ZAPPED); sprintf(target_buf, DUMMY_TARGET); } From patchwork Tue Nov 26 01:26:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13885413 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F3655BA3F for ; Tue, 26 Nov 2024 01:26:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584409; cv=none; b=u3ZYIcUpfh3JFZ8Ui59N7a0X5/5GT934gBtLzmqlC6U3ckL844BXkctCxMv5i5ZtOnRK1nIf2FrHt5Tyqp0L8YV3T2nvnOtCBv5hYlBaSONpCOKVjasw1D/jgI2elBAdJcnAIcVhov7ZkdSlJG6ZGaWLc7Pc/6cjuG49K0P2+l8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584409; c=relaxed/simple; bh=BNrhj+AvAoG4Bq2DF5Pg+gmIHa7HLyYcblVoYEuiGSw=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ppMRTCyfk+iak/767zGzZLdyIt1pwoa5xSHcHa7xVmiq4n3TbDeCXjXteJ06x2UnsQizhqIldVWqqkcEhoEE6R35447r74fuIqdzhZzBHSTtLV3fVuIecqzYACDM5r9Sm20rOXkRvl4/CvYePYHe1ZZUPXAOMV6VS8LAXCRfRUE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=PupDOKzc; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="PupDOKzc" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 86B98C4CECF; Tue, 26 Nov 2024 01:26:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732584408; bh=BNrhj+AvAoG4Bq2DF5Pg+gmIHa7HLyYcblVoYEuiGSw=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=PupDOKzcpd0zJtNGhsWk6wI4p3+clUdRgt8uBfzw9uXYZeSM76BmqnkLiLrv/p2W1 LQ7WnO0jPy3bOh7buXOPwKxwWzMm5vKWsbMNckgoAsRlcxLMxQUalqdSczF2OPjaaJ x5LPbZCOnZupkwbYTgckNsMRnZt2193DCe/6tg2kghliQuUtVCv4HG+tcNMCoKvKpx 6thE7tbX+dl5uZ7QEYmsdMj77ZMTtCKlQvAK0SeMFosAIlHYjsY6/h0PEp6yL9SSOy Ii+d+1UHRTOFbcQAk1pRF4ggdQ9hYHMuKQ5KpVVicl8ko7flYRExuIjIEQ9DKKdoNd sXVhr9ZQeW3OA== Date: Mon, 25 Nov 2024 17:26:48 -0800 Subject: [PATCH 08/21] xfs: mark metadir repair tempfiles with IRECOVERY From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org Message-ID: <173258397940.4032920.13386336664110619158.stgit@frogsfrogsfrogs> In-Reply-To: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> References: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Once in a long while, xfs/566 and xfs/801 report directory corruption in one of the metadata subdirectories while it's forcibly rebuilding all filesystem metadata. I observed the following sequence of events: 1. Initiate a repair of the parent pointers for the /quota/user file. This is the secret file containing user quota data. 2. The pptr repair thread creates a temporary file and begins staging parent pointers in the ondisk metadata in preparation for an exchange-range to commit the new pptr data. 3. At the same time, initiate a repair of the /quota directory itself. 4. The dir repair thread finds the temporary file from (2), scans it for parent pointers, and stages a dirent in its own temporary dir in preparation to commit the fixed directory. 5. The parent pointer repair completes and frees the temporary file. 6. The dir repair commits the new directory and scans it again. It finds the dirent that points to the old temporary file in (2) and marks the directory corrupt. Oops! Repair code must never scan the temporary files that other repair functions create to stage new metadata. They're not supposed to do that, but the predicate function xrep_is_tempfile is incorrect because it assumes that any XFS_DIFLAG2_METADATA file cannot ever be a temporary file, but xrep_tempfile_adjust_directory_tree creates exactly that. Fix this by setting the IRECOVERY flag on temporary metadata directory inodes and using that to correct the predicate. Repair code is supposed to erase all the data in temporary files before releasing them, so it's ok if a thread scans the temporary file after we drop IRECOVERY. Fixes: bb6cdd5529ff67 ("xfs: hide metadata inodes from everyone because they are special") Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/scrub/tempfile.c | 10 ++++++++-- fs/xfs/xfs_inode.h | 2 +- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/fs/xfs/scrub/tempfile.c b/fs/xfs/scrub/tempfile.c index 4b7f7860e37ece..dc3802c7f678ce 100644 --- a/fs/xfs/scrub/tempfile.c +++ b/fs/xfs/scrub/tempfile.c @@ -223,6 +223,7 @@ xrep_tempfile_adjust_directory_tree( if (error) goto out_ilock; + xfs_iflags_set(sc->tempip, XFS_IRECOVERY); xfs_qm_dqdetach(sc->tempip); out_ilock: xrep_tempfile_iunlock(sc); @@ -246,6 +247,8 @@ xrep_tempfile_remove_metadir( ASSERT(sc->tp == NULL); + xfs_iflags_clear(sc->tempip, XFS_IRECOVERY); + xfs_ilock(sc->tempip, XFS_IOLOCK_EXCL); sc->temp_ilock_flags |= XFS_IOLOCK_EXCL; @@ -945,10 +948,13 @@ xrep_is_tempfile( /* * Files in the metadata directory tree also have S_PRIVATE set and - * IOP_XATTR unset, so we must distinguish them separately. + * IOP_XATTR unset, so we must distinguish them separately. We (ab)use + * the IRECOVERY flag to mark temporary metadir inodes knowing that the + * end of log recovery clears IRECOVERY, so the only ones that can + * exist during online repair are the ones we create. */ if (xfs_has_metadir(mp) && (ip->i_diflags2 & XFS_DIFLAG2_METADATA)) - return false; + return __xfs_iflags_test(ip, XFS_IRECOVERY); if (IS_PRIVATE(inode) && !(inode->i_opflags & IOP_XATTR)) return true; diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 2a4485fb990846..bd6b37beabacdd 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -231,7 +231,7 @@ xfs_iflags_clear(xfs_inode_t *ip, unsigned long flags) } static inline int -__xfs_iflags_test(xfs_inode_t *ip, unsigned long flags) +__xfs_iflags_test(const struct xfs_inode *ip, unsigned long flags) { return (ip->i_flags & flags); } From patchwork Tue Nov 26 01:27:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13885414 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AD8D0BE46; Tue, 26 Nov 2024 01:27:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584424; cv=none; b=M/6N4zM8M0PK/saZnlUSAiqR/6sKN3EpoCLSDRLkqdb9Da0Rm2ME82D6QaWPem5XEGHnhXKbRzHd/XSZ3IPg9a5qFyIvNSYcIL61R9Nuwt6A+UtO62V8zxhWafUiQIkMemHCBSmdFkXlesEYHB+Rl0Dtg9CCXIBUK7tSZJmuPyE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584424; c=relaxed/simple; bh=hvtg8CFJW6YkaVNSXT1MmNPelQ5diQxgZ8Yqp7xq4Gw=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=CE16pWOhufXUfuZIDN37jLCBanGvQpJELYR8qtDmnqwioVucqstFybuDhin7I+gEizVXxMcLZDc0yRS7ULzy3/cYzqemUPgvqZqD8oOq0bpqRpLLa5CSdc+J1gJ/YvcWRXmXrOoFJXZ59/vwUtnXTsZyztDkYI3DMrF2xukZ0Ho= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=dy5rNaI6; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="dy5rNaI6" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2D08AC4CECE; Tue, 26 Nov 2024 01:27:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732584424; bh=hvtg8CFJW6YkaVNSXT1MmNPelQ5diQxgZ8Yqp7xq4Gw=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=dy5rNaI6DuM3ykPUHrnn/CLB780LGj2xnEaKUK7UOMhytegqU9nCibsTlo+D7thTz 2B4Py1ORSfELIvsGslDVigxBfWsD9as8ktGD0xZ0lvJx4B6Tn6cyvGxYFG67jAcVNe ySlwUPY2l9XW/BOlRL6ywOODmGIvyEERyYaVmfaCIvYW9wLsOC5QXbESDznSJQI4q7 WXDKCqYHbYdD8l5NKywTkdWwwr3dHlCJ34cTzC9P/kXXen+tUqhvBZ9CjaYrw/S00H VSlEnkOcAZ882NRJDqBV2NmXVLC2OomeiGOUDtBfQd2LHMoeIVYmzl0pqT+hl9pSax 0MEMg92jgmJsg== Date: Mon, 25 Nov 2024 17:27:03 -0800 Subject: [PATCH 09/21] xfs: fix null bno_hint handling in xfs_rtallocate_rtg From: "Darrick J. Wong" To: djwong@kernel.org Cc: stable@vger.kernel.org, hch@lst.de, linux-xfs@vger.kernel.org Message-ID: <173258397957.4032920.17159744103545265309.stgit@frogsfrogsfrogs> In-Reply-To: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> References: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong xfs_bmap_rtalloc initializes the bno_hint variable to NULLRTBLOCK (aka NULLFSBLOCK). If the allocation request is for a file range that's adjacent to an existing mapping, it will then change bno_hint to the blkno hint in the bmalloca structure. In other words, bno_hint is either a rt block number, or it's all 1s. Unfortunately, commit ec12f97f1b8a8f didn't take the NULLRTBLOCK state into account, which means that it tries to translate that into a realtime extent number. We then end up with an obnoxiously high rtx number and pointlessly feed that to the near allocator. This often fails and falls back to the by-size allocator. Seeing as we had no locality hint anyway, this is a waste of time. Fix the code to detect a lack of bno_hint correctly. This was detected by running xfs/009 with metadir enabled and a 28k rt extent size. Cc: # v6.12 Fixes: ec12f97f1b8a8f ("xfs: make the rtalloc start hint a xfs_rtblock_t") Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_rtalloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 0cb534d71119a5..fcfa6e0eb3ad2a 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -1827,7 +1827,7 @@ xfs_rtallocate_rtg( * For an allocation to an empty file at offset 0, pick an extent that * will space things out in the rt area. */ - if (bno_hint) + if (bno_hint != NULLFSBLOCK) start = xfs_rtb_to_rtx(args.mp, bno_hint); else if (!xfs_has_rtgroups(args.mp) && initial_user_data) start = xfs_rtpick_extent(args.rtg, tp, maxlen); From patchwork Tue Nov 26 01:27:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13885415 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 65493946C for ; Tue, 26 Nov 2024 01:27:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584440; cv=none; b=SYoqcxtbN3YVy6de8KUB4h5RgSSsHtNZtAJnM/FWjDQ8KPHeQA1AuE8Rtq4AReU/S3PZB1/sV4DQg+lUp58u6WqCUOnuIXULuSCoflYgD6fYh+GNmKmgSguX+8XY0tFas11PMCp3rJBt+9g2rhdOW0onO5ZZ+F63w8z6IuLwwQk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584440; c=relaxed/simple; bh=+Bz2U8Uf+86gssWiN9CQ6X/u/z4vN/05CWpuE7um/Dc=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=lqBzNFbVAM9Uug0rUtVo5mIV7DMw+QqSBMBTKnoPvSmVVqYaxKHIL/G91taqTctWUeiAPSvt6Se5lZB9v7bTeZkhrSZthD8FHqf8jq4BfghOD0UocDqs+gHKy5vGe9jIBuS3Oqy1f6lngFa7eh3m2nBFkA1sElubok8YvjihsAE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=swjHfmWL; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="swjHfmWL" Received: by smtp.kernel.org (Postfix) with ESMTPSA id CBA53C4CECE; Tue, 26 Nov 2024 01:27:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732584439; bh=+Bz2U8Uf+86gssWiN9CQ6X/u/z4vN/05CWpuE7um/Dc=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=swjHfmWLYuWRumorWtGrioTfnNrfqWwYTmTLEV8TLlC/MjJDQvCms6hwg+1stQcUT uPgBpANrDOYBKxVhAzWbqkwi9ENRUPmYKG1XiVttCDGlLGjGAasgYLdZ9kPC3fNp/5 7pyAXTw83mIh2pxHLomHaO7mljx5D6CVtkBzv4ciUHCWAWoVOHyOMo1eFaa7ZEyV23 eslSuiq3nO8c1LLvleK4asYt7rw+SlJ3uXnFLR9qdK//a2k6fUkgcclCZ6MlEkscfW H6WzUFxChUkgWIJrYe1hStOXmOtdNY6TibxRC2Nj4KCdtcUwryX+Y9hv2cI789SE0E QYMMhRNsxBZ7Q== Date: Mon, 25 Nov 2024 17:27:19 -0800 Subject: [PATCH 10/21] xfs: fix error bailout in xfs_rtginode_create From: "Darrick J. Wong" To: djwong@kernel.org Cc: dan.carpenter@linaro.org, hch@lst.de, linux-xfs@vger.kernel.org Message-ID: <173258397974.4032920.351176801232799495.stgit@frogsfrogsfrogs> In-Reply-To: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> References: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong smatch reported that we screwed up the error cleanup in this function. Fix it. Fixes: ae897e0bed0f54 ("xfs: support creating per-RTG files in growfs") Reported-by: Dan Carpenter Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_rtgroup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/libxfs/xfs_rtgroup.c b/fs/xfs/libxfs/xfs_rtgroup.c index e74bb059f24fa1..4f3bfc884aff29 100644 --- a/fs/xfs/libxfs/xfs_rtgroup.c +++ b/fs/xfs/libxfs/xfs_rtgroup.c @@ -496,7 +496,7 @@ xfs_rtginode_create( error = xfs_metadir_create(&upd, S_IFREG); if (error) - return error; + goto out_cancel; xfs_rtginode_lockdep_setup(upd.ip, rtg_rgno(rtg), type); From patchwork Tue Nov 26 01:27:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13885416 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 930BBBE46; Tue, 26 Nov 2024 01:27:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584455; cv=none; b=RrnFMQlgEOYF4gdmQdhE71ca2ZvBi+kc2rSOyycOb3ymMpWKyGMCQuKoKfCXkqcippTSHLeuIVuhbb8vkD8MUdoTVerCEFtOfW2/H+XfBgL5LXXKjC5BOoFdMlUVy5c05pGWOu2q8ruZ4veCauYWxbXA2/63qDbax9w9WOmLvmo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584455; c=relaxed/simple; bh=aCJvKNRheCLyJUL7KlMcJui60wDCGNq4cMyG98vTi00=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=DxYHFVl4rsFs3Vr4pwmUxbQAxL2jjv9SqR6PTQu7fKaW0vUMRhel49hwZK/5FWJK9TGDhOjhDQazpMWrvbvx04ho2n6QoCRZFPDcsENJinQdrGoqYyDIAiQEEu0FvbCcmoMP/EYYxelaIdguXo3zp3dAqbFR+c2bI1NOnQtPAaU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=QCKSPyLl; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="QCKSPyLl" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 64F3FC4CECE; Tue, 26 Nov 2024 01:27:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732584455; bh=aCJvKNRheCLyJUL7KlMcJui60wDCGNq4cMyG98vTi00=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=QCKSPyLln/pqYM2t2zWbm0XKDnYrKmihSaX5JbA+PSbaqa5Wcftzpm9tOiTgoBA/a 5ZqOSiQw7obDZr2TCMl5cY1A+Ksqtejy6ZdzMwyXiUz+nKYYFdLp6rcPIF0exIIkGZ ctrGnX76T0tfNxzwrisDG1ShaGWKDd1GonySsTw2oughC9+FFNE2AEERGE1XroeT0z fvftmsxx9ym9l21TqAq92uKNxccaSXQfoMRu0OyZVi4foSjXwYTcW+I1ulTTUbNcKO Ea9yQOq5b6RUK0KNnJnixSLsL8dmGG88OkxjyVhDHS1qR04XUSQv3QCIZsEarVkgLg ZIPjOFDnNIquw== Date: Mon, 25 Nov 2024 17:27:34 -0800 Subject: [PATCH 11/21] xfs: update btree keys correctly when _insrec splits an inode root block From: "Darrick J. Wong" To: djwong@kernel.org Cc: stable@vger.kernel.org, linux-xfs@vger.kernel.org Message-ID: <173258397991.4032920.4586526854197814179.stgit@frogsfrogsfrogs> In-Reply-To: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> References: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong In commit 2c813ad66a72, I partially fixed a bug wherein xfs_btree_insrec would erroneously try to update the parent's key for a block that had been split if we decided to insert the new record into the new block. The solution was to detect this situation and update the in-core key value that we pass up to the caller so that the caller will (eventually) add the new block to the parent level of the tree with the correct key. However, I missed a subtlety about the way inode-rooted btrees work. If the full block was a maximally sized inode root block, we'll solve that fullness by moving the root block's records to a new block, resizing the root block, and updating the root to point to the new block. We don't pass a pointer to the new block to the caller because that work has already been done. The new record will /always/ land in the new block, so in this case we need to use xfs_btree_update_keys to update the keys. This bug can theoretically manifest itself in the very rare case that we split a bmbt root block and the new record lands in the very first slot of the new block, though I've never managed to trigger it in practice. However, it is very easy to reproduce by running generic/522 with the realtime rmapbt patchset if rtinherit=1. Cc: # v4.8 Fixes: 2c813ad66a7218 ("xfs: support btrees with overlapping intervals for keys") Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_btree.c | 29 +++++++++++++++++++++++------ 1 file changed, 23 insertions(+), 6 deletions(-) diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index c748866ef92368..68ee1c299c25fd 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -3557,14 +3557,31 @@ xfs_btree_insrec( xfs_btree_log_block(cur, bp, XFS_BB_NUMRECS); /* - * If we just inserted into a new tree block, we have to - * recalculate nkey here because nkey is out of date. + * Update btree keys to reflect the newly added record or keyptr. + * There are three cases here to be aware of. Normally, all we have to + * do is walk towards the root, updating keys as necessary. * - * Otherwise we're just updating an existing block (having shoved - * some records into the new tree block), so use the regular key - * update mechanism. + * If the caller had us target a full block for the insertion, we dealt + * with that by calling the _make_block_unfull function. If the + * "make unfull" function splits the block, it'll hand us back the key + * and pointer of the new block. We haven't yet added the new block to + * the next level up, so if we decide to add the new record to the new + * block (bp->b_bn != old_bn), we have to update the caller's pointer + * so that the caller adds the new block with the correct key. + * + * However, there is a third possibility-- if the selected block is the + * root block of an inode-rooted btree and cannot be expanded further, + * the "make unfull" function moves the root block contents to a new + * block and updates the root block to point to the new block. In this + * case, no block pointer is passed back because the block has already + * been added to the btree. In this case, we need to use the regular + * key update function, just like the first case. This is critical for + * overlapping btrees, because the high key must be updated to reflect + * the entire tree, not just the subtree accessible through the first + * child of the root (which is now two levels down from the root). */ - if (bp && xfs_buf_daddr(bp) != old_bn) { + if (!xfs_btree_ptr_is_null(cur, &nptr) && + bp && xfs_buf_daddr(bp) != old_bn) { xfs_btree_get_keys(cur, block, lkey); } else if (xfs_btree_needs_key_update(cur, optr)) { error = xfs_btree_update_keys(cur, level); From patchwork Tue Nov 26 01:27:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13885417 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 88806C2C6; Tue, 26 Nov 2024 01:27:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584471; cv=none; b=jvZjmMqEg0qIB+pr4gzt0a5zZ483z7QlYi+pkXhb9zQTSf7cMtdF7CBOMdJPlEmXMwXKASm2r254dJpk+uIimdyHGMN+NzH+W894VloWpHVrZgRTer3MgWHKDkCb+pTN4nKvig2MrMvPXYc0e/3Os75Of/kMsMBpDkEfYMh0n8k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584471; c=relaxed/simple; bh=4vgcFib9AR65eqjVPmhbGa4SZu/AvWK6PBa6PPpGwV4=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=qJmNeNdf2lN/0+OpDwDZGEIZ8Fv65l9XKAxp7MzjrZfHEf7u/8zSA6X8/W11JHgqruoLS89G61DkMH+mUPKoPqyl4gtCgddvjYT9Q18ewgvFcSczVo0zb+bUgkr8KxqOtV8tNjq6Kvk1CKXtGasp3BVdvvst9mWURXi9wgVOU0o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=jak8Osc6; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="jak8Osc6" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 07592C4CECF; Tue, 26 Nov 2024 01:27:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732584471; bh=4vgcFib9AR65eqjVPmhbGa4SZu/AvWK6PBa6PPpGwV4=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=jak8Osc6Y7LOUrP4W/y4cm149gp9w4FLe5Ktnb+r+5Q6DvBZo0wsyl6sErtQa+vnr d3niDeHbAM4nXE4PrXsZXVinRV0nDAmVztwsRjS30qIgMhDwurpKdRTv4fTrvPFu3s IX6Nkc7lmlvMRWlCodV2CjIwohmnkcTgtssKDKO3xRZSwGp47EiCgvOfXDLj+AWuv9 K4goMHt+48LXsCQPGZUl15fmjITC3IVwFhNQhI2tCeAIpJnnto41KlMINnY7UrAidc jnhp6jnlrseLkRoACKrGH6tZrdN2SHJ6eAjMiu/KvKzBy/LMfHAcg4/TPRKN/eaQ4Y tzBODDHcb01+A== Date: Mon, 25 Nov 2024 17:27:50 -0800 Subject: [PATCH 12/21] xfs: fix scrub tracepoints when inode-rooted btrees are involved From: "Darrick J. Wong" To: djwong@kernel.org Cc: stable@vger.kernel.org, linux-xfs@vger.kernel.org Message-ID: <173258398008.4032920.2214591217065414920.stgit@frogsfrogsfrogs> In-Reply-To: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> References: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Fix a minor mistakes in the scrub tracepoints that can manifest when inode-rooted btrees are enabled. The existing code worked fine for bmap btrees, but we should tighten the code up to be less sloppy. Cc: # v5.7 Fixes: 92219c292af8dd ("xfs: convert btree cursor inode-private member names") Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/scrub/trace.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index 9b38f5ad1eaf07..d2ae7e93acb08e 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -605,7 +605,7 @@ TRACE_EVENT(xchk_ifork_btree_op_error, TP_fast_assign( xfs_fsblock_t fsbno = xchk_btree_cur_fsbno(cur, level); __entry->dev = sc->mp->m_super->s_dev; - __entry->ino = sc->ip->i_ino; + __entry->ino = cur->bc_ino.ip->i_ino; __entry->whichfork = cur->bc_ino.whichfork; __entry->type = sc->sm->sm_type; __assign_str(name); From patchwork Tue Nov 26 01:28:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13885418 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 298B5C2C6; Tue, 26 Nov 2024 01:28:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584487; cv=none; b=TKHxhcKhHf/CbTanYd85PAHjDGjD1i2nNk0PVoBemCklPYqge8wPKhjDAEh+Z3p9VayBFUGFSyoHpFj9ZxuS5O2Eyv6Eid4N0Ixx8U2GO58E6h4Bs8drwjNv4N0D8VDSvrVfr6Q/ze+AFqHK0SIELbwYblM/zTfiRyn3yDfpuhE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584487; c=relaxed/simple; bh=c7PzyRcLjLqL1PLXx/V6AmTTP88j/FVN/S8Y2Gt4J5o=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Qv7+xOGHuVNUoI5l5SJ47oYQeKRHYtbCgKME5mOkUbCYrD+giE3SmH4fuqjBaGNSJ6sEyv/4rPAs2vrYagtOjAVtFOKSS4YqIkf7I+soPtNJrXztXbQmlLgeaOkXqD882mrurWZYYNnhMowYsBkWV0zu4nUEeu4cdixaDZlVPTY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=PhqfGRvS; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="PhqfGRvS" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 99F55C4CECE; Tue, 26 Nov 2024 01:28:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732584486; bh=c7PzyRcLjLqL1PLXx/V6AmTTP88j/FVN/S8Y2Gt4J5o=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=PhqfGRvSsyGkCjFFWZds13BSu7x4FwneBmC1d6eFk4q07Yf2I6D+hqrz29kfzYAjE UVW4CF3ImykzOZHg5mTXTtbEaiqa11Tsu0NaW+61Kxi2zwDfmnNF21/WZcS5paLnFl A7IzcM5UtqoOqw+yv7KQjnzhi3gnqS3CT4F2vJDjFdNgaWS7dYQGQg9Q7nM+LcNQAq 9aNHJdHnJ8hM8M5F1NY+G8Evpkcz7Yz8u/7wJO9zXd7e1FfBHY5/3BmrqZHjRQhHyx VpOrRf9dvfmETnqSRlBMjBPVClJm4ytrPGKkcayS0ExqJ13nFnFcdkqeXUYwD+OlZh Y/nmcyHXvIIow== Date: Mon, 25 Nov 2024 17:28:06 -0800 Subject: [PATCH 13/21] xfs: unlock inodes when erroring out of xfs_trans_alloc_dir From: "Darrick J. Wong" To: djwong@kernel.org Cc: stable@vger.kernel.org, linux-xfs@vger.kernel.org Message-ID: <173258398025.4032920.17639399507003367709.stgit@frogsfrogsfrogs> In-Reply-To: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> References: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Debugging a filesystem patch with generic/475 caused the system to hang after observing the following sequences in dmesg: XFS (dm-0): metadata I/O error in "xfs_imap_to_bp+0x61/0xe0 [xfs]" at daddr 0x491520 len 32 error 5 XFS (dm-0): metadata I/O error in "xfs_btree_read_buf_block+0xba/0x160 [xfs]" at daddr 0x3445608 len 8 error 5 XFS (dm-0): metadata I/O error in "xfs_imap_to_bp+0x61/0xe0 [xfs]" at daddr 0x138e1c0 len 32 error 5 XFS (dm-0): log I/O error -5 XFS (dm-0): Metadata I/O Error (0x1) detected at xfs_trans_read_buf_map+0x1ea/0x4b0 [xfs] (fs/xfs/xfs_trans_buf.c:311). Shutting down filesystem. XFS (dm-0): Please unmount the filesystem and rectify the problem(s) XFS (dm-0): Internal error dqp->q_ino.reserved < dqp->q_ino.count at line 869 of file fs/xfs/xfs_trans_dquot.c. Caller xfs_trans_dqresv+0x236/0x440 [xfs] XFS (dm-0): Corruption detected. Unmount and run xfs_repair XFS (dm-0): Unmounting Filesystem be6bcbcc-9921-4deb-8d16-7cc94e335fa7 The system is stuck in unmount trying to lock a couple of inodes so that they can be purged. The dquot corruption notice above is a clue to what happened -- a link() call tried to set up a transaction to link a child into a directory. Quota reservation for the transaction failed after IO errors shut down the filesystem, but then we forgot to unlock the inodes on our way out. Fix that. Cc: # v6.10 Fixes: bd5562111d5839 ("xfs: Hold inode locks in xfs_trans_alloc_dir") Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_trans.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c index 30fbed27cf05cc..05b18e30368e4b 100644 --- a/fs/xfs/xfs_trans.c +++ b/fs/xfs/xfs_trans.c @@ -1435,5 +1435,8 @@ xfs_trans_alloc_dir( out_cancel: xfs_trans_cancel(tp); + xfs_iunlock(dp, XFS_ILOCK_EXCL); + if (dp != ip) + xfs_iunlock(ip, XFS_ILOCK_EXCL); return error; } From patchwork Tue Nov 26 01:28:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13885419 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 70989C2C6; Tue, 26 Nov 2024 01:28:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584502; cv=none; b=FGnN8VRw6b8RTmsRcpfH4raSWEwEUgRj1Hc+ljpBWpo6FBNgbjGmvaOLyiAniMqn1HM9WbSMj/awOe2vm+wYdLRR6S3dHXyRRVc33wm1sBxWGxk1j/DG2Y05yS+QVlTWH7WCEj1P/CRH58gvTQviygt3gYps8fBN5BegKVUd1xQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584502; c=relaxed/simple; bh=UjeDlYDq8ScPR+IUo7fNtTSfVftf2MN/U7ORrCmVgV8=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=PeibGH9kRGVVNIFJZOxGqkkB/3nXRtSsP8RsEsA3+6ede4PujAnj8Pz4cDQK08QuNn8uCcNsVVHJOF1pT6cvEWkNtVIxW1RPWtXMG9ePhrei7a9fldCvH6i1IuEtw3p5FYRmPA/QryNu66+glr9Xc+GoZayKAnDROffrjxUC9b8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ATMNzXsY; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ATMNzXsY" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 40BE8C4CECE; Tue, 26 Nov 2024 01:28:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732584502; bh=UjeDlYDq8ScPR+IUo7fNtTSfVftf2MN/U7ORrCmVgV8=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=ATMNzXsYLCkqqu1XBp5sWAwwyRl1SzXNQWIQMPah3SHrP5Q8TSyNAiXIVaA6HXPcN NA1VxfBiNTJK6J4+/TSyldw19qGL5nmYYF1IVTiGRuMwoinQ3kIP7OcDrJGL3eTXhX ESW0kfdkdeePUIE4jVVuKNTfLHrHwQJIxErlRC4ZJiUYtC1q0Ah1qApxxshmsLnqUT sdg4SBbt7EkRYx5cNM7kzyU7NKazr0yim/G1csu6kEHppT7/8gn5s3C4FlQ01dHYLx dXZRG5vnBhX7lIkQPGnZ0m2lPhQvvXdwNhAOMVoO5MIlFC+dbf6H9hpl5dkVmKqSMR fwgozT8Q1SXZQ== Date: Mon, 25 Nov 2024 17:28:21 -0800 Subject: [PATCH 14/21] xfs: only run precommits once per transaction object From: "Darrick J. Wong" To: djwong@kernel.org Cc: stable@vger.kernel.org, linux-xfs@vger.kernel.org Message-ID: <173258398042.4032920.1346072051908401243.stgit@frogsfrogsfrogs> In-Reply-To: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> References: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Committing a transaction tx0 with a defer ops chain of (A, B, C) creates a chain of transactions that looks like this: tx0 -> txA -> txB -> txC Prior to commit cb042117488dbf, __xfs_trans_commit would run precommits on tx0, then call xfs_defer_finish_noroll to convert A-C to tx[A-C]. Unfortunately, after the finish_noroll loop we forgot to run precommits on txC. That was fixed by adding the second precommit call. Unfortunately, none of us remembered that xfs_defer_finish_noroll calls __xfs_trans_commit a second time to commit tx0 before finishing work A in txA and committing that. In other words, we run precommits twice on tx0: xfs_trans_commit(tx0) __xfs_trans_commit(tx0, false) xfs_trans_run_precommits(tx0) xfs_defer_finish_noroll(tx0) xfs_trans_roll(tx0) txA = xfs_trans_dup(tx0) __xfs_trans_commit(tx0, true) xfs_trans_run_precommits(tx0) This currently isn't an issue because the inode item precommit is idempotent; the iunlink item precommit deletes itself so it can't be called again; and the buffer/dquot item precommits only check the incore objects for corruption. However, it doesn't make sense to run precommits twice. Fix this situation by only running precommits after finish_noroll. Cc: # v6.4 Fixes: cb042117488dbf ("xfs: defered work could create precommits") Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_trans.c | 16 ++++------------ 1 file changed, 4 insertions(+), 12 deletions(-) diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c index 05b18e30368e4b..4a517250efc911 100644 --- a/fs/xfs/xfs_trans.c +++ b/fs/xfs/xfs_trans.c @@ -860,13 +860,6 @@ __xfs_trans_commit( trace_xfs_trans_commit(tp, _RET_IP_); - error = xfs_trans_run_precommits(tp); - if (error) { - if (tp->t_flags & XFS_TRANS_PERM_LOG_RES) - xfs_defer_cancel(tp); - goto out_unreserve; - } - /* * Finish deferred items on final commit. Only permanent transactions * should ever have deferred ops. @@ -877,13 +870,12 @@ __xfs_trans_commit( error = xfs_defer_finish_noroll(&tp); if (error) goto out_unreserve; - - /* Run precommits from final tx in defer chain. */ - error = xfs_trans_run_precommits(tp); - if (error) - goto out_unreserve; } + error = xfs_trans_run_precommits(tp); + if (error) + goto out_unreserve; + /* * If there is nothing to be logged by the transaction, * then unlock all of the items associated with the From patchwork Tue Nov 26 01:28:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13885420 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A075C2C6 for ; Tue, 26 Nov 2024 01:28:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584518; cv=none; b=YzijCO5smnA/Do80RASTV20LSODKEK07KLzIxXZ/vllNVk4g2iuv3n+Wkgb+uB7/OPfMayCu6uRKffxN6h481PJxiul5HUhB323IvCl8NbQk8bTlqtKYflVMNtiihpFL6n1iV9yG2tTCDX3BAtTD6mqADylFkrKM0bDyQ7YL66c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584518; c=relaxed/simple; bh=ckZ3RAz3S7mV5m/xGTWZGQ1tVjF9sdvxF+ZdCjAr4Z4=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=FQnyVE8rbLC8TW3dRyKdWTm519tRoHvIIWF+7w/27o6axyNEzCGSFuQveDgaJVshp2GNzR0FROSk4dZd3lsbiL7La9X0befAK0wdOxUm5kxkX10eF3QsdZKmmb1iMc4WP93KsGtoUEnHuiIHFJ740KlgWzbqqFecbyII/2U/DEM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=hlnpnwqF; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="hlnpnwqF" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D8E7EC4CECE; Tue, 26 Nov 2024 01:28:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732584517; bh=ckZ3RAz3S7mV5m/xGTWZGQ1tVjF9sdvxF+ZdCjAr4Z4=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=hlnpnwqF2YmaCF51JCzHXjPmslDk4Imi9mkICMSfqlz2ICzmb6fD39I09ehCj7UsH t9WLeUi53UFYdqDr0jcP3xjYJ3LR3ykGVCG0J9Y+NwZYl9pBro0bcD/DrDC7p0A8Jl KKxIgr+TgA/LV/LmJrajSZXr1o7NlA2igEwDUHL7RkRY6ULuifaKXDUR4Fh9NGK5LU nNdc4hkkYGQ/iRPm+oTJaqt4h1Xn5IBJlrc/YoiZ2oC61rEPWm3X15SQMlPJM6vAFy +sjhTWMMCSlruEjlcMMuTWpZwJWsdxoPeAJhS+9WDUZRugwm//NHMW+qx9rk63O7uE goNsoKoj6dRkg== Date: Mon, 25 Nov 2024 17:28:37 -0800 Subject: [PATCH 15/21] xfs: remove recursion in __xfs_trans_commit From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Message-ID: <173258398059.4032920.3998675004204277948.stgit@frogsfrogsfrogs> In-Reply-To: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> References: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Currently, __xfs_trans_commit calls xfs_defer_finish_noroll, which calls __xfs_trans_commit again on the same transaction. In other words, there's function recursion that has caused minor amounts of confusion in the past. There's no reason to keep this around, since there's only one place where we actually want the xfs_defer_finish_noroll, and that is in the top level xfs_trans_commit call. Fixes: 98719051e75ccf ("xfs: refactor internal dfops initialization") Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_trans.c | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c index 4a517250efc911..26bb2343082af4 100644 --- a/fs/xfs/xfs_trans.c +++ b/fs/xfs/xfs_trans.c @@ -860,18 +860,6 @@ __xfs_trans_commit( trace_xfs_trans_commit(tp, _RET_IP_); - /* - * Finish deferred items on final commit. Only permanent transactions - * should ever have deferred ops. - */ - WARN_ON_ONCE(!list_empty(&tp->t_dfops) && - !(tp->t_flags & XFS_TRANS_PERM_LOG_RES)); - if (!regrant && (tp->t_flags & XFS_TRANS_PERM_LOG_RES)) { - error = xfs_defer_finish_noroll(&tp); - if (error) - goto out_unreserve; - } - error = xfs_trans_run_precommits(tp); if (error) goto out_unreserve; @@ -950,6 +938,20 @@ int xfs_trans_commit( struct xfs_trans *tp) { + /* + * Finish deferred items on final commit. Only permanent transactions + * should ever have deferred ops. + */ + WARN_ON_ONCE(!list_empty(&tp->t_dfops) && + !(tp->t_flags & XFS_TRANS_PERM_LOG_RES)); + if (tp->t_flags & XFS_TRANS_PERM_LOG_RES) { + int error = xfs_defer_finish_noroll(&tp); + if (error) { + xfs_trans_cancel(tp); + return error; + } + } + return __xfs_trans_commit(tp, false); } From patchwork Tue Nov 26 01:28:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13885424 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 052CAD26D; Tue, 26 Nov 2024 01:28:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584534; cv=none; b=APq2tmWN2I8vwJDDpU2RwrVLGLtKIkR1owB52CQsmsfoyUaPFdO7L5w+Wcczw+g1Yo98HE56eYJyfSEwC9MCmJlFlp4+4fI32rra67znj3YbtI2+S8xuPnHF2Dk7rIUiXNLaPcrgAT+hxMUVKCgklDKT/JPzKfwSUw3xqvTepow= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584534; c=relaxed/simple; bh=L/sfa2zdR7gn6tJWvXD9UNI+3OCZX2bx0Yt5/ZsmraM=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=HB6GqMRvSgkdj72xKnSVzZUqROgrt4tG/HelRCccQ3n50tKNRAS14pgFa2DBA0l8SQOLZs0GZEovZ/3tL2tQkTDYbzRtGAVXgJUDjD7JBzWYCj8GLkTU0rFpytQQdQ5B62m03SpAQ09anXSPlHCXlE7Z4xqcN0WQ+ipEJId4uz8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=rUbRbL2G; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="rUbRbL2G" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 77248C4CECE; Tue, 26 Nov 2024 01:28:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732584533; bh=L/sfa2zdR7gn6tJWvXD9UNI+3OCZX2bx0Yt5/ZsmraM=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=rUbRbL2GccPRgbTG0MTyDOiGF+RzvdBoQkK8GyqYI+9Ryh/KGBkarCGnGf2TB2yuX j/kXnvse5n4Eh8qbWM8sGgQNBC0l6T+DEQqIe1OSoGOUBbI1Mz0VhfwQqIwYlsAFGr LBbLRObKPAT8Gow/gzRF06B02Vwi33If31VbwiOS7oADY+OYKzziLXLhCHqdLvUUUh SQG27RzPW8VtHx7ybmt67oPDfItYkUUwWFHma98s4VKLvTm0yhfVrgNv1AtUx/aH2S 69BHEGzDQRm+r1/yw2chJHsHLpSa2/u0Cx55u6ynH83jTQGn9RWE2ibChUX7gd9Pxe wwY8eMkBQ0OOw== Date: Mon, 25 Nov 2024 17:28:53 -0800 Subject: [PATCH 16/21] xfs: don't lose solo superblock counter update transactions From: "Darrick J. Wong" To: djwong@kernel.org Cc: stable@vger.kernel.org, linux-xfs@vger.kernel.org Message-ID: <173258398074.4032920.16314140758572044747.stgit@frogsfrogsfrogs> In-Reply-To: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> References: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Superblock counter updates are tracked via per-transaction counters in the xfs_trans object. These changes are then turned into dirty log items in xfs_trans_apply_sb_deltas just prior to commiting the log items to the CIL. However, updating the per-transaction counter deltas do not cause XFS_TRANS_DIRTY to be set on the transaction. In other words, a pure sb counter update will be silently discarded if there are no other dirty log items attached to the transaction. This is currently not the case anywhere in the filesystem because sb counter updates always dirty at least one other metadata item, but let's not leave a logic bomb. Cc: # v2.6.35 Fixes: 0924378a689ccb ("xfs: split out iclog writing from xfs_trans_commit()") Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_trans.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c index 26bb2343082af4..427a8ba0ab99e2 100644 --- a/fs/xfs/xfs_trans.c +++ b/fs/xfs/xfs_trans.c @@ -860,6 +860,13 @@ __xfs_trans_commit( trace_xfs_trans_commit(tp, _RET_IP_); + /* + * Commit per-transaction changes that are not already tracked through + * log items. This can add dirty log items to the transaction. + */ + if (tp->t_flags & XFS_TRANS_SB_DIRTY) + xfs_trans_apply_sb_deltas(tp); + error = xfs_trans_run_precommits(tp); if (error) goto out_unreserve; @@ -890,8 +897,6 @@ __xfs_trans_commit( /* * If we need to update the superblock, then do it now. */ - if (tp->t_flags & XFS_TRANS_SB_DIRTY) - xfs_trans_apply_sb_deltas(tp); xfs_trans_apply_dquot_deltas(tp); xlog_cil_commit(log, tp, &commit_seq, regrant); From patchwork Tue Nov 26 01:29:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13885425 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 74559C2C6; Tue, 26 Nov 2024 01:29:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584549; cv=none; b=AAQzC1TusFF+vvyd2CWDZfPZxVfyqV0xT3DUE+NcMZjxwXU6534FtmJQ9SG9G/Dqjz2shJV9SnomUlWC9BrGtOCpz9fJrxTT2Izr5pMkkR7vcwpTf8MzuBO+Cb02BZhA0YXH2EVgzKWb505FnSpmRpr9bORZoXEmRQ/p6wd3SDQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584549; c=relaxed/simple; bh=DtR02jdxVUubWxzt6q69az9tmCnyzg7sQr8bK9u+0no=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=e0oNIoluUzjmS90BEJop0S94cdwtuZ83968MUDS7FQiN5aieON2ucmxXhHClUzygbhr6mO+iQ5d2DcI0HvLQ4+G3oA4/69qiyVz9oGdGytvCgtuCLT3oG8MaOJ2c9CdWlQ6qwVrlWQuFfhLDfV681kYXY5diFLUxsjr1yyAaCbY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=oi9Dq4gi; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="oi9Dq4gi" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 145F5C4CECE; Tue, 26 Nov 2024 01:29:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732584549; bh=DtR02jdxVUubWxzt6q69az9tmCnyzg7sQr8bK9u+0no=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=oi9Dq4gibPx4Jwyv3lx9p9DbuIbJocgt8vwo1E6vkpnIQbse0/eTo+cQO7xe1L0Pn RGWNkz3/jBqhuy5pNA3fZnRUH2UYEi2njj877myvQEGg29GiqZCKjrdolNYRhFBak5 I85AAkZC2qb67SY2fuhs2HTBFNUC07EafKY6n+wHdiMcf8/WLkylzFrm732IjXijRT UOotWI3v65KIi34dbv3VcU+gbstf3sKQuZf55kDjZoAYmbzF/LA4e//zp4kMopgb0s fWgNwS1tJHiu8nKpAwYHYf/qK7YKcfeCVsKCrzwc8j4UnAEai3wvk/X+Ou5fagEyZR 7mbmty9+D2yig== Date: Mon, 25 Nov 2024 17:29:08 -0800 Subject: [PATCH 17/21] xfs: don't lose solo dquot update transactions From: "Darrick J. Wong" To: djwong@kernel.org Cc: stable@vger.kernel.org, linux-xfs@vger.kernel.org Message-ID: <173258398090.4032920.6440798067032580972.stgit@frogsfrogsfrogs> In-Reply-To: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> References: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Quota counter updates are tracked via incore objects which hang off the xfs_trans object. These changes are then turned into dirty log items in xfs_trans_apply_dquot_deltas just prior to commiting the log items to the CIL. However, updating the incore deltas do not cause XFS_TRANS_DIRTY to be set on the transaction. In other words, a pure quota counter update will be silently discarded if there are no other dirty log items attached to the transaction. This is currently not the case anywhere in the filesystem because quota updates always dirty at least one other metadata item, but a subsequent bug fix will add dquot log item precommits, so we actually need a dirty dquot log item prior to xfs_trans_run_precommits. Also let's not leave a logic bomb. Cc: # v2.6.35 Fixes: 0924378a689ccb ("xfs: split out iclog writing from xfs_trans_commit()") Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_quota.h | 7 ++++--- fs/xfs/xfs_trans.c | 10 +++------- fs/xfs/xfs_trans_dquot.c | 31 ++++++++++++++++++++++++++----- 3 files changed, 33 insertions(+), 15 deletions(-) diff --git a/fs/xfs/xfs_quota.h b/fs/xfs/xfs_quota.h index fa1317cc396c96..b864ed59787780 100644 --- a/fs/xfs/xfs_quota.h +++ b/fs/xfs/xfs_quota.h @@ -101,7 +101,8 @@ extern void xfs_trans_free_dqinfo(struct xfs_trans *); extern void xfs_trans_mod_dquot_byino(struct xfs_trans *, struct xfs_inode *, uint, int64_t); extern void xfs_trans_apply_dquot_deltas(struct xfs_trans *); -extern void xfs_trans_unreserve_and_mod_dquots(struct xfs_trans *); +void xfs_trans_unreserve_and_mod_dquots(struct xfs_trans *tp, + bool already_locked); int xfs_trans_reserve_quota_nblks(struct xfs_trans *tp, struct xfs_inode *ip, int64_t dblocks, int64_t rblocks, bool force); extern int xfs_trans_reserve_quota_bydquots(struct xfs_trans *, @@ -172,8 +173,8 @@ static inline void xfs_trans_mod_dquot_byino(struct xfs_trans *tp, struct xfs_inode *ip, uint field, int64_t delta) { } -#define xfs_trans_apply_dquot_deltas(tp) -#define xfs_trans_unreserve_and_mod_dquots(tp) +#define xfs_trans_apply_dquot_deltas(tp, a) +#define xfs_trans_unreserve_and_mod_dquots(tp, a) static inline int xfs_trans_reserve_quota_nblks(struct xfs_trans *tp, struct xfs_inode *ip, int64_t dblocks, int64_t rblocks, bool force) diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c index 427a8ba0ab99e2..4cd25717c9d130 100644 --- a/fs/xfs/xfs_trans.c +++ b/fs/xfs/xfs_trans.c @@ -866,6 +866,7 @@ __xfs_trans_commit( */ if (tp->t_flags & XFS_TRANS_SB_DIRTY) xfs_trans_apply_sb_deltas(tp); + xfs_trans_apply_dquot_deltas(tp); error = xfs_trans_run_precommits(tp); if (error) @@ -894,11 +895,6 @@ __xfs_trans_commit( ASSERT(tp->t_ticket != NULL); - /* - * If we need to update the superblock, then do it now. - */ - xfs_trans_apply_dquot_deltas(tp); - xlog_cil_commit(log, tp, &commit_seq, regrant); xfs_trans_free(tp); @@ -924,7 +920,7 @@ __xfs_trans_commit( * the dqinfo portion to be. All that means is that we have some * (non-persistent) quota reservations that need to be unreserved. */ - xfs_trans_unreserve_and_mod_dquots(tp); + xfs_trans_unreserve_and_mod_dquots(tp, true); if (tp->t_ticket) { if (regrant && !xlog_is_shutdown(log)) xfs_log_ticket_regrant(log, tp->t_ticket); @@ -1018,7 +1014,7 @@ xfs_trans_cancel( } #endif xfs_trans_unreserve_and_mod_sb(tp); - xfs_trans_unreserve_and_mod_dquots(tp); + xfs_trans_unreserve_and_mod_dquots(tp, false); if (tp->t_ticket) { xfs_log_ticket_ungrant(log, tp->t_ticket); diff --git a/fs/xfs/xfs_trans_dquot.c b/fs/xfs/xfs_trans_dquot.c index 481ba3dc9f190d..713b6d243e5631 100644 --- a/fs/xfs/xfs_trans_dquot.c +++ b/fs/xfs/xfs_trans_dquot.c @@ -606,6 +606,24 @@ xfs_trans_apply_dquot_deltas( ASSERT(dqp->q_blk.reserved >= dqp->q_blk.count); ASSERT(dqp->q_ino.reserved >= dqp->q_ino.count); ASSERT(dqp->q_rtb.reserved >= dqp->q_rtb.count); + + /* + * We've applied the count changes and given back + * whatever reservation we didn't use. Zero out the + * dqtrx fields. + */ + qtrx->qt_blk_res = 0; + qtrx->qt_bcount_delta = 0; + qtrx->qt_delbcnt_delta = 0; + + qtrx->qt_rtblk_res = 0; + qtrx->qt_rtblk_res_used = 0; + qtrx->qt_rtbcount_delta = 0; + qtrx->qt_delrtb_delta = 0; + + qtrx->qt_ino_res = 0; + qtrx->qt_ino_res_used = 0; + qtrx->qt_icount_delta = 0; } } } @@ -642,7 +660,8 @@ xfs_trans_unreserve_and_mod_dquots_hook( */ void xfs_trans_unreserve_and_mod_dquots( - struct xfs_trans *tp) + struct xfs_trans *tp, + bool already_locked) { int i, j; struct xfs_dquot *dqp; @@ -671,10 +690,12 @@ xfs_trans_unreserve_and_mod_dquots( * about the number of blocks used field, or deltas. * Also we don't bother to zero the fields. */ - locked = false; + locked = already_locked; if (qtrx->qt_blk_res) { - xfs_dqlock(dqp); - locked = true; + if (!locked) { + xfs_dqlock(dqp); + locked = true; + } dqp->q_blk.reserved -= (xfs_qcnt_t)qtrx->qt_blk_res; } @@ -695,7 +716,7 @@ xfs_trans_unreserve_and_mod_dquots( dqp->q_rtb.reserved -= (xfs_qcnt_t)qtrx->qt_rtblk_res; } - if (locked) + if (locked && !already_locked) xfs_dqunlock(dqp); } From patchwork Tue Nov 26 01:29:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13885426 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D0FCBC2C6 for ; Tue, 26 Nov 2024 01:29:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584564; cv=none; b=cLy1qv6NJeJAmY2PbzcphRDwzEJDp4BW0G9g/Kg87REnHwdrqvINVUeWJrfZy0wVTeY0v4KMQ3pBiPYQ6n4GTOEIB0iIjkwfC8+HM2MMnN+kYQoCwfx/g4r5kTRhuTpGSRxvXx7mJHsmOjVzSF+9xPBj6p2WpbzzZrwXx2+aRQ8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584564; c=relaxed/simple; bh=OEpbbLhd9xGpUqvgxIdzfNl8VMCax2QtZGTLNzrrYGI=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=cNnI/TWc8ENxW+mo6bhtXBK4xHphvS/S5zJbppvJmdSgsj+JOvsuoUxr3GjFwc2wu4AWgJ/NB+2BtZQZUgpt7CxBQPMpR08/Tk8vclgCf9NoU31DPaH34uzPNm1XPg+hwQ3sUpUSzbahLkJo84lPMJHl9k8gwHQMLWyz9C1sWFo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Kacscxk7; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Kacscxk7" Received: by smtp.kernel.org (Postfix) with ESMTPSA id AA549C4CECE; Tue, 26 Nov 2024 01:29:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732584564; bh=OEpbbLhd9xGpUqvgxIdzfNl8VMCax2QtZGTLNzrrYGI=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=Kacscxk7bxK76nUcQHEwT7S3rSYYXPV1bGzsl7Sf4sOBN+8JWZnwysvGzkjZU9DEY SZx7+7U+1psUgRf9nopYxHzS2VA0/mnHrdB9hiSQch4ZFRk2UOpsebHNdTvyuxJ1wK Z/irpNi3zAUtGsqIaZWFDXiiJCyE3+sehOsyWxbpP/dT09KFSWKE6MvXBeHg6M2p7b aCE8D2mQMCFIo8617/4+Ja8rCLRwMMHeQIL05miTMYGiBP0VRVOT3dSjlTOzabS5LH orr+OS0NLwTnESMmcYuJiPQD5ALmstgT9+PhI9J4iwuujxU7nyhdG1UB8AjYfKtjCz ja4N/QD8MPHkA== Date: Mon, 25 Nov 2024 17:29:24 -0800 Subject: [PATCH 18/21] xfs: separate dquot buffer reads from xfs_dqflush From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Message-ID: <173258398108.4032920.1511154808709795549.stgit@frogsfrogsfrogs> In-Reply-To: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> References: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong The first step towards holding the dquot buffer in the li_buf instead of reading it in the AIL is to separate the part that reads the buffer from the actual flush code. There should be no functional changes. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_dquot.c | 57 +++++++++++++++++++++++++++++++---------------- fs/xfs/xfs_dquot.h | 4 ++- fs/xfs/xfs_dquot_item.c | 20 +++++++++++++--- fs/xfs/xfs_qm.c | 37 +++++++++++++++++++++++++------ 4 files changed, 86 insertions(+), 32 deletions(-) diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c index ff982d983989b0..6ec4087e38dfc8 100644 --- a/fs/xfs/xfs_dquot.c +++ b/fs/xfs/xfs_dquot.c @@ -1238,6 +1238,42 @@ xfs_qm_dqflush_check( return NULL; } +/* + * Get the buffer containing the on-disk dquot. + * + * Requires dquot flush lock, will clear the dirty flag, delete the quota log + * item from the AIL, and shut down the system if something goes wrong. + */ +int +xfs_dquot_read_buf( + struct xfs_trans *tp, + struct xfs_dquot *dqp, + struct xfs_buf **bpp) +{ + struct xfs_mount *mp = dqp->q_mount; + struct xfs_buf *bp = NULL; + int error; + + error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp, dqp->q_blkno, + mp->m_quotainfo->qi_dqchunklen, XBF_TRYLOCK, + &bp, &xfs_dquot_buf_ops); + if (error == -EAGAIN) + return error; + if (xfs_metadata_is_sick(error)) + xfs_dquot_mark_sick(dqp); + if (error) + goto out_abort; + + *bpp = bp; + return 0; + +out_abort: + dqp->q_flags &= ~XFS_DQFLAG_DIRTY; + xfs_trans_ail_delete(&dqp->q_logitem.qli_item, 0); + xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE); + return error; +} + /* * Write a modified dquot to disk. * The dquot must be locked and the flush lock too taken by caller. @@ -1249,11 +1285,10 @@ xfs_qm_dqflush_check( int xfs_qm_dqflush( struct xfs_dquot *dqp, - struct xfs_buf **bpp) + struct xfs_buf *bp) { struct xfs_mount *mp = dqp->q_mount; struct xfs_log_item *lip = &dqp->q_logitem.qli_item; - struct xfs_buf *bp; struct xfs_dqblk *dqblk; xfs_failaddr_t fa; int error; @@ -1263,28 +1298,12 @@ xfs_qm_dqflush( trace_xfs_dqflush(dqp); - *bpp = NULL; - xfs_qm_dqunpin_wait(dqp); - /* - * Get the buffer containing the on-disk dquot - */ - error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dqp->q_blkno, - mp->m_quotainfo->qi_dqchunklen, XBF_TRYLOCK, - &bp, &xfs_dquot_buf_ops); - if (error == -EAGAIN) - goto out_unlock; - if (xfs_metadata_is_sick(error)) - xfs_dquot_mark_sick(dqp); - if (error) - goto out_abort; - fa = xfs_qm_dqflush_check(dqp); if (fa) { xfs_alert(mp, "corrupt dquot ID 0x%x in memory at %pS", dqp->q_id, fa); - xfs_buf_relse(bp); xfs_dquot_mark_sick(dqp); error = -EFSCORRUPTED; goto out_abort; @@ -1334,14 +1353,12 @@ xfs_qm_dqflush( } trace_xfs_dqflush_done(dqp); - *bpp = bp; return 0; out_abort: dqp->q_flags &= ~XFS_DQFLAG_DIRTY; xfs_trans_ail_delete(lip, 0); xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE); -out_unlock: xfs_dqfunlock(dqp); return error; } diff --git a/fs/xfs/xfs_dquot.h b/fs/xfs/xfs_dquot.h index d73d179df00958..50f8404c41176c 100644 --- a/fs/xfs/xfs_dquot.h +++ b/fs/xfs/xfs_dquot.h @@ -214,7 +214,9 @@ void xfs_dquot_to_disk(struct xfs_disk_dquot *ddqp, struct xfs_dquot *dqp); #define XFS_DQ_IS_DIRTY(dqp) ((dqp)->q_flags & XFS_DQFLAG_DIRTY) void xfs_qm_dqdestroy(struct xfs_dquot *dqp); -int xfs_qm_dqflush(struct xfs_dquot *dqp, struct xfs_buf **bpp); +int xfs_dquot_read_buf(struct xfs_trans *tp, struct xfs_dquot *dqp, + struct xfs_buf **bpp); +int xfs_qm_dqflush(struct xfs_dquot *dqp, struct xfs_buf *bp); void xfs_qm_dqunpin_wait(struct xfs_dquot *dqp); void xfs_qm_adjust_dqtimers(struct xfs_dquot *d); void xfs_qm_adjust_dqlimits(struct xfs_dquot *d); diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c index 7d19091215b080..56ecc5ed01934d 100644 --- a/fs/xfs/xfs_dquot_item.c +++ b/fs/xfs/xfs_dquot_item.c @@ -155,14 +155,26 @@ xfs_qm_dquot_logitem_push( spin_unlock(&lip->li_ailp->ail_lock); - error = xfs_qm_dqflush(dqp, &bp); + error = xfs_dquot_read_buf(NULL, dqp, &bp); + if (error) { + if (error == -EAGAIN) + rval = XFS_ITEM_LOCKED; + xfs_dqfunlock(dqp); + goto out_relock_ail; + } + + /* + * dqflush completes dqflock on error, and the delwri ioend does it on + * success. + */ + error = xfs_qm_dqflush(dqp, bp); if (!error) { if (!xfs_buf_delwri_queue(bp, buffer_list)) rval = XFS_ITEM_FLUSHING; - xfs_buf_relse(bp); - } else if (error == -EAGAIN) - rval = XFS_ITEM_LOCKED; + } + xfs_buf_relse(bp); +out_relock_ail: spin_lock(&lip->li_ailp->ail_lock); out_unlock: xfs_dqunlock(dqp); diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c index a4fa21dfd6b4ad..341fe4821c2d77 100644 --- a/fs/xfs/xfs_qm.c +++ b/fs/xfs/xfs_qm.c @@ -148,17 +148,28 @@ xfs_qm_dqpurge( * We don't care about getting disk errors here. We need * to purge this dquot anyway, so we go ahead regardless. */ - error = xfs_qm_dqflush(dqp, &bp); + error = xfs_dquot_read_buf(NULL, dqp, &bp); + if (error == -EAGAIN) { + xfs_dqfunlock(dqp); + dqp->q_flags &= ~XFS_DQFLAG_FREEING; + goto out_unlock; + } + if (error) + goto out_funlock; + + /* + * dqflush completes dqflock on error, and the bwrite ioend + * does it on success. + */ + error = xfs_qm_dqflush(dqp, bp); if (!error) { error = xfs_bwrite(bp); xfs_buf_relse(bp); - } else if (error == -EAGAIN) { - dqp->q_flags &= ~XFS_DQFLAG_FREEING; - goto out_unlock; } xfs_dqflock(dqp); } +out_funlock: ASSERT(atomic_read(&dqp->q_pincount) == 0); ASSERT(xlog_is_shutdown(dqp->q_logitem.qli_item.li_log) || !test_bit(XFS_LI_IN_AIL, &dqp->q_logitem.qli_item.li_flags)); @@ -495,7 +506,17 @@ xfs_qm_dquot_isolate( /* we have to drop the LRU lock to flush the dquot */ spin_unlock(lru_lock); - error = xfs_qm_dqflush(dqp, &bp); + error = xfs_dquot_read_buf(NULL, dqp, &bp); + if (error) { + xfs_dqfunlock(dqp); + goto out_unlock_dirty; + } + + /* + * dqflush completes dqflock on error, and the delwri ioend + * does it on success. + */ + error = xfs_qm_dqflush(dqp, bp); if (error) goto out_unlock_dirty; @@ -1491,11 +1512,13 @@ xfs_qm_flush_one( goto out_unlock; } - error = xfs_qm_dqflush(dqp, &bp); + error = xfs_dquot_read_buf(NULL, dqp, &bp); if (error) goto out_unlock; - xfs_buf_delwri_queue(bp, buffer_list); + error = xfs_qm_dqflush(dqp, bp); + if (!error) + xfs_buf_delwri_queue(bp, buffer_list); xfs_buf_relse(bp); out_unlock: xfs_dqunlock(dqp); From patchwork Tue Nov 26 01:29:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13885427 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7D533D531 for ; Tue, 26 Nov 2024 01:29:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584580; cv=none; b=T2wrdGRGiYcXarbsg6EQEMpGg08sclfBScCpokuSJsgAZ+3P+0AgxNIh12i4K2o9S7sUdHOAt9zhpQqIy59DeDj9VdITTHhAuwdnRskwFC2wr/Mvaqc1Wx1f6bK36Q8kOkALFEd4f0nGUWZH0r34q1UGoyYg/W1b6RLRZOqgU9E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584580; c=relaxed/simple; bh=9XEqeucVOzgibdufI0pmJr9ghGoD5zeYFVoGtKEkOuc=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=a0NaqoASF2sPT27219PdoVnfkVWSKPoECSPAv/DWFeuhh55fJr3Sag9CCH1G3Ai6bPR8XhbHpTMIokmHzMK9sFdXMjZamHXoeBdSu1kxtN6it595AKcjs93GXRIdQSgvyMpslcpP8b5sZDKTl1ZzYD4jz1DxDo5X5i7bfyCSiPg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=P+mIVVkO; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="P+mIVVkO" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 574FBC4CECE; Tue, 26 Nov 2024 01:29:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732584580; bh=9XEqeucVOzgibdufI0pmJr9ghGoD5zeYFVoGtKEkOuc=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=P+mIVVkOmERnvj2U+o8SxL/5wnpap/W0SRIaknGuz12P963P3667VF00/B5XbLXmU dTE+JIWVgm7g0a4IkqRsAjmZCrACfrZIGSPSDX9sZ2sVkpmzWPuvegkXH4Z6y2ig9x S4mUraLePirT7dG9chLIbm/P9bkVCoT/XIyhvhSBPIMXhOZ4oy7be58FLzRK7cYUOE THsLQsQJu1tZeoVWzsf7pc8InswHXm+L1PdHfC2Cm/l4+9L5/qJzPCfNSWSYXiBzTh t2JWXWUocgSds9nbpXhd70lNCiToVGXiu90V1LKDwlOQLF/Jr3zUGhmaC4grsuszD2 HdLwYvdaGJ7/g== Date: Mon, 25 Nov 2024 17:29:39 -0800 Subject: [PATCH 19/21] xfs: clean up log item accesses in xfs_qm_dqflush{,_done} From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Message-ID: <173258398125.4032920.10688788085648644743.stgit@frogsfrogsfrogs> In-Reply-To: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> References: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Clean up these functions a little bit before we move on to the real modifications, and make the variable naming consistent for dquot log items. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_dquot.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c index 6ec4087e38dfc8..4ba042786cfb7b 100644 --- a/fs/xfs/xfs_dquot.c +++ b/fs/xfs/xfs_dquot.c @@ -1142,8 +1142,9 @@ static void xfs_qm_dqflush_done( struct xfs_log_item *lip) { - struct xfs_dq_logitem *qip = (struct xfs_dq_logitem *)lip; - struct xfs_dquot *dqp = qip->qli_dquot; + struct xfs_dq_logitem *qlip = + container_of(lip, struct xfs_dq_logitem, qli_item); + struct xfs_dquot *dqp = qlip->qli_dquot; struct xfs_ail *ailp = lip->li_ailp; xfs_lsn_t tail_lsn; @@ -1156,12 +1157,12 @@ xfs_qm_dqflush_done( * holding the lock before removing the dquot from the AIL. */ if (test_bit(XFS_LI_IN_AIL, &lip->li_flags) && - ((lip->li_lsn == qip->qli_flush_lsn) || + ((lip->li_lsn == qlip->qli_flush_lsn) || test_bit(XFS_LI_FAILED, &lip->li_flags))) { spin_lock(&ailp->ail_lock); xfs_clear_li_failed(lip); - if (lip->li_lsn == qip->qli_flush_lsn) { + if (lip->li_lsn == qlip->qli_flush_lsn) { /* xfs_ail_update_finish() drops the AIL lock */ tail_lsn = xfs_ail_delete_one(ailp, lip); xfs_ail_update_finish(ailp, tail_lsn); @@ -1319,7 +1320,7 @@ xfs_qm_dqflush( dqp->q_flags &= ~XFS_DQFLAG_DIRTY; xfs_trans_ail_copy_lsn(mp->m_ail, &dqp->q_logitem.qli_flush_lsn, - &dqp->q_logitem.qli_item.li_lsn); + &lip->li_lsn); /* * copy the lsn into the on-disk dquot now while we have the in memory @@ -1331,7 +1332,7 @@ xfs_qm_dqflush( * of a dquot without an up-to-date CRC getting to disk. */ if (xfs_has_crc(mp)) { - dqblk->dd_lsn = cpu_to_be64(dqp->q_logitem.qli_item.li_lsn); + dqblk->dd_lsn = cpu_to_be64(lip->li_lsn); xfs_update_cksum((char *)dqblk, sizeof(struct xfs_dqblk), XFS_DQUOT_CRC_OFF); } @@ -1341,7 +1342,7 @@ xfs_qm_dqflush( * the AIL and release the flush lock once the dquot is synced to disk. */ bp->b_flags |= _XBF_DQUOTS; - list_add_tail(&dqp->q_logitem.qli_item.li_bio_list, &bp->b_li_list); + list_add_tail(&lip->li_bio_list, &bp->b_li_list); /* * If the buffer is pinned then push on the log so we won't From patchwork Tue Nov 26 01:29:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13885428 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6E065101DE; Tue, 26 Nov 2024 01:29:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584596; cv=none; b=OIE9ld/PA1XTFPB+vUTPUoUgVhUqJxXZ38F4rk94zC8O1dBVPkDMt6NHZDDI+9JfQ1RXe8i7GjzVk4ZByxFgIXHRqCJqy/EUjgo5byxGcCpS0uc15OALtqJnutYHxFDfWdLVAa6EvvNGXHVw7I8uZ2h1KvGANd9tSNtQuBTrrec= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584596; c=relaxed/simple; bh=/7ryQ8LaCtzZEF0YvtYsthcqBolx/BPLDskoe3P6FrA=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=XYJ9dt7hJElAOgfw+kQ85G1MTo+qnNAk7+nRjmS0FH4DT4tF9Kw7kLURUsET7HT/712a+s3fNaV+yxNSJhns5fFP33J998uogPMQfoNVqeLwpqAGs8TBoZeVTg5OLFnij5bnso0DOs3ap2lL1QAsAduszTaEqBlSIiKVAN5aJAM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=G+aCl8Ac; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="G+aCl8Ac" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E8A80C4CECE; Tue, 26 Nov 2024 01:29:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732584596; bh=/7ryQ8LaCtzZEF0YvtYsthcqBolx/BPLDskoe3P6FrA=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=G+aCl8AcUWpiRqHBXsIRjSParyaG75KeKy2STyeIcVIWZR+rR9gzkwR46AxCmv3Yl 2OgavsJAy+W7hNMvZkcQTyL2u7FLuC1tLUwX1VnDOgL3UC4S0g0TKYcW2fBgfuA0r3 r3u4vyLUbwnTD72UMkz9gfBWV+rs0x0risvT1kQtzV9AacIOPIEjAcn/COMPZk8bbb /5q+/wPZg0S3V33uzFo41LGDL3GUvX/ScLPMgm4ps+IO9LIKi786oKR3M640TdB/C6 TB17OuvL+AcF0najQrA2U7VGNhKMxiA3AuyJnApg8o2FzKNxdZxeXZpDrM1mOtSJ9k +2R/CpMJDyxPA== Date: Mon, 25 Nov 2024 17:29:55 -0800 Subject: [PATCH 20/21] xfs: attach dquot buffer to dquot log item buffer From: "Darrick J. Wong" To: djwong@kernel.org Cc: stable@vger.kernel.org, linux-xfs@vger.kernel.org Message-ID: <173258398142.4032920.11501045442848686733.stgit@frogsfrogsfrogs> In-Reply-To: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> References: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Ever since 6.12-rc1, I've observed a pile of warnings from the kernel when running fstests with quotas enabled: WARNING: CPU: 1 PID: 458580 at mm/page_alloc.c:4221 __alloc_pages_noprof+0xc9c/0xf18 CPU: 1 UID: 0 PID: 458580 Comm: xfsaild/sda3 Tainted: G W 6.12.0-rc6-djwa #rc6 6ee3e0e531f6457e2d26aa008a3b65ff184b377c Call trace: __alloc_pages_noprof+0xc9c/0xf18 alloc_pages_mpol_noprof+0x94/0x240 alloc_pages_noprof+0x68/0xf8 new_slab+0x3e0/0x568 ___slab_alloc+0x5a0/0xb88 __slab_alloc.constprop.0+0x7c/0xf8 __kmalloc_noprof+0x404/0x4d0 xfs_buf_get_map+0x594/0xde0 [xfs 384cb02810558b4c490343c164e9407332118f88] xfs_buf_read_map+0x64/0x2e0 [xfs 384cb02810558b4c490343c164e9407332118f88] xfs_trans_read_buf_map+0x1dc/0x518 [xfs 384cb02810558b4c490343c164e9407332118f88] xfs_qm_dqflush+0xac/0x468 [xfs 384cb02810558b4c490343c164e9407332118f88] xfs_qm_dquot_logitem_push+0xe4/0x148 [xfs 384cb02810558b4c490343c164e9407332118f88] xfsaild+0x3f4/0xde8 [xfs 384cb02810558b4c490343c164e9407332118f88] kthread+0x110/0x128 ret_from_fork+0x10/0x20 ---[ end trace 0000000000000000 ]--- This corresponds to the line: WARN_ON_ONCE(current->flags & PF_MEMALLOC); within the NOFAIL checks. What's happening here is that the XFS AIL is trying to write a disk quota update back into the filesystem, but for that it needs to read the ondisk buffer for the dquot. The buffer is not in memory anymore, probably because it was evicted. Regardless, the buffer cache tries to allocate a new buffer, but those allocations are NOFAIL. The AIL thread has marked itself PF_MEMALLOC (aka noreclaim) since commit 43ff2122e6492b ("xfs: on-stack delayed write buffer lists") presumably because reclaim can push on XFS to push on the AIL. An easy way to fix this probably would have been to drop the NOFAIL flag from the xfs_buf allocation and open code a retry loop, but then there's still the problem that for bs>ps filesystems, the buffer itself could require up to 64k worth of pages. Inode items had similar behavior (multi-page cluster buffers that we don't want to allocate in the AIL) which we solved by making transaction precommit attach the inode cluster buffers to the dirty log item. Let's solve the dquot problem in the same way. So: Make a real precommit handler to read the dquot buffer and attach it to the log item; pass it to dqflush in the push method; and have the iodone function detach the buffer once we've flushed everything. Add a state flag to the log item to track when a thread has entered the precommit -> push mechanism to skip the detaching if it turns out that the dquot is very busy, as we don't hold the dquot lock between log item commit and AIL push). Reading and attaching the dquot buffer in the precommit hook is inspired by the work done for inode cluster buffers some time ago. Cc: # v6.12 Fixes: 903edea6c53f09 ("mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner") Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_dquot.c | 120 +++++++++++++++++++++++++++++++++++++++++++++-- fs/xfs/xfs_dquot.h | 5 ++ fs/xfs/xfs_dquot_item.c | 39 ++++++++++----- fs/xfs/xfs_dquot_item.h | 7 +++ fs/xfs/xfs_qm.c | 6 +- fs/xfs/xfs_trans_ail.c | 2 - 6 files changed, 155 insertions(+), 24 deletions(-) diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c index 4ba042786cfb7b..c495f7ad80018f 100644 --- a/fs/xfs/xfs_dquot.c +++ b/fs/xfs/xfs_dquot.c @@ -75,8 +75,24 @@ void xfs_qm_dqdestroy( struct xfs_dquot *dqp) { + struct xfs_dq_logitem *qlip = &dqp->q_logitem; + struct xfs_buf *bp = NULL; + ASSERT(list_empty(&dqp->q_lru)); + /* + * Detach the dquot buffer if it's still attached, because we can get + * called through dqpurge after a log shutdown. + */ + spin_lock(&qlip->qli_lock); + if (qlip->qli_item.li_buf) { + bp = qlip->qli_item.li_buf; + qlip->qli_item.li_buf = NULL; + } + spin_unlock(&qlip->qli_lock); + if (bp) + xfs_buf_rele(bp); + kvfree(dqp->q_logitem.qli_item.li_lv_shadow); mutex_destroy(&dqp->q_qlock); @@ -1146,6 +1162,7 @@ xfs_qm_dqflush_done( container_of(lip, struct xfs_dq_logitem, qli_item); struct xfs_dquot *dqp = qlip->qli_dquot; struct xfs_ail *ailp = lip->li_ailp; + struct xfs_buf *bp = NULL; xfs_lsn_t tail_lsn; /* @@ -1175,6 +1192,19 @@ xfs_qm_dqflush_done( * Release the dq's flush lock since we're done with it. */ xfs_dqfunlock(dqp); + + /* + * If this dquot hasn't been dirtied since initiating the last dqflush, + * release the buffer reference. + */ + spin_lock(&qlip->qli_lock); + if (!qlip->qli_dirty) { + bp = lip->li_buf; + lip->li_buf = NULL; + } + spin_unlock(&qlip->qli_lock); + if (bp) + xfs_buf_rele(bp); } void @@ -1197,7 +1227,7 @@ xfs_buf_dquot_io_fail( spin_lock(&bp->b_mount->m_ail->ail_lock); list_for_each_entry(lip, &bp->b_li_list, li_bio_list) - xfs_set_li_failed(lip, bp); + set_bit(XFS_LI_FAILED, &lip->li_flags); spin_unlock(&bp->b_mount->m_ail->ail_lock); } @@ -1249,6 +1279,7 @@ int xfs_dquot_read_buf( struct xfs_trans *tp, struct xfs_dquot *dqp, + xfs_buf_flags_t xbf_flags, struct xfs_buf **bpp) { struct xfs_mount *mp = dqp->q_mount; @@ -1256,7 +1287,7 @@ xfs_dquot_read_buf( int error; error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp, dqp->q_blkno, - mp->m_quotainfo->qi_dqchunklen, XBF_TRYLOCK, + mp->m_quotainfo->qi_dqchunklen, xbf_flags, &bp, &xfs_dquot_buf_ops); if (error == -EAGAIN) return error; @@ -1275,6 +1306,77 @@ xfs_dquot_read_buf( return error; } +/* + * Attach a dquot buffer to this dquot to avoid allocating a buffer during a + * dqflush, since dqflush can be called from reclaim context. + */ +int +xfs_dquot_attach_buf( + struct xfs_trans *tp, + struct xfs_dquot *dqp) +{ + struct xfs_dq_logitem *qlip = &dqp->q_logitem; + struct xfs_log_item *lip = &qlip->qli_item; + int error; + + spin_lock(&qlip->qli_lock); + if (!lip->li_buf) { + struct xfs_buf *bp = NULL; + + spin_unlock(&qlip->qli_lock); + error = xfs_dquot_read_buf(tp, dqp, 0, &bp); + if (error) + return error; + + /* + * Attach the dquot to the buffer so that the AIL does not have + * to read the dquot buffer to push this item. + */ + xfs_buf_hold(bp); + spin_lock(&qlip->qli_lock); + lip->li_buf = bp; + xfs_trans_brelse(tp, bp); + } + qlip->qli_dirty = true; + spin_unlock(&qlip->qli_lock); + + return 0; +} + +/* + * Get a new reference the dquot buffer attached to this dquot for a dqflush + * operation. + * + * Returns 0 and a NULL bp if none was attached to the dquot; 0 and a locked + * bp; or -EAGAIN if the buffer could not be locked. + */ +int +xfs_dquot_use_attached_buf( + struct xfs_dquot *dqp, + struct xfs_buf **bpp) +{ + struct xfs_buf *bp = dqp->q_logitem.qli_item.li_buf; + + /* + * A NULL buffer can happen if the dquot dirty flag was set but the + * filesystem shut down before transaction commit happened. In that + * case we're not going to flush anyway. + */ + if (!bp) { + ASSERT(xfs_is_shutdown(dqp->q_mount)); + + *bpp = NULL; + return 0; + } + + if (!xfs_buf_trylock(bp)) + return -EAGAIN; + + xfs_buf_hold(bp); + *bpp = bp; + return 0; +} + /* * Write a modified dquot to disk. * The dquot must be locked and the flush lock too taken by caller. @@ -1289,7 +1391,8 @@ xfs_qm_dqflush( struct xfs_buf *bp) { struct xfs_mount *mp = dqp->q_mount; - struct xfs_log_item *lip = &dqp->q_logitem.qli_item; + struct xfs_dq_logitem *qlip = &dqp->q_logitem; + struct xfs_log_item *lip = &qlip->qli_item; struct xfs_dqblk *dqblk; xfs_failaddr_t fa; int error; @@ -1319,8 +1422,15 @@ xfs_qm_dqflush( */ dqp->q_flags &= ~XFS_DQFLAG_DIRTY; - xfs_trans_ail_copy_lsn(mp->m_ail, &dqp->q_logitem.qli_flush_lsn, - &lip->li_lsn); + /* + * We hold the dquot lock, so nobody can dirty it while we're + * scheduling the write out. Clear the dirty-since-flush flag. + */ + spin_lock(&qlip->qli_lock); + qlip->qli_dirty = false; + spin_unlock(&qlip->qli_lock); + + xfs_trans_ail_copy_lsn(mp->m_ail, &qlip->qli_flush_lsn, &lip->li_lsn); /* * copy the lsn into the on-disk dquot now while we have the in memory diff --git a/fs/xfs/xfs_dquot.h b/fs/xfs/xfs_dquot.h index 50f8404c41176c..362ca34f7c248b 100644 --- a/fs/xfs/xfs_dquot.h +++ b/fs/xfs/xfs_dquot.h @@ -215,7 +215,7 @@ void xfs_dquot_to_disk(struct xfs_disk_dquot *ddqp, struct xfs_dquot *dqp); void xfs_qm_dqdestroy(struct xfs_dquot *dqp); int xfs_dquot_read_buf(struct xfs_trans *tp, struct xfs_dquot *dqp, - struct xfs_buf **bpp); + xfs_buf_flags_t flags, struct xfs_buf **bpp); int xfs_qm_dqflush(struct xfs_dquot *dqp, struct xfs_buf *bp); void xfs_qm_dqunpin_wait(struct xfs_dquot *dqp); void xfs_qm_adjust_dqtimers(struct xfs_dquot *d); @@ -239,6 +239,9 @@ void xfs_dqlockn(struct xfs_dqtrx *q); void xfs_dquot_set_prealloc_limits(struct xfs_dquot *); +int xfs_dquot_attach_buf(struct xfs_trans *tp, struct xfs_dquot *dqp); +int xfs_dquot_use_attached_buf(struct xfs_dquot *dqp, struct xfs_buf **bpp); + static inline struct xfs_dquot *xfs_qm_dqhold(struct xfs_dquot *dqp) { xfs_dqlock(dqp); diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c index 56ecc5ed01934d..271b195ebb9326 100644 --- a/fs/xfs/xfs_dquot_item.c +++ b/fs/xfs/xfs_dquot_item.c @@ -123,8 +123,9 @@ xfs_qm_dquot_logitem_push( __releases(&lip->li_ailp->ail_lock) __acquires(&lip->li_ailp->ail_lock) { - struct xfs_dquot *dqp = DQUOT_ITEM(lip)->qli_dquot; - struct xfs_buf *bp = lip->li_buf; + struct xfs_dq_logitem *qlip = DQUOT_ITEM(lip); + struct xfs_dquot *dqp = qlip->qli_dquot; + struct xfs_buf *bp; uint rval = XFS_ITEM_SUCCESS; int error; @@ -155,11 +156,10 @@ xfs_qm_dquot_logitem_push( spin_unlock(&lip->li_ailp->ail_lock); - error = xfs_dquot_read_buf(NULL, dqp, &bp); - if (error) { - if (error == -EAGAIN) - rval = XFS_ITEM_LOCKED; + error = xfs_dquot_use_attached_buf(dqp, &bp); + if (error == -EAGAIN) { xfs_dqfunlock(dqp); + rval = XFS_ITEM_LOCKED; goto out_relock_ail; } @@ -207,12 +207,10 @@ xfs_qm_dquot_logitem_committing( } #ifdef DEBUG_EXPENSIVE -static int -xfs_qm_dquot_logitem_precommit( - struct xfs_trans *tp, - struct xfs_log_item *lip) +static void +xfs_qm_dquot_logitem_precommit_check( + struct xfs_dquot *dqp) { - struct xfs_dquot *dqp = DQUOT_ITEM(lip)->qli_dquot; struct xfs_mount *mp = dqp->q_mount; struct xfs_disk_dquot ddq = { }; xfs_failaddr_t fa; @@ -228,13 +226,24 @@ xfs_qm_dquot_logitem_precommit( xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE); ASSERT(fa == NULL); } - - return 0; } #else -# define xfs_qm_dquot_logitem_precommit NULL +# define xfs_qm_dquot_logitem_precommit_check(...) ((void)0) #endif +static int +xfs_qm_dquot_logitem_precommit( + struct xfs_trans *tp, + struct xfs_log_item *lip) +{ + struct xfs_dq_logitem *qlip = DQUOT_ITEM(lip); + struct xfs_dquot *dqp = qlip->qli_dquot; + + xfs_qm_dquot_logitem_precommit_check(dqp); + + return xfs_dquot_attach_buf(tp, dqp); +} + static const struct xfs_item_ops xfs_dquot_item_ops = { .iop_size = xfs_qm_dquot_logitem_size, .iop_precommit = xfs_qm_dquot_logitem_precommit, @@ -259,5 +268,7 @@ xfs_qm_dquot_logitem_init( xfs_log_item_init(dqp->q_mount, &lp->qli_item, XFS_LI_DQUOT, &xfs_dquot_item_ops); + spin_lock_init(&lp->qli_lock); lp->qli_dquot = dqp; + lp->qli_dirty = false; } diff --git a/fs/xfs/xfs_dquot_item.h b/fs/xfs/xfs_dquot_item.h index 794710c2447493..d66e52807d76d5 100644 --- a/fs/xfs/xfs_dquot_item.h +++ b/fs/xfs/xfs_dquot_item.h @@ -14,6 +14,13 @@ struct xfs_dq_logitem { struct xfs_log_item qli_item; /* common portion */ struct xfs_dquot *qli_dquot; /* dquot ptr */ xfs_lsn_t qli_flush_lsn; /* lsn at last flush */ + + /* + * We use this spinlock to coordinate access to the li_buf pointer in + * the log item and the qli_dirty flag. + */ + spinlock_t qli_lock; + bool qli_dirty; /* dirtied since last flush? */ }; void xfs_qm_dquot_logitem_init(struct xfs_dquot *dqp); diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c index 341fe4821c2d77..a79c4a1bf27fab 100644 --- a/fs/xfs/xfs_qm.c +++ b/fs/xfs/xfs_qm.c @@ -148,7 +148,7 @@ xfs_qm_dqpurge( * We don't care about getting disk errors here. We need * to purge this dquot anyway, so we go ahead regardless. */ - error = xfs_dquot_read_buf(NULL, dqp, &bp); + error = xfs_dquot_read_buf(NULL, dqp, XBF_TRYLOCK, &bp); if (error == -EAGAIN) { xfs_dqfunlock(dqp); dqp->q_flags &= ~XFS_DQFLAG_FREEING; @@ -506,7 +506,7 @@ xfs_qm_dquot_isolate( /* we have to drop the LRU lock to flush the dquot */ spin_unlock(lru_lock); - error = xfs_dquot_read_buf(NULL, dqp, &bp); + error = xfs_dquot_read_buf(NULL, dqp, XBF_TRYLOCK, &bp); if (error) { xfs_dqfunlock(dqp); goto out_unlock_dirty; @@ -1512,7 +1512,7 @@ xfs_qm_flush_one( goto out_unlock; } - error = xfs_dquot_read_buf(NULL, dqp, &bp); + error = xfs_dquot_read_buf(NULL, dqp, XBF_TRYLOCK, &bp); if (error) goto out_unlock; diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c index 8ede9d099d1fea..f56d62dced97b1 100644 --- a/fs/xfs/xfs_trans_ail.c +++ b/fs/xfs/xfs_trans_ail.c @@ -360,7 +360,7 @@ xfsaild_resubmit_item( /* protected by ail_lock */ list_for_each_entry(lip, &bp->b_li_list, li_bio_list) { - if (bp->b_flags & _XBF_INODES) + if (bp->b_flags & (_XBF_INODES | _XBF_DQUOTS)) clear_bit(XFS_LI_FAILED, &lip->li_flags); else xfs_clear_li_failed(lip); From patchwork Tue Nov 26 01:30:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13885429 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AF52A8F7D; Tue, 26 Nov 2024 01:30:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584611; cv=none; b=g6UglF/y6sGcfi4NKwj+VfKhi2E2CgiHOrJ75lFRVtf0GMjQ0HR9m3atL0UDGyGCNIRDnV/ulQEQIto3sQqg51uR751263M+jqC6cIE8iOKxtwjY+B4kiSZbAwvNFYQiASFtPZmIpNNoJ8quwPM4XpXCLeZqxKcmZGPV1k//fpY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732584611; c=relaxed/simple; bh=HK5b8pvSZdRSIviaWqNiHWzdphi33f6n7p6eh1jC4o8=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=hcfiCFZJvJ+umLdel+cgPGP2k9adS3Ej4osp/a9BYo7Szhgn3G3AmYyh7kSGF6+tJKUZuy/3+Fp9HtwYfzJb+Zjqpi25/+w5pq946wmYkTWD7ljAI/QU7m7v9uFGvngLOXXe5OPltfZT6bnoxirWTaUlP4yGCFeeFRZgnRM/Qd8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=gEqDbxkB; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="gEqDbxkB" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 89923C4CECE; Tue, 26 Nov 2024 01:30:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732584611; bh=HK5b8pvSZdRSIviaWqNiHWzdphi33f6n7p6eh1jC4o8=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=gEqDbxkBaR/ldjuFZ0/6thiUPyST/s2+hzLoRO1pVJFEQw+1guaTZ5Oz81SC4YNHo 9ZXk41onAqlZ1b10v0Omfisqib0Z1WpHnWBN23rzERmnO2akv9LYWjlLbCoo5OiC2M LsdGKqdOh1WS8E2Y6N27Ll0P8Aog6Pix+gTus2x/gePLcatqTGXBQXqbCdwpZUuHz5 YXVn0/38I+ECXbLqjCdsGUTWx/aoMz52Ysjg3So5cMCQoZIQ96YFYgIc7mUGoblJ4j 77WE5qPKTW7wamr9N4oBxo0Lltk25MVd589WagBEa95cTSjRXTj30NgTvr/Eospytk si4b8xCtwusiQ== Date: Mon, 25 Nov 2024 17:30:11 -0800 Subject: [PATCH 21/21] xfs: convert quotacheck to attach dquot buffers From: "Darrick J. Wong" To: djwong@kernel.org Cc: stable@vger.kernel.org, linux-xfs@vger.kernel.org Message-ID: <173258398160.4032920.3728172117282478382.stgit@frogsfrogsfrogs> In-Reply-To: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> References: <173258397748.4032920.4159079744952779287.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Now that we've converted the dquot logging machinery to attach the dquot buffer to the li_buf pointer so that the AIL dqflush doesn't have to allocate or read buffers in a reclaim path, do the same for the quotacheck code so that the reclaim shrinker dqflush call doesn't have to do that either. Cc: # v6.12 Fixes: 903edea6c53f09 ("mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner") Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_dquot.c | 9 +++------ fs/xfs/xfs_dquot.h | 2 -- fs/xfs/xfs_qm.c | 18 +++++++++++++----- 3 files changed, 16 insertions(+), 13 deletions(-) diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c index c495f7ad80018f..c47f95c96fe0cf 100644 --- a/fs/xfs/xfs_dquot.c +++ b/fs/xfs/xfs_dquot.c @@ -1275,11 +1275,10 @@ xfs_qm_dqflush_check( * Requires dquot flush lock, will clear the dirty flag, delete the quota log * item from the AIL, and shut down the system if something goes wrong. */ -int +static int xfs_dquot_read_buf( struct xfs_trans *tp, struct xfs_dquot *dqp, - xfs_buf_flags_t xbf_flags, struct xfs_buf **bpp) { struct xfs_mount *mp = dqp->q_mount; @@ -1287,10 +1286,8 @@ xfs_dquot_read_buf( int error; error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp, dqp->q_blkno, - mp->m_quotainfo->qi_dqchunklen, xbf_flags, + mp->m_quotainfo->qi_dqchunklen, 0, &bp, &xfs_dquot_buf_ops); - if (error == -EAGAIN) - return error; if (xfs_metadata_is_sick(error)) xfs_dquot_mark_sick(dqp); if (error) @@ -1324,7 +1321,7 @@ xfs_dquot_attach_buf( struct xfs_buf *bp = NULL; spin_unlock(&qlip->qli_lock); - error = xfs_dquot_read_buf(tp, dqp, 0, &bp); + error = xfs_dquot_read_buf(tp, dqp, &bp); if (error) return error; diff --git a/fs/xfs/xfs_dquot.h b/fs/xfs/xfs_dquot.h index 362ca34f7c248b..1c5c911615bf7f 100644 --- a/fs/xfs/xfs_dquot.h +++ b/fs/xfs/xfs_dquot.h @@ -214,8 +214,6 @@ void xfs_dquot_to_disk(struct xfs_disk_dquot *ddqp, struct xfs_dquot *dqp); #define XFS_DQ_IS_DIRTY(dqp) ((dqp)->q_flags & XFS_DQFLAG_DIRTY) void xfs_qm_dqdestroy(struct xfs_dquot *dqp); -int xfs_dquot_read_buf(struct xfs_trans *tp, struct xfs_dquot *dqp, - xfs_buf_flags_t flags, struct xfs_buf **bpp); int xfs_qm_dqflush(struct xfs_dquot *dqp, struct xfs_buf *bp); void xfs_qm_dqunpin_wait(struct xfs_dquot *dqp); void xfs_qm_adjust_dqtimers(struct xfs_dquot *d); diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c index a79c4a1bf27fab..e073ad51af1a3d 100644 --- a/fs/xfs/xfs_qm.c +++ b/fs/xfs/xfs_qm.c @@ -148,13 +148,13 @@ xfs_qm_dqpurge( * We don't care about getting disk errors here. We need * to purge this dquot anyway, so we go ahead regardless. */ - error = xfs_dquot_read_buf(NULL, dqp, XBF_TRYLOCK, &bp); + error = xfs_dquot_use_attached_buf(dqp, &bp); if (error == -EAGAIN) { xfs_dqfunlock(dqp); dqp->q_flags &= ~XFS_DQFLAG_FREEING; goto out_unlock; } - if (error) + if (!bp) goto out_funlock; /* @@ -506,8 +506,8 @@ xfs_qm_dquot_isolate( /* we have to drop the LRU lock to flush the dquot */ spin_unlock(lru_lock); - error = xfs_dquot_read_buf(NULL, dqp, XBF_TRYLOCK, &bp); - if (error) { + error = xfs_dquot_use_attached_buf(dqp, &bp); + if (!bp || error == -EAGAIN) { xfs_dqfunlock(dqp); goto out_unlock_dirty; } @@ -1330,6 +1330,10 @@ xfs_qm_quotacheck_dqadjust( return error; } + error = xfs_dquot_attach_buf(NULL, dqp); + if (error) + return error; + trace_xfs_dqadjust(dqp); /* @@ -1512,9 +1516,13 @@ xfs_qm_flush_one( goto out_unlock; } - error = xfs_dquot_read_buf(NULL, dqp, XBF_TRYLOCK, &bp); + error = xfs_dquot_use_attached_buf(dqp, &bp); if (error) goto out_unlock; + if (!bp) { + error = -EFSCORRUPTED; + goto out_unlock; + } error = xfs_qm_dqflush(dqp, bp); if (!error)