[1/4] xfs: force all buffers to be written during btree bulk load

From: Darrick J. Wong <djwong@kernel.org>

From: Darrick J. Wong <djwong@kernel.org>

While stress-testing online repair of btrees, I noticed periodic
assertion failures from the buffer cache about buffer readers
encountering buffers with DELWRI_Q set, even though the btree bulk load
had already committed and the buffer itself wasn't on any delwri list.

I traced this to a misunderstanding of how the delwri lists work,
particularly with regards to the AIL's buffer list.  If a buffer is
logged and committed, the buffer can end up on that AIL buffer list.  If
btree repairs are run twice in rapid succession, it's possible that the
first repair will invalidate the buffer and free it before the next time
the AIL wakes up.  This clears DELWRI_Q from the buffer state.

If the second repair allocates the same block, it will then recycle the
buffer to start writing the new btree block.  Meanwhile, if the AIL
wakes up and walks the buffer list, it will ignore the buffer because it
can't lock it, and go back to sleep.

When the second repair calls delwri_queue to put the buffer on the
list of buffers to write before committing the new btree, it will set
DELWRI_Q again, but since the buffer hasn't been removed from the AIL's
buffer list, it won't add it to the bulkload buffer's list.

This is incorrect, because the bulkload caller relies on delwri_submit
to ensure that all the buffers have been sent to disk /before/
committing the new btree root pointer.  This ordering requirement is
required for data consistency.

Worse, the AIL won't clear DELWRI_Q from the buffer when it does finally
drop it, so the next thread to walk through the btree will trip over a
debug assertion on that flag.

To fix this, create a new function that waits for the buffer to be
removed from any other delwri lists before adding the buffer to the
caller's delwri list.  By waiting for the buffer to clear both the
delwri list and any potential delwri wait list, we can be sure that
repair will initiate writes of all buffers and report all write errors
back to userspace instead of committing the new structure.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_btree_staging.c |    4 +--
 fs/xfs/xfs_buf.c                  |   47 ++++++++++++++++++++++++++++++++++---
 fs/xfs/xfs_buf.h                  |    1 +
 3 files changed, 45 insertions(+), 7 deletions(-)

Message ID	170086926593.2770816.5504104328549141972.stgit@frogsfrogsfrogs (mailing list archive)
State	Superseded
Headers	show Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 100F63DB80 for <linux-xfs@vger.kernel.org>; Fri, 24 Nov 2023 23:49:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="mFWsRvm4" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D2449C433C8; Fri, 24 Nov 2023 23:48:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700869739; bh=wRUBfyIbrAPI9DaOMcV/6Q6SalFEzsa85ONxXEVw6IY=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=mFWsRvm4fl3eTMKWCUUeJI7CWOpan0iY2sJcH0KHUrWNtp9Za6wqI1g47aRAPH129 zdhy5D/7J4s9do4USclr2IDp7yiLk7p9LNVsjy2urbDkVpQnpmdM9sZ9gkIqEw8lPO w1tLBONJ4iul8yowegJX7HeNCOC0EM+Ox16eiTq9i6Ijruo50D57S3DKZUIDhEDy70 wRm4vpvelvrS+O00kb01GqwhrKGoxD+Jgl2+nOujjO9DXwFN3m16w5GNNWAgG0YpBP muRFMn+293ZHBG48zwhMqVGLmn8kKgXnOXrnPvoh03KtsRhmxAw3tenEpwusoibAAr fz0gejJu/Xs6g== Date: Fri, 24 Nov 2023 15:48:59 -0800 Subject: [PATCH 1/4] xfs: force all buffers to be written during btree bulk load From: "Darrick J. Wong" <djwong@kernel.org> To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Message-ID: <170086926593.2770816.5504104328549141972.stgit@frogsfrogsfrogs> In-Reply-To: <170086926569.2770816.7549813820649168963.stgit@frogsfrogsfrogs> References: <170086926569.2770816.7549813820649168963.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: <linux-xfs.vger.kernel.org> List-Subscribe: <mailto:linux-xfs+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-xfs+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit
Series	xfs: prepare repair for bulk loading \| expand [PATCHSET,v28.0,0/4] xfs: prepare repair for bulk loading [1/4] xfs: force all buffers to be written during btree bulk load [2/4] xfs: add debug knobs to control btree bulk load slack factors [3/4] xfs: move btree bulkload record initialization to ->get_record implementations [4/4] xfs: constrain dirty buffers while formatting a staged btree

[1/4] xfs: force all buffers to be written during btree bulk load

Commit Message

Comments

Patch