From patchwork Wed Feb 19 00:49:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13981250 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E91BB186294; Wed, 19 Feb 2025 00:49:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739926189; cv=none; b=kJG9zrsE2+vDLvciJpitfdB2vNlb4d84rtBm3DiVFzHU9rz6veQuAXLjzGXW02Q80jWC7/XlRUmuvLGk81Fbffg4z9TEehSKk9XOdzoubO25yVdT7gHVsU63ShpNvQ1hDQB7Zg0dHdaQJy8gzV/GVyTWloPhHfj0YmyB0MThry4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739926189; c=relaxed/simple; bh=vY1gfLO5bZF2Cl00vzdm0c5onlHSrw/mq5Hnh/8rccw=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=b1PiEwClmg74OMQRkq0vbF2HsIqvfSFHl8dTGZCwKn6MvEs5OwCAQKQ8ne8MHRKyx1NSlMvYlSu3WnmY1E4igP4zgO/ClSDh1ZD4JEnRN5ZsX59HMlI7+WPs5AyyFwbdD2H7bkz5EvEvCURQJMW+uQfaVb3piVogtwFv35yt2yo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Y4DMfgPa; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Y4DMfgPa" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B5754C4CEE2; Wed, 19 Feb 2025 00:49:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1739926188; bh=vY1gfLO5bZF2Cl00vzdm0c5onlHSrw/mq5Hnh/8rccw=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=Y4DMfgPacZNp3HX5/3lO0rlCYpjvN9icZGHcqwbHECeCkZh/+BG+FrHBZGjWwY16a j8T9loImTzp8FHyqW/wCPiaJfXtg5wPtpY0JUlsyhyfJG2LM3bolYXOtfIWNrm1Hsa j0vQMLThx2HQlqov3bmH9/DGf1DYuQzKXHuEEppi/M9LxXmoq0ATo5mcNDa4a1xFVE yIL/P9nqkZFD8n0TGnxRco5f/QuItYUKc08bOWhPH7g+vCOfpwdZMYAIOJtND9szSk gpv6WxPA49frM5dY5M0MLHZc16K4t/ZGp1+EEel8ciYaodvX998XiDjyfKh6l2aLMF 8OdhkQcHzlUqw== Date: Tue, 18 Feb 2025 16:49:48 -0800 Subject: [PATCH 1/3] logwrites: warn if we don't think read after discard returns zeroes From: "Darrick J. Wong" To: zlang@redhat.com, djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, fstests@vger.kernel.org Message-ID: <173992586985.4078081.2930161551208349551.stgit@frogsfrogsfrogs> In-Reply-To: <173992586956.4078081.15131555531444924972.stgit@frogsfrogsfrogs> References: <173992586956.4078081.15131555531444924972.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: fstests@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong The logwrites replay program expects that it can issue a DISCARD against the block device passed to _log_writes_init and that will cause all subsequent reads to return zeroes. This is required for correct log recovery on filesystems such as XFS that skip recovering buffers if newer ones are found on disk. Unfortunately, there's no way to discover if a device's discard implementation actually guarantees zeroes. There used to be a sysfs knob keyed to an allowlist, but it is now hardwired to return 0. So either we need a magic device that does discard-and-zero, or we need to do the zeroing ourselves. The logwrites program does its own zeroing if there is no discard support, and some tests do their own zeroing. The only devices we know to work reliably are the software defined ones that are provided by the kernel itself -- which means dm-thinp. Warn if we have a device that supports discard that isn't thinp and the test fails. Signed-off-by: "Darrick J. Wong" --- common/dmlogwrites | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/common/dmlogwrites b/common/dmlogwrites index a27e1966a933a6..96101d53c38b4a 100644 --- a/common/dmlogwrites +++ b/common/dmlogwrites @@ -59,6 +59,35 @@ _require_log_writes_dax_mountopt() fi } +_log_writes_check_bdev() +{ + local sysfs="/sys/block/$(_short_dev $1)" + + # Some filesystems (e.g. XFS) optimize log recovery by assuming that + # they can elide replay of metadata blocks if the block has a higher + # log serial number than the transaction being recovered. This is a + # problem if the filesystem log contents can go back in time, which is + # what the logwrites replay program does. + # + # The logwrites replay program begins by erasing the block device's + # contents. This can be done very quickly with DISCARD provided the + # device guarantees that all reads after a DISCARD return zeroes, or + # very slowly by writing zeroes to the device. Fast is preferable, but + # there's no longer any way to detect that DISCARD actually unmaps + # zeroes, so warn the user about this requirement if the test happens + # to fail. + + # No discard support means the logwrites will do its own zeroing + test "$(cat "$sysfs/queue/discard_max_bytes")" -eq 0 && return + + # dm-thinp guarantees that reads after discards return zeroes + dmsetup status "$blkdev" 2>/dev/null | grep -q '^0.* thin ' && return + + echo "HINT: $blkdev doesn't guarantee that reads after DISCARD will return zeroes" >> $seqres.hints + echo " This is required for correct journal replay on some filesystems (e.g. xfs)" >> $seqres.hints + echo >> $seqres.hints +} + # Set up a dm-log-writes device # # blkdev: the specified target device @@ -84,6 +113,8 @@ _log_writes_init() LOGWRITES_NAME=logwrites-test LOGWRITES_DMDEV=/dev/mapper/$LOGWRITES_NAME LOGWRITES_TABLE="0 $BLK_DEV_SIZE log-writes $blkdev $LOGWRITES_DEV" + + _log_writes_check_bdev "$blkdev" _dmsetup_create $LOGWRITES_NAME --table "$LOGWRITES_TABLE" || \ _fail "failed to create log-writes device" } From patchwork Wed Feb 19 00:50:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13981251 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EA1CB180A80; Wed, 19 Feb 2025 00:50:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739926205; cv=none; b=T/fp1HmBFrQXwF3L8sQALzJdQVmmNCkb92EpuoDODnKFxItUINZynMBMh0bwNRaZFrEh2UYGkEOt+JC9k2fm6Ll0N5YKPcBLXv8JAa7b302Rfd8SC6XL0d4Ng9sZxopAqnj6fZpiTuICwzdnfgs2NLPUh22wdk0NTT8FFb8qqls= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739926205; c=relaxed/simple; bh=1gZot46ERbQd78/irKGOKdgoQtlhQZQdfk1dh7VJHeI=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=lpWUVK1ylJncqGJ54q7B0m7MHOWZ84CcpPMx+4P+hDpMFQbYVpvZTE5eLyqHHOvW/qYnCkntFveCnFXRKg9tiWf9a5s5Evm9pY5v9SqQTIRhltpxMxy0nDj2BDBVMNXfj7XXLSqI+gDNt+hPde6n7gynlLjsACu7GPGftRn4V5M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=POfdrRQX; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="POfdrRQX" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 550E8C4CEE2; Wed, 19 Feb 2025 00:50:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1739926204; bh=1gZot46ERbQd78/irKGOKdgoQtlhQZQdfk1dh7VJHeI=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=POfdrRQXo+2nx4I823dr5rDDiXSKw3zjMeD5zRZDiQ7Qyo6Q2cscZi2mg9/cNW5R8 kJSuMh/nWBBleHFzCOuygzxAcqmEj3ZuFJXTKgVlzQmVyvcuYy3wYXapuQdweanPAj ZNLBnkoNBO/pdDzkEd8BJzCxl+yapBLwA4ZJeKR2+A6WGy2ZTdfCM5+TaFkxnW74AL HopHlVMxXjNh8cSPf6v3VSsDM3uGmaCRCqhZ6oAEYw9FdtY2D2H2JfM8xTi6Eewl/f /eg0P515N1DrD/8wBKo0LwxlmwU4xDfaXDqH/vNij68YWfsXyoRP/KGmxzY63hXNbg wl3GQJvEEg3ww== Date: Tue, 18 Feb 2025 16:50:03 -0800 Subject: [PATCH 2/3] logwrites: use BLKZEROOUT if it's available From: "Darrick J. Wong" To: zlang@redhat.com, djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, fstests@vger.kernel.org Message-ID: <173992587004.4078081.9861006903580046263.stgit@frogsfrogsfrogs> In-Reply-To: <173992586956.4078081.15131555531444924972.stgit@frogsfrogsfrogs> References: <173992586956.4078081.15131555531444924972.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: fstests@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Use the BLKZEROOUT ioctl instead of writing zeroed buffers if the kernel supports it. Signed-off-by: "Darrick J. Wong" --- src/log-writes/log-writes.h | 1 + src/log-writes/log-writes.c | 10 ++++++++++ 2 files changed, 11 insertions(+) diff --git a/src/log-writes/log-writes.h b/src/log-writes/log-writes.h index b9f571ac3b2384..f659931634e64a 100644 --- a/src/log-writes/log-writes.h +++ b/src/log-writes/log-writes.h @@ -63,6 +63,7 @@ struct log_write_entry { #define LOG_IGNORE_DISCARD (1 << 0) #define LOG_DISCARD_NOT_SUPP (1 << 1) +#define LOG_ZEROOUT_NOT_SUPP (1 << 2) struct log { int logfd; diff --git a/src/log-writes/log-writes.c b/src/log-writes/log-writes.c index aa53473974d9e8..8f94ae5629e085 100644 --- a/src/log-writes/log-writes.c +++ b/src/log-writes/log-writes.c @@ -42,6 +42,7 @@ static int discard_range(struct log *log, u64 start, u64 len) static int zero_range(struct log *log, u64 start, u64 len) { + u64 range[2] = { start, len }; u64 bufsize = len; ssize_t ret; char *buf = NULL; @@ -54,6 +55,15 @@ static int zero_range(struct log *log, u64 start, u64 len) return 0; } + if (!(log->flags & LOG_ZEROOUT_NOT_SUPP)) { + if (ioctl(log->replayfd, BLKZEROOUT, &range) < 0) { + if (log_writes_verbose) + printf( + "replay device doesn't support zeroout, switching to writing zeros\n"); + log->flags |= LOG_ZEROOUT_NOT_SUPP; + } + } + while (!buf) { buf = malloc(bufsize); if (!buf) From patchwork Wed Feb 19 00:50:19 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13981252 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8BE0F1586C8; Wed, 19 Feb 2025 00:50:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739926220; cv=none; b=goF8ByO5qxILATiYlGB/kyRvNGlWx+5RE6Oz7J02QAvNjdnVlcc6916INRyiq+GtNbo0n/ksHewHPHs7u1cUJPYHkx5Cc7vensLzp2YnEVLplAMnsVSYJpMGY4r2ra3Wr95cQuQz5xUq9KG11zcoV5ZRBPjezJINmeWRnp11mjU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739926220; c=relaxed/simple; bh=TsSMnjKtE+SIRtkHK/xoJs0IUyUvd3znhcLf9lWRSY0=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=MvLWYIXl8sutkIgMjHCweYpMxnAC2mgAQ/JvJr+CTsKixw7Cy+QP4L2YNXGH03P8xr1z5wVBwylV6PdHS+aPAswHMjSwsyHkx4RbrW2s0eWU2+zfPVk6Uu1UYRPGExPQHpySJ0WB0iSC6FNFds+GJShCiEIS3spzz+5MuJd8qYc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=osEj5qcf; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="osEj5qcf" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 03F6FC4CEE2; Wed, 19 Feb 2025 00:50:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1739926220; bh=TsSMnjKtE+SIRtkHK/xoJs0IUyUvd3znhcLf9lWRSY0=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=osEj5qcfW7ItjirzaorGAcw6hPBM7db3Ph+NKhQXHvZMdq4jx3etWfyvo8eUtmbNS xbzRC9N6zzTKLFThRpq6jQjt46nUeeLn8PMjPpVVVlJKouBBuFwKV233dHyT9YGqw3 SBACNWDzOy5Y7bybu2T4/QaaQ/m8GgCLTOrLtZRygKaCDSQLkw7LBJgvLi7kqYf5LS 7nLt6hphWi0FRxZYdwArmzRLxsFuVQA0Q/n3WRYZ6jjHBlv8d8wXQ2z6SBW8yVYH2W O/leP0S7bBnE/hk+NenJ281geKvvtZ0e/jQzXU02KUJ1EM/B5TzUReyY7jO/FBz1MZ MZrvJX1Aam2lQ== Date: Tue, 18 Feb 2025 16:50:19 -0800 Subject: [PATCH 3/3] logwrites: only use BLKDISCARD if we know discard zeroes data From: "Darrick J. Wong" To: zlang@redhat.com, djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, fstests@vger.kernel.org Message-ID: <173992587022.4078081.2598103027724572042.stgit@frogsfrogsfrogs> In-Reply-To: <173992586956.4078081.15131555531444924972.stgit@frogsfrogsfrogs> References: <173992586956.4078081.15131555531444924972.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: fstests@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Building off the checks established in the previous patch, only enable the use of BLKDISCARD if we know that the logwrites device guarantees that reads after a discard return zeroes. Signed-off-by: "Darrick J. Wong" --- common/dmlogwrites | 10 ++++++++-- src/log-writes/replay-log.c | 8 ++++++++ 2 files changed, 16 insertions(+), 2 deletions(-) diff --git a/common/dmlogwrites b/common/dmlogwrites index 96101d53c38b4a..fbc8beb5ce597e 100644 --- a/common/dmlogwrites +++ b/common/dmlogwrites @@ -81,7 +81,10 @@ _log_writes_check_bdev() test "$(cat "$sysfs/queue/discard_max_bytes")" -eq 0 && return # dm-thinp guarantees that reads after discards return zeroes - dmsetup status "$blkdev" 2>/dev/null | grep -q '^0.* thin ' && return + if dmsetup status "$blkdev" 2>/dev/null | grep -q '^0.* thin '; then + LOGWRITES_REPLAY_ARGS+=(--discard-zeroes-data) + return + fi echo "HINT: $blkdev doesn't guarantee that reads after DISCARD will return zeroes" >> $seqres.hints echo " This is required for correct journal replay on some filesystems (e.g. xfs)" >> $seqres.hints @@ -110,6 +113,7 @@ _log_writes_init() BLK_DEV_SIZE=$((length / blksz)) fi + LOGWRITES_REPLAY_ARGS=() LOGWRITES_NAME=logwrites-test LOGWRITES_DMDEV=/dev/mapper/$LOGWRITES_NAME LOGWRITES_TABLE="0 $BLK_DEV_SIZE log-writes $blkdev $LOGWRITES_DEV" @@ -161,7 +165,8 @@ _log_writes_replay_log() [ $? -ne 0 ] && _fail "mark '$_mark' does not exist" $here/src/log-writes/replay-log --log $LOGWRITES_DEV --replay $_blkdev \ - --end-mark $_mark >> $seqres.full 2>&1 + --end-mark $_mark "${LOGWRITES_REPLAY_ARGS[@]}" \ + >> $seqres.full 2>&1 [ $? -ne 0 ] && _fail "replay failed" } @@ -231,6 +236,7 @@ _log_writes_replay_log_range() echo "=== replay to $end ===" >> $seqres.full $here/src/log-writes/replay-log -vv --log $LOGWRITES_DEV \ --replay $blkdev --limit $(($end + 1)) \ + "${LOGWRITES_REPLAY_ARGS[@]}" \ >> $seqres.full 2>&1 [ $? -ne 0 ] && _fail "replay failed" } diff --git a/src/log-writes/replay-log.c b/src/log-writes/replay-log.c index 968c82ab64a9ad..e07401f63af573 100644 --- a/src/log-writes/replay-log.c +++ b/src/log-writes/replay-log.c @@ -18,6 +18,7 @@ enum option_indexes { FIND, NUM_ENTRIES, NO_DISCARD, + DISCARD_ZEROES_DATA, FSCK, CHECK, START_MARK, @@ -37,6 +38,7 @@ static struct option long_options[] = { {"find", no_argument, NULL, 0}, {"num-entries", no_argument, NULL, 0}, {"no-discard", no_argument, NULL, 0}, + {"discard-zeroes-data", no_argument, NULL, 0}, {"fsck", required_argument, NULL, 0}, {"check", required_argument, NULL, 0}, {"start-mark", required_argument, NULL, 0}, @@ -155,6 +157,7 @@ int main(int argc, char **argv) int ret; int print_num_entries = 0; int discard = 1; + int use_kernel_discard = 0; enum log_replay_check_mode check_mode = 0; while ((c = getopt_long(argc, argv, "v", long_options, @@ -242,6 +245,9 @@ int main(int argc, char **argv) case NO_DISCARD: discard = 0; break; + case DISCARD_ZEROES_DATA: + use_kernel_discard = 1; + break; case FSCK: fsck_command = strdup(optarg); if (!fsck_command) { @@ -299,6 +305,8 @@ int main(int argc, char **argv) if (!discard) log->flags |= LOG_IGNORE_DISCARD; + if (!use_kernel_discard) + log->flags |= LOG_DISCARD_NOT_SUPP; log->start_sector = start_sector; log->end_sector = end_sector;