diff mbox series

[1/3] btrfs: never defer I/O submission for fast CRC implementations

Message ID 20230503070615.1029820-2-hch@lst.de (mailing list archive)
State New, archived
Headers show
Series [1/3] btrfs: never defer I/O submission for fast CRC implementations | expand

Commit Message

Christoph Hellwig May 3, 2023, 7:06 a.m. UTC
Most modern hardware supports very fast accelerated crc32c calculation.
If that is supported the CPU overhead of the checksum calculation is
very limited, and offloading the calculation to special worker threads
has a lot of overhead for no gain.

E.g. on an Intel Optane device is actually very much slows down even
1M buffered writes with fio:

Unpatched:

write: IOPS=3316, BW=3316MiB/s (3477MB/s)(200GiB/61757msec); 0 zone resets

With synchronous CRCs:

write: IOPS=4882, BW=4882MiB/s (5119MB/s)(200GiB/41948msec); 0 zone resets

With a lot of variation during the unpatch run going down as low as
1100MB/s, while the synchronous CRC version has about the same peak write
speed but much lower dips, and fewer kworkers churning around.
Both tests had fio saturated at 100% CPU.

(thanks to Jens Axboe via Chris Mason for the benchmarking)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chris Mason <clm@fb.com>
---
 fs/btrfs/bio.c | 20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

Comments

Johannes Thumshirn May 3, 2023, 7:44 a.m. UTC | #1
Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
diff mbox series

Patch

diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c
index 4b5220509186a8..e8a55605ce22fa 100644
--- a/fs/btrfs/bio.c
+++ b/fs/btrfs/bio.c
@@ -574,6 +574,12 @@  static void run_one_async_free(struct btrfs_work *work)
 
 static bool should_async_write(struct btrfs_bio *bbio)
 {
+	/*
+	 * Submit synchronously if the checksum implementation is fast.
+	 */
+	if (test_bit(BTRFS_FS_CSUM_IMPL_FAST, &bbio->fs_info->flags))
+		return false;
+
 	/*
 	 * If the I/O is not issued by fsync and friends, (->sync_writers != 0),
 	 * then try to defer the submission to a workqueue to parallelize the
@@ -583,18 +589,10 @@  static bool should_async_write(struct btrfs_bio *bbio)
 		return false;
 
 	/*
-	 * Submit metadata writes synchronously if the checksum implementation
-	 * is fast, or we are on a zoned device that wants I/O to be submitted
-	 * in order.
+	 * Zoned devices require I/O to be submitted in order.
 	 */
-	if (bbio->bio.bi_opf & REQ_META) {
-		struct btrfs_fs_info *fs_info = bbio->fs_info;
-
-		if (btrfs_is_zoned(fs_info))
-			return false;
-		if (test_bit(BTRFS_FS_CSUM_IMPL_FAST, &fs_info->flags))
-			return false;
-	}
+	if ((bbio->bio.bi_opf & REQ_META) && btrfs_is_zoned(bbio->fs_info))
+		return false;
 
 	return true;
 }