diff mbox series

btrfs-progs: docs: extra notes about read-only scrub on read-write fs

Message ID 077b7ef71095dfaf92e605c515e384033fdc0dd5.1734826227.git.wqu@suse.com (mailing list archive)
State New
Headers show
Series btrfs-progs: docs: extra notes about read-only scrub on read-write fs | expand

Commit Message

Qu Wenruo Dec. 22, 2024, 12:10 a.m. UTC
[BUG]
There is a bug report that read-only scrub on a read-write fs still
causes writes into the fs, and that will be caught if there is a
read-only block device among the storage stack.

This will cause a kernel warning on failed transaction commit:

 BTRFS info (device dm-3): first mount of filesystem e18f0c40-88de-413f-9d7e-dcc8136ad6dd
 BTRFS info (device dm-3): using crc32c (crc32c-intel) checksum algorithm
 BTRFS info (device dm-3): using free-space-tree
 BTRFS info (device dm-3): scrub: started on devid 1
 Trying to write to read-only block-device md127
 btrfs_dev_stat_inc_and_print: 362 callbacks suppressed
 BTRFS error (device dm-3): bdev /dev/mapper/data errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
 BTRFS error (device dm-3): bdev /dev/mapper/data errs: wr 2, rd 0, flush 0, corrupt 0, gen 0
 BTRFS error (device dm-3): bdev /dev/mapper/data errs: wr 3, rd 0, flush 0, corrupt 0, gen 0
 BTRFS error (device dm-3): bdev /dev/mapper/data errs: wr 4, rd 0, flush 0, corrupt 0, gen 0
 BTRFS error (device dm-3): bdev /dev/mapper/data errs: wr 5, rd 0, flush 0, corrupt 0, gen 0
 BTRFS error (device dm-3): bdev /dev/mapper/data errs: wr 6, rd 0, flush 0, corrupt 0, gen 0
 BTRFS error (device dm-3): bdev /dev/mapper/data errs: wr 7, rd 0, flush 0, corrupt 0, gen 0
 BTRFS error (device dm-3): bdev /dev/mapper/data errs: wr 8, rd 0, flush 0, corrupt 0, gen 0
 BTRFS error (device dm-3): bdev /dev/mapper/data errs: wr 9, rd 0, flush 0, corrupt 0, gen 0
 BTRFS error (device dm-3): bdev /dev/mapper/data errs: wr 10, rd 0, flush 0, corrupt 0, gen 0
 BTRFS: error (device dm-3) in btrfs_commit_transaction:2523: errno=-5 IO failure (Error while writing out transaction)
 BTRFS info (device dm-3 state E): forced readonly
 BTRFS warning (device dm-3 state E): Skipping commit of aborted transaction.
 BTRFS error (device dm-3 state EA): Transaction aborted (error -5)
 BTRFS: error (device dm-3 state EA) in cleanup_transaction:2017: errno=-5 IO failure
 BTRFS warning (device dm-3 state EA): failed setting block group ro: -5
 BTRFS info (device dm-3 state EA): scrub: not finished on devid 1 with status: -5

[CAUSE]
The root cause is inside btrfs_inc_block_group_ro(), where we need to
hold a transaction handle, to prevent the transaction to be committed,
until we hold ro_block_group_mutex.

This will cause an empty transaction by itself, thus even if we can mark
the block group read-only without any extra workload, we still need to
commit the new and empty transaction.

Unfortunately this means RO scrub on RW filesystem will always cause the
fs to be updated.

[FIX]
The best fix is to make btrfs to avoid empty commit transaction, but
even with that done, read-only scrub on rw mount can still cause real
metadata updates (e.g. allocate new chunks and update device error
statistics).

It will be very complex to make read-only scrub to be fully read-only
on a read-write btrfs.

Thankfully read-only scrub on read-write mount with read-only device in
the storage stack is pretty rare, thus a documentation update should be
enough.

Issue: #934
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 Documentation/btrfs-scrub.rst    |  6 ++++++
 Documentation/ch-scrub-intro.rst | 10 ++++++++++
 2 files changed, 16 insertions(+)
diff mbox series

Patch

diff --git a/Documentation/btrfs-scrub.rst b/Documentation/btrfs-scrub.rst
index 86c39b20cbde..539b67924b39 100644
--- a/Documentation/btrfs-scrub.rst
+++ b/Documentation/btrfs-scrub.rst
@@ -89,6 +89,12 @@  start [-BdrRf] <path>|<device>
         -r
                 run in read-only mode, do not attempt to correct anything, can
                 be run on a read-only filesystem
+
+		Note that a read-only scrub on a read-write filesystem can
+		still cause write into the filesystem due to some internal
+		limitations.
+		Only a read-only scrub on a read-only fs can avoid writes from
+		scrub.
         -R
                 raw print mode, print full data instead of summary
         -f
diff --git a/Documentation/ch-scrub-intro.rst b/Documentation/ch-scrub-intro.rst
index 2276bc263223..58afb466eb12 100644
--- a/Documentation/ch-scrub-intro.rst
+++ b/Documentation/ch-scrub-intro.rst
@@ -46,6 +46,16 @@  read-write mounted filesystem.
    used, with expert guidance, to rebuild certain corrupted filesystem structures
    in the absence of any good replica.
 
+.. note::
+   Read-only scrub on read-write filesystem will cause some write into the
+   filesystem.
+
+   This is due to the design limitation to prevent race between marking block
+   group read-only and writing back block group items.
+
+   To avoid any writes from scrub, one has to run read-only scrub on read-only
+   filesystem.
+
 The user is supposed to run it manually or via a periodic system service. The
 recommended period is a month but it could be less. The estimated device bandwidth
 utilization is about 80% on an idle filesystem.