Message ID | ed62dbe7094fb5e80cb5b6df8590ff621b218ae5.1671252987.git.wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: don't clear EXTENT_BUFFER_DIRTY bit if the tree block is not dirtied in our transaction | expand |
On Sat, Dec 17, 2022 at 12:57:10PM +0800, Qu Wenruo wrote: > [BUG] > Since patch "btrfs: do not check header generation in > btrfs_clean_tree_block", a lot of scrub tests will report errors like > btrfs/072: > > QA output created by 072 > Silence is golden > Scrub find errors in "-m single -d single" test > > With newer scrub metadata error reports, we can see the bad tree blocks > are having older fsid from previous runs: > > BTRFS warning (device dm-2): tree block 24117248 mirror 1 has bad fsid, has b77cd862-f150-4c71-90ec-7baf0544d83f want 17df6abf-23cd-445f-b350-5b3e40bfd2fc > BTRFS warning (device dm-2): tree block 24117248 mirror 0 has bad fsid, has b77cd862-f150-4c71-90ec-7baf0544d83f want 17df6abf-23cd-445f-b350-5b3e40bfd2fc > BTRFS error (device dm-2): bdev /dev/mapper/test-scratch2 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 > > This means, some tree blocks of commit roots are not properly written > back to disks. > > [CAUSE] > Patch "btrfs: do not check header generation in btrfs_clean_tree_block" > will unconditionally clear the DIRTY flag, but there is a race that we > can clear the DIRTY flag for extent buffers which is under writeback: > > Transaction A | Transaction A + 1 > --------------------------------------+---------------------------------- > btrfs_cow_block() | > |- __btrfs_cow_block() | > |- btrfs_alloc_tree_block() | > | Tree block X get allocated | > | Generation is A, and marked | > | dirty. | > | > btrfs_commit_transaction() | > | Tree block X should be written | > | back. As it's dirtied in | > | transaction A. | > | | > |- cur_trans->state = | > | TRANS_STATE_UNBLOCKED | > | Now new transaction can be | > | started. | > | | Transaction A + 1 started > | | > | | btrfs_cow_block() > | | |- __btrfs_cow_block() > | | |- update_ref_for_cow() > | | |- btrfs_clear_buffer_dirty() > | | | Tree block X get freed, thus > | | its DIRTY flag get cleared. > | | And its pages are no > | | longer dirty. > | | > |- btrfs_write_and_wait_transaction()| > |- btrfs_write_marked_extents() | > Here we will only writeback | > dirty pages, and tree block X | > no longer has its pages dirty, | > it will no be written back. | > > Thus those tree blocks don't get written back and cause above fsid > mismatch during scrub. > > [FIX] > Just bring back the check of the old code, with added commt on why we > should not clear dirty flag unconditionally. > > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > This patch should just be used for testing. > > The proper fix would be an update of the series "extent buffer dirty > cleanups", which should remove the patch "btrfs: do not check header > generation in btrfs_clean_tree_block" completely. I've removed the sereis from misc-next and will add it to for-next later so we have fewer errors in the testing runs until the end of the year.
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 4f96f9603ec2..3e793d92d386 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -36,6 +36,7 @@ #include "file.h" #include "dev-replace.h" #include "super.h" +#include "transaction.h" static struct kmem_cache *extent_buffer_cache; @@ -4704,6 +4705,19 @@ void btrfs_clear_buffer_dirty(struct extent_buffer *eb) btrfs_assert_tree_write_locked(eb); + /* + * Do not clear the dirty flag for an extent buffer which is not + * modified in the current running transaction. + * + * The tree block may belong to previous transaction which is still + * waiting for writeback. + * Clearing the dirity bit unconditionally will make the previous + * transaction to miss its tree blocks, screwing up anything relying + * on commit root, from scrub to powerloss protection. + */ + if (btrfs_header_generation(eb) != fs_info->running_transaction->transid) + return; + if (!test_and_clear_bit(EXTENT_BUFFER_DIRTY, &eb->bflags)) return;
[BUG] Since patch "btrfs: do not check header generation in btrfs_clean_tree_block", a lot of scrub tests will report errors like btrfs/072: QA output created by 072 Silence is golden Scrub find errors in "-m single -d single" test With newer scrub metadata error reports, we can see the bad tree blocks are having older fsid from previous runs: BTRFS warning (device dm-2): tree block 24117248 mirror 1 has bad fsid, has b77cd862-f150-4c71-90ec-7baf0544d83f want 17df6abf-23cd-445f-b350-5b3e40bfd2fc BTRFS warning (device dm-2): tree block 24117248 mirror 0 has bad fsid, has b77cd862-f150-4c71-90ec-7baf0544d83f want 17df6abf-23cd-445f-b350-5b3e40bfd2fc BTRFS error (device dm-2): bdev /dev/mapper/test-scratch2 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 This means, some tree blocks of commit roots are not properly written back to disks. [CAUSE] Patch "btrfs: do not check header generation in btrfs_clean_tree_block" will unconditionally clear the DIRTY flag, but there is a race that we can clear the DIRTY flag for extent buffers which is under writeback: Transaction A | Transaction A + 1 --------------------------------------+---------------------------------- btrfs_cow_block() | |- __btrfs_cow_block() | |- btrfs_alloc_tree_block() | | Tree block X get allocated | | Generation is A, and marked | | dirty. | | btrfs_commit_transaction() | | Tree block X should be written | | back. As it's dirtied in | | transaction A. | | | |- cur_trans->state = | | TRANS_STATE_UNBLOCKED | | Now new transaction can be | | started. | | | Transaction A + 1 started | | | | btrfs_cow_block() | | |- __btrfs_cow_block() | | |- update_ref_for_cow() | | |- btrfs_clear_buffer_dirty() | | | Tree block X get freed, thus | | its DIRTY flag get cleared. | | And its pages are no | | longer dirty. | | |- btrfs_write_and_wait_transaction()| |- btrfs_write_marked_extents() | Here we will only writeback | dirty pages, and tree block X | no longer has its pages dirty, | it will no be written back. | Thus those tree blocks don't get written back and cause above fsid mismatch during scrub. [FIX] Just bring back the check of the old code, with added commt on why we should not clear dirty flag unconditionally. Signed-off-by: Qu Wenruo <wqu@suse.com> --- This patch should just be used for testing. The proper fix would be an update of the series "extent buffer dirty cleanups", which should remove the patch "btrfs: do not check header generation in btrfs_clean_tree_block" completely. But I really believe we need a new comment on the original check anyway. --- fs/btrfs/extent_io.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)