Message ID | 20200508100110.6965-1-fdmanana@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/4] Btrfs: fix a race between scrub and block group removal/allocation | expand |
Hi [This is an automated email] This commit has been processed because it contains a -stable tag. The stable tag indicates that it's relevant for the following trees: all The bot has tested the following trees: v5.6.11, v5.4.39, v4.19.121, v4.14.179, v4.9.222, v4.4.222. v5.6.11: Build OK! v5.4.39: Build failed! Errors: fs/btrfs/scrub.c:3291:20: error: dereferencing pointer to incomplete type ‘struct btrfs_block_group’ fs/btrfs/scrub.c:3472:31: error: passing argument 7 of ‘scrub_stripe’ from incompatible pointer type [-Werror=incompatible-pointer-types] v4.19.121: Build failed! Errors: fs/btrfs/scrub.c:3289:20: error: dereferencing pointer to incomplete type ‘struct btrfs_block_group’ fs/btrfs/scrub.c:3470:31: error: passing argument 7 of ‘scrub_stripe’ from incompatible pointer type [-Werror=incompatible-pointer-types] v4.14.179: Failed to apply! Possible dependencies: 32934280967d ("Btrfs: clean up scrub is_dev_replace parameter") c83488afc5a7 ("btrfs: Remove fs_info from btrfs_inc_block_group_ro") v4.9.222: Failed to apply! Possible dependencies: 0b246afa62b0 ("btrfs: root->fs_info cleanup, add fs_info convenience variables") 32934280967d ("Btrfs: clean up scrub is_dev_replace parameter") 5e00f1939f6e ("btrfs: convert btrfs_inc_block_group_ro to accept fs_info") 62d1f9fe97dd ("btrfs: remove trivial helper btrfs_find_tree_block") c83488afc5a7 ("btrfs: Remove fs_info from btrfs_inc_block_group_ro") cf8cddd38bab ("btrfs: don't abuse REQ_OP_* flags for btrfs_map_block") da17066c4047 ("btrfs: pull node/sector/stripe sizes out of root and into fs_info") de143792253e ("btrfs: struct btrfsic_state->root should be an fs_info") fb456252d3d9 ("btrfs: root->fs_info cleanup, use fs_info->dev_root everywhere") v4.4.222: Failed to apply! Possible dependencies: 0132761017e0 ("btrfs: fix string and comment grammatical issues and typos") 09cbfeaf1a5a ("mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros") 0b246afa62b0 ("btrfs: root->fs_info cleanup, add fs_info convenience variables") 0e749e54244e ("dax: increase granularity of dax_clear_blocks() operations") 32934280967d ("Btrfs: clean up scrub is_dev_replace parameter") 4420cfd3f51c ("staging: lustre: format properly all comment blocks for LNet core") 52db400fcd50 ("pmem, dax: clean up clear_pmem()") 5e00f1939f6e ("btrfs: convert btrfs_inc_block_group_ro to accept fs_info") 5fd88337d209 ("staging: lustre: fix all conditional comparison to zero in LNet layer") b2e0d1625e19 ("dax: fix lifetime of in-kernel dax mappings with dax_map_atomic()") bb7ab3b92e46 ("btrfs: Fix misspellings in comments.") c83488afc5a7 ("btrfs: Remove fs_info from btrfs_inc_block_group_ro") cf8cddd38bab ("btrfs: don't abuse REQ_OP_* flags for btrfs_map_block") d1a5f2b4d8a1 ("block: use DAX for partition table reads") de143792253e ("btrfs: struct btrfsic_state->root should be an fs_info") e10624f8c097 ("pmem: fail io-requests to known bad blocks") NOTE: The patch will not be queued to stable trees until it is upstream. How should we proceed with this patch?
On Fri, May 08, 2020 at 11:01:10AM +0100, fdmanana@kernel.org wrote: > From: Filipe Manana <fdmanana@suse.com> > CC: stable@vger.kernel.org > Signed-off-by: Filipe Manana <fdmanana@suse.com> 1-4 added to misc-next, thanks.
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index adaf8ab694d5..7c50ac5b6876 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -3046,7 +3046,8 @@ static noinline_for_stack int scrub_raid56_parity(struct scrub_ctx *sctx, static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx, struct map_lookup *map, struct btrfs_device *scrub_dev, - int num, u64 base, u64 length) + int num, u64 base, u64 length, + struct btrfs_block_group *cache) { struct btrfs_path *path, *ppath; struct btrfs_fs_info *fs_info = sctx->fs_info; @@ -3284,6 +3285,20 @@ static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx, break; } + /* + * If our block group was removed in the meanwhile, just + * stop scrubbing since there is no point in continuing. + * Continuing would prevent reusing its device extents + * for new block groups for a long time. + */ + spin_lock(&cache->lock); + if (cache->removed) { + spin_unlock(&cache->lock); + ret = 0; + goto out; + } + spin_unlock(&cache->lock); + extent = btrfs_item_ptr(l, slot, struct btrfs_extent_item); flags = btrfs_extent_flags(l, extent); @@ -3457,7 +3472,7 @@ static noinline_for_stack int scrub_chunk(struct scrub_ctx *sctx, if (map->stripes[i].dev->bdev == scrub_dev->bdev && map->stripes[i].physical == dev_offset) { ret = scrub_stripe(sctx, map, scrub_dev, i, - chunk_offset, length); + chunk_offset, length, cache); if (ret) goto out; } @@ -3555,6 +3570,23 @@ int scrub_enumerate_chunks(struct scrub_ctx *sctx, goto skip; /* + * Make sure that while we are scrubbing the corresponding block + * group doesn't get its logical address and its device extents + * reused for another block group, which can possibly be of a + * different type and different profile. We do this to prevent + * false error detections and crashes due to bogus attempts to + * repair extents. + */ + spin_lock(&cache->lock); + if (cache->removed) { + spin_unlock(&cache->lock); + btrfs_put_block_group(cache); + goto skip; + } + btrfs_get_block_group_trimming(cache); + spin_unlock(&cache->lock); + + /* * we need call btrfs_inc_block_group_ro() with scrubs_paused, * to avoid deadlock caused by: * btrfs_inc_block_group_ro() @@ -3609,6 +3641,7 @@ int scrub_enumerate_chunks(struct scrub_ctx *sctx, } else { btrfs_warn(fs_info, "failed setting block group ro: %d", ret); + btrfs_put_block_group_trimming(cache); btrfs_put_block_group(cache); scrub_pause_off(fs_info); break; @@ -3695,6 +3728,7 @@ int scrub_enumerate_chunks(struct scrub_ctx *sctx, spin_unlock(&cache->lock); } + btrfs_put_block_group_trimming(cache); btrfs_put_block_group(cache); if (ret) break;