Message ID | 6c556ce0456303fb8ec23a454c3b5e18b874ae91.1696291742.git.wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs-progs: properly reserve metadata space for bgt conversion | expand |
On Tue, Oct 03, 2023 at 10:39:18AM +1030, Qu Wenruo wrote: > There is a bug report that btrfstune failed to convert to block group > tree. > The bug report shows some very weird call trace, mostly fail at free > space tree code. If you mention a bug report please either add a link or describe the key details of the report. > One of the concern is the failed conversion may be caused by exhausted > metadata space. > In that case, we're doing quite some metadata operations (one > transaction to convert 64 block groups in one go). > > But for each transaction we have only reserved the metadata for one > block group conversion (1 block for extent tree and 1 block for block > group tree, this excludes free space and extent tree operations, as they > should go global rsv). > > Although we're not 100% sure about the failed conversion, at least we > should handle the metadata reservation correctly, by properly reserve > the needed space for the batch, and reduce the batch size just in case > there isn't much metadata space left. This is probably reasonable. > > Signed-off-by: Qu Wenruo <wqu@suse.com> Added to devel, thanks.
On 2023/10/7 02:05, David Sterba wrote: > On Tue, Oct 03, 2023 at 10:39:18AM +1030, Qu Wenruo wrote: >> There is a bug report that btrfstune failed to convert to block group >> tree. >> The bug report shows some very weird call trace, mostly fail at free >> space tree code. > > If you mention a bug report please either add a link or describe the key > details of the report. Here it is: https://lore.kernel.org/linux-btrfs/3c93d0b5-a8cb-ebe3-f8d6-76ea6340f23e@gmail.com/ I didn't put it in the first place is, it's really just an inspiration, not a concrete bug report. But for sure, in the future I can also put those inspirational reports into "Link:" tag. > >> One of the concern is the failed conversion may be caused by exhausted >> metadata space. >> In that case, we're doing quite some metadata operations (one >> transaction to convert 64 block groups in one go). >> >> But for each transaction we have only reserved the metadata for one >> block group conversion (1 block for extent tree and 1 block for block >> group tree, this excludes free space and extent tree operations, as they >> should go global rsv). >> >> Although we're not 100% sure about the failed conversion, at least we >> should handle the metadata reservation correctly, by properly reserve >> the needed space for the batch, and reduce the batch size just in case >> there isn't much metadata space left. > > This is probably reasonable. The change itself is reasonable, but don't be surprised that, btrfs-progs has no metadata reservation at all. It really goes allocate-as-we-can method, until it crashes at some critical path, and make the fs unable to be mounted/used by kernel due to -ENOSPC. The extra safenet for a simple metadata rsv check is introduced in this patch: https://lore.kernel.org/linux-btrfs/6081e57fe6f43b3ab44c883814c6a197513c66c0.1696489737.git.wqu@suse.com/ Thanks, Qu >> >> Signed-off-by: Qu Wenruo <wqu@suse.com> > > Added to devel, thanks.
On Mon, Oct 09, 2023 at 09:55:58AM +1030, Qu Wenruo wrote: > > > On 2023/10/7 02:05, David Sterba wrote: > > On Tue, Oct 03, 2023 at 10:39:18AM +1030, Qu Wenruo wrote: > >> There is a bug report that btrfstune failed to convert to block group > >> tree. > >> The bug report shows some very weird call trace, mostly fail at free > >> space tree code. > > > > If you mention a bug report please either add a link or describe the key > > details of the report. > > Here it is: > > https://lore.kernel.org/linux-btrfs/3c93d0b5-a8cb-ebe3-f8d6-76ea6340f23e@gmail.com/ > > I didn't put it in the first place is, it's really just an inspiration, > not a concrete bug report. That's also good reason to mention it in the changelog as it's relevant for the patch and can add more context e.g. about the use case. > But for sure, in the future I can also put those inspirational reports > into "Link:" tag. Yeah, you can mention id like "This is based on discussion in [1]" or similar. > >> One of the concern is the failed conversion may be caused by exhausted > >> metadata space. > >> In that case, we're doing quite some metadata operations (one > >> transaction to convert 64 block groups in one go). > >> > >> But for each transaction we have only reserved the metadata for one > >> block group conversion (1 block for extent tree and 1 block for block > >> group tree, this excludes free space and extent tree operations, as they > >> should go global rsv). > >> > >> Although we're not 100% sure about the failed conversion, at least we > >> should handle the metadata reservation correctly, by properly reserve > >> the needed space for the batch, and reduce the batch size just in case > >> there isn't much metadata space left. > > > > This is probably reasonable. > > The change itself is reasonable, but don't be surprised that, > btrfs-progs has no metadata reservation at all. > > It really goes allocate-as-we-can method, until it crashes at some > critical path, and make the fs unable to be mounted/used by kernel due > to -ENOSPC. I am not surprised. > The extra safenet for a simple metadata rsv check is introduced in this > patch: > > https://lore.kernel.org/linux-btrfs/6081e57fe6f43b3ab44c883814c6a197513c66c0.1696489737.git.wqu@suse.com/ Thanks. I think historically there was not much need for that as the only offlline writing action was 'check' that relied on the single transaction to do everything. Now we do more conversion changes so the sanity checks will be needed.
diff --git a/tune/convert-bgt.c b/tune/convert-bgt.c index 77cba3930ae1..dd3a8c750604 100644 --- a/tune/convert-bgt.c +++ b/tune/convert-bgt.c @@ -28,7 +28,7 @@ #include "tune/tune.h" /* After this many block groups we need to commit transaction. */ -#define BLOCK_GROUP_BATCH 64 +#define BLOCK_GROUP_BATCH 16 int convert_to_bg_tree(struct btrfs_fs_info *fs_info) { @@ -71,7 +71,8 @@ int convert_to_bg_tree(struct btrfs_fs_info *fs_info) error_msg(ERROR_MSG_COMMIT_TRANS, "new bg root: %d", ret); goto error; } - trans = btrfs_start_transaction(fs_info->tree_root, 2); + trans = btrfs_start_transaction(fs_info->tree_root, + 2 * BLOCK_GROUP_BATCH); if (IS_ERR(trans)) { ret = PTR_ERR(trans); errno = -ret; @@ -119,7 +120,8 @@ iterate_bgs: error_msg(ERROR_MSG_COMMIT_TRANS, "%m"); return ret; } - trans = btrfs_start_transaction(fs_info->tree_root, 2); + trans = btrfs_start_transaction(fs_info->tree_root, + 2 * BLOCK_GROUP_BATCH); if (IS_ERR(trans)) { ret = PTR_ERR(trans); errno = -ret; @@ -181,7 +183,8 @@ int convert_to_extent_tree(struct btrfs_fs_info *fs_info) error_msg(ERROR_MSG_COMMIT_TRANS, "new extent tree root: %m"); goto error; } - trans = btrfs_start_transaction(fs_info->tree_root, 2); + trans = btrfs_start_transaction(fs_info->tree_root, + 2 * BLOCK_GROUP_BATCH); if (IS_ERR(trans)) { ret = PTR_ERR(trans); errno = -ret; @@ -227,7 +230,8 @@ iterate_bgs: error_msg(ERROR_MSG_COMMIT_TRANS, "%m"); return ret; } - trans = btrfs_start_transaction(fs_info->tree_root, 2); + trans = btrfs_start_transaction(fs_info->tree_root, + 2 * BLOCK_GROUP_BATCH); if (IS_ERR(trans)) { ret = PTR_ERR(trans); errno = -ret;
There is a bug report that btrfstune failed to convert to block group tree. The bug report shows some very weird call trace, mostly fail at free space tree code. One of the concern is the failed conversion may be caused by exhausted metadata space. In that case, we're doing quite some metadata operations (one transaction to convert 64 block groups in one go). But for each transaction we have only reserved the metadata for one block group conversion (1 block for extent tree and 1 block for block group tree, this excludes free space and extent tree operations, as they should go global rsv). Although we're not 100% sure about the failed conversion, at least we should handle the metadata reservation correctly, by properly reserve the needed space for the batch, and reduce the batch size just in case there isn't much metadata space left. Signed-off-by: Qu Wenruo <wqu@suse.com> --- tune/convert-bgt.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)