Message ID | 20190816152019.1962-2-josef@toxicpanda.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fix global reserve size and can overcommit | expand |
On 16.08.19 г. 18:20 ч., Josef Bacik wrote: > It made sense to have the global reserve set at 16M in the past, but > since it is used less nowadays set the minimum size to the number of > items we'll need to update the main trees we update during a transaction > commit, plus some slop area so we can do unlinks if we need to. > > In practice this doesn't affect normal file systems, but for xfstests > where we do things like fill up a fs and then rm * it can fall over in > weird ways. This enables us for more sane behavior at extremely small > file system sizes. > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> > --- > fs/btrfs/block-rsv.c | 22 +++++++++++++++++++++- > 1 file changed, 21 insertions(+), 1 deletion(-) > > diff --git a/fs/btrfs/block-rsv.c b/fs/btrfs/block-rsv.c > index c64b460a4301..657675eef443 100644 > --- a/fs/btrfs/block-rsv.c > +++ b/fs/btrfs/block-rsv.c > @@ -258,6 +258,7 @@ void btrfs_update_global_block_rsv(struct btrfs_fs_info *fs_info) > struct btrfs_block_rsv *block_rsv = &fs_info->global_block_rsv; > struct btrfs_space_info *sinfo = block_rsv->space_info; > u64 num_bytes; > + unsigned min_items; > > /* > * The global block rsv is based on the size of the extent tree, the > @@ -267,7 +268,26 @@ void btrfs_update_global_block_rsv(struct btrfs_fs_info *fs_info) > num_bytes = btrfs_root_used(&fs_info->extent_root->root_item) + > btrfs_root_used(&fs_info->csum_root->root_item) + > btrfs_root_used(&fs_info->tree_root->root_item); > - num_bytes = max_t(u64, num_bytes, SZ_16M); > + > + /* > + * We at a minimum are going to modify the csum root, the tree root, and > + * the extent root. > + */ > + min_items = 3; > + > + /* > + * But we also want to reserve enough space so we can do the fallback > + * global reserve for an unlink, which is an additional 5 items (see the > + * comment in __unlink_start_trans for what we're modifying.) > + * > + * But we also need space for the delayed ref updates from the unlink, > + * so its 10, 5 for the actual operation, and 5 for the delayed ref > + * updates. > + */ > + min_items += 10; > + > + num_bytes = max_t(u64, num_bytes, > + btrfs_calc_insert_metadata_size(fs_info, min_items)); For ordinary, 16k nodesize filesystem, btrfs_calc_insert_metadata_size will really return 3.4m -> 16 * 8 * 2 * 13 * 1024 = 3407872 bytes . In those cases I expect that the code will always be doing num_bytes = num_bytes. > > spin_lock(&sinfo->lock); > spin_lock(&block_rsv->lock); >
On Tue, Aug 20, 2019 at 04:45:15PM +0300, Nikolay Borisov wrote: > > > On 16.08.19 г. 18:20 ч., Josef Bacik wrote: > > It made sense to have the global reserve set at 16M in the past, but > > since it is used less nowadays set the minimum size to the number of > > items we'll need to update the main trees we update during a transaction > > commit, plus some slop area so we can do unlinks if we need to. > > > > In practice this doesn't affect normal file systems, but for xfstests > > where we do things like fill up a fs and then rm * it can fall over in > > weird ways. This enables us for more sane behavior at extremely small > > file system sizes. > > > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> > > --- > > fs/btrfs/block-rsv.c | 22 +++++++++++++++++++++- > > 1 file changed, 21 insertions(+), 1 deletion(-) > > > > diff --git a/fs/btrfs/block-rsv.c b/fs/btrfs/block-rsv.c > > index c64b460a4301..657675eef443 100644 > > --- a/fs/btrfs/block-rsv.c > > +++ b/fs/btrfs/block-rsv.c > > @@ -258,6 +258,7 @@ void btrfs_update_global_block_rsv(struct btrfs_fs_info *fs_info) > > struct btrfs_block_rsv *block_rsv = &fs_info->global_block_rsv; > > struct btrfs_space_info *sinfo = block_rsv->space_info; > > u64 num_bytes; > > + unsigned min_items; > > > > /* > > * The global block rsv is based on the size of the extent tree, the > > @@ -267,7 +268,26 @@ void btrfs_update_global_block_rsv(struct btrfs_fs_info *fs_info) > > num_bytes = btrfs_root_used(&fs_info->extent_root->root_item) + > > btrfs_root_used(&fs_info->csum_root->root_item) + > > btrfs_root_used(&fs_info->tree_root->root_item); > > - num_bytes = max_t(u64, num_bytes, SZ_16M); > > + > > + /* > > + * We at a minimum are going to modify the csum root, the tree root, and > > + * the extent root. > > + */ > > + min_items = 3; > > + > > + /* > > + * But we also want to reserve enough space so we can do the fallback > > + * global reserve for an unlink, which is an additional 5 items (see the > > + * comment in __unlink_start_trans for what we're modifying.) > > + * > > + * But we also need space for the delayed ref updates from the unlink, > > + * so its 10, 5 for the actual operation, and 5 for the delayed ref > > + * updates. > > + */ > > + min_items += 10; > > + > > + num_bytes = max_t(u64, num_bytes, > > + btrfs_calc_insert_metadata_size(fs_info, min_items)); > > For ordinary, 16k nodesize filesystem, btrfs_calc_insert_metadata_size > will really return 3.4m -> 16 * 8 * 2 * 13 * 1024 = 3407872 bytes . In > those cases I expect that the code will always be doing num_bytes = > num_bytes. > Right, generally this will always be num_bytes = num_bytes. This is mostly for things like xfstests which start with empty fs's and then fill up the fs with data, so you only have like 200kib of metadata ever, so it ends up being the btrfs_calc_insert_metadat_size() amount. Thanks, Josef
diff --git a/fs/btrfs/block-rsv.c b/fs/btrfs/block-rsv.c index c64b460a4301..657675eef443 100644 --- a/fs/btrfs/block-rsv.c +++ b/fs/btrfs/block-rsv.c @@ -258,6 +258,7 @@ void btrfs_update_global_block_rsv(struct btrfs_fs_info *fs_info) struct btrfs_block_rsv *block_rsv = &fs_info->global_block_rsv; struct btrfs_space_info *sinfo = block_rsv->space_info; u64 num_bytes; + unsigned min_items; /* * The global block rsv is based on the size of the extent tree, the @@ -267,7 +268,26 @@ void btrfs_update_global_block_rsv(struct btrfs_fs_info *fs_info) num_bytes = btrfs_root_used(&fs_info->extent_root->root_item) + btrfs_root_used(&fs_info->csum_root->root_item) + btrfs_root_used(&fs_info->tree_root->root_item); - num_bytes = max_t(u64, num_bytes, SZ_16M); + + /* + * We at a minimum are going to modify the csum root, the tree root, and + * the extent root. + */ + min_items = 3; + + /* + * But we also want to reserve enough space so we can do the fallback + * global reserve for an unlink, which is an additional 5 items (see the + * comment in __unlink_start_trans for what we're modifying.) + * + * But we also need space for the delayed ref updates from the unlink, + * so its 10, 5 for the actual operation, and 5 for the delayed ref + * updates. + */ + min_items += 10; + + num_bytes = max_t(u64, num_bytes, + btrfs_calc_insert_metadata_size(fs_info, min_items)); spin_lock(&sinfo->lock); spin_lock(&block_rsv->lock);
It made sense to have the global reserve set at 16M in the past, but since it is used less nowadays set the minimum size to the number of items we'll need to update the main trees we update during a transaction commit, plus some slop area so we can do unlinks if we need to. In practice this doesn't affect normal file systems, but for xfstests where we do things like fill up a fs and then rm * it can fall over in weird ways. This enables us for more sane behavior at extremely small file system sizes. Signed-off-by: Josef Bacik <josef@toxicpanda.com> --- fs/btrfs/block-rsv.c | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-)