Message ID | 20190913015127.14953-1-wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space | expand |
On 13.09.19 г. 4:51 ч., Qu Wenruo wrote: > [BUG] > Under the follow case with qgroup enabled, if some error happened after > we have reserved delalloc space, then in error handling path, we could > cause qgroup data space leakage: > > From btrfs_truncate_block() in inode.c: > > ret = btrfs_delalloc_reserve_space(inode, &data_reserved, > block_start, blocksize); > if (ret) > goto out; > > again: > page = find_or_create_page(mapping, index, mask); > if (!page) { > btrfs_delalloc_release_space(inode, data_reserved, > block_start, blocksize, true); > btrfs_delalloc_release_extents(BTRFS_I(inode), blocksize, true); > ret = -ENOMEM; > goto out; > } > > [CAUSE] > In above case, btrfs_delalloc_reserve_space() will call > btrfs_qgroup_reserve_data() and mark the io_tree range with > EXTENT_QGROUP_RESERVED flag. > > In the error handling path, btrfs_delalloc_release_space() calls > btrfs_qgroup_free_data() which should clear EXTENT_QGROUP_RESERVED flag > and reduce the reserved data space accroding to the cleared range. > > However due to a completion bug, btrfs_qgroup_free_data() will clear > EXTENT_QGROUP_RESERVED flag in BTRFS_I(inode)->io_failure_tree, other > than the correct BTRFS_I(inode)->io_tree. This is a bit confusing because the error is actually in qgroup_free_reserved_data, which is called from __btrfs_qgroup_release_data. But in the latter function there is also a call to clear_record_extent_bits with the correct tree. Just fix the function name by using qgroup_free_reserved_data. > Since io_failure_tree is never marked with that flag, > btrfs_qgroup_free_data() will not free any data reserved space at all, > causing a leakage. > > All of such error handling cases can only be triggered some errors not I take it you meant: This error handling can only be triggered by errors outside of qgroup e.g. EDQUOT can't triger the bug? The first part of the sentence is hard to parse. > from qgroup, so regular EDQUOT error won't trigger the bug. > Normally we need error injection to trigger such bug. > > [FIX] > Fix the wrong target io_tree. > > Reported-by: Josef Bacik <josef@toxicpanda.com> > Fixes: bc42bda22345 ("btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges") > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > fs/btrfs/qgroup.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c > index 2891b57b9e1e..64bdc3e3652d 100644 > --- a/fs/btrfs/qgroup.c > +++ b/fs/btrfs/qgroup.c > @@ -3492,7 +3492,7 @@ static int qgroup_free_reserved_data(struct inode *inode, > * EXTENT_QGROUP_RESERVED, we won't double free. > * So not need to rush. > */ > - ret = clear_record_extent_bits(&BTRFS_I(inode)->io_failure_tree, > + ret = clear_record_extent_bits(&BTRFS_I(inode)->io_tree, > free_start, free_start + free_len - 1, > EXTENT_QGROUP_RESERVED, &changeset); > if (ret < 0) >
On 2019/9/13 下午8:57, Nikolay Borisov wrote: > > > On 13.09.19 г. 4:51 ч., Qu Wenruo wrote: >> [BUG] >> Under the follow case with qgroup enabled, if some error happened after >> we have reserved delalloc space, then in error handling path, we could >> cause qgroup data space leakage: >> >> From btrfs_truncate_block() in inode.c: >> >> ret = btrfs_delalloc_reserve_space(inode, &data_reserved, >> block_start, blocksize); >> if (ret) >> goto out; >> >> again: >> page = find_or_create_page(mapping, index, mask); >> if (!page) { >> btrfs_delalloc_release_space(inode, data_reserved, >> block_start, blocksize, true); >> btrfs_delalloc_release_extents(BTRFS_I(inode), blocksize, true); >> ret = -ENOMEM; >> goto out; >> } >> >> [CAUSE] >> In above case, btrfs_delalloc_reserve_space() will call >> btrfs_qgroup_reserve_data() and mark the io_tree range with >> EXTENT_QGROUP_RESERVED flag. >> >> In the error handling path, btrfs_delalloc_release_space() calls >> btrfs_qgroup_free_data() which should clear EXTENT_QGROUP_RESERVED flag >> and reduce the reserved data space accroding to the cleared range. >> >> However due to a completion bug, btrfs_qgroup_free_data() will clear >> EXTENT_QGROUP_RESERVED flag in BTRFS_I(inode)->io_failure_tree, other >> than the correct BTRFS_I(inode)->io_tree. > > This is a bit confusing because the error is actually in > qgroup_free_reserved_data, which is called from > __btrfs_qgroup_release_data. But in the latter function there is also a > call to clear_record_extent_bits with the correct tree. Just fix the > function name by using qgroup_free_reserved_data. Right, I ignored some caller here, as the caller chain is not only dependent on btrfs_qgroup_free_data() but also on the parameter. E.g. only when reserved is non-null we go qgroup_free_reserved_data(). > >> Since io_failure_tree is never marked with that flag, >> btrfs_qgroup_free_data() will not free any data reserved space at all, >> causing a leakage. >> >> All of such error handling cases can only be triggered some errors not > > I take it you meant: > > This error handling can only be triggered by errors outside of qgroup > e.g. EDQUOT can't triger the bug? Right. I'll change it too something like "such leakage can only be triggered by errors outside of qgroup." Thanks, Qu > > The first part of the sentence is hard to parse. > >> from qgroup, so regular EDQUOT error won't trigger the bug. >> Normally we need error injection to trigger such bug. >> >> [FIX] >> Fix the wrong target io_tree. >> >> Reported-by: Josef Bacik <josef@toxicpanda.com> >> Fixes: bc42bda22345 ("btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges") >> Signed-off-by: Qu Wenruo <wqu@suse.com> >> --- >> fs/btrfs/qgroup.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c >> index 2891b57b9e1e..64bdc3e3652d 100644 >> --- a/fs/btrfs/qgroup.c >> +++ b/fs/btrfs/qgroup.c >> @@ -3492,7 +3492,7 @@ static int qgroup_free_reserved_data(struct inode *inode, >> * EXTENT_QGROUP_RESERVED, we won't double free. >> * So not need to rush. >> */ >> - ret = clear_record_extent_bits(&BTRFS_I(inode)->io_failure_tree, >> + ret = clear_record_extent_bits(&BTRFS_I(inode)->io_tree, >> free_start, free_start + free_len - 1, >> EXTENT_QGROUP_RESERVED, &changeset); >> if (ret < 0) >>
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index 2891b57b9e1e..64bdc3e3652d 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -3492,7 +3492,7 @@ static int qgroup_free_reserved_data(struct inode *inode, * EXTENT_QGROUP_RESERVED, we won't double free. * So not need to rush. */ - ret = clear_record_extent_bits(&BTRFS_I(inode)->io_failure_tree, + ret = clear_record_extent_bits(&BTRFS_I(inode)->io_tree, free_start, free_start + free_len - 1, EXTENT_QGROUP_RESERVED, &changeset); if (ret < 0)
[BUG] Under the follow case with qgroup enabled, if some error happened after we have reserved delalloc space, then in error handling path, we could cause qgroup data space leakage: From btrfs_truncate_block() in inode.c: ret = btrfs_delalloc_reserve_space(inode, &data_reserved, block_start, blocksize); if (ret) goto out; again: page = find_or_create_page(mapping, index, mask); if (!page) { btrfs_delalloc_release_space(inode, data_reserved, block_start, blocksize, true); btrfs_delalloc_release_extents(BTRFS_I(inode), blocksize, true); ret = -ENOMEM; goto out; } [CAUSE] In above case, btrfs_delalloc_reserve_space() will call btrfs_qgroup_reserve_data() and mark the io_tree range with EXTENT_QGROUP_RESERVED flag. In the error handling path, btrfs_delalloc_release_space() calls btrfs_qgroup_free_data() which should clear EXTENT_QGROUP_RESERVED flag and reduce the reserved data space accroding to the cleared range. However due to a completion bug, btrfs_qgroup_free_data() will clear EXTENT_QGROUP_RESERVED flag in BTRFS_I(inode)->io_failure_tree, other than the correct BTRFS_I(inode)->io_tree. Since io_failure_tree is never marked with that flag, btrfs_qgroup_free_data() will not free any data reserved space at all, causing a leakage. All of such error handling cases can only be triggered some errors not from qgroup, so regular EDQUOT error won't trigger the bug. Normally we need error injection to trigger such bug. [FIX] Fix the wrong target io_tree. Reported-by: Josef Bacik <josef@toxicpanda.com> Fixes: bc42bda22345 ("btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges") Signed-off-by: Qu Wenruo <wqu@suse.com> --- fs/btrfs/qgroup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)