Message ID | 1587361120-83160-1-git-send-email-xiyuyang19@fudan.edu.cn (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: Fix btrfs_block_group refcnt leak | expand |
On Mon, Apr 20, 2020 at 6:48 AM Xiyu Yang <xiyuyang19@fudan.edu.cn> wrote: > > btrfs_remove_block_group() invokes btrfs_lookup_block_group(), which > returns a local reference of the blcok group that contains the given > bytenr to "block_group" with increased refcount. > > When btrfs_remove_block_group() returns, "block_group" becomes invalid, > so the refcount should be decreased to keep refcount balanced. > > The reference counting issue happens in several exception handling paths > of btrfs_remove_block_group(). When those error scenarios occur such as > btrfs_alloc_path() returns NULL, the function forgets to decrease its > refcnt increased by btrfs_lookup_block_group() and will cause a refcnt > leak. > > Fix this issue by jumping to "out_put_group" label and calling > btrfs_put_block_group() when those error scenarios occur. > > Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> > Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Seems correct to me. I would change the subject to something more clear like: "btrfs: fix block group leak after failure to remove it" One more suggestion below. > --- > fs/btrfs/block-group.c | 13 ++++++++----- > 1 file changed, 8 insertions(+), 5 deletions(-) > > diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c > index 404e050ce8ee..d9f432bd3329 100644 > --- a/fs/btrfs/block-group.c > +++ b/fs/btrfs/block-group.c > @@ -916,7 +916,7 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans, > path = btrfs_alloc_path(); > if (!path) { > ret = -ENOMEM; > - goto out; > + goto out_put_group; > } > > /* > @@ -954,7 +954,7 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans, > ret = btrfs_orphan_add(trans, BTRFS_I(inode)); > if (ret) { > btrfs_add_delayed_iput(inode); > - goto out; > + goto out_put_group; > } > clear_nlink(inode); > /* One for the block groups ref */ > @@ -977,13 +977,13 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans, > > ret = btrfs_search_slot(trans, tree_root, &key, path, -1, 1); > if (ret < 0) > - goto out; > + goto out_put_group; > if (ret > 0) > btrfs_release_path(path); > if (ret == 0) { > ret = btrfs_del_item(trans, tree_root, path); > if (ret) > - goto out; > + goto out_put_group; > btrfs_release_path(path); > } > > @@ -1102,7 +1102,7 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans, > > ret = remove_block_group_free_space(trans, block_group); > if (ret) > - goto out; > + goto out_put_group; > > btrfs_put_block_group(block_group); > btrfs_put_block_group(block_group); > @@ -1132,6 +1132,9 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans, > btrfs_delayed_refs_rsv_release(fs_info, 1); > btrfs_free_path(path); > return ret; > +out_put_group: > + btrfs_put_block_group(block_group); > + goto out; Instead of this double goto, which tends to be error prone and harder to follow, I suggest placing a call to btrfs_put_block_group() right after the 'out' label, with a comment above it saying something like "once for the lookup reference" and removing one of the btrfs_put_block_group() calls right after calling remove_block_group_free_space(), and leaving a comment above the other one saying "once for the block groups rbtree". Thanks. > } > > struct btrfs_trans_handle *btrfs_start_trans_remove_block_group( > -- > 2.7.4 >
On Mon, Apr 20, 2020 at 01:38:40PM +0800, Xiyu Yang wrote: > btrfs_remove_block_group() invokes btrfs_lookup_block_group(), which > returns a local reference of the blcok group that contains the given > bytenr to "block_group" with increased refcount. > > When btrfs_remove_block_group() returns, "block_group" becomes invalid, > so the refcount should be decreased to keep refcount balanced. > > The reference counting issue happens in several exception handling paths > of btrfs_remove_block_group(). When those error scenarios occur such as > btrfs_alloc_path() returns NULL, the function forgets to decrease its > refcnt increased by btrfs_lookup_block_group() and will cause a refcnt > leak. > > Fix this issue by jumping to "out_put_group" label and calling > btrfs_put_block_group() when those error scenarios occur. > > Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> > Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Thanks for the fix. May I ask if this was found by code inspection or by some analysis tool? > @@ -1132,6 +1132,9 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans, > btrfs_delayed_refs_rsv_release(fs_info, 1); > btrfs_free_path(path); > return ret; > +out_put_group: > + btrfs_put_block_group(block_group); > + goto out; As Filipe noted, the trailing gotos are not a great pattern, what we'd like to do is a more linear sequence where the resource allocation/freeing nesting is obvious, like: x = allocate(X); if (!x) goto out; ... y = allocate(Y); if (!y) goto out_free_x; z = allocate(Z); if (!z) goto out_free_y; ... free(z); out_free_y: free(y); out_free_x: free(x); out: return; (where allocate/free can be refcount inc/dec and similar). Sometimes it's not that straightforward and the freeing block needs conditionals, but from code reading perspective this is still better than potentially wild gotos.
On Tue, Apr 21, 2020 at 12:43:15AM +0200, David Sterba wrote: > On Mon, Apr 20, 2020 at 01:38:40PM +0800, Xiyu Yang wrote: > > btrfs_remove_block_group() invokes btrfs_lookup_block_group(), which > > returns a local reference of the blcok group that contains the given > > bytenr to "block_group" with increased refcount. > > > > When btrfs_remove_block_group() returns, "block_group" becomes invalid, > > so the refcount should be decreased to keep refcount balanced. > > > > The reference counting issue happens in several exception handling paths > > of btrfs_remove_block_group(). When those error scenarios occur such as > > btrfs_alloc_path() returns NULL, the function forgets to decrease its > > refcnt increased by btrfs_lookup_block_group() and will cause a refcnt > > leak. > > > > Fix this issue by jumping to "out_put_group" label and calling > > btrfs_put_block_group() when those error scenarios occur. > > > > Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> > > Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> > > Thanks for the fix. May I ask if this was found by code inspection or by > some analysis tool? Thanks for your advice about the patch! We are looking for some automated ways to find this kind of bug.
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index 404e050ce8ee..d9f432bd3329 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -916,7 +916,7 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans, path = btrfs_alloc_path(); if (!path) { ret = -ENOMEM; - goto out; + goto out_put_group; } /* @@ -954,7 +954,7 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans, ret = btrfs_orphan_add(trans, BTRFS_I(inode)); if (ret) { btrfs_add_delayed_iput(inode); - goto out; + goto out_put_group; } clear_nlink(inode); /* One for the block groups ref */ @@ -977,13 +977,13 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans, ret = btrfs_search_slot(trans, tree_root, &key, path, -1, 1); if (ret < 0) - goto out; + goto out_put_group; if (ret > 0) btrfs_release_path(path); if (ret == 0) { ret = btrfs_del_item(trans, tree_root, path); if (ret) - goto out; + goto out_put_group; btrfs_release_path(path); } @@ -1102,7 +1102,7 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans, ret = remove_block_group_free_space(trans, block_group); if (ret) - goto out; + goto out_put_group; btrfs_put_block_group(block_group); btrfs_put_block_group(block_group); @@ -1132,6 +1132,9 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans, btrfs_delayed_refs_rsv_release(fs_info, 1); btrfs_free_path(path); return ret; +out_put_group: + btrfs_put_block_group(block_group); + goto out; } struct btrfs_trans_handle *btrfs_start_trans_remove_block_group(