Message ID | 60d12b7256e6877061eaa9df99ce2ed1f0f3d012.1606938211.git.josef@toxicpanda.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Cleanup error handling in relocation | expand |
On 2020/12/3 上午3:50, Josef Bacik wrote: > If we fail to setup a ->reloc_root in a different thread that path will > error out, however it still leaves root->reloc_root NULL but would still > appear set up in the transaction. Subsequent calls to > btrfs_record_root_in_transaction would succeed without attempting to > create the reloc root, as the transid has already been update. Handle > this case by making sure we have a root->reloc_root set after a > btrfs_record_root_in_transaction call so we don't end up deref'ing a > NULL pointer. > > Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org> > Signed-off-by: Josef Bacik <josef@toxicpanda.com> The fix here is mostly based on the fact that pointer assignment is atomic. But I'm wondering if we can do it better by using something like spinlock to make it more explicit. Or is such root->reloc_lock too overkilled? Thanks, Qu > --- > fs/btrfs/relocation.c | 15 +++++++++++++++ > 1 file changed, 15 insertions(+) > > diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c > index cebf8e9d7d96..c9df05f02649 100644 > --- a/fs/btrfs/relocation.c > +++ b/fs/btrfs/relocation.c > @@ -2078,6 +2078,13 @@ struct btrfs_root *select_reloc_root(struct btrfs_trans_handle *trans, > return ERR_PTR(ret); > root = root->reloc_root; > > + /* > + * We could have raced with another thread which failed, so > + * ->reloc_root may not be set, return -ENOENT in this case. > + */ > + if (!root) > + return ERR_PTR(-ENOENT); > + > if (next->new_bytenr != root->node->start) { > /* > * We just created the reloc root, so we shouldn't have > @@ -2579,6 +2586,14 @@ static int relocate_tree_block(struct btrfs_trans_handle *trans, > ret = btrfs_record_root_in_trans(trans, root); > if (ret) > goto out; > + /* > + * Another thread could have failed, need to check if we > + * have ->reloc_root actually set. > + */ > + if (!root->reloc_root) { > + ret = -ENOENT; > + goto out; > + } > root = root->reloc_root; > node->new_bytenr = root->node->start; > btrfs_put_root(node->root); >
On 12/2/20 11:49 PM, Qu Wenruo wrote: > > > On 2020/12/3 上午3:50, Josef Bacik wrote: >> If we fail to setup a ->reloc_root in a different thread that path will >> error out, however it still leaves root->reloc_root NULL but would still >> appear set up in the transaction. Subsequent calls to >> btrfs_record_root_in_transaction would succeed without attempting to >> create the reloc root, as the transid has already been update. Handle >> this case by making sure we have a root->reloc_root set after a >> btrfs_record_root_in_transaction call so we don't end up deref'ing a >> NULL pointer. >> >> Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org> >> Signed-off-by: Josef Bacik <josef@toxicpanda.com> > > The fix here is mostly based on the fact that pointer assignment is atomic. > > But I'm wondering if we can do it better by using something like > spinlock to make it more explicit. > Or is such root->reloc_lock too overkilled? We are essentially doing that already, as these checks are _after_ the btrfs_record_root_in_trans, which does the appropriate locking and such. The "race" is resolved inside of that function itself, so we will either have ->reloc_root set, or we won't, but it won't magically change between exiting btrfs_record_root_in_trans() and us checking root->reloc_root. Thanks, Josef
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index cebf8e9d7d96..c9df05f02649 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -2078,6 +2078,13 @@ struct btrfs_root *select_reloc_root(struct btrfs_trans_handle *trans, return ERR_PTR(ret); root = root->reloc_root; + /* + * We could have raced with another thread which failed, so + * ->reloc_root may not be set, return -ENOENT in this case. + */ + if (!root) + return ERR_PTR(-ENOENT); + if (next->new_bytenr != root->node->start) { /* * We just created the reloc root, so we shouldn't have @@ -2579,6 +2586,14 @@ static int relocate_tree_block(struct btrfs_trans_handle *trans, ret = btrfs_record_root_in_trans(trans, root); if (ret) goto out; + /* + * Another thread could have failed, need to check if we + * have ->reloc_root actually set. + */ + if (!root->reloc_root) { + ret = -ENOENT; + goto out; + } root = root->reloc_root; node->new_bytenr = root->node->start; btrfs_put_root(node->root);
If we fail to setup a ->reloc_root in a different thread that path will error out, however it still leaves root->reloc_root NULL but would still appear set up in the transaction. Subsequent calls to btrfs_record_root_in_transaction would succeed without attempting to create the reloc root, as the transid has already been update. Handle this case by making sure we have a root->reloc_root set after a btrfs_record_root_in_transaction call so we don't end up deref'ing a NULL pointer. Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org> Signed-off-by: Josef Bacik <josef@toxicpanda.com> --- fs/btrfs/relocation.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)