btrfs: Don't panic when we can't find a root key
diff mbox series

Message ID 20190225051106.22091-1-wqu@suse.com
State New
Headers show
Series
  • btrfs: Don't panic when we can't find a root key
Related show

Commit Message

Qu Wenruo Feb. 25, 2019, 5:11 a.m. UTC
When we failed to find a root key in btrfs_update_root(), we just panic.

That's definitely not cool, fix it by aborting current transaction and
return an error value.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/root-tree.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

Comments

Filipe Manana Feb. 25, 2019, 9:57 a.m. UTC | #1
On Mon, Feb 25, 2019 at 5:12 AM Qu Wenruo <wqu@suse.com> wrote:
>
> When we failed to find a root key in btrfs_update_root(), we just panic.
>
> That's definitely not cool, fix it by aborting current transaction and
> return an error value.
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Looks good.
Reviewed-by: Filipe Manana <fdmanana@suse.com>

> ---
>  fs/btrfs/root-tree.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c
> index 65bda0682928..c48a3e18866d 100644
> --- a/fs/btrfs/root-tree.c
> +++ b/fs/btrfs/root-tree.c
> @@ -137,11 +137,15 @@ int btrfs_update_root(struct btrfs_trans_handle *trans, struct btrfs_root
>                 goto out;
>         }
>
> -       if (ret != 0) {
> +       if (ret > 0) {
>                 btrfs_print_leaf(path->nodes[0]);
> -               btrfs_crit(fs_info, "unable to update root key %llu %u %llu",
> -                          key->objectid, key->type, key->offset);
> -               BUG_ON(1);
> +               btrfs_crit(fs_info,
> +                       "unable to find root key (%llu %u %llu) in tree %llu",
> +                          key->objectid, key->type, key->offset,
> +                          root->root_key.objectid);
> +               ret = -ENOENT;
> +               btrfs_abort_transaction(trans, ret);
> +               goto out;
>         }
>
>         l = path->nodes[0];
> --
> 2.20.1
>
Johannes Thumshirn Feb. 25, 2019, 10:33 a.m. UTC | #2
Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
David Sterba Feb. 25, 2019, 12:26 p.m. UTC | #3
On Mon, Feb 25, 2019 at 01:11:06PM +0800, Qu Wenruo wrote:
> When we failed to find a root key in btrfs_update_root(), we just panic.
> 
> That's definitely not cool, fix it by aborting current transaction and
> return an error value.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/root-tree.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c
> index 65bda0682928..c48a3e18866d 100644
> --- a/fs/btrfs/root-tree.c
> +++ b/fs/btrfs/root-tree.c
> @@ -137,11 +137,15 @@ int btrfs_update_root(struct btrfs_trans_handle *trans, struct btrfs_root
>  		goto out;
>  	}
>  
> -	if (ret != 0) {
> +	if (ret > 0) {
>  		btrfs_print_leaf(path->nodes[0]);

You leave the leaf dump here, which I believe was related to the BUG_ON
so there's information for later analysis. This is not usually done
where transaction is aborted and error returned. I'm not saying it's not
worth keeping it here, as the logic that would lead to this error is
in the 'never happens' category. But we know such things happen due to
the bitflips and random memory corruptions so a dump might make sense.

If you think this makes sense, please put it to the changelog and write
a comment before the leaf dump. This will be a good pattern to follow as
there are many instances of leaf_dump/BUG to clean up or review.
Qu Wenruo Feb. 25, 2019, 12:47 p.m. UTC | #4
On 2019/2/25 下午8:26, David Sterba wrote:
> On Mon, Feb 25, 2019 at 01:11:06PM +0800, Qu Wenruo wrote:
>> When we failed to find a root key in btrfs_update_root(), we just panic.
>>
>> That's definitely not cool, fix it by aborting current transaction and
>> return an error value.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>  fs/btrfs/root-tree.c | 12 ++++++++----
>>  1 file changed, 8 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c
>> index 65bda0682928..c48a3e18866d 100644
>> --- a/fs/btrfs/root-tree.c
>> +++ b/fs/btrfs/root-tree.c
>> @@ -137,11 +137,15 @@ int btrfs_update_root(struct btrfs_trans_handle *trans, struct btrfs_root
>>  		goto out;
>>  	}
>>  
>> -	if (ret != 0) {
>> +	if (ret > 0) {
>>  		btrfs_print_leaf(path->nodes[0]);
> 
> You leave the leaf dump here, which I believe was related to the BUG_ON
> so there's information for later analysis. This is not usually done
> where transaction is aborted and error returned. I'm not saying it's not
> worth keeping it here, as the logic that would lead to this error is
> in the 'never happens' category. But we know such things happen due to
> the bitflips and random memory corruptions so a dump might make sense.
> 
> If you think this makes sense, please put it to the changelog and write
> a comment before the leaf dump. This will be a good pattern to follow as
> there are many instances of leaf_dump/BUG to clean up or review.

I think the dump makes sense, but didn't think so deep to make it as an
example for later cleanup.

If we're going to use this as an example for later cleanups.
I won't keep it.

The message itself should be enough to locate the cause, and the tree
dump doesn't help as much as we expected.

So I'm going to remove it in next update.

Thanks,
Qu

Patch
diff mbox series

diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c
index 65bda0682928..c48a3e18866d 100644
--- a/fs/btrfs/root-tree.c
+++ b/fs/btrfs/root-tree.c
@@ -137,11 +137,15 @@  int btrfs_update_root(struct btrfs_trans_handle *trans, struct btrfs_root
 		goto out;
 	}
 
-	if (ret != 0) {
+	if (ret > 0) {
 		btrfs_print_leaf(path->nodes[0]);
-		btrfs_crit(fs_info, "unable to update root key %llu %u %llu",
-			   key->objectid, key->type, key->offset);
-		BUG_ON(1);
+		btrfs_crit(fs_info,
+			"unable to find root key (%llu %u %llu) in tree %llu",
+			   key->objectid, key->type, key->offset,
+			   root->root_key.objectid);
+		ret = -ENOENT;
+		btrfs_abort_transaction(trans, ret);
+		goto out;
 	}
 
 	l = path->nodes[0];