diff mbox series

[v2,4/7] btrfs-progs: lowmem: search key of root again after check_fs_root() under repair

Message ID 20180912204924.10089-5-suy.fnst@cn.fujitsu.com (mailing list archive)
State New, archived
Headers show
Series btrfs-progs: lowmem: bug fixes and inode_extref repair | expand

Commit Message

Su Yue Sept. 12, 2018, 8:49 p.m. UTC
From: Su Yue <suy.fnst@cn.fujitsu.com>

In check_fs_roots_lowmem(), we do search and follow the resulted path
to call check_fs_root(), then call btrfs_next_item() to check next
root.
However, if repair is enabled, the root tree can be cowed, the
existed path can cause strange errors.

Solution:
  If repair, save the key before calling check_fs_root,
  search the saved key again before checking next root.

Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
---
 check/mode-lowmem.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

Comments

Qu Wenruo Sept. 13, 2018, 11:37 p.m. UTC | #1
On 2018/9/13 上午4:49, damenly.su@gmail.com wrote:
> From: Su Yue <suy.fnst@cn.fujitsu.com>
> 
> In check_fs_roots_lowmem(), we do search and follow the resulted path
> to call check_fs_root(), then call btrfs_next_item() to check next
> root.
> However, if repair is enabled, the root tree can be cowed, the
> existed path can cause strange errors.
> 
> Solution:
>   If repair, save the key before calling check_fs_root,
>   search the saved key again before checking next root.

Both reason and solution looks good.

> 
> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
> ---
>  check/mode-lowmem.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
> index 89a304bbdd69..8fc9edab1d66 100644
> --- a/check/mode-lowmem.c
> +++ b/check/mode-lowmem.c
> @@ -4967,9 +4967,13 @@ int check_fs_roots_lowmem(struct btrfs_fs_info *fs_info)
>  	}
>  
>  	while (1) {
> +		struct btrfs_key saved_key;
> +
>  		node = path.nodes[0];
>  		slot = path.slots[0];
>  		btrfs_item_key_to_cpu(node, &key, slot);
> +		if (repair)
> +			saved_key = key;
>  		if (key.objectid > BTRFS_LAST_FREE_OBJECTID)
>  			goto out;
>  		if (key.type == BTRFS_ROOT_ITEM_KEY &&
> @@ -5000,6 +5004,17 @@ int check_fs_roots_lowmem(struct btrfs_fs_info *fs_info)
>  			err |= ret;
>  		}
>  next:
> +		/*
> +		 * Since root tree can be cowed during repair,
> +		 * here search the saved key again.
> +		 */
> +		if (repair) {
> +			btrfs_release_path(&path);
> +			ret = btrfs_search_slot(NULL, fs_info->tree_root,
> +						&saved_key, &path, 0, 0);
> +			/* Repair never deletes trees, search must succeed. */
> +			BUG_ON(ret);

But this doesn't look good to me.

Your assumption here is valid (at least for now), but it's possible that
some tree blocks get corrupted in a large root tree, and in that case,
we could still read part of the root tree, but btrfs_search_slot() could
still return -EIO for certain search key.

So I still prefer to do some error handling other than BUG_ON(ret).

Thanks,
Qu

> +		}
>  		ret = btrfs_next_item(tree_root, &path);
>  		if (ret > 0)
>  			goto out;
>
Su Yue Sept. 14, 2018, 12:58 a.m. UTC | #2
On 09/14/2018 07:37 AM, Qu Wenruo wrote:
> 
> 
> On 2018/9/13 上午4:49, damenly.su@gmail.com wrote:
>> From: Su Yue <suy.fnst@cn.fujitsu.com>
>>
>> In check_fs_roots_lowmem(), we do search and follow the resulted path
>> to call check_fs_root(), then call btrfs_next_item() to check next
>> root.
>> However, if repair is enabled, the root tree can be cowed, the
>> existed path can cause strange errors.
>>
>> Solution:
>>    If repair, save the key before calling check_fs_root,
>>    search the saved key again before checking next root.
> 
> Both reason and solution looks good.
> 
>>
>> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
>> ---
>>   check/mode-lowmem.c | 15 +++++++++++++++
>>   1 file changed, 15 insertions(+)
>>
>> diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
>> index 89a304bbdd69..8fc9edab1d66 100644
>> --- a/check/mode-lowmem.c
>> +++ b/check/mode-lowmem.c
>> @@ -4967,9 +4967,13 @@ int check_fs_roots_lowmem(struct btrfs_fs_info *fs_info)
>>   	}
>>   
>>   	while (1) {
>> +		struct btrfs_key saved_key;
>> +
>>   		node = path.nodes[0];
>>   		slot = path.slots[0];
>>   		btrfs_item_key_to_cpu(node, &key, slot);
>> +		if (repair)
>> +			saved_key = key;
>>   		if (key.objectid > BTRFS_LAST_FREE_OBJECTID)
>>   			goto out;
>>   		if (key.type == BTRFS_ROOT_ITEM_KEY &&
>> @@ -5000,6 +5004,17 @@ int check_fs_roots_lowmem(struct btrfs_fs_info *fs_info)
>>   			err |= ret;
>>   		}
>>   next:
>> +		/*
>> +		 * Since root tree can be cowed during repair,
>> +		 * here search the saved key again.
>> +		 */
>> +		if (repair) {
>> +			btrfs_release_path(&path);
>> +			ret = btrfs_search_slot(NULL, fs_info->tree_root,
>> +						&saved_key, &path, 0, 0);
>> +			/* Repair never deletes trees, search must succeed. */
>> +			BUG_ON(ret);
> 
> But this doesn't look good to me.
> 
> Your assumption here is valid (at least for now), but it's possible that
> some tree blocks get corrupted in a large root tree, and in that case,
> we could still read part of the root tree, but btrfs_search_slot() could
> still return -EIO for certain search key.
> 
> So I still prefer to do some error handling other than BUG_ON(ret).
> 
Okay, will try it.

Thanks,
Su
> Thanks,
> Qu
> 
>> +		}
>>   		ret = btrfs_next_item(tree_root, &path);
>>   		if (ret > 0)
>>   			goto out;
>>
> 
>
Nikolay Borisov Sept. 14, 2018, 6:27 a.m. UTC | #3
On 14.09.2018 03:58, Su Yue wrote:
> 
> 
> On 09/14/2018 07:37 AM, Qu Wenruo wrote:
>>
>>
>> On 2018/9/13 上午4:49, damenly.su@gmail.com wrote:
>>> From: Su Yue <suy.fnst@cn.fujitsu.com>
>>>
>>> In check_fs_roots_lowmem(), we do search and follow the resulted path
>>> to call check_fs_root(), then call btrfs_next_item() to check next
>>> root.
>>> However, if repair is enabled, the root tree can be cowed, the
>>> existed path can cause strange errors.
>>>
>>> Solution:
>>>    If repair, save the key before calling check_fs_root,
>>>    search the saved key again before checking next root.
>>
>> Both reason and solution looks good.
>>
>>>
>>> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
>>> ---
>>>   check/mode-lowmem.c | 15 +++++++++++++++
>>>   1 file changed, 15 insertions(+)
>>>
>>> diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
>>> index 89a304bbdd69..8fc9edab1d66 100644
>>> --- a/check/mode-lowmem.c
>>> +++ b/check/mode-lowmem.c
>>> @@ -4967,9 +4967,13 @@ int check_fs_roots_lowmem(struct btrfs_fs_info
>>> *fs_info)
>>>       }
>>>         while (1) {
>>> +        struct btrfs_key saved_key;
>>> +
>>>           node = path.nodes[0];
>>>           slot = path.slots[0];
>>>           btrfs_item_key_to_cpu(node, &key, slot);
>>> +        if (repair)
>>> +            saved_key = key;
>>>           if (key.objectid > BTRFS_LAST_FREE_OBJECTID)
>>>               goto out;
>>>           if (key.type == BTRFS_ROOT_ITEM_KEY &&
>>> @@ -5000,6 +5004,17 @@ int check_fs_roots_lowmem(struct btrfs_fs_info
>>> *fs_info)
>>>               err |= ret;
>>>           }
>>>   next:
>>> +        /*
>>> +         * Since root tree can be cowed during repair,
>>> +         * here search the saved key again.
>>> +         */
>>> +        if (repair) {
>>> +            btrfs_release_path(&path);
>>> +            ret = btrfs_search_slot(NULL, fs_info->tree_root,
>>> +                        &saved_key, &path, 0, 0);
>>> +            /* Repair never deletes trees, search must succeed. */
>>> +            BUG_ON(ret);
>>
>> But this doesn't look good to me.
>>
>> Your assumption here is valid (at least for now), but it's possible that
>> some tree blocks get corrupted in a large root tree, and in that case,
>> we could still read part of the root tree, but btrfs_search_slot() could
>> still return -EIO for certain search key.
>>
>> So I still prefer to do some error handling other than BUG_ON(ret).
>>
> Okay, will try it.

Just to emphasize Qu's point - we should strive to remove existing
BUG_ON and should never introduce new ones. btrfs-progs is already quite
messy and we should be improving that.

> 
> Thanks,
> Su
>> Thanks,
>> Qu
>>
>>> +        }
>>>           ret = btrfs_next_item(tree_root, &path);
>>>           if (ret > 0)
>>>               goto out;
>>>
>>
>>
> 
> 
>
Su Yue Sept. 14, 2018, 7:13 a.m. UTC | #4
On 09/14/2018 02:27 PM, Nikolay Borisov wrote:
> 
> 
> On 14.09.2018 03:58, Su Yue wrote:
>>
>>
>> On 09/14/2018 07:37 AM, Qu Wenruo wrote:
>>>
>>>
>>> On 2018/9/13 上午4:49, damenly.su@gmail.com wrote:
>>>> From: Su Yue <suy.fnst@cn.fujitsu.com>
>>>>
>>>> In check_fs_roots_lowmem(), we do search and follow the resulted path
>>>> to call check_fs_root(), then call btrfs_next_item() to check next
>>>> root.
>>>> However, if repair is enabled, the root tree can be cowed, the
>>>> existed path can cause strange errors.
>>>>
>>>> Solution:
>>>>     If repair, save the key before calling check_fs_root,
>>>>     search the saved key again before checking next root.
>>>
>>> Both reason and solution looks good.
>>>
>>>>
>>>> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
>>>> ---
>>>>    check/mode-lowmem.c | 15 +++++++++++++++
>>>>    1 file changed, 15 insertions(+)
>>>>
>>>> diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
>>>> index 89a304bbdd69..8fc9edab1d66 100644
>>>> --- a/check/mode-lowmem.c
>>>> +++ b/check/mode-lowmem.c
>>>> @@ -4967,9 +4967,13 @@ int check_fs_roots_lowmem(struct btrfs_fs_info
>>>> *fs_info)
>>>>        }
>>>>          while (1) {
>>>> +        struct btrfs_key saved_key;
>>>> +
>>>>            node = path.nodes[0];
>>>>            slot = path.slots[0];
>>>>            btrfs_item_key_to_cpu(node, &key, slot);
>>>> +        if (repair)
>>>> +            saved_key = key;
>>>>            if (key.objectid > BTRFS_LAST_FREE_OBJECTID)
>>>>                goto out;
>>>>            if (key.type == BTRFS_ROOT_ITEM_KEY &&
>>>> @@ -5000,6 +5004,17 @@ int check_fs_roots_lowmem(struct btrfs_fs_info
>>>> *fs_info)
>>>>                err |= ret;
>>>>            }
>>>>    next:
>>>> +        /*
>>>> +         * Since root tree can be cowed during repair,
>>>> +         * here search the saved key again.
>>>> +         */
>>>> +        if (repair) {
>>>> +            btrfs_release_path(&path);
>>>> +            ret = btrfs_search_slot(NULL, fs_info->tree_root,
>>>> +                        &saved_key, &path, 0, 0);
>>>> +            /* Repair never deletes trees, search must succeed. */
>>>> +            BUG_ON(ret);
>>>
>>> But this doesn't look good to me.
>>>
>>> Your assumption here is valid (at least for now), but it's possible that
>>> some tree blocks get corrupted in a large root tree, and in that case,
>>> we could still read part of the root tree, but btrfs_search_slot() could
>>> still return -EIO for certain search key.
>>>
>>> So I still prefer to do some error handling other than BUG_ON(ret).
>>>
>> Okay, will try it.
> 
> Just to emphasize Qu's point - we should strive to remove existing
> BUG_ON and should never introduce new ones. btrfs-progs is already quite
> messy and we should be improving that.
> 
Understood, thanks for your emphasis.

Su
>>
>> Thanks,
>> Su
>>> Thanks,
>>> Qu
>>>
>>>> +        }
>>>>            ret = btrfs_next_item(tree_root, &path);
>>>>            if (ret > 0)
>>>>                goto out;
>>>>
>>>
>>>
>>
>>
>>
> 
>
diff mbox series

Patch

diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
index 89a304bbdd69..8fc9edab1d66 100644
--- a/check/mode-lowmem.c
+++ b/check/mode-lowmem.c
@@ -4967,9 +4967,13 @@  int check_fs_roots_lowmem(struct btrfs_fs_info *fs_info)
 	}
 
 	while (1) {
+		struct btrfs_key saved_key;
+
 		node = path.nodes[0];
 		slot = path.slots[0];
 		btrfs_item_key_to_cpu(node, &key, slot);
+		if (repair)
+			saved_key = key;
 		if (key.objectid > BTRFS_LAST_FREE_OBJECTID)
 			goto out;
 		if (key.type == BTRFS_ROOT_ITEM_KEY &&
@@ -5000,6 +5004,17 @@  int check_fs_roots_lowmem(struct btrfs_fs_info *fs_info)
 			err |= ret;
 		}
 next:
+		/*
+		 * Since root tree can be cowed during repair,
+		 * here search the saved key again.
+		 */
+		if (repair) {
+			btrfs_release_path(&path);
+			ret = btrfs_search_slot(NULL, fs_info->tree_root,
+						&saved_key, &path, 0, 0);
+			/* Repair never deletes trees, search must succeed. */
+			BUG_ON(ret);
+		}
 		ret = btrfs_next_item(tree_root, &path);
 		if (ret > 0)
 			goto out;