diff mbox series

btrfs: fix warning in create_pending_snapshot

Message ID tencent_DB6BA6C1B369A367C96C83A36457D7735705@qq.com (mailing list archive)
State New
Headers show
Series btrfs: fix warning in create_pending_snapshot | expand

Commit Message

Edward Adam Davis Nov. 11, 2023, 5:06 a.m. UTC
The create_snapshot will use the objectid that already exists in the qgroup_tree
tree, so when calculating the free_ojectid, it is added to determine whether it
exists in the qgroup_tree tree.

Reported-and-tested-by: syzbot+4d81015bc10889fd12ea@syzkaller.appspotmail.com
Fixes: 6ed05643ddb1 ("btrfs: create qgroup earlier in snapshot creation")
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
---
 fs/btrfs/disk-io.c | 3 ++-
 fs/btrfs/qgroup.c  | 2 +-
 fs/btrfs/qgroup.h  | 2 ++
 3 files changed, 5 insertions(+), 2 deletions(-)

Comments

Matthew Wilcox Nov. 11, 2023, 6:20 a.m. UTC | #1
On Sat, Nov 11, 2023 at 01:06:01PM +0800, Edward Adam Davis wrote:
> +++ b/fs/btrfs/disk-io.c
> @@ -4931,7 +4931,8 @@ int btrfs_get_free_objectid(struct btrfs_root *root, u64 *objectid)
>  		goto out;
>  	}
>  
> -	*objectid = root->free_objectid++;
> +	while (find_qgroup_rb(root->fs_info, root->free_objectid++));
> +	*objectid = root->free_objectid;

This looks buggy to me.  Let's say that free_objectid is currently 3.

Before, it would assign 3 to *objectid, and increment free_objectid to
4.  After (assuming the loop terminates on first iteration), it will
increment free_objectid to 4, then assign 4 to *objectid.

I think you meant to write:

	while (find_qgroup_rb(root->fs_info, root->free_objectid))
		root->free_objectid++;
	*objectid = root->free_objectid++;

And the lesson here is that more compact code is not necessarily more
correct code.

(I'm not making any judgement about whether this is the correct fix;
I don't understand btrfs well enough to have an opinion.  Just that
this is not an equivalent transformation)
Qu Wenruo Nov. 11, 2023, 6:54 a.m. UTC | #2
On 2023/11/11 15:36, Edward Adam Davis wrote:
> The create_snapshot will use the objectid that already exists in the qgroup_tree
> tree, so when calculating the free_ojectid, it is added to determine whether it
> exists in the qgroup_tree tree.
>
> Reported-and-tested-by: syzbot+4d81015bc10889fd12ea@syzkaller.appspotmail.com
> Fixes: 6ed05643ddb1 ("btrfs: create qgroup earlier in snapshot creation")
> Signed-off-by: Edward Adam Davis <eadavis@qq.com>
> ---
>   fs/btrfs/disk-io.c | 3 ++-
>   fs/btrfs/qgroup.c  | 2 +-
>   fs/btrfs/qgroup.h  | 2 ++
>   3 files changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 401ea09ae4b8..97050a3edc32 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -4931,7 +4931,8 @@ int btrfs_get_free_objectid(struct btrfs_root *root, u64 *objectid)
>   		goto out;
>   	}
>
> -	*objectid = root->free_objectid++;
> +	while (find_qgroup_rb(root->fs_info, root->free_objectid++));

I don't think this is correct.

Firstly you didn't take qgroup_ioctl_lock.

Secondly, please explain why you believe the free objectid of a
subvolume is related to the qgroup id?


For any one who really wants to fix the syzbot bug, please explain the
bug clearly before doing any fix.

If you can not explain the bug clearly, then you're doing it wrong.

Thanks,
Qu

> +	*objectid = root->free_objectid;
>   	ret = 0;
>   out:
>   	mutex_unlock(&root->objectid_mutex);
> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> index edb84cc03237..3705e7d57057 100644
> --- a/fs/btrfs/qgroup.c
> +++ b/fs/btrfs/qgroup.c
> @@ -171,7 +171,7 @@ qgroup_rescan_init(struct btrfs_fs_info *fs_info, u64 progress_objectid,
>   static void qgroup_rescan_zero_tracking(struct btrfs_fs_info *fs_info);
>
>   /* must be called with qgroup_ioctl_lock held */
> -static struct btrfs_qgroup *find_qgroup_rb(struct btrfs_fs_info *fs_info,
> +struct btrfs_qgroup *find_qgroup_rb(struct btrfs_fs_info *fs_info,
>   					   u64 qgroupid)
>   {
>   	struct rb_node *n = fs_info->qgroup_tree.rb_node;
> diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h
> index 855a4f978761..96c6aa31ca91 100644
> --- a/fs/btrfs/qgroup.h
> +++ b/fs/btrfs/qgroup.h
> @@ -425,4 +425,6 @@ bool btrfs_check_quota_leak(struct btrfs_fs_info *fs_info);
>   int btrfs_record_squota_delta(struct btrfs_fs_info *fs_info,
>   			      struct btrfs_squota_delta *delta);
>
> +struct btrfs_qgroup *find_qgroup_rb(struct btrfs_fs_info *fs_info,
> +                                            u64 qgroupid);
>   #endif
Edward Adam Davis Nov. 11, 2023, 8:13 a.m. UTC | #3
On Sat, 11 Nov 2023 06:20:59 +0000, Matthew Wilcox wrote:
> > +++ b/fs/btrfs/disk-io.c
> > @@ -4931,7 +4931,8 @@ int btrfs_get_free_objectid(struct btrfs_root *root, u64 *objectid)
> >  		goto out;
> >  	}
> >  
> > -	*objectid = root->free_objectid++;
> > +	while (find_qgroup_rb(root->fs_info, root->free_objectid++));
> > +	*objectid = root->free_objectid;
> 
> This looks buggy to me.  Let's say that free_objectid is currently 3.
> 
> Before, it would assign 3 to *objectid, and increment free_objectid to
> 4.  After (assuming the loop terminates on first iteration), it will
> increment free_objectid to 4, then assign 4 to *objectid.
> 
> I think you meant to write:
> 
> 	while (find_qgroup_rb(root->fs_info, root->free_objectid))
> 		root->free_objectid++;
> 	*objectid = root->free_objectid++;
Yes, your guess is correct.
> 
> And the lesson here is that more compact code is not necessarily more
> correct code.
> 
> (I'm not making any judgement about whether this is the correct fix;
> I don't understand btrfs well enough to have an opinion.  Just that
> this is not an equivalent transformation)
I don't have much knowledge about btrfs too, but one thing is clear: the qgroupid 
taken by create_snapshot() is calculated from btrfs_get_free_ojectid(). 
At the same time, when calculating the new value in btrfs_get_free_ojectid(), 
it is clearly unreasonable to not determine whether the new value exists in the
qgroup_tree tree.
Perhaps there are other methods to obtain a new qgroupid, but before obtaining 
a new value, it is necessary to perform a duplicate value judgment on qgroup_tree,
otherwise similar problems may still occur.

edward
Qu Wenruo Nov. 11, 2023, 8:48 p.m. UTC | #4
On 2023/11/11 18:43, Edward Adam Davis wrote:
> On Sat, 11 Nov 2023 06:20:59 +0000, Matthew Wilcox wrote:
>>> +++ b/fs/btrfs/disk-io.c
>>> @@ -4931,7 +4931,8 @@ int btrfs_get_free_objectid(struct btrfs_root *root, u64 *objectid)
>>>   		goto out;
>>>   	}
>>>
>>> -	*objectid = root->free_objectid++;
>>> +	while (find_qgroup_rb(root->fs_info, root->free_objectid++));
>>> +	*objectid = root->free_objectid;
>>
>> This looks buggy to me.  Let's say that free_objectid is currently 3.
>>
>> Before, it would assign 3 to *objectid, and increment free_objectid to
>> 4.  After (assuming the loop terminates on first iteration), it will
>> increment free_objectid to 4, then assign 4 to *objectid.
>>
>> I think you meant to write:
>>
>> 	while (find_qgroup_rb(root->fs_info, root->free_objectid))
>> 		root->free_objectid++;
>> 	*objectid = root->free_objectid++;
> Yes, your guess is correct.
>>
>> And the lesson here is that more compact code is not necessarily more
>> correct code.
>>
>> (I'm not making any judgement about whether this is the correct fix;
>> I don't understand btrfs well enough to have an opinion.  Just that
>> this is not an equivalent transformation)
> I don't have much knowledge about btrfs too, but one thing is clear: the qgroupid
> taken by create_snapshot() is calculated from btrfs_get_free_ojectid().
> At the same time, when calculating the new value in btrfs_get_free_ojectid(),
> it is clearly unreasonable to not determine whether the new value exists in the
> qgroup_tree tree.

Nope, it's totally wrong.

Qgroupid is bound to subvolumeid, thus getting a different id for
qgroupid is going to screw the whole thing up.

> Perhaps there are other methods to obtain a new qgroupid, but before obtaining
> a new value, it is necessary to perform a duplicate value judgment on qgroup_tree,
> otherwise similar problems may still occur.

If you don't really understand the context, the fix is never going to be
correct.

Thanks,
Qu

>
> edward
>
>
diff mbox series

Patch

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 401ea09ae4b8..97050a3edc32 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -4931,7 +4931,8 @@  int btrfs_get_free_objectid(struct btrfs_root *root, u64 *objectid)
 		goto out;
 	}
 
-	*objectid = root->free_objectid++;
+	while (find_qgroup_rb(root->fs_info, root->free_objectid++));
+	*objectid = root->free_objectid;
 	ret = 0;
 out:
 	mutex_unlock(&root->objectid_mutex);
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index edb84cc03237..3705e7d57057 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -171,7 +171,7 @@  qgroup_rescan_init(struct btrfs_fs_info *fs_info, u64 progress_objectid,
 static void qgroup_rescan_zero_tracking(struct btrfs_fs_info *fs_info);
 
 /* must be called with qgroup_ioctl_lock held */
-static struct btrfs_qgroup *find_qgroup_rb(struct btrfs_fs_info *fs_info,
+struct btrfs_qgroup *find_qgroup_rb(struct btrfs_fs_info *fs_info,
 					   u64 qgroupid)
 {
 	struct rb_node *n = fs_info->qgroup_tree.rb_node;
diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h
index 855a4f978761..96c6aa31ca91 100644
--- a/fs/btrfs/qgroup.h
+++ b/fs/btrfs/qgroup.h
@@ -425,4 +425,6 @@  bool btrfs_check_quota_leak(struct btrfs_fs_info *fs_info);
 int btrfs_record_squota_delta(struct btrfs_fs_info *fs_info,
 			      struct btrfs_squota_delta *delta);
 
+struct btrfs_qgroup *find_qgroup_rb(struct btrfs_fs_info *fs_info,
+                                            u64 qgroupid);
 #endif