diff mbox series

btrfs: do not abort transaction if there is already an existing qgroup

Message ID b305a5b0228b40fc62923b0133957c72468600de.1699649085.git.wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: do not abort transaction if there is already an existing qgroup | expand

Commit Message

Qu Wenruo Nov. 10, 2023, 8:44 p.m. UTC
[BUG]
Syzbot reported a regression that after commit 6ed05643ddb1 ("btrfs:
create qgroup earlier in snapshot creation") we can trigger transaction
abort during subvolume creation:

  ------------[ cut here ]------------
  BTRFS: Transaction aborted (error -17)
  WARNING: CPU: 0 PID: 5057 at fs/btrfs/transaction.c:1778 create_pending_snapshot+0x25f4/0x2b70 fs/btrfs/transaction.c:1778
  Modules linked in:
  CPU: 0 PID: 5057 Comm: syz-executor225 Not tainted 6.6.0-syzkaller-15365-g305230142ae0 #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
  RIP: 0010:create_pending_snapshot+0x25f4/0x2b70 fs/btrfs/transaction.c:1778
  Call Trace:
   <TASK>
   create_pending_snapshots+0x195/0x1d0 fs/btrfs/transaction.c:1967
   btrfs_commit_transaction+0xf1c/0x3730 fs/btrfs/transaction.c:2440
   create_snapshot+0x4a5/0x7e0 fs/btrfs/ioctl.c:845
   btrfs_mksubvol+0x5d0/0x750 fs/btrfs/ioctl.c:995
   btrfs_mksnapshot+0xb5/0xf0 fs/btrfs/ioctl.c:1041
   __btrfs_ioctl_snap_create+0x344/0x460 fs/btrfs/ioctl.c:1294
   btrfs_ioctl_snap_create+0x13c/0x190 fs/btrfs/ioctl.c:1321
   btrfs_ioctl+0xbbf/0xd40
   vfs_ioctl fs/ioctl.c:51 [inline]
   __do_sys_ioctl fs/ioctl.c:871 [inline]
   __se_sys_ioctl+0xf8/0x170 fs/ioctl.c:857
   do_syscall_x64 arch/x86/entry/common.c:51 [inline]
   do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
   entry_SYSCALL_64_after_hwframe+0x63/0x6b
  RIP: 0033:0x7f2f791127b9
   </TASK>

[CAUSE]
The error number is -EEXIST, which can happen for qgroup if there is
already an existing qgroup and then we're trying to create a subvolume
for it.

[FIX]
In that case, we can continue creating the subvolume, although it may
lead to qgroup inconsistency, it's not so critical to abort the current
transaction.

So in this case, we can just ignore the non-critical errors, mostly -EEXIST
(there is already a qgroup).

Reported-by: syzbot+4d81015bc10889fd12ea@syzkaller.appspotmail.com
Fixes: 6ed05643ddb1 ("btrfs: create qgroup earlier in snapshot creation")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/transaction.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Filipe Manana Nov. 13, 2023, 1:28 p.m. UTC | #1
On Fri, Nov 10, 2023 at 8:45 PM Qu Wenruo <wqu@suse.com> wrote:
>
> [BUG]
> Syzbot reported a regression that after commit 6ed05643ddb1 ("btrfs:
> create qgroup earlier in snapshot creation") we can trigger transaction
> abort during subvolume creation:

subvolume -> snapshot

>
>   ------------[ cut here ]------------
>   BTRFS: Transaction aborted (error -17)
>   WARNING: CPU: 0 PID: 5057 at fs/btrfs/transaction.c:1778 create_pending_snapshot+0x25f4/0x2b70 fs/btrfs/transaction.c:1778
>   Modules linked in:
>   CPU: 0 PID: 5057 Comm: syz-executor225 Not tainted 6.6.0-syzkaller-15365-g305230142ae0 #0
>   Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
>   RIP: 0010:create_pending_snapshot+0x25f4/0x2b70 fs/btrfs/transaction.c:1778
>   Call Trace:
>    <TASK>
>    create_pending_snapshots+0x195/0x1d0 fs/btrfs/transaction.c:1967
>    btrfs_commit_transaction+0xf1c/0x3730 fs/btrfs/transaction.c:2440
>    create_snapshot+0x4a5/0x7e0 fs/btrfs/ioctl.c:845
>    btrfs_mksubvol+0x5d0/0x750 fs/btrfs/ioctl.c:995
>    btrfs_mksnapshot+0xb5/0xf0 fs/btrfs/ioctl.c:1041
>    __btrfs_ioctl_snap_create+0x344/0x460 fs/btrfs/ioctl.c:1294
>    btrfs_ioctl_snap_create+0x13c/0x190 fs/btrfs/ioctl.c:1321
>    btrfs_ioctl+0xbbf/0xd40
>    vfs_ioctl fs/ioctl.c:51 [inline]
>    __do_sys_ioctl fs/ioctl.c:871 [inline]
>    __se_sys_ioctl+0xf8/0x170 fs/ioctl.c:857
>    do_syscall_x64 arch/x86/entry/common.c:51 [inline]
>    do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
>    entry_SYSCALL_64_after_hwframe+0x63/0x6b
>   RIP: 0033:0x7f2f791127b9
>    </TASK>
>
> [CAUSE]
> The error number is -EEXIST, which can happen for qgroup if there is
> already an existing qgroup and then we're trying to create a subvolume

subvolume -> snapshot

> for it.
>
> [FIX]
> In that case, we can continue creating the subvolume, although it may

subvolume -> snapshot

> lead to qgroup inconsistency, it's not so critical to abort the current
> transaction.
>
> So in this case, we can just ignore the non-critical errors, mostly -EEXIST
> (there is already a qgroup).
>
> Reported-by: syzbot+4d81015bc10889fd12ea@syzkaller.appspotmail.com
> Fixes: 6ed05643ddb1 ("btrfs: create qgroup earlier in snapshot creation")
> Signed-off-by: Qu Wenruo <wqu@suse.com>

With those minor fixes to the changelog:

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Thanks.

> ---
>  fs/btrfs/transaction.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index 9694a3ca1739..7af9665bebae 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -1774,7 +1774,7 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans,
>         btrfs_release_path(path);
>
>         ret = btrfs_create_qgroup(trans, objectid);
> -       if (ret) {
> +       if (ret && ret != -EEXIST) {
>                 btrfs_abort_transaction(trans, ret);
>                 goto fail;
>         }
> --
> 2.42.1
>
>
David Sterba Nov. 13, 2023, 5:07 p.m. UTC | #2
On Sat, Nov 11, 2023 at 07:14:57AM +1030, Qu Wenruo wrote:
> [BUG]
> Syzbot reported a regression that after commit 6ed05643ddb1 ("btrfs:
> create qgroup earlier in snapshot creation") we can trigger transaction
> abort during subvolume creation:
> 
>   ------------[ cut here ]------------
>   BTRFS: Transaction aborted (error -17)
>   WARNING: CPU: 0 PID: 5057 at fs/btrfs/transaction.c:1778 create_pending_snapshot+0x25f4/0x2b70 fs/btrfs/transaction.c:1778
>   Modules linked in:
>   CPU: 0 PID: 5057 Comm: syz-executor225 Not tainted 6.6.0-syzkaller-15365-g305230142ae0 #0
>   Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
>   RIP: 0010:create_pending_snapshot+0x25f4/0x2b70 fs/btrfs/transaction.c:1778
>   Call Trace:
>    <TASK>
>    create_pending_snapshots+0x195/0x1d0 fs/btrfs/transaction.c:1967
>    btrfs_commit_transaction+0xf1c/0x3730 fs/btrfs/transaction.c:2440
>    create_snapshot+0x4a5/0x7e0 fs/btrfs/ioctl.c:845
>    btrfs_mksubvol+0x5d0/0x750 fs/btrfs/ioctl.c:995
>    btrfs_mksnapshot+0xb5/0xf0 fs/btrfs/ioctl.c:1041
>    __btrfs_ioctl_snap_create+0x344/0x460 fs/btrfs/ioctl.c:1294
>    btrfs_ioctl_snap_create+0x13c/0x190 fs/btrfs/ioctl.c:1321
>    btrfs_ioctl+0xbbf/0xd40
>    vfs_ioctl fs/ioctl.c:51 [inline]
>    __do_sys_ioctl fs/ioctl.c:871 [inline]
>    __se_sys_ioctl+0xf8/0x170 fs/ioctl.c:857
>    do_syscall_x64 arch/x86/entry/common.c:51 [inline]
>    do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
>    entry_SYSCALL_64_after_hwframe+0x63/0x6b
>   RIP: 0033:0x7f2f791127b9
>    </TASK>
> 
> [CAUSE]
> The error number is -EEXIST, which can happen for qgroup if there is
> already an existing qgroup and then we're trying to create a subvolume
> for it.
> 
> [FIX]
> In that case, we can continue creating the subvolume, although it may
> lead to qgroup inconsistency, it's not so critical to abort the current
> transaction.
> 
> So in this case, we can just ignore the non-critical errors, mostly -EEXIST
> (there is already a qgroup).
> 
> Reported-by: syzbot+4d81015bc10889fd12ea@syzkaller.appspotmail.com
> Fixes: 6ed05643ddb1 ("btrfs: create qgroup earlier in snapshot creation")
> Signed-off-by: Qu Wenruo <wqu@suse.com>

With the changelog fixups added to misc-next, thanks.
Anand Jain Nov. 14, 2023, 5:13 a.m. UTC | #3
> [CAUSE]
> The error number is -EEXIST, which can happen for qgroup if there is
> already an existing qgroup and then we're trying to create a subvolume
> for it.
> 

We were able to create a qgroup for which the snapshot ID did not exist.
Shouldn't that have failed in the first place?

Thanks, Anand

> [FIX]
> In that case, we can continue creating the subvolume, although it may
> lead to qgroup inconsistency, it's not so critical to abort the current
> transaction.
> 
> So in this case, we can just ignore the non-critical errors, mostly -EEXIST
> (there is already a qgroup).
> 
> Reported-by: syzbot+4d81015bc10889fd12ea@syzkaller.appspotmail.com
> Fixes: 6ed05643ddb1 ("btrfs: create qgroup earlier in snapshot creation")
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>   fs/btrfs/transaction.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index 9694a3ca1739..7af9665bebae 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -1774,7 +1774,7 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans,
>   	btrfs_release_path(path);
>   
>   	ret = btrfs_create_qgroup(trans, objectid);
> -	if (ret) {
> +	if (ret && ret != -EEXIST) {
>   		btrfs_abort_transaction(trans, ret);
>   		goto fail;
>   	}
Qu Wenruo Nov. 14, 2023, 7:55 a.m. UTC | #4
On 2023/11/14 15:43, Anand Jain wrote:
> 
> 
>> [CAUSE]
>> The error number is -EEXIST, which can happen for qgroup if there is
>> already an existing qgroup and then we're trying to create a subvolume
>> for it.
>>
> 
> We were able to create a qgroup for which the snapshot ID did not exist.
> Shouldn't that have failed in the first place?

We allowed it from the very beginning, even had interfaces to allow end 
users to modify them directly.

But nowadays, you can no longer change the numbers out of the kernel, 
thus the newly created 0 level qgroups can only have 0 as their rfer/excl.

Thus it won't cause any problem, some may even consider it as a way to 
"preallocate" qgroups.

Thanks,
Qu
> 
> Thanks, Anand
> 
>> [FIX]
>> In that case, we can continue creating the subvolume, although it may
>> lead to qgroup inconsistency, it's not so critical to abort the current
>> transaction.
>>
>> So in this case, we can just ignore the non-critical errors, mostly 
>> -EEXIST
>> (there is already a qgroup).
>>
>> Reported-by: syzbot+4d81015bc10889fd12ea@syzkaller.appspotmail.com
>> Fixes: 6ed05643ddb1 ("btrfs: create qgroup earlier in snapshot creation")
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>   fs/btrfs/transaction.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
>> index 9694a3ca1739..7af9665bebae 100644
>> --- a/fs/btrfs/transaction.c
>> +++ b/fs/btrfs/transaction.c
>> @@ -1774,7 +1774,7 @@ static noinline int 
>> create_pending_snapshot(struct btrfs_trans_handle *trans,
>>       btrfs_release_path(path);
>>       ret = btrfs_create_qgroup(trans, objectid);
>> -    if (ret) {
>> +    if (ret && ret != -EEXIST) {
>>           btrfs_abort_transaction(trans, ret);
>>           goto fail;
>>       }
>
David Sterba Nov. 14, 2023, 12:38 p.m. UTC | #5
On Tue, Nov 14, 2023 at 06:25:03PM +1030, Qu Wenruo wrote:
> 
> 
> On 2023/11/14 15:43, Anand Jain wrote:
> > 
> > 
> >> [CAUSE]
> >> The error number is -EEXIST, which can happen for qgroup if there is
> >> already an existing qgroup and then we're trying to create a subvolume
> >> for it.
> >>
> > 
> > We were able to create a qgroup for which the snapshot ID did not exist.
> > Shouldn't that have failed in the first place?
> 
> We allowed it from the very beginning, even had interfaces to allow end 
> users to modify them directly.
> 
> But nowadays, you can no longer change the numbers out of the kernel, 
> thus the newly created 0 level qgroups can only have 0 as their rfer/excl.
> 
> Thus it won't cause any problem, some may even consider it as a way to 
> "preallocate" qgroups.

Is this based on a real world use case or only that the functionality
allows that? Preallocating could be hard as the subvolume numbers are
unpredictable in general, in practice it's +1 from the last one created
so with some luck the new subvolume could match the preallocated qgroup.

Preallocating qgroups might make sense to put them to some higher level
qgroup for accounting and setting limits, but this is again artifical
and unstable when using so I wonder who and how would use it that way.
diff mbox series

Patch

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 9694a3ca1739..7af9665bebae 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1774,7 +1774,7 @@  static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans,
 	btrfs_release_path(path);
 
 	ret = btrfs_create_qgroup(trans, objectid);
-	if (ret) {
+	if (ret && ret != -EEXIST) {
 		btrfs_abort_transaction(trans, ret);
 		goto fail;
 	}