[v2] Btrfs: fix deadlock when enabling quotas due to concurrent snapshot creation
diff mbox series

Message ID 20181119141536.20748-1-fdmanana@kernel.org
State New
Headers show
Series
  • [v2] Btrfs: fix deadlock when enabling quotas due to concurrent snapshot creation
Related show

Commit Message

Filipe Manana Nov. 19, 2018, 2:15 p.m. UTC
From: Filipe Manana <fdmanana@suse.com>

If the quota enable and snapshot creation ioctls are called concurrently
we can get into a deadlock where the task enabling quotas will deadlock
on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it
twice, or the task creating a snapshot tries to commit the transaction
while the task enabling quota waits for the former task to commit the
transaction while holding the mutex. The following time diagrams show how
both cases happen.

First scenario:

           CPU 0                                    CPU 1

 btrfs_ioctl()
  btrfs_ioctl_quota_ctl()
   btrfs_quota_enable()
    mutex_lock(fs_info->qgroup_ioctl_lock)
    btrfs_start_transaction()

                                             btrfs_ioctl()
                                              btrfs_ioctl_snap_create_v2
                                               create_snapshot()
                                                --> adds snapshot to the
                                                    list pending_snapshots
                                                    of the current
                                                    transaction

    btrfs_commit_transaction()
     create_pending_snapshots()
       create_pending_snapshot()
        qgroup_account_snapshot()
         btrfs_qgroup_inherit()
	   mutex_lock(fs_info->qgroup_ioctl_lock)
	    --> deadlock, mutex already locked
	        by this task at
		btrfs_quota_enable()

Second scenario:

           CPU 0                                    CPU 1

 btrfs_ioctl()
  btrfs_ioctl_quota_ctl()
   btrfs_quota_enable()
    mutex_lock(fs_info->qgroup_ioctl_lock)
    btrfs_start_transaction()

                                             btrfs_ioctl()
                                              btrfs_ioctl_snap_create_v2
                                               create_snapshot()
                                                --> adds snapshot to the
                                                    list pending_snapshots
                                                    of the current
                                                    transaction

                                                btrfs_commit_transaction()
                                                 --> waits for task at
                                                     CPU 0 to release
                                                     its transaction
                                                     handle

    btrfs_commit_transaction()
     --> sees another task started
         the transaction commit first
     --> releases its transaction
         handle
     --> waits for the transaction
         commit to be completed by
         the task at CPU 1

                                                 create_pending_snapshot()
                                                  qgroup_account_snapshot()
                                                   btrfs_qgroup_inherit()
                                                    mutex_lock(fs_info->qgroup_ioctl_lock)
                                                     --> deadlock, task at CPU 0
                                                         has the mutex locked but
                                                         it is waiting for us to
                                                         finish the transaction
                                                         commit

So fix this by setting the quota enabled flag in fs_info after committing
the transaction at btrfs_quota_enable(). This ends up serializing quota
enable and snapshot creation as if the snapshot creation happened just
before the quota enable request. The quota rescan task, scheduled after
committing the transaction in btrfs_quote_enable(), will do the accounting.

Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating snapshot")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---

V2: Added second deadlock example to changelog and changed the fix to deal
    with that case as well.

 fs/btrfs/qgroup.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

Comments

Qu Wenruo Nov. 19, 2018, 2:48 p.m. UTC | #1
On 2018/11/19 下午10:15, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> If the quota enable and snapshot creation ioctls are called concurrently
> we can get into a deadlock where the task enabling quotas will deadlock
> on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it
> twice, or the task creating a snapshot tries to commit the transaction
> while the task enabling quota waits for the former task to commit the
> transaction while holding the mutex. The following time diagrams show how
> both cases happen.
> 
> First scenario:
> 
>            CPU 0                                    CPU 1
> 
>  btrfs_ioctl()
>   btrfs_ioctl_quota_ctl()
>    btrfs_quota_enable()
>     mutex_lock(fs_info->qgroup_ioctl_lock)
>     btrfs_start_transaction()
> 
>                                              btrfs_ioctl()
>                                               btrfs_ioctl_snap_create_v2
>                                                create_snapshot()
>                                                 --> adds snapshot to the
>                                                     list pending_snapshots
>                                                     of the current
>                                                     transaction
> 
>     btrfs_commit_transaction()
>      create_pending_snapshots()
>        create_pending_snapshot()
>         qgroup_account_snapshot()
>          btrfs_qgroup_inherit()
> 	   mutex_lock(fs_info->qgroup_ioctl_lock)
> 	    --> deadlock, mutex already locked
> 	        by this task at
> 		btrfs_quota_enable()
> 
> Second scenario:
> 
>            CPU 0                                    CPU 1
> 
>  btrfs_ioctl()
>   btrfs_ioctl_quota_ctl()
>    btrfs_quota_enable()
>     mutex_lock(fs_info->qgroup_ioctl_lock)
>     btrfs_start_transaction()
> 
>                                              btrfs_ioctl()
>                                               btrfs_ioctl_snap_create_v2
>                                                create_snapshot()
>                                                 --> adds snapshot to the
>                                                     list pending_snapshots
>                                                     of the current
>                                                     transaction
> 
>                                                 btrfs_commit_transaction()
>                                                  --> waits for task at
>                                                      CPU 0 to release
>                                                      its transaction
>                                                      handle
> 
>     btrfs_commit_transaction()
>      --> sees another task started
>          the transaction commit first
>      --> releases its transaction
>          handle
>      --> waits for the transaction
>          commit to be completed by
>          the task at CPU 1
> 
>                                                  create_pending_snapshot()
>                                                   qgroup_account_snapshot()
>                                                    btrfs_qgroup_inherit()
>                                                     mutex_lock(fs_info->qgroup_ioctl_lock)
>                                                      --> deadlock, task at CPU 0
>                                                          has the mutex locked but
>                                                          it is waiting for us to
>                                                          finish the transaction
>                                                          commit
> 
> So fix this by setting the quota enabled flag in fs_info after committing
> the transaction at btrfs_quota_enable(). This ends up serializing quota
> enable and snapshot creation as if the snapshot creation happened just
> before the quota enable request. The quota rescan task, scheduled after
> committing the transaction in btrfs_quote_enable(), will do the accounting.
> 
> Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating snapshot")
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> ---
> 
> V2: Added second deadlock example to changelog and changed the fix to deal
>     with that case as well.
> 
>  fs/btrfs/qgroup.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> index d4917c0cddf5..ae1358253b7b 100644
> --- a/fs/btrfs/qgroup.c
> +++ b/fs/btrfs/qgroup.c
> @@ -1013,16 +1013,22 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
>  		btrfs_abort_transaction(trans, ret);
>  		goto out_free_path;
>  	}
> -	spin_lock(&fs_info->qgroup_lock);
> -	fs_info->quota_root = quota_root;
> -	set_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
> -	spin_unlock(&fs_info->qgroup_lock);
>  
>  	ret = btrfs_commit_transaction(trans);
>  	trans = NULL;
>  	if (ret)
>  		goto out_free_path;

The main concern here is, if we don't set qgroup enabled bit before we
commit transaction, there will be a window where new tree modification
could happen before QGROUP_ENABLED bit set.

There may be some qgroup reserved space related problem in such case,
but I'm not 100% sure to foresee such problem.


The best way to do this is, commit trans first, and before any other one
trying to start transaction, we set the bit.
However I can't find such infrastructure now (I still remember we used
to have a pending bit to change quota enabled flag, but removed later)

Thanks,
Qu

>  
> +	/*
> +	 * Set quota enabled flag after committing the transaction, to avoid
> +	 * deadlocks on fs_info->qgroup_ioctl_lock with concurrent snapshot
> +	 * creation.
> +	 */
> +	spin_lock(&fs_info->qgroup_lock);
> +	fs_info->quota_root = quota_root;
> +	set_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
> +	spin_unlock(&fs_info->qgroup_lock);
> +
>  	ret = qgroup_rescan_init(fs_info, 0, 1);
>  	if (!ret) {
>  	        qgroup_rescan_zero_tracking(fs_info);
>
Filipe Manana Nov. 19, 2018, 3:24 p.m. UTC | #2
On Mon, Nov 19, 2018 at 2:48 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
>
> On 2018/11/19 下午10:15, fdmanana@kernel.org wrote:
> > From: Filipe Manana <fdmanana@suse.com>
> >
> > If the quota enable and snapshot creation ioctls are called concurrently
> > we can get into a deadlock where the task enabling quotas will deadlock
> > on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it
> > twice, or the task creating a snapshot tries to commit the transaction
> > while the task enabling quota waits for the former task to commit the
> > transaction while holding the mutex. The following time diagrams show how
> > both cases happen.
> >
> > First scenario:
> >
> >            CPU 0                                    CPU 1
> >
> >  btrfs_ioctl()
> >   btrfs_ioctl_quota_ctl()
> >    btrfs_quota_enable()
> >     mutex_lock(fs_info->qgroup_ioctl_lock)
> >     btrfs_start_transaction()
> >
> >                                              btrfs_ioctl()
> >                                               btrfs_ioctl_snap_create_v2
> >                                                create_snapshot()
> >                                                 --> adds snapshot to the
> >                                                     list pending_snapshots
> >                                                     of the current
> >                                                     transaction
> >
> >     btrfs_commit_transaction()
> >      create_pending_snapshots()
> >        create_pending_snapshot()
> >         qgroup_account_snapshot()
> >          btrfs_qgroup_inherit()
> >          mutex_lock(fs_info->qgroup_ioctl_lock)
> >           --> deadlock, mutex already locked
> >               by this task at
> >               btrfs_quota_enable()
> >
> > Second scenario:
> >
> >            CPU 0                                    CPU 1
> >
> >  btrfs_ioctl()
> >   btrfs_ioctl_quota_ctl()
> >    btrfs_quota_enable()
> >     mutex_lock(fs_info->qgroup_ioctl_lock)
> >     btrfs_start_transaction()
> >
> >                                              btrfs_ioctl()
> >                                               btrfs_ioctl_snap_create_v2
> >                                                create_snapshot()
> >                                                 --> adds snapshot to the
> >                                                     list pending_snapshots
> >                                                     of the current
> >                                                     transaction
> >
> >                                                 btrfs_commit_transaction()
> >                                                  --> waits for task at
> >                                                      CPU 0 to release
> >                                                      its transaction
> >                                                      handle
> >
> >     btrfs_commit_transaction()
> >      --> sees another task started
> >          the transaction commit first
> >      --> releases its transaction
> >          handle
> >      --> waits for the transaction
> >          commit to be completed by
> >          the task at CPU 1
> >
> >                                                  create_pending_snapshot()
> >                                                   qgroup_account_snapshot()
> >                                                    btrfs_qgroup_inherit()
> >                                                     mutex_lock(fs_info->qgroup_ioctl_lock)
> >                                                      --> deadlock, task at CPU 0
> >                                                          has the mutex locked but
> >                                                          it is waiting for us to
> >                                                          finish the transaction
> >                                                          commit
> >
> > So fix this by setting the quota enabled flag in fs_info after committing
> > the transaction at btrfs_quota_enable(). This ends up serializing quota
> > enable and snapshot creation as if the snapshot creation happened just
> > before the quota enable request. The quota rescan task, scheduled after
> > committing the transaction in btrfs_quote_enable(), will do the accounting.
> >
> > Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating snapshot")
> > Signed-off-by: Filipe Manana <fdmanana@suse.com>
> > ---
> >
> > V2: Added second deadlock example to changelog and changed the fix to deal
> >     with that case as well.
> >
> >  fs/btrfs/qgroup.c | 14 ++++++++++----
> >  1 file changed, 10 insertions(+), 4 deletions(-)
> >
> > diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> > index d4917c0cddf5..ae1358253b7b 100644
> > --- a/fs/btrfs/qgroup.c
> > +++ b/fs/btrfs/qgroup.c
> > @@ -1013,16 +1013,22 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
> >               btrfs_abort_transaction(trans, ret);
> >               goto out_free_path;
> >       }
> > -     spin_lock(&fs_info->qgroup_lock);
> > -     fs_info->quota_root = quota_root;
> > -     set_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
> > -     spin_unlock(&fs_info->qgroup_lock);
> >
> >       ret = btrfs_commit_transaction(trans);
> >       trans = NULL;
> >       if (ret)
> >               goto out_free_path;
>
> The main concern here is, if we don't set qgroup enabled bit before we
> commit transaction, there will be a window where new tree modification
> could happen before QGROUP_ENABLED bit set.

That doesn't seem to make much sense to me, if I understood correctly.
Essentially you're saying stuff done to any tree in the the
transaction we use to
enable quotas must be accounted for. In that case the quota enabled bit should
be done as soon as the transaction is started, because before we set
it and after
we started (or joined) a transaction, a lot could of modifications may
have happened.
Nevertheless I don't think it's unexpected for anyone to have the
accounting happening
only after the quota enable ioctl returns success.

>
> There may be some qgroup reserved space related problem in such case,
> but I'm not 100% sure to foresee such problem.
>
>
> The best way to do this is, commit trans first, and before any other one
> trying to start transaction, we set the bit.
> However I can't find such infrastructure now (I still remember we used
> to have a pending bit to change quota enabled flag, but removed later)
>
> Thanks,
> Qu
>
> >
> > +     /*
> > +      * Set quota enabled flag after committing the transaction, to avoid
> > +      * deadlocks on fs_info->qgroup_ioctl_lock with concurrent snapshot
> > +      * creation.
> > +      */
> > +     spin_lock(&fs_info->qgroup_lock);
> > +     fs_info->quota_root = quota_root;
> > +     set_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
> > +     spin_unlock(&fs_info->qgroup_lock);
> > +
> >       ret = qgroup_rescan_init(fs_info, 0, 1);
> >       if (!ret) {
> >               qgroup_rescan_zero_tracking(fs_info);
> >
>
Nikolay Borisov Nov. 19, 2018, 3:36 p.m. UTC | #3
On 19.11.18 г. 16:48 ч., Qu Wenruo wrote:
> There may be some qgroup reserved space related problem in such case,
> but I'm not 100% sure to foresee such problem.

But why is this a problem - we always queue quota rescan following QUOTA
enable, that should take care of proper accounting, no?

> 
> 
> The best way to do this is, commit trans first, and before any other one
> trying to start transaction, we set the bit.
> However I can't find such infrastructure now (I still remember we used
> to have a pending bit to change quota enabled flag, but removed later)
Qu Wenruo Nov. 20, 2018, 12:30 a.m. UTC | #4
On 2018/11/19 下午11:36, Nikolay Borisov wrote:
> 
> 
> On 19.11.18 г. 16:48 ч., Qu Wenruo wrote:
>> There may be some qgroup reserved space related problem in such case,
>> but I'm not 100% sure to foresee such problem.
> 
> But why is this a problem - we always queue quota rescan following QUOTA
> enable, that should take care of proper accounting, no?

For reserved data/metadata space, not qgroup numbers.

But it turns out we have already handle such case for data.

So I'm overreacting to this problem.

> 
>>
>>
>> The best way to do this is, commit trans first, and before any other one
>> trying to start transaction, we set the bit.
>> However I can't find such infrastructure now (I still remember we used
>> to have a pending bit to change quota enabled flag, but removed later)
Qu Wenruo Nov. 20, 2018, 12:32 a.m. UTC | #5
On 2018/11/19 下午11:24, Filipe Manana wrote:
> On Mon, Nov 19, 2018 at 2:48 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>
>>
>>
>> On 2018/11/19 下午10:15, fdmanana@kernel.org wrote:
>>> From: Filipe Manana <fdmanana@suse.com>
>>>
>>> If the quota enable and snapshot creation ioctls are called concurrently
>>> we can get into a deadlock where the task enabling quotas will deadlock
>>> on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it
>>> twice, or the task creating a snapshot tries to commit the transaction
>>> while the task enabling quota waits for the former task to commit the
>>> transaction while holding the mutex. The following time diagrams show how
>>> both cases happen.
>>>
>>> First scenario:
>>>
>>>            CPU 0                                    CPU 1
>>>
>>>  btrfs_ioctl()
>>>   btrfs_ioctl_quota_ctl()
>>>    btrfs_quota_enable()
>>>     mutex_lock(fs_info->qgroup_ioctl_lock)
>>>     btrfs_start_transaction()
>>>
>>>                                              btrfs_ioctl()
>>>                                               btrfs_ioctl_snap_create_v2
>>>                                                create_snapshot()
>>>                                                 --> adds snapshot to the
>>>                                                     list pending_snapshots
>>>                                                     of the current
>>>                                                     transaction
>>>
>>>     btrfs_commit_transaction()
>>>      create_pending_snapshots()
>>>        create_pending_snapshot()
>>>         qgroup_account_snapshot()
>>>          btrfs_qgroup_inherit()
>>>          mutex_lock(fs_info->qgroup_ioctl_lock)
>>>           --> deadlock, mutex already locked
>>>               by this task at
>>>               btrfs_quota_enable()
>>>
>>> Second scenario:
>>>
>>>            CPU 0                                    CPU 1
>>>
>>>  btrfs_ioctl()
>>>   btrfs_ioctl_quota_ctl()
>>>    btrfs_quota_enable()
>>>     mutex_lock(fs_info->qgroup_ioctl_lock)
>>>     btrfs_start_transaction()
>>>
>>>                                              btrfs_ioctl()
>>>                                               btrfs_ioctl_snap_create_v2
>>>                                                create_snapshot()
>>>                                                 --> adds snapshot to the
>>>                                                     list pending_snapshots
>>>                                                     of the current
>>>                                                     transaction
>>>
>>>                                                 btrfs_commit_transaction()
>>>                                                  --> waits for task at
>>>                                                      CPU 0 to release
>>>                                                      its transaction
>>>                                                      handle
>>>
>>>     btrfs_commit_transaction()
>>>      --> sees another task started
>>>          the transaction commit first
>>>      --> releases its transaction
>>>          handle
>>>      --> waits for the transaction
>>>          commit to be completed by
>>>          the task at CPU 1
>>>
>>>                                                  create_pending_snapshot()
>>>                                                   qgroup_account_snapshot()
>>>                                                    btrfs_qgroup_inherit()
>>>                                                     mutex_lock(fs_info->qgroup_ioctl_lock)
>>>                                                      --> deadlock, task at CPU 0
>>>                                                          has the mutex locked but
>>>                                                          it is waiting for us to
>>>                                                          finish the transaction
>>>                                                          commit
>>>
>>> So fix this by setting the quota enabled flag in fs_info after committing
>>> the transaction at btrfs_quota_enable(). This ends up serializing quota
>>> enable and snapshot creation as if the snapshot creation happened just
>>> before the quota enable request. The quota rescan task, scheduled after
>>> committing the transaction in btrfs_quote_enable(), will do the accounting.
>>>
>>> Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating snapshot")
>>> Signed-off-by: Filipe Manana <fdmanana@suse.com>
>>> ---
>>>
>>> V2: Added second deadlock example to changelog and changed the fix to deal
>>>     with that case as well.
>>>
>>>  fs/btrfs/qgroup.c | 14 ++++++++++----
>>>  1 file changed, 10 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
>>> index d4917c0cddf5..ae1358253b7b 100644
>>> --- a/fs/btrfs/qgroup.c
>>> +++ b/fs/btrfs/qgroup.c
>>> @@ -1013,16 +1013,22 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
>>>               btrfs_abort_transaction(trans, ret);
>>>               goto out_free_path;
>>>       }
>>> -     spin_lock(&fs_info->qgroup_lock);
>>> -     fs_info->quota_root = quota_root;
>>> -     set_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
>>> -     spin_unlock(&fs_info->qgroup_lock);
>>>
>>>       ret = btrfs_commit_transaction(trans);
>>>       trans = NULL;
>>>       if (ret)
>>>               goto out_free_path;
>>
>> The main concern here is, if we don't set qgroup enabled bit before we
>> commit transaction, there will be a window where new tree modification
>> could happen before QGROUP_ENABLED bit set.
> 
> That doesn't seem to make much sense to me, if I understood correctly.
> Essentially you're saying stuff done to any tree in the the
> transaction we use to
> enable quotas must be accounted for. In that case the quota enabled bit should
> be done as soon as the transaction is started, because before we set
> it and after
> we started (or joined) a transaction, a lot could of modifications may
> have happened.
> Nevertheless I don't think it's unexpected for anyone to have the
> accounting happening
> only after the quota enable ioctl returns success.

The problem is not accounting, the qgroup number won't cause problem.

It's the reserved space. Like some dirty pages are dirtied before quota
enabled, but at run_dealloc() time quota is enabled.

For such case we have io_tree based method to avoid underflow so it
should be OK.

So v2 patch looks OK.

Thanks,
Qu

> 
>>
>> There may be some qgroup reserved space related problem in such case,
>> but I'm not 100% sure to foresee such problem.
>>
>>
>> The best way to do this is, commit trans first, and before any other one
>> trying to start transaction, we set the bit.
>> However I can't find such infrastructure now (I still remember we used
>> to have a pending bit to change quota enabled flag, but removed later)
>>
>> Thanks,
>> Qu
>>
>>>
>>> +     /*
>>> +      * Set quota enabled flag after committing the transaction, to avoid
>>> +      * deadlocks on fs_info->qgroup_ioctl_lock with concurrent snapshot
>>> +      * creation.
>>> +      */
>>> +     spin_lock(&fs_info->qgroup_lock);
>>> +     fs_info->quota_root = quota_root;
>>> +     set_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
>>> +     spin_unlock(&fs_info->qgroup_lock);
>>> +
>>>       ret = qgroup_rescan_init(fs_info, 0, 1);
>>>       if (!ret) {
>>>               qgroup_rescan_zero_tracking(fs_info);
>>>
>>
David Sterba Nov. 22, 2018, 1:12 p.m. UTC | #6
On Tue, Nov 20, 2018 at 08:32:42AM +0800, Qu Wenruo wrote:
> >>> @@ -1013,16 +1013,22 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
> >>>               btrfs_abort_transaction(trans, ret);
> >>>               goto out_free_path;
> >>>       }
> >>> -     spin_lock(&fs_info->qgroup_lock);
> >>> -     fs_info->quota_root = quota_root;
> >>> -     set_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
> >>> -     spin_unlock(&fs_info->qgroup_lock);
> >>>
> >>>       ret = btrfs_commit_transaction(trans);
> >>>       trans = NULL;
> >>>       if (ret)
> >>>               goto out_free_path;
> >>
> >> The main concern here is, if we don't set qgroup enabled bit before we
> >> commit transaction, there will be a window where new tree modification
> >> could happen before QGROUP_ENABLED bit set.
> > 
> > That doesn't seem to make much sense to me, if I understood correctly.
> > Essentially you're saying stuff done to any tree in the the
> > transaction we use to
> > enable quotas must be accounted for. In that case the quota enabled bit should
> > be done as soon as the transaction is started, because before we set
> > it and after
> > we started (or joined) a transaction, a lot could of modifications may
> > have happened.
> > Nevertheless I don't think it's unexpected for anyone to have the
> > accounting happening
> > only after the quota enable ioctl returns success.
> 
> The problem is not accounting, the qgroup number won't cause problem.
> 
> It's the reserved space. Like some dirty pages are dirtied before quota
> enabled, but at run_dealloc() time quota is enabled.
> 
> For such case we have io_tree based method to avoid underflow so it
> should be OK.
> 
> So v2 patch looks OK.

Does that mean reviewed-by? In case there's a evolved discussion under a
patch, a clear yes/no is appreciated and an explicit Reviewed-by even
more. I'm about to add this patch to rc4 pull, thre's still some time to
add the tag. Thanks.
Qu Wenruo Nov. 22, 2018, 1:46 p.m. UTC | #7
On 2018/11/22 下午9:12, David Sterba wrote:
> On Tue, Nov 20, 2018 at 08:32:42AM +0800, Qu Wenruo wrote:
>>>>> @@ -1013,16 +1013,22 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
>>>>>               btrfs_abort_transaction(trans, ret);
>>>>>               goto out_free_path;
>>>>>       }
>>>>> -     spin_lock(&fs_info->qgroup_lock);
>>>>> -     fs_info->quota_root = quota_root;
>>>>> -     set_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
>>>>> -     spin_unlock(&fs_info->qgroup_lock);
>>>>>
>>>>>       ret = btrfs_commit_transaction(trans);
>>>>>       trans = NULL;
>>>>>       if (ret)
>>>>>               goto out_free_path;
>>>>
>>>> The main concern here is, if we don't set qgroup enabled bit before we
>>>> commit transaction, there will be a window where new tree modification
>>>> could happen before QGROUP_ENABLED bit set.
>>>
>>> That doesn't seem to make much sense to me, if I understood correctly.
>>> Essentially you're saying stuff done to any tree in the the
>>> transaction we use to
>>> enable quotas must be accounted for. In that case the quota enabled bit should
>>> be done as soon as the transaction is started, because before we set
>>> it and after
>>> we started (or joined) a transaction, a lot could of modifications may
>>> have happened.
>>> Nevertheless I don't think it's unexpected for anyone to have the
>>> accounting happening
>>> only after the quota enable ioctl returns success.
>>
>> The problem is not accounting, the qgroup number won't cause problem.
>>
>> It's the reserved space. Like some dirty pages are dirtied before quota
>> enabled, but at run_dealloc() time quota is enabled.
>>
>> For such case we have io_tree based method to avoid underflow so it
>> should be OK.
>>
>> So v2 patch looks OK.
> 
> Does that mean reviewed-by? In case there's a evolved discussion under a
> patch, a clear yes/no is appreciated and an explicit Reviewed-by even
> more. I'm about to add this patch to rc4 pull, thre's still some time to
> add the tag. Thanks.
> 

I'd like to add reviewed-by tab, but I'm still not 100% if this will
cause extra qgroup reserved space related problem.

At least from my side, I can't directly see a case where it will cause
problem.

Does such case mean a reviewed-by tag? Or something LGTM-but-uncertain?

Thanks,
Qu
David Sterba Nov. 22, 2018, 2:42 p.m. UTC | #8
On Thu, Nov 22, 2018 at 09:46:44PM +0800, Qu Wenruo wrote:
> >>> it and after
> >>> we started (or joined) a transaction, a lot could of modifications may
> >>> have happened.
> >>> Nevertheless I don't think it's unexpected for anyone to have the
> >>> accounting happening
> >>> only after the quota enable ioctl returns success.
> >>
> >> The problem is not accounting, the qgroup number won't cause problem.
> >>
> >> It's the reserved space. Like some dirty pages are dirtied before quota
> >> enabled, but at run_dealloc() time quota is enabled.
> >>
> >> For such case we have io_tree based method to avoid underflow so it
> >> should be OK.
> >>
> >> So v2 patch looks OK.
> > 
> > Does that mean reviewed-by? In case there's a evolved discussion under a
> > patch, a clear yes/no is appreciated and an explicit Reviewed-by even
> > more. I'm about to add this patch to rc4 pull, thre's still some time to
> > add the tag. Thanks.
> 
> I'd like to add reviewed-by tab, but I'm still not 100% if this will
> cause extra qgroup reserved space related problem.
> 
> At least from my side, I can't directly see a case where it will cause
> problem.
> 
> Does such case mean a reviewed-by tag? Or something LGTM-but-uncertain?

It means that we can keep the patch in testing branch for a bit longer,
there's still time to put it to a later rc once there's enough
certainty.

Patch
diff mbox series

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index d4917c0cddf5..ae1358253b7b 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1013,16 +1013,22 @@  int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
 		btrfs_abort_transaction(trans, ret);
 		goto out_free_path;
 	}
-	spin_lock(&fs_info->qgroup_lock);
-	fs_info->quota_root = quota_root;
-	set_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
-	spin_unlock(&fs_info->qgroup_lock);
 
 	ret = btrfs_commit_transaction(trans);
 	trans = NULL;
 	if (ret)
 		goto out_free_path;
 
+	/*
+	 * Set quota enabled flag after committing the transaction, to avoid
+	 * deadlocks on fs_info->qgroup_ioctl_lock with concurrent snapshot
+	 * creation.
+	 */
+	spin_lock(&fs_info->qgroup_lock);
+	fs_info->quota_root = quota_root;
+	set_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
+	spin_unlock(&fs_info->qgroup_lock);
+
 	ret = qgroup_rescan_init(fs_info, 0, 1);
 	if (!ret) {
 	        qgroup_rescan_zero_tracking(fs_info);