diff mbox series

btrfs: qgroup: don't try to wait flushing if we're already holding a transaction

Message ID 20201204012448.26546-1-wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: qgroup: don't try to wait flushing if we're already holding a transaction | expand

Commit Message

Qu Wenruo Dec. 4, 2020, 1:24 a.m. UTC
There is a chance of racing for qgroup flushing which may lead to
deadlock:

	Thread A		|	Thread B
   (no trans handler hold)	|  (already hold a trans handler)
--------------------------------+--------------------------------
__btrfs_qgroup_reserve_meta()   | __btrfs_qgroup_reserve_meta()
|- try_flush_qgroup()		| |- try_flushing_qgroup()
   |- QGROUP_FLUSHING bit set   |    |
   |				|    |- test_and_set_bit()
   |				|    |- wait_event()
   |- btrfs_join_transaction()	|
   |- btrfs_commit_transaction()|

			!!! DEAD LOCK !!!

Since thread A want to commit transaction, but thread B is hold a
transaction handler, blocking the commit.
At the same time, thread B is waiting thread A to finish it commit.

This is just a hot fix, and would lead to more EDQUOT when we're near
the qgroup limit.

The root fix would to make all metadata/data reservation to happen
without a transaction handler hold.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/qgroup.c | 31 +++++++++++++++++++++----------
 1 file changed, 21 insertions(+), 10 deletions(-)

Comments

Filipe Manana Dec. 4, 2020, 11:48 a.m. UTC | #1
On Fri, Dec 4, 2020 at 1:29 AM Qu Wenruo <wqu@suse.com> wrote:
>
> There is a chance of racing for qgroup flushing which may lead to
> deadlock:
>
>         Thread A                |       Thread B
>    (no trans handler hold)      |  (already hold a trans handler)

handler -> handle

"not holding a trans handle" and "holding a trans handle" respectively.

> --------------------------------+--------------------------------
> __btrfs_qgroup_reserve_meta()   | __btrfs_qgroup_reserve_meta()
> |- try_flush_qgroup()           | |- try_flushing_qgroup()

try_flushing_qgroup() -> try_flush_qgroup()

>    |- QGROUP_FLUSHING bit set   |    |
>    |                            |    |- test_and_set_bit()
>    |                            |    |- wait_event()
>    |- btrfs_join_transaction()  |
>    |- btrfs_commit_transaction()|
>
>                         !!! DEAD LOCK !!!
>
> Since thread A want to commit transaction, but thread B is hold a
> transaction handler, blocking the commit.

"thread B is hold a transaction handler" -> "thread B is holding a
transaction handle"

> At the same time, thread B is waiting thread A to finish it commit.

waiting for, it -> its

>
> This is just a hot fix, and would lead to more EDQUOT when we're near
> the qgroup limit.
>
> The root fix would to make all metadata/data reservation to happen

would -> would be

> without a transaction handler hold.

without a transaction handler hold -> without holding a transaction handle

>
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Other than the grammar, it looks fine, thanks.

Reviewed-by: Filipe Manana <fdmanana@suse.com>

> ---
>  fs/btrfs/qgroup.c | 31 +++++++++++++++++++++----------
>  1 file changed, 21 insertions(+), 10 deletions(-)
>
> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> index fe3046007f52..7785dfa348d2 100644
> --- a/fs/btrfs/qgroup.c
> +++ b/fs/btrfs/qgroup.c
> @@ -3530,16 +3530,6 @@ static int try_flush_qgroup(struct btrfs_root *root)
>         int ret;
>         bool can_commit = true;
>
> -       /*
> -        * We don't want to run flush again and again, so if there is a running
> -        * one, we won't try to start a new flush, but exit directly.
> -        */
> -       if (test_and_set_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state)) {
> -               wait_event(root->qgroup_flush_wait,
> -                       !test_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state));
> -               return 0;
> -       }
> -
>         /*
>          * If current process holds a transaction, we shouldn't flush, as we
>          * assume all space reservation happens before a transaction handle is
> @@ -3554,6 +3544,27 @@ static int try_flush_qgroup(struct btrfs_root *root)
>             current->journal_info != BTRFS_SEND_TRANS_STUB)
>                 can_commit = false;
>
> +       /*
> +        * We don't want to run flush again and again, so if there is a running
> +        * one, we won't try to start a new flush, but exit directly.
> +        */
> +       if (test_and_set_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state)) {
> +               /*
> +                * We are already holding a trans, thus we can block other
> +                * threads from flushing.
> +                * So exit right now. This increases the chance of EDQUOT for
> +                * heavy load and near limit cases.
> +                * But we can argue that if we're already near limit, EDQUOT
> +                * is unavoidable anyway.
> +                */
> +               if (!can_commit)
> +                       return 0;
> +
> +               wait_event(root->qgroup_flush_wait,
> +                       !test_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state));
> +               return 0;
> +       }
> +
>         ret = btrfs_start_delalloc_snapshot(root);
>         if (ret < 0)
>                 goto out;
> --
> 2.29.2
>
David Sterba Dec. 4, 2020, 5:28 p.m. UTC | #2
On Fri, Dec 04, 2020 at 09:24:47AM +0800, Qu Wenruo wrote:
> There is a chance of racing for qgroup flushing which may lead to
> deadlock:
> 
> 	Thread A		|	Thread B
>    (no trans handler hold)	|  (already hold a trans handler)
> --------------------------------+--------------------------------
> __btrfs_qgroup_reserve_meta()   | __btrfs_qgroup_reserve_meta()
> |- try_flush_qgroup()		| |- try_flushing_qgroup()
>    |- QGROUP_FLUSHING bit set   |    |
>    |				|    |- test_and_set_bit()
>    |				|    |- wait_event()
>    |- btrfs_join_transaction()	|
>    |- btrfs_commit_transaction()|
> 
> 			!!! DEAD LOCK !!!
> 
> Since thread A want to commit transaction, but thread B is hold a
> transaction handler, blocking the commit.
> At the same time, thread B is waiting thread A to finish it commit.
> 
> This is just a hot fix, and would lead to more EDQUOT when we're near
> the qgroup limit.
> 
> The root fix would to make all metadata/data reservation to happen
> without a transaction handler hold.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Added to misc-next, thanks.
Qu Wenruo Dec. 5, 2020, 2:55 a.m. UTC | #3
On 2020/12/5 上午1:28, David Sterba wrote:
> On Fri, Dec 04, 2020 at 09:24:47AM +0800, Qu Wenruo wrote:
>> There is a chance of racing for qgroup flushing which may lead to
>> deadlock:
>>
>> 	Thread A		|	Thread B
>>    (no trans handler hold)	|  (already hold a trans handler)
>> --------------------------------+--------------------------------
>> __btrfs_qgroup_reserve_meta()   | __btrfs_qgroup_reserve_meta()
>> |- try_flush_qgroup()		| |- try_flushing_qgroup()
>>    |- QGROUP_FLUSHING bit set   |    |
>>    |				|    |- test_and_set_bit()
>>    |				|    |- wait_event()
>>    |- btrfs_join_transaction()	|
>>    |- btrfs_commit_transaction()|
>>
>> 			!!! DEAD LOCK !!!
>>
>> Since thread A want to commit transaction, but thread B is hold a
>> transaction handler, blocking the commit.
>> At the same time, thread B is waiting thread A to finish it commit.
>>
>> This is just a hot fix, and would lead to more EDQUOT when we're near
>> the qgroup limit.
>>
>> The root fix would to make all metadata/data reservation to happen
>> without a transaction handler hold.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
> 
> Added to misc-next, thanks.
> 
Sorry, forgot the fixes tag:

Fixes: c53e9653605d ("btrfs: qgroup: try to flush qgroup space when we
get -EDQUOT")

Mind to add that in the branch?

Thanks,
Qu
diff mbox series

Patch

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index fe3046007f52..7785dfa348d2 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -3530,16 +3530,6 @@  static int try_flush_qgroup(struct btrfs_root *root)
 	int ret;
 	bool can_commit = true;
 
-	/*
-	 * We don't want to run flush again and again, so if there is a running
-	 * one, we won't try to start a new flush, but exit directly.
-	 */
-	if (test_and_set_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state)) {
-		wait_event(root->qgroup_flush_wait,
-			!test_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state));
-		return 0;
-	}
-
 	/*
 	 * If current process holds a transaction, we shouldn't flush, as we
 	 * assume all space reservation happens before a transaction handle is
@@ -3554,6 +3544,27 @@  static int try_flush_qgroup(struct btrfs_root *root)
 	    current->journal_info != BTRFS_SEND_TRANS_STUB)
 		can_commit = false;
 
+	/*
+	 * We don't want to run flush again and again, so if there is a running
+	 * one, we won't try to start a new flush, but exit directly.
+	 */
+	if (test_and_set_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state)) {
+		/*
+		 * We are already holding a trans, thus we can block other
+		 * threads from flushing.
+		 * So exit right now. This increases the chance of EDQUOT for
+		 * heavy load and near limit cases.
+		 * But we can argue that if we're already near limit, EDQUOT
+		 * is unavoidable anyway.
+		 */
+		if (!can_commit)
+			return 0;
+
+		wait_event(root->qgroup_flush_wait,
+			!test_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state));
+		return 0;
+	}
+
 	ret = btrfs_start_delalloc_snapshot(root);
 	if (ret < 0)
 		goto out;