diff mbox series

[3/5] btrfs: only take normal tickets into account in may_commit_transaction

Message ID 20200309202322.12327-4-josef@toxicpanda.com (mailing list archive)
State New, archived
Headers show
Series Deal with a few ENOSPC corner cases | expand

Commit Message

Josef Bacik March 9, 2020, 8:23 p.m. UTC
In debugging a generic/320 failure on ppc64, Nikolay noticed that
sometimes we'd ENOSPC out with plenty of space to reclaim if we had
committed the transaction.  He further discovered that this was because
there was a priority ticket that was small enough to fit in the free
space currently in the space_info.  While that is a problem by itself,
it exposed another flaw, that we consider priority tickets in
may_commit_transaction.

Priority tickets are not allowed to commit the transaction, thus we
shouldn't even consider them in may_commit_transaction.  Instead we need
to only consider current normal tickets.  With this fix in place, we
will properly commit the transaction.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

Comments

Nikolay Borisov March 9, 2020, 8:51 p.m. UTC | #1
On 9.03.20 г. 22:23 ч., Josef Bacik wrote:
> In debugging a generic/320 failure on ppc64, Nikolay noticed that
> sometimes we'd ENOSPC out with plenty of space to reclaim if we had
> committed the transaction.  He further discovered that this was because
> there was a priority ticket that was small enough to fit in the free
> space currently in the space_info.  While that is a problem by itself,
> it exposed another flaw, that we consider priority tickets in
> may_commit_transaction.
> 
> Priority tickets are not allowed to commit the transaction, thus we
> shouldn't even consider them in may_commit_transaction.  Instead we need
> to only consider current normal tickets.  With this fix in place, we
> will properly commit the transaction.

My testing shows this fix is correct but I'd like to have a bit more
information how priority tickets confused may_commit_transaction,
perhaps put the example you gave on slack?

> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/space-info.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> index 8d00a9ee9458..d198cfd45cf7 100644
> --- a/fs/btrfs/space-info.c
> +++ b/fs/btrfs/space-info.c
> @@ -592,10 +592,7 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info,
>  	else
>  		cur_free_bytes = 0;
>  
> -	if (!list_empty(&space_info->priority_tickets))
> -		ticket = list_first_entry(&space_info->priority_tickets,
> -					  struct reserve_ticket, list);
> -	else if (!list_empty(&space_info->tickets))
> +	if (!list_empty(&space_info->tickets))
>  		ticket = list_first_entry(&space_info->tickets,
>  					  struct reserve_ticket, list);

nit: That could be written simply :

ticket = list_first_entry_or_null(&space_info->tickets, struct
reserve_ticket, list);)

>  	bytes_needed = (ticket) ? ticket->bytes : 0;
>
Nikolay Borisov March 9, 2020, 11:13 p.m. UTC | #2
On 9.03.20 г. 22:23 ч., Josef Bacik wrote:
> In debugging a generic/320 failure on ppc64, Nikolay noticed that
> sometimes we'd ENOSPC out with plenty of space to reclaim if we had
> committed the transaction.  He further discovered that this was because
> there was a priority ticket that was small enough to fit in the free
> space currently in the space_info.  While that is a problem by itself,
> it exposed another flaw, that we consider priority tickets in
> may_commit_transaction.
> 
> Priority tickets are not allowed to commit the transaction, thus we
> shouldn't even consider them in may_commit_transaction.  Instead we need
> to only consider current normal tickets.  With this fix in place, we
> will properly commit the transaction.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/space-info.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> index 8d00a9ee9458..d198cfd45cf7 100644
> --- a/fs/btrfs/space-info.c
> +++ b/fs/btrfs/space-info.c
> @@ -592,10 +592,7 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info,
>  	else
>  		cur_free_bytes = 0;
>  
> -	if (!list_empty(&space_info->priority_tickets))
> -		ticket = list_first_entry(&space_info->priority_tickets,
> -					  struct reserve_ticket, list);
> -	else if (!list_empty(&space_info->tickets))
> +	if (!list_empty(&space_info->tickets))
>  		ticket = list_first_entry(&space_info->tickets,
>  					  struct reserve_ticket, list);
>  	bytes_needed = (ticket) ? ticket->bytes : 0;
> 


Thinking about this a bit more, if we have a ticket here that we have
enough free space to satisfy we just return, where is this ticket
supposed to be granted? Can we get into the same situation we had with
the priority tickets? I guess what I"m asking is "why don't we call try
granting ticket" in this case?
Nikolay Borisov March 10, 2020, 10:27 a.m. UTC | #3
On 9.03.20 г. 22:23 ч., Josef Bacik wrote:
> In debugging a generic/320 failure on ppc64, Nikolay noticed that
> sometimes we'd ENOSPC out with plenty of space to reclaim if we had
> committed the transaction.  He further discovered that this was because
> there was a priority ticket that was small enough to fit in the free
> space currently in the space_info.  While that is a problem by itself,
> it exposed another flaw, that we consider priority tickets in
> may_commit_transaction.
> 
> Priority tickets are not allowed to commit the transaction, thus we
> shouldn't even consider them in may_commit_transaction.  Instead we need
> to only consider current normal tickets.  With this fix in place, we
> will properly commit the transaction.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/space-info.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> index 8d00a9ee9458..d198cfd45cf7 100644
> --- a/fs/btrfs/space-info.c
> +++ b/fs/btrfs/space-info.c
> @@ -592,10 +592,7 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info,
>  	else
>  		cur_free_bytes = 0;
>  
> -	if (!list_empty(&space_info->priority_tickets))
> -		ticket = list_first_entry(&space_info->priority_tickets,
> -					  struct reserve_ticket, list);
> -	else if (!list_empty(&space_info->tickets))
> +	if (!list_empty(&space_info->tickets))
>  		ticket = list_first_entry(&space_info->tickets,
>  					  struct reserve_ticket, list);

I took another look at handle_reserve_ticket and in the case of
BTRFS_RESERVE_FLUSH_EVICT  which is also handled by the priority list we
simply ignore the prio ticket, is this correct at all?
evict_flush_states does in fact contain COMMIT_TRANS state?

>  	bytes_needed = (ticket) ? ticket->bytes : 0;
>
diff mbox series

Patch

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 8d00a9ee9458..d198cfd45cf7 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -592,10 +592,7 @@  static int may_commit_transaction(struct btrfs_fs_info *fs_info,
 	else
 		cur_free_bytes = 0;
 
-	if (!list_empty(&space_info->priority_tickets))
-		ticket = list_first_entry(&space_info->priority_tickets,
-					  struct reserve_ticket, list);
-	else if (!list_empty(&space_info->tickets))
+	if (!list_empty(&space_info->tickets))
 		ticket = list_first_entry(&space_info->tickets,
 					  struct reserve_ticket, list);
 	bytes_needed = (ticket) ? ticket->bytes : 0;