Message ID | 20200309202322.12327-4-josef@toxicpanda.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Deal with a few ENOSPC corner cases | expand |
On 9.03.20 г. 22:23 ч., Josef Bacik wrote: > In debugging a generic/320 failure on ppc64, Nikolay noticed that > sometimes we'd ENOSPC out with plenty of space to reclaim if we had > committed the transaction. He further discovered that this was because > there was a priority ticket that was small enough to fit in the free > space currently in the space_info. While that is a problem by itself, > it exposed another flaw, that we consider priority tickets in > may_commit_transaction. > > Priority tickets are not allowed to commit the transaction, thus we > shouldn't even consider them in may_commit_transaction. Instead we need > to only consider current normal tickets. With this fix in place, we > will properly commit the transaction. My testing shows this fix is correct but I'd like to have a bit more information how priority tickets confused may_commit_transaction, perhaps put the example you gave on slack? > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> > --- > fs/btrfs/space-info.c | 5 +---- > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c > index 8d00a9ee9458..d198cfd45cf7 100644 > --- a/fs/btrfs/space-info.c > +++ b/fs/btrfs/space-info.c > @@ -592,10 +592,7 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info, > else > cur_free_bytes = 0; > > - if (!list_empty(&space_info->priority_tickets)) > - ticket = list_first_entry(&space_info->priority_tickets, > - struct reserve_ticket, list); > - else if (!list_empty(&space_info->tickets)) > + if (!list_empty(&space_info->tickets)) > ticket = list_first_entry(&space_info->tickets, > struct reserve_ticket, list); nit: That could be written simply : ticket = list_first_entry_or_null(&space_info->tickets, struct reserve_ticket, list);) > bytes_needed = (ticket) ? ticket->bytes : 0; >
On 9.03.20 г. 22:23 ч., Josef Bacik wrote: > In debugging a generic/320 failure on ppc64, Nikolay noticed that > sometimes we'd ENOSPC out with plenty of space to reclaim if we had > committed the transaction. He further discovered that this was because > there was a priority ticket that was small enough to fit in the free > space currently in the space_info. While that is a problem by itself, > it exposed another flaw, that we consider priority tickets in > may_commit_transaction. > > Priority tickets are not allowed to commit the transaction, thus we > shouldn't even consider them in may_commit_transaction. Instead we need > to only consider current normal tickets. With this fix in place, we > will properly commit the transaction. > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> > --- > fs/btrfs/space-info.c | 5 +---- > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c > index 8d00a9ee9458..d198cfd45cf7 100644 > --- a/fs/btrfs/space-info.c > +++ b/fs/btrfs/space-info.c > @@ -592,10 +592,7 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info, > else > cur_free_bytes = 0; > > - if (!list_empty(&space_info->priority_tickets)) > - ticket = list_first_entry(&space_info->priority_tickets, > - struct reserve_ticket, list); > - else if (!list_empty(&space_info->tickets)) > + if (!list_empty(&space_info->tickets)) > ticket = list_first_entry(&space_info->tickets, > struct reserve_ticket, list); > bytes_needed = (ticket) ? ticket->bytes : 0; > Thinking about this a bit more, if we have a ticket here that we have enough free space to satisfy we just return, where is this ticket supposed to be granted? Can we get into the same situation we had with the priority tickets? I guess what I"m asking is "why don't we call try granting ticket" in this case?
On 9.03.20 г. 22:23 ч., Josef Bacik wrote: > In debugging a generic/320 failure on ppc64, Nikolay noticed that > sometimes we'd ENOSPC out with plenty of space to reclaim if we had > committed the transaction. He further discovered that this was because > there was a priority ticket that was small enough to fit in the free > space currently in the space_info. While that is a problem by itself, > it exposed another flaw, that we consider priority tickets in > may_commit_transaction. > > Priority tickets are not allowed to commit the transaction, thus we > shouldn't even consider them in may_commit_transaction. Instead we need > to only consider current normal tickets. With this fix in place, we > will properly commit the transaction. > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> > --- > fs/btrfs/space-info.c | 5 +---- > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c > index 8d00a9ee9458..d198cfd45cf7 100644 > --- a/fs/btrfs/space-info.c > +++ b/fs/btrfs/space-info.c > @@ -592,10 +592,7 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info, > else > cur_free_bytes = 0; > > - if (!list_empty(&space_info->priority_tickets)) > - ticket = list_first_entry(&space_info->priority_tickets, > - struct reserve_ticket, list); > - else if (!list_empty(&space_info->tickets)) > + if (!list_empty(&space_info->tickets)) > ticket = list_first_entry(&space_info->tickets, > struct reserve_ticket, list); I took another look at handle_reserve_ticket and in the case of BTRFS_RESERVE_FLUSH_EVICT which is also handled by the priority list we simply ignore the prio ticket, is this correct at all? evict_flush_states does in fact contain COMMIT_TRANS state? > bytes_needed = (ticket) ? ticket->bytes : 0; >
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c index 8d00a9ee9458..d198cfd45cf7 100644 --- a/fs/btrfs/space-info.c +++ b/fs/btrfs/space-info.c @@ -592,10 +592,7 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info, else cur_free_bytes = 0; - if (!list_empty(&space_info->priority_tickets)) - ticket = list_first_entry(&space_info->priority_tickets, - struct reserve_ticket, list); - else if (!list_empty(&space_info->tickets)) + if (!list_empty(&space_info->tickets)) ticket = list_first_entry(&space_info->tickets, struct reserve_ticket, list); bytes_needed = (ticket) ? ticket->bytes : 0;
In debugging a generic/320 failure on ppc64, Nikolay noticed that sometimes we'd ENOSPC out with plenty of space to reclaim if we had committed the transaction. He further discovered that this was because there was a priority ticket that was small enough to fit in the free space currently in the space_info. While that is a problem by itself, it exposed another flaw, that we consider priority tickets in may_commit_transaction. Priority tickets are not allowed to commit the transaction, thus we shouldn't even consider them in may_commit_transaction. Instead we need to only consider current normal tickets. With this fix in place, we will properly commit the transaction. Signed-off-by: Josef Bacik <josef@toxicpanda.com> --- fs/btrfs/space-info.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)