diff mbox series

[11/23] btrfs: check tickets after waiting on ordered extents

Message ID 20200131223613.490779-12-josef@toxicpanda.com (mailing list archive)
State New, archived
Headers show
Series Convert data reservations to the ticketing infrastructure | expand

Commit Message

Josef Bacik Jan. 31, 2020, 10:36 p.m. UTC
Right now if the space is free'd up after the ordered extents complete
(which is likely since the reservations are held until they complete),
we would do extra delalloc flushing before we'd notice that we didn't
have any more tickets.  Fix this by moving the tickets check after our
wait_ordered_extents check.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

Comments

Nikolay Borisov Feb. 3, 2020, 1:10 p.m. UTC | #1
On 1.02.20 г. 0:36 ч., Josef Bacik wrote:
> Right now if the space is free'd up after the ordered extents complete
> (which is likely since the reservations are held until they complete),
> we would do extra delalloc flushing before we'd notice that we didn't
> have any more tickets.  Fix this by moving the tickets check after our
> wait_ordered_extents check.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

This patch makes sense only for metadata. Is this your intention -
tweaking the metadata change behavior? Correct me if I'm wrong but

btrfs_start_delalloc_roots from previous patch will essentially call
btrfs_run_delalloc_range for the roots which will create ordered extents in:
btrfs_run_delalloc_range
  cow_file_range
   add_ordered_extents

Following page writeout, from the endio routines we'll eventually do:

finish_ordered_fn
 btrfs_finish_ordered_io
  insert_reserved_file_extent
   btrfs_alloc_reserved_file_extent
    create delayed ref  <---- after the delayed extent is run this will
free some data space. But this happens in transaction commit context and
not when runnig ordered extents
  btrfs_remove_ordered_extent
   btrfs_delalloc_release_metadata <- this is only for metadata
    btrfs_inode_rsv_release
     __btrfs_block_rsv_release <-- frees metadata but not data?


I'm looking those patches thinking every change should be pertinent to
data space but apparently it's not. If so I think it will be best if you
update the cover letter for V2 to mention which patches can go in
independently or give more context why this particular patch is
pertinent to data flush.
Josef Bacik Feb. 3, 2020, 3:57 p.m. UTC | #2
On 2/3/20 8:10 AM, Nikolay Borisov wrote:
> 
> 
> On 1.02.20 г. 0:36 ч., Josef Bacik wrote:
>> Right now if the space is free'd up after the ordered extents complete
>> (which is likely since the reservations are held until they complete),
>> we would do extra delalloc flushing before we'd notice that we didn't
>> have any more tickets.  Fix this by moving the tickets check after our
>> wait_ordered_extents check.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> 
> This patch makes sense only for metadata. Is this your intention -
> tweaking the metadata change behavior? Correct me if I'm wrong but
> 
> btrfs_start_delalloc_roots from previous patch will essentially call
> btrfs_run_delalloc_range for the roots which will create ordered extents in:
> btrfs_run_delalloc_range
>    cow_file_range
>     add_ordered_extents
> 
> Following page writeout, from the endio routines we'll eventually do:
> 
> finish_ordered_fn
>   btrfs_finish_ordered_io
>    insert_reserved_file_extent
>     btrfs_alloc_reserved_file_extent
>      create delayed ref  <---- after the delayed extent is run this will
> free some data space. But this happens in transaction commit context and
> not when runnig ordered extents
>    btrfs_remove_ordered_extent
>     btrfs_delalloc_release_metadata <- this is only for metadata
>      btrfs_inode_rsv_release
>       __btrfs_block_rsv_release <-- frees metadata but not data?
> 
> 
> I'm looking those patches thinking every change should be pertinent to
> data space but apparently it's not. If so I think it will be best if you
> update the cover letter for V2 to mention which patches can go in
> independently or give more context why this particular patch is
> pertinent to data flush.
> 


Specifically what I'm addressing here for data is this behavior

btrfs_start_delalloc_roots()
   ->check space_info->tickets, it's not empty
btrfs_finish_ordered_io()
   -> we drop a previously uncompressed extent (ie larger one) for a newly
      compressed extent, now space_info->tickets _is_ empty
loop again despite having no tickets to flush for.

Does this scenario happen all the time?  Nope, but I noticed it was happening 
sometimes with the data intensive enopsc tests xfstests so I threw it in there 
because it made a tiny difference, plus it's actually correct.

It definitely affects the metadata case more for sure, but I only noticed it 
when I forgot I had compression enabled for one of my xfstests runs.  Thanks,

Joosef
diff mbox series

Patch

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index a69e3d9057ff..955f59f4b1d0 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -378,14 +378,6 @@  static void shrink_delalloc(struct btrfs_fs_info *fs_info,
 	while ((delalloc_bytes || dio_bytes) && loops < 3) {
 		btrfs_start_delalloc_roots(fs_info, items);
 
-		spin_lock(&space_info->lock);
-		if (list_empty(&space_info->tickets) &&
-		    list_empty(&space_info->priority_tickets)) {
-			spin_unlock(&space_info->lock);
-			break;
-		}
-		spin_unlock(&space_info->lock);
-
 		loops++;
 		if (wait_ordered && !trans) {
 			btrfs_wait_ordered_roots(fs_info, items, 0, (u64)-1);
@@ -394,6 +386,15 @@  static void shrink_delalloc(struct btrfs_fs_info *fs_info,
 			if (time_left)
 				break;
 		}
+
+		spin_lock(&space_info->lock);
+		if (list_empty(&space_info->tickets) &&
+		    list_empty(&space_info->priority_tickets)) {
+			spin_unlock(&space_info->lock);
+			break;
+		}
+		spin_unlock(&space_info->lock);
+
 		delalloc_bytes = percpu_counter_sum_positive(
 						&fs_info->delalloc_bytes);
 		dio_bytes = percpu_counter_sum_positive(&fs_info->dio_bytes);