Message ID | 20200203204951.517751-1-josef@toxicpanda.com (mailing list archive) |
---|---|
Headers | show |
Series | Convert data reservations to the ticketing infrastructure | expand |
On 3.02.20 г. 22:49 ч., Josef Bacik wrote: > v2->v3: > - added a comment patch for the flushing states for data. > - I forgot to add cancel_work_sync() for the new async data flusher thread. > - Cleaned up a few of Nikolay's nits. > > v1->v2: > - dropped the RFC > - realized that I mis-translated the transaction commit logic from the old way > to the new way, so I've reworked a bunch of patches to clearly pull that > behavior into the generic flushing code. I've then cleaned it up later to > make it easy to bisect down to behavior changes. > - Cleaned up the priority flushing, there's no need for an explicit state array. > - Removed the CHUNK_FORCE_ALLOC from the normal flushing as well, simply keep > the logic of allocating chunks until we've made our reservation or we are > full, then fall back on the normal flushing mechanism. > > -------------- Original email -------------- > Nikolay reported a problem where we'd occasionally fail generic/371. This test > has two tasks running in a loop, one that fallocate and rm's, and one that > pwrite's and rm's. There is enough space for both of these files to exist at > one time, but sometimes the pwrite would fail. > > It would fail because we do not serialize data reseravtions. If one task is > stuck doing the reclaim work, and another task comes in and steals it's > reservation enough times, we'll give up and return ENOSPC. We validated this by > adding a printk to the data reservation path to tell us that it was succeeding > at making a reservation while another task was flushing. > > To solve this problem I've converted data reservations over to the ticketing > system that metadata uses. There are some cleanups and some fixes that have to > come before we could do that. The following are simply cleanups > > [PATCH 01/20] btrfs: change nr to u64 in btrfs_start_delalloc_roots > [PATCH 02/20] btrfs: remove orig from shrink_delalloc > [PATCH 03/20] btrfs: handle U64_MAX for shrink_delalloc > > The following are fixes that are needed to handle data space infos properly. > > [PATCH 04/20] btrfs: make shrink_delalloc take space_info as an arg > [PATCH 05/20] btrfs: make ALLOC_CHUNK use the space info flags > [PATCH 06/20] btrfs: call btrfs_try_granting_tickets when freeing > [PATCH 07/20] btrfs: call btrfs_try_granting_tickets when unpinning > [PATCH 08/20] btrfs: call btrfs_try_granting_tickets when reserving > [PATCH 09/20] btrfs: use the btrfs_space_info_free_bytes_may_use > > I then converted the data reservation path over to the ticketing infrastructure, > but I did it in a way that mirrored exactly what we currently have. The idea is > that I want to be able to bisect regressions that happen from behavior change, > and doing that would be hard if I just had a single patch doing the whole > conversion at once. So the following patches are only moving code around > logically, but preserve the same behavior as before > > [PATCH 10/20] btrfs: add flushing states for handling data > [PATCH 11/20] btrfs: add btrfs_reserve_data_bytes and use it > [PATCH 12/20] btrfs: use ticketing for data space reservations > > And then the following patches were changing the behavior of how we flush space > for data reservations. > > [PATCH 13/20] btrfs: run delayed iputs before committing the > [PATCH 14/20] btrfs: flush delayed refs when trying to reserve data > [PATCH 15/20] btrfs: serialize data reservations if we are flushing > [PATCH 16/20] btrfs: rework chunk allocate for data reservations > [PATCH 17/20] btrfs: drop the commit_cycles stuff for data > > And then a cleanup now that the data reservation code is the same as the > metadata reservation code. > > [PATCH 18/20] btrfs: use the same helper for data and metadata > > Finally a patch to change the flushing from direct to asynchronous, mirroring > what we do for metadata for better latency. > > [PATCH 19/20] btrfs: do async reclaim for data reservations > > And then a final cleanup now that we're where we want to be with the data > reservation path. > > [PATCH 20/20] btrfs: kill the priority_reclaim_space helper > > I've marked this as an RFC because I've only tested this with generic/371. I'm > starting my overnight runs of xfstests now and will likely find regressions, but > I'd like to get review started so this can get merged ASAP. Thanks, > > Josef > > For the whole series: Tested-by: Nikolay Borisov <nborisov@suse.com> There are no regressions in xfstest, fixes generic/371 and also the tests that exhibited performance problems in v1 are now fixed (as of v2).