Message ID | 5c0655d43b2809932ec8aa40d99b94295469e3f1.1605266377.git.fdmanana@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: stop incrementing log batch when joining log transaction | expand |
On 11/13/20 6:23 AM, fdmanana@kernel.org wrote: > From: Filipe Manana <fdmanana@suse.com> > > When joining a log transaction we acquire the root's log mutex, then > increment the root's log batch and log writers counters while holding > the mutex. However we don't need to increment the log batch there, > because we are holding the mutex and incremented the log writers counter > as well, so any other task trying to sync log will wait for the current > task to finish its logging and still achieve the desired log batching. > > Since the log batch counter is an atomic counter and is incremented twice > at the very beginning of the fsync callback (btrfs_sync_file()), once > before flushing delalloc and once again after waiting for writeback to > complete, eliminating its increment when joining the log transaction > may provide some performance gains in case we have multiple concurrent > tasks doing fsyncs against different files in the same subvolume, as it > reduces contention on the atomic (locking the cacheline and bouncing it). > > When testing fio with 32 jobs, on a 8 cores vm, doing fsyncs against > different files of the same subvolume, on top of a zram device, I could > consistently see gains (higher throughput) between 1% to 2%, which is a > very low value and possibly hard to be observed with a real device (I > couldn't observe consistent gains with my low/mid end NVMe device). > So this change is mostly motivated to just simplify the logic, as updating > the log batch counter is only relevant when an fsync starts and while not > holding the root's log mutex. > > Signed-off-by: Filipe Manana <fdmanana@suse.com> You had me at "simplify the logic" Reviewed-by: Josef Bacik <josef@toxicpanda.com> Thanks, Josef
On Fri, Nov 13, 2020 at 11:23:30AM +0000, fdmanana@kernel.org wrote: > From: Filipe Manana <fdmanana@suse.com> > > When joining a log transaction we acquire the root's log mutex, then > increment the root's log batch and log writers counters while holding > the mutex. However we don't need to increment the log batch there, > because we are holding the mutex and incremented the log writers counter > as well, so any other task trying to sync log will wait for the current > task to finish its logging and still achieve the desired log batching. > > Since the log batch counter is an atomic counter and is incremented twice > at the very beginning of the fsync callback (btrfs_sync_file()), once > before flushing delalloc and once again after waiting for writeback to > complete, eliminating its increment when joining the log transaction > may provide some performance gains in case we have multiple concurrent > tasks doing fsyncs against different files in the same subvolume, as it > reduces contention on the atomic (locking the cacheline and bouncing it). > > When testing fio with 32 jobs, on a 8 cores vm, doing fsyncs against > different files of the same subvolume, on top of a zram device, I could > consistently see gains (higher throughput) between 1% to 2%, which is a > very low value and possibly hard to be observed with a real device (I > couldn't observe consistent gains with my low/mid end NVMe device). > So this change is mostly motivated to just simplify the logic, as updating > the log batch counter is only relevant when an fsync starts and while not > holding the root's log mutex. > > Signed-off-by: Filipe Manana <fdmanana@suse.com> Added to misc-next, thanks.
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index 17a08a7bf6cc..bc3396dfec10 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -172,7 +172,6 @@ static int start_log_trans(struct btrfs_trans_handle *trans, root->log_start_pid = current->pid; } - atomic_inc(&root->log_batch); atomic_inc(&root->log_writers); if (ctx && !ctx->logging_new_name) { int index = root->log_transid % 2;