[21/35] btrfs: only run delayed refs if we're committing

Message ID	20180830174225.2200-22-josef@toxicpanda.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-btrfs-owner@kernel.org> From: Josef Bacik <josef@toxicpanda.com> To: linux-btrfs@vger.kernel.org Subject: [PATCH 21/35] btrfs: only run delayed refs if we're committing Date: Thu, 30 Aug 2018 13:42:11 -0400 Message-Id: <20180830174225.2200-22-josef@toxicpanda.com> In-Reply-To: <20180830174225.2200-1-josef@toxicpanda.com> References: <20180830174225.2200-1-josef@toxicpanda.com> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk
Series	My current patch queue \| expand [00/35] My current patch queue [01/35] btrfs: add btrfs_delete_ref_head helper [02/35] btrfs: add cleanup_ref_head_accounting helper [03/35] btrfs: use cleanup_extent_op in check_ref_cleanup [04/35] btrfs: only track ref_heads in delayed_ref_updates [05/35] btrfs: introduce delayed_refs_rsv [06/35] btrfs: check if free bgs for commit [07/35] btrfs: dump block_rsv whe dumping space info [08/35] btrfs: release metadata before running delayed refs [09/35] btrfs: protect space cache inode alloc with nofs [10/35] btrfs: fix truncate throttling [11/35] btrfs: don't use global rsv for chunk allocation [12/35] btrfs: add ALLOC_CHUNK_FORCE to the flushing code [13/35] btrfs: reset max_extent_size properly [14/35] btrfs: don't enospc all tickets on flush failure [15/35] btrfs: run delayed iputs before committing [16/35] btrfs: loop in inode_rsv_refill [17/35] btrfs: move the dio_sem higher up the callchain [18/35] btrfs: set max_extent_size properly [19/35] btrfs: don't use ctl->free_space for max_extent_size [20/35] btrfs: reset max_extent_size on clear in a bitmap [21/35] btrfs: only run delayed refs if we're committing [22/35] btrfs: make sure we create all new bgs [23/35] btrfs: assert on non-empty delayed iputs [24/35] btrfs: pass delayed_refs_root to btrfs_delayed_ref_lock [25/35] btrfs: make btrfs_destroy_delayed_refs use btrfs_delayed_ref_lock [26/35] btrfs: make btrfs_destroy_delayed_refs use btrfs_delete_ref_head [27/35] btrfs: handle delayed ref head accounting cleanup in abort [28/35] btrfs: call btrfs_create_pending_block_groups unconditionally [29/35] btrfs: just delete pending bgs if we are aborted [30/35] btrfs: cleanup pending bgs on transaction abort [31/35] btrfs: clear delayed_refs_rsv for dirty bg cleanup [32/35] btrfs: only free reserved extent if we didn't insert it [33/35] btrfs: fix insert_reserved error handling [34/35] btrfs: wait on ordered extents on abort cleanup [35/35] MAINTAINERS: update my email address for btrfs

Message ID

20180830174225.2200-22-josef@toxicpanda.com (mailing list archive)

State

New, archived

Headers

From: Josef Bacik <josef@toxicpanda.com>
To: linux-btrfs@vger.kernel.org
Subject: [PATCH 21/35] btrfs: only run delayed refs if we're committing
Date: Thu, 30 Aug 2018 13:42:11 -0400
Message-Id: <20180830174225.2200-22-josef@toxicpanda.com>
In-Reply-To: <20180830174225.2200-1-josef@toxicpanda.com>
References: <20180830174225.2200-1-josef@toxicpanda.com>
Sender: linux-btrfs-owner@vger.kernel.org
Precedence: bulk

Series

My current patch queue | expand

Commit Message

Josef Bacik Aug. 30, 2018, 5:42 p.m. UTC

I noticed in a giant dbench run that we spent a lot of time on lock
contention while running transaction commit.  This is because dbench
results in a lot of fsync()'s that do a btrfs_transaction_commit(), and
they all run the delayed refs first thing, so they all contend with
each other.  This leads to seconds of 0 throughput.  Change this to only
run the delayed refs if we're the ones committing the transaction.  This
makes the latency go away and we get no more lock contention.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/transaction.c | 24 +++++++++---------------
 1 file changed, 9 insertions(+), 15 deletions(-)

Comments

Omar Sandoval Sept. 1, 2018, 12:28 a.m. UTC | #1

On Thu, Aug 30, 2018 at 01:42:11PM -0400, Josef Bacik wrote:
> I noticed in a giant dbench run that we spent a lot of time on lock
> contention while running transaction commit.  This is because dbench
> results in a lot of fsync()'s that do a btrfs_transaction_commit(), and
> they all run the delayed refs first thing, so they all contend with
> each other.  This leads to seconds of 0 throughput.  Change this to only
> run the delayed refs if we're the ones committing the transaction.  This
> makes the latency go away and we get no more lock contention.

This means that we're going to spend more time running delayed refs
while in TRANS_STATE_COMMIT_START, so couldn't we end up blocking new
transactions more than before?

> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/transaction.c | 24 +++++++++---------------
>  1 file changed, 9 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index ebb0c0405598..2bb19e2ded5e 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -1918,15 +1918,6 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
>  	btrfs_trans_release_metadata(trans);
>  	trans->block_rsv = NULL;
>  
> -	/* make a pass through all the delayed refs we have so far
> -	 * any runnings procs may add more while we are here
> -	 */
> -	ret = btrfs_run_delayed_refs(trans, 0);
> -	if (ret) {
> -		btrfs_end_transaction(trans);
> -		return ret;
> -	}
> -
>  	cur_trans = trans->transaction;
>  
>  	/*
> @@ -1939,12 +1930,6 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
>  	if (!list_empty(&trans->new_bgs))
>  		btrfs_create_pending_block_groups(trans);
>  
> -	ret = btrfs_run_delayed_refs(trans, 0);
> -	if (ret) {
> -		btrfs_end_transaction(trans);
> -		return ret;
> -	}
> -
>  	if (!test_bit(BTRFS_TRANS_DIRTY_BG_RUN, &cur_trans->flags)) {
>  		int run_it = 0;
>  
> @@ -2015,6 +2000,15 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
>  		spin_unlock(&fs_info->trans_lock);
>  	}
>  
> +	/*
> +	 * We are now the only one in the commit area, we can run delayed refs
> +	 * without hitting a bunch of lock contention from a lot of people
> +	 * trying to commit the transaction at once.
> +	 */
> +	ret = btrfs_run_delayed_refs(trans, 0);
> +	if (ret)
> +		goto cleanup_transaction;
> +
>  	extwriter_counter_dec(cur_trans, trans->type);
>  
>  	ret = btrfs_start_delalloc_flush(fs_info);
> -- 
> 2.14.3
>

Josef Bacik Sept. 4, 2018, 5:54 p.m. UTC | #2

On Fri, Aug 31, 2018 at 05:28:09PM -0700, Omar Sandoval wrote:
> On Thu, Aug 30, 2018 at 01:42:11PM -0400, Josef Bacik wrote:
> > I noticed in a giant dbench run that we spent a lot of time on lock
> > contention while running transaction commit.  This is because dbench
> > results in a lot of fsync()'s that do a btrfs_transaction_commit(), and
> > they all run the delayed refs first thing, so they all contend with
> > each other.  This leads to seconds of 0 throughput.  Change this to only
> > run the delayed refs if we're the ones committing the transaction.  This
> > makes the latency go away and we get no more lock contention.
> 
> This means that we're going to spend more time running delayed refs
> while in TRANS_STATE_COMMIT_START, so couldn't we end up blocking new
> transactions more than before?
> 

You'd think that, but the lock contention is enough that it makes it
unfuckingpossible for anything to run for several seconds while everybody
competes for either the delayed refs lock or the extent root lock.

With the delayed refs rsv we actually end up running the delayed refs often
enough because of the extra ENOSPC pressure that we don't really end up with
long chunks of time running delayed refs while blocking out START transactions.

If at some point down the line this turns out to be an actual issue we can
revisit the best way to do this.  Off the top of my head we do something like
wrap it in a "run all the delayed refs" mutex so that all the committers just
wait on whoever wins, and we move it back outside of the start logic in order to
make it better all the way around.  But I don't think that's something we need
to do at this point.  Thanks,

Josef

Omar Sandoval Sept. 4, 2018, 6:04 p.m. UTC | #3

On Tue, Sep 04, 2018 at 01:54:13PM -0400, Josef Bacik wrote:
> On Fri, Aug 31, 2018 at 05:28:09PM -0700, Omar Sandoval wrote:
> > On Thu, Aug 30, 2018 at 01:42:11PM -0400, Josef Bacik wrote:
> > > I noticed in a giant dbench run that we spent a lot of time on lock
> > > contention while running transaction commit.  This is because dbench
> > > results in a lot of fsync()'s that do a btrfs_transaction_commit(), and
> > > they all run the delayed refs first thing, so they all contend with
> > > each other.  This leads to seconds of 0 throughput.  Change this to only
> > > run the delayed refs if we're the ones committing the transaction.  This
> > > makes the latency go away and we get no more lock contention.
> > 
> > This means that we're going to spend more time running delayed refs
> > while in TRANS_STATE_COMMIT_START, so couldn't we end up blocking new
> > transactions more than before?
> > 
> 
> You'd think that, but the lock contention is enough that it makes it
> unfuckingpossible for anything to run for several seconds while everybody
> competes for either the delayed refs lock or the extent root lock.
> 
> With the delayed refs rsv we actually end up running the delayed refs often
> enough because of the extra ENOSPC pressure that we don't really end up with
> long chunks of time running delayed refs while blocking out START transactions.
> 
> If at some point down the line this turns out to be an actual issue we can
> revisit the best way to do this.  Off the top of my head we do something like
> wrap it in a "run all the delayed refs" mutex so that all the committers just
> wait on whoever wins, and we move it back outside of the start logic in order to
> make it better all the way around.  But I don't think that's something we need
> to do at this point.  Thanks,

Ok, that's good enough for me.

Reviewed-by: Omar Sandoval <osandov@fb.com>

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index ebb0c0405598..2bb19e2ded5e 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1918,15 +1918,6 @@  int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
 	btrfs_trans_release_metadata(trans);
 	trans->block_rsv = NULL;
 
-	/* make a pass through all the delayed refs we have so far
-	 * any runnings procs may add more while we are here
-	 */
-	ret = btrfs_run_delayed_refs(trans, 0);
-	if (ret) {
-		btrfs_end_transaction(trans);
-		return ret;
-	}
-
 	cur_trans = trans->transaction;
 
 	/*
@@ -1939,12 +1930,6 @@  int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
 	if (!list_empty(&trans->new_bgs))
 		btrfs_create_pending_block_groups(trans);
 
-	ret = btrfs_run_delayed_refs(trans, 0);
-	if (ret) {
-		btrfs_end_transaction(trans);
-		return ret;
-	}
-
 	if (!test_bit(BTRFS_TRANS_DIRTY_BG_RUN, &cur_trans->flags)) {
 		int run_it = 0;
 
@@ -2015,6 +2000,15 @@  int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
 		spin_unlock(&fs_info->trans_lock);
 	}
 
+	/*
+	 * We are now the only one in the commit area, we can run delayed refs
+	 * without hitting a bunch of lock contention from a lot of people
+	 * trying to commit the transaction at once.
+	 */
+	ret = btrfs_run_delayed_refs(trans, 0);
+	if (ret)
+		goto cleanup_transaction;
+
 	extwriter_counter_dec(cur_trans, trans->type);
 
 	ret = btrfs_start_delalloc_flush(fs_info);

[21/35] btrfs: only run delayed refs if we're committing

Commit Message

Comments

Patch