diff mbox series

[21/35] btrfs: only run delayed refs if we're committing

Message ID 20180830174225.2200-22-josef@toxicpanda.com (mailing list archive)
State New, archived
Headers show
Series My current patch queue | expand

Commit Message

Josef Bacik Aug. 30, 2018, 5:42 p.m. UTC
I noticed in a giant dbench run that we spent a lot of time on lock
contention while running transaction commit.  This is because dbench
results in a lot of fsync()'s that do a btrfs_transaction_commit(), and
they all run the delayed refs first thing, so they all contend with
each other.  This leads to seconds of 0 throughput.  Change this to only
run the delayed refs if we're the ones committing the transaction.  This
makes the latency go away and we get no more lock contention.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/transaction.c | 24 +++++++++---------------
 1 file changed, 9 insertions(+), 15 deletions(-)

Comments

Omar Sandoval Sept. 1, 2018, 12:28 a.m. UTC | #1
On Thu, Aug 30, 2018 at 01:42:11PM -0400, Josef Bacik wrote:
> I noticed in a giant dbench run that we spent a lot of time on lock
> contention while running transaction commit.  This is because dbench
> results in a lot of fsync()'s that do a btrfs_transaction_commit(), and
> they all run the delayed refs first thing, so they all contend with
> each other.  This leads to seconds of 0 throughput.  Change this to only
> run the delayed refs if we're the ones committing the transaction.  This
> makes the latency go away and we get no more lock contention.

This means that we're going to spend more time running delayed refs
while in TRANS_STATE_COMMIT_START, so couldn't we end up blocking new
transactions more than before?

> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/transaction.c | 24 +++++++++---------------
>  1 file changed, 9 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index ebb0c0405598..2bb19e2ded5e 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -1918,15 +1918,6 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
>  	btrfs_trans_release_metadata(trans);
>  	trans->block_rsv = NULL;
>  
> -	/* make a pass through all the delayed refs we have so far
> -	 * any runnings procs may add more while we are here
> -	 */
> -	ret = btrfs_run_delayed_refs(trans, 0);
> -	if (ret) {
> -		btrfs_end_transaction(trans);
> -		return ret;
> -	}
> -
>  	cur_trans = trans->transaction;
>  
>  	/*
> @@ -1939,12 +1930,6 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
>  	if (!list_empty(&trans->new_bgs))
>  		btrfs_create_pending_block_groups(trans);
>  
> -	ret = btrfs_run_delayed_refs(trans, 0);
> -	if (ret) {
> -		btrfs_end_transaction(trans);
> -		return ret;
> -	}
> -
>  	if (!test_bit(BTRFS_TRANS_DIRTY_BG_RUN, &cur_trans->flags)) {
>  		int run_it = 0;
>  
> @@ -2015,6 +2000,15 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
>  		spin_unlock(&fs_info->trans_lock);
>  	}
>  
> +	/*
> +	 * We are now the only one in the commit area, we can run delayed refs
> +	 * without hitting a bunch of lock contention from a lot of people
> +	 * trying to commit the transaction at once.
> +	 */
> +	ret = btrfs_run_delayed_refs(trans, 0);
> +	if (ret)
> +		goto cleanup_transaction;
> +
>  	extwriter_counter_dec(cur_trans, trans->type);
>  
>  	ret = btrfs_start_delalloc_flush(fs_info);
> -- 
> 2.14.3
>
Josef Bacik Sept. 4, 2018, 5:54 p.m. UTC | #2
On Fri, Aug 31, 2018 at 05:28:09PM -0700, Omar Sandoval wrote:
> On Thu, Aug 30, 2018 at 01:42:11PM -0400, Josef Bacik wrote:
> > I noticed in a giant dbench run that we spent a lot of time on lock
> > contention while running transaction commit.  This is because dbench
> > results in a lot of fsync()'s that do a btrfs_transaction_commit(), and
> > they all run the delayed refs first thing, so they all contend with
> > each other.  This leads to seconds of 0 throughput.  Change this to only
> > run the delayed refs if we're the ones committing the transaction.  This
> > makes the latency go away and we get no more lock contention.
> 
> This means that we're going to spend more time running delayed refs
> while in TRANS_STATE_COMMIT_START, so couldn't we end up blocking new
> transactions more than before?
> 

You'd think that, but the lock contention is enough that it makes it
unfuckingpossible for anything to run for several seconds while everybody
competes for either the delayed refs lock or the extent root lock.

With the delayed refs rsv we actually end up running the delayed refs often
enough because of the extra ENOSPC pressure that we don't really end up with
long chunks of time running delayed refs while blocking out START transactions.

If at some point down the line this turns out to be an actual issue we can
revisit the best way to do this.  Off the top of my head we do something like
wrap it in a "run all the delayed refs" mutex so that all the committers just
wait on whoever wins, and we move it back outside of the start logic in order to
make it better all the way around.  But I don't think that's something we need
to do at this point.  Thanks,

Josef
Omar Sandoval Sept. 4, 2018, 6:04 p.m. UTC | #3
On Tue, Sep 04, 2018 at 01:54:13PM -0400, Josef Bacik wrote:
> On Fri, Aug 31, 2018 at 05:28:09PM -0700, Omar Sandoval wrote:
> > On Thu, Aug 30, 2018 at 01:42:11PM -0400, Josef Bacik wrote:
> > > I noticed in a giant dbench run that we spent a lot of time on lock
> > > contention while running transaction commit.  This is because dbench
> > > results in a lot of fsync()'s that do a btrfs_transaction_commit(), and
> > > they all run the delayed refs first thing, so they all contend with
> > > each other.  This leads to seconds of 0 throughput.  Change this to only
> > > run the delayed refs if we're the ones committing the transaction.  This
> > > makes the latency go away and we get no more lock contention.
> > 
> > This means that we're going to spend more time running delayed refs
> > while in TRANS_STATE_COMMIT_START, so couldn't we end up blocking new
> > transactions more than before?
> > 
> 
> You'd think that, but the lock contention is enough that it makes it
> unfuckingpossible for anything to run for several seconds while everybody
> competes for either the delayed refs lock or the extent root lock.
> 
> With the delayed refs rsv we actually end up running the delayed refs often
> enough because of the extra ENOSPC pressure that we don't really end up with
> long chunks of time running delayed refs while blocking out START transactions.
> 
> If at some point down the line this turns out to be an actual issue we can
> revisit the best way to do this.  Off the top of my head we do something like
> wrap it in a "run all the delayed refs" mutex so that all the committers just
> wait on whoever wins, and we move it back outside of the start logic in order to
> make it better all the way around.  But I don't think that's something we need
> to do at this point.  Thanks,

Ok, that's good enough for me.

Reviewed-by: Omar Sandoval <osandov@fb.com>
diff mbox series

Patch

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index ebb0c0405598..2bb19e2ded5e 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1918,15 +1918,6 @@  int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
 	btrfs_trans_release_metadata(trans);
 	trans->block_rsv = NULL;
 
-	/* make a pass through all the delayed refs we have so far
-	 * any runnings procs may add more while we are here
-	 */
-	ret = btrfs_run_delayed_refs(trans, 0);
-	if (ret) {
-		btrfs_end_transaction(trans);
-		return ret;
-	}
-
 	cur_trans = trans->transaction;
 
 	/*
@@ -1939,12 +1930,6 @@  int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
 	if (!list_empty(&trans->new_bgs))
 		btrfs_create_pending_block_groups(trans);
 
-	ret = btrfs_run_delayed_refs(trans, 0);
-	if (ret) {
-		btrfs_end_transaction(trans);
-		return ret;
-	}
-
 	if (!test_bit(BTRFS_TRANS_DIRTY_BG_RUN, &cur_trans->flags)) {
 		int run_it = 0;
 
@@ -2015,6 +2000,15 @@  int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
 		spin_unlock(&fs_info->trans_lock);
 	}
 
+	/*
+	 * We are now the only one in the commit area, we can run delayed refs
+	 * without hitting a bunch of lock contention from a lot of people
+	 * trying to commit the transaction at once.
+	 */
+	ret = btrfs_run_delayed_refs(trans, 0);
+	if (ret)
+		goto cleanup_transaction;
+
 	extwriter_counter_dec(cur_trans, trans->type);
 
 	ret = btrfs_start_delalloc_flush(fs_info);