[4/8] btrfs: add ALLOC_CHUNK_FORCE to the flushing code
diff mbox series

Message ID 20181203152459.21630-5-josef@toxicpanda.com
State New
Headers show
Series
  • Enospc cleanups and fixeS
Related show

Commit Message

Josef Bacik Dec. 3, 2018, 3:24 p.m. UTC
With my change to no longer take into account the global reserve for
metadata allocation chunks we have this side-effect for mixed block
group fs'es where we are no longer allocating enough chunks for the
data/metadata requirements.  To deal with this add a ALLOC_CHUNK_FORCE
step to the flushing state machine.  This will only get used if we've
already made a full loop through the flushing machinery and tried
committing the transaction.  If we have then we can try and force a
chunk allocation since we likely need it to make progress.  This
resolves the issues I was seeing with the mixed bg tests in xfstests
with my previous patch.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/ctree.h             |  3 ++-
 fs/btrfs/extent-tree.c       | 18 +++++++++++++++++-
 include/trace/events/btrfs.h |  1 +
 3 files changed, 20 insertions(+), 2 deletions(-)

Comments

Nikolay Borisov Dec. 11, 2018, 10:08 a.m. UTC | #1
On 3.12.18 г. 17:24 ч., Josef Bacik wrote:
> With my change to no longer take into account the global reserve for
> metadata allocation chunks we have this side-effect for mixed block
> group fs'es where we are no longer allocating enough chunks for the
> data/metadata requirements.  To deal with this add a ALLOC_CHUNK_FORCE
> step to the flushing state machine.  This will only get used if we've
> already made a full loop through the flushing machinery and tried
> committing the transaction.  If we have then we can try and force a
> chunk allocation since we likely need it to make progress.  This
> resolves the issues I was seeing with the mixed bg tests in xfstests
> with my previous patch.
> 
> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Imo this and the previous patch should be squashed into one.

> ---
>  fs/btrfs/ctree.h             |  3 ++-
>  fs/btrfs/extent-tree.c       | 18 +++++++++++++++++-
>  include/trace/events/btrfs.h |  1 +
>  3 files changed, 20 insertions(+), 2 deletions(-)
> 

<snip>
David Sterba Dec. 11, 2018, 4:47 p.m. UTC | #2
On Tue, Dec 11, 2018 at 12:08:23PM +0200, Nikolay Borisov wrote:
> 
> 
> On 3.12.18 г. 17:24 ч., Josef Bacik wrote:
> > With my change to no longer take into account the global reserve for
> > metadata allocation chunks we have this side-effect for mixed block
> > group fs'es where we are no longer allocating enough chunks for the
> > data/metadata requirements.  To deal with this add a ALLOC_CHUNK_FORCE
> > step to the flushing state machine.  This will only get used if we've
> > already made a full loop through the flushing machinery and tried
> > committing the transaction.  If we have then we can try and force a
> > chunk allocation since we likely need it to make progress.  This
> > resolves the issues I was seeing with the mixed bg tests in xfstests
> > with my previous patch.
> > 
> > Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> 
> Imo this and the previous patch should be squashed into one.

I don't see why, separate patches also look good to me. One changes the
logic regarding global reserve and the other fixes behaviour regarding
mixed block groups.

Possibly, if the fix can be applied first and then the overall logic
changed, that's still 2 patches but there's no intermediate state with
the bug. As long as it's not something really disasterous or if the
"one logical thing per patch" is unnecessarily split to 2 patches, I'm
willing to take more patches. This is a bit of a grey zone so if I'm
missing something regarding the split, please let me know.
Nikolay Borisov Dec. 11, 2018, 4:51 p.m. UTC | #3
On 11.12.18 г. 18:47 ч., David Sterba wrote:
> On Tue, Dec 11, 2018 at 12:08:23PM +0200, Nikolay Borisov wrote:
>>
>>
>> On 3.12.18 г. 17:24 ч., Josef Bacik wrote:
>>> With my change to no longer take into account the global reserve for
>>> metadata allocation chunks we have this side-effect for mixed block
>>> group fs'es where we are no longer allocating enough chunks for the
>>> data/metadata requirements.  To deal with this add a ALLOC_CHUNK_FORCE
>>> step to the flushing state machine.  This will only get used if we've
>>> already made a full loop through the flushing machinery and tried
>>> committing the transaction.  If we have then we can try and force a
>>> chunk allocation since we likely need it to make progress.  This
>>> resolves the issues I was seeing with the mixed bg tests in xfstests
>>> with my previous patch.
>>>
>>> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
>>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
>>
>> Imo this and the previous patch should be squashed into one.
> 
> I don't see why, separate patches also look good to me. One changes the
> logic regarding global reserve and the other fixes behaviour regarding
> mixed block groups.

As far as I understand this deficient behavior is a direct result of the
previous patch. In essnse previous patch fixes something and introduces
new problem which is subsequently fixed by this patch. The way I see it
if both patches are squashed the change log should be :

"I do [explanation of the first change]. However this introduces
[explain bug from patch 2] so fix it by [explain fix from 2nd patch]"


> 
> Possibly, if the fix can be applied first and then the overall logic
> changed, that's still 2 patches but there's no intermediate state with
> the bug. As long as it's not something really disasterous or if the
> "one logical thing per patch" is unnecessarily split to 2 patches, I'm
> willing to take more patches. This is a bit of a grey zone so if I'm
> missing something regarding the split, please let me know.
>
David Sterba Dec. 11, 2018, 7:04 p.m. UTC | #4
On Tue, Dec 11, 2018 at 06:51:34PM +0200, Nikolay Borisov wrote:
> 
> 
> On 11.12.18 г. 18:47 ч., David Sterba wrote:
> > On Tue, Dec 11, 2018 at 12:08:23PM +0200, Nikolay Borisov wrote:
> >>
> >>
> >> On 3.12.18 г. 17:24 ч., Josef Bacik wrote:
> >>> With my change to no longer take into account the global reserve for
> >>> metadata allocation chunks we have this side-effect for mixed block
> >>> group fs'es where we are no longer allocating enough chunks for the
> >>> data/metadata requirements.  To deal with this add a ALLOC_CHUNK_FORCE
> >>> step to the flushing state machine.  This will only get used if we've
> >>> already made a full loop through the flushing machinery and tried
> >>> committing the transaction.  If we have then we can try and force a
> >>> chunk allocation since we likely need it to make progress.  This
> >>> resolves the issues I was seeing with the mixed bg tests in xfstests
> >>> with my previous patch.
> >>>
> >>> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> >>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> >>
> >> Imo this and the previous patch should be squashed into one.
> > 
> > I don't see why, separate patches also look good to me. One changes the
> > logic regarding global reserve and the other fixes behaviour regarding
> > mixed block groups.
> 
> As far as I understand this deficient behavior is a direct result of the
> previous patch. In essnse previous patch fixes something and introduces
> new problem which is subsequently fixed by this patch. The way I see it
> if both patches are squashed the change log should be :
> 
> "I do [explanation of the first change]. However this introduces
> [explain bug from patch 2] so fix it by [explain fix from 2nd patch]"

Ok, patches merged.

Patch
diff mbox series

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 30da075c042e..7cf6ad021d81 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2750,7 +2750,8 @@  enum btrfs_flush_state {
 	FLUSH_DELALLOC		=	5,
 	FLUSH_DELALLOC_WAIT	=	6,
 	ALLOC_CHUNK		=	7,
-	COMMIT_TRANS		=	8,
+	ALLOC_CHUNK_FORCE	=	8,
+	COMMIT_TRANS		=	9,
 };
 
 int btrfs_alloc_data_chunk_ondemand(struct btrfs_inode *inode, u64 bytes);
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 667b992d322d..2d0dd70570ca 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4938,6 +4938,7 @@  static void flush_space(struct btrfs_fs_info *fs_info,
 		btrfs_end_transaction(trans);
 		break;
 	case ALLOC_CHUNK:
+	case ALLOC_CHUNK_FORCE:
 		trans = btrfs_join_transaction(root);
 		if (IS_ERR(trans)) {
 			ret = PTR_ERR(trans);
@@ -4945,7 +4946,9 @@  static void flush_space(struct btrfs_fs_info *fs_info,
 		}
 		ret = do_chunk_alloc(trans,
 				     btrfs_metadata_alloc_profile(fs_info),
-				     CHUNK_ALLOC_NO_FORCE);
+				     (state == ALLOC_CHUNK) ?
+				     CHUNK_ALLOC_NO_FORCE :
+				     CHUNK_ALLOC_FORCE);
 		btrfs_end_transaction(trans);
 		if (ret > 0 || ret == -ENOSPC)
 			ret = 0;
@@ -5081,6 +5084,19 @@  static void btrfs_async_reclaim_metadata_space(struct work_struct *work)
 				commit_cycles--;
 		}
 
+		/*
+		 * We don't want to force a chunk allocation until we've tried
+		 * pretty hard to reclaim space.  Think of the case where we
+		 * free'd up a bunch of space and so have a lot of pinned space
+		 * to reclaim.  We would rather use that than possibly create a
+		 * underutilized metadata chunk.  So if this is our first run
+		 * through the flushing state machine skip ALLOC_CHUNK_FORCE and
+		 * commit the transaction.  If nothing has changed the next go
+		 * around then we can force a chunk allocation.
+		 */
+		if (flush_state == ALLOC_CHUNK_FORCE && !commit_cycles)
+			flush_state++;
+
 		if (flush_state > COMMIT_TRANS) {
 			commit_cycles++;
 			if (commit_cycles > 2) {
diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h
index 63d1f9d8b8c7..dd0e6f8d6b6e 100644
--- a/include/trace/events/btrfs.h
+++ b/include/trace/events/btrfs.h
@@ -1051,6 +1051,7 @@  TRACE_EVENT(btrfs_trigger_flush,
 		{ FLUSH_DELAYED_REFS_NR,	"FLUSH_DELAYED_REFS_NR"},	\
 		{ FLUSH_DELAYED_REFS,		"FLUSH_ELAYED_REFS"},		\
 		{ ALLOC_CHUNK,			"ALLOC_CHUNK"},			\
+		{ ALLOC_CHUNK_FORCE,		"ALLOC_CHUNK_FORCE"},		\
 		{ COMMIT_TRANS,			"COMMIT_TRANS"})
 
 TRACE_EVENT(btrfs_flush_space,