btrfs: fix possible infinite loop in data async reclaim

Message ID	24f846bc8860cab91ca134d0a337cc290589a092.1598389008.git.josef@toxicpanda.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=8sAC=CD=vger.kernel.org=linux-btrfs-owner@kernel.org> From: Josef Bacik <josef@toxicpanda.com> To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH] btrfs: fix possible infinite loop in data async reclaim Date: Tue, 25 Aug 2020 16:56:59 -0400 Message-Id: <24f846bc8860cab91ca134d0a337cc290589a092.1598389008.git.josef@toxicpanda.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk
Series	btrfs: fix possible infinite loop in data async reclaim \| expand btrfs: fix possible infinite loop in data async reclaim

Message ID

24f846bc8860cab91ca134d0a337cc290589a092.1598389008.git.josef@toxicpanda.com (mailing list archive)

State

New, archived

Headers

From: Josef Bacik <josef@toxicpanda.com>
To: linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: [PATCH] btrfs: fix possible infinite loop in data async reclaim
Date: Tue, 25 Aug 2020 16:56:59 -0400
Message-Id: 
 <24f846bc8860cab91ca134d0a337cc290589a092.1598389008.git.josef@toxicpanda.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: linux-btrfs-owner@vger.kernel.org
Precedence: bulk

Series

btrfs: fix possible infinite loop in data async reclaim | expand

Commit Message

Josef Bacik Aug. 25, 2020, 8:56 p.m. UTC

Dave reported an issue where generic/102 would sometimes hang.  This
turned out to be because we'd get into this spot where we were no longer
making progress on data reservations because our exit condition was not
met.  The log is basically

while (!space_info->full && !list_empty(&space_info->tickets))
	flush_space(space_info, flush_state);

where flush state is our various flush states, but doesn't include
ALLOC_CHUNK_FORCE.  This is because we actually lead with allocating
chunks, and so the assumption was that once you got to the actual
flushing states you could no longer allocate chunks.  This was a stupid
assumption, because you could have deleted block groups that would be
reclaimed by a transaction commit, thus unsetting space_info->full.
This is essentially what happens with generic/102, and so sometimes
you'd get stuck in the flushing loop because we weren't allocating
chunks, but flushing space wasn't giving us what we needed to make
progress.

Fix this by adding ALLOC_CHUNK_FORCE to the end of our flushing states,
that way we will eventually bail out because we did end up with
space_info->full if we free'd a chunk previously.  Otherwise, as is the
case for this test, we'll allocate our chunk and continue on our happy
merry way.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

David Sterba Aug. 26, 2020, 9:13 a.m. UTC | #1

On Tue, Aug 25, 2020 at 04:56:59PM -0400, Josef Bacik wrote:
> Dave reported an issue where generic/102 would sometimes hang.  This
> turned out to be because we'd get into this spot where we were no longer
> making progress on data reservations because our exit condition was not
> met.  The log is basically
> 
> while (!space_info->full && !list_empty(&space_info->tickets))
> 	flush_space(space_info, flush_state);
> 
> where flush state is our various flush states, but doesn't include
> ALLOC_CHUNK_FORCE.  This is because we actually lead with allocating
> chunks, and so the assumption was that once you got to the actual
> flushing states you could no longer allocate chunks.  This was a stupid
> assumption, because you could have deleted block groups that would be
> reclaimed by a transaction commit, thus unsetting space_info->full.
> This is essentially what happens with generic/102, and so sometimes
> you'd get stuck in the flushing loop because we weren't allocating
> chunks, but flushing space wasn't giving us what we needed to make
> progress.
> 
> Fix this by adding ALLOC_CHUNK_FORCE to the end of our flushing states,
> that way we will eventually bail out because we did end up with
> space_info->full if we free'd a chunk previously.  Otherwise, as is the
> case for this test, we'll allocate our chunk and continue on our happy
> merry way.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Thanks. As the flushing states are added one by one at the end of the
series, I'll add it as a separate patch. Folding it to some other patch
would lose a bit more of information that's in the changelog, so this
leaves a short window where the 102 hang could happen but again the
flushing sequence is not switched at once.

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 71aa9e0de61e..b733718f45d3 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1044,12 +1044,18 @@  static void btrfs_async_reclaim_metadata_space(struct work_struct *work)
  *   total_bytes_pinned < reservation we will not commit.  This is why the
  *   previous states are actually important, to make sure we know for sure
  *   whether committing the transaction will allow us to make progress.
+ *
+ * ALLOC_CHUNK_FORCE
+ *   For data we start with alloc chunk force, however we could have been full
+ *   before, and then the transaction commit could have freed new block groups,
+ *   so if we now have space to allocate do the force chunk allocation.
  */
 static const enum btrfs_flush_state data_flush_states[] = {
 	FLUSH_DELALLOC_WAIT,
 	RUN_DELAYED_IPUTS,
 	FLUSH_DELAYED_REFS,
 	COMMIT_TRANS,
+	ALLOC_CHUNK_FORCE,
 };
 
 static void btrfs_async_reclaim_data_space(struct work_struct *work)

btrfs: fix possible infinite loop in data async reclaim

Commit Message

Comments

Patch