diff mbox series

[3/8] btrfs: error out when reallocating block for defrag using a stale transaction

Message ID c8083f9239853dc397beda6a3dc97c93da62137b.1695731842.git.fdmanana@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: some fixes and cleanups around btrfs_cow_block() | expand

Commit Message

Filipe Manana Sept. 26, 2023, 12:45 p.m. UTC
From: Filipe Manana <fdmanana@suse.com>

At btrfs_realloc_node() we have these checks to verify we are not using a
stale transaction (a past transaction with an unblocked state or higher),
and the only thing we do is to trigger two WARN_ON(). This however is a
critical problem, highly unexpected and if it happens it's most likely due
to a bug, so we should error out and turn the fs into error state so that
such issue is much more easily noticed if it's triggered.

The problem is critical because in btrfs_realloc_node() we COW tree blocks,
and using such stale transaction will lead to not persisting the extent
buffers used for the COW operations, as allocating tree block adds the
range of the respective extent buffers to the ->dirty_pages iotree of the
transaction, and a stale transaction, in the unlocked state or higher,
will not flush dirty extent buffers anymore, therefore resulting in not
persisting the tree block and resource leaks (not cleaning the dirty_pages
iotree for example).

So do the following changes:

1) Return -EUCLEAN if we find a stale transaction;

2) Turn the fs into error state, with error -EUCLEAN, so that no
   transaction can be committed, and generate a stack trace;

3) Combine both conditions into a single if statement, as both are related
   and have the same error message;

4) Mark the check as unlikely, since this is not expected to ever happen.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/ctree.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

Comments

David Sterba Sept. 26, 2023, 5:41 p.m. UTC | #1
On Tue, Sep 26, 2023 at 01:45:15PM +0100, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> At btrfs_realloc_node() we have these checks to verify we are not using a
> stale transaction (a past transaction with an unblocked state or higher),
> and the only thing we do is to trigger two WARN_ON(). This however is a
> critical problem, highly unexpected and if it happens it's most likely due
> to a bug, so we should error out and turn the fs into error state so that
> such issue is much more easily noticed if it's triggered.
> 
> The problem is critical because in btrfs_realloc_node() we COW tree blocks,
> and using such stale transaction will lead to not persisting the extent
> buffers used for the COW operations, as allocating tree block adds the
> range of the respective extent buffers to the ->dirty_pages iotree of the
> transaction, and a stale transaction, in the unlocked state or higher,
> will not flush dirty extent buffers anymore, therefore resulting in not
> persisting the tree block and resource leaks (not cleaning the dirty_pages
> iotree for example).
> 
> So do the following changes:
> 
> 1) Return -EUCLEAN if we find a stale transaction;
> 
> 2) Turn the fs into error state, with error -EUCLEAN, so that no
>    transaction can be committed, and generate a stack trace;
> 
> 3) Combine both conditions into a single if statement, as both are related
>    and have the same error message;
> 
> 4) Mark the check as unlikely, since this is not expected to ever happen.
> 
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> ---
>  fs/btrfs/ctree.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> index 4eef1a7d1db6..8619172bcba1 100644
> --- a/fs/btrfs/ctree.c
> +++ b/fs/btrfs/ctree.c
> @@ -817,8 +817,22 @@ int btrfs_realloc_node(struct btrfs_trans_handle *trans,
>  	int progress_passed = 0;
>  	struct btrfs_disk_key disk_key;
>  
> -	WARN_ON(trans->transaction != fs_info->running_transaction);
> -	WARN_ON(trans->transid != fs_info->generation);
> +	/*
> +	 * COWing must happen through a running transaction, which always
> +	 * matches the current fs generation (it's a transaction with a state
> +	 * less than TRANS_STATE_UNBLOCKED). If it doesn't, then turn the fs
> +	 * into error state to prevent the commit of any transaction.
> +	 */
> +	if (unlikely(trans->transaction != fs_info->running_transaction ||
> +		     trans->transid != fs_info->generation)) {
> +		btrfs_handle_fs_error(fs_info, -EUCLEAN,

Same question as in patch 1, could this be btrfs_abort_transaction()?

> +"unexpected transaction when attempting to reallocate parent %llu for root %llu, transaction %llu running transaction %llu fs generation %llu",
> +				      parent->start, btrfs_root_id(root),
> +				      trans->transid,
> +				      fs_info->running_transaction->transid,
> +				      fs_info->generation);
> +		return -EUCLEAN;
> +	}
>  
>  	parent_nritems = btrfs_header_nritems(parent);
>  	blocksize = fs_info->nodesize;
> -- 
> 2.40.1
diff mbox series

Patch

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 4eef1a7d1db6..8619172bcba1 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -817,8 +817,22 @@  int btrfs_realloc_node(struct btrfs_trans_handle *trans,
 	int progress_passed = 0;
 	struct btrfs_disk_key disk_key;
 
-	WARN_ON(trans->transaction != fs_info->running_transaction);
-	WARN_ON(trans->transid != fs_info->generation);
+	/*
+	 * COWing must happen through a running transaction, which always
+	 * matches the current fs generation (it's a transaction with a state
+	 * less than TRANS_STATE_UNBLOCKED). If it doesn't, then turn the fs
+	 * into error state to prevent the commit of any transaction.
+	 */
+	if (unlikely(trans->transaction != fs_info->running_transaction ||
+		     trans->transid != fs_info->generation)) {
+		btrfs_handle_fs_error(fs_info, -EUCLEAN,
+"unexpected transaction when attempting to reallocate parent %llu for root %llu, transaction %llu running transaction %llu fs generation %llu",
+				      parent->start, btrfs_root_id(root),
+				      trans->transid,
+				      fs_info->running_transaction->transid,
+				      fs_info->generation);
+		return -EUCLEAN;
+	}
 
 	parent_nritems = btrfs_header_nritems(parent);
 	blocksize = fs_info->nodesize;