diff mbox series

[02/18] btrfs: fix start transaction qgroup rsv double free

Message ID 90d1a33e3722d5533a8bb595b658aae81d1e6c21.1688597211.git.boris@bur.io (mailing list archive)
State New, archived
Headers show
Series btrfs: simple quotas | expand

Commit Message

Boris Burkov July 5, 2023, 11:20 p.m. UTC
btrfs_start_transaction reserves metadata space of the PERTRANS type
before it identifies a transaction to start/join. This allows flushing
when reserving that space without a deadlock. However, it results in a
race which temporarily breaks qgroup rsv accounting.

T1                                              T2
start_transaction
do_stuff
                                            start_transaction
                                                qgroup_reserve_meta_pertrans
commit_transaction
    qgroup_free_meta_all_pertrans
                                            hit an error starting txn
                                            goto reserve_fail
                                            qgroup_free_meta_pertrans (already freed!)

The basic issue is that there is nothing preventing another commit from
committing before start_transaction finishes (in fact sometimes we
intentionally wait for it..) so any error path that frees the reserve is
at risk of this race.

While this exact space was getting freed anyway, and it's not a huge
deal to double free it (just a warning, the free code catches this), it
can result in incorrectly freeing some other pertrans reservation in
this same reservation, which could then lead to spuriously granting
reservations we might not have the space for. Therefore, I do believe it
is worth fixing.

To fix it, use the existing prealloc->pertrans conversion mechanism.
When we first reserve the space, we reserve prealloc space and only when
we are sure we have a transaction do we convert it to pertrans. This way
any racing commits do not blow away our reservation, but we still get a
pertrans reservation that is freed when _this_ transaction gets committed.

This issue can be reproduced by running generic/269 with either qgroups
or squotas enabled via mkfs on the scratch device.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/transaction.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

Comments

Josef Bacik July 13, 2023, 2:02 p.m. UTC | #1
On Wed, Jul 05, 2023 at 04:20:39PM -0700, Boris Burkov wrote:
> btrfs_start_transaction reserves metadata space of the PERTRANS type
> before it identifies a transaction to start/join. This allows flushing
> when reserving that space without a deadlock. However, it results in a
> race which temporarily breaks qgroup rsv accounting.
> 
> T1                                              T2
> start_transaction
> do_stuff
>                                             start_transaction
>                                                 qgroup_reserve_meta_pertrans
> commit_transaction
>     qgroup_free_meta_all_pertrans
>                                             hit an error starting txn
>                                             goto reserve_fail
>                                             qgroup_free_meta_pertrans (already freed!)
> 
> The basic issue is that there is nothing preventing another commit from
> committing before start_transaction finishes (in fact sometimes we
> intentionally wait for it..) so any error path that frees the reserve is
> at risk of this race.
> 
> While this exact space was getting freed anyway, and it's not a huge
> deal to double free it (just a warning, the free code catches this), it
> can result in incorrectly freeing some other pertrans reservation in
> this same reservation, which could then lead to spuriously granting
> reservations we might not have the space for. Therefore, I do believe it
> is worth fixing.
> 
> To fix it, use the existing prealloc->pertrans conversion mechanism.
> When we first reserve the space, we reserve prealloc space and only when
> we are sure we have a transaction do we convert it to pertrans. This way
> any racing commits do not blow away our reservation, but we still get a
> pertrans reservation that is freed when _this_ transaction gets committed.
> 
> This issue can be reproduced by running generic/269 with either qgroups
> or squotas enabled via mkfs on the scratch device.
> 
> Signed-off-by: Boris Burkov <boris@bur.io>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef
diff mbox series

Patch

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index cf306351b148..d40b0a39a552 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -591,8 +591,13 @@  start_transaction(struct btrfs_root *root, unsigned int num_items,
 		u64 delayed_refs_bytes = 0;
 
 		qgroup_reserved = num_items * fs_info->nodesize;
-		ret = btrfs_qgroup_reserve_meta_pertrans(root, qgroup_reserved,
-				enforce_qgroups);
+		/*
+		 * Use prealloc for now, as there might be a currently running
+		 * transaction that could free this reserved space prematurely
+		 * by committing.
+		 */
+		ret = btrfs_qgroup_reserve_meta_prealloc(root, qgroup_reserved,
+							 enforce_qgroups, false);
 		if (ret)
 			return ERR_PTR(ret);
 
@@ -705,6 +710,14 @@  start_transaction(struct btrfs_root *root, unsigned int num_items,
 		h->reloc_reserved = reloc_reserved;
 	}
 
+	/*
+	 * Now that we have found a transaction to be a part of, convert the
+	 * qgroup reservation from prealloc to pertrans. A different transaction
+	 * can't race in and free our pertrans out from under us.
+	 */
+	if (qgroup_reserved)
+		btrfs_qgroup_convert_reserved_meta(root, qgroup_reserved);
+
 got_it:
 	if (!current->journal_info)
 		current->journal_info = h;
@@ -752,7 +765,7 @@  start_transaction(struct btrfs_root *root, unsigned int num_items,
 		btrfs_block_rsv_release(fs_info, &fs_info->trans_block_rsv,
 					num_bytes, NULL);
 reserve_fail:
-	btrfs_qgroup_free_meta_pertrans(root, qgroup_reserved);
+	btrfs_qgroup_free_meta_prealloc(root, qgroup_reserved);
 	return ERR_PTR(ret);
 }