diff mbox

Btrfs: fix racy system chunk allocation when setting block group ro

Message ID 1431972715-16833-1-git-send-email-fdmanana@suse.com (mailing list archive)
State Accepted
Headers show

Commit Message

Filipe Manana May 18, 2015, 6:11 p.m. UTC
If while setting a block group read-only we end up allocating a system
chunk, through check_system_chunk(), we were not doing it while holding
the chunk mutex which is a problem if a concurrent chunk allocation is
happening, through do_chunk_alloc(), as it means both block groups can
end up using the same logical addresses and physical regions in the
device(s). So make sure we hold the chunk mutex.

Cc: stable@vger.kernel.org  # 4.0+
Fixes: 2f0810880f08 ("btrfs: delete chunk allocation attemp when
                      setting block group ro")

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/extent-tree.c | 2 ++
 fs/btrfs/volumes.c     | 1 +
 2 files changed, 3 insertions(+)

Comments

Holger Hoffstätte May 19, 2015, 11:32 a.m. UTC | #1
On Mon, 18 May 2015 19:11:40 +0100, Filipe Manana wrote:

> If while setting a block group read-only we end up allocating a system
> chunk, through check_system_chunk(), we were not doing it while holding
> the chunk mutex which is a problem if a concurrent chunk allocation is
> happening, through do_chunk_alloc(), as it means both block groups can
> end up using the same logical addresses and physical regions in the
> device(s). So make sure we hold the chunk mutex.
> 
> Cc: stable@vger.kernel.org  # 4.0+
> Fixes: 2f0810880f08 ("btrfs: delete chunk allocation attemp when
>                       setting block group ro")

Hello Filipe,

good find, as usual. But from the description it's not clear to me whether
this also fixes rebalance seemingly not having an effect (as reported many
times now) or "just" the on-disk block data getting munched together on a
racy day? Just thought I'd ask for clarification before I start patching.

Thanks!

Holger


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Filipe Manana May 19, 2015, 12:30 p.m. UTC | #2
On Tue, May 19, 2015 at 12:32 PM, Holger Hoffstätte
<holger.hoffstaette@googlemail.com> wrote:
> On Mon, 18 May 2015 19:11:40 +0100, Filipe Manana wrote:
>
>> If while setting a block group read-only we end up allocating a system
>> chunk, through check_system_chunk(), we were not doing it while holding
>> the chunk mutex which is a problem if a concurrent chunk allocation is
>> happening, through do_chunk_alloc(), as it means both block groups can
>> end up using the same logical addresses and physical regions in the
>> device(s). So make sure we hold the chunk mutex.
>>
>> Cc: stable@vger.kernel.org  # 4.0+
>> Fixes: 2f0810880f08 ("btrfs: delete chunk allocation attemp when
>>                       setting block group ro")
>
> Hello Filipe,
>
> good find, as usual. But from the description it's not clear to me whether
> this also fixes rebalance seemingly not having an effect (as reported many
> times now) or "just" the on-disk block data getting munched together on a
> racy day? Just thought I'd ask for clarification before I start patching.

No, it doesn't fix the balance/conversion issue. It's a completely
separate issue/regression. Chris is working on the balance/conversion
issue.

thanks

>
> Thanks!
>
> Holger
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 4bf489b..1d757d8 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -8846,7 +8846,9 @@  again:
 out:
 	if (cache->flags & BTRFS_BLOCK_GROUP_SYSTEM) {
 		alloc_flags = update_block_group_flags(root, cache->flags);
+		lock_chunks(root->fs_info->chunk_root);
 		check_system_chunk(trans, root, alloc_flags);
+		unlock_chunks(root->fs_info->chunk_root);
 	}
 	mutex_unlock(&root->fs_info->ro_block_group_mutex);
 
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 56b91ed..d2b276c 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4641,6 +4641,7 @@  int btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 {
 	u64 chunk_offset;
 
+	ASSERT(mutex_is_locked(&extent_root->fs_info->chunk_mutex));
 	chunk_offset = find_next_chunk(extent_root->fs_info);
 	return __btrfs_alloc_chunk(trans, extent_root, chunk_offset, type);
 }