diff mbox series

[1/2] btrfs: reschedule when updating chunk maps at the end of a device replace

Message ID e3c4aaf1800d85bd4ffc397c2a8b8c291818e286.1722264391.git.fdmanana@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: some updates to the dev replace finishing parting | expand

Commit Message

Filipe Manana July 29, 2024, 2:51 p.m. UTC
From: Filipe Manana <fdmanana@suse.com>

At the end of a device replace we must go over all the chunk maps and
update their stripes to point to the target device instead of the source
device. We iterate over the chunk maps while holding a write lock and
we never reschedule, which can result in monopolizing a CPU for too long
and blocking readers for too long (it's a rw lock, non-blocking).

So improve on this by rescheduling if necessary. This is safe because at
this point we are holding the chunk mutex, which means no new chunks can
be allocated and therefore we don't risk missing a new chunk map that
covers a range behind the last one we processed before rescheduling.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/dev-replace.c | 9 +++++++++
 1 file changed, 9 insertions(+)
diff mbox series

Patch

diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index f638c458d285..20cf5e95f2bc 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -827,6 +827,14 @@  static void btrfs_dev_replace_update_device_in_mapping_tree(
 	u64 start = 0;
 	int i;
 
+	/*
+	 * The chunk mutex must be held so that no new chunks can be created
+	 * while we are updating existing chunks. This guarantees we don't miss
+	 * any new chunk that gets created for a range that falls before the
+	 * range of the last chunk we processed.
+	 */
+	lockdep_assert_held(&fs_info->chunk_mutex);
+
 	write_lock(&fs_info->mapping_tree_lock);
 	do {
 		struct btrfs_chunk_map *map;
@@ -839,6 +847,7 @@  static void btrfs_dev_replace_update_device_in_mapping_tree(
 				map->stripes[i].dev = tgtdev;
 		start = map->start + map->chunk_len;
 		btrfs_free_chunk_map(map);
+		cond_resched_rwlock_write(&fs_info->mapping_tree_lock);
 	} while (start);
 	write_unlock(&fs_info->mapping_tree_lock);
 }