diff mbox series

[08/19] btrfs: make unmirroed BGs readonly only if we have at least one writable BG

Message ID 20190607131025.31996-9-naohiro.aota@wdc.com (mailing list archive)
State New, archived
Headers show
Series btrfs zoned block device support | expand

Commit Message

Naohiro Aota June 7, 2019, 1:10 p.m. UTC
If the btrfs volume has mirrored block groups, it unconditionally makes
un-mirrored block groups read only. When we have mirrored block groups, but
don't have writable block groups, this will drop all writable block groups.
So, check if we have at least one writable mirrored block group before
setting un-mirrored block groups read only.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/extent-tree.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

Comments

Josef Bacik June 13, 2019, 2:09 p.m. UTC | #1
On Fri, Jun 07, 2019 at 10:10:14PM +0900, Naohiro Aota wrote:
> If the btrfs volume has mirrored block groups, it unconditionally makes
> un-mirrored block groups read only. When we have mirrored block groups, but
> don't have writable block groups, this will drop all writable block groups.
> So, check if we have at least one writable mirrored block group before
> setting un-mirrored block groups read only.
> 

I don't understand why you want this.  Thanks,

Josef
Naohiro Aota June 18, 2019, 7:42 a.m. UTC | #2
On 2019/06/13 23:09, Josef Bacik wrote:
> On Fri, Jun 07, 2019 at 10:10:14PM +0900, Naohiro Aota wrote:
>> If the btrfs volume has mirrored block groups, it unconditionally makes
>> un-mirrored block groups read only. When we have mirrored block groups, but
>> don't have writable block groups, this will drop all writable block groups.
>> So, check if we have at least one writable mirrored block group before
>> setting un-mirrored block groups read only.
>>
> 
> I don't understand why you want this.  Thanks,
> 
> Josef
> 

This is necessary to handle e.g. btrfs/124 case.

When we mount degraded RAID1 FS and write to it, and then
re-mount with full device, the write pointers of corresponding
zones of written BG differ.  The patch 07 mark such block group
as "wp_broken" and make it read only.  In this situation, we only
have read only RAID1 BGs because of "wp_broken" and un-mirrored BGs
are also marked read only, because we have RAID1 BGs.
As a result, all the BGs are now read only, so that we
cannot even start the rebalance to fix the situation.
Josef Bacik June 18, 2019, 1:35 p.m. UTC | #3
On Tue, Jun 18, 2019 at 07:42:46AM +0000, Naohiro Aota wrote:
> On 2019/06/13 23:09, Josef Bacik wrote:
> > On Fri, Jun 07, 2019 at 10:10:14PM +0900, Naohiro Aota wrote:
> >> If the btrfs volume has mirrored block groups, it unconditionally makes
> >> un-mirrored block groups read only. When we have mirrored block groups, but
> >> don't have writable block groups, this will drop all writable block groups.
> >> So, check if we have at least one writable mirrored block group before
> >> setting un-mirrored block groups read only.
> >>
> > 
> > I don't understand why you want this.  Thanks,
> > 
> > Josef
> > 
> 
> This is necessary to handle e.g. btrfs/124 case.
> 
> When we mount degraded RAID1 FS and write to it, and then
> re-mount with full device, the write pointers of corresponding
> zones of written BG differ.  The patch 07 mark such block group
> as "wp_broken" and make it read only.  In this situation, we only
> have read only RAID1 BGs because of "wp_broken" and un-mirrored BGs
> are also marked read only, because we have RAID1 BGs.
> As a result, all the BGs are now read only, so that we
> cannot even start the rebalance to fix the situation.

Ah ok, please add this explanation to the changelog.  Thanks,

Josef
diff mbox series

Patch

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index ebd0d6eae038..3d41d840fe5c 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -10791,6 +10791,9 @@  int btrfs_read_block_groups(struct btrfs_fs_info *info)
 	}
 
 	list_for_each_entry_rcu(space_info, &info->space_info, list) {
+		bool has_rw = false;
+		int i;
+
 		if (!(get_alloc_profile(info, space_info->flags) &
 		      (BTRFS_BLOCK_GROUP_RAID10 |
 		       BTRFS_BLOCK_GROUP_RAID1 |
@@ -10798,6 +10801,25 @@  int btrfs_read_block_groups(struct btrfs_fs_info *info)
 		       BTRFS_BLOCK_GROUP_RAID6 |
 		       BTRFS_BLOCK_GROUP_DUP)))
 			continue;
+
+		/* check if we have at least one writable mirroed block group */
+		for (i = 0; i < BTRFS_NR_RAID_TYPES; i++) {
+			if (i == BTRFS_RAID_RAID0 || i == BTRFS_RAID_SINGLE)
+				continue;
+			list_for_each_entry(cache, &space_info->block_groups[i],
+					    list) {
+				if (!cache->ro) {
+					has_rw = true;
+					break;
+				}
+			}
+			if (has_rw)
+				break;
+		}
+
+		if (!has_rw)
+			continue;
+
 		/*
 		 * avoid allocating from un-mirrored block group if there are
 		 * mirrored block groups.