diff mbox series

[1/2] btrfs: reject device with CHANGING_FSID_V2

Message ID 83e6a50ea2040a27e0dc05a09a9213b79e8938c8.1695244296.git.anand.jain@oracle.com (mailing list archive)
State New, archived
Headers show
Series btrfs: reject device with CHANGING_FSID_V2 flag | expand

Commit Message

Anand Jain Sept. 20, 2023, 9:51 p.m. UTC
The BTRFS_SUPER_FLAG_CHANGING_FSID_V2 flag indicates a transient state
where the device in the userspace btrfstune -m|-M operation failed to
complete changing the fsid.

This flag makes the kernel to automatically determine the other
partner devices to which a given device can be associated, based on the
fsid, metadata_uuid and generation values.

btrfstune -m|M feature is especially useful in virtual cloud setups, where
compute instances (disk images) are quickly copied, fsid changed, and
launched. Given numerous disk images with the same metadata_uuid but
different fsid, there's no clear way a device can be correctly assembled
with the proper partners when the CHANGING_FSID_V2 flag is set. So, the
disk could be assembled incorrectly, as in the example below:

Before this patch:

Consider the following two filesystems:
   /dev/loop[2-3] are raw copies of /dev/loop[0-1] and the btrsftune -m
operation fails.

In this scenario, as the /dev/loop0's fsid change is interrupted, and the
CHANGING_FSID_V2 flag is set as shown below.

  $ p="device|devid|^metadata_uuid|^fsid|^incom|^generation|^flags"

  $ btrfs inspect dump-super /dev/loop0 | egrep '$p'
  superblock: bytenr=65536, device=/dev/loop0
  flags			0x1000000001
  fsid			7d4b4b93-2b27-4432-b4e4-4be1fbccbd45
  metadata_uuid		bb040a9f-233a-4de2-ad84-49aa5a28059b
  generation		9
  num_devices		2
  incompat_flags	0x741
  dev_item.devid	1

  $ btrfs inspect dump-super /dev/loop1 | egrep '$p'
  superblock: bytenr=65536, device=/dev/loop1
  flags			0x1
  fsid			11d2af4d-1b71-45a9-83f6-f2100766939d
  metadata_uuid		bb040a9f-233a-4de2-ad84-49aa5a28059b
  generation		10
  num_devices		2
  incompat_flags	0x741
  dev_item.devid	2

  $ btrfs inspect dump-super /dev/loop2 | egrep '$p'
  superblock: bytenr=65536, device=/dev/loop2
  flags			0x1
  fsid			7d4b4b93-2b27-4432-b4e4-4be1fbccbd45
  metadata_uuid		bb040a9f-233a-4de2-ad84-49aa5a28059b
  generation		8
  num_devices		2
  incompat_flags	0x741
  dev_item.devid	1

  $ btrfs inspect dump-super /dev/loop3 | egrep '$p'
  superblock: bytenr=65536, device=/dev/loop3
  flags			0x1
  fsid			7d4b4b93-2b27-4432-b4e4-4be1fbccbd45
  metadata_uuid		bb040a9f-233a-4de2-ad84-49aa5a28059b
  generation		8
  num_devices		2
  incompat_flags	0x741
  dev_item.devid	2

It is normal that some devices aren't instantly discovered during
system boot or iSCSI discovery. The controlled scan below demonstrates
this.

  $ btrfs device scan --forget
  $ btrfs device scan /dev/loop0
  Scanning for btrfs filesystems on '/dev/loop0'
  $ mount /dev/loop3 /btrfs
  $ btrfs filesystem show -m
  Label: none  uuid: 7d4b4b93-2b27-4432-b4e4-4be1fbccbd45
	Total devices 2 FS bytes used 144.00KiB
	devid    1 size 300.00MiB used 48.00MiB path /dev/loop0
	devid    2 size 300.00MiB used 40.00MiB path /dev/loop3

/dev/loop0 and /dev/loop3 are incorrectly partnered.

This kernel patch removes functions and code connected to the
CHANGING_FSID_V2 flag.

With this patch, now devices with the CHANGING_FSID_V2 flag are rejected.
And its partner will fail to mount with the extra -o degraded option.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
Moreover, a btrfs-progs patch (below) has eliminated the use of the
CHANGING_FSID_V2 flag entirely:

   [PATCH] btrfs-progs: btrfstune -m|M remove 2-stage commit

And we solve the compatability concerns as below:

  New-kernel new-progs - has no CHANGING_FSID_V2 flag.
  Old-kernel new-progs - has no CHANGING_FSID_V2 flag, kernel code unused.
  Old-kernel old-progs - bug may occur.
  New-kernel old-progs - Should use host with the newer btrfs-progs to fix.

For legacy systems to help fix such a condition in the userspace instead
we have the below patchset which ports of kernel's CHANGING_FSID_V2 code.

   [PATCH 0/4 v4] btrfs-progs: recover from failed metadata_uuid port kernel

And if it couldn't fix in some cases, users can use manually reunite,
with the patchset:

   [PATCH 00/10] btrfs-progs: check and tune: add device and noscan options

 fs/btrfs/disk-io.c | 10 ----------
 fs/btrfs/volumes.c |  7 +++++++
 2 files changed, 7 insertions(+), 10 deletions(-)

Comments

David Sterba Sept. 22, 2023, 11:23 a.m. UTC | #1
On Thu, Sep 21, 2023 at 05:51:13AM +0800, Anand Jain wrote:
> The BTRFS_SUPER_FLAG_CHANGING_FSID_V2 flag indicates a transient state
> where the device in the userspace btrfstune -m|-M operation failed to
> complete changing the fsid.
> 
> This flag makes the kernel to automatically determine the other
> partner devices to which a given device can be associated, based on the
> fsid, metadata_uuid and generation values.
> 
> btrfstune -m|M feature is especially useful in virtual cloud setups, where
> compute instances (disk images) are quickly copied, fsid changed, and
> launched. Given numerous disk images with the same metadata_uuid but
> different fsid, there's no clear way a device can be correctly assembled
> with the proper partners when the CHANGING_FSID_V2 flag is set. So, the
> disk could be assembled incorrectly, as in the example below:
> 
> Before this patch:
> 
> Consider the following two filesystems:
>    /dev/loop[2-3] are raw copies of /dev/loop[0-1] and the btrsftune -m
> operation fails.
> 
> In this scenario, as the /dev/loop0's fsid change is interrupted, and the
> CHANGING_FSID_V2 flag is set as shown below.
> 
>   $ p="device|devid|^metadata_uuid|^fsid|^incom|^generation|^flags"
> 
>   $ btrfs inspect dump-super /dev/loop0 | egrep '$p'
>   superblock: bytenr=65536, device=/dev/loop0
>   flags			0x1000000001
>   fsid			7d4b4b93-2b27-4432-b4e4-4be1fbccbd45
>   metadata_uuid		bb040a9f-233a-4de2-ad84-49aa5a28059b
>   generation		9
>   num_devices		2
>   incompat_flags	0x741
>   dev_item.devid	1
> 
>   $ btrfs inspect dump-super /dev/loop1 | egrep '$p'
>   superblock: bytenr=65536, device=/dev/loop1
>   flags			0x1
>   fsid			11d2af4d-1b71-45a9-83f6-f2100766939d
>   metadata_uuid		bb040a9f-233a-4de2-ad84-49aa5a28059b
>   generation		10
>   num_devices		2
>   incompat_flags	0x741
>   dev_item.devid	2
> 
>   $ btrfs inspect dump-super /dev/loop2 | egrep '$p'
>   superblock: bytenr=65536, device=/dev/loop2
>   flags			0x1
>   fsid			7d4b4b93-2b27-4432-b4e4-4be1fbccbd45
>   metadata_uuid		bb040a9f-233a-4de2-ad84-49aa5a28059b
>   generation		8
>   num_devices		2
>   incompat_flags	0x741
>   dev_item.devid	1
> 
>   $ btrfs inspect dump-super /dev/loop3 | egrep '$p'
>   superblock: bytenr=65536, device=/dev/loop3
>   flags			0x1
>   fsid			7d4b4b93-2b27-4432-b4e4-4be1fbccbd45
>   metadata_uuid		bb040a9f-233a-4de2-ad84-49aa5a28059b
>   generation		8
>   num_devices		2
>   incompat_flags	0x741
>   dev_item.devid	2
> 
> It is normal that some devices aren't instantly discovered during
> system boot or iSCSI discovery. The controlled scan below demonstrates
> this.
> 
>   $ btrfs device scan --forget
>   $ btrfs device scan /dev/loop0
>   Scanning for btrfs filesystems on '/dev/loop0'
>   $ mount /dev/loop3 /btrfs
>   $ btrfs filesystem show -m
>   Label: none  uuid: 7d4b4b93-2b27-4432-b4e4-4be1fbccbd45
> 	Total devices 2 FS bytes used 144.00KiB
> 	devid    1 size 300.00MiB used 48.00MiB path /dev/loop0
> 	devid    2 size 300.00MiB used 40.00MiB path /dev/loop3
> 
> /dev/loop0 and /dev/loop3 are incorrectly partnered.
> 
> This kernel patch removes functions and code connected to the
> CHANGING_FSID_V2 flag.
> 
> With this patch, now devices with the CHANGING_FSID_V2 flag are rejected.
> And its partner will fail to mount with the extra -o degraded option.
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> Moreover, a btrfs-progs patch (below) has eliminated the use of the
> CHANGING_FSID_V2 flag entirely:
> 
>    [PATCH] btrfs-progs: btrfstune -m|M remove 2-stage commit
> 
> And we solve the compatability concerns as below:
> 
>   New-kernel new-progs - has no CHANGING_FSID_V2 flag.
>   Old-kernel new-progs - has no CHANGING_FSID_V2 flag, kernel code unused.
>   Old-kernel old-progs - bug may occur.
>   New-kernel old-progs - Should use host with the newer btrfs-progs to fix.
> 
> For legacy systems to help fix such a condition in the userspace instead
> we have the below patchset which ports of kernel's CHANGING_FSID_V2 code.
> 
>    [PATCH 0/4 v4] btrfs-progs: recover from failed metadata_uuid port kernel
> 
> And if it couldn't fix in some cases, users can use manually reunite,
> with the patchset:
> 
>    [PATCH 00/10] btrfs-progs: check and tune: add device and noscan options
> 
>  fs/btrfs/disk-io.c | 10 ----------
>  fs/btrfs/volumes.c |  7 +++++++
>  2 files changed, 7 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index dc577b3c53f6..95746ddf7dc3 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -3173,7 +3173,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
>  	u32 nodesize;
>  	u32 stripesize;
>  	u64 generation;
> -	u64 features;
>  	u16 csum_type;
>  	struct btrfs_super_block *disk_super;
>  	struct btrfs_fs_info *fs_info = btrfs_sb(sb);
> @@ -3255,15 +3254,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
>  
>  	disk_super = fs_info->super_copy;
>  
> -
> -	features = btrfs_super_flags(disk_super);
> -	if (features & BTRFS_SUPER_FLAG_CHANGING_FSID_V2) {
> -		features &= ~BTRFS_SUPER_FLAG_CHANGING_FSID_V2;
> -		btrfs_set_super_flags(disk_super, features);
> -		btrfs_info(fs_info,
> -			"found metadata UUID change in progress flag, clearing");
> -	}

This is removed from the mount path but it's still rejected because at
some point the device scanning will be called and will return -EINVAL.

> -
>  	memcpy(fs_info->super_for_commit, fs_info->super_copy,
>  	       sizeof(*fs_info->super_for_commit));
>  
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index bc8d46cbc7c5..c845c60ec207 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -791,6 +791,13 @@ static noinline struct btrfs_device *device_list_add(const char *path,
>  	bool fsid_change_in_progress = (btrfs_super_flags(disk_super) &
>  					BTRFS_SUPER_FLAG_CHANGING_FSID_V2);
>  
> +	if (fsid_change_in_progress) {
> +		btrfs_err(NULL,
> +"device %s has incomplete FSID changes please use btrfstune to complete",

This could say it's specifically metadata_uuid.

> +			  path);
> +		return ERR_PTR(-EINVAL);

We could probably return -EAGAIN as it's not a hard error.

Please let me know if you agree with the changes, I'll fix it in the
commit.
Anand Jain Sept. 22, 2023, 12:40 p.m. UTC | #2
>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>> index dc577b3c53f6..95746ddf7dc3 100644
>> --- a/fs/btrfs/disk-io.c
>> +++ b/fs/btrfs/disk-io.c
>> @@ -3173,7 +3173,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
>>   	u32 nodesize;
>>   	u32 stripesize;
>>   	u64 generation;
>> -	u64 features;
>>   	u16 csum_type;
>>   	struct btrfs_super_block *disk_super;
>>   	struct btrfs_fs_info *fs_info = btrfs_sb(sb);
>> @@ -3255,15 +3254,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
>>   
>>   	disk_super = fs_info->super_copy;
>>   
>> -
>> -	features = btrfs_super_flags(disk_super);
>> -	if (features & BTRFS_SUPER_FLAG_CHANGING_FSID_V2) {
>> -		features &= ~BTRFS_SUPER_FLAG_CHANGING_FSID_V2;
>> -		btrfs_set_super_flags(disk_super, features);
>> -		btrfs_info(fs_info,
>> -			"found metadata UUID change in progress flag, clearing");
>> -	}
> 
> This is removed from the mount path but it's still rejected because at
> some point the device scanning will be called and will return -EINVAL.
> 

Correct. This mount thread calls btrfs_scan_one_device() with the 
mounting device as the path and verifies its superblock until it reaches 
device_list_add(), where we return -EINVAL.


>> -
>>   	memcpy(fs_info->super_for_commit, fs_info->super_copy,
>>   	       sizeof(*fs_info->super_for_commit));
>>   
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index bc8d46cbc7c5..c845c60ec207 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -791,6 +791,13 @@ static noinline struct btrfs_device *device_list_add(const char *path,
>>   	bool fsid_change_in_progress = (btrfs_super_flags(disk_super) &
>>   					BTRFS_SUPER_FLAG_CHANGING_FSID_V2);
>>   
>> +	if (fsid_change_in_progress) {
>> +		btrfs_err(NULL,
>> +"device %s has incomplete FSID changes please use btrfstune to complete",
> 
> This could say it's specifically metadata_uuid.
> 
>> +			  path);
>> +		return ERR_PTR(-EINVAL);

Here.

> 
> We could probably return -EAGAIN as it's not a hard error.
> 

-EAGAIN is fine.

> Please let me know if you agree with the changes, I'll fix it in the
> commit.

Thanks!
Anand
diff mbox series

Patch

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index dc577b3c53f6..95746ddf7dc3 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3173,7 +3173,6 @@  int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 	u32 nodesize;
 	u32 stripesize;
 	u64 generation;
-	u64 features;
 	u16 csum_type;
 	struct btrfs_super_block *disk_super;
 	struct btrfs_fs_info *fs_info = btrfs_sb(sb);
@@ -3255,15 +3254,6 @@  int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 
 	disk_super = fs_info->super_copy;
 
-
-	features = btrfs_super_flags(disk_super);
-	if (features & BTRFS_SUPER_FLAG_CHANGING_FSID_V2) {
-		features &= ~BTRFS_SUPER_FLAG_CHANGING_FSID_V2;
-		btrfs_set_super_flags(disk_super, features);
-		btrfs_info(fs_info,
-			"found metadata UUID change in progress flag, clearing");
-	}
-
 	memcpy(fs_info->super_for_commit, fs_info->super_copy,
 	       sizeof(*fs_info->super_for_commit));
 
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index bc8d46cbc7c5..c845c60ec207 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -791,6 +791,13 @@  static noinline struct btrfs_device *device_list_add(const char *path,
 	bool fsid_change_in_progress = (btrfs_super_flags(disk_super) &
 					BTRFS_SUPER_FLAG_CHANGING_FSID_V2);
 
+	if (fsid_change_in_progress) {
+		btrfs_err(NULL,
+"device %s has incomplete FSID changes please use btrfstune to complete",
+			  path);
+		return ERR_PTR(-EINVAL);
+	}
+
 	error = lookup_bdev(path, &path_devt);
 	if (error) {
 		btrfs_err(NULL, "failed to lookup block device for path %s: %d",