mbox series

[00/10] btrfs-progs: check and tune: add device and noscan options

Message ID cover.1687943122.git.anand.jain@oracle.com (mailing list archive)
Headers show
Series btrfs-progs: check and tune: add device and noscan options | expand

Message

Anand Jain June 28, 2023, 11:56 a.m. UTC
By default, btrfstune and btrfs check scans all and only the block devices
in the system.

To scan regular files without mapping them to a loop device, adds the
--device option.

To indicate not to scan the system for other devices, adds the --noscan
option.

For example:

  The command below will scan both regular files and the devices
  provided in the --device option, along with the system block devices.

        btrfstune -m --device /tdev/td1,/tdev/td2 /tdev/td3
  or
        btrfs check --device /tdev/td1 --device /tdev/td2 /tdev/td3

  In some cases, if you need to avoid the default system scan for the
  block device, you can use the --noscan option.

        btrfstune -m --noscan --device /tdev/td1,/tdev/td2 /tdev/td3

        btrfs check --noscan --device /tdev/td1,/tdev/td2 /tdev/td3

 This patch bundle depends on the preparatory patch bundle sent before:
    [PATCH 0/6 v3] btrfs-progs: cleanup and preparatory around device scan

 And, replaces [1] in the mailing list, as the --device option helper
 function is peeled and transformed into a common helper function
 in a separate patch.
    [1] [PATCH 0/4] btrfs-progs: tune: add --device and --noscan option

Anand Jain (10):
  btrfs-progs: common: add --device option helpers
  btrfs-progs: tune: consolidate return goto free-out
  btrfs-progs: tune: introduce --device option
  btrfs-progs: docs: update btrfstune --device option
  btrfs-progs: tune: introduce --noscan option
  btrfs-progs: docs: update btrfstune --noscan option
  btrfs-progs: check: introduce --device option
  btrfs-progs: docs: update btrfs check --device option
  btrfs-progs: check: introduce --noscan option
  btrfs-progs: docs: update btrfs check --noscan option

 Documentation/btrfs-check.rst |  6 +++
 Documentation/btrfstune.rst   |  7 ++++
 check/main.c                  | 45 +++++++++++++++++++++-
 common/device-scan.c          | 32 ++++++++++++++++
 common/device-scan.h          |  2 +
 tune/main.c                   | 71 ++++++++++++++++++++++++++++-------
 6 files changed, 148 insertions(+), 15 deletions(-)

Comments

David Sterba July 13, 2023, 6:35 p.m. UTC | #1
On Wed, Jun 28, 2023 at 07:56:07PM +0800, Anand Jain wrote:
> By default, btrfstune and btrfs check scans all and only the block devices
> in the system.
> 
> To scan regular files without mapping them to a loop device, adds the
> --device option.
> 
> To indicate not to scan the system for other devices, adds the --noscan
> option.
> 
> For example:
> 
>   The command below will scan both regular files and the devices
>   provided in the --device option, along with the system block devices.
> 
>         btrfstune -m --device /tdev/td1,/tdev/td2 /tdev/td3
>   or
>         btrfs check --device /tdev/td1 --device /tdev/td2 /tdev/td3
> 
>   In some cases, if you need to avoid the default system scan for the
>   block device, you can use the --noscan option.
> 
>         btrfstune -m --noscan --device /tdev/td1,/tdev/td2 /tdev/td3
> 
>         btrfs check --noscan --device /tdev/td1,/tdev/td2 /tdev/td3

From the examples above I don't understand which devices get scanned or
not, there are the --device ones and then the agtument. Also for
examples please use something recognizable like /dev/sdx, /dev/sdy,
otherwise it looks like it requires some special type of device.

I'd expect that --noscan will not scan any device that is part of the
filesystem that is pointed to by the agrument (/tdev/td3 in this case).

If the option --noscan + --device are meant to work together that only
the given devices are either scanned or not (this is the part I don't
see clearly), then I'd suggest to let --noscan take a value of device
that won't be scanned.

Eventually there could be a special value --noscan=all to not scan all
devices. Also please drop the syntax where the devices are separated by
",", one option per device should work for everybody and you can avoid
parsting the option value.
Anand Jain July 19, 2023, 7:48 a.m. UTC | #2
On 14/07/2023 02:35, David Sterba wrote:
> On Wed, Jun 28, 2023 at 07:56:07PM +0800, Anand Jain wrote:
>> By default, btrfstune and btrfs check scans all and only the block devices
>> in the system.
>>
>> To scan regular files without mapping them to a loop device, adds the
>> --device option.
>>
>> To indicate not to scan the system for other devices, adds the --noscan
>> option.
>>
>> For example:
>>
>>    The command below will scan both regular files and the devices
>>    provided in the --device option, along with the system block devices.
>>
>>          btrfstune -m --device /tdev/td1,/tdev/td2 /tdev/td3
>>    or
>>          btrfs check --device /tdev/td1 --device /tdev/td2 /tdev/td3
>>
>>    In some cases, if you need to avoid the default system scan for the
>>    block device, you can use the --noscan option.
>>
>>          btrfstune -m --noscan --device /tdev/td1,/tdev/td2 /tdev/td3
>>
>>          btrfs check --noscan --device /tdev/td1,/tdev/td2 /tdev/td3
> 
>  From the examples above I don't understand which devices get scanned or
> not, there are the --device ones and then the agtument. Also for
> examples please use something recognizable like /dev/sdx, /dev/sdy,
> otherwise it looks like it requires some special type of device.

Let me try again..

The --noscan option in btrfstune is designed to disable automatic 
system-wide scanning + detection and assembly of Btrfs devices.
Instead, it utilizes the devices provided in the command line argument,
at the option --device and the last argument.

I used devices like "/tdev/td1," and so on to indicate that it's
not just the block device as an argument but also the regular file
images. Seems like I didn't explain it clearly. I'll try update this.


> I'd expect that --noscan will not scan any device that is part of the
> filesystem that is pointed to by the agrument (/tdev/td3 in this case).

Indeed true, except for the devices specified in the --device option and 
the last argument of the 'btrfstune' command. I propose this approach 
due to the split brain (CHANGING_FSID) scenario, where devices to be 
assembled may have different fsids but matching metadata_uuid, making 
automatic assembly unsafe.


In my upcoming kernel patch, I have a test case that demonstrates bug in 
the kernel's automatic device assembly during the split brain scenarios 
and as below:

------------
Consider two filesystems with two unique fsids.

	267baf3a-d949-4e8d-bb09-79edab766d39 (/dev/loop0 /dev/loop1)
	b698d8f4-cc56-4480-b202-c29882cfa13e (/dev/loop2 /dev/loop3)

The metadata_uuid (271ca08e-ed4c-4c80-8919-fe59533a635b) is the same
for both fsids.

Suppose 267baf3a-d949-4e8d-bb09-79edab766d39 attempts to change its fsid 
using 'btrfstune -m' option, but encounters a write failure on devid 1.

	$ /incomplete-fsid-change/btrfstune -m /dev/loop0
	dbg: Wrote devid 2 skip_write_sb_cnt 0
	dbg: Wrote devid 1 skip_write_sb_cnt 1
	dbg: Wrote devid 2 skip_write_sb_cnt 1
	dbg: SKIP devid 1 skip_write_sb_cnt 2  <-- Last sb write failed


Check if one of the device has CHANGING_FSID set.
Also, note that
  All devices /dev/loop[0-3] have same metadata_uuid.
  /dev/loop[0-1] have different fsids.
  /dev/loop[2-3] have greater generation number.


	$ cat sb
	btrfs inspect dump-super $1 | \
			egrep
	"device|devid|^metadata_uuid|^fsid|^incom|^generation|^flags"



	$ sb /dev/loo0
superblock: bytenr=65536, device=/dev/loop0
flags			0x1000000001   <-- CHANGEING_FSID_V2
fsid			267baf3a-d949-4e8d-bb09-79edab766d39 <--
metadata_uuid		271ca08e-ed4c-4c80-8919-fe59533a635b
generation		13
num_devices		2
incompat_flags		0x761
dev_item.devid		1

	$ sb /dev/loop1
superblock: bytenr=65536, device=/dev/loop1
flags			0x1
fsid			54ba1e89-b754-4a18-b464-0cc5ec47187a <--
metadata_uuid		271ca08e-ed4c-4c80-8919-fe59533a635b
generation		14 <--
num_devices		2
incompat_flags		0x761
dev_item.devid		2

	$ sb /dev/loop2
superblock: bytenr=65536, device=/dev/loop2
flags			0x1
fsid			b698d8f4-cc56-4480-b202-c29882cfa13e
metadata_uuid		271ca08e-ed4c-4c80-8919-fe59533a635b
generation		15 <--
num_devices		2
incompat_flags		0x761
dev_item.devid		1

	$ sb /dev/loop3
superblock: bytenr=65536, device=/dev/loop3
flags			0x1
fsid			b698d8f4-cc56-4480-b202-c29882cfa13e
metadata_uuid		271ca08e-ed4c-4c80-8919-fe59533a635b
generation		15
num_devices		2
incompat_flags		0x761
dev_item.devid		2



Despite /dev/loop0 and /dev/loop3 not belonging to the same fsid, in the
context of recovering from the CHANGING_FSID scenario, kernel can
assemble them together.

$ btrfs dev scan --forget
$ btrfs dev scan /dev/loop3
Scanning for btrfs filesystems on '/dev/loop3'

$ mount /dev/loop0 /btrfs
$ btrfs fi show -m /btrfs
Label: none  uuid: b698d8f4-cc56-4480-b202-c29882cfa13e
	Total devices 2 FS bytes used 144.00KiB
	devid    1 size 300.00MiB used 48.00MiB path /dev/loop0  <--
	devid    2 size 300.00MiB used 40.00MiB path /dev/loop3  <--


[  984.959694] BTRFS error (device loop3): parent transid verify failed 
on 30507008 wanted 15 found 14


In my view, automatic device assembly is unsafe. Devices with the
CHANGING_FSID flag should be treated similar to a filesystem needing
fsck and fix or assemble them back in the userland.
----------------

> If the option --noscan + --device are meant to work together that only
> the given devices are either scanned or not (this is the part I don't
> see clearly),

Only the specified devices are scanned and used for assembling the
volume if their metadata_uuids match and devids line up.

> then I'd suggest to let --noscan take a value of device
> that won't be scanned.

Oh. Do you mean something like, for example:

  (not related to any of the test case):

     btrfstune -m --noscan=/dev/sda --noscan=/dev/sdb  /dev/sdc

I'm okay, but I think we need to rename 'noscan' to something else?

> 
> Eventually there could be a special value --noscan=all to not scan all
> devices.

--noscan and --device options are also present in the
"btrfs inspect dump-tree" sub-command.
I hope its different usage won't confuse the users.

> Also please drop the syntax where the devices are separated by
> ",", one option per device should work for everybody and you can avoid
> parsting the option value.

I kept it the same as in the mount -o device option;
I'm ok dropping support for ','.

Thanks, Anand