mbox series

[0/6] btrfs: metadata uuid fixes and enhancements

Message ID 20191212110132.11063-1-Damenly_Su@gmx.com (mailing list archive)
Headers show
Series btrfs: metadata uuid fixes and enhancements | expand

Message

Su Yue Dec. 12, 2019, 11:01 a.m. UTC
From: Su Yue <Damenly_Su@gmx.com>

This patchset fixes one reproducible bug and add two split-brain
cases ignored.

The origin code thinks the final state of successful synced device is
always has INCOMPAT_METADATA_UUID feature. However, a device without
the feature flag can be the one pull into disk. This is what handled
in the patchset. Test images are added in btrfs-progs part.

Patch[1] fixes a bug about wrong fsid copy.
Patch[2] is for the later patches.
Patch[3-5] add the forgotten cases.
Patch[6] just does simple code movement for grace.

The set passes xfstests-dev without regressions.

Su Yue (6):
  btrfs: metadata_uuid: fix failed assertion due to unsuccessful device
    scan
  btrfs: metadata_uuid: move split-brain handling from fs_id() to new
    function
  btrfs: split-brain case for scanned changing device with
    INCOMPAT_METADATA_UUID
  btrfs: split-brain case for scanned changed device without
    INCOMPAT_METADATA_UUID
  btrfs: copy fsid and metadata_uuid for pulled disk without
    INCOMPAT_METADATA_UUID
  btrfs: metadata_uuid: move partly logic into find_fsid_inprogress()

 fs/btrfs/volumes.c | 193 +++++++++++++++++++++++++++++----------------
 1 file changed, 125 insertions(+), 68 deletions(-)

Comments

Nikolay Borisov Dec. 13, 2019, 8:03 a.m. UTC | #1
On 12.12.19 г. 13:01 ч., damenly.su@gmail.com wrote:
> From: Su Yue <Damenly_Su@gmx.com>
> 
> This patchset fixes one reproducible bug and add two split-brain
> cases ignored.
> 
> The origin code thinks the final state of successful synced device is
> always has INCOMPAT_METADATA_UUID feature. However, a device without
> the feature flag can be the one pull into disk. This is what handled
> in the patchset. Test images are added in btrfs-progs part.
> 
> Patch[1] fixes a bug about wrong fsid copy.
> Patch[2] is for the later patches.
> Patch[3-5] add the forgotten cases.
> Patch[6] just does simple code movement for grace.
> 
> The set passes xfstests-dev without regressions.
> 
> Su Yue (6):
>   btrfs: metadata_uuid: fix failed assertion due to unsuccessful device
>     scan
>   btrfs: metadata_uuid: move split-brain handling from fs_id() to new
>     function
>   btrfs: split-brain case for scanned changing device with
>     INCOMPAT_METADATA_UUID
>   btrfs: split-brain case for scanned changed device without
>     INCOMPAT_METADATA_UUID
>   btrfs: copy fsid and metadata_uuid for pulled disk without
>     INCOMPAT_METADATA_UUID
>   btrfs: metadata_uuid: move partly logic into find_fsid_inprogress()
> 
>  fs/btrfs/volumes.c | 193 +++++++++++++++++++++++++++++----------------
>  1 file changed, 125 insertions(+), 68 deletions(-)
> 


I'm currently on holiday but the fsid change feature has a design
document here:
https://github.com/btrfs/btrfs-dev-docs/blob/master/fsid-change.txt

it lists all the cases I have handled. If you think there are other
please first describe them in prose following the parlance set out in
the document to ease reasoning.
Su Yue Dec. 16, 2019, 12:49 a.m. UTC | #2
On 2019/12/13 4:03 PM, Nikolay Borisov wrote:
>
>
> On 12.12.19 г. 13:01 ч., damenly.su@gmail.com wrote:
>> From: Su Yue <Damenly_Su@gmx.com>
>>
>> This patchset fixes one reproducible bug and add two split-brain
>> cases ignored.
>>
>> The origin code thinks the final state of successful synced device is
>> always has INCOMPAT_METADATA_UUID feature. However, a device without
>> the feature flag can be the one pull into disk. This is what handled
>> in the patchset. Test images are added in btrfs-progs part.
>>
>> Patch[1] fixes a bug about wrong fsid copy.
>> Patch[2] is for the later patches.
>> Patch[3-5] add the forgotten cases.
>> Patch[6] just does simple code movement for grace.
>>
>> The set passes xfstests-dev without regressions.
>>
>> Su Yue (6):
>>    btrfs: metadata_uuid: fix failed assertion due to unsuccessful device
>>      scan
>>    btrfs: metadata_uuid: move split-brain handling from fs_id() to new
>>      function
>>    btrfs: split-brain case for scanned changing device with
>>      INCOMPAT_METADATA_UUID
>>    btrfs: split-brain case for scanned changed device without
>>      INCOMPAT_METADATA_UUID
>>    btrfs: copy fsid and metadata_uuid for pulled disk without
>>      INCOMPAT_METADATA_UUID
>>    btrfs: metadata_uuid: move partly logic into find_fsid_inprogress()
>>
>>   fs/btrfs/volumes.c | 193 +++++++++++++++++++++++++++++----------------
>>   1 file changed, 125 insertions(+), 68 deletions(-)
>>
>
>
> I'm currently on holiday but the fsid change feature has a design
> document here:
> https://github.com/btrfs/btrfs-dev-docs/blob/master/fsid-change.txt
>
> it lists all the cases I have handled. If you think there are other
> please first describe them in prose following the parlance set out in
> the document to ease reasoning.
>

Thanks. I am going to do it.
The following is just a rough version for people to understand.
The pull request version will be more official like expressions in
btrfs-dev-docs.


 > 4. Failure during transaction x + 1. When such a failure happens the
 > filesystem in question will be partitioned in two sets P and Q. Where P
 > would have the C flag and a new value for fsid written to it as well
as the
 > old FSID value written in the ‘metadata_uuid’ field. In contrast Q would
 > have just the old fsid and the IP flag, also the superblock’s generation
 > number of disks in P will be higher than those in Q. Again two cases
needs
 > to be handled:

I won't argue the Q has the old fsid and IP flag. There is another state
of P.


0) There are two devices with fsid A, without metadata_uuid. E.g
        d1[A, 0, 0]
        d2[A, 0, 0]
        (The first element is the FSID, the 2nd is METADATA_UUID, the 3rd is
         the incompat flag bits)

1) After running "btrfstune -M B":
        d1[B, A, METADATA_UUID]
        d2[B, A, METADATA_UUID]

2) During "btrfstune -M a",
2.1) After first btrfs_commit_transaction() of set_metadata_uuid() finished:

        d1[B, A, METADATA_UUID | FSID_CHANGING_V2]
        d2[B, A, METADATA_UUID | FSID_CHANGING_V2]

2.2) Then run into the branch  btrfstune.c: line 141.

	    if (new_uuid && uuid_changed && memcmp(disk_super->metadata_uuid,
                                                new_fsid,
BTRFS_FSID_SIZE) == 0) {
                 /*
                  * Changing fsid to be the same as metadata uuid, so just
                  * disable the flag
                  */
                 memcpy(disk_super->fsid, &new_fsid, BTRFS_FSID_SIZE);
                 incompat_flags &= ~BTRFS_FEATURE_INCOMPAT_METADATA_UUID;
                 btrfs_set_super_incompat_flags(disk_super, incompat_flags);
                 memset(disk_super->metadata_uuid, 0, BTRFS_FSID_SIZE);

      Now @disk_super is [A, 0, FSID_CHANGING_V2], without METADATA_UUID
flag.

2.3) Then we go to the final btrfs_commit_transaction() of
set_metadata_uuid().
      But powerloss happened, and only d1 get synced:

       d1[A, 0, 0]                                 ---> P
       d2[B, A, METADATA_UUID | FSID_CHANGING_V2]  ---> Q



      So there are 2 cases you forgot to consider.
      - P is scanned first, then Q.
      - Q is scanned first, then P.