Message ID | 20200403134436.9095-1-nborisov@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RESEND] btrfs: Remove received information from snapshot on ro->rw switch | expand |
On 4/3/20 9:44 AM, Nikolay Borisov wrote: > Currently when a read-only snapshot is received and subsequently its > ro property is set to false i.e. switched to rw-mode the > received_uuid/stime/rtime/stransid/rtransid of that subvol remains > intact. However, once the received volume is switched to RW mode we > cannot guaranteee that it contains the same data, so it makes sense > to remove those fields which indicate this volume was ever > send/received. Additionally, sending such volume can cause conflicts > due to the presence of received_uuid. > > Signed-off-by: Nikolay Borisov <nborisov@suse.com> > Suggested-by: David Sterba <dsterba@suse.cz> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Thanks, Josef
On Fri, Apr 3, 2020 at 2:45 PM Nikolay Borisov <nborisov@suse.com> wrote: > > Currently when a read-only snapshot is received and subsequently its > ro property is set to false i.e. switched to rw-mode the > received_uuid/stime/rtime/stransid/rtransid of that subvol remains > intact. However, once the received volume is switched to RW mode we > cannot guaranteee that it contains the same data, so it makes sense > to remove those fields which indicate this volume was ever > send/received. Additionally, sending such volume can cause conflicts > due to the presence of received_uuid. > > Signed-off-by: Nikolay Borisov <nborisov@suse.com> > Suggested-by: David Sterba <dsterba@suse.cz> > --- > > I've been carrying this patch in my tree for around 2.5 years. It stems from > multiple reports on the mailing list about people changing the RO->RW mode on > a received snapshot and getting unexpected behavior. AFAIR this patch resolved > that. I agree with it, and this sporadically causes problems that get reported on the list. I spent a bunch of time back and forth with a user that ran into a problem caused by switching snapshots from RO to RW. So I would like to add the following paragraphs and Link tag to the changelog: " Preserving the received_uuid (and related fields) after transitioning the snapshot from RO to RW and then changing the snapshot has a potential for causing send to fail in many unexpected ways if we later turn it back to RO and use it for an incremental send operation. A recent example, in the thread to which the Link tag below points to, we had a user with a filesystem that had multiple snapshots with the same received_uuid but with different content due to a transition from RO to RW and then back to RO. When an incremental send was attempted using two of those snapshots, it resulted in send emitting a clone operation that was intended to clone from the parent root to the send root - however because both roots had the same received_uuid, the receiver ended up picking the send root as the source root for the clone operation, which happened to result in the clone operation to fail with -EINVAL, due to the source and destination offsets being the same (and on the same root and file). In this particular case there was no harm, but we could end up in a case where the clone operation succeeds but clones wrong data due to picking up an incorrect root - as in the case of multiple snapshots with the same received_uuid, it has no way to know which one is the correct one if they have different content. Link: https://lore.kernel.org/linux-btrfs/CAOaVUnV3L6RpcqJ5gaqzNXWXK0VMkEVXCdihawH1PgS6TiMchQ@mail.gmail.com/ " A few comments below as well. > > > fs/btrfs/ioctl.c | 27 +++++++++++++++++++++++++++ > 1 file changed, 27 insertions(+) > > diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c > index 40b729dce91c..39840b654600 100644 > --- a/fs/btrfs/ioctl.c > +++ b/fs/btrfs/ioctl.c > @@ -1873,6 +1873,7 @@ static noinline int btrfs_ioctl_subvol_setflags(struct file *file, > struct btrfs_trans_handle *trans; > u64 root_flags; > u64 flags; > + bool clear_received_state = false; > int ret = 0; > > if (!inode_owner_or_capable(inode)) > @@ -1917,6 +1918,7 @@ static noinline int btrfs_ioctl_subvol_setflags(struct file *file, > btrfs_set_root_flags(&root->root_item, > root_flags & ~BTRFS_ROOT_SUBVOL_RDONLY); > spin_unlock(&root->root_item_lock); > + clear_received_state = true; > } else { > spin_unlock(&root->root_item_lock); > btrfs_warn(fs_info, > @@ -1933,6 +1935,31 @@ static noinline int btrfs_ioctl_subvol_setflags(struct file *file, > goto out_reset; > } > > + if (clear_received_state) { > + if (!btrfs_is_empty_uuid(root->root_item.received_uuid)) { This could be combined into a single if to reduce indentation: if (clear_received_state && !btrfs_is_empty_uuid(...)) Also, there's here indentation weirdness due to using spaces instead of tabs, and checkpatch complains about it: ERROR: code indent should use tabs where possible #80: FILE: fs/btrfs/ioctl.c:1996: +^I if (!btrfs_is_empty_uuid(root->root_item.received_uuid)) {$ ERROR: code indent should use tabs where possible #83: FILE: fs/btrfs/ioctl.c:1999: +^I ret = btrfs_uuid_tree_remove(trans,$ ERROR: code indent should use tabs where possible #84: FILE: fs/btrfs/ioctl.c:2000: +^I root->root_item.received_uuid,$ (...) > + struct btrfs_root_item *root_item = &root->root_item; > + > + ret = btrfs_uuid_tree_remove(trans, > + root->root_item.received_uuid, > + BTRFS_UUID_KEY_RECEIVED_SUBVOL, > + root->root_key.objectid); If we reach here, we should have reserved one additional metadata item when starting the transaction. And add a comment about what each item is used for - 1 for updating the root item in the root tree, another 1 for removing an item from the uuid tree. Also, we can use root_item there instead of root->root_item. Other than that, it looks fine to me. Thanks. > + > + if (ret && ret != -ENOENT) { > + btrfs_abort_transaction(trans, ret); > + btrfs_end_transaction(trans); > + goto out_reset; > + } > + > + memset(root_item->received_uuid, 0, BTRFS_UUID_SIZE); > + btrfs_set_root_stransid(root_item, 0); > + btrfs_set_root_rtransid(root_item, 0); > + btrfs_set_stack_timespec_sec(&root_item->stime, 0); > + btrfs_set_stack_timespec_nsec(&root_item->stime, 0); > + btrfs_set_stack_timespec_sec(&root_item->rtime, 0); > + btrfs_set_stack_timespec_nsec(&root_item->rtime, 0); > + } > + } > + > ret = btrfs_update_root(trans, fs_info->tree_root, > &root->root_key, &root->root_item); > if (ret < 0) { > -- > 2.17.1 >
On 08/09/2021 12:32, Filipe Manana wrote: > On Fri, Apr 3, 2020 at 2:45 PM Nikolay Borisov <nborisov@suse.com> wrote: >> >> Currently when a read-only snapshot is received and subsequently its >> ro property is set to false i.e. switched to rw-mode the >> received_uuid/stime/rtime/stransid/rtransid of that subvol remains >> intact. However, once the received volume is switched to RW mode we >> cannot guaranteee that it contains the same data, so it makes sense >> to remove those fields which indicate this volume was ever >> send/received. Additionally, sending such volume can cause conflicts >> due to the presence of received_uuid. >> >> Signed-off-by: Nikolay Borisov <nborisov@suse.com> >> Suggested-by: David Sterba <dsterba@suse.cz> >> --- >> >> I've been carrying this patch in my tree for around 2.5 years. It stems from >> multiple reports on the mailing list about people changing the RO->RW mode on >> a received snapshot and getting unexpected behavior. AFAIR this patch resolved >> that. > > I agree with it, and this sporadically causes problems that get > reported on the list. > I spent a bunch of time back and forth with a user that ran into a > problem caused by switching snapshots from RO to RW. > > So I would like to add the following paragraphs and Link tag to the changelog: > > " > Preserving the received_uuid (and related fields) after transitioning the > snapshot from RO to RW and then changing the snapshot has a potential for > causing send to fail in many unexpected ways if we later turn it back to > RO and use it for an incremental send operation. > > A recent example, in the thread to which the Link tag below points to, we > had a user with a filesystem that had multiple snapshots with the same > received_uuid but with different content due to a transition from RO to RW > and then back to RO. Should btrfs-check be enhanced to detect subvolumes with duplicated received_uuid? Presumably the repair would be to change all of them to new UUIDs (or maybe keep the oldest one unchanged?). Graham When an incremental send was attempted using two of > those snapshots, it resulted in send emitting a clone operation that was > intended to clone from the parent root to the send root - however because > both roots had the same received_uuid, the receiver ended up picking the > send root as the source root for the clone operation, which happened to > result in the clone operation to fail with -EINVAL, due to the source and > destination offsets being the same (and on the same root and file). In this > particular case there was no harm, but we could end up in a case where the > clone operation succeeds but clones wrong data due to picking up an > incorrect root - as in the case of multiple snapshots with the same > received_uuid, it has no way to know which one is the correct one if they > have different content. > > Link: https://lore.kernel.org/linux-btrfs/CAOaVUnV3L6RpcqJ5gaqzNXWXK0VMkEVXCdihawH1PgS6TiMchQ@mail.gmail.com/ > " > > > A few comments below as well. > >> >> >> fs/btrfs/ioctl.c | 27 +++++++++++++++++++++++++++ >> 1 file changed, 27 insertions(+) >> >> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c >> index 40b729dce91c..39840b654600 100644 >> --- a/fs/btrfs/ioctl.c >> +++ b/fs/btrfs/ioctl.c >> @@ -1873,6 +1873,7 @@ static noinline int btrfs_ioctl_subvol_setflags(struct file *file, >> struct btrfs_trans_handle *trans; >> u64 root_flags; >> u64 flags; >> + bool clear_received_state = false; >> int ret = 0; >> >> if (!inode_owner_or_capable(inode)) >> @@ -1917,6 +1918,7 @@ static noinline int btrfs_ioctl_subvol_setflags(struct file *file, >> btrfs_set_root_flags(&root->root_item, >> root_flags & ~BTRFS_ROOT_SUBVOL_RDONLY); >> spin_unlock(&root->root_item_lock); >> + clear_received_state = true; >> } else { >> spin_unlock(&root->root_item_lock); >> btrfs_warn(fs_info, >> @@ -1933,6 +1935,31 @@ static noinline int btrfs_ioctl_subvol_setflags(struct file *file, >> goto out_reset; >> } >> >> + if (clear_received_state) { >> + if (!btrfs_is_empty_uuid(root->root_item.received_uuid)) { > > This could be combined into a single if to reduce indentation: if > (clear_received_state && !btrfs_is_empty_uuid(...)) > > Also, there's here indentation weirdness due to using spaces instead > of tabs, and checkpatch complains about it: > > ERROR: code indent should use tabs where possible > #80: FILE: fs/btrfs/ioctl.c:1996: > +^I if (!btrfs_is_empty_uuid(root->root_item.received_uuid)) {$ > > ERROR: code indent should use tabs where possible > #83: FILE: fs/btrfs/ioctl.c:1999: > +^I ret = btrfs_uuid_tree_remove(trans,$ > > ERROR: code indent should use tabs where possible > #84: FILE: fs/btrfs/ioctl.c:2000: > +^I root->root_item.received_uuid,$ > (...) > >> + struct btrfs_root_item *root_item = &root->root_item; >> + >> + ret = btrfs_uuid_tree_remove(trans, >> + root->root_item.received_uuid, >> + BTRFS_UUID_KEY_RECEIVED_SUBVOL, >> + root->root_key.objectid); > > If we reach here, we should have reserved one additional metadata item > when starting the transaction. > And add a comment about what each item is used for - 1 for updating > the root item in the root tree, another 1 for removing an item from > the uuid tree. > > Also, we can use root_item there instead of root->root_item. > > Other than that, it looks fine to me. > > Thanks. > >> + >> + if (ret && ret != -ENOENT) { >> + btrfs_abort_transaction(trans, ret); >> + btrfs_end_transaction(trans); >> + goto out_reset; >> + } >> + >> + memset(root_item->received_uuid, 0, BTRFS_UUID_SIZE); >> + btrfs_set_root_stransid(root_item, 0); >> + btrfs_set_root_rtransid(root_item, 0); >> + btrfs_set_stack_timespec_sec(&root_item->stime, 0); >> + btrfs_set_stack_timespec_nsec(&root_item->stime, 0); >> + btrfs_set_stack_timespec_sec(&root_item->rtime, 0); >> + btrfs_set_stack_timespec_nsec(&root_item->rtime, 0); >> + } >> + } >> + >> ret = btrfs_update_root(trans, fs_info->tree_root, >> &root->root_key, &root->root_item); >> if (ret < 0) { >> -- >> 2.17.1 >> > >
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 40b729dce91c..39840b654600 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1873,6 +1873,7 @@ static noinline int btrfs_ioctl_subvol_setflags(struct file *file, struct btrfs_trans_handle *trans; u64 root_flags; u64 flags; + bool clear_received_state = false; int ret = 0; if (!inode_owner_or_capable(inode)) @@ -1917,6 +1918,7 @@ static noinline int btrfs_ioctl_subvol_setflags(struct file *file, btrfs_set_root_flags(&root->root_item, root_flags & ~BTRFS_ROOT_SUBVOL_RDONLY); spin_unlock(&root->root_item_lock); + clear_received_state = true; } else { spin_unlock(&root->root_item_lock); btrfs_warn(fs_info, @@ -1933,6 +1935,31 @@ static noinline int btrfs_ioctl_subvol_setflags(struct file *file, goto out_reset; } + if (clear_received_state) { + if (!btrfs_is_empty_uuid(root->root_item.received_uuid)) { + struct btrfs_root_item *root_item = &root->root_item; + + ret = btrfs_uuid_tree_remove(trans, + root->root_item.received_uuid, + BTRFS_UUID_KEY_RECEIVED_SUBVOL, + root->root_key.objectid); + + if (ret && ret != -ENOENT) { + btrfs_abort_transaction(trans, ret); + btrfs_end_transaction(trans); + goto out_reset; + } + + memset(root_item->received_uuid, 0, BTRFS_UUID_SIZE); + btrfs_set_root_stransid(root_item, 0); + btrfs_set_root_rtransid(root_item, 0); + btrfs_set_stack_timespec_sec(&root_item->stime, 0); + btrfs_set_stack_timespec_nsec(&root_item->stime, 0); + btrfs_set_stack_timespec_sec(&root_item->rtime, 0); + btrfs_set_stack_timespec_nsec(&root_item->rtime, 0); + } + } + ret = btrfs_update_root(trans, fs_info->tree_root, &root->root_key, &root->root_item); if (ret < 0) {
Currently when a read-only snapshot is received and subsequently its ro property is set to false i.e. switched to rw-mode the received_uuid/stime/rtime/stransid/rtransid of that subvol remains intact. However, once the received volume is switched to RW mode we cannot guaranteee that it contains the same data, so it makes sense to remove those fields which indicate this volume was ever send/received. Additionally, sending such volume can cause conflicts due to the presence of received_uuid. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Suggested-by: David Sterba <dsterba@suse.cz> --- I've been carrying this patch in my tree for around 2.5 years. It stems from multiple reports on the mailing list about people changing the RO->RW mode on a received snapshot and getting unexpected behavior. AFAIR this patch resolved that. fs/btrfs/ioctl.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) -- 2.17.1