Message ID | a6612cd9432b8ae6429cceee561c0259232cc554.1654602414.git.wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] btrfs: reject log replay if there is unsupported RO flag | expand |
On Tue, Jun 07, 2022 at 07:48:24PM +0800, Qu Wenruo wrote: > [BUG] > If we have a btrfs image with dirty log, along with an unsupported RO > compatible flag: > > log_root 30474240 > ... > compat_flags 0x0 > compat_ro_flags 0x40000003 > ( FREE_SPACE_TREE | > FREE_SPACE_TREE_VALID | > unknown flag: 0x40000000 ) > > Then even if we can only mount it RO, we will still cause metadata > update for log replay: > > BTRFS info (device dm-1): flagging fs with big metadata feature > BTRFS info (device dm-1): using free space tree > BTRFS info (device dm-1): has skinny extents > BTRFS info (device dm-1): start tree-log replay > > This is definitely against RO compact flag requirement. > > [CAUSE] > RO compact flag only forces us to do RO mount, but we will still do log > replay for plain RO mount. > > Thus this will result us to do log replay and update metadata. > > This can be very problematic for new RO compat flag, for example older > kernel can not understand v2 cache, and if we allow metadata update on > RO mount and invalidate/corrupt v2 cache. > > [FIX] > Just reject the mount unless rescue=nologreplay is provided: > > BTRFS error (device dm-1): cannot replay dirty log with unsupport optional features (0x40000000), try rescue=nologreplay instead > > We don't want to set rescue=nologreply directly, as this would make the > end user to read the old data, and cause confusion. > > Since the such case is really rare, we're mostly fine to just reject the > mount with an error message, which also includes the proper workaround. > > Cc: stable@vger.kernel.org #4.9+ > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > Changelog: > v2: > - Reject the mount instead of setting nologreplay > To avoid the confusion which can return old data. > Unfortunately I don't have a better to shrink the new error message > into one 80-char line. > --- > fs/btrfs/disk-io.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > index fe309db9f5ff..f20bd8024334 100644 > --- a/fs/btrfs/disk-io.c > +++ b/fs/btrfs/disk-io.c > @@ -3655,6 +3655,20 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device > err = -EINVAL; > goto fail_alloc; > } > + /* > + * We have unsupported RO compat features, although RO mounted, we > + * should not cause any metadata write, including log replay. > + * Or we can screw up whatever the new feature requires. > + */ > + if (unlikely(features && btrfs_super_log_root(disk_super) && > + !btrfs_test_opt(fs_info, NOLOGREPLAY))) { > + btrfs_err(fs_info, > +"cannot replay dirty log with unsupport optional features (0x%llx), try rescue=nologreplay instead", unsupport -> unsupported The "optional" sounds superfluous. You are CC'ing stable 4.9+, but IIRC the rescue= mount option is relatively recent, so the message will need to be updated (4.9, at least, doesn't have it). Other than that it looks fine to me. It's clear that it's a problem with the free space tree, but with verity (the only other RO compat feature), I'm not sure we need this constraint, and it may happen more often since verity support is recent, unlike the free space tree which has been around for years. Thanks. > + features); > + err = -EINVAL; > + goto fail_alloc; > + } > + > > if (sectorsize < PAGE_SIZE) { > struct btrfs_subpage_info *subpage_info; > -- > 2.36.1 >
On 2022/6/7 20:35, Filipe Manana wrote: > On Tue, Jun 07, 2022 at 07:48:24PM +0800, Qu Wenruo wrote: >> [BUG] >> If we have a btrfs image with dirty log, along with an unsupported RO >> compatible flag: >> >> log_root 30474240 >> ... >> compat_flags 0x0 >> compat_ro_flags 0x40000003 >> ( FREE_SPACE_TREE | >> FREE_SPACE_TREE_VALID | >> unknown flag: 0x40000000 ) >> >> Then even if we can only mount it RO, we will still cause metadata >> update for log replay: >> >> BTRFS info (device dm-1): flagging fs with big metadata feature >> BTRFS info (device dm-1): using free space tree >> BTRFS info (device dm-1): has skinny extents >> BTRFS info (device dm-1): start tree-log replay >> >> This is definitely against RO compact flag requirement. >> >> [CAUSE] >> RO compact flag only forces us to do RO mount, but we will still do log >> replay for plain RO mount. >> >> Thus this will result us to do log replay and update metadata. >> >> This can be very problematic for new RO compat flag, for example older >> kernel can not understand v2 cache, and if we allow metadata update on >> RO mount and invalidate/corrupt v2 cache. >> >> [FIX] >> Just reject the mount unless rescue=nologreplay is provided: >> >> BTRFS error (device dm-1): cannot replay dirty log with unsupport optional features (0x40000000), try rescue=nologreplay instead >> >> We don't want to set rescue=nologreply directly, as this would make the >> end user to read the old data, and cause confusion. >> >> Since the such case is really rare, we're mostly fine to just reject the >> mount with an error message, which also includes the proper workaround. >> >> Cc: stable@vger.kernel.org #4.9+ >> Signed-off-by: Qu Wenruo <wqu@suse.com> >> --- >> Changelog: >> v2: >> - Reject the mount instead of setting nologreplay >> To avoid the confusion which can return old data. >> Unfortunately I don't have a better to shrink the new error message >> into one 80-char line. >> --- >> fs/btrfs/disk-io.c | 14 ++++++++++++++ >> 1 file changed, 14 insertions(+) >> >> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c >> index fe309db9f5ff..f20bd8024334 100644 >> --- a/fs/btrfs/disk-io.c >> +++ b/fs/btrfs/disk-io.c >> @@ -3655,6 +3655,20 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device >> err = -EINVAL; >> goto fail_alloc; >> } >> + /* >> + * We have unsupported RO compat features, although RO mounted, we >> + * should not cause any metadata write, including log replay. >> + * Or we can screw up whatever the new feature requires. >> + */ >> + if (unlikely(features && btrfs_super_log_root(disk_super) && >> + !btrfs_test_opt(fs_info, NOLOGREPLAY))) { >> + btrfs_err(fs_info, >> +"cannot replay dirty log with unsupport optional features (0x%llx), try rescue=nologreplay instead", > > unsupport -> unsupported > > The "optional" sounds superfluous. > > You are CC'ing stable 4.9+, but IIRC the rescue= mount option is relatively > recent, so the message will need to be updated (4.9, at least, doesn't have it). Yep, thus when this get merged, I'll manually backport the patch to older kernels. Thanks, Qu > > Other than that it looks fine to me. > > It's clear that it's a problem with the free space tree, but with verity (the only > other RO compat feature), I'm not sure we need this constraint, and it may happen > more often since verity support is recent, unlike the free space tree which has > been around for years. > > Thanks. > >> + features); >> + err = -EINVAL; >> + goto fail_alloc; >> + } >> + >> >> if (sectorsize < PAGE_SIZE) { >> struct btrfs_subpage_info *subpage_info; >> -- >> 2.36.1 >>
On Tue, Jun 07, 2022 at 07:48:24PM +0800, Qu Wenruo wrote: > [BUG] > If we have a btrfs image with dirty log, along with an unsupported RO > compatible flag: > > log_root 30474240 > ... > compat_flags 0x0 > compat_ro_flags 0x40000003 > ( FREE_SPACE_TREE | > FREE_SPACE_TREE_VALID | > unknown flag: 0x40000000 ) > > Then even if we can only mount it RO, we will still cause metadata > update for log replay: > > BTRFS info (device dm-1): flagging fs with big metadata feature > BTRFS info (device dm-1): using free space tree > BTRFS info (device dm-1): has skinny extents > BTRFS info (device dm-1): start tree-log replay > > This is definitely against RO compact flag requirement. > > [CAUSE] > RO compact flag only forces us to do RO mount, but we will still do log > replay for plain RO mount. > > Thus this will result us to do log replay and update metadata. > > This can be very problematic for new RO compat flag, for example older > kernel can not understand v2 cache, and if we allow metadata update on > RO mount and invalidate/corrupt v2 cache. > > [FIX] > Just reject the mount unless rescue=nologreplay is provided: I agree that it's better to fail the mount and keep the log. The way out of that is to use newer kernel that supports it or explicitly clear the log. > BTRFS error (device dm-1): cannot replay dirty log with unsupport optional features (0x40000000), try rescue=nologreplay instead > > We don't want to set rescue=nologreply directly, as this would make the > end user to read the old data, and cause confusion. > > Since the such case is really rare, we're mostly fine to just reject the > mount with an error message, which also includes the proper workaround. I think this is a use case for 'btrfs rescue zero-log' that is not caused by a bug in the tree log code (ie. the original purpose for the tool), so this should be also documented.
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index fe309db9f5ff..f20bd8024334 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3655,6 +3655,20 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device err = -EINVAL; goto fail_alloc; } + /* + * We have unsupported RO compat features, although RO mounted, we + * should not cause any metadata write, including log replay. + * Or we can screw up whatever the new feature requires. + */ + if (unlikely(features && btrfs_super_log_root(disk_super) && + !btrfs_test_opt(fs_info, NOLOGREPLAY))) { + btrfs_err(fs_info, +"cannot replay dirty log with unsupport optional features (0x%llx), try rescue=nologreplay instead", + features); + err = -EINVAL; + goto fail_alloc; + } + if (sectorsize < PAGE_SIZE) { struct btrfs_subpage_info *subpage_info;
[BUG] If we have a btrfs image with dirty log, along with an unsupported RO compatible flag: log_root 30474240 ... compat_flags 0x0 compat_ro_flags 0x40000003 ( FREE_SPACE_TREE | FREE_SPACE_TREE_VALID | unknown flag: 0x40000000 ) Then even if we can only mount it RO, we will still cause metadata update for log replay: BTRFS info (device dm-1): flagging fs with big metadata feature BTRFS info (device dm-1): using free space tree BTRFS info (device dm-1): has skinny extents BTRFS info (device dm-1): start tree-log replay This is definitely against RO compact flag requirement. [CAUSE] RO compact flag only forces us to do RO mount, but we will still do log replay for plain RO mount. Thus this will result us to do log replay and update metadata. This can be very problematic for new RO compat flag, for example older kernel can not understand v2 cache, and if we allow metadata update on RO mount and invalidate/corrupt v2 cache. [FIX] Just reject the mount unless rescue=nologreplay is provided: BTRFS error (device dm-1): cannot replay dirty log with unsupport optional features (0x40000000), try rescue=nologreplay instead We don't want to set rescue=nologreply directly, as this would make the end user to read the old data, and cause confusion. Since the such case is really rare, we're mostly fine to just reject the mount with an error message, which also includes the proper workaround. Cc: stable@vger.kernel.org #4.9+ Signed-off-by: Qu Wenruo <wqu@suse.com> --- Changelog: v2: - Reject the mount instead of setting nologreplay To avoid the confusion which can return old data. Unfortunately I don't have a better to shrink the new error message into one 80-char line. --- fs/btrfs/disk-io.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)