diff mbox series

[v2] btrfs: reject log replay if there is unsupported RO flag

Message ID a6612cd9432b8ae6429cceee561c0259232cc554.1654602414.git.wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series [v2] btrfs: reject log replay if there is unsupported RO flag | expand

Commit Message

Qu Wenruo June 7, 2022, 11:48 a.m. UTC
[BUG]
If we have a btrfs image with dirty log, along with an unsupported RO
compatible flag:

log_root		30474240
...
compat_flags		0x0
compat_ro_flags		0x40000003
			( FREE_SPACE_TREE |
			  FREE_SPACE_TREE_VALID |
			  unknown flag: 0x40000000 )

Then even if we can only mount it RO, we will still cause metadata
update for log replay:

 BTRFS info (device dm-1): flagging fs with big metadata feature
 BTRFS info (device dm-1): using free space tree
 BTRFS info (device dm-1): has skinny extents
 BTRFS info (device dm-1): start tree-log replay

This is definitely against RO compact flag requirement.

[CAUSE]
RO compact flag only forces us to do RO mount, but we will still do log
replay for plain RO mount.

Thus this will result us to do log replay and update metadata.

This can be very problematic for new RO compat flag, for example older
kernel can not understand v2 cache, and if we allow metadata update on
RO mount and invalidate/corrupt v2 cache.

[FIX]
Just reject the mount unless rescue=nologreplay is provided:

  BTRFS error (device dm-1): cannot replay dirty log with unsupport optional features (0x40000000), try rescue=nologreplay instead

We don't want to set rescue=nologreply directly, as this would make the
end user to read the old data, and cause confusion.

Since the such case is really rare, we're mostly fine to just reject the
mount with an error message, which also includes the proper workaround.

Cc: stable@vger.kernel.org #4.9+
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
Changelog:
v2:
- Reject the mount instead of setting nologreplay
  To avoid the confusion which can return old data.
  Unfortunately I don't have a better to shrink the new error message
  into one 80-char line.
---
 fs/btrfs/disk-io.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

Comments

Filipe Manana June 7, 2022, 12:35 p.m. UTC | #1
On Tue, Jun 07, 2022 at 07:48:24PM +0800, Qu Wenruo wrote:
> [BUG]
> If we have a btrfs image with dirty log, along with an unsupported RO
> compatible flag:
> 
> log_root		30474240
> ...
> compat_flags		0x0
> compat_ro_flags		0x40000003
> 			( FREE_SPACE_TREE |
> 			  FREE_SPACE_TREE_VALID |
> 			  unknown flag: 0x40000000 )
> 
> Then even if we can only mount it RO, we will still cause metadata
> update for log replay:
> 
>  BTRFS info (device dm-1): flagging fs with big metadata feature
>  BTRFS info (device dm-1): using free space tree
>  BTRFS info (device dm-1): has skinny extents
>  BTRFS info (device dm-1): start tree-log replay
> 
> This is definitely against RO compact flag requirement.
> 
> [CAUSE]
> RO compact flag only forces us to do RO mount, but we will still do log
> replay for plain RO mount.
> 
> Thus this will result us to do log replay and update metadata.
> 
> This can be very problematic for new RO compat flag, for example older
> kernel can not understand v2 cache, and if we allow metadata update on
> RO mount and invalidate/corrupt v2 cache.
> 
> [FIX]
> Just reject the mount unless rescue=nologreplay is provided:
> 
>   BTRFS error (device dm-1): cannot replay dirty log with unsupport optional features (0x40000000), try rescue=nologreplay instead
> 
> We don't want to set rescue=nologreply directly, as this would make the
> end user to read the old data, and cause confusion.
> 
> Since the such case is really rare, we're mostly fine to just reject the
> mount with an error message, which also includes the proper workaround.
> 
> Cc: stable@vger.kernel.org #4.9+
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> Changelog:
> v2:
> - Reject the mount instead of setting nologreplay
>   To avoid the confusion which can return old data.
>   Unfortunately I don't have a better to shrink the new error message
>   into one 80-char line.
> ---
>  fs/btrfs/disk-io.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index fe309db9f5ff..f20bd8024334 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -3655,6 +3655,20 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
>  		err = -EINVAL;
>  		goto fail_alloc;
>  	}
> +	/*
> +	 * We have unsupported RO compat features, although RO mounted, we
> +	 * should not cause any metadata write, including log replay.
> +	 * Or we can screw up whatever the new feature requires.
> +	 */
> +	if (unlikely(features && btrfs_super_log_root(disk_super) &&
> +		     !btrfs_test_opt(fs_info, NOLOGREPLAY))) {
> +		btrfs_err(fs_info,
> +"cannot replay dirty log with unsupport optional features (0x%llx), try rescue=nologreplay instead",

unsupport -> unsupported

The "optional" sounds superfluous.

You are CC'ing stable 4.9+, but IIRC the rescue= mount option is relatively
recent, so the message will need to be updated (4.9, at least, doesn't have it).

Other than that it looks fine to me.

It's clear that it's a problem with the free space tree, but with verity (the only
other RO compat feature), I'm not sure we need this constraint, and it may happen
more often since verity support is recent, unlike the free space tree which has
been around for years.

Thanks.

> +			  features);
> +		err = -EINVAL;
> +		goto fail_alloc;
> +	}
> +
>  
>  	if (sectorsize < PAGE_SIZE) {
>  		struct btrfs_subpage_info *subpage_info;
> -- 
> 2.36.1
>
Qu Wenruo June 7, 2022, 12:51 p.m. UTC | #2
On 2022/6/7 20:35, Filipe Manana wrote:
> On Tue, Jun 07, 2022 at 07:48:24PM +0800, Qu Wenruo wrote:
>> [BUG]
>> If we have a btrfs image with dirty log, along with an unsupported RO
>> compatible flag:
>>
>> log_root		30474240
>> ...
>> compat_flags		0x0
>> compat_ro_flags		0x40000003
>> 			( FREE_SPACE_TREE |
>> 			  FREE_SPACE_TREE_VALID |
>> 			  unknown flag: 0x40000000 )
>>
>> Then even if we can only mount it RO, we will still cause metadata
>> update for log replay:
>>
>>   BTRFS info (device dm-1): flagging fs with big metadata feature
>>   BTRFS info (device dm-1): using free space tree
>>   BTRFS info (device dm-1): has skinny extents
>>   BTRFS info (device dm-1): start tree-log replay
>>
>> This is definitely against RO compact flag requirement.
>>
>> [CAUSE]
>> RO compact flag only forces us to do RO mount, but we will still do log
>> replay for plain RO mount.
>>
>> Thus this will result us to do log replay and update metadata.
>>
>> This can be very problematic for new RO compat flag, for example older
>> kernel can not understand v2 cache, and if we allow metadata update on
>> RO mount and invalidate/corrupt v2 cache.
>>
>> [FIX]
>> Just reject the mount unless rescue=nologreplay is provided:
>>
>>    BTRFS error (device dm-1): cannot replay dirty log with unsupport optional features (0x40000000), try rescue=nologreplay instead
>>
>> We don't want to set rescue=nologreply directly, as this would make the
>> end user to read the old data, and cause confusion.
>>
>> Since the such case is really rare, we're mostly fine to just reject the
>> mount with an error message, which also includes the proper workaround.
>>
>> Cc: stable@vger.kernel.org #4.9+
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>> Changelog:
>> v2:
>> - Reject the mount instead of setting nologreplay
>>    To avoid the confusion which can return old data.
>>    Unfortunately I don't have a better to shrink the new error message
>>    into one 80-char line.
>> ---
>>   fs/btrfs/disk-io.c | 14 ++++++++++++++
>>   1 file changed, 14 insertions(+)
>>
>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>> index fe309db9f5ff..f20bd8024334 100644
>> --- a/fs/btrfs/disk-io.c
>> +++ b/fs/btrfs/disk-io.c
>> @@ -3655,6 +3655,20 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
>>   		err = -EINVAL;
>>   		goto fail_alloc;
>>   	}
>> +	/*
>> +	 * We have unsupported RO compat features, although RO mounted, we
>> +	 * should not cause any metadata write, including log replay.
>> +	 * Or we can screw up whatever the new feature requires.
>> +	 */
>> +	if (unlikely(features && btrfs_super_log_root(disk_super) &&
>> +		     !btrfs_test_opt(fs_info, NOLOGREPLAY))) {
>> +		btrfs_err(fs_info,
>> +"cannot replay dirty log with unsupport optional features (0x%llx), try rescue=nologreplay instead",
>
> unsupport -> unsupported
>
> The "optional" sounds superfluous.
>
> You are CC'ing stable 4.9+, but IIRC the rescue= mount option is relatively
> recent, so the message will need to be updated (4.9, at least, doesn't have it).

Yep, thus when this get merged, I'll manually backport the patch to
older kernels.

Thanks,
Qu
>
> Other than that it looks fine to me.
>
> It's clear that it's a problem with the free space tree, but with verity (the only
> other RO compat feature), I'm not sure we need this constraint, and it may happen
> more often since verity support is recent, unlike the free space tree which has
> been around for years.
>
> Thanks.
>
>> +			  features);
>> +		err = -EINVAL;
>> +		goto fail_alloc;
>> +	}
>> +
>>
>>   	if (sectorsize < PAGE_SIZE) {
>>   		struct btrfs_subpage_info *subpage_info;
>> --
>> 2.36.1
>>
David Sterba June 7, 2022, 3:50 p.m. UTC | #3
On Tue, Jun 07, 2022 at 07:48:24PM +0800, Qu Wenruo wrote:
> [BUG]
> If we have a btrfs image with dirty log, along with an unsupported RO
> compatible flag:
> 
> log_root		30474240
> ...
> compat_flags		0x0
> compat_ro_flags		0x40000003
> 			( FREE_SPACE_TREE |
> 			  FREE_SPACE_TREE_VALID |
> 			  unknown flag: 0x40000000 )
> 
> Then even if we can only mount it RO, we will still cause metadata
> update for log replay:
> 
>  BTRFS info (device dm-1): flagging fs with big metadata feature
>  BTRFS info (device dm-1): using free space tree
>  BTRFS info (device dm-1): has skinny extents
>  BTRFS info (device dm-1): start tree-log replay
> 
> This is definitely against RO compact flag requirement.
> 
> [CAUSE]
> RO compact flag only forces us to do RO mount, but we will still do log
> replay for plain RO mount.
> 
> Thus this will result us to do log replay and update metadata.
> 
> This can be very problematic for new RO compat flag, for example older
> kernel can not understand v2 cache, and if we allow metadata update on
> RO mount and invalidate/corrupt v2 cache.
> 
> [FIX]
> Just reject the mount unless rescue=nologreplay is provided:

I agree that it's better to fail the mount and keep the log. The way out
of that is to use newer kernel that supports it or explicitly clear the
log.

>   BTRFS error (device dm-1): cannot replay dirty log with unsupport optional features (0x40000000), try rescue=nologreplay instead
> 
> We don't want to set rescue=nologreply directly, as this would make the
> end user to read the old data, and cause confusion.
> 
> Since the such case is really rare, we're mostly fine to just reject the
> mount with an error message, which also includes the proper workaround.

I think this is a use case for 'btrfs rescue zero-log' that is not
caused by a bug in the tree log code (ie. the original purpose for the
tool), so this should be also documented.
diff mbox series

Patch

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index fe309db9f5ff..f20bd8024334 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3655,6 +3655,20 @@  int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 		err = -EINVAL;
 		goto fail_alloc;
 	}
+	/*
+	 * We have unsupported RO compat features, although RO mounted, we
+	 * should not cause any metadata write, including log replay.
+	 * Or we can screw up whatever the new feature requires.
+	 */
+	if (unlikely(features && btrfs_super_log_root(disk_super) &&
+		     !btrfs_test_opt(fs_info, NOLOGREPLAY))) {
+		btrfs_err(fs_info,
+"cannot replay dirty log with unsupport optional features (0x%llx), try rescue=nologreplay instead",
+			  features);
+		err = -EINVAL;
+		goto fail_alloc;
+	}
+
 
 	if (sectorsize < PAGE_SIZE) {
 		struct btrfs_subpage_info *subpage_info;