diff mbox series

btrfs: properly reject clear_cache and v1 cache for block-group-tree

Message ID 832315ce8d970a393a2948e4cc21690a1d9e1cac.1682559926.git.wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: properly reject clear_cache and v1 cache for block-group-tree | expand

Commit Message

Qu Wenruo April 27, 2023, 1:45 a.m. UTC
[BUG]
With block-group-tree feature enabled, mounting it with clear_cache
would cause the following transaction abort at mount or remount:

 BTRFS info (device dm-4): force clearing of disk cache
 BTRFS info (device dm-4): using free space tree
 BTRFS info (device dm-4): auto enabling async discard
 BTRFS info (device dm-4): clearing free space tree
 BTRFS info (device dm-4): clearing compat-ro feature flag for FREE_SPACE_TREE (0x1)
 BTRFS info (device dm-4): clearing compat-ro feature flag for FREE_SPACE_TREE_VALID (0x2)
 BTRFS error (device dm-4): block-group-tree feature requires fres-space-tree and no-holes
 BTRFS error (device dm-4): super block corruption detected before writing it to disk
 BTRFS: error (device dm-4) in write_all_supers:4288: errno=-117 Filesystem corrupted (unexpected superblock corruption detected)
 BTRFS warning (device dm-4: state E): Skipping commit of aborted transaction.

[CAUSE]
For block-group-tree feature, we have an artificial dependency on
free-space-tree.

This means if we detects block-group-tree without v2 cache, we consider
it a corruption and cause the problem.

For clear_cache mount option, it would temporary disable v2 cache, then
re-enable it.

But unfortunately for that temporary v2 cache disabled status, we refuse
to write a superblock with bg tree only flag, thus leads to the above
transaction abortion.

[FIX]
For now, just reject clear_cache and v1 cache mount option for block
group tree.
So now we got a graceful rejection other than a transaction abort:

 BTRFS info (device dm-4): force clearing of disk cache
 BTRFS error (device dm-4): cannot disable free space tree with block-group-tree feature
 BTRFS error (device dm-4): open_ctree failed

Cc: stable@vger.kernel.org # 6.1+
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
For the proper fix, we need to change the behavior of clear_cache and v1
cache switch.

For pure clear_cache without switch cache version, we should allow
rebuilding v2 cache without fully disable v2 cache.
---
 fs/btrfs/super.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

David Sterba April 27, 2023, 11:32 p.m. UTC | #1
On Thu, Apr 27, 2023 at 09:45:32AM +0800, Qu Wenruo wrote:
> [BUG]
> With block-group-tree feature enabled, mounting it with clear_cache
> would cause the following transaction abort at mount or remount:
> 
>  BTRFS info (device dm-4): force clearing of disk cache
>  BTRFS info (device dm-4): using free space tree
>  BTRFS info (device dm-4): auto enabling async discard
>  BTRFS info (device dm-4): clearing free space tree
>  BTRFS info (device dm-4): clearing compat-ro feature flag for FREE_SPACE_TREE (0x1)
>  BTRFS info (device dm-4): clearing compat-ro feature flag for FREE_SPACE_TREE_VALID (0x2)
>  BTRFS error (device dm-4): block-group-tree feature requires fres-space-tree and no-holes
>  BTRFS error (device dm-4): super block corruption detected before writing it to disk
>  BTRFS: error (device dm-4) in write_all_supers:4288: errno=-117 Filesystem corrupted (unexpected superblock corruption detected)
>  BTRFS warning (device dm-4: state E): Skipping commit of aborted transaction.
> 
> [CAUSE]
> For block-group-tree feature, we have an artificial dependency on
> free-space-tree.
> 
> This means if we detects block-group-tree without v2 cache, we consider
> it a corruption and cause the problem.
> 
> For clear_cache mount option, it would temporary disable v2 cache, then
> re-enable it.
> 
> But unfortunately for that temporary v2 cache disabled status, we refuse
> to write a superblock with bg tree only flag, thus leads to the above
> transaction abortion.
> 
> [FIX]
> For now, just reject clear_cache and v1 cache mount option for block
> group tree.
> So now we got a graceful rejection other than a transaction abort:
> 
>  BTRFS info (device dm-4): force clearing of disk cache
>  BTRFS error (device dm-4): cannot disable free space tree with block-group-tree feature
>  BTRFS error (device dm-4): open_ctree failed
> 
> Cc: stable@vger.kernel.org # 6.1+
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> For the proper fix, we need to change the behavior of clear_cache and v1
> cache switch.
> 
> For pure clear_cache without switch cache version, we should allow
> rebuilding v2 cache without fully disable v2 cache.

There was an issue to clarify docs about the space caches and it's a
mess, once we had v2 the number of states has increased and it's
becoming user unfriendly. At least we have the block-group-tree and
free-space-tree tied together and this should be the focus of the
feature compatibility.

I think it should be acceptable to slightly change the behaviour for
some obscure combination v1/v2/clear/bgt and either reject some of them
or suggest more than one step how to do it. E.g. first fully convert
from v1 to v2 and then to bgt, or allow v2/bgt/clear in one go.

Added to misc-next, thanks.
Qu Wenruo April 27, 2023, 11:37 p.m. UTC | #2
On 2023/4/28 07:32, David Sterba wrote:
> On Thu, Apr 27, 2023 at 09:45:32AM +0800, Qu Wenruo wrote:
>> [BUG]
>> With block-group-tree feature enabled, mounting it with clear_cache
>> would cause the following transaction abort at mount or remount:
>>
>>   BTRFS info (device dm-4): force clearing of disk cache
>>   BTRFS info (device dm-4): using free space tree
>>   BTRFS info (device dm-4): auto enabling async discard
>>   BTRFS info (device dm-4): clearing free space tree
>>   BTRFS info (device dm-4): clearing compat-ro feature flag for FREE_SPACE_TREE (0x1)
>>   BTRFS info (device dm-4): clearing compat-ro feature flag for FREE_SPACE_TREE_VALID (0x2)
>>   BTRFS error (device dm-4): block-group-tree feature requires fres-space-tree and no-holes
>>   BTRFS error (device dm-4): super block corruption detected before writing it to disk
>>   BTRFS: error (device dm-4) in write_all_supers:4288: errno=-117 Filesystem corrupted (unexpected superblock corruption detected)
>>   BTRFS warning (device dm-4: state E): Skipping commit of aborted transaction.
>>
>> [CAUSE]
>> For block-group-tree feature, we have an artificial dependency on
>> free-space-tree.
>>
>> This means if we detects block-group-tree without v2 cache, we consider
>> it a corruption and cause the problem.
>>
>> For clear_cache mount option, it would temporary disable v2 cache, then
>> re-enable it.
>>
>> But unfortunately for that temporary v2 cache disabled status, we refuse
>> to write a superblock with bg tree only flag, thus leads to the above
>> transaction abortion.
>>
>> [FIX]
>> For now, just reject clear_cache and v1 cache mount option for block
>> group tree.
>> So now we got a graceful rejection other than a transaction abort:
>>
>>   BTRFS info (device dm-4): force clearing of disk cache
>>   BTRFS error (device dm-4): cannot disable free space tree with block-group-tree feature
>>   BTRFS error (device dm-4): open_ctree failed
>>
>> Cc: stable@vger.kernel.org # 6.1+
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>> For the proper fix, we need to change the behavior of clear_cache and v1
>> cache switch.
>>
>> For pure clear_cache without switch cache version, we should allow
>> rebuilding v2 cache without fully disable v2 cache.
> 
> There was an issue to clarify docs about the space caches and it's a
> mess, once we had v2 the number of states has increased and it's
> becoming user unfriendly. At least we have the block-group-tree and
> free-space-tree tied together and this should be the focus of the
> feature compatibility.
> 
> I think it should be acceptable to slightly change the behaviour for
> some obscure combination v1/v2/clear/bgt and either reject some of them
> or suggest more than one step how to do it. E.g. first fully convert
> from v1 to v2 and then to bgt, or allow v2/bgt/clear in one go.

On the clear cache behavior, I'm working on making it independent from 
the current cache version.
E.g. on v2 cache, clear_cache would just clear the cache, without 
(temporarily) falling back to v1.

This should slightly reduce the complexity for the free space cache matrix.

To me, the biggest inconsistency comes from the fact that free space 
cache version can be too easily modified just by a mount option.

While compat_ro flags normally should be enabled/disabled by btrfstune 
tools.

Thanks,
Qu

> 
> Added to misc-next, thanks.
Christoph Anton Mitterer April 27, 2023, 11:41 p.m. UTC | #3
On Fri, 2023-04-28 at 07:37 +0800, Qu Wenruo wrote:
> On the clear cache behavior, I'm working on making it independent
> from 
> the current cache version.
> E.g. on v2 cache, clear_cache would just clear the cache, without 
> (temporarily) falling back to v1.

If you make any such changes, please be sure to also adapt the docs...
or tell me how it (then) works and I may help you out with some doc
PRs.

Cheers,
Chris.
Qu Wenruo April 28, 2023, 12:12 a.m. UTC | #4
On 2023/4/28 07:41, Christoph Anton Mitterer wrote:
> On Fri, 2023-04-28 at 07:37 +0800, Qu Wenruo wrote:
>> On the clear cache behavior, I'm working on making it independent
>> from
>> the current cache version.
>> E.g. on v2 cache, clear_cache would just clear the cache, without
>> (temporarily) falling back to v1.
> 
> If you make any such changes, please be sure to also adapt the docs...
> or tell me how it (then) works and I may help you out with some doc
> PRs.

I'm not yet determined what's the proper way to go.

If I can finish the clear_cache behavior in small enough patches and 
backport for v6.1+, I believe there would be no extra docs needed, 
except mentioning bgt needs no-holes and free-space-tree.

Every unsupported combination would got proper dmesg prompt.

Thanks,
Qu

> 
> Cheers,
> Chris.
diff mbox series

Patch

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 581845bc206a..eefae0318d4f 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -826,7 +826,12 @@  int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
 	    !btrfs_test_opt(info, CLEAR_CACHE)) {
 		btrfs_err(info, "cannot disable free space tree");
 		ret = -EINVAL;
-
+	}
+	if (btrfs_fs_compat_ro(info, BLOCK_GROUP_TREE) &&
+	    (btrfs_test_opt(info, CLEAR_CACHE) ||
+	     !btrfs_test_opt(info, FREE_SPACE_TREE))) {
+		btrfs_err(info, "cannot disable free space tree with block-group-tree feature");
+		ret = -EINVAL;
 	}
 	if (!ret)
 		ret = btrfs_check_mountopts_zoned(info);