mbox series

[0/7] btrfs-progs: tune: add resume support for csum conversion

Message ID cover.1684913599.git.wqu@suse.com (mailing list archive)
Headers show
Series btrfs-progs: tune: add resume support for csum conversion | expand

Message

Qu Wenruo May 24, 2023, 7:41 a.m. UTC
[RESUME SUPPORT]
This patchset adds the resume support for "btrfstune --csum".

The main part is in resume for data csum change, as we have 4 different
status during data csum conversion.

Thankfully for data csum conversion, everything is protected by
metadata COW and we can detect the current status by the existence of
both old and new csum items, and their coverage.

For resume of metadata conversion, there is nothing we can really do but
go through all the tree blocks and verify them against both new and old
csum type.

[TESTING]
For the tests, currently errors are injected after every possible
transaction commit, and then resume from that interrupted status.
So far the patchset passes all above tests.

But I'm not sure what's the better way to craft the test case.

Should we go log-writes? Or use pre-built images?

[TODO]
- Test cases for resume

- Support for revert if errors are found
  If we hit data csum mismatch and can not repair from any copy, then
  we should revert back to the old csum.

- Support for pre-cautious metadata check
  We'd better read and very metadata before rewriting them.

- Performance tuning
  We want to take a balance between too frequent commit transaction
  and too large transaction.
  This is especially true for data csum objectid change and maybe
  old data csum deletion.

- UI enhancement
  A btrfs-convert like UI would be better.


Qu Wenruo (7):
  btrfs-progs: tune: implement resume support for metadata checksum
  btrfs-progs: tune: implement resume support for generating new data
    csum
  btrfs-progs: tune: implement resume support for csum tree without any
    new csum item
  btrfs-progs: tune: implement resume support for empty csum tree
  btrfs-progs: tune: implement resume support for half deleted old csums
  btrfs-progs: tune: implement resume support for data csum objectid
    change
  btrfs-progs: tune: reject csum change if the fs is already using the
    target csum type

 tune/change-csum.c | 461 ++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 418 insertions(+), 43 deletions(-)

Comments

Anand Jain May 24, 2023, 8:42 a.m. UTC | #1
Why not use the generation numbers "From" and "To" for converting
to the new checksum and keeping track of the generation number of
the most recent successful conversion?

Thanks, Anand


On 24/05/2023 15:41, Qu Wenruo wrote:
> [RESUME SUPPORT]
> This patchset adds the resume support for "btrfstune --csum".
> 
> The main part is in resume for data csum change, as we have 4 different
> status during data csum conversion.
> 
> Thankfully for data csum conversion, everything is protected by
> metadata COW and we can detect the current status by the existence of
> both old and new csum items, and their coverage.
> 
> For resume of metadata conversion, there is nothing we can really do but
> go through all the tree blocks and verify them against both new and old
> csum type.
> 
> [TESTING]
> For the tests, currently errors are injected after every possible
> transaction commit, and then resume from that interrupted status.
> So far the patchset passes all above tests.
> 
> But I'm not sure what's the better way to craft the test case.
> 
> Should we go log-writes? Or use pre-built images?
> 
> [TODO]
> - Test cases for resume
> 
> - Support for revert if errors are found
>    If we hit data csum mismatch and can not repair from any copy, then
>    we should revert back to the old csum.
> 
> - Support for pre-cautious metadata check
>    We'd better read and very metadata before rewriting them.
> 
> - Performance tuning
>    We want to take a balance between too frequent commit transaction
>    and too large transaction.
>    This is especially true for data csum objectid change and maybe
>    old data csum deletion.
> 
> - UI enhancement
>    A btrfs-convert like UI would be better.
> 
> 
> Qu Wenruo (7):
>    btrfs-progs: tune: implement resume support for metadata checksum
>    btrfs-progs: tune: implement resume support for generating new data
>      csum
>    btrfs-progs: tune: implement resume support for csum tree without any
>      new csum item
>    btrfs-progs: tune: implement resume support for empty csum tree
>    btrfs-progs: tune: implement resume support for half deleted old csums
>    btrfs-progs: tune: implement resume support for data csum objectid
>      change
>    btrfs-progs: tune: reject csum change if the fs is already using the
>      target csum type
> 
>   tune/change-csum.c | 461 ++++++++++++++++++++++++++++++++++++++++-----
>   1 file changed, 418 insertions(+), 43 deletions(-)
>
Qu Wenruo May 24, 2023, 8:48 a.m. UTC | #2
On 2023/5/24 16:42, Anand Jain wrote:
>
>
> Why not use the generation numbers "From" and "To" for converting
> to the new checksum and keeping track of the generation number of
> the most recent successful conversion?

Firstly, generation only makes sense for COW based operations (data csum
modification part), thus it doesn't work for metadata csum rewrite,
which happen in place.

Secondly, there is only one major generation number, and how do you
define "from" and "to"?

>
> Thanks, Anand
>
>
> On 24/05/2023 15:41, Qu Wenruo wrote:
>> [RESUME SUPPORT]
>> This patchset adds the resume support for "btrfstune --csum".
>>
>> The main part is in resume for data csum change, as we have 4 different
>> status during data csum conversion.
>>
>> Thankfully for data csum conversion, everything is protected by
>> metadata COW and we can detect the current status by the existence of
>> both old and new csum items, and their coverage.
>>
>> For resume of metadata conversion, there is nothing we can really do but
>> go through all the tree blocks and verify them against both new and old
>> csum type.
>>
>> [TESTING]
>> For the tests, currently errors are injected after every possible
>> transaction commit, and then resume from that interrupted status.
>> So far the patchset passes all above tests.
>>
>> But I'm not sure what's the better way to craft the test case.
>>
>> Should we go log-writes? Or use pre-built images?
>>
>> [TODO]
>> - Test cases for resume
>>
>> - Support for revert if errors are found
>>    If we hit data csum mismatch and can not repair from any copy, then
>>    we should revert back to the old csum.
>>
>> - Support for pre-cautious metadata check
>>    We'd better read and very metadata before rewriting them.
>>
>> - Performance tuning
>>    We want to take a balance between too frequent commit transaction
>>    and too large transaction.
>>    This is especially true for data csum objectid change and maybe
>>    old data csum deletion.
>>
>> - UI enhancement
>>    A btrfs-convert like UI would be better.
>>
>>
>> Qu Wenruo (7):
>>    btrfs-progs: tune: implement resume support for metadata checksum
>>    btrfs-progs: tune: implement resume support for generating new data
>>      csum
>>    btrfs-progs: tune: implement resume support for csum tree without any
>>      new csum item
>>    btrfs-progs: tune: implement resume support for empty csum tree
>>    btrfs-progs: tune: implement resume support for half deleted old csums
>>    btrfs-progs: tune: implement resume support for data csum objectid
>>      change
>>    btrfs-progs: tune: reject csum change if the fs is already using the
>>      target csum type
>>
>>   tune/change-csum.c | 461 ++++++++++++++++++++++++++++++++++++++++-----
>>   1 file changed, 418 insertions(+), 43 deletions(-)
>>
>
David Sterba May 24, 2023, 3:55 p.m. UTC | #3
On Wed, May 24, 2023 at 03:41:23PM +0800, Qu Wenruo wrote:
> [RESUME SUPPORT]
> This patchset adds the resume support for "btrfstune --csum".

Great, thanks.

> The main part is in resume for data csum change, as we have 4 different
> status during data csum conversion.
> 
> Thankfully for data csum conversion, everything is protected by
> metadata COW and we can detect the current status by the existence of
> both old and new csum items, and their coverage.
> 
> For resume of metadata conversion, there is nothing we can really do but
> go through all the tree blocks and verify them against both new and old
> csum type.
> 
> [TESTING]
> For the tests, currently errors are injected after every possible
> transaction commit, and then resume from that interrupted status.
> So far the patchset passes all above tests.

So you've inserted some "if/return EINVAL", right?

> But I'm not sure what's the better way to craft the test case.
> 
> Should we go log-writes? Or use pre-built images?

Log writes maybe, the device mapper targets are already used in the test
suite. You could use your testing injection code to generate them.

But it's clear that we could generate them only once, or we'd have to
store the image generators (possible).

Alternatively, can we create some error injection framework? In the
simplest form, define injection points in the code watching for some
conndition and trigger them from outside eg. set by an environment
variable. Maybe there's something better.

> [TODO]
> - Test cases for resume
> 
> - Support for revert if errors are found
>   If we hit data csum mismatch and can not repair from any copy, then
>   we should revert back to the old csum.
> 
> - Support for pre-cautious metadata check
>   We'd better read and very metadata before rewriting them.

Read and verify? Agreed.

> - Performance tuning
>   We want to take a balance between too frequent commit transaction
>   and too large transaction.
>   This is especially true for data csum objectid change and maybe
>   old data csum deletion.
> 
> - UI enhancement
>   A btrfs-convert like UI would be better.

Yeah it's growing beyond what one simple option can handle. The separate
command group for btrfstune is in the work, so far still on the code
level, the subcommands are roughly matching the conversion features but
not yet finalized.
Qu Wenruo May 24, 2023, 10:24 p.m. UTC | #4
On 2023/5/24 23:55, David Sterba wrote:
> On Wed, May 24, 2023 at 03:41:23PM +0800, Qu Wenruo wrote:
>> [RESUME SUPPORT]
>> This patchset adds the resume support for "btrfstune --csum".
>
> Great, thanks.
>
>> The main part is in resume for data csum change, as we have 4 different
>> status during data csum conversion.
>>
>> Thankfully for data csum conversion, everything is protected by
>> metadata COW and we can detect the current status by the existence of
>> both old and new csum items, and their coverage.
>>
>> For resume of metadata conversion, there is nothing we can really do but
>> go through all the tree blocks and verify them against both new and old
>> csum type.
>>
>> [TESTING]
>> For the tests, currently errors are injected after every possible
>> transaction commit, and then resume from that interrupted status.
>> So far the patchset passes all above tests.
>
> So you've inserted some "if/return EINVAL", right?

Yes for my local tests.

>
>> But I'm not sure what's the better way to craft the test case.
>>
>> Should we go log-writes? Or use pre-built images?
>
> Log writes maybe, the device mapper targets are already used in the test
> suite. You could use your testing injection code to generate them.

Log writes allows us to resume at each FUA/FLUSH, thus doesn't need the
error injection part.

It has the best coverage for data csum part, but not for metadata
rewrites, as it doesn't generate any FUA/FLUSH.

>
> But it's clear that we could generate them only once, or we'd have to
> store the image generators (possible).

Pre-built images are much better and controllable for us to test
metadata resume and other corner cases, but less coverage overall.

>
> Alternatively, can we create some error injection framework? In the
> simplest form, define injection points in the code watching for some
> conndition and trigger them from outside eg. set by an environment
> variable. Maybe there's something better.

For injection framework, we at least need some way to trigger them by
some possibility.
Which can already be a little too complex.

>
>> [TODO]
>> - Test cases for resume
>>
>> - Support for revert if errors are found
>>    If we hit data csum mismatch and can not repair from any copy, then
>>    we should revert back to the old csum.
>>
>> - Support for pre-cautious metadata check
>>    We'd better read and very metadata before rewriting them.
>
> Read and verify? Agreed.

It's already read-and-verify during metadata rewrites, the pre-cautious
part is reading the whole metadata, before doing metadata rewrites.

But the best practice is "btrfs check --readonly" before calling
"btrfstune --csum".

Thanks,
Qu


>
>> - Performance tuning
>>    We want to take a balance between too frequent commit transaction
>>    and too large transaction.
>>    This is especially true for data csum objectid change and maybe
>>    old data csum deletion.
>>
>> - UI enhancement
>>    A btrfs-convert like UI would be better.
>
> Yeah it's growing beyond what one simple option can handle. The separate
> command group for btrfstune is in the work, so far still on the code
> level, the subcommands are roughly matching the conversion features but
> not yet finalized.
David Sterba Aug. 9, 2023, 3:50 p.m. UTC | #5
On Thu, May 25, 2023 at 06:24:42AM +0800, Qu Wenruo wrote:
> > Alternatively, can we create some error injection framework? In the
> > simplest form, define injection points in the code watching for some
> > conndition and trigger them from outside eg. set by an environment
> > variable. Maybe there's something better.
> 
> For injection framework, we at least need some way to trigger them by
> some possibility.
> Which can already be a little too complex.

I've implemented the environment variable approach, it's in devel and
example injection points are in the checksum switch. It's really simple
and I'm not sure I like it but at least we have a starting point.