mbox series

[0/3] btrfs-progs: Do proper extent item generation repair

Message ID 20200117072959.27929-1-wqu@suse.com (mailing list archive)
Headers show
Series btrfs-progs: Do proper extent item generation repair | expand

Message

Qu Wenruo Jan. 17, 2020, 7:29 a.m. UTC
Before this patchset, the only way to repair invalid extent item
generation is to use --init-extent-tree, which is really a bad idea.

To rebuild the whole extent tree just for one corrupted extent item?
I must be insane at that time.

This patch introduces the proper extent item generation repair
functionality for both mode, and alter existing test case to also test
repair.

Qu Wenruo (3):
  btrfs-progs: check/lowmem: Repair invalid extent item generation
  btrfs-progs: check/original: Repair extent item generation
  btrfs-progs: tests/fsck-044: Enable repair test for invalid extent
    item generation

 check/main.c                                  | 66 +++++++++++++++++
 check/mode-lowmem.c                           | 74 +++++++++++++++++++
 .../.lowmem_repairable                        |  0
 .../test.sh                                   | 19 -----
 4 files changed, 140 insertions(+), 19 deletions(-)
 create mode 100644 tests/fsck-tests/044-invalid-extent-item-generation/.lowmem_repairable
 delete mode 100755 tests/fsck-tests/044-invalid-extent-item-generation/test.sh

Comments

Josef Bacik Jan. 17, 2020, 2:28 p.m. UTC | #1
On 1/17/20 2:29 AM, Qu Wenruo wrote:
> Before this patchset, the only way to repair invalid extent item
> generation is to use --init-extent-tree, which is really a bad idea.
> 
> To rebuild the whole extent tree just for one corrupted extent item?
> I must be insane at that time.
> 
> This patch introduces the proper extent item generation repair
> functionality for both mode, and alter existing test case to also test
> repair.
> 
> Qu Wenruo (3):
>    btrfs-progs: check/lowmem: Repair invalid extent item generation
>    btrfs-progs: check/original: Repair extent item generation
>    btrfs-progs: tests/fsck-044: Enable repair test for invalid extent
>      item generation
> 
>   check/main.c                                  | 66 +++++++++++++++++
>   check/mode-lowmem.c                           | 74 +++++++++++++++++++
>   .../.lowmem_repairable                        |  0
>   .../test.sh                                   | 19 -----
>   4 files changed, 140 insertions(+), 19 deletions(-)
>   create mode 100644 tests/fsck-tests/044-invalid-extent-item-generation/.lowmem_repairable
>   delete mode 100755 tests/fsck-tests/044-invalid-extent-item-generation/test.sh
> 

If we have a generation > super generation that means that block is from the 
future and we shouldn't trust anything in it right?  I haven't touched this code 
in a while, but that just meant we threw it away and any extent references that 
were in that block were just re-created.  Is that not what's happening now? 
This seems like a bad way to go about fixing this particular problem.  Thanks,

Josef
Qu Wenruo Jan. 18, 2020, 1:08 a.m. UTC | #2
On 2020/1/17 下午10:28, Josef Bacik wrote:
> On 1/17/20 2:29 AM, Qu Wenruo wrote:
>> Before this patchset, the only way to repair invalid extent item
>> generation is to use --init-extent-tree, which is really a bad idea.
>>
>> To rebuild the whole extent tree just for one corrupted extent item?
>> I must be insane at that time.
>>
>> This patch introduces the proper extent item generation repair
>> functionality for both mode, and alter existing test case to also test
>> repair.
>>
>> Qu Wenruo (3):
>>    btrfs-progs: check/lowmem: Repair invalid extent item generation
>>    btrfs-progs: check/original: Repair extent item generation
>>    btrfs-progs: tests/fsck-044: Enable repair test for invalid extent
>>      item generation
>>
>>   check/main.c                                  | 66 +++++++++++++++++
>>   check/mode-lowmem.c                           | 74 +++++++++++++++++++
>>   .../.lowmem_repairable                        |  0
>>   .../test.sh                                   | 19 -----
>>   4 files changed, 140 insertions(+), 19 deletions(-)
>>   create mode 100644
>> tests/fsck-tests/044-invalid-extent-item-generation/.lowmem_repairable
>>   delete mode 100755
>> tests/fsck-tests/044-invalid-extent-item-generation/test.sh
>>
> 
> If we have a generation > super generation that means that block is from
> the future and we shouldn't trust anything in it right?

From current report, it's mostly caused by:
- A bug around 2014
- Memory corruption

For the later case, as long as it's the only bug, it's easy to fix.
For the former case, although we don't have a concrete cause, it doesn't
seem to cause tons of similar problems.

Either way, from current reports they can be fixed, so I think it's
kinda OK to do such simple fix instead of always go slow --init-extent-tree.

Thanks,
Qu

>  I haven't
> touched this code in a while, but that just meant we threw it away and
> any extent references that were in that block were just re-created.  Is
> that not what's happening now? This seems like a bad way to go about
> fixing this particular problem.  Thanks,
> 
> Josef