4.17-rc1 FS went read-only during balance
diff mbox

Message ID d935c0e0-c2c8-a0ec-bb07-5c3879dd1be0@gmx.com
State New
Headers show

Commit Message

Qu Wenruo April 23, 2018, 8:23 a.m. UTC
On 2018年04月23日 16:04, Dmitrii Tcvetkov wrote:
>>>>> TL;DR It seems as regression in 4.17, but I managed to find a
>>>>> workaround to make filesystem rw mountable again.
>>>>>
>>>>> Kernel built from tag v4.17-rc1
>>>>> btrfs-progs 4.16
>>>>>
>>>>> Tonight two my machines (PC (ECC RAM) and laptop(non-ECC RAM)) were
>>>>> doing usual weekly balance with this command via cron:
>>>>> btrfs balance start -musage=50 -dusage=50 <mountpoint>
>>>>> Both machines run same kernel version. 
>>>>>
>>>>> On PC that caused root and "data" filesystems to go readonly. Root
>>>>> is on an SSD with data single and metadata DUP, "data" filesystem
>>>>> is on 2 HDDs with RAID1 for data and metadata.
>>>>>
>>>>> On laptop only /home went ro, it's on NVMe SSD with data single and
>>>>> metadata DUP. 
>>>>>
>>>>> Btrfs check of PC rootfs was without any errors in both modes, I did
>>>>> them once each before reboot on readonly filesystem with --force
>>>>> flag and then from live usb. Same output without any errors.
>>>>>
>>>>> After reboot kernel refused rw mount rootfs with the same error as
>>>>> during cron balance, ro mount was accepted, error during rw mount:
>>>>> BTRFS: error (device dm-17) in merge_reloc_roots:2465: errno=-117    
>>>   
>>>> 117 means EUCLEAN, which could be caused by the newly introduced
>>>> first_key and level check.  
>>>   
>>>> Please apply this hotfix to fix it.
>>>> btrfs: Only check first key for committed tree blocks
>>>> (Which is included in latest pull request)  
>>>   
>>>> Also, please consider enable CONFIG_BTRFS_DEBUG to provide extra
>>>> debug info.  
>>>   
>>>> Thanks,
>>>> Qu  
>>>
>>> I tried 4.17-rc2 (as the pull request was pulled) with
>>> CONFIG_BTRFS_DEBUG on LVM snapshot of laptop home partition (/dev/vdb)
>>> in a VM (VM kernel sees only snapshot so no UUID collisions). Dmesg
>>> attached.  
>>
>> Thanks for the info and your previous btrfs-image.
>>
>> The image itself shows nothing wrong, so it should be runtime problem.
>> Would you please apply these two debug patches?
>> https://patchwork.kernel.org/patch/10335133/
>> https://patchwork.kernel.org/patch/10335135/
>>
>> And the attached diff file?
>>
>> My guess is the parent node is not initialized correctly in this case.
>>
>> Thanks,
>> Qu
> 
> Dmesg from kernel with all three patches applied attached.
> 
Thanks for the debug info, it really helps a lot!

It turns out that I'm just a super idiot, a typo in replace_path()
caused this, and it could not be trigger unless we enter it from
relocation recovery.

Please try the attached patch to see if it solves the problem.

Thanks,
Qu

Comments

Dmitrii Tcvetkov April 23, 2018, 8:40 a.m. UTC | #1
> >>>>> TL;DR It seems as regression in 4.17, but I managed to find a
> >>>>> workaround to make filesystem rw mountable again.
> >>>>>
> >>>>> Kernel built from tag v4.17-rc1
> >>>>> btrfs-progs 4.16
> >>>>>
> >>>>> Tonight two my machines (PC (ECC RAM) and laptop(non-ECC RAM)) were
> >>>>> doing usual weekly balance with this command via cron:
> >>>>> btrfs balance start -musage=50 -dusage=50 <mountpoint>
> >>>>> Both machines run same kernel version. 
> >>>>>
> >>>>> On PC that caused root and "data" filesystems to go readonly. Root
> >>>>> is on an SSD with data single and metadata DUP, "data" filesystem
> >>>>> is on 2 HDDs with RAID1 for data and metadata.
> >>>>>
> >>>>> On laptop only /home went ro, it's on NVMe SSD with data single and
> >>>>> metadata DUP. 
> >>>>>
> >>>>> Btrfs check of PC rootfs was without any errors in both modes, I did
> >>>>> them once each before reboot on readonly filesystem with --force
> >>>>> flag and then from live usb. Same output without any errors.
> >>>>>
> >>>>> After reboot kernel refused rw mount rootfs with the same error as
> >>>>> during cron balance, ro mount was accepted, error during rw mount:
> >>>>> BTRFS: error (device dm-17) in merge_reloc_roots:2465: errno=-117      
> >>>     
> >>>> 117 means EUCLEAN, which could be caused by the newly introduced
> >>>> first_key and level check.    
> >>>     
> >>>> Please apply this hotfix to fix it.
> >>>> btrfs: Only check first key for committed tree blocks
> >>>> (Which is included in latest pull request)    
> >>>     
> >>>> Also, please consider enable CONFIG_BTRFS_DEBUG to provide extra
> >>>> debug info.    
> >>>     
> >>>> Thanks,
> >>>> Qu    
> >>>
> >>> I tried 4.17-rc2 (as the pull request was pulled) with
> >>> CONFIG_BTRFS_DEBUG on LVM snapshot of laptop home partition (/dev/vdb)
> >>> in a VM (VM kernel sees only snapshot so no UUID collisions). Dmesg
> >>> attached.    
> >>
> >> Thanks for the info and your previous btrfs-image.
> >>
> >> The image itself shows nothing wrong, so it should be runtime problem.
> >> Would you please apply these two debug patches?
> >> https://patchwork.kernel.org/patch/10335133/
> >> https://patchwork.kernel.org/patch/10335135/
> >>
> >> And the attached diff file?
> >>
> >> My guess is the parent node is not initialized correctly in this case.
> >>
> >> Thanks,
> >> Qu  
> > 
> > Dmesg from kernel with all three patches applied attached.
> >   
> Thanks for the debug info, it really helps a lot!
> 
> It turns out that I'm just a super idiot, a typo in replace_path()
> caused this, and it could not be trigger unless we enter it from
> relocation recovery.
> 
> Please try the attached patch to see if it solves the problem.
> 
> Thanks,
> Qu
Glad to help, the patch solved the problem, 
rw mount is successful and balance finished, no errors or debug output,
btrfs check is clean in both modes.

[    2.842718] BTRFS: device label home devid 1 transid 277952 /dev/vdb
[    2.924965] BTRFS: device label root devid 1 transid 84092 /dev/vda2
[    3.072271] BTRFS info (device vda2): use lzo compression, level 0
[    3.072897] BTRFS info (device vda2): enabling auto defrag
[    3.073476] BTRFS info (device vda2): using free space tree
[    3.074049] BTRFS info (device vda2): has skinny extents
[    5.411821] BTRFS info (device vda2): using free space tree
[   24.925293] BTRFS info (device vdb): using free space tree
[   24.925324] BTRFS info (device vdb): has skinny extents
[   31.711868] BTRFS info (device vdb): continuing balance
[   31.721658] BTRFS info (device vdb): checking UUID tree
[   31.822920] BTRFS info (device vdb): relocating block group 69889687552flags data 
[   33.730399] BTRFS info (device vdb): found 12 extents
[   36.950699] BTRFS info (device vdb): found 12 extents
[   37.030813] BTRFS info (device vdb): relocating block group 67742203904flags metadata|dup 
[   37.104174] BTRFS info (device vdb): relocating block group 67708649472 flags system|dup 
[   37.189843] BTRFS info (device vdb): found 1 extents

Patch
diff mbox

From 4b70eb864192ec5cf54a7e67e2957ddf0e5c0f6f Mon Sep 17 00:00:00 2001
From: Qu Wenruo <wqu@suse.com>
Date: Mon, 23 Apr 2018 16:13:55 +0800
Subject: [PATCH] btrfs: Fix wrong first_key parameter in replace_path

Commit 581c1760415c ("btrfs: Validate child tree block's level and first
key") introduced new @first_key parameter for read_tree_block(), however
caller in replace_path() is parasing wrong key to read_tree_block().

It should use parameter @first_key other than @key.

Normally it won't expose problem as @key is normally initialzied to the
same value of @first_key we expect.
However in relocation recovery case, @key can be set to (0, 0, 0), and
since no valid key in relocation tree can be (0, 0, 0), it will cause
read_tree_block() to return -EUCLEAN and interrupt relocation recovery.

Fix it by setting @first_key correctly.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/relocation.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 00b7d3231821..b041b945a7ae 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1841,7 +1841,7 @@  int replace_path(struct btrfs_trans_handle *trans,
 		old_bytenr = btrfs_node_blockptr(parent, slot);
 		blocksize = fs_info->nodesize;
 		old_ptr_gen = btrfs_node_ptr_generation(parent, slot);
-		btrfs_node_key_to_cpu(parent, &key, slot);
+		btrfs_node_key_to_cpu(parent, &first_key, slot);
 
 		if (level <= max_level) {
 			eb = path->nodes[level];
-- 
2.17.0