diff mbox

btrfs restore memory corruption (bug: 82701)

Message ID 1408689825.22226.14.camel@localhost.localdomain (mailing list archive)
State New, archived
Headers show

Commit Message

Gui Hecheng Aug. 22, 2014, 6:43 a.m. UTC
On Thu, 2014-08-21 at 16:19 +0200, Marc Dietrich wrote:
> Am Donnerstag, 21. August 2014, 17:52:16 schrieb Gui Hecheng:
> > On Mon, 2014-08-18 at 11:25 +0200, Marc Dietrich wrote:
> > > Hi,
> > > 
> > > I did a checkout of the latest btrfs progs to repair my damaged
> > > filesystem.
> > > Running btrfs restore gives me several failed to inflate: -6 and crashes
> > > with some memory corruption. I ran it again with valgrind and got:
> > > 
> > > valgrind --log-file=x2 -v --leak-check=yes btrfs restore /dev/sda9
> > > /mnt/backup
> > > 
> > > ==8528== Memcheck, a memory error detector
> > > ==8528== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
> > > ==8528== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
> > > ==8528== Command: btrfs restore /dev/sda9 /mnt/backup
> > > ==8528== Parent PID: 8453
> > > ==8528==
> > > ==8528== Syscall param pwrite64(buf) points to uninitialised byte(s)
> > > ==8528==    at 0x59BE3C3: __pwrite_nocancel (in /lib64/libpthread-2.18.so)
> > > ==8528==    by 0x41F22F: search_dir (cmds-restore.c:392)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > ==8528==  Address 0x66956a0 is 7,056 bytes inside a block of size 8,192
> > > alloc'd
> > > ==8528==    at 0x4C277AB: malloc (in
> > > /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so)
> > > ==8528==    by 0x41EEAD: search_dir (cmds-restore.c:316)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > 
> > -------------------[snip]---------------------------------
> > 
> > > ==8528== Invalid read of size 1
> > > ==8528==    at 0x4C2BF15: memcpy@@GLIBC_2.14 (in
> > > /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> > > ==8528==    by 0x43818F: read_extent_buffer (string3.h:51)
> > > ==8528==    by 0x41EC66: search_dir (cmds-restore.c:233)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > ==8528==  Address 0x684c186 is 1,110 bytes inside a block of size 4,224
> > > free'd ==8528==    at 0x4C28ADC: free (in
> > > /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so)
> > > ==8528==    by 0x437895: free_extent_buffer (extent_io.c:618)
> > > ==8528==    by 0x41E053: next_leaf (cmds-restore.c:202)
> > > ==8528==    by 0x41E50F: search_dir (cmds-restore.c:731)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > ==8528==
> > > ==8528== Invalid read of size 8
> > > ==8528==    at 0x4C2BF40: memcpy@@GLIBC_2.14 (in
> > > /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> > > ==8528==    by 0x43818F: read_extent_buffer (string3.h:51)
> > > ==8528==    by 0x41EC66: search_dir (cmds-restore.c:233)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > ==8528==  Address 0x684c178 is 1,096 bytes inside a block of size 4,224
> > > free'd ==8528==    at 0x4C28ADC: free (in
> > > /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so)
> > > ==8528==    by 0x437895: free_extent_buffer (extent_io.c:618)
> > > ==8528==    by 0x41E053: next_leaf (cmds-restore.c:202)
> > > ==8528==    by 0x41E50F: search_dir (cmds-restore.c:731)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > ==8528==
> > > ==8528== Invalid read of size 8
> > > ==8528==    at 0x4C2BF52: memcpy@@GLIBC_2.14 (in
> > > /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> > > ==8528==    by 0x43818F: read_extent_buffer (string3.h:51)
> > > ==8528==    by 0x41EC66: search_dir (cmds-restore.c:233)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > ==8528==  Address 0x684c168 is 1,080 bytes inside a block of size 4,224
> > > free'd ==8528==    at 0x4C28ADC: free (in
> > > /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so)
> > > ==8528==    by 0x437895: free_extent_buffer (extent_io.c:618)
> > > ==8528==    by 0x41E053: next_leaf (cmds-restore.c:202)
> > > ==8528==    by 0x41E50F: search_dir (cmds-restore.c:731)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > ==8528==
> > > ==8528== Invalid read of size 1
> > > ==8528==    at 0x4C2BFE4: memcpy@@GLIBC_2.14 (in
> > > /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> > > ==8528==    by 0x43818F: read_extent_buffer (string3.h:51)
> > > ==8528==    by 0x41EC66: search_dir (cmds-restore.c:233)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > ==8528==  Address 0x6a385f8 is 2,680 bytes inside a block of size 4,224
> > > free'd ==8528==    at 0x4C28ADC: free (in
> > > /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so)
> > > ==8528==    by 0x437895: free_extent_buffer (extent_io.c:618)
> > > ==8528==    by 0x41E053: next_leaf (cmds-restore.c:202)
> > > ==8528==    by 0x41E50F: search_dir (cmds-restore.c:731)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > ==8528==
> > 
> > ----------------------------------------------------------
> > For the above piece,
> > maybe you would like to try if the following helps or not:
> > 
> > diff --git a/cmds-restore.c b/cmds-restore.c
> > index 239ea6c..dde7de8 100644
> > --- a/cmds-restore.c
> > +++ b/cmds-restore.c
> > @@ -182,6 +182,7 @@ again:
> >                 c = path->nodes[level];
> >                 if (slot >= btrfs_header_nritems(c)) {
> >                         level++;
> > +                       offset = 1;
> >                         if (level == BTRFS_MAX_LEVEL)
> >                                 return 1;
> >                         continue;
> > 
> > it doesn't seems to go the right way when entering the next level,
> > it should starts at the first slot at least.
> 
> Can't tell if it's the right thing to do, but at least I haven't seen *this* 
> leak message for a while now.

I think it works a bit, because it shows that you've passed all the
stuff with inline extents.

> Additionally, I get many of these (unrelated) leaks now:

For the leak below...
I've no idea why the @decompress_lzo() is not statisfied with @inbuf
with the exact size of the disk bytes.
Or maybe the compressed data had just sufferred damages...

BTW, when you wrote your data, did that kernel has the following commit
for btrfs?
	commit: 59516f6017c589e7316418fda6128ba8f829a77f

If *NO*, then you may try the following and see if it makes any
difference:
---------------------------------------------------------
 
@@ -376,7 +376,7 @@ again:
                goto out;
        }
 
-       ret = decompress(inbuf, outbuf, disk_size, &ram_size, compress);
+       ret = decompress(inbuf, outbuf, num_bytes, &ram_size, compress);
        if (ret) {
                num_copies =
btrfs_num_copies(&root->fs_info->mapping_tree,
                                              bytenr, length);
------------------------------------------------------------------------
*NOTE*: the above is just a trial, it is actually not proper, but please
don't worry, it does no harm.

-Gui

> ==3007== Invalid read of size 1
> ==3007==    at 0x57A11B1: lzo1x_decompress_safe (in 
> /usr/lib64/liblzo2.so.2.0.0)
> ==3007==    by 0x41E2C4: decompress (cmds-restore.c:122)
> ==3007==    by 0x41F19D: search_dir (cmds-restore.c:378)
> ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> ==3007==  Address 0x6887774 is 4 bytes after a block of size 4,096 alloc'd
> ==3007==    at 0x4C277AB: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-
> amd64-linux.so)
> ==3007==    by 0x41EE61: search_dir (cmds-restore.c:309)
> ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> 
> Thanks so far!
> 
> Marc
> 
> 
> > > ==8528== Invalid read of size 2
> > > ==8528==    at 0x4C2BFA0: memcpy@@GLIBC_2.14 (in
> > > /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> > > ==8528==    by 0x43818F: read_extent_buffer (string3.h:51)
> > > ==8528==    by 0x41EC66: search_dir (cmds-restore.c:233)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > ==8528==  Address 0x6b0bfb8 is 632 bytes inside a block of size 4,224
> > > free'd ==8528==    at 0x4C28ADC: free (in
> > > /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so)
> > > ==8528==    by 0x437895: free_extent_buffer (extent_io.c:618)
> > > ==8528==    by 0x4261CA: btrfs_release_path (ctree.c:61)
> > > ==8528==    by 0x426212: btrfs_free_path (ctree.c:51)
> > > ==8528==    by 0x41F93B: search_dir (cmds-restore.c:911)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==
> > > ==8528== Invalid read of size 2
> > > ==8528==    at 0x4C2BFB3: memcpy@@GLIBC_2.14 (in
> > > /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> > > ==8528==    by 0x43818F: read_extent_buffer (string3.h:51)
> > > ==8528==    by 0x41EC66: search_dir (cmds-restore.c:233)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > ==8528==  Address 0x6b0bfb4 is 628 bytes inside a block of size 4,224
> > > free'd ==8528==    at 0x4C28ADC: free (in
> > > /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so)
> > > ==8528==    by 0x437895: free_extent_buffer (extent_io.c:618)
> > > ==8528==    by 0x4261CA: btrfs_release_path (ctree.c:61)
> > > ==8528==    by 0x426212: btrfs_free_path (ctree.c:51)
> > > ==8528==    by 0x41F93B: search_dir (cmds-restore.c:911)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > ==8528==
> > > ==8528==
> > > ==8528== HEAP SUMMARY:
> > > ==8528==     in use at exit: 0 bytes in 0 blocks
> > > ==8528==   total heap usage: 260,452 allocs, 260,452 frees, 278,189,550
> > > bytes allocated
> > > ==8528==
> > > ==8528== All heap blocks were freed -- no leaks are possible
> > > ==8528==
> > > ==8528== For counts of detected and suppressed errors, rerun with: -v
> > > ==8528== Use --track-origins=yes to see where uninitialised values come
> > > from ==8528== ERROR SUMMARY: 16597 errors from 7 contexts (suppressed: 2
> > > from 2)
> > > 
> > > see: https://bugzilla.kernel.org/show_bug.cgi?id=82701
> > > 
> > > Marc
> > > 
> > > p.s.
> > > 
> > > I wonder if this list should be autosubscribed to btrfs related bugs
> > > 
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Marc Dietrich Aug. 22, 2014, 8:42 a.m. UTC | #1
Am Freitag, 22. August 2014, 14:43:45 schrieb Gui Hecheng:
> On Thu, 2014-08-21 at 16:19 +0200, Marc Dietrich wrote:
> > Am Donnerstag, 21. August 2014, 17:52:16 schrieb Gui Hecheng:
> > > On Mon, 2014-08-18 at 11:25 +0200, Marc Dietrich wrote:
> > > > Hi,
> > > > 
> > > > I did a checkout of the latest btrfs progs to repair my damaged
> > > > filesystem.
> > > > Running btrfs restore gives me several failed to inflate: -6 and
> > > > crashes
> > > > with some memory corruption. I ran it again with valgrind and got:
> > > > 
> > > > valgrind --log-file=x2 -v --leak-check=yes btrfs restore /dev/sda9
> > > > /mnt/backup
> > > > 
> > > > ==8528== Memcheck, a memory error detector
> > > > ==8528== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et
> > > > al.
> > > > ==8528== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright
> > > > info
> > > > ==8528== Command: btrfs restore /dev/sda9 /mnt/backup
> > > > ==8528== Parent PID: 8453
> > > > ==8528==
> > > > ==8528== Syscall param pwrite64(buf) points to uninitialised byte(s)
> > > > ==8528==    at 0x59BE3C3: __pwrite_nocancel (in
> > > > /lib64/libpthread-2.18.so)
> > > > ==8528==    by 0x41F22F: search_dir (cmds-restore.c:392)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > > ==8528==  Address 0x66956a0 is 7,056 bytes inside a block of size
> > > > 8,192
> > > > alloc'd
> > > > ==8528==    at 0x4C277AB: malloc (in
> > > > /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so)
> > > > ==8528==    by 0x41EEAD: search_dir (cmds-restore.c:316)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > 
> > > -------------------[snip]---------------------------------
> > > .... leaks ...
> > > ----------------------------------------------------------
>
> For the leak below...
> I've no idea why the @decompress_lzo() is not statisfied with @inbuf
> with the exact size of the disk bytes.
> Or maybe the compressed data had just sufferred damages...
> 
> BTW, when you wrote your data, did that kernel has the following commit
> for btrfs?
> 	commit: 59516f6017c589e7316418fda6128ba8f829a77f

mmh, I used the master branch which is still on 3.14.2 (from k.org).

Ah, there is a development branch on another repo (repo.or.cz). Why oh why?

> 
> If *NO*, then you may try the following and see if it makes any
> difference:
> ---------------------------------------------------------
> diff --git a/cmds-restore.c b/cmds-restore.c
> index dde7de8..ae1ea72 100644
> --- a/cmds-restore.c
> +++ b/cmds-restore.c
> @@ -297,7 +297,7 @@ static int copy_one_extent(struct btrfs_root *root,
> int fd,
>         ram_size = btrfs_file_extent_ram_bytes(leaf, fi);
>         offset = btrfs_file_extent_offset(leaf, fi);
>         num_bytes = btrfs_file_extent_num_bytes(leaf, fi);
> -       size_left = disk_size;
> +       size_left = num_bytes;
>         if (compress == BTRFS_COMPRESS_NONE)
>                 bytenr += offset;
> 
> @@ -376,7 +376,7 @@ again:
>                 goto out;
>         }
> 
> -       ret = decompress(inbuf, outbuf, disk_size, &ram_size, compress);
> +       ret = decompress(inbuf, outbuf, num_bytes, &ram_size, compress);
>         if (ret) {
>                 num_copies =
> btrfs_num_copies(&root->fs_info->mapping_tree,
>                                               bytenr, length);
> ------------------------------------------------------------------------
> *NOTE*: the above is just a trial, it is actually not proper, but please
> don't worry, it does no harm.

well, my restore finished after 1 week (~ 400 GB of compressed data), from 
which 100 GB got lost. It wasn't important data so I'm willing to redo the 
complete restore again if you (or the the btrfs team) is interested in fixing 
these bugs in the near future.

I will upload the latest valgrind log for the finial run to the bugzilla on 
kernel.org (https://bugzilla.kernel.org/show_bug.cgi?id=82701).

I wonder if there is a corrupted btrfs disk image which can be used as a 
reference and which triggers all the current error paths (or maybe several 
images with one error in each one as other projects do). On the other hand, I 
guess this would be a huge pill of work.

Marc



> -Gui
> 
> > ==3007== Invalid read of size 1
> > ==3007==    at 0x57A11B1: lzo1x_decompress_safe (in
> > /usr/lib64/liblzo2.so.2.0.0)
> > ==3007==    by 0x41E2C4: decompress (cmds-restore.c:122)
> > ==3007==    by 0x41F19D: search_dir (cmds-restore.c:378)
> > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > ==3007==  Address 0x6887774 is 4 bytes after a block of size 4,096 alloc'd
> > ==3007==    at 0x4C277AB: malloc (in
> > /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so)
> > ==3007==    by 0x41EE61: search_dir (cmds-restore.c:309)
> > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > 
> > Thanks so far!
> > 
> > Marc
> > 
> > > > ==8528== Invalid read of size 2
> > > > ==8528==    at 0x4C2BFA0: memcpy@@GLIBC_2.14 (in
> > > > /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> > > > ==8528==    by 0x43818F: read_extent_buffer (string3.h:51)
> > > > ==8528==    by 0x41EC66: search_dir (cmds-restore.c:233)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > > ==8528==  Address 0x6b0bfb8 is 632 bytes inside a block of size 4,224
> > > > free'd ==8528==    at 0x4C28ADC: free (in
> > > > /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so)
> > > > ==8528==    by 0x437895: free_extent_buffer (extent_io.c:618)
> > > > ==8528==    by 0x4261CA: btrfs_release_path (ctree.c:61)
> > > > ==8528==    by 0x426212: btrfs_free_path (ctree.c:51)
> > > > ==8528==    by 0x41F93B: search_dir (cmds-restore.c:911)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==
> > > > ==8528== Invalid read of size 2
> > > > ==8528==    at 0x4C2BFB3: memcpy@@GLIBC_2.14 (in
> > > > /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> > > > ==8528==    by 0x43818F: read_extent_buffer (string3.h:51)
> > > > ==8528==    by 0x41EC66: search_dir (cmds-restore.c:233)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > > ==8528==  Address 0x6b0bfb4 is 628 bytes inside a block of size 4,224
> > > > free'd ==8528==    at 0x4C28ADC: free (in
> > > > /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so)
> > > > ==8528==    by 0x437895: free_extent_buffer (extent_io.c:618)
> > > > ==8528==    by 0x4261CA: btrfs_release_path (ctree.c:61)
> > > > ==8528==    by 0x426212: btrfs_free_path (ctree.c:51)
> > > > ==8528==    by 0x41F93B: search_dir (cmds-restore.c:911)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > ==8528==
> > > > ==8528==
> > > > ==8528== HEAP SUMMARY:
> > > > ==8528==     in use at exit: 0 bytes in 0 blocks
> > > > ==8528==   total heap usage: 260,452 allocs, 260,452 frees,
> > > > 278,189,550
> > > > bytes allocated
> > > > ==8528==
> > > > ==8528== All heap blocks were freed -- no leaks are possible
> > > > ==8528==
> > > > ==8528== For counts of detected and suppressed errors, rerun with: -v
> > > > ==8528== Use --track-origins=yes to see where uninitialised values
> > > > come
> > > > from ==8528== ERROR SUMMARY: 16597 errors from 7 contexts (suppressed:
> > > > 2
> > > > from 2)
> > > > 
> > > > see: https://bugzilla.kernel.org/show_bug.cgi?id=82701
> > > > 
> > > > Marc
> > > > 
> > > > p.s.
> > > > 
> > > > I wonder if this list should be autosubscribed to btrfs related bugs
> > > > 
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
> > > > in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gui Hecheng Aug. 22, 2014, 9:02 a.m. UTC | #2
On Fri, 2014-08-22 at 10:42 +0200, Marc Dietrich wrote:
> Am Freitag, 22. August 2014, 14:43:45 schrieb Gui Hecheng:
> > On Thu, 2014-08-21 at 16:19 +0200, Marc Dietrich wrote:
> > > Am Donnerstag, 21. August 2014, 17:52:16 schrieb Gui Hecheng:
> > > > On Mon, 2014-08-18 at 11:25 +0200, Marc Dietrich wrote:
> > > > > Hi,
> > > > > 
> > > > > I did a checkout of the latest btrfs progs to repair my damaged
> > > > > filesystem.
> > > > > Running btrfs restore gives me several failed to inflate: -6 and
> > > > > crashes
> > > > > with some memory corruption. I ran it again with valgrind and got:
> > > > > 
> > > > > valgrind --log-file=x2 -v --leak-check=yes btrfs restore /dev/sda9
> > > > > /mnt/backup
> > > > > 
> > > > > ==8528== Memcheck, a memory error detector
> > > > > ==8528== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et
> > > > > al.
> > > > > ==8528== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright
> > > > > info
> > > > > ==8528== Command: btrfs restore /dev/sda9 /mnt/backup
> > > > > ==8528== Parent PID: 8453
> > > > > ==8528==
> > > > > ==8528== Syscall param pwrite64(buf) points to uninitialised byte(s)
> > > > > ==8528==    at 0x59BE3C3: __pwrite_nocancel (in
> > > > > /lib64/libpthread-2.18.so)
> > > > > ==8528==    by 0x41F22F: search_dir (cmds-restore.c:392)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > > > ==8528==  Address 0x66956a0 is 7,056 bytes inside a block of size
> > > > > 8,192
> > > > > alloc'd
> > > > > ==8528==    at 0x4C277AB: malloc (in
> > > > > /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so)
> > > > > ==8528==    by 0x41EEAD: search_dir (cmds-restore.c:316)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > > 
> > > > -------------------[snip]---------------------------------
> > > > .... leaks ...
> > > > ----------------------------------------------------------
> >
> > For the leak below...
> > I've no idea why the @decompress_lzo() is not statisfied with @inbuf
> > with the exact size of the disk bytes.
> > Or maybe the compressed data had just sufferred damages...
> > 
> > BTW, when you wrote your data, did that kernel has the following commit
> > for btrfs?
> > 	commit: 59516f6017c589e7316418fda6128ba8f829a77f
> 
> mmh, I used the master branch which is still on 3.14.2 (from k.org).
> 
> Ah, there is a development branch on another repo (repo.or.cz). Why oh why?

There is a development branch for btrfs-progs from david:
http://github.com/kdave/btrfs-progs.git if you would like to try.

But here, what I mean is your *kernel* version when you wrote your data.
There is a change for btrfs-restore which depends on a kernel commit.
If you wrote your data with a older kernel and apply the 3.14.2
btrfs-progs to restore, then there may be wandering stuffs.
Now, I am just suspecting such a scenario.

Thanks,
-Gui

> > 
> > If *NO*, then you may try the following and see if it makes any
> > difference:
> > ---------------------------------------------------------
> > diff --git a/cmds-restore.c b/cmds-restore.c
> > index dde7de8..ae1ea72 100644
> > --- a/cmds-restore.c
> > +++ b/cmds-restore.c
> > @@ -297,7 +297,7 @@ static int copy_one_extent(struct btrfs_root *root,
> > int fd,
> >         ram_size = btrfs_file_extent_ram_bytes(leaf, fi);
> >         offset = btrfs_file_extent_offset(leaf, fi);
> >         num_bytes = btrfs_file_extent_num_bytes(leaf, fi);
> > -       size_left = disk_size;
> > +       size_left = num_bytes;
> >         if (compress == BTRFS_COMPRESS_NONE)
> >                 bytenr += offset;
> > 
> > @@ -376,7 +376,7 @@ again:
> >                 goto out;
> >         }
> > 
> > -       ret = decompress(inbuf, outbuf, disk_size, &ram_size, compress);
> > +       ret = decompress(inbuf, outbuf, num_bytes, &ram_size, compress);
> >         if (ret) {
> >                 num_copies =
> > btrfs_num_copies(&root->fs_info->mapping_tree,
> >                                               bytenr, length);
> > ------------------------------------------------------------------------
> > *NOTE*: the above is just a trial, it is actually not proper, but please
> > don't worry, it does no harm.
> 
> well, my restore finished after 1 week (~ 400 GB of compressed data), from 
> which 100 GB got lost. It wasn't important data so I'm willing to redo the 
> complete restore again if you (or the the btrfs team) is interested in fixing 
> these bugs in the near future.
> 
> I will upload the latest valgrind log for the finial run to the bugzilla on 
> kernel.org (https://bugzilla.kernel.org/show_bug.cgi?id=82701).
> 
> I wonder if there is a corrupted btrfs disk image which can be used as a 
> reference and which triggers all the current error paths (or maybe several 
> images with one error in each one as other projects do). On the other hand, I 
> guess this would be a huge pill of work.
> 
> Marc
> 
> 
> 
> > -Gui
> > 
> > > ==3007== Invalid read of size 1
> > > ==3007==    at 0x57A11B1: lzo1x_decompress_safe (in
> > > /usr/lib64/liblzo2.so.2.0.0)
> > > ==3007==    by 0x41E2C4: decompress (cmds-restore.c:122)
> > > ==3007==    by 0x41F19D: search_dir (cmds-restore.c:378)
> > > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > > ==3007==  Address 0x6887774 is 4 bytes after a block of size 4,096 alloc'd
> > > ==3007==    at 0x4C277AB: malloc (in
> > > /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so)
> > > ==3007==    by 0x41EE61: search_dir (cmds-restore.c:309)
> > > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > > ==3007==    by 0x41F8D7: search_dir (cmds-restore.c:895)
> > > 
> > > Thanks so far!
> > > 
> > > Marc
> > > 
> > > > > ==8528== Invalid read of size 2
> > > > > ==8528==    at 0x4C2BFA0: memcpy@@GLIBC_2.14 (in
> > > > > /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> > > > > ==8528==    by 0x43818F: read_extent_buffer (string3.h:51)
> > > > > ==8528==    by 0x41EC66: search_dir (cmds-restore.c:233)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > > > ==8528==  Address 0x6b0bfb8 is 632 bytes inside a block of size 4,224
> > > > > free'd ==8528==    at 0x4C28ADC: free (in
> > > > > /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so)
> > > > > ==8528==    by 0x437895: free_extent_buffer (extent_io.c:618)
> > > > > ==8528==    by 0x4261CA: btrfs_release_path (ctree.c:61)
> > > > > ==8528==    by 0x426212: btrfs_free_path (ctree.c:51)
> > > > > ==8528==    by 0x41F93B: search_dir (cmds-restore.c:911)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==
> > > > > ==8528== Invalid read of size 2
> > > > > ==8528==    at 0x4C2BFB3: memcpy@@GLIBC_2.14 (in
> > > > > /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> > > > > ==8528==    by 0x43818F: read_extent_buffer (string3.h:51)
> > > > > ==8528==    by 0x41EC66: search_dir (cmds-restore.c:233)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > > > ==8528==  Address 0x6b0bfb4 is 628 bytes inside a block of size 4,224
> > > > > free'd ==8528==    at 0x4C28ADC: free (in
> > > > > /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so)
> > > > > ==8528==    by 0x437895: free_extent_buffer (extent_io.c:618)
> > > > > ==8528==    by 0x4261CA: btrfs_release_path (ctree.c:61)
> > > > > ==8528==    by 0x426212: btrfs_free_path (ctree.c:51)
> > > > > ==8528==    by 0x41F93B: search_dir (cmds-restore.c:911)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==
> > > > > ==8528==
> > > > > ==8528== HEAP SUMMARY:
> > > > > ==8528==     in use at exit: 0 bytes in 0 blocks
> > > > > ==8528==   total heap usage: 260,452 allocs, 260,452 frees,
> > > > > 278,189,550
> > > > > bytes allocated
> > > > > ==8528==
> > > > > ==8528== All heap blocks were freed -- no leaks are possible
> > > > > ==8528==
> > > > > ==8528== For counts of detected and suppressed errors, rerun with: -v
> > > > > ==8528== Use --track-origins=yes to see where uninitialised values
> > > > > come
> > > > > from ==8528== ERROR SUMMARY: 16597 errors from 7 contexts (suppressed:
> > > > > 2
> > > > > from 2)
> > > > > 
> > > > > see: https://bugzilla.kernel.org/show_bug.cgi?id=82701
> > > > > 
> > > > > Marc
> > > > > 
> > > > > p.s.
> > > > > 
> > > > > I wonder if this list should be autosubscribed to btrfs related bugs
> > > > > 
> > > > > --
> > > > > To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
> > > > > in
> > > > > the body of a message to majordomo@vger.kernel.org
> > > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marc Dietrich Aug. 25, 2014, 8:58 a.m. UTC | #3
Am Freitag 22 August 2014, 10:42:18 schrieb Marc Dietrich:
> Am Freitag, 22. August 2014, 14:43:45 schrieb Gui Hecheng:
> > On Thu, 2014-08-21 at 16:19 +0200, Marc Dietrich wrote:
> > > Am Donnerstag, 21. August 2014, 17:52:16 schrieb Gui Hecheng:
> > > > On Mon, 2014-08-18 at 11:25 +0200, Marc Dietrich wrote:
> > > > > Hi,
> > > > > 
> > > > > I did a checkout of the latest btrfs progs to repair my damaged
> > > > > filesystem.
> > > > > Running btrfs restore gives me several failed to inflate: -6 and
> > > > > crashes
> > > > > with some memory corruption. I ran it again with valgrind and got:
> > > > > 
> > > > > valgrind --log-file=x2 -v --leak-check=yes btrfs restore /dev/sda9
> > > > > /mnt/backup
> > > > > 
> > > > > ==8528== Memcheck, a memory error detector
> > > > > ==8528== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et
> > > > > al.
> > > > > ==8528== Using Valgrind-3.8.1 and LibVEX; rerun with -h for
> > > > > copyright
> > > > > info
> > > > > ==8528== Command: btrfs restore /dev/sda9 /mnt/backup
> > > > > ==8528== Parent PID: 8453
> > > > > ==8528==
> > > > > ==8528== Syscall param pwrite64(buf) points to uninitialised byte(s)
> > > > > ==8528==    at 0x59BE3C3: __pwrite_nocancel (in
> > > > > /lib64/libpthread-2.18.so)
> > > > > ==8528==    by 0x41F22F: search_dir (cmds-restore.c:392)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > > > ==8528==  Address 0x66956a0 is 7,056 bytes inside a block of size
> > > > > 8,192
> > > > > alloc'd
> > > > > ==8528==    at 0x4C277AB: malloc (in
> > > > > /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so)
> > > > > ==8528==    by 0x41EEAD: search_dir (cmds-restore.c:316)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > > 
> > > > -------------------[snip]---------------------------------
> > > > .... leaks ...
> > > > ----------------------------------------------------------
> > 
> > For the leak below...
> > I've no idea why the @decompress_lzo() is not statisfied with @inbuf
> > with the exact size of the disk bytes.
> > Or maybe the compressed data had just sufferred damages...
> > 
> > BTW, when you wrote your data, did that kernel has the following commit
> > for btrfs?
> > 
> > 	commit: 59516f6017c589e7316418fda6128ba8f829a77f
> 
> mmh, I used the master branch which is still on 3.14.2 (from k.org).
> 
> Ah, there is a development branch on another repo (repo.or.cz). Why oh why?

Guy, 

sorry to quote an earlier mail, I forgot to add you as CC on you latest post 
and I'm not subscribed to the list.

> There is a development branch for btrfs-progs from david:
> http://github.com/kdave/btrfs-progs.git if you would like to try.

ok, thanks will try.

> But here, what I mean is your *kernel* version when you wrote your data.

I'm using btrfs since 3.14 or so (and maybe also some random distro kernel 
based on 3.11). The partition contained a lot of larger git trees and virtual 
machines - yes, not ideal for btrfs but a nice testcase ...

> There is a change for btrfs-restore which depends on a kernel commit.
> If you wrote your data with a older kernel and apply the 3.14.2
> btrfs-progs to restore, then there may be wandering stuffs.

wow. That should never happend I think. Userspace should always be able to fix 
corruptions made by earlier kernels (except disk layout changes maybe).

> Now, I am just suspecting such a scenario.

Possbile. So how to proceed? If I checkout the latest brtfs from the repo 
above and restore again, are you still interested in the results?

It seems there are lots of people reporting corruptions on the list and also 
lots of fixes posted. Maybe it's better to restart from new (format a the 
partiton) and report problems happen after that. What do you think?

Marc
Gui Hecheng Aug. 25, 2014, 10:21 a.m. UTC | #4
On Mon, 2014-08-25 at 10:58 +0200, Marc Dietrich wrote:
> Am Freitag 22 August 2014, 10:42:18 schrieb Marc Dietrich:
> > Am Freitag, 22. August 2014, 14:43:45 schrieb Gui Hecheng:
> > > On Thu, 2014-08-21 at 16:19 +0200, Marc Dietrich wrote:
> > > > Am Donnerstag, 21. August 2014, 17:52:16 schrieb Gui Hecheng:
> > > > > On Mon, 2014-08-18 at 11:25 +0200, Marc Dietrich wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > I did a checkout of the latest btrfs progs to repair my damaged
> > > > > > filesystem.
> > > > > > Running btrfs restore gives me several failed to inflate: -6 and
> > > > > > crashes
> > > > > > with some memory corruption. I ran it again with valgrind and got:
> > > > > > 
> > > > > > valgrind --log-file=x2 -v --leak-check=yes btrfs restore /dev/sda9
> > > > > > /mnt/backup
> > > > > > 
> > > > > > ==8528== Memcheck, a memory error detector
> > > > > > ==8528== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et
> > > > > > al.
> > > > > > ==8528== Using Valgrind-3.8.1 and LibVEX; rerun with -h for
> > > > > > copyright
> > > > > > info
> > > > > > ==8528== Command: btrfs restore /dev/sda9 /mnt/backup
> > > > > > ==8528== Parent PID: 8453
> > > > > > ==8528==
> > > > > > ==8528== Syscall param pwrite64(buf) points to uninitialised byte(s)
> > > > > > ==8528==    at 0x59BE3C3: __pwrite_nocancel (in
> > > > > > /lib64/libpthread-2.18.so)
> > > > > > ==8528==    by 0x41F22F: search_dir (cmds-restore.c:392)
> > > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > > > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > > > > ==8528==  Address 0x66956a0 is 7,056 bytes inside a block of size
> > > > > > 8,192
> > > > > > alloc'd
> > > > > > ==8528==    at 0x4C277AB: malloc (in
> > > > > > /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so)
> > > > > > ==8528==    by 0x41EEAD: search_dir (cmds-restore.c:316)
> > > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > > ==8528==    by 0x41F8D0: search_dir (cmds-restore.c:895)
> > > > > > ==8528==    by 0x4204B8: cmd_restore (cmds-restore.c:1284)
> > > > > > ==8528==    by 0x4043FE: main (btrfs.c:286)
> > > > > 
> > > > > -------------------[snip]---------------------------------
> > > > > .... leaks ...
> > > > > ----------------------------------------------------------
> > > 
> > > For the leak below...
> > > I've no idea why the @decompress_lzo() is not statisfied with @inbuf
> > > with the exact size of the disk bytes.
> > > Or maybe the compressed data had just sufferred damages...
> > > 
> > > BTW, when you wrote your data, did that kernel has the following commit
> > > for btrfs?
> > > 
> > > 	commit: 59516f6017c589e7316418fda6128ba8f829a77f
> > 
> > mmh, I used the master branch which is still on 3.14.2 (from k.org).
> > 
> > Ah, there is a development branch on another repo (repo.or.cz). Why oh why?
> 
> Guy, 
> 
> sorry to quote an earlier mail, I forgot to add you as CC on you latest post 
> and I'm not subscribed to the list.
> 
> > There is a development branch for btrfs-progs from david:
> > http://github.com/kdave/btrfs-progs.git if you would like to try.
> 
> ok, thanks will try.
> 
> > But here, what I mean is your *kernel* version when you wrote your data.
> 
> I'm using btrfs since 3.14 or so (and maybe also some random distro kernel 
> based on 3.11). The partition contained a lot of larger git trees and virtual 
> machines - yes, not ideal for btrfs but a nice testcase ...
> 
> > There is a change for btrfs-restore which depends on a kernel commit.
> > If you wrote your data with a older kernel and apply the 3.14.2
> > btrfs-progs to restore, then there may be wandering stuffs.
> 
> wow. That should never happend I think. Userspace should always be able to fix 
> corruptions made by earlier kernels (except disk layout changes maybe).
> 
> > Now, I am just suspecting such a scenario.
> 
> Possbile. So how to proceed? If I checkout the latest brtfs from the repo 
> above and restore again, are you still interested in the results?

Ah, I think you could clone the progs from the repo and apply the two
small pieces that I mentioned before.
Yes, I am still trying to follow the issues with restore. It seems
btrfs-restore needs more effect from btrfs developers since it doesn't
survive tough scenarioes.

> It seems there are lots of people reporting corruptions on the list and also 
> lots of fixes posted. Maybe it's better to restart from new (format a the 
> partiton) and report problems happen after that. What do you think?

Oh, I think you've just found a really good case for btrfs-restore.
Maybe you could keep a image of that, just like Zooko did here:
https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg36701.html

Thanks,
-Gui

> Marc


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/cmds-restore.c b/cmds-restore.c
index dde7de8..ae1ea72 100644
--- a/cmds-restore.c
+++ b/cmds-restore.c
@@ -297,7 +297,7 @@  static int copy_one_extent(struct btrfs_root *root,
int fd,
        ram_size = btrfs_file_extent_ram_bytes(leaf, fi);
        offset = btrfs_file_extent_offset(leaf, fi);
        num_bytes = btrfs_file_extent_num_bytes(leaf, fi);
-       size_left = disk_size;
+       size_left = num_bytes;
        if (compress == BTRFS_COMPRESS_NONE)
                bytenr += offset;