diff mbox

btrfs-show vs. btrfs different output

Message ID 514B346D.1010403@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Eric Sandeen March 21, 2013, 4:25 p.m. UTC
On 3/21/13 10:29 AM, Jon Nelson wrote:
> On Thu, Mar 21, 2013 at 10:11 AM, Eric Sandeen <sandeen@redhat.com> wrote:
>> On 3/21/13 10:04 AM, Jon Nelson wrote:
> ...
>>> 2. the current git btrfs-show and btrfs fi show both output
>>> *different* devices for device with UUID
>>> b5dc52bd-21bf-4173-8049-d54d88c82240, and they're both wrong.
>>
>> does blkid output find that uuid anywhere?
>>
>> Since you're working in git, can you maybe do a little bisecting
>> to find out when it changed?  Should be a fairly quick test?
> 
> blkid does /not/ report that uuid anywhere.
> 
> git bisect, if I did it correctly, says:
> 
> 
> 6eba9002956ac40db87d42fb653a0524dc568810 is the first bad commit
> commit 6eba9002956ac40db87d42fb653a0524dc568810
> Author: Goffredo Baroncelli <kreijack@inwind.it>
> Date:   Tue Sep 4 19:59:26 2012 +0200
> 
>     Correct un-initialized fsid variable
> 
> :100644 100644 b21a87f827a6250da45f2fb6a1c3a6b651062243
> 03952051b5e25e0b67f0f910c84d93eb90de8480 M      disk-io.c

Ok, I think this is another case of greedily scanning stale
backup superblocks (did you ever have btrfs on the whole sda
or sdb?)

btrfs_read_dev_super() currently tries to scan all 3 superblocks
(primary & 2 backups).  I'm guessing that you have some stale
backup superblocks on sda and/or sdb.

Before the above commit, if the first sb didn't look valid,
it'd skip to the 2nd.  If the 2nd (stale) one looked OK,
it'd compare its fsid to an uniniitialized variable (fsid)
which would fail (since the "fsid" contents were random.)
Same for the 3rd backup if found, and eventually it'd return
-1 as failure and not report the device.

After the commit, it'd skip the first invalid sb as well.
But this time, it takes the fsid from the 2nd superblock as
"good" and makes it through the loop thinking that it's found
something valid.  Hence the report of a device which you didn't
expect even though the first superblock is indeed wiped out.

There are some patches floating around to stop this
backup superblock scanning altogether.

This might fix it for you; it basically returns failure
if any superblock on the device is found to be bad.

What we really need is the right bits in the right places
to let the administrator know if a device looks like it might
be corrupt & in need of fixing, vs. ignoring it altogether.

Not sure if this is something we want upstream but you could
test if if you like.

-Eric

[PATCH] Fail btrfs_read_dev_super if any super looks invalid.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Anand Jain March 22, 2013, 3:06 a.m. UTC | #1
> What we really need is the right bits in the right places
> to let the administrator know if a device looks like it might
> be corrupt & in need of fixing, vs. ignoring it altogether.


--
 > 2. the current git btrfs-show and btrfs fi show both output
 > *different* devices for device with UUID
 > b5dc52bd-21bf-4173-8049-d54d88c82240, and they're both wrong.
--

  here per Jon experience showing stale btrfs on /dev/sdb (note
  that good btrfs is on /dev/sdb3) is wrong.

  More over when we show something like /dev/sdb and /dev/sdb3
  both have two different btrfs, its a bit scary.

  I would vote for ignoring it altogether since we have
  btrfs-find-root and btrfs-select-super to help find the
  backup SB for recovery if needed.

Thanks.

-Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jon Nelson March 22, 2013, 1:59 p.m. UTC | #2
On Thu, Mar 21, 2013 at 11:25 AM, Eric Sandeen <sandeen@redhat.com> wrote:
> On 3/21/13 10:29 AM, Jon Nelson wrote:
>> On Thu, Mar 21, 2013 at 10:11 AM, Eric Sandeen <sandeen@redhat.com> wrote:
>>> On 3/21/13 10:04 AM, Jon Nelson wrote:
>> ...
>>>> 2. the current git btrfs-show and btrfs fi show both output
>>>> *different* devices for device with UUID
>>>> b5dc52bd-21bf-4173-8049-d54d88c82240, and they're both wrong.
>>>
>>> does blkid output find that uuid anywhere?
>>>
>>> Since you're working in git, can you maybe do a little bisecting
>>> to find out when it changed?  Should be a fairly quick test?
>>
>> blkid does /not/ report that uuid anywhere.
>>
>> git bisect, if I did it correctly, says:
>>
>>
>> 6eba9002956ac40db87d42fb653a0524dc568810 is the first bad commit
>> commit 6eba9002956ac40db87d42fb653a0524dc568810
>> Author: Goffredo Baroncelli <kreijack@inwind.it>
>> Date:   Tue Sep 4 19:59:26 2012 +0200
>>
>>     Correct un-initialized fsid variable
>>
>> :100644 100644 b21a87f827a6250da45f2fb6a1c3a6b651062243
>> 03952051b5e25e0b67f0f910c84d93eb90de8480 M      disk-io.c
>
> Ok, I think this is another case of greedily scanning stale
> backup superblocks (did you ever have btrfs on the whole sda
> or sdb?)
>
> btrfs_read_dev_super() currently tries to scan all 3 superblocks
> (primary & 2 backups).  I'm guessing that you have some stale
> backup superblocks on sda and/or sdb.
>
> Before the above commit, if the first sb didn't look valid,
> it'd skip to the 2nd.  If the 2nd (stale) one looked OK,
> it'd compare its fsid to an uniniitialized variable (fsid)
> which would fail (since the "fsid" contents were random.)
> Same for the 3rd backup if found, and eventually it'd return
> -1 as failure and not report the device.
>
> After the commit, it'd skip the first invalid sb as well.
> But this time, it takes the fsid from the 2nd superblock as
> "good" and makes it through the loop thinking that it's found
> something valid.  Hence the report of a device which you didn't
> expect even though the first superblock is indeed wiped out.
>
> There are some patches floating around to stop this
> backup superblock scanning altogether.
>
> This might fix it for you; it basically returns failure
> if any superblock on the device is found to be bad.
>
> What we really need is the right bits in the right places
> to let the administrator know if a device looks like it might
> be corrupt & in need of fixing, vs. ignoring it altogether.
>
> Not sure if this is something we want upstream but you could
> test if if you like.

I did test and it appears to resolve the issue for me.
Thank you!
Eric Sandeen March 22, 2013, 10:34 p.m. UTC | #3
On 3/22/13 8:59 AM, Jon Nelson wrote:
> On Thu, Mar 21, 2013 at 11:25 AM, Eric Sandeen <sandeen@redhat.com> wrote:
>> On 3/21/13 10:29 AM, Jon Nelson wrote:
>>> On Thu, Mar 21, 2013 at 10:11 AM, Eric Sandeen <sandeen@redhat.com> wrote:
>>>> On 3/21/13 10:04 AM, Jon Nelson wrote:
>>> ...
>>>>> 2. the current git btrfs-show and btrfs fi show both output
>>>>> *different* devices for device with UUID
>>>>> b5dc52bd-21bf-4173-8049-d54d88c82240, and they're both wrong.
>>>>
>>>> does blkid output find that uuid anywhere?
>>>>
>>>> Since you're working in git, can you maybe do a little bisecting
>>>> to find out when it changed?  Should be a fairly quick test?
>>>
>>> blkid does /not/ report that uuid anywhere.
>>>
>>> git bisect, if I did it correctly, says:
>>>
>>>
>>> 6eba9002956ac40db87d42fb653a0524dc568810 is the first bad commit
>>> commit 6eba9002956ac40db87d42fb653a0524dc568810
>>> Author: Goffredo Baroncelli <kreijack@inwind.it>
>>> Date:   Tue Sep 4 19:59:26 2012 +0200
>>>
>>>     Correct un-initialized fsid variable
>>>
>>> :100644 100644 b21a87f827a6250da45f2fb6a1c3a6b651062243
>>> 03952051b5e25e0b67f0f910c84d93eb90de8480 M      disk-io.c
>>
>> Ok, I think this is another case of greedily scanning stale
>> backup superblocks (did you ever have btrfs on the whole sda
>> or sdb?)
>>
>> btrfs_read_dev_super() currently tries to scan all 3 superblocks
>> (primary & 2 backups).  I'm guessing that you have some stale
>> backup superblocks on sda and/or sdb.
>>
>> Before the above commit, if the first sb didn't look valid,
>> it'd skip to the 2nd.  If the 2nd (stale) one looked OK,
>> it'd compare its fsid to an uniniitialized variable (fsid)
>> which would fail (since the "fsid" contents were random.)
>> Same for the 3rd backup if found, and eventually it'd return
>> -1 as failure and not report the device.
>>
>> After the commit, it'd skip the first invalid sb as well.
>> But this time, it takes the fsid from the 2nd superblock as
>> "good" and makes it through the loop thinking that it's found
>> something valid.  Hence the report of a device which you didn't
>> expect even though the first superblock is indeed wiped out.
>>
>> There are some patches floating around to stop this
>> backup superblock scanning altogether.
>>
>> This might fix it for you; it basically returns failure
>> if any superblock on the device is found to be bad.
>>
>> What we really need is the right bits in the right places
>> to let the administrator know if a device looks like it might
>> be corrupt & in need of fixing, vs. ignoring it altogether.
>>
>> Not sure if this is something we want upstream but you could
>> test if if you like.
> 
> I did test and it appears to resolve the issue for me.
> Thank you!

Thanks.  I need to get back to finding the right overall solution
here, but have been busy elsewhere.  It's on the list ;)

Anand is looking at it too and has some patches on the list.

-Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/disk-io.c b/disk-io.c
index a9fd374..3aca16f 100644
--- a/disk-io.c
+++ b/disk-io.c
@@ -1104,7 +1104,6 @@  struct btrfs_root *open_ctree_fd(int fp, const char *path, u64 sb_bytenr,
 int btrfs_read_dev_super(int fd, struct btrfs_super_block *sb, u64 sb_bytenr)
 {
 	u8 fsid[BTRFS_FSID_SIZE];
-	int fsid_is_initialized = 0;
 	struct btrfs_super_block buf;
 	int i;
 	int ret;
@@ -1130,24 +1129,22 @@  int btrfs_read_dev_super(int fd, struct btrfs_super_block *sb, u64 sb_bytenr)
 		if (ret < sizeof(buf))
 			break;
 
-		if (btrfs_super_bytenr(&buf) != bytenr )
-			continue;
-		/* if magic is NULL, the device was removed */
-		if (buf.magic == 0 && i == 0) 
+		/* Bail out on any invalid superblock */
+		if (btrfs_super_bytenr(&buf) != bytenr ||
+		    buf.magic != cpu_to_le64(BTRFS_MAGIC))
 			return -1;
-		if (buf.magic != cpu_to_le64(BTRFS_MAGIC))
-			continue;
 
-		if (!fsid_is_initialized) {
+		/* if sb 0 looks valid use its fsid as trusted */
+		if (i == 0)
 			memcpy(fsid, buf.fsid, sizeof(fsid));
-			fsid_is_initialized = 1;
-		} else if (memcmp(fsid, buf.fsid, sizeof(fsid))) {
+		else if (memcmp(fsid, buf.fsid, sizeof(fsid))) {
 			/*
 			 * the superblocks (the original one and
 			 * its backups) contain data of different
-			 * filesystems -> the super cannot be trusted
+			 * filesystems -> the super cannot be trusted,
+			 * hence the fs cannot be trusted.
 			 */
-			continue;
+			return -1;
 		}
 
 		if (btrfs_super_generation(&buf) > transid) {