diff mbox

btrfs-progs: a copy of superblock is zero may not mean btrfs is not there

Message ID 1365753306-19412-1-git-send-email-anand.jain@oracle.com (mailing list archive)
State Under Review, archived
Headers show

Commit Message

Anand Jain April 12, 2013, 7:55 a.m. UTC
If one of the copy of the superblock is zero it does not
confirm to us that btrfs isn't there on that disk. When
we are having more than one copy of superblock we should
rather let the for loop to continue to check other copies.

the following test case and results would justify the
fix

mkfs.btrfs /dev/sdb /dev/sdc -f
mount /dev/sdb /btrfs
dd if=/dev/zero bs=1 count=8 of=/dev/sdc seek=$((64*1024+64))
~/before/btrfs-select-super -s 1 /dev/sdc
using SB copy 1, bytenr 67108864

here btrfs-select-super just wrote superblock to a mounted btrfs

with the fix:
./btrfs-select-super -s 1 /dev/sdc
/dev/sdc is currently mounted. Aborting.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 disk-io.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Comments

David Sterba April 16, 2013, 11:57 a.m. UTC | #1
On Fri, Apr 12, 2013 at 03:55:06PM +0800, Anand Jain wrote:
> If one of the copy of the superblock is zero it does not
> confirm to us that btrfs isn't there on that disk. When
> we are having more than one copy of superblock we should
> rather let the for loop to continue to check other copies.
> 
> the following test case and results would justify the
> fix
> 
> mkfs.btrfs /dev/sdb /dev/sdc -f
> mount /dev/sdb /btrfs
> dd if=/dev/zero bs=1 count=8 of=/dev/sdc seek=$((64*1024+64))
> ~/before/btrfs-select-super -s 1 /dev/sdc
> using SB copy 1, bytenr 67108864
> 
> here btrfs-select-super just wrote superblock to a mounted btrfs

Why does not check_mounted() catch this in the first place? Ie. based on
the status in /proc/mounts not on random bytes in the superblock.

david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Anand Jain April 17, 2013, 2:19 a.m. UTC | #2
On 04/16/2013 07:57 PM, David Sterba wrote:
> On Fri, Apr 12, 2013 at 03:55:06PM +0800, Anand Jain wrote:
>> If one of the copy of the superblock is zero it does not
>> confirm to us that btrfs isn't there on that disk. When
>> we are having more than one copy of superblock we should
>> rather let the for loop to continue to check other copies.
>>
>> the following test case and results would justify the
>> fix
>>
>> mkfs.btrfs /dev/sdb /dev/sdc -f
>> mount /dev/sdb /btrfs
>> dd if=/dev/zero bs=1 count=8 of=/dev/sdc seek=$((64*1024+64))
>> ~/before/btrfs-select-super -s 1 /dev/sdc
>> using SB copy 1, bytenr 67108864
>>
>> here btrfs-select-super just wrote superblock to a mounted btrfs
>
> Why does not check_mounted() catch this in the first place? Ie. based on
> the status in /proc/mounts not on random bytes in the superblock.

  the reason is, as of now /proc/mounts just knows about the devid 1.

Thanks, Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba April 17, 2013, 5:12 p.m. UTC | #3
On Wed, Apr 17, 2013 at 10:19:09AM +0800, Anand Jain wrote:
> 
> 
> On 04/16/2013 07:57 PM, David Sterba wrote:
> >On Fri, Apr 12, 2013 at 03:55:06PM +0800, Anand Jain wrote:
> >>If one of the copy of the superblock is zero it does not
> >>confirm to us that btrfs isn't there on that disk. When
> >>we are having more than one copy of superblock we should
> >>rather let the for loop to continue to check other copies.
> >>
> >>the following test case and results would justify the
> >>fix
> >>
> >>mkfs.btrfs /dev/sdb /dev/sdc -f
> >>mount /dev/sdb /btrfs
> >>dd if=/dev/zero bs=1 count=8 of=/dev/sdc seek=$((64*1024+64))
> >>~/before/btrfs-select-super -s 1 /dev/sdc
> >>using SB copy 1, bytenr 67108864
> >>
> >>here btrfs-select-super just wrote superblock to a mounted btrfs
> >
> >Why does not check_mounted() catch this in the first place? Ie. based on
> >the status in /proc/mounts not on random bytes in the superblock.
> 
>  the reason is, as of now /proc/mounts just knows about the devid 1.

My oversight, it's mkfs on sdb and select-super on sdc, but then sdc is
already open and the open(O_EXCL) should prevent that, right? The same
way mkfs checks whether all the devices are available.

david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Anand Jain April 18, 2013, 8:36 a.m. UTC | #4
>>>> If one of the copy of the superblock is zero it does not
>>>> confirm to us that btrfs isn't there on that disk. When
>>>> we are having more than one copy of superblock we should
>>>> rather let the for loop to continue to check other copies.
>>>>
>>>> the following test case and results would justify the
>>>> fix
>>>>
>>>> mkfs.btrfs /dev/sdb /dev/sdc -f
>>>> mount /dev/sdb /btrfs
>>>> dd if=/dev/zero bs=1 count=8 of=/dev/sdc seek=$((64*1024+64))
>>>> ~/before/btrfs-select-super -s 1 /dev/sdc
>>>> using SB copy 1, bytenr 67108864
>>>>
>>>> here btrfs-select-super just wrote superblock to a mounted btrfs
>>>
>>> Why does not check_mounted() catch this in the first place? Ie. based on
>>> the status in /proc/mounts not on random bytes in the superblock.
>>
>>   the reason is, as of now /proc/mounts just knows about the devid 1.
>
> My oversight, it's mkfs on sdb and select-super on sdc, but then sdc is
> already open and the open(O_EXCL) should prevent that, right? The same
> way mkfs checks whether all the devices are available.

  thanks for the comments.
  checking for O_EXCL would help in a way to avoid this problem.
  but it doesn't address the actual problem.

  here, IMO this is wrong..
------
int btrfs_read_dev_super(int fd, struct btrfs_super_block *sb, u64 
sb_bytenr, u64 flags)
{
::
                 /* if magic is NULL, the device was removed */ <----
                 if (buf.magic == 0 && i == 0)
                       return -1;
-----
  since it would inhibits check for the backup superblock
  when the primary superblock is wrongly overwritten with
  zeros.

  eg:
  in general threads which set BTRFS_SCAN_BACKUP_SB flag
  are affected.

  such as btrfs-find-root should fail to work and it
  does as in the below eg: with single disk.

--------
mkfs.btrfs /dev/dm-5 -f
dd if=/dev/zero bs=1 count=8 of=/dev/dm-5 seek=$((64*1024+64))

~/before/btrfs-find-root /dev/dm-5
No valid Btrfs found on /dev/dm-5
Open ctree failed

with the fix:
btrfs-find-root /dev/dm-5
Super think's the tree root is at 29364224, chunk root 20971520
Well block 4194304 seems great, but generation doesn't match, have=2, want=4
Well block 4206592 seems great, but generation doesn't match, have=3, want=4
Found tree root at 29364224
----------

Thanks, Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/disk-io.c b/disk-io.c
index 589b37a..3f85c21 100644
--- a/disk-io.c
+++ b/disk-io.c
@@ -1138,9 +1138,12 @@  int btrfs_read_dev_super(int fd, struct btrfs_super_block *sb, u64 sb_bytenr,
 
 		if (btrfs_super_bytenr(&buf) != bytenr )
 			continue;
-		/* if magic is NULL, the device was removed */
-		if (buf.magic == 0 && i == 0) 
-			return -1;
+		/* if magic is NULL, either the device was removed
+		*  OR user / application inflected the disk albeit
+		*  with the most common zeros.
+		*  so only this doesn't confirm that this disk
+		*  isn't part of btrfs
+		*/
 		if (buf.magic != cpu_to_le64(BTRFS_MAGIC))
 			continue;