diff mbox

[2/2] btrfs: btrfs_rm_device() should zero mirror SB as well

Message ID 1374773376-29853-2-git-send-email-anand.jain@oracle.com (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Anand Jain July 25, 2013, 5:29 p.m. UTC
There is a very less chance that all the copies of SB
on the disk is zeroed unintentionally. unless device
is removed, so this fix will ensure all copies on the
disk is zeroed when the disk is intentionally removed.

reproducer:
-------------------
btrfs dev del /dev/sdc /btrfs
echo $?
0
umount /btrfs
btrfs fi show
Label: none  uuid: e7aae9f0-1aa8-41f5-8fb6-d4d8f80cdb2c
        Total devices 1 FS bytes used 28.00KiB
        devid    1 size 2.00GiB used 20.00MiB path /dev/sdb

./btrfs-select-super -s 1 /dev/sdc
using SB copy 1, bytenr 67108864

btrfs fi show
Label: none  uuid: e7aae9f0-1aa8-41f5-8fb6-d4d8f80cdb2c
        Total devices 1 FS bytes used 28.00KiB
        devid    2 size 2.00GiB used 0.00 path /dev/sdc   <-- WRONG
        devid    1 size 2.00GiB used 20.00MiB path /dev/sdb

mount /dev/sdc /btrfs
btrfs fi show --kernel
Label: none  uuid: e7aae9f0-1aa8-41f5-8fb6-d4d8f80cdb2c mounted: /btrfs
        Group profile: metadata: single  data: single
        Total devices 1 FS bytes used 28.00KiB
        devid    1 size 2.00GiB used 20.00MiB path /dev/sdb
---------------------

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/volumes.c |   30 ++++++++++++++++++++++++++++++
 1 files changed, 30 insertions(+), 0 deletions(-)

Comments

Eric Sandeen Aug. 9, 2013, 8:47 p.m. UTC | #1
On 7/25/13 12:29 PM, Anand Jain wrote:
> There is a very less chance that all the copies of SB
> on the disk is zeroed unintentionally. unless device
> is removed, so this fix will ensure all copies on the
> disk is zeroed when the disk is intentionally removed.
> 
> reproducer:
> -------------------
> btrfs dev del /dev/sdc /btrfs
> echo $?
> 0
> umount /btrfs
> btrfs fi show
> Label: none  uuid: e7aae9f0-1aa8-41f5-8fb6-d4d8f80cdb2c
>         Total devices 1 FS bytes used 28.00KiB
>         devid    1 size 2.00GiB used 20.00MiB path /dev/sdb

Great, so dev del makes it unfindable . . . 

> ./btrfs-select-super -s 1 /dev/sdc
> using SB copy 1, bytenr 67108864

Now you use a rescue tool to resurrect the fs from a backup
superblock.  And:

> btrfs fi show
> Label: none  uuid: e7aae9f0-1aa8-41f5-8fb6-d4d8f80cdb2c
>         Total devices 1 FS bytes used 28.00KiB
>         devid    2 size 2.00GiB used 0.00 path /dev/sdc   <-- WRONG
>         devid    1 size 2.00GiB used 20.00MiB path /dev/sdb

Ok, now it's findable.  Isn't that exactly how this should behave?
What is wrong about this?

> mount /dev/sdc /btrfs
> btrfs fi show --kernel
> Label: none  uuid: e7aae9f0-1aa8-41f5-8fb6-d4d8f80cdb2c mounted: /btrfs
>         Group profile: metadata: single  data: single
>         Total devices 1 FS bytes used 28.00KiB
>         devid    1 size 2.00GiB used 20.00MiB path /dev/sdb

Oh good, you could bring it back after a potential administrative error,
using a recovery tool (btrfs-select-super)!  Isn't that a good thing?

IOWS: what does this change actually fix?

-Eric

> ---------------------
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
>  fs/btrfs/volumes.c |   30 ++++++++++++++++++++++++++++++
>  1 files changed, 30 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 557a743..090f57c 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -1641,12 +1641,42 @@ int btrfs_rm_device(struct btrfs_root *root, char *device_path)
>  	 * remove it from the devices list and zero out the old super
>  	 */
>  	if (clear_super && disk_super) {
> +		u64 bytenr;
> +		int i;
> +
>  		/* make sure this device isn't detected as part of
>  		 * the FS anymore
>  		 */
>  		memset(&disk_super->magic, 0, sizeof(disk_super->magic));
>  		set_buffer_dirty(bh);
>  		sync_dirty_buffer(bh);
> +
> +		/* clear the mirror copies of super block on the disk
> +		 * being removed, 0th copy is been taken care above and
> +		 * the below would take of the rest
> +		 */
> +		for (i = 1; i < BTRFS_SUPER_MIRROR_MAX; i++) {
> +			brelse(bh);
> +			bytenr = btrfs_sb_offset(i);
> +			if (bytenr + BTRFS_SUPER_INFO_SIZE >=
> +					i_size_read(bdev->bd_inode))
> +				break;
> +			bh = __bread(bdev, bytenr / 4096,
> +					BTRFS_SUPER_INFO_SIZE);
> +			if (!bh)
> +				continue;
> +
> +			disk_super = (struct btrfs_super_block *)bh->b_data;
> +
> +			if (btrfs_super_bytenr(disk_super) != bytenr ||
> +				btrfs_super_magic(disk_super) != BTRFS_MAGIC) {
> +				continue;
> +			}
> +			memset(&disk_super->magic, 0,
> +						sizeof(disk_super->magic));
> +			set_buffer_dirty(bh);
> +			sync_dirty_buffer(bh);
> +		}
>  	}
>  
>  	ret = 0;
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Anand Jain Aug. 10, 2013, 2:04 a.m. UTC | #2
>> btrfs fi show
>> Label: none  uuid: e7aae9f0-1aa8-41f5-8fb6-d4d8f80cdb2c
>>          Total devices 1 FS bytes used 28.00KiB
>>          devid    2 size 2.00GiB used 0.00 path /dev/sdc   <-- WRONG
>>          devid    1 size 2.00GiB used 20.00MiB path /dev/sdb
>
> Ok, now it's findable.  Isn't that exactly how this should behave?
> What is wrong about this?

  Total devices is still 1.

>> mount /dev/sdc /btrfs
>> btrfs fi show --kernel
>> Label: none  uuid: e7aae9f0-1aa8-41f5-8fb6-d4d8f80cdb2c mounted: /btrfs
>>          Group profile: metadata: single  data: single
>>          Total devices 1 FS bytes used 28.00KiB
>>          devid    1 size 2.00GiB used 20.00MiB path /dev/sdb
>
> Oh good, you could bring it back after a potential administrative error,
> using a recovery tool (btrfs-select-super)!  Isn't that a good thing?

  Note, here btrfs fi show used the new option --kernel
  this does not show /dev/sdc though you use it mount.
  Its all messed up.

  If user wants to bring back the intentionally deleted
  disk, then they should rather call btrfs dev add, so
  that it will take care of integrating the disk back
  to the FS.

  recovery tools are for possible recovery from the
  corruption, delete is not a corruption. Thats an
  intentional step that user decided to take and the
  undo for it is 'dev add'.

> IOWS: what does this change actually fix?

  Writes zeros to all copies of SB when disk is deleted
  (before we used to just zero only the first copy).
  In that way corruption is distinguished from the
  deleted disk in a fair calculations.

  Otherwise allowing these things would cost us in terms
  of support for the administrative error. Which we don't
  have to encourage.

Thanks, Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Sandeen Aug. 10, 2013, 10:10 p.m. UTC | #3
On 8/9/13 9:04 PM, anand jain wrote:
> 
> 
>>> btrfs fi show
>>> Label: none  uuid: e7aae9f0-1aa8-41f5-8fb6-d4d8f80cdb2c
>>>          Total devices 1 FS bytes used 28.00KiB
>>>          devid    2 size 2.00GiB used 0.00 path /dev/sdc   <-- WRONG
>>>          devid    1 size 2.00GiB used 20.00MiB path /dev/sdb
>>
>> Ok, now it's findable.  Isn't that exactly how this should behave?
>> What is wrong about this?
> 
>  Total devices is still 1.

Hm, so it is.  But that inconsistency indicates a bug
somewhere else, doesn't it?  (FWIW the above works for me,
w/ the right number of devices shown after removal...)

Anyway, I wonder if we've ever resolved the "when should we look
for backup superblocks?" question.  Because that would inform
decisions about when they should be read, when they must be zeroed, etc.

I thought the plan was that backup superblocks should not be
read unless we explicitly specify it via mount option or btrfs
commandline option.

If we must add code to zero all the superblocks on removal to fix
something that's still discovering them, that seems
to mean backups are still being automatically read.  Should they be?

What is the design intent for when backup SBs whould be used?
Maybe then I could better understand the reason for this change.

Thanks,
-Eric

>>> mount /dev/sdc /btrfs
>>> btrfs fi show --kernel
>>> Label: none  uuid: e7aae9f0-1aa8-41f5-8fb6-d4d8f80cdb2c mounted: /btrfs
>>>          Group profile: metadata: single  data: single
>>>          Total devices 1 FS bytes used 28.00KiB
>>>          devid    1 size 2.00GiB used 20.00MiB path /dev/sdb
>>
>> Oh good, you could bring it back after a potential administrative error,
>> using a recovery tool (btrfs-select-super)!  Isn't that a good thing?
> 
>  Note, here btrfs fi show used the new option --kernel
>  this does not show /dev/sdc though you use it mount.
>  Its all messed up.
> 
>  If user wants to bring back the intentionally deleted
>  disk, then they should rather call btrfs dev add, so
>  that it will take care of integrating the disk back
>  to the FS.
> 
>  recovery tools are for possible recovery from the
>  corruption, delete is not a corruption. Thats an
>  intentional step that user decided to take and the
>  undo for it is 'dev add'.
> 
>> IOWS: what does this change actually fix?
> 
>  Writes zeros to all copies of SB when disk is deleted
>  (before we used to just zero only the first copy).
>  In that way corruption is distinguished from the
>  deleted disk in a fair calculations.
> 
>  Otherwise allowing these things would cost us in terms
>  of support for the administrative error. Which we don't
>  have to encourage.
> 
> Thanks, Anand

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Anand Jain Aug. 12, 2013, 2:39 a.m. UTC | #4
Hi Eric,

>>>> btrfs fi show
>>>> Label: none  uuid: e7aae9f0-1aa8-41f5-8fb6-d4d8f80cdb2c
>>>>           Total devices 1 FS bytes used 28.00KiB
>>>>           devid    2 size 2.00GiB used 0.00 path /dev/sdc   <-- WRONG
>>>>           devid    1 size 2.00GiB used 20.00MiB path /dev/sdb
>>>
>>> Ok, now it's findable.  Isn't that exactly how this should behave?
>>> What is wrong about this?
>>
>>   Total devices is still 1.
>
> Hm, so it is.  But that inconsistency indicates a bug
> somewhere else, doesn't it?

  Nope. to undo dev-delete admin has to use dev-add.

  Here the idea is to release the disk nicely when
  we know we don't need it anymore.

Thanks, Anand

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 557a743..090f57c 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1641,12 +1641,42 @@  int btrfs_rm_device(struct btrfs_root *root, char *device_path)
 	 * remove it from the devices list and zero out the old super
 	 */
 	if (clear_super && disk_super) {
+		u64 bytenr;
+		int i;
+
 		/* make sure this device isn't detected as part of
 		 * the FS anymore
 		 */
 		memset(&disk_super->magic, 0, sizeof(disk_super->magic));
 		set_buffer_dirty(bh);
 		sync_dirty_buffer(bh);
+
+		/* clear the mirror copies of super block on the disk
+		 * being removed, 0th copy is been taken care above and
+		 * the below would take of the rest
+		 */
+		for (i = 1; i < BTRFS_SUPER_MIRROR_MAX; i++) {
+			brelse(bh);
+			bytenr = btrfs_sb_offset(i);
+			if (bytenr + BTRFS_SUPER_INFO_SIZE >=
+					i_size_read(bdev->bd_inode))
+				break;
+			bh = __bread(bdev, bytenr / 4096,
+					BTRFS_SUPER_INFO_SIZE);
+			if (!bh)
+				continue;
+
+			disk_super = (struct btrfs_super_block *)bh->b_data;
+
+			if (btrfs_super_bytenr(disk_super) != bytenr ||
+				btrfs_super_magic(disk_super) != BTRFS_MAGIC) {
+				continue;
+			}
+			memset(&disk_super->magic, 0,
+						sizeof(disk_super->magic));
+			set_buffer_dirty(bh);
+			sync_dirty_buffer(bh);
+		}
 	}
 
 	ret = 0;