diff mbox

btrfs-progs: wipe copies of the stale superblock beyond -b size

Message ID 20180326082742.9235-10-anand.jain@oracle.com (mailing list archive)
State New, archived
Headers show

Commit Message

Anand Jain March 26, 2018, 8:27 a.m. UTC
During the mkfs.btrfs -b <blockcount> btrfs_prepare_device() zeros all
the superblock bytenr locations only if the bytenr is below the
blockcount. The problem with this is that if the BTRFS is recreated
with a smaller size then we will leave the stale superblock in the disk
which shall confuse the recovery. As shown in the test case below.

 mkfs.btrfs -qf /dev/mapper/vg-lv
 mkfs.btrfs -qf -b1G /dev/mapper/vg-lv
 btrfs in dump-super -a /dev/mapper/vg-lv | grep '.fsid|superblock:'

 superblock: bytenr=65536, device=/dev/mapper/vg-lv
 dev_item.fsid		ebc67d01-7fc5-43f0-90b4-d1925002551e [match]
 superblock: bytenr=67108864, device=/dev/mapper/vg-lv
 dev_item.fsid		ebc67d01-7fc5-43f0-90b4-d1925002551e [match]
 superblock: bytenr=274877906944, device=/dev/mapper/vg-lv
 dev_item.fsid		b97a9206-593b-4933-a424-c6a6ee23fe7c [match]

So if we find a valid superblock zero it even if it's beyond the
blockcount.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 utils.c | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

Comments

Nikolay Borisov March 26, 2018, 7:25 p.m. UTC | #1
On 26.03.2018 11:27, Anand Jain wrote:
> During the mkfs.btrfs -b <blockcount> btrfs_prepare_device() zeros all
> the superblock bytenr locations only if the bytenr is below the
> blockcount. The problem with this is that if the BTRFS is recreated
> with a smaller size then we will leave the stale superblock in the disk
> which shall confuse the recovery. As shown in the test case below.
> 
>  mkfs.btrfs -qf /dev/mapper/vg-lv
>  mkfs.btrfs -qf -b1G /dev/mapper/vg-lv
>  btrfs in dump-super -a /dev/mapper/vg-lv | grep '.fsid|superblock:'
> 
>  superblock: bytenr=65536, device=/dev/mapper/vg-lv
>  dev_item.fsid		ebc67d01-7fc5-43f0-90b4-d1925002551e [match]
>  superblock: bytenr=67108864, device=/dev/mapper/vg-lv
>  dev_item.fsid		ebc67d01-7fc5-43f0-90b4-d1925002551e [match]
>  superblock: bytenr=274877906944, device=/dev/mapper/vg-lv
>  dev_item.fsid		b97a9206-593b-4933-a424-c6a6ee23fe7c [match]
> 
> So if we find a valid superblock zero it even if it's beyond the
> blockcount.
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
>  utils.c | 35 +++++++++++++++++++++++++++++++++++
>  1 file changed, 35 insertions(+)
> 
> diff --git a/utils.c b/utils.c
> index e9cb3a82fda6..6a9408b06e73 100644
> --- a/utils.c
> +++ b/utils.c
> @@ -365,6 +365,41 @@ int btrfs_prepare_device(int fd, const char *file, u64 *block_count_ret,
>  		return 1;
>  	}
>  
> +	/*
> +	 * Check for the BTRFS SB copies up until btrfs_device_size() and zero
> +	 * it. So that kernel (or user for the manual recovery) don't have to
> +	 * confuse with the stale SB copy during recovery.
> +	 */
> +	if (block_count != btrfs_device_size(fd, &st)) {
> +		for (i = 1; i < BTRFS_SUPER_MIRROR_MAX; i++) {
> +			struct btrfs_super_block *disk_super;
> +			char buf[BTRFS_SUPER_INFO_SIZE];
> +			disk_super = (struct btrfs_super_block *)buf;
> +
> +			/* Already zeroed above */
> +			if (btrfs_sb_offset(i) < block_count)
> +				continue;
> +
> +			/* Beyond actual disk size */
> +			if (btrfs_sb_offset(i) >= btrfs_device_size(fd, &st))
> +				continue;
> +
> +			/* Does not contain any stale SB */
> +			if (btrfs_read_dev_super(fd, disk_super,
> +						 btrfs_sb_offset(i), 0))
> +				continue;
> +
> +			ret = zero_dev_clamped(fd, btrfs_sb_offset(i),
> +						BTRFS_SUPER_INFO_SIZE,
> +						btrfs_device_size(fd, &st));
> +			if (ret < 0) {
> +				error("failed to zero device '%s' bytenr %llu: %s",
> +					file, btrfs_sb_offset(i), strerror(-ret));
> +				return 1;
> +			}
> +		}
> +	}
> +

I told you this code can be made a lot simpler, simply by modifying the
last argument passed to zero_dev_clamped. I even posted the resulting
diff which was just 3 lines changed.

I agree that it's a good idea to wipe all available superblock when we
use -b. However I don't agree with your approach - it adds a loop, it
adds a bunch of checks and makes the complexity orders of magnitude
higher than it could be. So I'm asking again - is there any inherent
benefit which I'm missing in your newly added 35 lines of code against
just passing the block device to zero_dev_clamped and letting the
existing logic take care of everything?

>  	*block_count_ret = block_count;
>  	return 0;
>  }
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Anand Jain March 26, 2018, 11:50 p.m. UTC | #2
> I told you this code can be made a lot simpler, simply by modifying the
> last argument passed to zero_dev_clamped. I even posted the resulting
> diff which was just 3 lines changed.
> 
> I agree that it's a good idea to wipe all available superblock when we
> use -b. However I don't agree with your approach - it adds a loop, it
> adds a bunch of checks and makes the complexity orders of magnitude
> higher than it could be. So I'm asking again - is there any inherent
> benefit which I'm missing in your newly added 35 lines of code against
> just passing the block device to zero_dev_clamped and letting the
> existing logic take care of everything?

I had that idea for v1 as well, but I didn't do it because it would
zero bytenr_copy#2 even when there is no btrfs superblock, (which is
fine only with in block_count).

Some might view it as corrupting the usr data (for which they have
specified -b option?)? I am just discussing I don't have any usecase
to prove it though. Do you have any idea?
If you think it should be ok, I shall go ahead and zero without
checking for the btrfs superblock beyond block_count.

Thanks, Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nikolay Borisov March 27, 2018, 6:19 a.m. UTC | #3
On 27.03.2018 02:50, Anand Jain wrote:
> 
> 
>> I told you this code can be made a lot simpler, simply by modifying the
>> last argument passed to zero_dev_clamped. I even posted the resulting
>> diff which was just 3 lines changed.
>>
>> I agree that it's a good idea to wipe all available superblock when we
>> use -b. However I don't agree with your approach - it adds a loop, it
>> adds a bunch of checks and makes the complexity orders of magnitude
>> higher than it could be. So I'm asking again - is there any inherent
>> benefit which I'm missing in your newly added 35 lines of code against
>> just passing the block device to zero_dev_clamped and letting the
>> existing logic take care of everything?
> 
> I had that idea for v1 as well, but I didn't do it because it would
> zero bytenr_copy#2 even when there is no btrfs superblock, (which is
> fine only with in block_count).
> 
> Some might view it as corrupting the usr data (for which they have
> specified -b option?)? I am just discussing I don't have any usecase
> to prove it though. Do you have any idea?
> If you think it should be ok, I shall go ahead and zero without
> checking for the btrfs superblock beyond block_count.

Right, so the pertinent question is - is anyone expecting to do mkfs on
a volume irrespective of the options and expect to be able to recover
data prior to the mkfs? I think the answer is 'no', but let's see what
the community says.

> 
> Thanks, Anand
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Anand Jain March 27, 2018, 11:27 p.m. UTC | #4
On 03/27/2018 02:19 PM, Nikolay Borisov wrote:
> 
> 
> On 27.03.2018 02:50, Anand Jain wrote:
>>
>>
>>> I told you this code can be made a lot simpler, simply by modifying the
>>> last argument passed to zero_dev_clamped. I even posted the resulting
>>> diff which was just 3 lines changed.
>>>
>>> I agree that it's a good idea to wipe all available superblock when we
>>> use -b. However I don't agree with your approach - it adds a loop, it
>>> adds a bunch of checks and makes the complexity orders of magnitude
>>> higher than it could be. So I'm asking again - is there any inherent
>>> benefit which I'm missing in your newly added 35 lines of code against
>>> just passing the block device to zero_dev_clamped and letting the
>>> existing logic take care of everything?
>>
>> I had that idea for v1 as well, but I didn't do it because it would
>> zero bytenr_copy#2 even when there is no btrfs superblock, (which is
>> fine only with in block_count).
>>
>> Some might view it as corrupting the usr data (for which they have
>> specified -b option?)? I am just discussing I don't have any usecase
>> to prove it though. Do you have any idea?
>> If you think it should be ok, I shall go ahead and zero without
>> checking for the btrfs superblock beyond block_count.
> 
> Right, so the pertinent question is - is anyone expecting to do mkfs on
> a volume irrespective of the options and expect to be able to recover
> data prior to the mkfs? I think the answer is 'no', but let's see what
> the community says.

  Writing zero even if there is no superblock is not a question of
  recovering btrfs, its a question whether there was any other good
  data that is intersting to the user, for which they have specified
  -b option.

  Asking community is a good idea, lets take a conservative approach
  to zero stale superblock only if we find a valid superblock, atleast
  for now, we can add cleanups later.

Thanks, Aannd

>>
>> Thanks, Anand
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/utils.c b/utils.c
index e9cb3a82fda6..6a9408b06e73 100644
--- a/utils.c
+++ b/utils.c
@@ -365,6 +365,41 @@  int btrfs_prepare_device(int fd, const char *file, u64 *block_count_ret,
 		return 1;
 	}
 
+	/*
+	 * Check for the BTRFS SB copies up until btrfs_device_size() and zero
+	 * it. So that kernel (or user for the manual recovery) don't have to
+	 * confuse with the stale SB copy during recovery.
+	 */
+	if (block_count != btrfs_device_size(fd, &st)) {
+		for (i = 1; i < BTRFS_SUPER_MIRROR_MAX; i++) {
+			struct btrfs_super_block *disk_super;
+			char buf[BTRFS_SUPER_INFO_SIZE];
+			disk_super = (struct btrfs_super_block *)buf;
+
+			/* Already zeroed above */
+			if (btrfs_sb_offset(i) < block_count)
+				continue;
+
+			/* Beyond actual disk size */
+			if (btrfs_sb_offset(i) >= btrfs_device_size(fd, &st))
+				continue;
+
+			/* Does not contain any stale SB */
+			if (btrfs_read_dev_super(fd, disk_super,
+						 btrfs_sb_offset(i), 0))
+				continue;
+
+			ret = zero_dev_clamped(fd, btrfs_sb_offset(i),
+						BTRFS_SUPER_INFO_SIZE,
+						btrfs_device_size(fd, &st));
+			if (ret < 0) {
+				error("failed to zero device '%s' bytenr %llu: %s",
+					file, btrfs_sb_offset(i), strerror(-ret));
+				return 1;
+			}
+		}
+	}
+
 	*block_count_ret = block_count;
 	return 0;
 }