diff mbox

[v2] Btrfs: Check metadata redundancy on balance

Message ID muotgb$4mf$1@ger.gmane.org (mailing list archive)
State Superseded
Headers show

Commit Message

Sam Tygier Oct. 3, 2015, 3:50 p.m. UTC
Currently BTRFS allows you to make bad choices of data and 
metadata levels. For example -d raid1 -m raid0 means you can
only use half your total disk space, but will loose everything
if 1 disk fails. It should give a warning in these cases.

This patch is a follow up to
[PATCH v2] btrfs-progs: check metadata redundancy
in order to cover the case of using balance to convert to such
a set of raid levels.

A simple example to hit this is to create a single device fs, 
which will default to single:dup, then to add a second device and
attempt to convert to raid1 with the command
btrfs balance start -dconvert=raid1  /mnt
this will result in a filesystem with raid1:dup, which will not
survive the loss of one drive. I personally don't see why the tools
should allow this, but in the previous thread a warning was
considered sufficient.

Changes in v2
Use btrfs_get_num_tolerated_disk_barrier_failures()

Signed-off-by: Sam Tygier <samtygier@yahoo.co.uk>

From: Sam Tygier <samtygier@yahoo.co.uk>
Date: Sat, 3 Oct 2015 16:43:48 +0100
Subject: [PATCH] Btrfs: Check metadata redundancy on balance

When converting a filesystem via balance check that metadata mode
is at least as redundant as the data mode. For example give warning
when:
-dconvert=raid1 -mconvert=single
---
 fs/btrfs/volumes.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Anand Jain Oct. 5, 2015, 2:33 a.m. UTC | #1
Sam,

On 10/03/2015 11:50 PM, sam tygier wrote:
> Currently BTRFS allows you to make bad choices of data and
> metadata levels. For example -d raid1 -m raid0 means you can
> only use half your total disk space, but will loose everything
> if 1 disk fails. It should give a warning in these cases.

  Nice test case. however the way we calculate the impact of
  lost device would be per chunk, as in the upcoming patch -set.

     PATCH 1/5] btrfs: Introduce a new function to check if all chunks a 
OK for degraded mount

  The above patch-set should catch the bug here. Would you be able to
  confirm if this patch is still needed Or apply your patch on top of
  it ?

Thanks, Anand


> This patch is a follow up to
> [PATCH v2] btrfs-progs: check metadata redundancy
> in order to cover the case of using balance to convert to such
> a set of raid levels.
>
> A simple example to hit this is to create a single device fs,
> which will default to single:dup, then to add a second device and
> attempt to convert to raid1 with the command
> btrfs balance start -dconvert=raid1  /mnt
> this will result in a filesystem with raid1:dup, which will not
> survive the loss of one drive. I personally don't see why the tools
> should allow this, but in the previous thread a warning was
> considered sufficient.
>
> Changes in v2
> Use btrfs_get_num_tolerated_disk_barrier_failures()
>
> Signed-off-by: Sam Tygier <samtygier@yahoo.co.uk>
>
> From: Sam Tygier <samtygier@yahoo.co.uk>
> Date: Sat, 3 Oct 2015 16:43:48 +0100
> Subject: [PATCH] Btrfs: Check metadata redundancy on balance
>
> When converting a filesystem via balance check that metadata mode
> is at least as redundant as the data mode. For example give warning
> when:
> -dconvert=raid1 -mconvert=single
> ---
>   fs/btrfs/volumes.c | 6 ++++++
>   1 file changed, 6 insertions(+)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 6fc73586..40247e9 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -3584,6 +3584,12 @@ int btrfs_balance(struct btrfs_balance_control *bctl,
>   		}
>   	} while (read_seqretry(&fs_info->profiles_lock, seq));
>
> +	if (btrfs_get_num_tolerated_disk_barrier_failures(bctl->meta.target) <
> +		btrfs_get_num_tolerated_disk_barrier_failures(bctl->data.target)) {
> +		btrfs_info(fs_info,
> +			"Warning: metatdata has lower redundancy than data\n");
> +	}
> +
>   	if (bctl->sys.flags & BTRFS_BALANCE_ARGS_CONVERT) {
>   		fs_info->num_tolerated_disk_barrier_failures = min(
>   			btrfs_calc_num_tolerated_disk_barrier_failures(fs_info),
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sam Tygier Oct. 7, 2015, 8:19 a.m. UTC | #2
On 05/10/15 03:33, Anand Jain wrote:
> 
> Sam,
> 
> On 10/03/2015 11:50 PM, sam tygier wrote:
>> Currently BTRFS allows you to make bad choices of data and
>> metadata levels. For example -d raid1 -m raid0 means you can
>> only use half your total disk space, but will loose everything
>> if 1 disk fails. It should give a warning in these cases.
> 
>   Nice test case. however the way we calculate the impact of
>   lost device would be per chunk, as in the upcoming patch -set.
> 
>      PATCH 1/5] btrfs: Introduce a new function to check if all chunks a OK for degraded mount
> 
>   The above patch-set should catch the bug here. Would you be able to
>   confirm if this patch is still needed Or apply your patch on top of
>   it ?
> 
> Thanks, Anand
> 

If I understand the per-chunk work correctly it is to handle the case 
where although there are not enough disks remaining to guarantee being
able to mount degraded, the arrangement of existing chunks happens to 
allow it (e.g. all the single chunks happen to be on a surviving disk).
So while the example case in "[PATCH 0/5] Btrfs: Per-chunk degradable 
check", can survive a 1 disk loss, the raid levels do not guarantee
survivability of a 1 disk loss after more data is written.

My patch is preventing combinations of raid levels that have poor 
guarantees when loosing disks, but waste disk space. For example
data=raid1 metadata=single, which wastes space by writing the data
twice, but would not guarantee survival of a 1 disk loss (even if the
per-chuck patches allow some 1 disk losses to survive) and could loose
everything if a bit flip happened in a critical metadata chunk.

So I think my patch is useful with or without per-chunk work.

Thanks,
Sam

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 6fc73586..40247e9 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3584,6 +3584,12 @@  int btrfs_balance(struct btrfs_balance_control *bctl,
 		}
 	} while (read_seqretry(&fs_info->profiles_lock, seq));
 
+	if (btrfs_get_num_tolerated_disk_barrier_failures(bctl->meta.target) <
+		btrfs_get_num_tolerated_disk_barrier_failures(bctl->data.target)) {
+		btrfs_info(fs_info,
+			"Warning: metatdata has lower redundancy than data\n");
+	}
+
 	if (bctl->sys.flags & BTRFS_BALANCE_ARGS_CONVERT) {
 		fs_info->num_tolerated_disk_barrier_failures = min(
 			btrfs_calc_num_tolerated_disk_barrier_failures(fs_info),