diff mbox

btrfs: Allow forced conversion of metadata to dup profile on multiple devices

Message ID 5304D781.6020309@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Austin S. Hemmelgarn Feb. 19, 2014, 4:10 p.m. UTC
Currently, btrfs balance start fails when trying to convert metadata or
system chunks to dup profile on filesystems with multiple devices.  This
requires that a conversion from a multi-device filesystem to a single
device filesystem use the following methodology:
    1. btrfs balance start -dconvert=single -mconvert=single \
       -sconvert=single -f /
    2. btrfs device delete / /dev/sdx
    3. btrfs balance start -mconvert=dup -sconvert=dup /
This results in a period of time (possibly very long if the devices are
big) where you don't have the protection guarantees of multiple copies
of metadata chunks.

After applying this patch, one can instead use the following methodology
for conversion from a multi-device filesystem to a single device
filesystem:
    1. btrfs balance start -dconvert=single -mconvert=dup \
       -sconvert=dup -f /
    2. btrfs device delete / /dev/sdx
This greatly reduces the chances of the operation causing data loss due
to a read error during the device delete.

Signed-off-by: Austin S. Hemmelgarn <ahferroin7@gmail.com>
---
 fs/btrfs/volumes.c | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

 		allowed |= BTRFS_BLOCK_GROUP_RAID5;
@@ -3221,6 +3219,21 @@ int btrfs_balance(struct btrfs_balance_control
*bctl,
 				goto out;
 			}
 		}
+		if (((bctl->sys.flags & BTRFS_BALANCE_ARGS_CONVERT) &&
+		    (bctl->sys.target & ~BTRFS_BLOCK_GROUP_DUP) ||
+		    (bctl->meta.flags & BTRFS_BALANCE_ARGS_CONVERT) &&
+		    (bctl->meta.target & ~BTRFS_BLOCK_GROUP_DUP)) &&
+		    (num_devs > 1)) {
+			if (bctl->flags & BTRFS_BALANCE_FORCE) {
+				btrfs_info(fs_info, "force conversion of metadata "
+					   "to dup profile on multiple devices");
+			} else {
+				btrfs_err(fs_info, "balance will reduce metadata "
+					  "integrity, use force if you want this");
+				ret = -EINVAL;
+				goto out;
+			}
+		}
 	} while (read_seqretry(&fs_info->profiles_lock, seq));
  	if (bctl->sys.flags & BTRFS_BALANCE_ARGS_CONVERT) {

Comments

David Sterba Feb. 20, 2014, 4:57 p.m. UTC | #1
On Wed, Feb 19, 2014 at 11:10:41AM -0500, Austin S Hemmelgarn wrote:
> Currently, btrfs balance start fails when trying to convert metadata or
> system chunks to dup profile on filesystems with multiple devices.  This
> requires that a conversion from a multi-device filesystem to a single
> device filesystem use the following methodology:
>     1. btrfs balance start -dconvert=single -mconvert=single \
>        -sconvert=single -f /
>     2. btrfs device delete / /dev/sdx
>     3. btrfs balance start -mconvert=dup -sconvert=dup /
> This results in a period of time (possibly very long if the devices are
> big) where you don't have the protection guarantees of multiple copies
> of metadata chunks.
> 
> After applying this patch, one can instead use the following methodology
> for conversion from a multi-device filesystem to a single device
> filesystem:
>     1. btrfs balance start -dconvert=single -mconvert=dup \
>        -sconvert=dup -f /
>     2. btrfs device delete / /dev/sdx
> This greatly reduces the chances of the operation causing data loss due
> to a read error during the device delete.
> 
> Signed-off-by: Austin S. Hemmelgarn <ahferroin7@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.cz>

Sounds useful. The muliple devices + DUP is allowed setup when the
device is added, this patch only adds the 'delete' counterpart. The
imroved data loss protection during the process is a good thing.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ilya Dryomov Feb. 24, 2014, 1:37 p.m. UTC | #2
On Thu, Feb 20, 2014 at 6:57 PM, David Sterba <dsterba@suse.cz> wrote:
> On Wed, Feb 19, 2014 at 11:10:41AM -0500, Austin S Hemmelgarn wrote:
>> Currently, btrfs balance start fails when trying to convert metadata or
>> system chunks to dup profile on filesystems with multiple devices.  This
>> requires that a conversion from a multi-device filesystem to a single
>> device filesystem use the following methodology:
>>     1. btrfs balance start -dconvert=single -mconvert=single \
>>        -sconvert=single -f /
>>     2. btrfs device delete / /dev/sdx
>>     3. btrfs balance start -mconvert=dup -sconvert=dup /
>> This results in a period of time (possibly very long if the devices are
>> big) where you don't have the protection guarantees of multiple copies
>> of metadata chunks.
>>
>> After applying this patch, one can instead use the following methodology
>> for conversion from a multi-device filesystem to a single device
>> filesystem:
>>     1. btrfs balance start -dconvert=single -mconvert=dup \
>>        -sconvert=dup -f /
>>     2. btrfs device delete / /dev/sdx
>> This greatly reduces the chances of the operation causing data loss due
>> to a read error during the device delete.
>>
>> Signed-off-by: Austin S. Hemmelgarn <ahferroin7@gmail.com>
> Reviewed-by: David Sterba <dsterba@suse.cz>
>
> Sounds useful. The muliple devices + DUP is allowed setup when the
> device is added, this patch only adds the 'delete' counterpart. The
> imroved data loss protection during the process is a good thing.

Hi,

Have you actually tried to queue it?  Unless I'm missing something, it won't
compile, and on top of that, it seems to be corrupted too..

IIRC muliple devices + DUP is "allowed" only until the first balance, has that
changed?

Thanks,

                Ilya
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Austin S. Hemmelgarn Feb. 24, 2014, 1:44 p.m. UTC | #3
On 2014-02-24 08:37, Ilya Dryomov wrote:
> On Thu, Feb 20, 2014 at 6:57 PM, David Sterba <dsterba@suse.cz> wrote:
>> On Wed, Feb 19, 2014 at 11:10:41AM -0500, Austin S Hemmelgarn wrote:
>>> Currently, btrfs balance start fails when trying to convert metadata or
>>> system chunks to dup profile on filesystems with multiple devices.  This
>>> requires that a conversion from a multi-device filesystem to a single
>>> device filesystem use the following methodology:
>>>     1. btrfs balance start -dconvert=single -mconvert=single \
>>>        -sconvert=single -f /
>>>     2. btrfs device delete / /dev/sdx
>>>     3. btrfs balance start -mconvert=dup -sconvert=dup /
>>> This results in a period of time (possibly very long if the devices are
>>> big) where you don't have the protection guarantees of multiple copies
>>> of metadata chunks.
>>>
>>> After applying this patch, one can instead use the following methodology
>>> for conversion from a multi-device filesystem to a single device
>>> filesystem:
>>>     1. btrfs balance start -dconvert=single -mconvert=dup \
>>>        -sconvert=dup -f /
>>>     2. btrfs device delete / /dev/sdx
>>> This greatly reduces the chances of the operation causing data loss due
>>> to a read error during the device delete.
>>>
>>> Signed-off-by: Austin S. Hemmelgarn <ahferroin7@gmail.com>
>> Reviewed-by: David Sterba <dsterba@suse.cz>
>>
>> Sounds useful. The muliple devices + DUP is allowed setup when the
>> device is added, this patch only adds the 'delete' counterpart. The
>> imroved data loss protection during the process is a good thing.
> 
> Hi,
> 
> Have you actually tried to queue it?  Unless I'm missing something, it won't
> compile, and on top of that, it seems to be corrupted too..
The patch itself was made using git, AFAICT it should be fine.  I've
personally built and tested it using UML.
> 
> IIRC muliple devices + DUP is "allowed" only until the first balance, has that
> changed?

This is just a limitation of how the kernel handles balances, DUP
profiles with multiple devices work, it's just terribly inefficient.
The primary use case is converting a multi-device FS with RAID for
metadata to a single device FS without having to reduce integrity.
> Thanks,
> 
>                 Ilya
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ilya Dryomov Feb. 24, 2014, 2:12 p.m. UTC | #4
On Mon, Feb 24, 2014 at 3:44 PM, Austin S Hemmelgarn
<ahferroin7@gmail.com> wrote:
> On 2014-02-24 08:37, Ilya Dryomov wrote:
>> On Thu, Feb 20, 2014 at 6:57 PM, David Sterba <dsterba@suse.cz> wrote:
>>> On Wed, Feb 19, 2014 at 11:10:41AM -0500, Austin S Hemmelgarn wrote:
>>>> Currently, btrfs balance start fails when trying to convert metadata or
>>>> system chunks to dup profile on filesystems with multiple devices.  This
>>>> requires that a conversion from a multi-device filesystem to a single
>>>> device filesystem use the following methodology:
>>>>     1. btrfs balance start -dconvert=single -mconvert=single \
>>>>        -sconvert=single -f /
>>>>     2. btrfs device delete / /dev/sdx
>>>>     3. btrfs balance start -mconvert=dup -sconvert=dup /
>>>> This results in a period of time (possibly very long if the devices are
>>>> big) where you don't have the protection guarantees of multiple copies
>>>> of metadata chunks.
>>>>
>>>> After applying this patch, one can instead use the following methodology
>>>> for conversion from a multi-device filesystem to a single device
>>>> filesystem:
>>>>     1. btrfs balance start -dconvert=single -mconvert=dup \
>>>>        -sconvert=dup -f /
>>>>     2. btrfs device delete / /dev/sdx
>>>> This greatly reduces the chances of the operation causing data loss due
>>>> to a read error during the device delete.
>>>>
>>>> Signed-off-by: Austin S. Hemmelgarn <ahferroin7@gmail.com>
>>> Reviewed-by: David Sterba <dsterba@suse.cz>
>>>
>>> Sounds useful. The muliple devices + DUP is allowed setup when the
>>> device is added, this patch only adds the 'delete' counterpart. The
>>> imroved data loss protection during the process is a good thing.
>>
>> Hi,
>>
>> Have you actually tried to queue it?  Unless I'm missing something, it won't
>> compile, and on top of that, it seems to be corrupted too..
> The patch itself was made using git, AFAICT it should be fine.  I've
> personally built and tested it using UML.

It doesn't look fine.  It was generated with git, but it got corrupted
on the way: either how you pasted it or the email client you use is the
problem.

On Wed, Feb 19, 2014 at 6:10 PM, Austin S Hemmelgarn
<ahferroin7@gmail.com> wrote:
> Currently, btrfs balance start fails when trying to convert metadata or
> system chunks to dup profile on filesystems with multiple devices.  This
> requires that a conversion from a multi-device filesystem to a single
> device filesystem use the following methodology:
>     1. btrfs balance start -dconvert=single -mconvert=single \
>        -sconvert=single -f /
>     2. btrfs device delete / /dev/sdx
>     3. btrfs balance start -mconvert=dup -sconvert=dup /
> This results in a period of time (possibly very long if the devices are
> big) where you don't have the protection guarantees of multiple copies
> of metadata chunks.
>
> After applying this patch, one can instead use the following methodology
> for conversion from a multi-device filesystem to a single device
> filesystem:
>     1. btrfs balance start -dconvert=single -mconvert=dup \
>        -sconvert=dup -f /
>     2. btrfs device delete / /dev/sdx
> This greatly reduces the chances of the operation causing data loss due
> to a read error during the device delete.
>
> Signed-off-by: Austin S. Hemmelgarn <ahferroin7@gmail.com>
> ---
>  fs/btrfs/volumes.c | 21 +++++++++++++++++----
>  1 file changed, 17 insertions(+), 4 deletions(-)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 07629e9..38a9522 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -3152,10 +3152,8 @@ int btrfs_balance(struct btrfs_balance_control
> *bctl,

^^^, that should be a single line

>                 num_devices--;
>         }
>         btrfs_dev_replace_unlock(&fs_info->dev_replace);
> -       allowed = BTRFS_AVAIL_ALLOC_BIT_SINGLE;
> -       if (num_devices == 1)
> -               allowed |= BTRFS_BLOCK_GROUP_DUP;
> -       else if (num_devices > 1)
> +       allowed = BTRFS_AVAIL_ALLOC_BIT_SINGLE | BTRFS_BLOCK_GROUP_DUP;
> +       if (num_devices > 1)
>                 allowed |= (BTRFS_BLOCK_GROUP_RAID0 | BTRFS_BLOCK_GROUP_RAID1);
>         if (num_devices > 2)
>                 allowed |= BTRFS_BLOCK_GROUP_RAID5;
> @@ -3221,6 +3219,21 @@ int btrfs_balance(struct btrfs_balance_control
> *bctl,

^^^, ditto

>                                 goto out;
>                         }
>                 }
> +               if (((bctl->sys.flags & BTRFS_BALANCE_ARGS_CONVERT) &&
> +                   (bctl->sys.target & ~BTRFS_BLOCK_GROUP_DUP) ||
> +                   (bctl->meta.flags & BTRFS_BALANCE_ARGS_CONVERT) &&
> +                   (bctl->meta.target & ~BTRFS_BLOCK_GROUP_DUP)) &&
> +                   (num_devs > 1)) {
> +                       if (bctl->flags & BTRFS_BALANCE_FORCE) {
> +                               btrfs_info(fs_info, "force conversion of metadata "
> +                                          "to dup profile on multiple devices");
> +                       } else {
> +                               btrfs_err(fs_info, "balance will reduce metadata "
> +                                         "integrity, use force if you want this");
> +                               ret = -EINVAL;
> +                               goto out;
> +                       }
> +               }
>         } while (read_seqretry(&fs_info->profiles_lock, seq));
>         if (bctl->sys.flags & BTRFS_BALANCE_ARGS_CONVERT) {

^^^, there should be 3 lines of context here.  It looks like an empty
line between "} while ..." and "if (bctl..." is missing.

There is probably more, those just stand out.

Thanks,

                Ilya
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Austin S. Hemmelgarn Feb. 24, 2014, 2:40 p.m. UTC | #5
On 2014-02-24 09:12, Ilya Dryomov wrote:
> On Mon, Feb 24, 2014 at 3:44 PM, Austin S Hemmelgarn
> <ahferroin7@gmail.com> wrote:
>> On 2014-02-24 08:37, Ilya Dryomov wrote:
>>> On Thu, Feb 20, 2014 at 6:57 PM, David Sterba <dsterba@suse.cz> wrote:
>>>> On Wed, Feb 19, 2014 at 11:10:41AM -0500, Austin S Hemmelgarn wrote:
>>>>> Currently, btrfs balance start fails when trying to convert metadata or
>>>>> system chunks to dup profile on filesystems with multiple devices.  This
>>>>> requires that a conversion from a multi-device filesystem to a single
>>>>> device filesystem use the following methodology:
>>>>>     1. btrfs balance start -dconvert=single -mconvert=single \
>>>>>        -sconvert=single -f /
>>>>>     2. btrfs device delete / /dev/sdx
>>>>>     3. btrfs balance start -mconvert=dup -sconvert=dup /
>>>>> This results in a period of time (possibly very long if the devices are
>>>>> big) where you don't have the protection guarantees of multiple copies
>>>>> of metadata chunks.
>>>>>
>>>>> After applying this patch, one can instead use the following methodology
>>>>> for conversion from a multi-device filesystem to a single device
>>>>> filesystem:
>>>>>     1. btrfs balance start -dconvert=single -mconvert=dup \
>>>>>        -sconvert=dup -f /
>>>>>     2. btrfs device delete / /dev/sdx
>>>>> This greatly reduces the chances of the operation causing data loss due
>>>>> to a read error during the device delete.
>>>>>
>>>>> Signed-off-by: Austin S. Hemmelgarn <ahferroin7@gmail.com>
>>>> Reviewed-by: David Sterba <dsterba@suse.cz>
>>>>
>>>> Sounds useful. The muliple devices + DUP is allowed setup when the
>>>> device is added, this patch only adds the 'delete' counterpart. The
>>>> imroved data loss protection during the process is a good thing.
>>>
>>> Hi,
>>>
>>> Have you actually tried to queue it?  Unless I'm missing something, it won't
>>> compile, and on top of that, it seems to be corrupted too..
>> The patch itself was made using git, AFAICT it should be fine.  I've
>> personally built and tested it using UML.
> 
> It doesn't look fine.  It was generated with git, but it got corrupted
> on the way: either how you pasted it or the email client you use is the
> problem.
> 
> On Wed, Feb 19, 2014 at 6:10 PM, Austin S Hemmelgarn
> <ahferroin7@gmail.com> wrote:
>> Currently, btrfs balance start fails when trying to convert metadata or
>> system chunks to dup profile on filesystems with multiple devices.  This
>> requires that a conversion from a multi-device filesystem to a single
>> device filesystem use the following methodology:
>>     1. btrfs balance start -dconvert=single -mconvert=single \
>>        -sconvert=single -f /
>>     2. btrfs device delete / /dev/sdx
>>     3. btrfs balance start -mconvert=dup -sconvert=dup /
>> This results in a period of time (possibly very long if the devices are
>> big) where you don't have the protection guarantees of multiple copies
>> of metadata chunks.
>>
>> After applying this patch, one can instead use the following methodology
>> for conversion from a multi-device filesystem to a single device
>> filesystem:
>>     1. btrfs balance start -dconvert=single -mconvert=dup \
>>        -sconvert=dup -f /
>>     2. btrfs device delete / /dev/sdx
>> This greatly reduces the chances of the operation causing data loss due
>> to a read error during the device delete.
>>
>> Signed-off-by: Austin S. Hemmelgarn <ahferroin7@gmail.com>
>> ---
>>  fs/btrfs/volumes.c | 21 +++++++++++++++++----
>>  1 file changed, 17 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 07629e9..38a9522 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -3152,10 +3152,8 @@ int btrfs_balance(struct btrfs_balance_control
>> *bctl,
> 
> ^^^, that should be a single line
> 
>>                 num_devices--;
>>         }
>>         btrfs_dev_replace_unlock(&fs_info->dev_replace);
>> -       allowed = BTRFS_AVAIL_ALLOC_BIT_SINGLE;
>> -       if (num_devices == 1)
>> -               allowed |= BTRFS_BLOCK_GROUP_DUP;
>> -       else if (num_devices > 1)
>> +       allowed = BTRFS_AVAIL_ALLOC_BIT_SINGLE | BTRFS_BLOCK_GROUP_DUP;
>> +       if (num_devices > 1)
>>                 allowed |= (BTRFS_BLOCK_GROUP_RAID0 | BTRFS_BLOCK_GROUP_RAID1);
>>         if (num_devices > 2)
>>                 allowed |= BTRFS_BLOCK_GROUP_RAID5;
>> @@ -3221,6 +3219,21 @@ int btrfs_balance(struct btrfs_balance_control
>> *bctl,
> 
> ^^^, ditto
> 
>>                                 goto out;
>>                         }
>>                 }
>> +               if (((bctl->sys.flags & BTRFS_BALANCE_ARGS_CONVERT) &&
>> +                   (bctl->sys.target & ~BTRFS_BLOCK_GROUP_DUP) ||
>> +                   (bctl->meta.flags & BTRFS_BALANCE_ARGS_CONVERT) &&
>> +                   (bctl->meta.target & ~BTRFS_BLOCK_GROUP_DUP)) &&
>> +                   (num_devs > 1)) {
>> +                       if (bctl->flags & BTRFS_BALANCE_FORCE) {
>> +                               btrfs_info(fs_info, "force conversion of metadata "
>> +                                          "to dup profile on multiple devices");
>> +                       } else {
>> +                               btrfs_err(fs_info, "balance will reduce metadata "
>> +                                         "integrity, use force if you want this");
>> +                               ret = -EINVAL;
>> +                               goto out;
>> +                       }
>> +               }
>>         } while (read_seqretry(&fs_info->profiles_lock, seq));
>>         if (bctl->sys.flags & BTRFS_BALANCE_ARGS_CONVERT) {
> 
> ^^^, there should be 3 lines of context here.  It looks like an empty
> line between "} while ..." and "if (bctl..." is missing.
> 
> There is probably more, those just stand out.
> 
> Thanks,
> 
>                 Ilya
> 
I'll rebase and regenerate the patch tonight when I get off of work, I'm
thinking my mail client might have mangled it somehow.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 07629e9..38a9522 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3152,10 +3152,8 @@  int btrfs_balance(struct btrfs_balance_control
*bctl,
 		num_devices--;
 	}
 	btrfs_dev_replace_unlock(&fs_info->dev_replace);
-	allowed = BTRFS_AVAIL_ALLOC_BIT_SINGLE;
-	if (num_devices == 1)
-		allowed |= BTRFS_BLOCK_GROUP_DUP;
-	else if (num_devices > 1)
+	allowed = BTRFS_AVAIL_ALLOC_BIT_SINGLE | BTRFS_BLOCK_GROUP_DUP;
+	if (num_devices > 1)
 		allowed |= (BTRFS_BLOCK_GROUP_RAID0 | BTRFS_BLOCK_GROUP_RAID1);
 	if (num_devices > 2)