diff mbox

[v2,2/2] btrfs: Fix lost-data-profile caused by balance bg

Message ID 9ec09c79f6a86fc7174249efd641f85bead0ef40.1443611403.git.zhaolei@cn.fujitsu.com (mailing list archive)
State Superseded
Headers show

Commit Message

Zhaolei Sept. 30, 2015, 11:11 a.m. UTC
Reproduce:
 (In integration-4.3 branch)

 TEST_DEV=(/dev/vdg /dev/vdh)
 TEST_DIR=/mnt/tmp

 umount "$TEST_DEV" >/dev/null
 mkfs.btrfs -f -d raid1 "${TEST_DEV[@]}"

 mount -o nospace_cache "$TEST_DEV" "$TEST_DIR"
 btrfs balance start -dusage=0 $TEST_DIR
 btrfs filesystem usage $TEST_DIR

 dd if=/dev/zero of="$TEST_DIR"/file count=100
 btrfs filesystem usage $TEST_DIR

Result:
 We can see "no data chunk" in first "btrfs filesystem usage":
 # btrfs filesystem usage $TEST_DIR
 Overall:
    ...
 Metadata,single: Size:8.00MiB, Used:0.00B
    /dev/vdg        8.00MiB
 Metadata,RAID1: Size:122.88MiB, Used:112.00KiB
    /dev/vdg      122.88MiB
    /dev/vdh      122.88MiB
 System,single: Size:4.00MiB, Used:0.00B
    /dev/vdg        4.00MiB
 System,RAID1: Size:8.00MiB, Used:16.00KiB
    /dev/vdg        8.00MiB
    /dev/vdh        8.00MiB
 Unallocated:
    /dev/vdg        1.06GiB
    /dev/vdh        1.07GiB

 And "data chunks changed from raid1 to single" in second
 "btrfs filesystem usage":
 # btrfs filesystem usage $TEST_DIR
 Overall:
    ...
 Data,single: Size:256.00MiB, Used:0.00B
    /dev/vdh      256.00MiB
 Metadata,single: Size:8.00MiB, Used:0.00B
    /dev/vdg        8.00MiB
 Metadata,RAID1: Size:122.88MiB, Used:112.00KiB
    /dev/vdg      122.88MiB
    /dev/vdh      122.88MiB
 System,single: Size:4.00MiB, Used:0.00B
    /dev/vdg        4.00MiB
 System,RAID1: Size:8.00MiB, Used:16.00KiB
    /dev/vdg        8.00MiB
    /dev/vdh        8.00MiB
 Unallocated:
    /dev/vdg        1.06GiB
    /dev/vdh      841.92MiB

Reason:
 btrfs balance delete last data chunk in case of no data in
 the filesystem, then we can see "no data chunk" by "fi usage"
 command.

 And when we do write operation to fs, the only available data
 profile is 0x0, result is all new chunks are allocated single type.

Fix:
 Allocate a data chunk explicitly to ensure we don't lose the
 raid profile for data.

Test:
 Test by above script, and confirmed the logic by debug output.

Changelog v1->v2:
1: Update patch description of "Fix" field
2: Use BTRFS_BLOCK_GROUP_DATA for btrfs_force_chunk_alloc instead
   of 1
3: Only reserve chunk if balance data chunk.
All suggested-by: Filipe Manana <fdmanana@gmail.com>

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
---
 fs/btrfs/volumes.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

Comments

Filipe Manana Sept. 30, 2015, 4:17 p.m. UTC | #1
On Wed, Sep 30, 2015 at 12:11 PM, Zhao Lei <zhaolei@cn.fujitsu.com> wrote:
> Reproduce:
>  (In integration-4.3 branch)
>
>  TEST_DEV=(/dev/vdg /dev/vdh)
>  TEST_DIR=/mnt/tmp
>
>  umount "$TEST_DEV" >/dev/null
>  mkfs.btrfs -f -d raid1 "${TEST_DEV[@]}"
>
>  mount -o nospace_cache "$TEST_DEV" "$TEST_DIR"
>  btrfs balance start -dusage=0 $TEST_DIR
>  btrfs filesystem usage $TEST_DIR
>
>  dd if=/dev/zero of="$TEST_DIR"/file count=100
>  btrfs filesystem usage $TEST_DIR
>
> Result:
>  We can see "no data chunk" in first "btrfs filesystem usage":
>  # btrfs filesystem usage $TEST_DIR
>  Overall:
>     ...
>  Metadata,single: Size:8.00MiB, Used:0.00B
>     /dev/vdg        8.00MiB
>  Metadata,RAID1: Size:122.88MiB, Used:112.00KiB
>     /dev/vdg      122.88MiB
>     /dev/vdh      122.88MiB
>  System,single: Size:4.00MiB, Used:0.00B
>     /dev/vdg        4.00MiB
>  System,RAID1: Size:8.00MiB, Used:16.00KiB
>     /dev/vdg        8.00MiB
>     /dev/vdh        8.00MiB
>  Unallocated:
>     /dev/vdg        1.06GiB
>     /dev/vdh        1.07GiB
>
>  And "data chunks changed from raid1 to single" in second
>  "btrfs filesystem usage":
>  # btrfs filesystem usage $TEST_DIR
>  Overall:
>     ...
>  Data,single: Size:256.00MiB, Used:0.00B
>     /dev/vdh      256.00MiB
>  Metadata,single: Size:8.00MiB, Used:0.00B
>     /dev/vdg        8.00MiB
>  Metadata,RAID1: Size:122.88MiB, Used:112.00KiB
>     /dev/vdg      122.88MiB
>     /dev/vdh      122.88MiB
>  System,single: Size:4.00MiB, Used:0.00B
>     /dev/vdg        4.00MiB
>  System,RAID1: Size:8.00MiB, Used:16.00KiB
>     /dev/vdg        8.00MiB
>     /dev/vdh        8.00MiB
>  Unallocated:
>     /dev/vdg        1.06GiB
>     /dev/vdh      841.92MiB
>
> Reason:
>  btrfs balance delete last data chunk in case of no data in
>  the filesystem, then we can see "no data chunk" by "fi usage"
>  command.
>
>  And when we do write operation to fs, the only available data
>  profile is 0x0, result is all new chunks are allocated single type.
>
> Fix:
>  Allocate a data chunk explicitly to ensure we don't lose the
>  raid profile for data.
>
> Test:
>  Test by above script, and confirmed the logic by debug output.
>
> Changelog v1->v2:
> 1: Update patch description of "Fix" field
> 2: Use BTRFS_BLOCK_GROUP_DATA for btrfs_force_chunk_alloc instead
>    of 1
> 3: Only reserve chunk if balance data chunk.
> All suggested-by: Filipe Manana <fdmanana@gmail.com>
>
> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>

Reviewed-by: Filipe Manana <fdmanana@suse.com>

thanks

> ---
>  fs/btrfs/volumes.c | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 6fc73586..cd9e5bd 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -3277,6 +3277,7 @@ static int __btrfs_balance(struct btrfs_fs_info *fs_info)
>         u64 limit_data = bctl->data.limit;
>         u64 limit_meta = bctl->meta.limit;
>         u64 limit_sys = bctl->sys.limit;
> +       int chunk_reserved = 0;
>
>         /* step one make some room on all the devices */
>         devices = &fs_info->fs_devices->devices;
> @@ -3326,6 +3327,8 @@ again:
>         key.type = BTRFS_CHUNK_ITEM_KEY;
>
>         while (1) {
> +               u64 chunk_type;
> +
>                 if ((!counting && atomic_read(&fs_info->balance_pause_req)) ||
>                     atomic_read(&fs_info->balance_cancel_req)) {
>                         ret = -ECANCELED;
> @@ -3371,8 +3374,10 @@ again:
>                         spin_unlock(&fs_info->balance_lock);
>                 }
>
> +               chunk_type = btrfs_chunk_type(leaf, chunk);
>                 ret = should_balance_chunk(chunk_root, leaf, chunk,
>                                            found_key.offset);
> +
>                 btrfs_release_path(path);
>                 if (!ret) {
>                         mutex_unlock(&fs_info->delete_unused_bgs_mutex);
> @@ -3387,6 +3392,25 @@ again:
>                         goto loop;
>                 }
>
> +               if ((chunk_type & BTRFS_BLOCK_GROUP_DATA) && !chunk_reserved) {
> +                       trans = btrfs_start_transaction(chunk_root, 0);
> +                       if (IS_ERR(trans)) {
> +                               mutex_unlock(&fs_info->delete_unused_bgs_mutex);
> +                               ret = PTR_ERR(trans);
> +                               goto error;
> +                       }
> +
> +                       ret = btrfs_force_chunk_alloc(trans, chunk_root,
> +                                                     BTRFS_BLOCK_GROUP_DATA);
> +                       if (ret < 0) {
> +                               mutex_unlock(&fs_info->delete_unused_bgs_mutex);
> +                               goto error;
> +                       }
> +
> +                       btrfs_end_transaction(trans, chunk_root);
> +                       chunk_reserved = 1;
> +               }
> +
>                 ret = btrfs_relocate_chunk(chunk_root,
>                                            found_key.offset);
>                 mutex_unlock(&fs_info->delete_unused_bgs_mutex);
> --
> 1.8.5.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 6fc73586..cd9e5bd 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3277,6 +3277,7 @@  static int __btrfs_balance(struct btrfs_fs_info *fs_info)
 	u64 limit_data = bctl->data.limit;
 	u64 limit_meta = bctl->meta.limit;
 	u64 limit_sys = bctl->sys.limit;
+	int chunk_reserved = 0;
 
 	/* step one make some room on all the devices */
 	devices = &fs_info->fs_devices->devices;
@@ -3326,6 +3327,8 @@  again:
 	key.type = BTRFS_CHUNK_ITEM_KEY;
 
 	while (1) {
+		u64 chunk_type;
+
 		if ((!counting && atomic_read(&fs_info->balance_pause_req)) ||
 		    atomic_read(&fs_info->balance_cancel_req)) {
 			ret = -ECANCELED;
@@ -3371,8 +3374,10 @@  again:
 			spin_unlock(&fs_info->balance_lock);
 		}
 
+		chunk_type = btrfs_chunk_type(leaf, chunk);
 		ret = should_balance_chunk(chunk_root, leaf, chunk,
 					   found_key.offset);
+
 		btrfs_release_path(path);
 		if (!ret) {
 			mutex_unlock(&fs_info->delete_unused_bgs_mutex);
@@ -3387,6 +3392,25 @@  again:
 			goto loop;
 		}
 
+		if ((chunk_type & BTRFS_BLOCK_GROUP_DATA) && !chunk_reserved) {
+			trans = btrfs_start_transaction(chunk_root, 0);
+			if (IS_ERR(trans)) {
+				mutex_unlock(&fs_info->delete_unused_bgs_mutex);
+				ret = PTR_ERR(trans);
+				goto error;
+			}
+
+			ret = btrfs_force_chunk_alloc(trans, chunk_root,
+						      BTRFS_BLOCK_GROUP_DATA);
+			if (ret < 0) {
+				mutex_unlock(&fs_info->delete_unused_bgs_mutex);
+				goto error;
+			}
+
+			btrfs_end_transaction(trans, chunk_root);
+			chunk_reserved = 1;
+		}
+
 		ret = btrfs_relocate_chunk(chunk_root,
 					   found_key.offset);
 		mutex_unlock(&fs_info->delete_unused_bgs_mutex);