[02/13] btrfs: Do per-chunk check for mount time check

Message ID	1462889372-5274-4-git-send-email-anand.jain@oracle.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-btrfs-owner@kernel.org> From: Anand Jain <anand.jain@oracle.com> To: linux-btrfs@vger.kernel.org Cc: dsterba@suse.cz, yauhen.kharuzhy@zavadatar.com Subject: [PATCH 02/13] btrfs: Do per-chunk check for mount time check Date: Tue, 10 May 2016 22:09:21 +0800 Message-Id: <1462889372-5274-4-git-send-email-anand.jain@oracle.com> In-Reply-To: <1462889372-5274-1-git-send-email-anand.jain@oracle.com> References: <1462889372-5274-1-git-send-email-anand.jain@oracle.com> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk

Message ID

1462889372-5274-4-git-send-email-anand.jain@oracle.com (mailing list archive)

State

New, archived

Headers

From: Anand Jain <anand.jain@oracle.com>
To: linux-btrfs@vger.kernel.org
Cc: dsterba@suse.cz, yauhen.kharuzhy@zavadatar.com
Subject: [PATCH 02/13] btrfs: Do per-chunk check for mount time check
Date: Tue, 10 May 2016 22:09:21 +0800
Message-Id: <1462889372-5274-4-git-send-email-anand.jain@oracle.com>
In-Reply-To: <1462889372-5274-1-git-send-email-anand.jain@oracle.com>
References: <1462889372-5274-1-git-send-email-anand.jain@oracle.com>
Sender: linux-btrfs-owner@vger.kernel.org
Precedence: bulk

Commit Message

Anand Jain May 10, 2016, 2:09 p.m. UTC

From: Qu Wenruo <quwenruo@cn.fujitsu.com>

Now use the btrfs_check_degraded() to do mount time degraded check.

With this patch, now we can mount with the following case:
 # mkfs.btrfs -f -m raid1 -d single /dev/sdb /dev/sdc
 # wipefs -a /dev/sdc
 # mount /dev/sdb /mnt/btrfs -o degraded
 As the single data chunk is only in sdb, so it's OK to mount as degraded,
 as missing one device is OK for RAID1.

But still fail with the following case as expected:
 # mkfs.btrfs -f -m raid1 -d single /dev/sdb /dev/sdc
 # wipefs -a /dev/sdb
 # mount /dev/sdc /mnt/btrfs -o degraded
 As the data chunk is only in sdb, so it's not OK to mount it as degraded.

Reported-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Reported-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>

[Btrfs: use btrfs_error instead of btrfs_err during mount]
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/disk-io.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

Comments

Hugo Mills Sept. 12, 2016, 9:49 p.m. UTC | #1

What happened to these patches? (Particularly the per-chunk
degraded checks). We've just had someone on IRC who could have used
the capability...

   Hugo.

On Tue, May 10, 2016 at 10:09:21PM +0800, Anand Jain wrote:
> From: Qu Wenruo <quwenruo@cn.fujitsu.com>
> 
> Now use the btrfs_check_degraded() to do mount time degraded check.
> 
> With this patch, now we can mount with the following case:
>  # mkfs.btrfs -f -m raid1 -d single /dev/sdb /dev/sdc
>  # wipefs -a /dev/sdc
>  # mount /dev/sdb /mnt/btrfs -o degraded
>  As the single data chunk is only in sdb, so it's OK to mount as degraded,
>  as missing one device is OK for RAID1.
> 
> But still fail with the following case as expected:
>  # mkfs.btrfs -f -m raid1 -d single /dev/sdb /dev/sdc
>  # wipefs -a /dev/sdb
>  # mount /dev/sdc /mnt/btrfs -o degraded
>  As the data chunk is only in sdb, so it's not OK to mount it as degraded.
> 
> Reported-by: Zhao Lei <zhaolei@cn.fujitsu.com>
> Reported-by: Anand Jain <anand.jain@oracle.com>
> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
> 
> [Btrfs: use btrfs_error instead of btrfs_err during mount]
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
>  fs/btrfs/disk-io.c | 18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index d01f89d130e0..4f91a049fbca 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -2885,6 +2885,16 @@ int open_ctree(struct super_block *sb,
>  		goto fail_tree_roots;
>  	}
>  
> +	ret = btrfs_check_degradable(fs_info, fs_info->sb->s_flags);
> +	if (ret < 0) {
> +		btrfs_err(fs_info, "degraded writable mount failed %d", ret);
> +		goto fail_tree_roots;
> +	} else if (ret > 0 && !btrfs_test_opt(chunk_root, DEGRADED)) {
> +		btrfs_warn(fs_info,
> +			"Some device missing, but still degraded mountable, please mount with -o degraded option");
> +		ret = -EACCES;
> +		goto fail_tree_roots;
> +	}
>  	/*
>  	 * keep the device that is marked to be the target device for the
>  	 * dev_replace procedure
> @@ -2988,14 +2998,6 @@ retry_root_backup:
>  	}
>  	fs_info->num_tolerated_disk_barrier_failures =
>  		btrfs_calc_num_tolerated_disk_barrier_failures(fs_info);
> -	if (fs_info->fs_devices->missing_devices >
> -	     fs_info->num_tolerated_disk_barrier_failures &&
> -	    !(sb->s_flags & MS_RDONLY)) {
> -		pr_warn("BTRFS: missing devices(%llu) exceeds the limit(%d), writeable mount is not allowed\n",
> -			fs_info->fs_devices->missing_devices,
> -			fs_info->num_tolerated_disk_barrier_failures);
> -		goto fail_sysfs;
> -	}
>  
>  	fs_info->cleaner_kthread = kthread_run(cleaner_kthread, tree_root,
>  					       "btrfs-cleaner");
> -- 
> 2.7.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Qu Wenruo Sept. 13, 2016, 12:10 a.m. UTC | #2

IIRC it's now part of Anand Jain's hot device replace patchset.

And noone knows when hot device replace will be merged, the per chunk 
degrade check won't be merged.

Thanks,
Qu

At 09/13/2016 05:49 AM, Hugo Mills wrote:
>    What happened to these patches? (Particularly the per-chunk
> degraded checks). We've just had someone on IRC who could have used
> the capability...
>
>    Hugo.
>
> On Tue, May 10, 2016 at 10:09:21PM +0800, Anand Jain wrote:
>> From: Qu Wenruo <quwenruo@cn.fujitsu.com>
>>
>> Now use the btrfs_check_degraded() to do mount time degraded check.
>>
>> With this patch, now we can mount with the following case:
>>  # mkfs.btrfs -f -m raid1 -d single /dev/sdb /dev/sdc
>>  # wipefs -a /dev/sdc
>>  # mount /dev/sdb /mnt/btrfs -o degraded
>>  As the single data chunk is only in sdb, so it's OK to mount as degraded,
>>  as missing one device is OK for RAID1.
>>
>> But still fail with the following case as expected:
>>  # mkfs.btrfs -f -m raid1 -d single /dev/sdb /dev/sdc
>>  # wipefs -a /dev/sdb
>>  # mount /dev/sdc /mnt/btrfs -o degraded
>>  As the data chunk is only in sdb, so it's not OK to mount it as degraded.
>>
>> Reported-by: Zhao Lei <zhaolei@cn.fujitsu.com>
>> Reported-by: Anand Jain <anand.jain@oracle.com>
>> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
>>
>> [Btrfs: use btrfs_error instead of btrfs_err during mount]
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>>  fs/btrfs/disk-io.c | 18 ++++++++++--------
>>  1 file changed, 10 insertions(+), 8 deletions(-)
>>
>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>> index d01f89d130e0..4f91a049fbca 100644
>> --- a/fs/btrfs/disk-io.c
>> +++ b/fs/btrfs/disk-io.c
>> @@ -2885,6 +2885,16 @@ int open_ctree(struct super_block *sb,
>>  		goto fail_tree_roots;
>>  	}
>>
>> +	ret = btrfs_check_degradable(fs_info, fs_info->sb->s_flags);
>> +	if (ret < 0) {
>> +		btrfs_err(fs_info, "degraded writable mount failed %d", ret);
>> +		goto fail_tree_roots;
>> +	} else if (ret > 0 && !btrfs_test_opt(chunk_root, DEGRADED)) {
>> +		btrfs_warn(fs_info,
>> +			"Some device missing, but still degraded mountable, please mount with -o degraded option");
>> +		ret = -EACCES;
>> +		goto fail_tree_roots;
>> +	}
>>  	/*
>>  	 * keep the device that is marked to be the target device for the
>>  	 * dev_replace procedure
>> @@ -2988,14 +2998,6 @@ retry_root_backup:
>>  	}
>>  	fs_info->num_tolerated_disk_barrier_failures =
>>  		btrfs_calc_num_tolerated_disk_barrier_failures(fs_info);
>> -	if (fs_info->fs_devices->missing_devices >
>> -	     fs_info->num_tolerated_disk_barrier_failures &&
>> -	    !(sb->s_flags & MS_RDONLY)) {
>> -		pr_warn("BTRFS: missing devices(%llu) exceeds the limit(%d), writeable mount is not allowed\n",
>> -			fs_info->fs_devices->missing_devices,
>> -			fs_info->num_tolerated_disk_barrier_failures);
>> -		goto fail_sysfs;
>> -	}
>>
>>  	fs_info->cleaner_kthread = kthread_run(cleaner_kthread, tree_root,
>>  					       "btrfs-cleaner");
>> --
>> 2.7.0
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Anand Jain Sept. 14, 2016, 7:29 a.m. UTC | #3

On 09/13/2016 05:49 AM, Hugo Mills wrote:
>    What happened to these patches? (Particularly the per-chunk
> degraded checks).

   Per-chunk degraded-check patch helps to workaround the issue.
   Which is needed to test hotspare support.
   The final fix for the same is..
     [RFC] btrfs: create degraded-RAID1 chunks
   which needs more review.

Thanks, Anand

> We've just had someone on IRC who could have used
> the capability...
>
>    Hugo.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Anand Jain Nov. 8, 2016, 12:32 p.m. UTC | #4

Hi David,

This and its related patches 1/13..5/13 provides a good interim
workaround to the regression caused by the patch
----
commit 95669976bd7d30ae265db938ecb46a6b7f8cb893
Author: Miao Xie <miaox@cn.fujitsu.com>
Date:   Thu Jul 24 11:37:14 2014 +0800

     Btrfs: don't consider the missing device when allocating new chunks
----
ref [1]

[1]
https://patchwork.kernel.org/patch/8965291/

Would want to know your opinion.

The final solution is complex, and is in the RFC [2] which isn't
integration ready yet.

[2]
https://patchwork.kernel.org/patch/8965301/

Thanks, Anand


On 05/10/16 22:09, Anand Jain wrote:
> From: Qu Wenruo <quwenruo@cn.fujitsu.com>
>
> Now use the btrfs_check_degraded() to do mount time degraded check.
>
> With this patch, now we can mount with the following case:
>  # mkfs.btrfs -f -m raid1 -d single /dev/sdb /dev/sdc
>  # wipefs -a /dev/sdc
>  # mount /dev/sdb /mnt/btrfs -o degraded
>  As the single data chunk is only in sdb, so it's OK to mount as degraded,
>  as missing one device is OK for RAID1.
>
> But still fail with the following case as expected:
>  # mkfs.btrfs -f -m raid1 -d single /dev/sdb /dev/sdc
>  # wipefs -a /dev/sdb
>  # mount /dev/sdc /mnt/btrfs -o degraded
>  As the data chunk is only in sdb, so it's not OK to mount it as degraded.
>
> Reported-by: Zhao Lei <zhaolei@cn.fujitsu.com>
> Reported-by: Anand Jain <anand.jain@oracle.com>
> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
>
> [Btrfs: use btrfs_error instead of btrfs_err during mount]
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
>  fs/btrfs/disk-io.c | 18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)
>
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index d01f89d130e0..4f91a049fbca 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -2885,6 +2885,16 @@ int open_ctree(struct super_block *sb,
>  		goto fail_tree_roots;
>  	}
>
> +	ret = btrfs_check_degradable(fs_info, fs_info->sb->s_flags);
> +	if (ret < 0) {
> +		btrfs_err(fs_info, "degraded writable mount failed %d", ret);
> +		goto fail_tree_roots;
> +	} else if (ret > 0 && !btrfs_test_opt(chunk_root, DEGRADED)) {
> +		btrfs_warn(fs_info,
> +			"Some device missing, but still degraded mountable, please mount with -o degraded option");
> +		ret = -EACCES;
> +		goto fail_tree_roots;
> +	}
>  	/*
>  	 * keep the device that is marked to be the target device for the
>  	 * dev_replace procedure
> @@ -2988,14 +2998,6 @@ retry_root_backup:
>  	}
>  	fs_info->num_tolerated_disk_barrier_failures =
>  		btrfs_calc_num_tolerated_disk_barrier_failures(fs_info);
> -	if (fs_info->fs_devices->missing_devices >
> -	     fs_info->num_tolerated_disk_barrier_failures &&
> -	    !(sb->s_flags & MS_RDONLY)) {
> -		pr_warn("BTRFS: missing devices(%llu) exceeds the limit(%d), writeable mount is not allowed\n",
> -			fs_info->fs_devices->missing_devices,
> -			fs_info->num_tolerated_disk_barrier_failures);
> -		goto fail_sysfs;
> -	}
>
>  	fs_info->cleaner_kthread = kthread_run(cleaner_kthread, tree_root,
>  					       "btrfs-cleaner");
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index d01f89d130e0..4f91a049fbca 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2885,6 +2885,16 @@  int open_ctree(struct super_block *sb,
 		goto fail_tree_roots;
 	}
 
+	ret = btrfs_check_degradable(fs_info, fs_info->sb->s_flags);
+	if (ret < 0) {
+		btrfs_err(fs_info, "degraded writable mount failed %d", ret);
+		goto fail_tree_roots;
+	} else if (ret > 0 && !btrfs_test_opt(chunk_root, DEGRADED)) {
+		btrfs_warn(fs_info,
+			"Some device missing, but still degraded mountable, please mount with -o degraded option");
+		ret = -EACCES;
+		goto fail_tree_roots;
+	}
 	/*
 	 * keep the device that is marked to be the target device for the
 	 * dev_replace procedure
@@ -2988,14 +2998,6 @@  retry_root_backup:
 	}
 	fs_info->num_tolerated_disk_barrier_failures =
 		btrfs_calc_num_tolerated_disk_barrier_failures(fs_info);
-	if (fs_info->fs_devices->missing_devices >
-	     fs_info->num_tolerated_disk_barrier_failures &&
-	    !(sb->s_flags & MS_RDONLY)) {
-		pr_warn("BTRFS: missing devices(%llu) exceeds the limit(%d), writeable mount is not allowed\n",
-			fs_info->fs_devices->missing_devices,
-			fs_info->num_tolerated_disk_barrier_failures);
-		goto fail_sysfs;
-	}
 
 	fs_info->cleaner_kthread = kthread_run(cleaner_kthread, tree_root,
 					       "btrfs-cleaner");

[02/13] btrfs: Do per-chunk check for mount time check

Commit Message

Comments

Patch