diff mbox

btrfs-progs: mkfs: Enable -d dup for single device

Message ID fff170b03b5707d4b4457f36da271fadf4bdeade.1444732174.git.zhaolei@cn.fujitsu.com (mailing list archive)
State Superseded
Headers show

Commit Message

Zhaolei Oct. 13, 2015, 10:29 a.m. UTC
Current code don't support dup profile in single device, except it
is in mixed mode, because following reason:
1: In some ssd with deduplication function, it have no effect.
2: For a physical device, it the entire disk broken, -d dup can
   not help.
3: Half performance comparing with single profile.
4: We have a workaround: Create multi-partition in single device,
   and btefs will treat them as multi device.

Instead of refuse -d dup, we have a better choise:
Give user a chance to select, and output a warning to notice above
problem.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
---
 mkfs.c  |  2 +-
 utils.c | 10 ++++------
 utils.h |  2 +-
 3 files changed, 6 insertions(+), 8 deletions(-)

Comments

David Sterba Oct. 13, 2015, 11:28 a.m. UTC | #1
On Tue, Oct 13, 2015 at 06:29:41PM +0800, Zhao Lei wrote:
> Current code don't support dup profile in single device, except it
> is in mixed mode, because following reason:
> 1: In some ssd with deduplication function, it have no effect.
> 2: For a physical device, it the entire disk broken, -d dup can
>    not help.
> 3: Half performance comparing with single profile.
> 4: We have a workaround: Create multi-partition in single device,
>    and btefs will treat them as multi device.

While the above makes sense is true, I'm not sure that DUP was disabled
for these reasons. I'm sure that I read a comment from Chris that dup
for data is intentionally disabled because this would lead to
corruption, the code for DUP for metadata would not work for data. And I
can't find the comment, but the doubt is there. So unless I find it or
get otherwise convicend that it's ok, I won't merge the patch. I hope
you understand that.

What I remember from the comment is that "it's slightly offset that would
lead to corruption".
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhaolei Oct. 13, 2015, 11:40 a.m. UTC | #2
Hi, David Sterba

Thanks for review.

> -----Original Message-----
> From: David Sterba [mailto:dsterba@suse.cz]
> Sent: Tuesday, October 13, 2015 7:29 PM
> To: Zhao Lei <zhaolei@cn.fujitsu.com>
> Cc: linux-btrfs@vger.kernel.org; clm@fb.com
> Subject: Re: [PATCH] btrfs-progs: mkfs: Enable -d dup for single device
> 
> On Tue, Oct 13, 2015 at 06:29:41PM +0800, Zhao Lei wrote:
> > Current code don't support dup profile in single device, except it is
> > in mixed mode, because following reason:
> > 1: In some ssd with deduplication function, it have no effect.
> > 2: For a physical device, it the entire disk broken, -d dup can
> >    not help.
> > 3: Half performance comparing with single profile.
> > 4: We have a workaround: Create multi-partition in single device,
> >    and btefs will treat them as multi device.
> 
> While the above makes sense is true, I'm not sure that DUP was disabled for
> these reasons. I'm sure that I read a comment from Chris that dup for data is
> intentionally disabled because this would lead to corruption, the code for DUP
> for metadata would not work for data. And I can't find the comment, but the
> doubt is there. So unless I find it or get otherwise convicend that it's ok, I won't
> merge the patch. I hope you understand that.
> 
> What I remember from the comment is that "it's slightly offset that would lead
> to corruption".

Before this patch, I had done git blame to search why the condition was added,
and hadn't found the exact reason.

I will queue xfstests(btrfs/generic) at this profile with all mount option
for multi-times, to check is something wrong with this.

Thanks
Zhaolei


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba Oct. 13, 2015, 12:36 p.m. UTC | #3
On Tue, Oct 13, 2015 at 07:40:30PM +0800, Zhao Lei wrote:
> > What I remember from the comment is that "it's slightly offset that would lead
> > to corruption".
> 
> Before this patch, I had done git blame to search why the condition was added,
> and hadn't found the exact reason.

found it: commit bc3f116fec194f1d7329b160c266fe16b9266a1e and it was not
aobut data/dup but mixed bgs with sectorisze != nodesize:

 26 +       nodesize = btrfs_super_nodesize(disk_super);
 27 +       leafsize = btrfs_super_leafsize(disk_super);
 28 +       sectorsize = btrfs_super_sectorsize(disk_super);
 29 +       stripesize = btrfs_super_stripesize(disk_super);
 30 +
 31 +       /*
 32 +        * mixed block groups end up with duplicate but slightly offset
 33 +        * extent buffers for the same range.  It leads to corruptions
 34 +        */
 35 +       if ((features & BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS) &&
 36 +           (sectorsize != leafsize)) {
 37 +               printk(KERN_WARNING "btrfs: unequal leaf/node/sector sizes "
 38 +                               "are not allowed for mixed block groups on %s\n",
 39 +                               sb->s_id);
 40 +               goto fail_alloc;
 41 +       }
 42 +

> I will queue xfstests(btrfs/generic) at this profile with all mount option
> for multi-times, to check is something wrong with this.

Thanks. We need to cover more: the balance conversion forbids data/dup profile,
I'm not sure if scrub handles that properly, and the ususal suspects in the
rescue tools (fsck, restore, chunk-recover).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhaolei Oct. 13, 2015, 1:13 p.m. UTC | #4
Hi, David Sterba

> -----Original Message-----
> From: David Sterba [mailto:dsterba@suse.cz]
> Sent: Tuesday, October 13, 2015 8:36 PM
> To: Zhao Lei <zhaolei@cn.fujitsu.com>
> Cc: dsterba@suse.cz; linux-btrfs@vger.kernel.org; clm@fb.com
> Subject: Re: [PATCH] btrfs-progs: mkfs: Enable -d dup for single device
> 
> On Tue, Oct 13, 2015 at 07:40:30PM +0800, Zhao Lei wrote:
> > > What I remember from the comment is that "it's slightly offset that
> > > would lead to corruption".
> >
> > Before this patch, I had done git blame to search why the condition
> > was added, and hadn't found the exact reason.
> 
> found it: commit bc3f116fec194f1d7329b160c266fe16b9266a1e and it was not
> aobut data/dup but mixed bgs with sectorisze != nodesize:
> 
>  26 +       nodesize = btrfs_super_nodesize(disk_super);
>  27 +       leafsize = btrfs_super_leafsize(disk_super);
>  28 +       sectorsize = btrfs_super_sectorsize(disk_super);
>  29 +       stripesize = btrfs_super_stripesize(disk_super);
>  30 +
>  31 +       /*
>  32 +        * mixed block groups end up with duplicate but slightly offset
>  33 +        * extent buffers for the same range.  It leads to corruptions
>  34 +        */
>  35 +       if ((features & BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS)
> &&
>  36 +           (sectorsize != leafsize)) {
>  37 +               printk(KERN_WARNING "btrfs: unequal
> leaf/node/sector sizes "
>  38 +                               "are not allowed for mixed block
> groups on %s\n",
>  39 +                               sb->s_id);
>  40 +               goto fail_alloc;
>  41 +       }
>  42 +
> 
Thanks for this information, I'll investigate is similar problem in non-mixed
with dup.

> > I will queue xfstests(btrfs/generic) at this profile with all mount
> > option for multi-times, to check is something wrong with this.
> 
> Thanks. We need to cover more: the balance conversion forbids data/dup
> profile, I'm not sure if scrub handles that properly, and the ususal suspects in
> the rescue tools (fsck, restore, chunk-recover).
>
Agree, a new profile may be have potential problem because existing code
haven't check the support status.

IMHO, it is still necessary except we can prove this function should not exist.
But we'll need to do more works to confirm above potential problem.

Thanks
Zhaolei


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba Oct. 27, 2015, 2:20 p.m. UTC | #5
On Tue, Oct 13, 2015 at 06:29:41PM +0800, Zhao Lei wrote:
> Current code don't support dup profile in single device, except it
> is in mixed mode, because following reason:
> 1: In some ssd with deduplication function, it have no effect.
> 2: For a physical device, it the entire disk broken, -d dup can
>    not help.
> 3: Half performance comparing with single profile.
> 4: We have a workaround: Create multi-partition in single device,
>    and btefs will treat them as multi device.
> 
> Instead of refuse -d dup, we have a better choise:
> Give user a chance to select, and output a warning to notice above
> problem.
> 
> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>

FYI, I've queued the patch for 4.4 (or later) whenever we're ready on
the kernel side.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhaolei Oct. 28, 2015, 1:41 a.m. UTC | #6
Hi, David Sterba

> -----Original Message-----
> From: David Sterba [mailto:dsterba@suse.cz]
> Sent: Tuesday, October 27, 2015 10:20 PM
> To: Zhao Lei <zhaolei@cn.fujitsu.com>
> Cc: linux-btrfs@vger.kernel.org
> Subject: Re: [PATCH] btrfs-progs: mkfs: Enable -d dup for single device
> 
> On Tue, Oct 13, 2015 at 06:29:41PM +0800, Zhao Lei wrote:
> > Current code don't support dup profile in single device, except it is
> > in mixed mode, because following reason:
> > 1: In some ssd with deduplication function, it have no effect.
> > 2: For a physical device, it the entire disk broken, -d dup can
> >    not help.
> > 3: Half performance comparing with single profile.
> > 4: We have a workaround: Create multi-partition in single device,
> >    and btefs will treat them as multi device.
> >
> > Instead of refuse -d dup, we have a better choise:
> > Give user a chance to select, and output a warning to notice above
> > problem.
> >
> > Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
> 
> FYI, I've queued the patch for 4.4 (or later) whenever we're ready on the kernel
> side.
Thanks.

I'm doing relative tests(xfstests, userland tools, functions as scrub, balance, ...)
on it.

Thanks
Zhaolei

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/mkfs.c b/mkfs.c
index ecd6fbf..7fa7cfc 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -1578,7 +1578,7 @@  int main(int ac, char **av)
 		}
 	}
 	ret = test_num_disk_vs_raid(metadata_profile, data_profile,
-			dev_cnt, mixed);
+			dev_cnt, mixed, ssd);
 	if (ret)
 		exit(1);
 
diff --git a/utils.c b/utils.c
index f1e3248..d81c2d9 100644
--- a/utils.c
+++ b/utils.c
@@ -2425,7 +2425,7 @@  static int group_profile_devs_min(u64 flag)
 }
 
 int test_num_disk_vs_raid(u64 metadata_profile, u64 data_profile,
-	u64 dev_cnt, int mixed)
+	u64 dev_cnt, int mixed, int ssd)
 {
 	u64 allowed = 0;
 
@@ -2466,11 +2466,9 @@  int test_num_disk_vs_raid(u64 metadata_profile, u64 data_profile,
 		return 1;
 	}
 
-	if (!mixed && (data_profile & BTRFS_BLOCK_GROUP_DUP)) {
-		fprintf(stderr,
-			"ERROR: DUP for data is allowed only in mixed mode\n");
-		return 1;
-	}
+	warning_on(!mixed && (data_profile & BTRFS_BLOCK_GROUP_DUP) && ssd,
+		   "DUP have no effect if your SSD have deduplication function");
+
 	return 0;
 }
 
diff --git a/utils.h b/utils.h
index 044ea15..b85f3fe 100644
--- a/utils.h
+++ b/utils.h
@@ -167,7 +167,7 @@  int test_dev_for_mkfs(char *file, int force_overwrite);
 int get_label_mounted(const char *mount_path, char *labelp);
 int get_label_unmounted(const char *dev, char *label);
 int test_num_disk_vs_raid(u64 metadata_profile, u64 data_profile,
-	u64 dev_cnt, int mixed);
+	u64 dev_cnt, int mixed, int ssd);
 int group_profile_max_safe_loss(u64 flags);
 int is_vol_small(char *file);
 int csum_tree_block(struct btrfs_root *root, struct extent_buffer *buf,