diff mbox

Btrfs: do not async metadata csums if we have hardware crc32c

Message ID 1348510264-5781-1-git-send-email-jbacik@fusionio.com (mailing list archive)
State New, archived
Headers show

Commit Message

Josef Bacik Sept. 24, 2012, 6:11 p.m. UTC
The reason we offload csumming is because it is CPU intensive, except it is
not on modern intel CPUs.  So check to see if we support hardware crc32c,
and if we do just do the csumming in our current threads context.  Otherwise
we can farm it off.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
---
 fs/btrfs/disk-io.c |   17 +++++++++++++++++
 1 files changed, 17 insertions(+), 0 deletions(-)

Comments

Arne Jansen Sept. 24, 2012, 6:19 p.m. UTC | #1
On 09/24/12 20:11, Josef Bacik wrote:
> The reason we offload csumming is because it is CPU intensive, except it is
> not on modern intel CPUs.  So check to see if we support hardware crc32c,
> and if we do just do the csumming in our current threads context.  Otherwise
> we can farm it off.  Thanks,
> 
> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
> ---
>  fs/btrfs/disk-io.c |   17 +++++++++++++++++
>  1 files changed, 17 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index dcaf556..830b9af 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -31,6 +31,7 @@
>  #include <linux/migrate.h>
>  #include <linux/ratelimit.h>
>  #include <asm/unaligned.h>
> +#include <asm/cpufeature.h>
>  #include "compat.h"
>  #include "ctree.h"
>  #include "disk-io.h"
> @@ -880,6 +881,22 @@ static int btree_submit_bio_hook(struct inode *inode, int rw, struct bio *bio,
>  	}
>  
>  	/*
> +	 * Pretty sure I'm going to hell for this.  If our CPU can do crc32cs in
> +	 * the hardware then there is no reason to do the csum stuff
> +	 * asynchronously, it will be faster to do it inline, so test to see if
> +	 * our CPU can do hardware crc32c and if it can just do the csum in our
> +	 * threads context.
> +	 */
> +#ifdef CONFIG_X86
> +	if (cpu_has_xmm4_2) {
> +		printk(KERN_ERR "doing it the fast way\n");

You'll probably go to hell for the printk...

> +		ret = btree_csum_one_bio(bio);
> +		if (ret)
> +			return ret;
> +		return btrfs_map_bio(BTRFS_I(inode)->root, rw, bio, mirror_num, 0);
> +	}
> +#endif
> +	/*
>  	 * kthread helpers are used to submit writes so that checksumming
>  	 * can happen in parallel across all CPUs
>  	 */
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Josef Bacik Sept. 24, 2012, 6:33 p.m. UTC | #2
On Mon, Sep 24, 2012 at 12:19:20PM -0600, Arne Jansen wrote:
> On 09/24/12 20:11, Josef Bacik wrote:
> > The reason we offload csumming is because it is CPU intensive, except it is
> > not on modern intel CPUs.  So check to see if we support hardware crc32c,
> > and if we do just do the csumming in our current threads context.  Otherwise
> > we can farm it off.  Thanks,
> > 
> > Signed-off-by: Josef Bacik <jbacik@fusionio.com>
> > ---
> >  fs/btrfs/disk-io.c |   17 +++++++++++++++++
> >  1 files changed, 17 insertions(+), 0 deletions(-)
> > 
> > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> > index dcaf556..830b9af 100644
> > --- a/fs/btrfs/disk-io.c
> > +++ b/fs/btrfs/disk-io.c
> > @@ -31,6 +31,7 @@
> >  #include <linux/migrate.h>
> >  #include <linux/ratelimit.h>
> >  #include <asm/unaligned.h>
> > +#include <asm/cpufeature.h>
> >  #include "compat.h"
> >  #include "ctree.h"
> >  #include "disk-io.h"
> > @@ -880,6 +881,22 @@ static int btree_submit_bio_hook(struct inode *inode, int rw, struct bio *bio,
> >  	}
> >  
> >  	/*
> > +	 * Pretty sure I'm going to hell for this.  If our CPU can do crc32cs in
> > +	 * the hardware then there is no reason to do the csum stuff
> > +	 * asynchronously, it will be faster to do it inline, so test to see if
> > +	 * our CPU can do hardware crc32c and if it can just do the csum in our
> > +	 * threads context.
> > +	 */
> > +#ifdef CONFIG_X86
> > +	if (cpu_has_xmm4_2) {
> > +		printk(KERN_ERR "doing it the fast way\n");
> 
> You'll probably go to hell for the printk...
> 

Hahah oops, at least I remembered to take out the other printk, it had much more
colorful language ;).  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chris Mason Sept. 24, 2012, 6:58 p.m. UTC | #3
On Mon, Sep 24, 2012 at 12:19:20PM -0600, Arne Jansen wrote:
> On 09/24/12 20:11, Josef Bacik wrote:
> > The reason we offload csumming is because it is CPU intensive, except it is
> > not on modern intel CPUs.  So check to see if we support hardware crc32c,
> > and if we do just do the csumming in our current threads context.  Otherwise
> > we can farm it off.  Thanks,
> > 
> > Signed-off-by: Josef Bacik <jbacik@fusionio.com>
> > ---
> >  fs/btrfs/disk-io.c |   17 +++++++++++++++++
> >  1 files changed, 17 insertions(+), 0 deletions(-)
> > 
> > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> > index dcaf556..830b9af 100644
> > --- a/fs/btrfs/disk-io.c
> > +++ b/fs/btrfs/disk-io.c
> > @@ -31,6 +31,7 @@
> >  #include <linux/migrate.h>
> >  #include <linux/ratelimit.h>
> >  #include <asm/unaligned.h>
> > +#include <asm/cpufeature.h>
> >  #include "compat.h"
> >  #include "ctree.h"
> >  #include "disk-io.h"
> > @@ -880,6 +881,22 @@ static int btree_submit_bio_hook(struct inode *inode, int rw, struct bio *bio,
> >  	}
> >  
> >  	/*
> > +	 * Pretty sure I'm going to hell for this.  If our CPU can do crc32cs in
> > +	 * the hardware then there is no reason to do the csum stuff
> > +	 * asynchronously, it will be faster to do it inline, so test to see if
> > +	 * our CPU can do hardware crc32c and if it can just do the csum in our
> > +	 * threads context.
> > +	 */
> > +#ifdef CONFIG_X86
> > +	if (cpu_has_xmm4_2) {
> > +		printk(KERN_ERR "doing it the fast way\n");
> 
> You'll probably go to hell for the printk...

;)

Testing with dd on my recent intel box, I can hardware crc32c at
1.3GB/s.  Anything beyond that and you really want more cpus jumping
into the mix.

I wanted to use this test for data crcs too, but I suppose the helpers
only really hurt for the synchronous IO.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba Sept. 24, 2012, 9:03 p.m. UTC | #4
On Mon, Sep 24, 2012 at 02:11:04PM -0400, Josef Bacik wrote:
> +#ifdef CONFIG_X86
> +	if (cpu_has_xmm4_2) {
> +		printk(KERN_ERR "doing it the fast way\n");
> +		ret = btree_csum_one_bio(bio);
> +		if (ret)
> +			return ret;
> +		return btrfs_map_bio(BTRFS_I(inode)->root, rw, bio, mirror_num, 0);
> +	}
> +#endif

Could you please put the check into a separate helper and avoid the
#ifdef? This is a second candidate for a standalone utils.c where non-fs
support code could reside. Or you can call it hellpers.c .


david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba Sept. 25, 2012, 10:51 a.m. UTC | #5
On Mon, Sep 24, 2012 at 11:03:49PM +0200, David Sterba wrote:
> Could you please put the check into a separate helper

Please note that checksum will become a variable per-filesystem
property, stored within the superblock, so the helper should be passed a
fs_info pointer.

thanks,
david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
ching Sept. 25, 2012, 11:40 a.m. UTC | #6
On 09/25/2012 06:51 PM, David Sterba wrote:
> On Mon, Sep 24, 2012 at 11:03:49PM +0200, David Sterba wrote:
>> Could you please put the check into a separate helper
> Please note that checksum will become a variable per-filesystem
> property, stored within the superblock, so the helper should be passed a
> fs_info pointer.
>
> thanks,
> david
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

How about enhancing the "*thread_pool=/number" mount option instead?

thread_pool=n     enable threadpool/**/for compression and checksum, MAY improve bandwidth/*
*/thread_pool=0     disable threadpool for compression and checksum, MIGHT reduce latency
thread_pool=-1 or not provided    automatically managed (current behaviour and default choice)


This should allow user to tradeoff between latency and bandwidth, furthermore, you do not need to assume that btrfs may use crc32c algorithm only forever.




/*
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
ching Sept. 25, 2012, 11:54 a.m. UTC | #7
On 09/25/2012 06:51 PM, David Sterba wrote:
> On Mon, Sep 24, 2012 at 11:03:49PM +0200, David Sterba wrote:
>> Could you please put the check into a separate helper
> Please note that checksum will become a variable per-filesystem
> property, stored within the superblock, so the helper should be passed a
> fs_info pointer.
>
> thanks,
> david
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

How about enhancing the "thread_pool=number" mount option instead?

thread_pool=n                     enable threadpool for compression and checksum, MAY improve bandwidth
thread_pool=0                     disable threadpool for compression and checksum, MIGHT reduce latency
thread_pool=-1 or not provided    automatically managed (current behavior and default choice)


This should allow user to tradeoff between latency and bandwidth, furthermore, you do not need to assume that btrfs may use crc32c algorithm only forever.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba Sept. 25, 2012, 12:55 p.m. UTC | #8
On Tue, Sep 25, 2012 at 07:40:17PM +0800, ching wrote:
> How about enhancing the "*thread_pool=/number" mount option instead?
> 
> thread_pool=n     enable threadpool/**/for compression and checksum, MAY improve bandwidth/*
> */thread_pool=0     disable threadpool for compression and checksum, MIGHT reduce latency
> thread_pool=-1 or not provided    automatically managed (current behaviour and default choice)

Sorry, I don't understand the syntax, can you please write it more
clearly? Thanks.

> This should allow user to tradeoff between latency and bandwidth,
> furthermore, you do not need to assume that btrfs may use crc32c
> algorithm only forever.

Some sort of finer control over the threads makes sense, we should
distinguish for cpu-bound processing where paralelism wins and io-bound
where it is not helpful to add more and more threads that hammer a
single device (namely in case of HDD, so this should be tunable for the
devices with cheap seeks).

david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index dcaf556..830b9af 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -31,6 +31,7 @@ 
 #include <linux/migrate.h>
 #include <linux/ratelimit.h>
 #include <asm/unaligned.h>
+#include <asm/cpufeature.h>
 #include "compat.h"
 #include "ctree.h"
 #include "disk-io.h"
@@ -880,6 +881,22 @@  static int btree_submit_bio_hook(struct inode *inode, int rw, struct bio *bio,
 	}
 
 	/*
+	 * Pretty sure I'm going to hell for this.  If our CPU can do crc32cs in
+	 * the hardware then there is no reason to do the csum stuff
+	 * asynchronously, it will be faster to do it inline, so test to see if
+	 * our CPU can do hardware crc32c and if it can just do the csum in our
+	 * threads context.
+	 */
+#ifdef CONFIG_X86
+	if (cpu_has_xmm4_2) {
+		printk(KERN_ERR "doing it the fast way\n");
+		ret = btree_csum_one_bio(bio);
+		if (ret)
+			return ret;
+		return btrfs_map_bio(BTRFS_I(inode)->root, rw, bio, mirror_num, 0);
+	}
+#endif
+	/*
 	 * kthread helpers are used to submit writes so that checksumming
 	 * can happen in parallel across all CPUs
 	 */