diff mbox

dm stripe: add DAX support

Message ID 1466792610-30369-1-git-send-email-toshi.kani@hpe.com (mailing list archive)
State New, archived
Headers show

Commit Message

Kani, Toshi June 24, 2016, 6:23 p.m. UTC
Change dm-stripe to implement direct_access function,
stripe_direct_access(), which maps bdev and sector and
calls direct_access function of its physical target device.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
---
 drivers/md/dm-stripe.c |   25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

Comments

Mike Snitzer June 24, 2016, 6:29 p.m. UTC | #1
On Fri, Jun 24 2016 at  2:23pm -0400,
Toshi Kani <toshi.kani@hpe.com> wrote:

> Change dm-stripe to implement direct_access function,
> stripe_direct_access(), which maps bdev and sector and
> calls direct_access function of its physical target device.
> 
> Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
> Cc: Alasdair Kergon <agk@redhat.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
> ---
>  drivers/md/dm-stripe.c |   25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
> 
> diff --git a/drivers/md/dm-stripe.c b/drivers/md/dm-stripe.c
> index 48f1c01..8925f6a 100644
> --- a/drivers/md/dm-stripe.c
> +++ b/drivers/md/dm-stripe.c
> @@ -308,6 +308,30 @@ static int stripe_map(struct dm_target *ti, struct bio *bio)
>  	return DM_MAPIO_REMAPPED;
>  }
>  
> +static long stripe_direct_access(struct dm_target *ti, sector_t sector,
> +		void __pmem **kaddr, pfn_t *pfn, long size)
> +{
> +	struct stripe_c *sc;
> +	struct block_device *bdev;
> +	uint32_t stripe;
> +	struct blk_dax_ctl dax = {
> +		.size = size,
> +	};
> +	long ret;
> +
> +	sc = ti->private;
> +	stripe_map_sector(sc, sector, &stripe, &dax.sector);
> +
> +	dax.sector += sc->stripe[stripe].physical_start;
> +	bdev = sc->stripe[stripe].dev->bdev;
> +
> +	ret = bdev_direct_access(bdev, &dax);
> +	*kaddr = dax.addr;
> +	*pfn = dax.pfn;
> +
> +	return ret;
> +}
> +
>  /*
>   * Stripe status:
>   *
> @@ -425,6 +449,7 @@ static struct target_type stripe_target = {
>  	.status = stripe_status,
>  	.iterate_devices = stripe_iterate_devices,
>  	.io_hints = stripe_io_hints,
> +	.direct_access = stripe_direct_access,
>  };
>  
>  int __init dm_stripe_init(void)

Thanks, once jens queues the block changes I'll be able to pull this in
to linux-dm.git

BTW, if in your testing you could evaluate/quantify any extra overhead
from DM that'd be useful to share.  It could be there are bottlenecks
that need to be fixed, etc.

Mike
Kani, Toshi June 24, 2016, 6:42 p.m. UTC | #2
On Fri, 2016-06-24 at 14:29 -0400, Mike Snitzer wrote:
> On Fri, Jun 24 2016 at  2:23pm -0400,

> Toshi Kani <toshi.kani@hpe.com> wrote:

 :
> Thanks, once jens queues the block changes I'll be able to pull this in

> to linux-dm.git

> 

> BTW, if in your testing you could evaluate/quantify any extra overhead

> from DM that'd be useful to share.  It could be there are bottlenecks

> that need to be fixed, etc.


I ran some performance tests as sanity checks (results are in ballpark), but
have not verified them carefully.  Yes, I will check to see if there are any
bottlenecks.

Thanks!
-Toshi
kernel test robot June 24, 2016, 7:58 p.m. UTC | #3
Hi,

[auto build test WARNING on dm/for-next]
[also build test WARNING on v4.7-rc4 next-20160624]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Toshi-Kani/dm-stripe-add-DAX-support/20160625-022600
base:   https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git for-next
config: m68k-sun3_defconfig (attached as .config)
compiler: m68k-linux-gcc (GCC) 4.9.0
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=m68k 

All warnings (new ones prefixed by >>):

   drivers/md/dm-stripe.c:452:2: error: unknown field 'direct_access' specified in initializer
     .direct_access = stripe_direct_access,
     ^
   drivers/md/dm-stripe.c:452:2: warning: missing braces around initializer [-Wmissing-braces]
   drivers/md/dm-stripe.c:452:2: warning: (near initialization for 'stripe_target.list') [-Wmissing-braces]
>> drivers/md/dm-stripe.c:452:2: warning: initialization from incompatible pointer type
   drivers/md/dm-stripe.c:452:2: warning: (near initialization for 'stripe_target.list.next')

vim +452 drivers/md/dm-stripe.c

   436	
   437		blk_limits_io_min(limits, chunk_size);
   438		blk_limits_io_opt(limits, chunk_size * sc->stripes);
   439	}
   440	
   441	static struct target_type stripe_target = {
   442		.name   = "striped",
   443		.version = {1, 5, 1},
   444		.module = THIS_MODULE,
   445		.ctr    = stripe_ctr,
   446		.dtr    = stripe_dtr,
   447		.map    = stripe_map,
   448		.end_io = stripe_end_io,
   449		.status = stripe_status,
   450		.iterate_devices = stripe_iterate_devices,
   451		.io_hints = stripe_io_hints,
 > 452		.direct_access = stripe_direct_access,
   453	};
   454	
   455	int __init dm_stripe_init(void)
   456	{
   457		int r;
   458	
   459		r = dm_register_target(&stripe_target);
   460		if (r < 0)

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
kernel test robot June 24, 2016, 7:59 p.m. UTC | #4
Hi,

[auto build test ERROR on dm/for-next]
[also build test ERROR on v4.7-rc4 next-20160624]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Toshi-Kani/dm-stripe-add-DAX-support/20160625-022600
base:   https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git for-next
config: x86_64-randconfig-s5-06250328 (attached as .config)
compiler: gcc-6 (Debian 6.1.1-1) 6.1.1 20160430
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All error/warnings (new ones prefixed by >>):

>> drivers/md/dm-stripe.c:452:2: error: unknown field 'direct_access' specified in initializer
     .direct_access = stripe_direct_access,
     ^
>> drivers/md/dm-stripe.c:452:19: error: initialization from incompatible pointer type [-Werror=incompatible-pointer-types]
     .direct_access = stripe_direct_access,
                      ^~~~~~~~~~~~~~~~~~~~
   drivers/md/dm-stripe.c:452:19: note: (near initialization for 'stripe_target.list.next')
>> drivers/md/dm-stripe.c:441:43: warning: missing braces around initializer [-Wmissing-braces]
    static struct target_type stripe_target = {
                                              ^
   drivers/md/dm-stripe.c:441:43: note: (near initialization for 'stripe_target')
   cc1: some warnings being treated as errors

vim +/direct_access +452 drivers/md/dm-stripe.c

   435		unsigned chunk_size = sc->chunk_size << SECTOR_SHIFT;
   436	
   437		blk_limits_io_min(limits, chunk_size);
   438		blk_limits_io_opt(limits, chunk_size * sc->stripes);
   439	}
   440	
 > 441	static struct target_type stripe_target = {
   442		.name   = "striped",
   443		.version = {1, 5, 1},
   444		.module = THIS_MODULE,
   445		.ctr    = stripe_ctr,
   446		.dtr    = stripe_dtr,
   447		.map    = stripe_map,
   448		.end_io = stripe_end_io,
   449		.status = stripe_status,
   450		.iterate_devices = stripe_iterate_devices,
   451		.io_hints = stripe_io_hints,
 > 452		.direct_access = stripe_direct_access,
   453	};
   454	
   455	int __init dm_stripe_init(void)

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
Mike Snitzer June 24, 2016, 8:02 p.m. UTC | #5
On Fri, Jun 24 2016 at  3:58pm -0400,
kbuild test robot <lkp@intel.com> wrote:

> Hi,
> 
> [auto build test WARNING on dm/for-next]
> [also build test WARNING on v4.7-rc4 next-20160624]
> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
> 
> url:    https://github.com/0day-ci/linux/commits/Toshi-Kani/dm-stripe-add-DAX-support/20160625-022600
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git for-next
> config: m68k-sun3_defconfig (attached as .config)
> compiler: m68k-linux-gcc (GCC) 4.9.0
> reproduce:
>         wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # save the attached .config to linux build tree
>         make.cross ARCH=m68k 
> 
> All warnings (new ones prefixed by >>):
> 
>    drivers/md/dm-stripe.c:452:2: error: unknown field 'direct_access' specified in initializer
>      .direct_access = stripe_direct_access,
>      ^
>    drivers/md/dm-stripe.c:452:2: warning: missing braces around initializer [-Wmissing-braces]
>    drivers/md/dm-stripe.c:452:2: warning: (near initialization for 'stripe_target.list') [-Wmissing-braces]
> >> drivers/md/dm-stripe.c:452:2: warning: initialization from incompatible pointer type
>    drivers/md/dm-stripe.c:452:2: warning: (near initialization for 'stripe_target.list.next')

FYI, overzealous kbuild robot build failure here.. I haven't staged the
prereq DM core changes yet because I'm waiting for Jens to take the 2
block changes those depend on.

Mike
Kani, Toshi June 24, 2016, 8:46 p.m. UTC | #6
On Fri, 2016-06-24 at 16:02 -0400, Mike Snitzer wrote:
> On Fri, Jun 24 2016 at  3:58pm -0400,

> kbuild test robot <lkp@intel.com> wrote:

> 

> > 

> > Hi,

> > 

> > [auto build test WARNING on dm/for-next]

> > [also build test WARNING on v4.7-rc4 next-20160624]

> > [if your patch is applied to the wrong git tree, please drop us a note to

> > help improve the system]

> > 

> > url:    https://github.com/0day-ci/linux/commits/Toshi-Kani/dm-stripe-add-

> > DAX-support/20160625-022600

> > base:   https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linu

> > x-dm.git for-next

> > config: m68k-sun3_defconfig (attached as .config)

> > compiler: m68k-linux-gcc (GCC) 4.9.0

> > reproduce:

> >         wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.gi

> > t/plain/sbin/make.cross -O ~/bin/make.cross

> >         chmod +x ~/bin/make.cross

> >         # save the attached .config to linux build tree

> >         make.cross ARCH=m68k 

> > 

> > All warnings (new ones prefixed by >>):

> > 

> >    drivers/md/dm-stripe.c:452:2: error: unknown field 'direct_access'

> > specified in initializer

> >      .direct_access = stripe_direct_access,

> >      ^

> >    drivers/md/dm-stripe.c:452:2: warning: missing braces around

> > initializer [-Wmissing-braces]

> >    drivers/md/dm-stripe.c:452:2: warning: (near initialization for

> > 'stripe_target.list') [-Wmissing-braces]

> > > 

> > > > 

> > > > drivers/md/dm-stripe.c:452:2: warning: initialization from

> > > > incompatible pointer type

> >    drivers/md/dm-stripe.c:452:2: warning: (near initialization for

> > 'stripe_target.list.next')

>

> FYI, overzealous kbuild robot build failure here.. I haven't staged the

> prereq DM core changes yet because I'm waiting for Jens to take the 2

> block changes those depend on.


This patch applies on top of wip.  Could this be because 'for-next' does not
have the previous dm/dm-linear series?

Thanks,
-Toshi
Kani, Toshi July 12, 2016, 10:22 p.m. UTC | #7
On Fri, 2016-06-24 at 14:29 -0400, Mike Snitzer wrote:

> BTW, if in your testing you could evaluate/quantify any extra overhead

> from DM that'd be useful to share.  It could be there are bottlenecks

> that need to be fixed, etc.


Here are some results from fio benchmark.  The test is single-threaded and is
bound to one CPU.

 DAX  LVM   IOPS   NOTE
 ---------------------------------------
  Y    N    790K
  Y    Y    754K   5% overhead with LVM
  N    N    567K
  N    Y    457K   20% overhead with LVM

 DAX: Y: mount -o dax,noatime, N: mount -o noatime
 LVM: Y: dm-linear on pmem0 device, N: pmem0 device
 fio: bs=4k, size=2G, direct=1, rw=randread, numjobs=1

Among the 5% overhead with DAX/LVM, the new DM direct_access interfaces
account for less than 0.5%.

 dm_blk_direct_access 0.28%
 linear_direct_access 0.17%

The average latency increases slightly from 0.93us to 0.95us.  I think most of
the overhead comes from the submit_bio() path, which is used only for
accessing metadata with DAX.  I believe this is due to cloning bio for each
request in DM.  There is 12% more L2 miss in total.

Without DAX, 20% overhead is observed with LVM.  Average latency increases
from 1.39us to 1.82us.  Without DAX, bio is cloned for both data and metadata.

Thanks,
-Toshi
Mike Snitzer July 13, 2016, 2:01 a.m. UTC | #8
On Tue, Jul 12 2016 at  6:22pm -0400,
Kani, Toshimitsu <toshi.kani@hpe.com> wrote:

> On Fri, 2016-06-24 at 14:29 -0400, Mike Snitzer wrote:
> > 
> > BTW, if in your testing you could evaluate/quantify any extra overhead
> > from DM that'd be useful to share.  It could be there are bottlenecks
> > that need to be fixed, etc.
> 
> Here are some results from fio benchmark.  The test is single-threaded and is
> bound to one CPU.
> 
>  DAX  LVM   IOPS   NOTE
>  ---------------------------------------
>   Y    N    790K
>   Y    Y    754K   5% overhead with LVM
>   N    N    567K
>   N    Y    457K   20% overhead with LVM
> 
>  DAX: Y: mount -o dax,noatime, N: mount -o noatime
>  LVM: Y: dm-linear on pmem0 device, N: pmem0 device
>  fio: bs=4k, size=2G, direct=1, rw=randread, numjobs=1
> 
> Among the 5% overhead with DAX/LVM, the new DM direct_access interfaces
> account for less than 0.5%.
> 
>  dm_blk_direct_access 0.28%
>  linear_direct_access 0.17%
> 
> The average latency increases slightly from 0.93us to 0.95us.  I think most of
> the overhead comes from the submit_bio() path, which is used only for
> accessing metadata with DAX.  I believe this is due to cloning bio for each
> request in DM.  There is 12% more L2 miss in total.
> 
> Without DAX, 20% overhead is observed with LVM.  Average latency increases
> from 1.39us to 1.82us.  Without DAX, bio is cloned for both data and metadata.

Thanks for putting this summary together.  Unfortunately none of the DM
changes can be queued for 4.8 until Jens takes the 2 block core patches:
https://patchwork.kernel.org/patch/9196021/
https://patchwork.kernel.org/patch/9196019/

Not sure what the hold up and/or issue is with them.  But I've asked
twice (and implicilty a 3rd time here).  Hopefully they land in time for
4.8.

Mike
Kani, Toshi July 13, 2016, 3:03 p.m. UTC | #9
On Tue, 2016-07-12 at 22:01 -0400, Mike Snitzer wrote:
> On Tue, Jul 12 2016 at  6:22pm -0400,

> Kani, Toshimitsu <toshi.kani@hpe.com> wrote:

> > On Fri, 2016-06-24 at 14:29 -0400, Mike Snitzer wrote:

> > >  

 :
> Thanks for putting this summary together.  Unfortunately none of the DM

> changes can be queued for 4.8 until Jens takes the 2 block core patches:

> https://patchwork.kernel.org/patch/9196021/

> https://patchwork.kernel.org/patch/9196019/

> 

> Not sure what the hold up and/or issue is with them.  But I've asked

> twice (and implicilty a 3rd time here).  Hopefully they land in time for

> 4.8.


Hi Jens,

Can you take the two patches above?  These patches add QUEUE_FLAG_DAX and its
sysfs show func.  They allow device-mapper to set dax-capability based on its
configuration, and also allow user applications to check dax-capability via
sysfs.

Thanks,
-Toshi
Mike Snitzer July 21, 2016, 12:01 a.m. UTC | #10
On Wed, Jul 13 2016 at 11:03am -0400,
Kani, Toshimitsu <toshi.kani@hpe.com> wrote:

> On Tue, 2016-07-12 at 22:01 -0400, Mike Snitzer wrote:
> > On Tue, Jul 12 2016 at  6:22pm -0400,
> > Kani, Toshimitsu <toshi.kani@hpe.com> wrote:
> > > On Fri, 2016-06-24 at 14:29 -0400, Mike Snitzer wrote:
> > > >  
>  :
> > Thanks for putting this summary together.  Unfortunately none of the DM
> > changes can be queued for 4.8 until Jens takes the 2 block core patches:
> > https://patchwork.kernel.org/patch/9196021/
> > https://patchwork.kernel.org/patch/9196019/
> > 
> > Not sure what the hold up and/or issue is with them.  But I've asked
> > twice (and implicilty a 3rd time here).  Hopefully they land in time for
> > 4.8.
> 
> Hi Jens,
> 
> Can you take the two patches above?  These patches add QUEUE_FLAG_DAX and its
> sysfs show func.  They allow device-mapper to set dax-capability based on its
> configuration, and also allow user applications to check dax-capability via
> sysfs.

Hi Jens,

As I shared with you before, here are the 2 patches I staged in
linux-next via linux-dm.git's 'for-next' (same as in the patchwork
references above) -- these were staged just so we had linux-next
coverage until you picked them up.

Please pick these up for 4.8 -- the changes are straightforward, are
required for DM's DAX support, and Dan Williams has also acked them:

https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next&id=6c00531c6affa42b165df73a1eac3289bc45f4c4
https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next&id=51d1a5abcd0e88cb4528f35643245ea59cf234f1

Feel free to pick them up from patchwork or cherry-pick from
linux-dm.git

Thanks!
Mike
Jens Axboe July 21, 2016, 3:03 a.m. UTC | #11
On 07/20/2016 06:01 PM, Mike Snitzer wrote:
> On Wed, Jul 13 2016 at 11:03am -0400,
> Kani, Toshimitsu <toshi.kani@hpe.com> wrote:
>
>> On Tue, 2016-07-12 at 22:01 -0400, Mike Snitzer wrote:
>>> On Tue, Jul 12 2016 at  6:22pm -0400,
>>> Kani, Toshimitsu <toshi.kani@hpe.com> wrote:
>>>> On Fri, 2016-06-24 at 14:29 -0400, Mike Snitzer wrote:
>>>>>
>>  :
>>> Thanks for putting this summary together.  Unfortunately none of the DM
>>> changes can be queued for 4.8 until Jens takes the 2 block core patches:
>>> https://patchwork.kernel.org/patch/9196021/
>>> https://patchwork.kernel.org/patch/9196019/
>>>
>>> Not sure what the hold up and/or issue is with them.  But I've asked
>>> twice (and implicilty a 3rd time here).  Hopefully they land in time for
>>> 4.8.
>>
>> Hi Jens,
>>
>> Can you take the two patches above?  These patches add QUEUE_FLAG_DAX and its
>> sysfs show func.  They allow device-mapper to set dax-capability based on its
>> configuration, and also allow user applications to check dax-capability via
>> sysfs.
>
> Hi Jens,
>
> As I shared with you before, here are the 2 patches I staged in
> linux-next via linux-dm.git's 'for-next' (same as in the patchwork
> references above) -- these were staged just so we had linux-next
> coverage until you picked them up.
>
> Please pick these up for 4.8 -- the changes are straightforward, are
> required for DM's DAX support, and Dan Williams has also acked them:
>
> https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next&id=6c00531c6affa42b165df73a1eac3289bc45f4c4
> https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next&id=51d1a5abcd0e88cb4528f35643245ea59cf234f1
>
> Feel free to pick them up from patchwork or cherry-pick from
> linux-dm.git

Added for 4.8, thanks!
diff mbox

Patch

diff --git a/drivers/md/dm-stripe.c b/drivers/md/dm-stripe.c
index 48f1c01..8925f6a 100644
--- a/drivers/md/dm-stripe.c
+++ b/drivers/md/dm-stripe.c
@@ -308,6 +308,30 @@  static int stripe_map(struct dm_target *ti, struct bio *bio)
 	return DM_MAPIO_REMAPPED;
 }
 
+static long stripe_direct_access(struct dm_target *ti, sector_t sector,
+		void __pmem **kaddr, pfn_t *pfn, long size)
+{
+	struct stripe_c *sc;
+	struct block_device *bdev;
+	uint32_t stripe;
+	struct blk_dax_ctl dax = {
+		.size = size,
+	};
+	long ret;
+
+	sc = ti->private;
+	stripe_map_sector(sc, sector, &stripe, &dax.sector);
+
+	dax.sector += sc->stripe[stripe].physical_start;
+	bdev = sc->stripe[stripe].dev->bdev;
+
+	ret = bdev_direct_access(bdev, &dax);
+	*kaddr = dax.addr;
+	*pfn = dax.pfn;
+
+	return ret;
+}
+
 /*
  * Stripe status:
  *
@@ -425,6 +449,7 @@  static struct target_type stripe_target = {
 	.status = stripe_status,
 	.iterate_devices = stripe_iterate_devices,
 	.io_hints = stripe_io_hints,
+	.direct_access = stripe_direct_access,
 };
 
 int __init dm_stripe_init(void)