diff mbox

DM MULTIPATH: Allow dm to send larger request if underlying device set to larger max_sectors value

Message ID F200AD42A082BC4088B232D13A916A8907E5DD64@SACEXCMBX02-PRD.hq.netapp.com (mailing list archive)
State Deferred, archived
Headers show

Commit Message

Chauhan, Vijay July 8, 2012, 5:59 p.m. UTC
Even though underlying paths are set with larger value for max_sectors, dm 
sets 1024(i.e 512KB) for max_sectors as default. max_sectors for dm 
device can be reset through sysfs but any time map is updated, max_sectors
is again set back to default. This patch gets the minimum of max_sectors from 
physical paths and sets it to dm device.


Signed-off-by: Vijay Chauhan <vijay.chauhan@netapp.com>
Reviewed-by: Babu Moger <babu.moger@netapp.com>
Reviewed-by: Bob Stankey <Robert.Stankey@netapp.com>

---
--

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Comments

Alasdair G Kergon July 9, 2012, 1:01 a.m. UTC | #1
Why is this separate from dm_set_device_limits?
Why is this not handled by blk-settings.c?
Should max_sectors be preserved across a table reload?

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Chauhan, Vijay July 9, 2012, 12:34 p.m. UTC | #2
Hi Alasdair,

On July 09, 2012 6:32 AM, Alasdair Wrote:
>
>Why is this separate from dm_set_device_limits?
Ideally it should have been taken care in dm_set_device_limits but to find 
minimum of max_sectors value among physical paths in dm_set_device_limits  seems tricky. 
I will look into it to use dm_set_device_limits.

>Why is this not handled by blk-settings.c?
blk_stack_limits do compare max_sectors between dm & physical path and minimum 
between two values is then set to max_sectors of dm. But if underlying paths 
are set with higher value, dm's max_sectors will be always set to default being 
minimum. Rather objective is to set minimum of max_sectors among underlying paths but 
not exceeding dm's max_hw_max_sectors

>Should max_sectors be preserved across a table reload?
Intentions is to set max_sectors to maximum capable when dm device is created. Even if 
we reset max_sectors_kb from sysfs, restart of multipathd would again reset it to default. 
Not sure if I answered your question.



Thanks,
Vijay

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer July 9, 2012, 1 p.m. UTC | #3
On Sun, Jul 08 2012 at  1:59pm -0400,
Chauhan, Vijay <Vijay.Chauhan@netapp.com> wrote:

> Even though underlying paths are set with larger value for max_sectors, dm 
> sets 1024(i.e 512KB) for max_sectors as default. max_sectors for dm 
> device can be reset through sysfs but any time map is updated, max_sectors
> is again set back to default. This patch gets the minimum of max_sectors from 
> physical paths and sets it to dm device.

There shouldn't be any need for additional DM overrides for max_sectors.

DM will stack the limits for all underlying devices each table reload
(via dm_calculate_queue_limits).  And max_sectors is properly stacked in
the block layer's bdev_stack_limits (called by dm_set_device_limits).

So is something resetting max_sectors with sysfs?  multipathd?

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer July 9, 2012, 1:16 p.m. UTC | #4
On Mon, Jul 09 2012 at  9:00am -0400,
Mike Snitzer <snitzer@redhat.com> wrote:

> On Sun, Jul 08 2012 at  1:59pm -0400,
> Chauhan, Vijay <Vijay.Chauhan@netapp.com> wrote:
> 
> > Even though underlying paths are set with larger value for max_sectors, dm 
> > sets 1024(i.e 512KB) for max_sectors as default. max_sectors for dm 
> > device can be reset through sysfs but any time map is updated, max_sectors
> > is again set back to default. This patch gets the minimum of max_sectors from 
> > physical paths and sets it to dm device.
> 
> There shouldn't be any need for additional DM overrides for max_sectors.
> 
> DM will stack the limits for all underlying devices each table reload
> (via dm_calculate_queue_limits).  And max_sectors is properly stacked in
> the block layer's bdev_stack_limits (called by dm_set_device_limits).
> 
> So is something resetting max_sectors with sysfs?  multipathd?

BLK_DEF_MAX_SECTORS = 1024
blk_set_stacking_limits: lim->max_sectors = BLK_DEF_MAX_SECTORS

But that just establishes the default, the stacking done by
blk_stack_limits will reduce 'max_sectors' accordingly based on the
underlying paths' max_sectors.

I can clearly see that max_sectors is reduced according to the
underlying device(s):

# multipath -ll
mpathe (36003005700ec1890167a7e5953effb87) dm-5 LSI,RAID 5/6 SAS 6G
size=465G features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  `- 0:2:4:0 sde 8:64 active ready running

# cat /sys/block/sde/queue/max_sectors_kb 
240

# cat /sys/block/dm-5/queue/max_sectors_kb 
240

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer July 9, 2012, 1:40 p.m. UTC | #5
On Mon, Jul 09 2012 at  9:16am -0400,
Mike Snitzer <snitzer@redhat.com> wrote:

> On Mon, Jul 09 2012 at  9:00am -0400,
> Mike Snitzer <snitzer@redhat.com> wrote:
> 
> > On Sun, Jul 08 2012 at  1:59pm -0400,
> > Chauhan, Vijay <Vijay.Chauhan@netapp.com> wrote:
> > 
> > > Even though underlying paths are set with larger value for max_sectors, dm 
> > > sets 1024(i.e 512KB) for max_sectors as default. max_sectors for dm 
> > > device can be reset through sysfs but any time map is updated, max_sectors
> > > is again set back to default. This patch gets the minimum of max_sectors from 
> > > physical paths and sets it to dm device.
> > 
> > There shouldn't be any need for additional DM overrides for max_sectors.
> > 
> > DM will stack the limits for all underlying devices each table reload
> > (via dm_calculate_queue_limits).  And max_sectors is properly stacked in
> > the block layer's bdev_stack_limits (called by dm_set_device_limits).
> > 
> > So is something resetting max_sectors with sysfs?  multipathd?
> 
> BLK_DEF_MAX_SECTORS = 1024
> blk_set_stacking_limits: lim->max_sectors = BLK_DEF_MAX_SECTORS
> 
> But that just establishes the default, the stacking done by
> blk_stack_limits will reduce 'max_sectors' accordingly based on the
> underlying paths' max_sectors.
> 
> I can clearly see that max_sectors is reduced according to the
> underlying device(s):

Ah, but you were saying max_hw_sectors and max_sectors may be larger
than 1024 and that you'd like to have the multipth device's max-sectors
reflect the larger values (not be capped by the block layer's
BLK_DEF_MAX_SECTORS).

Very interesting case that we haven't seen raised before.  This will
require block layer changes (DM will then get the change for free).

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
diff mbox

Patch

--- linux-3.5-rc5-orig/drivers/md/dm-table.c	2012-07-07 11:39:17.000000000 +0530
+++ linux-3.5-rc5/drivers/md/dm-table.c	2012-07-09 00:52:37.000000000 +0530
@@ -549,6 +549,18 @@  int dm_set_device_limits(struct dm_targe
 }
 EXPORT_SYMBOL_GPL(dm_set_device_limits);
 
+int dm_device_max_sectors(struct dm_target *ti, struct dm_dev *dev,
+			sector_t start, sector_t len, void *data)
+{
+	unsigned int *max_sectors = data;
+	struct block_device *bdev = dev->bdev;
+	struct request_queue *q = bdev_get_queue(bdev);
+
+	*max_sectors = min_not_zero(*max_sectors, q->limits.max_sectors);
+
+	return 0;
+}
+
 /*
  * Decrement a device's use count and remove it if necessary.
  */
@@ -692,6 +704,7 @@  static int validate_hardware_logical_blo
 	struct dm_target *uninitialized_var(ti);
 	struct queue_limits ti_limits;
 	unsigned i = 0;
+	unsigned int max_sectors = 0;
 
 	/*
 	 * Check each entry in the table in turn.
@@ -706,6 +719,15 @@  static int validate_hardware_logical_blo
 			ti->type->iterate_devices(ti, dm_set_device_limits,
 						  &ti_limits);
 
+		/* Find minimum of max_sectors from target devices */
+		if (ti->type->iterate_devices) {
+			ti->type->iterate_devices(ti, dm_device_max_sectors,
+						  &max_sectors);
+			limits->max_sectors = min_t(unsigned int,
+						    ti_limits.max_hw_sectors,
+						    max_sectors);
+		}
+
 		/*
 		 * If the remaining sectors fall entirely within this
 		 * table entry are they compatible with its logical_block_size?