diff mbox

dm mpath selector: more evenly distribute ties

Message ID 20180119230737.133596-1-khazhy@google.com (mailing list archive)
State Accepted, archived
Delegated to: Mike Snitzer
Headers show

Commit Message

Khazhy Kumykov Jan. 19, 2018, 11:07 p.m. UTC
Move the last used path to the end of the list (least preferred) so that
ties are more evenly distributed.

For example, in case with three paths with one that is slower than
others, the remaining two would be unevenly used if they tie. This is
due to the rotation not being a truely fair distribution.

Illustrated: paths a, b, c, 'c' has 1 outstanding IO, a and b are 'tied'
Three possible rotations:
(a, b, c) -> best path 'a'
(b, c, a) -> best path 'b'
(c, a, b) -> best path 'a'
(a, b, c) -> best path 'a'
(b, c, a) -> best path 'b'
(c, a, b) -> best path 'a'
...

So 'a' is used 2x more than 'b', although they should be used evenly.

With this change, the most recently used path is always the least
preferred, removing this bias resulting in even distribution.
(a, b, c) -> best path 'a'
(b, c, a) -> best path 'b'
(c, a, b) -> best path 'a'
(c, b, a) -> best path 'b'
...

Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
---
 drivers/md/dm-queue-length.c | 6 +++---
 drivers/md/dm-service-time.c | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

Comments

Martin Wilck Jan. 24, 2018, 10:57 a.m. UTC | #1
On Fri, 2018-01-19 at 15:07 -0800, Khazhismel Kumykov wrote:
> Move the last used path to the end of the list (least preferred) so
> that
> ties are more evenly distributed.
> 
> For example, in case with three paths with one that is slower than
> others, the remaining two would be unevenly used if they tie. This is
> due to the rotation not being a truely fair distribution.
> 
> Illustrated: paths a, b, c, 'c' has 1 outstanding IO, a and b are
> 'tied'
> Three possible rotations:
> (a, b, c) -> best path 'a'
> (b, c, a) -> best path 'b'
> (c, a, b) -> best path 'a'
> (a, b, c) -> best path 'a'
> (b, c, a) -> best path 'b'
> (c, a, b) -> best path 'a'
> ...


This happens only if a and b actually have the same weight (e.g. queue
length for the queue-length selector). If 'a' really receives more IO,
its queue grows, and the selector will start preferring 'b', so the
effect should level out automatically with the current code as soon as
you have real IO going on. But maybe I haven't grasped what you're
referring to as "tied".

OTOH, if the "best" path has much lower queue length than the other
paths for whatever reason, your pushing it to the tail will require a
full list walk with every new call of the selector. I see tjat as a
small disadvantage of your approach.

Regards
Martin

> 
> So 'a' is used 2x more than 'b', although they should be used evenly.
> 
> With this change, the most recently used path is always the least
> preferred, removing this bias resulting in even distribution.
> (a, b, c) -> best path 'a'
> (b, c, a) -> best path 'b'
> (c, a, b) -> best path 'a'
> (c, b, a) -> best path 'b'
> ...
> 
> Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
> ---
>  drivers/md/dm-queue-length.c | 6 +++---
>  drivers/md/dm-service-time.c | 6 +++---
>  2 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/md/dm-queue-length.c b/drivers/md/dm-queue-
> length.c
> index 23f178641794..969c4f1a3633 100644
> --- a/drivers/md/dm-queue-length.c
> +++ b/drivers/md/dm-queue-length.c
> @@ -195,9 +195,6 @@ static struct dm_path *ql_select_path(struct
> path_selector *ps, size_t nr_bytes)
>  	if (list_empty(&s->valid_paths))
>  		goto out;
>  
> -	/* Change preferred (first in list) path to evenly balance.
> */
> -	list_move_tail(s->valid_paths.next, &s->valid_paths);
> -
>  	list_for_each_entry(pi, &s->valid_paths, list) {
>  		if (!best ||
>  		    (atomic_read(&pi->qlen) < atomic_read(&best-
> >qlen)))
> @@ -210,6 +207,9 @@ static struct dm_path *ql_select_path(struct
> path_selector *ps, size_t nr_bytes)
>  	if (!best)
>  		goto out;
>  
> +	/* Move most recently used to least preferred to evenly
> balance. */
> +	list_move_tail(&best->list, &s->valid_paths);
> +
>  	ret = best->path;
>  out:
>  	spin_unlock_irqrestore(&s->lock, flags);
> diff --git a/drivers/md/dm-service-time.c b/drivers/md/dm-service-
> time.c
> index 7b8642045c55..f006a9005593 100644
> --- a/drivers/md/dm-service-time.c
> +++ b/drivers/md/dm-service-time.c
> @@ -282,9 +282,6 @@ static struct dm_path *st_select_path(struct
> path_selector *ps, size_t nr_bytes)
>  	if (list_empty(&s->valid_paths))
>  		goto out;
>  
> -	/* Change preferred (first in list) path to evenly balance.
> */
> -	list_move_tail(s->valid_paths.next, &s->valid_paths);
> -
>  	list_for_each_entry(pi, &s->valid_paths, list)
>  		if (!best || (st_compare_load(pi, best, nr_bytes) <
> 0))
>  			best = pi;
> @@ -292,6 +289,9 @@ static struct dm_path *st_select_path(struct
> path_selector *ps, size_t nr_bytes)
>  	if (!best)
>  		goto out;
>  
> +	/* Move most recently used to least preferred to evenly
> balance. */
> +	list_move_tail(&best->list, &s->valid_paths);
> +
>  	ret = best->path;
>  out:
>  	spin_unlock_irqrestore(&s->lock, flags);
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
Khazhy Kumykov Jan. 24, 2018, 6:44 p.m. UTC | #2
On Wed, Jan 24, 2018 at 2:57 AM, Martin Wilck <mwilck@suse.com> wrote:
> On Fri, 2018-01-19 at 15:07 -0800, Khazhismel Kumykov wrote:
>> Move the last used path to the end of the list (least preferred) so
>> that
>> ties are more evenly distributed.
>>
>> For example, in case with three paths with one that is slower than
>> others, the remaining two would be unevenly used if they tie. This is
>> due to the rotation not being a truely fair distribution.
>>
>> Illustrated: paths a, b, c, 'c' has 1 outstanding IO, a and b are
>> 'tied'
>> Three possible rotations:
>> (a, b, c) -> best path 'a'
>> (b, c, a) -> best path 'b'
>> (c, a, b) -> best path 'a'
>> (a, b, c) -> best path 'a'
>> (b, c, a) -> best path 'b'
>> (c, a, b) -> best path 'a'
>> ...
>
>
> This happens only if a and b actually have the same weight (e.g. queue
> length for the queue-length selector). If 'a' really receives more IO,
> its queue grows, and the selector will start preferring 'b', so the
> effect should level out automatically with the current code as soon as
> you have real IO going on. But maybe I haven't grasped what you're
> referring to as "tied".
Yes, for "tied" I'm referring to paths with the same weight. As queue
length grows it does tend to level out, but in the case where queue
length doesn't grow (in this example I'm imagining 2 concurrent
requests to the device) the bias does show and the selectors really do
send 'a' 2x more requests than 'b' (when 'c' is much slower and 'a'
and 'b' are ~same speed).

>
> OTOH, if the "best" path has much lower queue length than the other
> paths for whatever reason, your pushing it to the tail will require a
> full list walk with every new call of the selector. I see tjat as a
> small disadvantage of your approach.
I see, with best at the tail, we would not see as much benefit from
the check if a path has no IOs on it in queue-length. In service-time
no such short circuit exists, so I don't believe anything changes
there. Am I understanding this correctly?

>
> Regards
> Martin
>

Thanks for your comments,
Khazhy

>>
>> So 'a' is used 2x more than 'b', although they should be used evenly.
>>
>> With this change, the most recently used path is always the least
>> preferred, removing this bias resulting in even distribution.
>> (a, b, c) -> best path 'a'
>> (b, c, a) -> best path 'b'
>> (c, a, b) -> best path 'a'
>> (c, b, a) -> best path 'b'
>> ...
>>
>> Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
>> ---
>>  drivers/md/dm-queue-length.c | 6 +++---
>>  drivers/md/dm-service-time.c | 6 +++---
>>  2 files changed, 6 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/md/dm-queue-length.c b/drivers/md/dm-queue-
>> length.c
>> index 23f178641794..969c4f1a3633 100644
>> --- a/drivers/md/dm-queue-length.c
>> +++ b/drivers/md/dm-queue-length.c
>> @@ -195,9 +195,6 @@ static struct dm_path *ql_select_path(struct
>> path_selector *ps, size_t nr_bytes)
>>       if (list_empty(&s->valid_paths))
>>               goto out;
>>
>> -     /* Change preferred (first in list) path to evenly balance.
>> */
>> -     list_move_tail(s->valid_paths.next, &s->valid_paths);
>> -
>>       list_for_each_entry(pi, &s->valid_paths, list) {
>>               if (!best ||
>>                   (atomic_read(&pi->qlen) < atomic_read(&best-
>> >qlen)))
>> @@ -210,6 +207,9 @@ static struct dm_path *ql_select_path(struct
>> path_selector *ps, size_t nr_bytes)
>>       if (!best)
>>               goto out;
>>
>> +     /* Move most recently used to least preferred to evenly
>> balance. */
>> +     list_move_tail(&best->list, &s->valid_paths);
>> +
>>       ret = best->path;
>>  out:
>>       spin_unlock_irqrestore(&s->lock, flags);
>> diff --git a/drivers/md/dm-service-time.c b/drivers/md/dm-service-
>> time.c
>> index 7b8642045c55..f006a9005593 100644
>> --- a/drivers/md/dm-service-time.c
>> +++ b/drivers/md/dm-service-time.c
>> @@ -282,9 +282,6 @@ static struct dm_path *st_select_path(struct
>> path_selector *ps, size_t nr_bytes)
>>       if (list_empty(&s->valid_paths))
>>               goto out;
>>
>> -     /* Change preferred (first in list) path to evenly balance.
>> */
>> -     list_move_tail(s->valid_paths.next, &s->valid_paths);
>> -
>>       list_for_each_entry(pi, &s->valid_paths, list)
>>               if (!best || (st_compare_load(pi, best, nr_bytes) <
>> 0))
>>                       best = pi;
>> @@ -292,6 +289,9 @@ static struct dm_path *st_select_path(struct
>> path_selector *ps, size_t nr_bytes)
>>       if (!best)
>>               goto out;
>>
>> +     /* Move most recently used to least preferred to evenly
>> balance. */
>> +     list_move_tail(&best->list, &s->valid_paths);
>> +
>>       ret = best->path;
>>  out:
>>       spin_unlock_irqrestore(&s->lock, flags);
>> --
>> dm-devel mailing list
>> dm-devel@redhat.com
>> https://www.redhat.com/mailman/listinfo/dm-devel
>
> --
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
>
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Martin Wilck Jan. 24, 2018, 7:09 p.m. UTC | #3
On Wed, 2018-01-24 at 10:44 -0800, Khazhismel Kumykov wrote:
> On Wed, Jan 24, 2018 at 2:57 AM, Martin Wilck <mwilck@suse.com>
> wrote:
> > On Fri, 2018-01-19 at 15:07 -0800, Khazhismel Kumykov wrote:
> > > Move the last used path to the end of the list (least preferred)
> > > so
> > > that
> > > ties are more evenly distributed.
> > > 
> > > For example, in case with three paths with one that is slower
> > > than
> > > others, the remaining two would be unevenly used if they tie.
> > > This is
> > > due to the rotation not being a truely fair distribution.
> > > 
> > > Illustrated: paths a, b, c, 'c' has 1 outstanding IO, a and b are
> > > 'tied'
> > > Three possible rotations:
> > > (a, b, c) -> best path 'a'
> > > (b, c, a) -> best path 'b'
> > > (c, a, b) -> best path 'a'
> > > (a, b, c) -> best path 'a'
> > > (b, c, a) -> best path 'b'
> > > (c, a, b) -> best path 'a'
> > > ...
> > 
> > 
> > This happens only if a and b actually have the same weight (e.g.
> > queue
> > length for the queue-length selector). If 'a' really receives more
> > IO,
> > its queue grows, and the selector will start preferring 'b', so the
> > effect should level out automatically with the current code as soon
> > as
> > you have real IO going on. But maybe I haven't grasped what you're
> > referring to as "tied".
> 
> Yes, for "tied" I'm referring to paths with the same weight. As queue
> length grows it does tend to level out, but in the case where queue
> length doesn't grow (in this example I'm imagining 2 concurrent
> requests to the device) the bias does show and the selectors really
> do
> send 'a' 2x more requests than 'b' (when 'c' is much slower and 'a'
> and 'b' are ~same speed).

Have you actually observed this? I find the idea pretty academical that
two paths would be walking "tied" this way. In practice, under IO load,
I'd expect queue lengths (and service-times, for that matter) to
fluctuate enough to prevent this effect to be measurable. But of
course, I may be wrong. If you really did observe this, the next
question would be: does hurt performance to an extent that can be
noticed/measured? I reckon that if 'a' got saturated under the load,
hurting performance, its queue length would grow quickly and 'b' would
automatically get preferred.

> > OTOH, if the "best" path has much lower queue length than the other
> > paths for whatever reason, your pushing it to the tail will require
> > a
> > full list walk with every new call of the selector. I see tjat as a
> > small disadvantage of your approach.
> 
> I see, with best at the tail, we would not see as much benefit from
> the check if a path has no IOs on it in queue-length. In service-time
> no such short circuit exists, so I don't believe anything changes
> there. Am I understanding this correctly?

Forget that. I was confused. We're walking the entire list every time
anyway, except for the . I find it strange to put the best candidate to
the tail every time, but I can't prove that it really hurts.

Martin
Khazhy Kumykov Jan. 24, 2018, 7:41 p.m. UTC | #4
On Wed, Jan 24, 2018 at 11:09 AM, Martin Wilck <mwilck@suse.com> wrote:
> On Wed, 2018-01-24 at 10:44 -0800, Khazhismel Kumykov wrote:
>> On Wed, Jan 24, 2018 at 2:57 AM, Martin Wilck <mwilck@suse.com>
>> wrote:
>> > On Fri, 2018-01-19 at 15:07 -0800, Khazhismel Kumykov wrote:
>> > > Move the last used path to the end of the list (least preferred)
>> > > so
>> > > that
>> > > ties are more evenly distributed.
>> > >
>> > > For example, in case with three paths with one that is slower
>> > > than
>> > > others, the remaining two would be unevenly used if they tie.
>> > > This is
>> > > due to the rotation not being a truely fair distribution.
>> > >
>> > > Illustrated: paths a, b, c, 'c' has 1 outstanding IO, a and b are
>> > > 'tied'
>> > > Three possible rotations:
>> > > (a, b, c) -> best path 'a'
>> > > (b, c, a) -> best path 'b'
>> > > (c, a, b) -> best path 'a'
>> > > (a, b, c) -> best path 'a'
>> > > (b, c, a) -> best path 'b'
>> > > (c, a, b) -> best path 'a'
>> > > ...
>> >
>> >
>> > This happens only if a and b actually have the same weight (e.g.
>> > queue
>> > length for the queue-length selector). If 'a' really receives more
>> > IO,
>> > its queue grows, and the selector will start preferring 'b', so the
>> > effect should level out automatically with the current code as soon
>> > as
>> > you have real IO going on. But maybe I haven't grasped what you're
>> > referring to as "tied".
>>
>> Yes, for "tied" I'm referring to paths with the same weight. As queue
>> length grows it does tend to level out, but in the case where queue
>> length doesn't grow (in this example I'm imagining 2 concurrent
>> requests to the device) the bias does show and the selectors really
>> do
>> send 'a' 2x more requests than 'b' (when 'c' is much slower and 'a'
>> and 'b' are ~same speed).
>
> Have you actually observed this? I find the idea pretty academical that
> two paths would be walking "tied" this way. In practice, under IO load,
> I'd expect queue lengths (and service-times, for that matter) to
> fluctuate enough to prevent this effect to be measurable. But of
> course, I may be wrong. If you really did observe this, the next
> question would be: does hurt performance to an extent that can be
> noticed/measured? I reckon that if 'a' got saturated under the load,
> hurting performance, its queue length would grow quickly and 'b' would
> automatically get preferred.
This is fairly simple to observe - start two sync reader threads
against a device with 3 backing paths, with one path taking longer on
average to complete requests than the other two. One of the 'faster'
paths will be used ~2x more than the other. Perhaps not that common a
situation, but is a real one. The bias seemed simple to remove, so
that the two (or N) paths would be used equally.

I don't see a downside to more evenly distributing in this case,
although I can't say I've directly observed performance downsides for
a biased distribution (aside from the distribution being biased itself
being a downside).
The bias of note is inherent to the order paths are added to the
selector (and which path is 'always bad'), so if 'a' is saturated due
to this, then slows, once it recovers it will continue to be
preferred, versus in an even distribution.

Khazhy
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Martin Wilck Jan. 24, 2018, 8 p.m. UTC | #5
On Wed, 2018-01-24 at 11:41 -0800, Khazhismel Kumykov wrote:
> On Wed, Jan 24, 2018 at 11:09 AM, Martin Wilck <mwilck@suse.com>
> wrote:
> > On Wed, 2018-01-24 at 10:44 -0800, Khazhismel Kumykov wrote:
> > > On Wed, Jan 24, 2018 at 2:57 AM, Martin Wilck <mwilck@suse.com>
> > > wrote:
> > > > On Fri, 2018-01-19 at 15:07 -0800, Khazhismel Kumykov wrote:
> > > > > Move the last used path to the end of the list (least
> > > > > preferred)
> > > > > so
> > > > > that
> > > > > ties are more evenly distributed.
> > > > > 
> > > > > For example, in case with three paths with one that is slower
> > > > > than
> > > > > others, the remaining two would be unevenly used if they tie.
> > > > > This is
> > > > > due to the rotation not being a truely fair distribution.
> > > > > 
> > > > > Illustrated: paths a, b, c, 'c' has 1 outstanding IO, a and b
> > > > > are
> > > > > 'tied'
> > > > > Three possible rotations:
> > > > > (a, b, c) -> best path 'a'
> > > > > (b, c, a) -> best path 'b'
> > > > > (c, a, b) -> best path 'a'
> > > > > (a, b, c) -> best path 'a'
> > > > > (b, c, a) -> best path 'b'
> > > > > (c, a, b) -> best path 'a'
> > > > > ...
> > > > 
> > > > 
> > > > This happens only if a and b actually have the same weight
> > > > (e.g.
> > > > queue
> > > > length for the queue-length selector). If 'a' really receives
> > > > more
> > > > IO,
> > > > its queue grows, and the selector will start preferring 'b', so
> > > > the
> > > > effect should level out automatically with the current code as
> > > > soon
> > > > as
> > > > you have real IO going on. But maybe I haven't grasped what
> > > > you're
> > > > referring to as "tied".
> > > 
> > > Yes, for "tied" I'm referring to paths with the same weight. As
> > > queue
> > > length grows it does tend to level out, but in the case where
> > > queue
> > > length doesn't grow (in this example I'm imagining 2 concurrent
> > > requests to the device) the bias does show and the selectors
> > > really
> > > do
> > > send 'a' 2x more requests than 'b' (when 'c' is much slower and
> > > 'a'
> > > and 'b' are ~same speed).
> > 
> > Have you actually observed this? I find the idea pretty academical
> > that
> > two paths would be walking "tied" this way. In practice, under IO
> > load,
> > I'd expect queue lengths (and service-times, for that matter) to
> > fluctuate enough to prevent this effect to be measurable. But of
> > course, I may be wrong. If you really did observe this, the next
> > question would be: does hurt performance to an extent that can be
> > noticed/measured? I reckon that if 'a' got saturated under the
> > load,
> > hurting performance, its queue length would grow quickly and 'b'
> > would
> > automatically get preferred.
> 
> This is fairly simple to observe - start two sync reader threads
> against a device with 3 backing paths, with one path taking longer on
> average to complete requests than the other two. One of the 'faster'
> paths will be used ~2x more than the other. Perhaps not that common a
> situation, but is a real one. The bias seemed simple to remove, so
> that the two (or N) paths would be used equally.
> 
> I don't see a downside to more evenly distributing in this case,
> although I can't say I've directly observed performance downsides for
> a biased distribution (aside from the distribution being biased
> itself
> being a downside).
> The bias of note is inherent to the order paths are added to the
> selector (and which path is 'always bad'), so if 'a' is saturated due
> to this, then slows, once it recovers it will continue to be
> preferred, versus in an even distribution.

Well, the expectation is indeed that load is spread equally, and I can
also see no downside. So:

Reviewed-by: Martin Wilck <mwilck@suse.com>
diff mbox

Patch

diff --git a/drivers/md/dm-queue-length.c b/drivers/md/dm-queue-length.c
index 23f178641794..969c4f1a3633 100644
--- a/drivers/md/dm-queue-length.c
+++ b/drivers/md/dm-queue-length.c
@@ -195,9 +195,6 @@  static struct dm_path *ql_select_path(struct path_selector *ps, size_t nr_bytes)
 	if (list_empty(&s->valid_paths))
 		goto out;
 
-	/* Change preferred (first in list) path to evenly balance. */
-	list_move_tail(s->valid_paths.next, &s->valid_paths);
-
 	list_for_each_entry(pi, &s->valid_paths, list) {
 		if (!best ||
 		    (atomic_read(&pi->qlen) < atomic_read(&best->qlen)))
@@ -210,6 +207,9 @@  static struct dm_path *ql_select_path(struct path_selector *ps, size_t nr_bytes)
 	if (!best)
 		goto out;
 
+	/* Move most recently used to least preferred to evenly balance. */
+	list_move_tail(&best->list, &s->valid_paths);
+
 	ret = best->path;
 out:
 	spin_unlock_irqrestore(&s->lock, flags);
diff --git a/drivers/md/dm-service-time.c b/drivers/md/dm-service-time.c
index 7b8642045c55..f006a9005593 100644
--- a/drivers/md/dm-service-time.c
+++ b/drivers/md/dm-service-time.c
@@ -282,9 +282,6 @@  static struct dm_path *st_select_path(struct path_selector *ps, size_t nr_bytes)
 	if (list_empty(&s->valid_paths))
 		goto out;
 
-	/* Change preferred (first in list) path to evenly balance. */
-	list_move_tail(s->valid_paths.next, &s->valid_paths);
-
 	list_for_each_entry(pi, &s->valid_paths, list)
 		if (!best || (st_compare_load(pi, best, nr_bytes) < 0))
 			best = pi;
@@ -292,6 +289,9 @@  static struct dm_path *st_select_path(struct path_selector *ps, size_t nr_bytes)
 	if (!best)
 		goto out;
 
+	/* Move most recently used to least preferred to evenly balance. */
+	list_move_tail(&best->list, &s->valid_paths);
+
 	ret = best->path;
 out:
 	spin_unlock_irqrestore(&s->lock, flags);