diff mbox

[1/2] libmultipath: hwhandler auto-detection for ALUA

Message ID 20180327215053.3631-2-mwilck@suse.com (mailing list archive)
State Not Applicable, archived
Delegated to: christophe varoqui
Headers show

Commit Message

Martin Wilck March 27, 2018, 9:50 p.m. UTC
If the hardware handler isn't explicitly set, infer ALUA support
from the pp->tpgs attribute. Likewise, if ALUA is selected, but
not supported by the hardware, fall back to no hardware handler.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 libmultipath/propsel.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

Comments

Benjamin Marzinski April 3, 2018, 8:31 p.m. UTC | #1
On Tue, Mar 27, 2018 at 11:50:52PM +0200, Martin Wilck wrote:
> If the hardware handler isn't explicitly set, infer ALUA support
> from the pp->tpgs attribute. Likewise, if ALUA is selected, but
> not supported by the hardware, fall back to no hardware handler.

Weren't you worried before about temporary ALUA failures? If you had a
temporary failure while configuring a device that you explicitly set to
be ALUA, then this would cause the device to be misconfigured? If the
hardware handler isn't set, inferring ALUA is fine. But what is the case
where we want to say that a device that is explicitly set to ALUA
shouldn't actually be ALUA?  It seem like if there is some uncertaintly,
we should just not set the hardware handler, and allow multipath to
infer it via the pp->tpgs value.

I'm not strongly against this patch. I just don't see the value in
overriding an explicit configuration, if we believe that temporary
failures are possible.

-Ben

> 
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  libmultipath/propsel.c | 19 +++++++++++++++++--
>  1 file changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
> index 93974a482336..dc24450eb775 100644
> --- a/libmultipath/propsel.c
> +++ b/libmultipath/propsel.c
> @@ -43,10 +43,13 @@ do {									\
>  		goto out;						\
>  	}								\
>  } while(0)
> +
> +static char default_origin[] = "(setting: multipath internal)";
> +
>  #define do_default(dest, value)						\
>  do {									\
>  	dest = value;							\
> -	origin = "(setting: multipath internal)";			\
> +	origin = default_origin;					\
>  } while(0)
>  
>  #define mp_set_mpe(var)							\
> @@ -373,16 +376,20 @@ static int get_dh_state(struct path *pp, char *value, size_t value_len)
>  
>  int select_hwhandler(struct config *conf, struct multipath *mp)
>  {
> -	char *origin;
> +	const char *origin;
>  	struct path *pp;
>  	/* dh_state is no longer than "detached" */
>  	char handler[12];
> +	static char alua_name[] = "1 alua";
> +	static const char tpgs_origin[]= "(setting: autodetected from TPGS)";
>  	char *dh_state;
>  	int i;
> +	bool all_tpgs = true;
>  
>  	dh_state = &handler[2];
>  	if (mp->retain_hwhandler != RETAIN_HWHANDLER_OFF) {
>  		vector_foreach_slot(mp->paths, pp, i) {
> +			all_tpgs = all_tpgs && (pp->tpgs > 0);
>  			if (get_dh_state(pp, dh_state, sizeof(handler) - 2) > 0
>  			    && strcmp(dh_state, "detached")) {
>  				memcpy(handler, "1 ", 2);
> @@ -397,6 +404,14 @@ int select_hwhandler(struct config *conf, struct multipath *mp)
>  	mp_set_conf(hwhandler);
>  	mp_set_default(hwhandler, DEFAULT_HWHANDLER);
>  out:
> +	if (all_tpgs && !strcmp(mp->hwhandler, DEFAULT_HWHANDLER) &&
> +		origin == default_origin) {
> +		mp->hwhandler = alua_name;
> +		origin = tpgs_origin;
> +	} else if (!all_tpgs && !strcmp(mp->hwhandler, alua_name)) {
> +		mp->hwhandler = DEFAULT_HWHANDLER;
> +		origin = tpgs_origin;
> +	}
>  	mp->hwhandler = STRDUP(mp->hwhandler);
>  	condlog(3, "%s: hardware_handler = \"%s\" %s", mp->alias, mp->hwhandler,
>  		origin);
> -- 
> 2.16.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Martin Wilck April 3, 2018, 8:53 p.m. UTC | #2
On Tue, 2018-04-03 at 15:31 -0500, Benjamin Marzinski wrote:
> On Tue, Mar 27, 2018 at 11:50:52PM +0200, Martin Wilck wrote:
> > If the hardware handler isn't explicitly set, infer ALUA support
> > from the pp->tpgs attribute. Likewise, if ALUA is selected, but
> > not supported by the hardware, fall back to no hardware handler.
> 
> Weren't you worried before about temporary ALUA failures? If you had
> a
> temporary failure while configuring a device that you explicitly set
> to
> be ALUA, then this would cause the device to be misconfigured? 

I believe that if TGPS is 0, the device will never be able to support
ALUA. The kernel also looks at the TPGS bits and won't try ALUA if they
are unset. Once the device is configured and actual ALUA RTPG/STPG
calls are performed, they may fail for a variety of temporary reasons -
I wanted to avoid resetting the prio algorithm to "const" for such
cases. That's my understanding, correct me if I'm wrong.

> If the
> hardware handler isn't set, inferring ALUA is fine. But what is the
> case
> where we want to say that a device that is explicitly set to ALUA
> shouldn't actually be ALUA?  It seem like if there is some
> uncertaintly,
> we should just not set the hardware handler, and allow multipath to
> infer it via the pp->tpgs value.
> 
> I'm not strongly against this patch. I just don't see the value in
> overriding an explicit configuration, if we believe that temporary
> failures are possible.

That would be fine if we didn't have any explicit "hardware_handler
alua" settings in the hardcoded hwtable any more, or at least if we're 
positive that those devices where we have "hardware_handler alua"
really support it.

We can also adopt the philosophy of "detect_prio" and "detect_checker",
add an additional config file option "detect_hwhandler", and look at
tpgs only if the latter it set (which would be the default). Like
detect_prio, users could then enforce their config file settings with
"detect_hwhandler no".

I was hoping we could find a simpler approach, without yet another
rarely-used config option.

Btw, at SUSE we solved our problem with the controller at hand by
simply removing "hardware_handler alua" and "prio alua" from the IBM
IPR entry. If the scsi_dh_alua module is loaded early (default on
SUSE), this results in ALUA hwhandler and sysfs prio being used for IPR
controllers that do support ALUA, and no hwhandler / const prio =
PRIO_UNDEF for those that don't. I'm not sure if that simple solution
suits upstream, because upstream doesn't enforce early loading of
device handler modules.

Regards,
Martin


> 
> -Ben
> 
> > 
> > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > ---
> >  libmultipath/propsel.c | 19 +++++++++++++++++--
> >  1 file changed, 17 insertions(+), 2 deletions(-)
> > 
> > diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
> > index 93974a482336..dc24450eb775 100644
> > --- a/libmultipath/propsel.c
> > +++ b/libmultipath/propsel.c
> > @@ -43,10 +43,13 @@ do {						
> > 			\
> >  		goto out;						
> > \
> >  	}								
> > \
> >  } while(0)
> > +
> > +static char default_origin[] = "(setting: multipath internal)";
> > +
> >  #define do_default(dest, value)					
> > 	\
> >  do {								
> > 	\
> >  	dest = value;						
> > 	\
> > -	origin = "(setting: multipath internal)";			
> > \
> > +	origin = default_origin;					
> > \
> >  } while(0)
> >  
> >  #define mp_set_mpe(var)						
> > 	\
> > @@ -373,16 +376,20 @@ static int get_dh_state(struct path *pp, char
> > *value, size_t value_len)
> >  
> >  int select_hwhandler(struct config *conf, struct multipath *mp)
> >  {
> > -	char *origin;
> > +	const char *origin;
> >  	struct path *pp;
> >  	/* dh_state is no longer than "detached" */
> >  	char handler[12];
> > +	static char alua_name[] = "1 alua";
> > +	static const char tpgs_origin[]= "(setting: autodetected
> > from TPGS)";
> >  	char *dh_state;
> >  	int i;
> > +	bool all_tpgs = true;
> >  
> >  	dh_state = &handler[2];
> >  	if (mp->retain_hwhandler != RETAIN_HWHANDLER_OFF) {
> >  		vector_foreach_slot(mp->paths, pp, i) {
> > +			all_tpgs = all_tpgs && (pp->tpgs > 0);
> >  			if (get_dh_state(pp, dh_state,
> > sizeof(handler) - 2) > 0
> >  			    && strcmp(dh_state, "detached")) {
> >  				memcpy(handler, "1 ", 2);
> > @@ -397,6 +404,14 @@ int select_hwhandler(struct config *conf,
> > struct multipath *mp)
> >  	mp_set_conf(hwhandler);
> >  	mp_set_default(hwhandler, DEFAULT_HWHANDLER);
> >  out:
> > +	if (all_tpgs && !strcmp(mp->hwhandler, DEFAULT_HWHANDLER)
> > &&
> > +		origin == default_origin) {
> > +		mp->hwhandler = alua_name;
> > +		origin = tpgs_origin;
> > +	} else if (!all_tpgs && !strcmp(mp->hwhandler, alua_name))
> > {
> > +		mp->hwhandler = DEFAULT_HWHANDLER;
> > +		origin = tpgs_origin;
> > +	}
> >  	mp->hwhandler = STRDUP(mp->hwhandler);
> >  	condlog(3, "%s: hardware_handler = \"%s\" %s", mp->alias,
> > mp->hwhandler,
> >  		origin);
> > -- 
> > 2.16.1
> 
>
Benjamin Marzinski April 3, 2018, 9:29 p.m. UTC | #3
On Tue, Apr 03, 2018 at 10:53:29PM +0200, Martin Wilck wrote:
> On Tue, 2018-04-03 at 15:31 -0500, Benjamin Marzinski wrote:
> > On Tue, Mar 27, 2018 at 11:50:52PM +0200, Martin Wilck wrote:
> > > If the hardware handler isn't explicitly set, infer ALUA support
> > > from the pp->tpgs attribute. Likewise, if ALUA is selected, but
> > > not supported by the hardware, fall back to no hardware handler.
> > 
> > Weren't you worried before about temporary ALUA failures? If you had
> > a
> > temporary failure while configuring a device that you explicitly set
> > to
> > be ALUA, then this would cause the device to be misconfigured? 
> 
> I believe that if TGPS is 0, the device will never be able to support
> ALUA. The kernel also looks at the TPGS bits and won't try ALUA if they
> are unset. Once the device is configured and actual ALUA RTPG/STPG
> calls are performed, they may fail for a variety of temporary reasons -
> I wanted to avoid resetting the prio algorithm to "const" for such
> cases. That's my understanding, correct me if I'm wrong.

Devices that were not correctly supporing ALUA returned > 0 for
get_target_port_group_support, so detect_alua actually does all the work
necessary to verify that it can get a priority. Without doing this,
multiple deviecs that didn't support ALUA were being detected as
supporting ALUA.

> 
> > If the
> > hardware handler isn't set, inferring ALUA is fine. But what is the
> > case
> > where we want to say that a device that is explicitly set to ALUA
> > shouldn't actually be ALUA?  It seem like if there is some
> > uncertaintly,
> > we should just not set the hardware handler, and allow multipath to
> > infer it via the pp->tpgs value.
> > 
> > I'm not strongly against this patch. I just don't see the value in
> > overriding an explicit configuration, if we believe that temporary
> > failures are possible.
> 
> That would be fine if we didn't have any explicit "hardware_handler
> alua" settings in the hardcoded hwtable any more, or at least if we're 
> positive that those devices where we have "hardware_handler alua"
> really support it.
> 
> We can also adopt the philosophy of "detect_prio" and "detect_checker",
> add an additional config file option "detect_hwhandler", and look at
> tpgs only if the latter it set (which would be the default). Like
> detect_prio, users could then enforce their config file settings with
> "detect_hwhandler no".
> 
> I was hoping we could find a simpler approach, without yet another
> rarely-used config option.
> 
> Btw, at SUSE we solved our problem with the controller at hand by
> simply removing "hardware_handler alua" and "prio alua" from the IBM
> IPR entry. If the scsi_dh_alua module is loaded early (default on
> SUSE), this results in ALUA hwhandler and sysfs prio being used for IPR
> controllers that do support ALUA, and no hwhandler / const prio =
> PRIO_UNDEF for those that don't. I'm not sure if that simple solution
> suits upstream, because upstream doesn't enforce early loading of
> device handler modules.

By using retain_attached_hwhandler at all, we are implicitly requiring
the scsi_dh_alua module to be loaded before devices with indeterminate
configurations are discovered for them to work correctly. right? For
instance, commit 715c48d93dd00930534ce6a55d0e3705466df5d6 did this for
netapp devices, and that was in 2013. I don't see how this is different.

-Ben

> Regards,
> Martin
> 
> 
> > 
> > -Ben
> > 
> > > 
> > > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > > ---
> > >  libmultipath/propsel.c | 19 +++++++++++++++++--
> > >  1 file changed, 17 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
> > > index 93974a482336..dc24450eb775 100644
> > > --- a/libmultipath/propsel.c
> > > +++ b/libmultipath/propsel.c
> > > @@ -43,10 +43,13 @@ do {						
> > > 			\
> > >  		goto out;						
> > > \
> > >  	}								
> > > \
> > >  } while(0)
> > > +
> > > +static char default_origin[] = "(setting: multipath internal)";
> > > +
> > >  #define do_default(dest, value)					
> > > 	\
> > >  do {								
> > > 	\
> > >  	dest = value;						
> > > 	\
> > > -	origin = "(setting: multipath internal)";			
> > > \
> > > +	origin = default_origin;					
> > > \
> > >  } while(0)
> > >  
> > >  #define mp_set_mpe(var)						
> > > 	\
> > > @@ -373,16 +376,20 @@ static int get_dh_state(struct path *pp, char
> > > *value, size_t value_len)
> > >  
> > >  int select_hwhandler(struct config *conf, struct multipath *mp)
> > >  {
> > > -	char *origin;
> > > +	const char *origin;
> > >  	struct path *pp;
> > >  	/* dh_state is no longer than "detached" */
> > >  	char handler[12];
> > > +	static char alua_name[] = "1 alua";
> > > +	static const char tpgs_origin[]= "(setting: autodetected
> > > from TPGS)";
> > >  	char *dh_state;
> > >  	int i;
> > > +	bool all_tpgs = true;
> > >  
> > >  	dh_state = &handler[2];
> > >  	if (mp->retain_hwhandler != RETAIN_HWHANDLER_OFF) {
> > >  		vector_foreach_slot(mp->paths, pp, i) {
> > > +			all_tpgs = all_tpgs && (pp->tpgs > 0);
> > >  			if (get_dh_state(pp, dh_state,
> > > sizeof(handler) - 2) > 0
> > >  			    && strcmp(dh_state, "detached")) {
> > >  				memcpy(handler, "1 ", 2);
> > > @@ -397,6 +404,14 @@ int select_hwhandler(struct config *conf,
> > > struct multipath *mp)
> > >  	mp_set_conf(hwhandler);
> > >  	mp_set_default(hwhandler, DEFAULT_HWHANDLER);
> > >  out:
> > > +	if (all_tpgs && !strcmp(mp->hwhandler, DEFAULT_HWHANDLER)
> > > &&
> > > +		origin == default_origin) {
> > > +		mp->hwhandler = alua_name;
> > > +		origin = tpgs_origin;
> > > +	} else if (!all_tpgs && !strcmp(mp->hwhandler, alua_name))
> > > {
> > > +		mp->hwhandler = DEFAULT_HWHANDLER;
> > > +		origin = tpgs_origin;
> > > +	}
> > >  	mp->hwhandler = STRDUP(mp->hwhandler);
> > >  	condlog(3, "%s: hardware_handler = \"%s\" %s", mp->alias,
> > > mp->hwhandler,
> > >  		origin);
> > > -- 
> > > 2.16.1
> > 
> > 
> 
> -- 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Hannes Reinecke April 4, 2018, 6:38 a.m. UTC | #4
On Tue, 3 Apr 2018 15:31:32 -0500
"Benjamin Marzinski" <bmarzins@redhat.com> wrote:

> On Tue, Mar 27, 2018 at 11:50:52PM +0200, Martin Wilck wrote:
> > If the hardware handler isn't explicitly set, infer ALUA support
> > from the pp->tpgs attribute. Likewise, if ALUA is selected, but
> > not supported by the hardware, fall back to no hardware handler.  
> 
> Weren't you worried before about temporary ALUA failures? If you had a
> temporary failure while configuring a device that you explicitly set
> to be ALUA, then this would cause the device to be misconfigured? If
> the hardware handler isn't set, inferring ALUA is fine. But what is
> the case where we want to say that a device that is explicitly set to
> ALUA shouldn't actually be ALUA?  It seem like if there is some
> uncertaintly, we should just not set the hardware handler, and allow
> multipath to infer it via the pp->tpgs value.
> 
> I'm not strongly against this patch. I just don't see the value in
> overriding an explicit configuration, if we believe that temporary
> failures are possible.
> 
We _do_ have an definitive guide, namely the TGPS bit.
If that isn't set it's pretty much pointless to try alua, regardless
what the configuration says.
If it's set but ALUA configuration fails we do have an error.
If it's not set and ALUA configuration fails then it's 'just' a
misconfiguration.

Which is precisely what bit us with the IPR controller; all devices
appear as 'IPR', but only for some configuration the TPGS bit is
set.

And as the hardware handler was set to 'ALUA' the hardware handler
always tried to attach, but failed for those devices which did not
support ALUA.

_And_ as we don't have a distinction between 'configuration error' and
'hardware failure' these devices failed to setup, and booting would
stop.

So this patch is just how to handle devices which are configured to use
the ALUA hardware handler, but which do not have the TPGS bit set.
For these devices attaching ALUA _will_ fail, but that's _actually_
expected, as the devices never claimed to support alua.

Hence I'm perfectly fine with this patch.

Cheers,

Hannes


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Martin Wilck April 4, 2018, 8:04 a.m. UTC | #5
On Tue, 2018-04-03 at 16:29 -0500, Benjamin Marzinski wrote:
> On Tue, Apr 03, 2018 at 10:53:29PM +0200, Martin Wilck wrote:
> > On Tue, 2018-04-03 at 15:31 -0500, Benjamin Marzinski wrote:
> > > On Tue, Mar 27, 2018 at 11:50:52PM +0200, Martin Wilck wrote:
> > > > If the hardware handler isn't explicitly set, infer ALUA
> > > > support
> > > > from the pp->tpgs attribute. Likewise, if ALUA is selected, but
> > > > not supported by the hardware, fall back to no hardware
> > > > handler.
> > > 
> > > Weren't you worried before about temporary ALUA failures? If you
> > > had
> > > a
> > > temporary failure while configuring a device that you explicitly
> > > set
> > > to
> > > be ALUA, then this would cause the device to be misconfigured? 
> > 
> > I believe that if TGPS is 0, the device will never be able to
> > support
> > ALUA. The kernel also looks at the TPGS bits and won't try ALUA if
> > they
> > are unset. Once the device is configured and actual ALUA RTPG/STPG
> > calls are performed, they may fail for a variety of temporary
> > reasons -
> > I wanted to avoid resetting the prio algorithm to "const" for such
> > cases. That's my understanding, correct me if I'm wrong.
> 
> Devices that were not correctly supporing ALUA returned > 0 for
> get_target_port_group_support, so detect_alua actually does all the
> work
> necessary to verify that it can get a priority. Without doing this,
> multiple deviecs that didn't support ALUA were being detected as
> supporting ALUA.

So, detect_alua() tests TPGS *and* tries and actual alua call, and sets
pp->tpgs to anything other than TPGS_NONE only if the latter is
successful. That's fine. My patch was looking at pp->tpgs, so it was
implicitly using this logic of detect_alua(). But does that guarantee
that future alua->getprio() calls will never fail at some later point
in time?

Maybe I misunderstood your original proposition. What I'm saying is
that resetting the prio algorithm from "alua" to "const" because of an
error code in get_prio() is wrong, because that error code may be
transient.

If we give "hardware_handler" config options preference over ALUA
autodetection, and thus enforce hwhandler "1 alua" on such devices that
have no ALUA support, domap() is guaranteed to fail, because the kernel
refuses to set up a map with a given hwhandler if any device doesn't
support that handler.

> By using retain_attached_hwhandler at all, we are implicitly
> requiring
> the scsi_dh_alua module to be loaded before devices with
> indeterminate
> configurations are discovered for them to work correctly. right? For
> instance, commit 715c48d93dd00930534ce6a55d0e3705466df5d6 did this
> for
> netapp devices, and that was in 2013. I don't see how this is
> different.

You're right, we are "implicitly requiring" this sort-of, but we have
no code that enforces the early loading of the device handlers. We
should be shipping a modules-load.d file, or a modprobe.d softdep, or
something similar that would enforce this setting if we _really_ depend
on it. "Implicit requirements" are bad. We should either make the
requirement explicit, or not hard-depend on it. So far I was thinking
the latter. After all, SCSI device-handler support is configurable in
the kernel.

Regards,
Martin
Martin Wilck April 12, 2018, 3:43 p.m. UTC | #6
Hi Ben,

On Wed, 2018-04-04 at 10:04 +0200, Martin Wilck wrote:
> On Tue, 2018-04-03 at 16:29 -0500, Benjamin Marzinski wrote:

> > > I believe that if TGPS is 0, the device will never be able to
> > > support
> > > ALUA. The kernel also looks at the TPGS bits and won't try ALUA
> > > if
> > > they
> > > are unset. Once the device is configured and actual ALUA
> > > RTPG/STPG
> > > calls are performed, they may fail for a variety of temporary
> > > reasons -
> > > I wanted to avoid resetting the prio algorithm to "const" for
> > > such
> > > cases. That's my understanding, correct me if I'm wrong.
> > 
> > Devices that were not correctly supporing ALUA returned > 0 for
> > get_target_port_group_support, so detect_alua actually does all the
> > work
> > necessary to verify that it can get a priority. Without doing this,
> > multiple deviecs that didn't support ALUA were being detected as
> > supporting ALUA.
> 
> So, detect_alua() tests TPGS *and* tries and actual alua call, and
> sets
> pp->tpgs to anything other than TPGS_NONE only if the latter is
> successful. That's fine. My patch was looking at pp->tpgs, so it was
> implicitly using this logic of detect_alua(). But does that guarantee
> that future alua->getprio() calls will never fail at some later point
> in time?
> 
> Maybe I misunderstood your original proposition. What I'm saying is
> that resetting the prio algorithm from "alua" to "const" because of
> an
> error code in get_prio() is wrong, because that error code may be
> transient.
> 
> If we give "hardware_handler" config options preference over ALUA
> autodetection, and thus enforce hwhandler "1 alua" on such devices
> that
> have no ALUA support, domap() is guaranteed to fail, because the
> kernel
> refuses to set up a map with a given hwhandler if any device doesn't
> support that handler.
> 
> > By using retain_attached_hwhandler at all, we are implicitly
> > requiring
> > the scsi_dh_alua module to be loaded before devices with
> > indeterminate
> > configurations are discovered for them to work correctly. right?
> > For
> > instance, commit 715c48d93dd00930534ce6a55d0e3705466df5d6 did this
> > for
> > netapp devices, and that was in 2013. I don't see how this is
> > different.
> 
> You're right, we are "implicitly requiring" this sort-of, but we have
> no code that enforces the early loading of the device handlers. We
> should be shipping a modules-load.d file, or a modprobe.d softdep, or
> something similar that would enforce this setting if we _really_
> depend
> on it. "Implicit requirements" are bad. We should either make the
> requirement explicit, or not hard-depend on it. So far I was thinking
> the latter. After all, SCSI device-handler support is configurable in
> the kernel.

I'm unsure what to do. Do you still reject my patch? Or have you been
convinced by Hannes and my arguments? 
Or are you requesting changes? If yes, what? 

Regards,
Martin
Benjamin Marzinski April 12, 2018, 7:49 p.m. UTC | #7
On Thu, Apr 12, 2018 at 05:43:39PM +0200, Martin Wilck wrote:
> Hi Ben,
> 
> I'm unsure what to do. Do you still reject my patch? Or have you been
> convinced by Hannes and my arguments? 
> Or are you requesting changes? If yes, what? 

I still feel that it's better to make the default config const for
devices that may or may not be ALUA, and let detect_alua figure it out,
rather than allowing multipathd to override a specifically requested
ALUA hardware handler. This is especially true if
get_target_port_group_support() and get_target_port_group succeed, but
get_asymmetric_access_state() fails in detect_alua().  But I don't think
that transient alua errors like this are very likely during multpath
creation, so I not going to reject the patch.

Reviewed-by: Benjmain Marzinski <bmarzins@redhat.com>

> 
> Regards,
> Martin
> 
> -- 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
diff mbox

Patch

diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
index 93974a482336..dc24450eb775 100644
--- a/libmultipath/propsel.c
+++ b/libmultipath/propsel.c
@@ -43,10 +43,13 @@  do {									\
 		goto out;						\
 	}								\
 } while(0)
+
+static char default_origin[] = "(setting: multipath internal)";
+
 #define do_default(dest, value)						\
 do {									\
 	dest = value;							\
-	origin = "(setting: multipath internal)";			\
+	origin = default_origin;					\
 } while(0)
 
 #define mp_set_mpe(var)							\
@@ -373,16 +376,20 @@  static int get_dh_state(struct path *pp, char *value, size_t value_len)
 
 int select_hwhandler(struct config *conf, struct multipath *mp)
 {
-	char *origin;
+	const char *origin;
 	struct path *pp;
 	/* dh_state is no longer than "detached" */
 	char handler[12];
+	static char alua_name[] = "1 alua";
+	static const char tpgs_origin[]= "(setting: autodetected from TPGS)";
 	char *dh_state;
 	int i;
+	bool all_tpgs = true;
 
 	dh_state = &handler[2];
 	if (mp->retain_hwhandler != RETAIN_HWHANDLER_OFF) {
 		vector_foreach_slot(mp->paths, pp, i) {
+			all_tpgs = all_tpgs && (pp->tpgs > 0);
 			if (get_dh_state(pp, dh_state, sizeof(handler) - 2) > 0
 			    && strcmp(dh_state, "detached")) {
 				memcpy(handler, "1 ", 2);
@@ -397,6 +404,14 @@  int select_hwhandler(struct config *conf, struct multipath *mp)
 	mp_set_conf(hwhandler);
 	mp_set_default(hwhandler, DEFAULT_HWHANDLER);
 out:
+	if (all_tpgs && !strcmp(mp->hwhandler, DEFAULT_HWHANDLER) &&
+		origin == default_origin) {
+		mp->hwhandler = alua_name;
+		origin = tpgs_origin;
+	} else if (!all_tpgs && !strcmp(mp->hwhandler, alua_name)) {
+		mp->hwhandler = DEFAULT_HWHANDLER;
+		origin = tpgs_origin;
+	}
 	mp->hwhandler = STRDUP(mp->hwhandler);
 	condlog(3, "%s: hardware_handler = \"%s\" %s", mp->alias, mp->hwhandler,
 		origin);