diff mbox series

[17/21] libmultipath: pathinfo: don't blank wwid if checker fails

Message ID 20181011222707.3631-18-mwilck@suse.com (mailing list archive)
State Not Applicable, archived
Delegated to: christophe varoqui
Headers show
Series libmultipath: checkers overhaul | expand

Commit Message

Martin Wilck Oct. 11, 2018, 10:27 p.m. UTC
Blanking a WWID is a dangerous operation. E.g. configure() would
consider the path in question as invalid and orphan it if the
WWID is blank. Don't do this for possibly transient checker failures.
Moreover, we try to determine WWID even if path_offline returns
PATH_DOWN in the first place, so why should we not if the checker
has a problem?

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 libmultipath/discovery.c | 3 ---
 1 file changed, 3 deletions(-)

Comments

Benjamin Marzinski Oct. 25, 2018, 9:50 p.m. UTC | #1
On Fri, Oct 12, 2018 at 12:27:03AM +0200, Martin Wilck wrote:
> Blanking a WWID is a dangerous operation. E.g. configure() would
> consider the path in question as invalid and orphan it if the
> WWID is blank. Don't do this for possibly transient checker failures.
> Moreover, we try to determine WWID even if path_offline returns
> PATH_DOWN in the first place, so why should we not if the checker
> has a problem?
> 

Is PATH_WILD a transient error?  Clearly we have cases where
PATH_UNCHECKED happens because of a transient error. What's the case
that you are worried about where we temporarily get a PATH_WILD, but we
think we should continue will setting up the path?

> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  libmultipath/discovery.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
> index 3550c3a7..467ece7a 100644
> --- a/libmultipath/discovery.c
> +++ b/libmultipath/discovery.c
> @@ -1944,9 +1944,6 @@ int pathinfo(struct path *pp, struct config *conf, int mask)
>  		if (path_state == PATH_UP) {
>  			pp->chkrstate = pp->state = get_state(pp, conf, 0,
>  							      path_state);
> -			if (pp->state == PATH_UNCHECKED ||
> -			    pp->state == PATH_WILD)
> -				goto blank;
>  			if (pp->state == PATH_TIMEOUT)
>  				pp->state = PATH_DOWN;
>  			if (pp->state == PATH_UP && !pp->size) {
> -- 
> 2.19.0

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Martin Wilck Oct. 26, 2018, 9:16 a.m. UTC | #2
On Thu, 2018-10-25 at 16:50 -0500,  Benjamin Marzinski  wrote:
> On Fri, Oct 12, 2018 at 12:27:03AM +0200, Martin Wilck wrote:
> > Blanking a WWID is a dangerous operation. E.g. configure() would
> > consider the path in question as invalid and orphan it if the
> > WWID is blank. Don't do this for possibly transient checker
> > failures.
> > Moreover, we try to determine WWID even if path_offline returns
> > PATH_DOWN in the first place, so why should we not if the checker
> > has a problem?
> > 
> 
> Is PATH_WILD a transient error?  Clearly we have cases where
> PATH_UNCHECKED happens because of a transient error. What's the case
> that you are worried about where we temporarily get a PATH_WILD, but
> we
> think we should continue will setting up the path?

My mentioning of a "transient" error condition is misleading. 

But do I think that the patch is correct. In particular with the
previous changes in the series, PATH_WILD would mean that the checker
couldn't do it's job and simply has no information whether or not the
path is good. IMO it is wrong not to try to obtain a WWID, or even
blank an already retrieved WWID, in this case.

Do you disagree?

Regards,
Martin
Benjamin Marzinski Oct. 26, 2018, 3:11 p.m. UTC | #3
On Fri, Oct 26, 2018 at 11:16:23AM +0200, Martin Wilck wrote:
> On Thu, 2018-10-25 at 16:50 -0500,  Benjamin Marzinski  wrote:
> > On Fri, Oct 12, 2018 at 12:27:03AM +0200, Martin Wilck wrote:
> > > Blanking a WWID is a dangerous operation. E.g. configure() would
> > > consider the path in question as invalid and orphan it if the
> > > WWID is blank. Don't do this for possibly transient checker
> > > failures.
> > > Moreover, we try to determine WWID even if path_offline returns
> > > PATH_DOWN in the first place, so why should we not if the checker
> > > has a problem?
> > > 
> > 
> > Is PATH_WILD a transient error?  Clearly we have cases where
> > PATH_UNCHECKED happens because of a transient error. What's the case
> > that you are worried about where we temporarily get a PATH_WILD, but
> > we
> > think we should continue will setting up the path?
> 
> My mentioning of a "transient" error condition is misleading. 
> 
> But do I think that the patch is correct. In particular with the
> previous changes in the series, PATH_WILD would mean that the checker
> couldn't do it's job and simply has no information whether or not the
> path is good. IMO it is wrong not to try to obtain a WWID, or even
> blank an already retrieved WWID, in this case.
> 
> Do you disagree?

I guess the question is "Should we create a multipath device if we think
that it's likely misconfigured?" If there really are multiple paths,
then it makes sense that we should create the multipath device.  On the
other hand, if the checker is misconfigured and a path fails, multipathd
may not be able to restore it, and users might not find that out until
they actually have the path fail, if they aren't paying attention.

I guess I don't really have any strong feelings against your approach.

Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>

> 
> Regards,
> Martin
> 
> -- 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
diff mbox series

Patch

diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index 3550c3a7..467ece7a 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -1944,9 +1944,6 @@  int pathinfo(struct path *pp, struct config *conf, int mask)
 		if (path_state == PATH_UP) {
 			pp->chkrstate = pp->state = get_state(pp, conf, 0,
 							      path_state);
-			if (pp->state == PATH_UNCHECKED ||
-			    pp->state == PATH_WILD)
-				goto blank;
 			if (pp->state == PATH_TIMEOUT)
 				pp->state = PATH_DOWN;
 			if (pp->state == PATH_UP && !pp->size) {