Message ID | 20181011222707.3631-18-mwilck@suse.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Delegated to: | christophe varoqui |
Headers | show |
Series | libmultipath: checkers overhaul | expand |
On Fri, Oct 12, 2018 at 12:27:03AM +0200, Martin Wilck wrote: > Blanking a WWID is a dangerous operation. E.g. configure() would > consider the path in question as invalid and orphan it if the > WWID is blank. Don't do this for possibly transient checker failures. > Moreover, we try to determine WWID even if path_offline returns > PATH_DOWN in the first place, so why should we not if the checker > has a problem? > Is PATH_WILD a transient error? Clearly we have cases where PATH_UNCHECKED happens because of a transient error. What's the case that you are worried about where we temporarily get a PATH_WILD, but we think we should continue will setting up the path? > Signed-off-by: Martin Wilck <mwilck@suse.com> > --- > libmultipath/discovery.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c > index 3550c3a7..467ece7a 100644 > --- a/libmultipath/discovery.c > +++ b/libmultipath/discovery.c > @@ -1944,9 +1944,6 @@ int pathinfo(struct path *pp, struct config *conf, int mask) > if (path_state == PATH_UP) { > pp->chkrstate = pp->state = get_state(pp, conf, 0, > path_state); > - if (pp->state == PATH_UNCHECKED || > - pp->state == PATH_WILD) > - goto blank; > if (pp->state == PATH_TIMEOUT) > pp->state = PATH_DOWN; > if (pp->state == PATH_UP && !pp->size) { > -- > 2.19.0 -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Thu, 2018-10-25 at 16:50 -0500, Benjamin Marzinski wrote: > On Fri, Oct 12, 2018 at 12:27:03AM +0200, Martin Wilck wrote: > > Blanking a WWID is a dangerous operation. E.g. configure() would > > consider the path in question as invalid and orphan it if the > > WWID is blank. Don't do this for possibly transient checker > > failures. > > Moreover, we try to determine WWID even if path_offline returns > > PATH_DOWN in the first place, so why should we not if the checker > > has a problem? > > > > Is PATH_WILD a transient error? Clearly we have cases where > PATH_UNCHECKED happens because of a transient error. What's the case > that you are worried about where we temporarily get a PATH_WILD, but > we > think we should continue will setting up the path? My mentioning of a "transient" error condition is misleading. But do I think that the patch is correct. In particular with the previous changes in the series, PATH_WILD would mean that the checker couldn't do it's job and simply has no information whether or not the path is good. IMO it is wrong not to try to obtain a WWID, or even blank an already retrieved WWID, in this case. Do you disagree? Regards, Martin
On Fri, Oct 26, 2018 at 11:16:23AM +0200, Martin Wilck wrote: > On Thu, 2018-10-25 at 16:50 -0500, Benjamin Marzinski wrote: > > On Fri, Oct 12, 2018 at 12:27:03AM +0200, Martin Wilck wrote: > > > Blanking a WWID is a dangerous operation. E.g. configure() would > > > consider the path in question as invalid and orphan it if the > > > WWID is blank. Don't do this for possibly transient checker > > > failures. > > > Moreover, we try to determine WWID even if path_offline returns > > > PATH_DOWN in the first place, so why should we not if the checker > > > has a problem? > > > > > > > Is PATH_WILD a transient error? Clearly we have cases where > > PATH_UNCHECKED happens because of a transient error. What's the case > > that you are worried about where we temporarily get a PATH_WILD, but > > we > > think we should continue will setting up the path? > > My mentioning of a "transient" error condition is misleading. > > But do I think that the patch is correct. In particular with the > previous changes in the series, PATH_WILD would mean that the checker > couldn't do it's job and simply has no information whether or not the > path is good. IMO it is wrong not to try to obtain a WWID, or even > blank an already retrieved WWID, in this case. > > Do you disagree? I guess the question is "Should we create a multipath device if we think that it's likely misconfigured?" If there really are multiple paths, then it makes sense that we should create the multipath device. On the other hand, if the checker is misconfigured and a path fails, multipathd may not be able to restore it, and users might not find that out until they actually have the path fail, if they aren't paying attention. I guess I don't really have any strong feelings against your approach. Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com> > > Regards, > Martin > > -- > Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107 > SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton > HRB 21284 (AG Nürnberg) > -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c index 3550c3a7..467ece7a 100644 --- a/libmultipath/discovery.c +++ b/libmultipath/discovery.c @@ -1944,9 +1944,6 @@ int pathinfo(struct path *pp, struct config *conf, int mask) if (path_state == PATH_UP) { pp->chkrstate = pp->state = get_state(pp, conf, 0, path_state); - if (pp->state == PATH_UNCHECKED || - pp->state == PATH_WILD) - goto blank; if (pp->state == PATH_TIMEOUT) pp->state = PATH_DOWN; if (pp->state == PATH_UP && !pp->size) {
Blanking a WWID is a dangerous operation. E.g. configure() would consider the path in question as invalid and orphan it if the WWID is blank. Don't do this for possibly transient checker failures. Moreover, we try to determine WWID even if path_offline returns PATH_DOWN in the first place, so why should we not if the checker has a problem? Signed-off-by: Martin Wilck <mwilck@suse.com> --- libmultipath/discovery.c | 3 --- 1 file changed, 3 deletions(-)