diff mbox

[20/31] 11-dm-mpath.rules: don't set READY->ACTIVATION

Message ID 20170902223900.7339-21-mwilck@suse.com (mailing list archive)
State Not Applicable, archived
Delegated to: christophe varoqui
Headers show

Commit Message

Martin Wilck Sept. 2, 2017, 10:38 p.m. UTC
The fact alone that a map changes from not ready to ready does
not imply that it is activating.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipath/11-dm-mpath.rules | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Benjamin Marzinski Sept. 13, 2017, 9:19 p.m. UTC | #1
On Sun, Sep 03, 2017 at 12:38:49AM +0200, Martin Wilck wrote:
> The fact alone that a map changes from not ready to ready does
> not imply that it is activating.

NAK on this one and
[PATCH 21/31] 11-dm-mpath.rules: Remember DM_ACTIVATION

This breaks things. It's often the case that there are devices stacked
on top of multipath.  When multipath loses all of its paths, if it
doesn't have queue_if_no_path set, it can fail IO up to these higher
devices, forcing them to react to the failure.  There is no way that I
know of to be able to check in udev to see if this has occured. The only
safe course of action is to assume the worst and notify lvm that things
are better now.

A simple reproducer of the problem is something like:

- create 2 multipath devices (call them mpatha and mpathb)
- set up raid1 on top of them

# pvcreate /dev/mapper/mpatha
# pvcreate /dev/mapper/mpathc
# vgcreate test /dev/mapper/mpathb /dev/mapper/mpatha
# lvcreate --type raid1 -m 1 -L 1G -n raid test

- verify that everything is o.k.

# dd if=/dev/zero of=/dev/test/raid bs=1M count=100
# multipath -ll
# pvs
# vgs
# lvs

- remove all of the path devices from mpatha
- verify that the raid device is in partial mode

# dd if=/dev/zero of=/dev/test/raid bs=1M count=100 
# multipath -ll
# pvs
# vgs
# lvs

- add the path devices back to the system
- check if the raid device has recovered

# multipath -ll
# pvs
# vgs
# lvs

With these patches, the raid device remains in partial mode, and the
mpatha pv is still listed as unknown. If I revert these two patches,
everything works fine again.

-Ben

> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  multipath/11-dm-mpath.rules | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/multipath/11-dm-mpath.rules b/multipath/11-dm-mpath.rules
> index 0be22ae4..3f47744f 100644
> --- a/multipath/11-dm-mpath.rules
> +++ b/multipath/11-dm-mpath.rules
> @@ -64,8 +64,7 @@ ENV{MPATH_DEVICE_READY}=="0", ENV{.MPATH_DEVICE_READY_OLD}=="1",\
>  ENV{MPATH_DEVICE_READY}=="0", ENV{DM_UDEV_DISABLE_OTHER_RULES_FLAG}="1"
>  ENV{MPATH_DEVICE_READY}!="0", ENV{.MPATH_DEVICE_READY_OLD}=="0",\
>  	ENV{DM_UDEV_DISABLE_OTHER_RULES_FLAG}="$env{DM_DISABLE_OTHER_RULES_FLAG_OLD}",\
> -	ENV{DM_DISABLE_OTHER_RULES_FLAG_OLD}="",\
> -	ENV{DM_ACTIVATION}="1"
> +	ENV{DM_DISABLE_OTHER_RULES_FLAG_OLD}=""
>  
>  LABEL="scan_import"
>  ENV{DM_NOSCAN}!="1", GOTO="mpath_end"
> -- 
> 2.14.0

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Martin Wilck Sept. 13, 2017, 9:33 p.m. UTC | #2
On Wed, 2017-09-13 at 16:19 -0500, Benjamin Marzinski wrote:
> On Sun, Sep 03, 2017 at 12:38:49AM +0200, Martin Wilck wrote:
> > The fact alone that a map changes from not ready to ready does
> > not imply that it is activating.
> 
> NAK on this one and
> [PATCH 21/31] 11-dm-mpath.rules: Remember DM_ACTIVATION
> 
> This breaks things. It's often the case that there are devices
> stacked
> on top of multipath.  When multipath loses all of its paths, if it
> doesn't have queue_if_no_path set, it can fail IO up to these higher
> devices, forcing them to react to the failure.  There is no way that
> I
> know of to be able to check in udev to see if this has occured. The
> only
> safe course of action is to assume the worst and notify lvm that
> things
> are better now.

You are right, thanks. I didn't consider the case without queueing.
I see no problem with just skipping these two.

Best,
Martin

> 
> A simple reproducer of the problem is something like:
> 
> - create 2 multipath devices (call them mpatha and mpathb)
> - set up raid1 on top of them
> 
> # pvcreate /dev/mapper/mpatha
> # pvcreate /dev/mapper/mpathc
> # vgcreate test /dev/mapper/mpathb /dev/mapper/mpatha
> # lvcreate --type raid1 -m 1 -L 1G -n raid test
> 
> - verify that everything is o.k.
> 
> # dd if=/dev/zero of=/dev/test/raid bs=1M count=100
> # multipath -ll
> # pvs
> # vgs
> # lvs
> 
> - remove all of the path devices from mpatha
> - verify that the raid device is in partial mode
> 
> # dd if=/dev/zero of=/dev/test/raid bs=1M count=100 
> # multipath -ll
> # pvs
> # vgs
> # lvs
> 
> - add the path devices back to the system
> - check if the raid device has recovered
> 
> # multipath -ll
> # pvs
> # vgs
> # lvs
> 
> With these patches, the raid device remains in partial mode, and the
> mpatha pv is still listed as unknown. If I revert these two patches,
> everything works fine again.
> 
> -Ben
> 
> > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > ---
> >  multipath/11-dm-mpath.rules | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> > 
> > diff --git a/multipath/11-dm-mpath.rules b/multipath/11-dm-
> > mpath.rules
> > index 0be22ae4..3f47744f 100644
> > --- a/multipath/11-dm-mpath.rules
> > +++ b/multipath/11-dm-mpath.rules
> > @@ -64,8 +64,7 @@ ENV{MPATH_DEVICE_READY}=="0",
> > ENV{.MPATH_DEVICE_READY_OLD}=="1",\
> >  ENV{MPATH_DEVICE_READY}=="0",
> > ENV{DM_UDEV_DISABLE_OTHER_RULES_FLAG}="1"
> >  ENV{MPATH_DEVICE_READY}!="0", ENV{.MPATH_DEVICE_READY_OLD}=="0",\
> >  	ENV{DM_UDEV_DISABLE_OTHER_RULES_FLAG}="$env{DM_DISABLE_OTH
> > ER_RULES_FLAG_OLD}",\
> > -	ENV{DM_DISABLE_OTHER_RULES_FLAG_OLD}="",\
> > -	ENV{DM_ACTIVATION}="1"
> > +	ENV{DM_DISABLE_OTHER_RULES_FLAG_OLD}=""
> >  
> >  LABEL="scan_import"
> >  ENV{DM_NOSCAN}!="1", GOTO="mpath_end"
> > -- 
> > 2.14.0
> 
>
Martin Wilck Sept. 14, 2017, 12:48 p.m. UTC | #3
On Wed, 2017-09-13 at 16:19 -0500, Benjamin Marzinski wrote:
> On Sun, Sep 03, 2017 at 12:38:49AM +0200, Martin Wilck wrote:
> > The fact alone that a map changes from not ready to ready does
> > not imply that it is activating.
> 
> NAK on this one and
> [PATCH 21/31] 11-dm-mpath.rules: Remember DM_ACTIVATION
> 
> This breaks things. It's often the case that there are devices
> stacked
> on top of multipath.  When multipath loses all of its paths, if it
> doesn't have queue_if_no_path set, it can fail IO up to these higher
> devices, forcing them to react to the failure.

Thinking about this again, most of the stuff we do in the multipath-
related udev rules is for the queueing case. Without
"queue_if_no_path", all that checking whether or not the map will
sustain IO is useless; it would be better to simply pass the device on
to upper layers, which will try to probe it and fail, which is intended
in the non-queueing setup. Perhaps we should check "queue_if_no_path"
early on and skip most of the stuff we  do in the queueing case?

If we did this, maybe even my patches 20/21 might be worth
reconsideration for the "queue_if_no_path" case only?

Cheers,
Martin
diff mbox

Patch

diff --git a/multipath/11-dm-mpath.rules b/multipath/11-dm-mpath.rules
index 0be22ae4..3f47744f 100644
--- a/multipath/11-dm-mpath.rules
+++ b/multipath/11-dm-mpath.rules
@@ -64,8 +64,7 @@  ENV{MPATH_DEVICE_READY}=="0", ENV{.MPATH_DEVICE_READY_OLD}=="1",\
 ENV{MPATH_DEVICE_READY}=="0", ENV{DM_UDEV_DISABLE_OTHER_RULES_FLAG}="1"
 ENV{MPATH_DEVICE_READY}!="0", ENV{.MPATH_DEVICE_READY_OLD}=="0",\
 	ENV{DM_UDEV_DISABLE_OTHER_RULES_FLAG}="$env{DM_DISABLE_OTHER_RULES_FLAG_OLD}",\
-	ENV{DM_DISABLE_OTHER_RULES_FLAG_OLD}="",\
-	ENV{DM_ACTIVATION}="1"
+	ENV{DM_DISABLE_OTHER_RULES_FLAG_OLD}=""
 
 LABEL="scan_import"
 ENV{DM_NOSCAN}!="1", GOTO="mpath_end"