diff mbox series

[4/6] 11-dm-mpath.rules: handle reloads during coldplug events

Message ID 20240205124638.17877-5-mwilck@suse.com (mailing list archive)
State Not Applicable, archived
Delegated to: Mike Snitzer
Headers show
Series multipath-tools: udev rules and service improvements | expand

Commit Message

Martin Wilck Feb. 5, 2024, 12:46 p.m. UTC
If a map reload happens while udev is processing rules for a coldplug
event, DM_SUSPENDED may be set if the respective test in 10-dm.rules
happens while the device is suspened. This will cause the rules for all
higher block device layers to be skipped. Record this situation in an udev
property. The reload operation will trigger another "change" uevent later,
which would normally be treated as a reload, and be ignored without
rescanning the device. If a previous "coldplug while suspended" situation is
detected, perform a full device rescan instead.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipath/11-dm-mpath.rules | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

Comments

Benjamin Marzinski Feb. 9, 2024, 1:03 a.m. UTC | #1
On Mon, Feb 05, 2024 at 01:46:36PM +0100, Martin Wilck wrote:
> If a map reload happens while udev is processing rules for a coldplug
> event, DM_SUSPENDED may be set if the respective test in 10-dm.rules
> happens while the device is suspened. This will cause the rules for all
> higher block device layers to be skipped. Record this situation in an udev
> property. The reload operation will trigger another "change" uevent later,
> which would normally be treated as a reload, and be ignored without
> rescanning the device. If a previous "coldplug while suspended" situation is
> detected, perform a full device rescan instead.

For what it's worth, this seems reasonable. But I do see the issue you
pointed out where we can't trust DM_SUSPENDED.  Perhaps we could track
in multipathd if an add event occured for a path between when we issued
a reload, and when we got the change event (unfortunately, change events
can occur for things other than reloads that multipathd triggers, and
the only way I know to accurately associate a uevent with a reload is by
using dm udev cookies. We have those turned off in multipathd and I
think it would be a good bit of work to turn them back on without
causing lots of waiting with locks held in multipathd). 
 
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  multipath/11-dm-mpath.rules | 22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/multipath/11-dm-mpath.rules b/multipath/11-dm-mpath.rules
> index 8fc4a6f..2706809 100644
> --- a/multipath/11-dm-mpath.rules
> +++ b/multipath/11-dm-mpath.rules
> @@ -9,8 +9,13 @@ ENV{DM_ACTION}=="PATH_*", GOTO="check_ready"
>  
>  ENV{DM_UDEV_DISABLE_OTHER_RULES_FLAG}!="?*", IMPORT{db}="DM_UDEV_DISABLE_OTHER_RULES_FLAG"
>  ENV{DM_NOSCAN}!="?*", IMPORT{db}="DM_NOSCAN"
> -GOTO="scan_import"
>  
> +# Coldplug event while device is suspended (e.g. during a reload)
> +ACTION=="add", ENV{DM_ACTIVATION}=="1", ENV{DM_SUSPENDED}=="1", \
> +	PROGRAM="/bin/logger -t 11-dm-mpath.rules -p daemon.warning \"Coldplug event for suspended device\"", \
> +	ENV{DM_COLDPLUG_SUSPENDED}="1"
> +
> +GOTO="scan_import"
>  LABEL="check_ready"
>  
>  IMPORT{db}="DM_DISABLE_OTHER_RULES_FLAG_OLD"
> @@ -53,6 +58,16 @@ ENV{DM_ACTION}=="PATH_FAILED", GOTO="mpath_action"
>  ENV{MPATH_DEVICE_READY}="1"
>  
>  LABEL="mpath_action"
> +
> +# A previous coldplug event occured while the device was suspended.
> +# Activation might have been partially skipped. Activate the device now,
> +# i.e. disable the MPATH_UNCHANGED logic and set DM_ACTIVATION=1.
> +IMPORT{db}="DM_COLDPLUG_SUSPENDED"
> +ENV{DM_COLDPLUG_SUSPENDED}=="1", ENV{DM_SUSPENDED}!="1", \
> +	ENV{DM_ACTIVATION}="1", ENV{MPATH_UNCHANGED}="0", \
> +	PROGRAM="/bin/logger -t 11-dm-mpath.rules -p daemon.notice \"Forcing activation of previously suspended device\"", \
> +	GOTO="force_activation"
> +
>  # DM_SUBSYSTEM_UDEV_FLAG0 is the "RELOAD" flag for multipath subsystem.
>  # Drop the DM_ACTIVATION flag here as mpath reloads tables if any of its
>  # paths are lost/recovered. For any stack above the mpath device, this is not
> @@ -67,6 +82,8 @@ ENV{DM_SUBSYSTEM_UDEV_FLAG0}=="1", \
>  ENV{DM_ACTION}=="PATH_FAILED|PATH_REINSTATED", \
>  	ENV{DM_ACTIVATION}="0", ENV{MPATH_UNCHANGED}="1"
>  
> +LABEL="force_activation"
> +
>  # Do not initiate scanning if no path is available,
>  # otherwise there would be a hang or IO error on access.
>  # We'd like to avoid this, especially within udev processing.
> @@ -103,6 +120,9 @@ IMPORT{db}="ID_PART_GPT_AUTO_ROOT"
>  
>  LABEL="import_end"
>  
> +# Reset previous DM_COLDPLUG_SUSPENDED if activation happens now
> +ENV{DM_SUSPENDED}!="1", ENV{DM_ACTIVATION}=="1", ENV{DM_COLDPLUG_SUSPENDED}=""
> +
>  # Multipath maps should take precedence over their members.
>  ENV{DM_UDEV_LOW_PRIORITY_FLAG}!="1", OPTIONS+="link_priority=50"
>  
> -- 
> 2.43.0
Martin Wilck Feb. 9, 2024, 3:53 p.m. UTC | #2
On Thu, 2024-02-08 at 20:03 -0500, Benjamin Marzinski wrote:
> On Mon, Feb 05, 2024 at 01:46:36PM +0100, Martin Wilck wrote:
> > If a map reload happens while udev is processing rules for a
> > coldplug
> > event, DM_SUSPENDED may be set if the respective test in 10-
> > dm.rules
> > happens while the device is suspened. This will cause the rules for
> > all
> > higher block device layers to be skipped. Record this situation in
> > an udev
> > property. The reload operation will trigger another "change" uevent
> > later,
> > which would normally be treated as a reload, and be ignored without
> > rescanning the device. If a previous "coldplug while suspended"
> > situation is
> > detected, perform a full device rescan instead.
> 
> For what it's worth, this seems reasonable. 

If that's so, I'd appreciate a Reviewed-by: :-)
See below.

> But I do see the issue you
> pointed out where we can't trust DM_SUSPENDED.

That was my suspicion originally, but the actual problem that was
observed was different, see my recent response to 1/6.

Yes, it can happen any time that a device gets suspended between the
DM_SUSPENDED check and an attempt to do actual I/O in later udev rules,
but that is much less likely than the other case. So far I haven't been
able to actually observe it.

Btw am I understanding correctly that you don't like the idea of going
back to exclusive locking?

>   Perhaps we could track
> in multipathd if an add event occured for a path between when we
> issued
> a reload, and when we got the change event (unfortunately, change
> events
> can occur for things other than reloads that multipathd triggers, and
> the only way I know to accurately associate a uevent with a reload is
> by
> using dm udev cookies. We have those turned off in multipathd and I
> think it would be a good bit of work to turn them back on without
> causing lots of waiting with locks held in multipathd). 

IMO doing this right in multipathd will be hard. We must be careful not
to trigger too many additional uevents just because something *might*
have gone wrong during udev handling (most of the time all is well,
even if a reload happens). I believe to do this properly we must listen
for *kernel* uevents, too, and that's something we've been trying to
avoid for good reason.

I had a different idea: to track a "reload count" for a map somehow
(e.g. in a file under /run/multipath) and checking that at the
beginning and end of uevent handling in order to see if a reload
occurred while the uevent was being processed (which would be detected
by seeing a different the reload count change during the uevent).

For now, I hope that this change plus my latest addition will make the
practical issue go away. All else can be discussed later.
We haven't got any reports about this sort of race for years, so it's
safe to assume that happens very rarely.

Regards
Martin
Benjamin Marzinski Feb. 9, 2024, 4:13 p.m. UTC | #3
On Fri, Feb 09, 2024 at 04:53:50PM +0100, Martin Wilck wrote:
> On Thu, 2024-02-08 at 20:03 -0500, Benjamin Marzinski wrote:
> > On Mon, Feb 05, 2024 at 01:46:36PM +0100, Martin Wilck wrote:
> > > If a map reload happens while udev is processing rules for a
> > > coldplug
> > > event, DM_SUSPENDED may be set if the respective test in 10-
> > > dm.rules
> > > happens while the device is suspened. This will cause the rules for
> > > all
> > > higher block device layers to be skipped. Record this situation in
> > > an udev
> > > property. The reload operation will trigger another "change" uevent
> > > later,
> > > which would normally be treated as a reload, and be ignored without
> > > rescanning the device. If a previous "coldplug while suspended"
> > > situation is
> > > detected, perform a full device rescan instead.
> > 
> > For what it's worth, this seems reasonable. 
> 
> If that's so, I'd appreciate a Reviewed-by: :-)
> See below.

Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>

> 
> > But I do see the issue you
> > pointed out where we can't trust DM_SUSPENDED.
> 
> That was my suspicion originally, but the actual problem that was
> observed was different, see my recent response to 1/6.
> 
> Yes, it can happen any time that a device gets suspended between the
> DM_SUSPENDED check and an attempt to do actual I/O in later udev rules,
> but that is much less likely than the other case. So far I haven't been
> able to actually observe it.
> 
> Btw am I understanding correctly that you don't like the idea of going
> back to exclusive locking?
> 
> >   Perhaps we could track
> > in multipathd if an add event occured for a path between when we
> > issued
> > a reload, and when we got the change event (unfortunately, change
> > events
> > can occur for things other than reloads that multipathd triggers, and
> > the only way I know to accurately associate a uevent with a reload is
> > by
> > using dm udev cookies. We have those turned off in multipathd and I
> > think it would be a good bit of work to turn them back on without
> > causing lots of waiting with locks held in multipathd). 
> 
> IMO doing this right in multipathd will be hard. We must be careful not
> to trigger too many additional uevents just because something *might*
> have gone wrong during udev handling (most of the time all is well,
> even if a reload happens). I believe to do this properly we must listen
> for *kernel* uevents, too, and that's something we've been trying to
> avoid for good reason.
> 
> I had a different idea: to track a "reload count" for a map somehow
> (e.g. in a file under /run/multipath) and checking that at the
> beginning and end of uevent handling in order to see if a reload
> occurred while the uevent was being processed (which would be detected
> by seeing a different the reload count change during the uevent).

That sounds much easier than messing with udev cookies.

-Ben

> For now, I hope that this change plus my latest addition will make the
> practical issue go away. All else can be discussed later.
> We haven't got any reports about this sort of race for years, so it's
> safe to assume that happens very rarely.
> 
> Regards
> Martin
Benjamin Marzinski Feb. 9, 2024, 4:15 p.m. UTC | #4
On Fri, Feb 09, 2024 at 04:53:50PM +0100, Martin Wilck wrote:
> On Thu, 2024-02-08 at 20:03 -0500, Benjamin Marzinski wrote:
> > On Mon, Feb 05, 2024 at 01:46:36PM +0100, Martin Wilck wrote:
> > > If a map reload happens while udev is processing rules for a
> > > coldplug
> > > event, DM_SUSPENDED may be set if the respective test in 10-
> > > dm.rules
> > > happens while the device is suspened. This will cause the rules for
> > > all
> > > higher block device layers to be skipped. Record this situation in
> > > an udev
> > > property. The reload operation will trigger another "change" uevent
> > > later,
> > > which would normally be treated as a reload, and be ignored without
> > > rescanning the device. If a previous "coldplug while suspended"
> > > situation is
> > > detected, perform a full device rescan instead.
> > 
> > For what it's worth, this seems reasonable. 
> 
> If that's so, I'd appreciate a Reviewed-by: :-)
> See below.
> 
> > But I do see the issue you
> > pointed out where we can't trust DM_SUSPENDED.
> 
> That was my suspicion originally, but the actual problem that was
> observed was different, see my recent response to 1/6.
> 
> Yes, it can happen any time that a device gets suspended between the
> DM_SUSPENDED check and an attempt to do actual I/O in later udev rules,
> but that is much less likely than the other case. So far I haven't been
> able to actually observe it.
> 
> Btw am I understanding correctly that you don't like the idea of going
> back to exclusive locking?

Oh, and yeah, I'd rather avoid exclusive locking if possible, just out
of a hunch that it could lead to other unexpected problems.

-Ben

> 
> >   Perhaps we could track
> > in multipathd if an add event occured for a path between when we
> > issued
> > a reload, and when we got the change event (unfortunately, change
> > events
> > can occur for things other than reloads that multipathd triggers, and
> > the only way I know to accurately associate a uevent with a reload is
> > by
> > using dm udev cookies. We have those turned off in multipathd and I
> > think it would be a good bit of work to turn them back on without
> > causing lots of waiting with locks held in multipathd). 
> 
> IMO doing this right in multipathd will be hard. We must be careful not
> to trigger too many additional uevents just because something *might*
> have gone wrong during udev handling (most of the time all is well,
> even if a reload happens). I believe to do this properly we must listen
> for *kernel* uevents, too, and that's something we've been trying to
> avoid for good reason.
> 
> I had a different idea: to track a "reload count" for a map somehow
> (e.g. in a file under /run/multipath) and checking that at the
> beginning and end of uevent handling in order to see if a reload
> occurred while the uevent was being processed (which would be detected
> by seeing a different the reload count change during the uevent).
> 
> For now, I hope that this change plus my latest addition will make the
> practical issue go away. All else can be discussed later.
> We haven't got any reports about this sort of race for years, so it's
> safe to assume that happens very rarely.
> 
> Regards
> Martin
diff mbox series

Patch

diff --git a/multipath/11-dm-mpath.rules b/multipath/11-dm-mpath.rules
index 8fc4a6f..2706809 100644
--- a/multipath/11-dm-mpath.rules
+++ b/multipath/11-dm-mpath.rules
@@ -9,8 +9,13 @@  ENV{DM_ACTION}=="PATH_*", GOTO="check_ready"
 
 ENV{DM_UDEV_DISABLE_OTHER_RULES_FLAG}!="?*", IMPORT{db}="DM_UDEV_DISABLE_OTHER_RULES_FLAG"
 ENV{DM_NOSCAN}!="?*", IMPORT{db}="DM_NOSCAN"
-GOTO="scan_import"
 
+# Coldplug event while device is suspended (e.g. during a reload)
+ACTION=="add", ENV{DM_ACTIVATION}=="1", ENV{DM_SUSPENDED}=="1", \
+	PROGRAM="/bin/logger -t 11-dm-mpath.rules -p daemon.warning \"Coldplug event for suspended device\"", \
+	ENV{DM_COLDPLUG_SUSPENDED}="1"
+
+GOTO="scan_import"
 LABEL="check_ready"
 
 IMPORT{db}="DM_DISABLE_OTHER_RULES_FLAG_OLD"
@@ -53,6 +58,16 @@  ENV{DM_ACTION}=="PATH_FAILED", GOTO="mpath_action"
 ENV{MPATH_DEVICE_READY}="1"
 
 LABEL="mpath_action"
+
+# A previous coldplug event occured while the device was suspended.
+# Activation might have been partially skipped. Activate the device now,
+# i.e. disable the MPATH_UNCHANGED logic and set DM_ACTIVATION=1.
+IMPORT{db}="DM_COLDPLUG_SUSPENDED"
+ENV{DM_COLDPLUG_SUSPENDED}=="1", ENV{DM_SUSPENDED}!="1", \
+	ENV{DM_ACTIVATION}="1", ENV{MPATH_UNCHANGED}="0", \
+	PROGRAM="/bin/logger -t 11-dm-mpath.rules -p daemon.notice \"Forcing activation of previously suspended device\"", \
+	GOTO="force_activation"
+
 # DM_SUBSYSTEM_UDEV_FLAG0 is the "RELOAD" flag for multipath subsystem.
 # Drop the DM_ACTIVATION flag here as mpath reloads tables if any of its
 # paths are lost/recovered. For any stack above the mpath device, this is not
@@ -67,6 +82,8 @@  ENV{DM_SUBSYSTEM_UDEV_FLAG0}=="1", \
 ENV{DM_ACTION}=="PATH_FAILED|PATH_REINSTATED", \
 	ENV{DM_ACTIVATION}="0", ENV{MPATH_UNCHANGED}="1"
 
+LABEL="force_activation"
+
 # Do not initiate scanning if no path is available,
 # otherwise there would be a hang or IO error on access.
 # We'd like to avoid this, especially within udev processing.
@@ -103,6 +120,9 @@  IMPORT{db}="ID_PART_GPT_AUTO_ROOT"
 
 LABEL="import_end"
 
+# Reset previous DM_COLDPLUG_SUSPENDED if activation happens now
+ENV{DM_SUSPENDED}!="1", ENV{DM_ACTIVATION}=="1", ENV{DM_COLDPLUG_SUSPENDED}=""
+
 # Multipath maps should take precedence over their members.
 ENV{DM_UDEV_LOW_PRIORITY_FLAG}!="1", OPTIONS+="link_priority=50"