diff mbox series

[4/4] libmultipath: trigger uevents for partitions, too

Message ID 20190709073909.32112-5-martin.wilck@suse.com (mailing list archive)
State Not Applicable, archived
Delegated to: christophe varoqui
Headers show
Series multipath-patches for 0.8.2 | expand

Commit Message

Martin Wilck July 9, 2019, 7:40 a.m. UTC
From: Martin Wilck <mwilck@suse.com>

We have added code to re-trigger uevents in various cases where
the status of a device changes from multipath to non-multipath,
or vice-versa. When multipathd triggers uevents for paths of a map
because the status of the map has changed, it needs to trigger
events for the partitions of the path devices, too. The kernel
doesn't do this automatically.

Fixes: c5023200 libmultipath: indicate wwid failure in dm_addmap_create()
Fixes: e5d3c3a0 libmultipath: trigger change uevent on new device creation
Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 libmultipath/configure.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

Comments

Roger Heflin July 9, 2019, 4:40 p.m. UTC | #1
We have an observed behavior as follows:

When the host boots up, a uuid symbolic link is created pointing at
/dev/sda1 (device for /boot)

Multipath starts up and creates an multipath device to manage /dev/sda
and a udev rule deletes /dev/sda1 invalidating the symbolic link.

The symbolic link does not appear to get re-created to point to the
new multipath device which would lead one to suspect that there was no
event happening for when the multipath device is created.

When /boot attempts to mount via uuid it fails as the symbolic link is
still pointing at /dev/sda1 that no longer exists in /dev.

Right now our solution is a manual solution of blacklisting the device
with /boot on it.

Would this fix triggering an event on the multipath device partitions
be expected to correct the above issue?

If so, I will submit a ticket to our OS vendor to have it backport it
into the versions that are having the issue with.

Thanks.

On Tue, Jul 9, 2019 at 2:42 AM Martin Wilck <Martin.Wilck@suse.com> wrote:
>
> From: Martin Wilck <mwilck@suse.com>
>
> We have added code to re-trigger uevents in various cases where
> the status of a device changes from multipath to non-multipath,
> or vice-versa. When multipathd triggers uevents for paths of a map
> because the status of the map has changed, it needs to trigger
> events for the partitions of the path devices, too. The kernel
> doesn't do this automatically.
>
> Fixes: c5023200 libmultipath: indicate wwid failure in dm_addmap_create()
> Fixes: e5d3c3a0 libmultipath: trigger change uevent on new device creation
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  libmultipath/configure.c | 38 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 38 insertions(+)
>
> diff --git a/libmultipath/configure.c b/libmultipath/configure.c
> index b09ef529..4cdf1363 100644
> --- a/libmultipath/configure.c
> +++ b/libmultipath/configure.c
> @@ -519,6 +519,42 @@ trigger_udev_change(const struct multipath *mpp)
>         udev_device_unref(udd);
>  }
>
> +static void trigger_partitions_udev_change(struct udev_device *dev,
> +                                          const char *action, int len)
> +{
> +       struct udev_enumerate *part_enum;
> +       struct udev_list_entry *item;
> +
> +       part_enum = udev_enumerate_new(udev);
> +       if (!part_enum)
> +               return;
> +
> +       if (udev_enumerate_add_match_parent(part_enum, dev) < 0 ||
> +           udev_enumerate_add_match_subsystem(part_enum, "block") < 0 ||
> +           udev_enumerate_scan_devices(part_enum) < 0)
> +               goto unref;
> +
> +       udev_list_entry_foreach(item,
> +                               udev_enumerate_get_list_entry(part_enum)) {
> +               const char *syspath;
> +               struct udev_device *part;
> +
> +               syspath = udev_list_entry_get_name(item);
> +               part = udev_device_new_from_syspath(udev, syspath);
> +               if (!part)
> +                       continue;
> +
> +               if (!strcmp("partition", udev_device_get_devtype(part))) {
> +                       condlog(4, "%s: triggering %s event for %s", __func__,
> +                               action, syspath);
> +                       sysfs_attr_set_value(part, "uevent", action, len);
> +               }
> +               udev_device_unref(part);
> +       }
> +unref:
> +       udev_enumerate_unref(part_enum);
> +}
> +
>  void
>  trigger_paths_udev_change(struct multipath *mpp, bool is_mpath)
>  {
> @@ -569,6 +605,8 @@ trigger_paths_udev_change(struct multipath *mpp, bool is_mpath)
>                                 action, pp->dev, is_mpath ? "" : "no ");
>                         sysfs_attr_set_value(pp->udev, "uevent",
>                                              action, strlen(action));
> +                       trigger_partitions_udev_change(pp->udev, action,
> +                                                      strlen(action));
>                 }
>         }
>
> --
> 2.22.0
>
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Martin Wilck July 10, 2019, 10:41 a.m. UTC | #2
On Tue, 2019-07-09 at 11:40 -0500, Roger Heflin wrote:
> We have an observed behavior as follows:
> 
> When the host boots up, a uuid symbolic link is created pointing at
> /dev/sda1 (device for /boot)
> 
> Multipath starts up and creates an multipath device to manage
> /dev/sda
> and a udev rule deletes /dev/sda1 invalidating the symbolic link.

I suppose you are talking about 68-del-part-nodes.rules. Note that the
rules in this file can be deactivated by setting
ENV{DONT_DEL_PART_NODES}="1" in an early udev rule.

Also, the rule only removes partitions for devices that have been
detected as being eligible for multipathing.

> The symbolic link does not appear to get re-created to point to the
> new multipath device which would lead one to suspect that there was
> no
> event happening for when the multipath device is created.

That's very unlikely. You should verify that the multipath device (for
sda) is created. My patch here relates only to the case where creating
the multipath device *fails*.

If the creation of the multipath device succeeded, my patch would have
no effect on your system. In that case you should also verify that the
kpartx devices on top of it were created. I advise to boot with
"udev.log-priority=debug" and check what udev is actually doing.

> Right now our solution is a manual solution of blacklisting the
> device
> with /boot on it.
>
> Would this fix triggering an event on the multipath device partitions
> be expected to correct the above issue?

Maybe. I don't know enough details about your configuration to tell.
But if this is a device that should not be multipathed, from my point
of view, proper blacklisting via multipath.conf is the recommended way
to handle this problem. 

You can also use "find_multipaths"; please check the documentation.
Note also that since 0.7.8, blacklisting "by protocol" is possible,
which makes it possible e.g. to blacklist local SATA disks with a
simple statement.

Regards,
Martin


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Roger Heflin July 10, 2019, 5:42 p.m. UTC | #3
On Wed, Jul 10, 2019 at 5:49 AM Martin Wilck <Martin.Wilck@suse.com> wrote:
>
> On Tue, 2019-07-09 at 11:40 -0500, Roger Heflin wrote:
> > We have an observed behavior as follows:
> >
> > When the host boots up, a uuid symbolic link is created pointing at
> > /dev/sda1 (device for /boot)
> >
> > Multipath starts up and creates an multipath device to manage
> > /dev/sda
> > and a udev rule deletes /dev/sda1 invalidating the symbolic link.
>
> I suppose you are talking about 68-del-part-nodes.rules. Note that the
> rules in this file can be deactivated by setting
> ENV{DONT_DEL_PART_NODES}="1" in an early udev rule.

The OS we have uses 62-multipath and does not have a global override like that.

I am looking at my notes on the issue and it was this:
rootvg gets started directly on /dev/sda2 and then multipath starts up
and attempts to mange it and deletes the partition on /dev/sda1
causing the by-uuid link to be invalid but multipath fails to create
the device with "map in use" because the lv's for rootvg are live on
/dev/sda2 directly.   So it does sound like your fix would would
correct this issue since on the multipath failure to manage it would
recreate the /dev/sda1 device.  There appears to be some race
condition in the initramfs/systemd where sometimes rootvg gets started
before multipath has managed the device causing the partition to be
deleted (we have multipath is the initramfs, and that was confirmed).
All of our other vg's dont have this issue as we are using the
rd.lvm.vg= such that only the rootvg gets turned on early.

>
> Also, the rule only removes partitions for devices that have been
> detected as being eligible for multipathing.
>
> > The symbolic link does not appear to get re-created to point to the
> > new multipath device which would lead one to suspect that there was
> > no
> > event happening for when the multipath device is created.
>
> That's very unlikely. You should verify that the multipath device (for
> sda) is created. My patch here relates only to the case where creating
> the multipath device *fails*.
>
?
>
> Maybe. I don't know enough details about your configuration to tell.
> But if this is a device that should not be multipathed, from my point
> of view, proper blacklisting via multipath.conf is the recommended way
> to handle this problem.
>
> You can also use "find_multipaths"; please check the documentation.
> Note also that since 0.7.8, blacklisting "by protocol" is possible,
> which makes it possible e.g. to blacklist local SATA disks with a
> simple statement.
>
We intentionally have find_multipaths set to allow a single path.  The
issue is on a number of VM and using multipath for everything allows
us to not have separate work instructions/scripts for VM's vs
physical, and also allows using multipath to use io retries to work
around short-term external vmhost and storage issues without having to
identify what nodes were affected and reboot them all (they just pause
and continue once the issue is fixed).  It is a very large environment
and things happen in the different sections of the environment and we
have been tweaking various configuration settings to result in less
trouble/more stability when things happen.   The environment has >5000
linux VM's and > 5000 physical linux hosts.

> Martin
>

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Martin Wilck July 10, 2019, 8:48 p.m. UTC | #4
On Wed, 2019-07-10 at 12:42 -0500, Roger Heflin wrote:
> 
> The OS we have uses 62-multipath and does not have a global override
> like that.
> 
> I am looking at my notes on the issue and it was this:
> rootvg gets started directly on /dev/sda2 and then multipath starts
> up
> and attempts to mange it and deletes the partition on /dev/sda1
> causing the by-uuid link to be invalid but multipath fails to create
> the device with "map in use" because the lv's for rootvg are live on
> /dev/sda2 directly.   So it does sound like your fix would would
> correct this issue since on the multipath failure to manage it would
> recreate the /dev/sda1 device.  

Yes, it's certainly worth a try. This is the case the patch was made
for. You must make sure that your distro contains the commits it builds
upon (c5023200, e5d3c3a0). All this is fairly recent stuff.

> There appears to be some race
> condition in the initramfs/systemd where sometimes rootvg gets
> started
> before multipath has managed the device causing the partition to be
> deleted (we have multipath is the initramfs, and that was confirmed).

That should never happen. The multipath udev rules set SYSTEMD_READY=0
on the devices, thus lvm must refrain from grabbing them. Maybe
multipath_component_detection is off in your LVM2 setup?

> All of our other vg's dont have this issue as we are using the
> rd.lvm.vg= such that only the rootvg gets turned on early.
> 
> > Also, the rule only removes partitions for devices that have been
> > detected as being eligible for multipathing.
> > 
> > > The symbolic link does not appear to get re-created to point to
> > > the
> > > new multipath device which would lead one to suspect that there
> > > was
> > > no
> > > event happening for when the multipath device is created.
> > 
> > That's very unlikely. You should verify that the multipath device
> > (for
> > sda) is created. My patch here relates only to the case where
> > creating
> > the multipath device *fails*.
> > 
> ?
> > Maybe. I don't know enough details about your configuration to
> > tell.
> > But if this is a device that should not be multipathed, from my
> > point
> > of view, proper blacklisting via multipath.conf is the recommended
> > way
> > to handle this problem.
> > 
> > You can also use "find_multipaths"; please check the documentation.
> > Note also that since 0.7.8, blacklisting "by protocol" is possible,
> > which makes it possible e.g. to blacklist local SATA disks with a
> > simple statement.
> > 
> We intentionally have find_multipaths set to allow a single path. 

But the map which is causing the problem does have multiple paths, if I
understand correctly? Not all past versions of multipath-tools behaved
consistently with "find_multipaths yes"; if you have such a version,
that might even explain the race you were talking about. But this is
speculation. The behavior depends on which commits exactly are in your
code base.

Anyway, I can't say much about it without knowing all the details. It
seems that you should contact your distro vendor.

Martin


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
diff mbox series

Patch

diff --git a/libmultipath/configure.c b/libmultipath/configure.c
index b09ef529..4cdf1363 100644
--- a/libmultipath/configure.c
+++ b/libmultipath/configure.c
@@ -519,6 +519,42 @@  trigger_udev_change(const struct multipath *mpp)
 	udev_device_unref(udd);
 }
 
+static void trigger_partitions_udev_change(struct udev_device *dev,
+					   const char *action, int len)
+{
+	struct udev_enumerate *part_enum;
+	struct udev_list_entry *item;
+
+	part_enum = udev_enumerate_new(udev);
+	if (!part_enum)
+		return;
+
+	if (udev_enumerate_add_match_parent(part_enum, dev) < 0 ||
+	    udev_enumerate_add_match_subsystem(part_enum, "block") < 0 ||
+	    udev_enumerate_scan_devices(part_enum) < 0)
+		goto unref;
+
+	udev_list_entry_foreach(item,
+				udev_enumerate_get_list_entry(part_enum)) {
+		const char *syspath;
+		struct udev_device *part;
+
+		syspath = udev_list_entry_get_name(item);
+		part = udev_device_new_from_syspath(udev, syspath);
+		if (!part)
+			continue;
+
+		if (!strcmp("partition", udev_device_get_devtype(part))) {
+			condlog(4, "%s: triggering %s event for %s", __func__,
+				action, syspath);
+			sysfs_attr_set_value(part, "uevent", action, len);
+		}
+		udev_device_unref(part);
+	}
+unref:
+	udev_enumerate_unref(part_enum);
+}
+
 void
 trigger_paths_udev_change(struct multipath *mpp, bool is_mpath)
 {
@@ -569,6 +605,8 @@  trigger_paths_udev_change(struct multipath *mpp, bool is_mpath)
 				action, pp->dev, is_mpath ? "" : "no ");
 			sysfs_attr_set_value(pp->udev, "uevent",
 					     action, strlen(action));
+			trigger_partitions_udev_change(pp->udev, action,
+						       strlen(action));
 		}
 	}