Message ID | 20240205124638.17877-2-mwilck@suse.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Delegated to: | Mike Snitzer |
Headers | show |
Series | multipath-tools: udev rules and service improvements | expand |
On Mon, Feb 05, 2024 at 01:46:33PM +0100, Martin Wilck wrote: > DM_UDEV_DISABLE_OTHER_RULES_FLAG and DM_NOSCAN may be already set > from previous rules, e.g. if the device is suspended. Make sure > we don't overwrite them. > > DM_DISABLE_OTHER_RULES_FLAG_OLD and MPATH_DEVICE_READY are only > used in this file, and not used in the scan_import code path. > > Signed-off-by: Martin Wilck <mwilck@suse.com> > --- > multipath/11-dm-mpath.rules | 15 +++++++++++---- > 1 file changed, 11 insertions(+), 4 deletions(-) > > diff --git a/multipath/11-dm-mpath.rules b/multipath/11-dm-mpath.rules > index c339f52..2c4d006 100644 > --- a/multipath/11-dm-mpath.rules > +++ b/multipath/11-dm-mpath.rules > @@ -2,12 +2,19 @@ ACTION!="add|change", GOTO="mpath_end" > ENV{DM_UDEV_RULES_VSN}!="?*", GOTO="mpath_end" > ENV{DM_UUID}!="mpath-?*", GOTO="mpath_end" > > -IMPORT{db}="DM_DISABLE_OTHER_RULES_FLAG_OLD" > -IMPORT{db}="MPATH_DEVICE_READY" > - > # If this uevent didn't come from dm, don't try to update the > # device state > -ENV{DM_COOKIE}!="?*", ENV{DM_ACTION}!="PATH_*", IMPORT{db}="DM_UDEV_DISABLE_OTHER_RULES_FLAG", IMPORT{db}="DM_NOSCAN", GOTO="scan_import" > +ENV{DM_COOKIE}=="?*", GOTO="check_ready" > +ENV{DM_ACTION}=="PATH_*", GOTO="check_ready" > + > +ENV{DM_UDEV_DISABLE_OTHER_RULES_FLAG}!="?*", IMPORT{db}="DM_UDEV_DISABLE_OTHER_RULES_FLAG" > +ENV{DM_NOSCAN}!="?*", IMPORT{db}="DM_NOSCAN" > +GOTO="scan_import" > + If we do this, don't we forget the values for DM_DISABLE_OTHER_RULES_FLAG_OLD and MPATH_DEVICE_READY whenever we get a non-dm uevent? If we skip importing them for a uevent, they're dropped from the database, which means that on the next dm-originated uevent we won't be able to get the old values. right? -Ben > +LABEL="check_ready" > + > +IMPORT{db}="DM_DISABLE_OTHER_RULES_FLAG_OLD" > +IMPORT{db}="MPATH_DEVICE_READY" > > ENV{.MPATH_DEVICE_READY_OLD}="$env{MPATH_DEVICE_READY}" > > -- > 2.43.0
On Mon, 2024-02-05 at 15:44 -0500, Benjamin Marzinski wrote: > On Mon, Feb 05, 2024 at 01:46:33PM +0100, Martin Wilck wrote: > > DM_UDEV_DISABLE_OTHER_RULES_FLAG and DM_NOSCAN may be already set > > from previous rules, e.g. if the device is suspended. Make sure > > we don't overwrite them. > > > > DM_DISABLE_OTHER_RULES_FLAG_OLD and MPATH_DEVICE_READY are only > > used in this file, and not used in the scan_import code path. > > > > Signed-off-by: Martin Wilck <mwilck@suse.com> > > --- > > multipath/11-dm-mpath.rules | 15 +++++++++++---- > > 1 file changed, 11 insertions(+), 4 deletions(-) > > > > diff --git a/multipath/11-dm-mpath.rules b/multipath/11-dm- > > mpath.rules > > index c339f52..2c4d006 100644 > > --- a/multipath/11-dm-mpath.rules > > +++ b/multipath/11-dm-mpath.rules > > @@ -2,12 +2,19 @@ ACTION!="add|change", GOTO="mpath_end" > > ENV{DM_UDEV_RULES_VSN}!="?*", GOTO="mpath_end" > > ENV{DM_UUID}!="mpath-?*", GOTO="mpath_end" > > > > -IMPORT{db}="DM_DISABLE_OTHER_RULES_FLAG_OLD" > > -IMPORT{db}="MPATH_DEVICE_READY" > > - > > # If this uevent didn't come from dm, don't try to update the > > # device state > > -ENV{DM_COOKIE}!="?*", ENV{DM_ACTION}!="PATH_*", > > IMPORT{db}="DM_UDEV_DISABLE_OTHER_RULES_FLAG", > > IMPORT{db}="DM_NOSCAN", GOTO="scan_import" > > +ENV{DM_COOKIE}=="?*", GOTO="check_ready" > > +ENV{DM_ACTION}=="PATH_*", GOTO="check_ready" > > + > > +ENV{DM_UDEV_DISABLE_OTHER_RULES_FLAG}!="?*", > > IMPORT{db}="DM_UDEV_DISABLE_OTHER_RULES_FLAG" > > +ENV{DM_NOSCAN}!="?*", IMPORT{db}="DM_NOSCAN" > > +GOTO="scan_import" > > + > > If we do this, don't we forget the values for > DM_DISABLE_OTHER_RULES_FLAG_OLD and MPATH_DEVICE_READY whenever we > get a > non-dm uevent? If we skip importing them for a uevent, they're > dropped > from the database, which means that on the next dm-originated uevent > we > won't be able to get the old values. right? Right, I didn't consider that. It makes sense for MPATH_DEVICE_READY. However, while at it, I wonder about our rationale for saving DM_UDEV_DISABLE_OTHER_RULES_FLAG in DM_DISABLE_OTHER_RULES_FLAG_OLD. In 10-dm.rules, DM_DISABLE_OTHER_RULES_FLAG changes its value only - in DM_UDEV_PRIMARY_SOURCE_FLAG=1 events, according to the value of DM_SUSPENDED - for DISK_RO=1 events (switches the flag on) (11-dm-lvm.rules has some additional logic that doesn't matter for multipath). For all other events, the flag is restored from the udev db in 10- dm.rules. Ignoring DISK_RO, it always has the value that DM_SUSPENDED had in the last DM_UDEV_PRIMARY_SOURCE_FLAG=1 (aka map reload) event. The logic in 11-dm-mpath.rules adds a check for unusable arrays on top, setting DM_DISABLE_OTHER_RULES_FLAG if MPATH_DEVICE_READY switches from 1 to 0. In this case we save the previous value in DM_DISABLE_OTHER_RULES_FLAG_OLD, and restore it from there in a later event if MPATH_DEVICE_READY switches back from 0 to 1. IMO the following can happen: 1. an event arrives while DM_SUSPENDED=1, causing DM_DISABLE_OTHER_RULES_FLAG=1 to be set 2. 11-dm-mpath.rules sees no paths and saves DM_DISABLE_OTHER_RULES_FLAG=1 to DM_DISABLE_OTHER_RULES_FLAG_OLD 3. an event arrives after the device has resumed, DM_SUSPENDED and DM_DISABLE_OTHER_RULES_FLAG are cleared in 10-dm.rules 4. 11-dm-mpath.rules sees MPATH_DEVICE_READY=1 and restores DM_DISABLE_OTHER_RULES_FLAG, setting it to 1 ... and this would be wrong, no? It seems to me that we should not save DM_DISABLE_OTHER_RULES_FLAG_OLD between uevents. We must set DM_DISABLE_OTHER_RULES_FLAG=1 to avoid upper layer probing if there are no paths, but I suppose we should restore the previous value after udev rule execution, e.g. in 99-dm- mpath.rules: ENV{DM_DISABLE_OTHER_RULES_FLAG_OLD}=="?*", \ ENV{DM_DISABLE_OTHER_RULES_FLAG}=$env{DM_DISABLE_OTHER_RULES_FLAG_OLD}, \ ENV{DM_DISABLE_OTHER_RULES_FLAG_OLD}="" Am I confusing stuff here? Unfortunately, testing of my patch series has shown that it's an improvement, but it isn't sufficient for completely avoiding races between multipathd and udev. DM_SUSPENDED=1 can be avoided, but it's still possible that the device gets suspended after the udev rules test for supended state and before they run kpartx, blkid, pvscan, or other similar commands. I am quite clueless about this right now, and would be grateful for ideas. Re-implementing LOCK_EX locking would be a possibility for systemd >= 250, as noted in the cover letter of the series. But for systemd <= 249, I don't see good options. We can't use LOCK_EX, because when we release the lock, we have no means to know whether or not a race with udev occurred (iow, whether udev tried to access the device while we held the lock, failed, and dropped the event). Thus we'd need to assume that this was the case, and force a reload after each reload, which is obvious nonsense. We also have no means to know whether the full set of rules has ever been run by udev for the device at hand, because we don't know the set of rules that follow after 11-dm- mpath.rules. Thanks, Martin
On Tue, Feb 06, 2024 at 11:50:46AM +0100, Martin Wilck wrote: > On Mon, 2024-02-05 at 15:44 -0500, Benjamin Marzinski wrote: > > On Mon, Feb 05, 2024 at 01:46:33PM +0100, Martin Wilck wrote: > > > DM_UDEV_DISABLE_OTHER_RULES_FLAG and DM_NOSCAN may be already set > > > from previous rules, e.g. if the device is suspended. Make sure > > > we don't overwrite them. > > > > > > DM_DISABLE_OTHER_RULES_FLAG_OLD and MPATH_DEVICE_READY are only > > > used in this file, and not used in the scan_import code path. > > > > > > Signed-off-by: Martin Wilck <mwilck@suse.com> > > > --- > > > multipath/11-dm-mpath.rules | 15 +++++++++++---- > > > 1 file changed, 11 insertions(+), 4 deletions(-) > > > > > > diff --git a/multipath/11-dm-mpath.rules b/multipath/11-dm- > > > mpath.rules > > > index c339f52..2c4d006 100644 > > > --- a/multipath/11-dm-mpath.rules > > > +++ b/multipath/11-dm-mpath.rules > > > @@ -2,12 +2,19 @@ ACTION!="add|change", GOTO="mpath_end" > > > ENV{DM_UDEV_RULES_VSN}!="?*", GOTO="mpath_end" > > > ENV{DM_UUID}!="mpath-?*", GOTO="mpath_end" > > > > > > -IMPORT{db}="DM_DISABLE_OTHER_RULES_FLAG_OLD" > > > -IMPORT{db}="MPATH_DEVICE_READY" > > > - > > > # If this uevent didn't come from dm, don't try to update the > > > # device state > > > -ENV{DM_COOKIE}!="?*", ENV{DM_ACTION}!="PATH_*", > > > IMPORT{db}="DM_UDEV_DISABLE_OTHER_RULES_FLAG", > > > IMPORT{db}="DM_NOSCAN", GOTO="scan_import" > > > +ENV{DM_COOKIE}=="?*", GOTO="check_ready" > > > +ENV{DM_ACTION}=="PATH_*", GOTO="check_ready" > > > + > > > +ENV{DM_UDEV_DISABLE_OTHER_RULES_FLAG}!="?*", > > > IMPORT{db}="DM_UDEV_DISABLE_OTHER_RULES_FLAG" > > > +ENV{DM_NOSCAN}!="?*", IMPORT{db}="DM_NOSCAN" > > > +GOTO="scan_import" > > > + > > > > If we do this, don't we forget the values for > > DM_DISABLE_OTHER_RULES_FLAG_OLD and MPATH_DEVICE_READY whenever we > > get a > > non-dm uevent? If we skip importing them for a uevent, they're > > dropped > > from the database, which means that on the next dm-originated uevent > > we > > won't be able to get the old values. right? > > Right, I didn't consider that. It makes sense for MPATH_DEVICE_READY. > > However, while at it, I wonder about our rationale for saving > DM_UDEV_DISABLE_OTHER_RULES_FLAG in DM_DISABLE_OTHER_RULES_FLAG_OLD. > > In 10-dm.rules, DM_DISABLE_OTHER_RULES_FLAG changes its value only > - in DM_UDEV_PRIMARY_SOURCE_FLAG=1 events, according to the value of > DM_SUSPENDED > - for DISK_RO=1 events (switches the flag on) > > (11-dm-lvm.rules has some additional logic that doesn't matter for > multipath). > > For all other events, the flag is restored from the udev db in 10- > dm.rules. Ignoring DISK_RO, it always has the value that DM_SUSPENDED > had in the last DM_UDEV_PRIMARY_SOURCE_FLAG=1 (aka map reload) event. > > The logic in 11-dm-mpath.rules adds a check for unusable arrays > on top, setting DM_DISABLE_OTHER_RULES_FLAG if MPATH_DEVICE_READY > switches from 1 to 0. In this case we save the previous value in > DM_DISABLE_OTHER_RULES_FLAG_OLD, and restore it from there in a later > event if MPATH_DEVICE_READY switches back from 0 to 1. > > IMO the following can happen: > > 1. an event arrives while DM_SUSPENDED=1, causing > DM_DISABLE_OTHER_RULES_FLAG=1 to be set > 2. 11-dm-mpath.rules sees no paths and saves > DM_DISABLE_OTHER_RULES_FLAG=1 to DM_DISABLE_OTHER_RULES_FLAG_OLD > 3. an event arrives after the device has resumed, DM_SUSPENDED and > DM_DISABLE_OTHER_RULES_FLAG are cleared in 10-dm.rules > 4. 11-dm-mpath.rules sees MPATH_DEVICE_READY=1 and restores > DM_DISABLE_OTHER_RULES_FLAG, setting it to 1 > > ... and this would be wrong, no? > > It seems to me that we should not save DM_DISABLE_OTHER_RULES_FLAG_OLD > between uevents. We must set DM_DISABLE_OTHER_RULES_FLAG=1 to avoid > upper layer probing if there are no paths, but I suppose we should > restore the previous value after udev rule execution, e.g. in 99-dm- > mpath.rules: > > ENV{DM_DISABLE_OTHER_RULES_FLAG_OLD}=="?*", \ > ENV{DM_DISABLE_OTHER_RULES_FLAG}=$env{DM_DISABLE_OTHER_RULES_FLAG_OLD}, \ > ENV{DM_DISABLE_OTHER_RULES_FLAG_OLD}="" > > Am I confusing stuff here? Nope. That makes a lot of sense. > Unfortunately, testing of my patch series has shown that it's an > improvement, but it isn't sufficient for completely avoiding races > between multipathd and udev. DM_SUSPENDED=1 can be avoided, but it's > still possible that the device gets suspended after the udev rules test > for supended state and before they run kpartx, blkid, pvscan, or other > similar commands. > > I am quite clueless about this right now, and would be grateful for > ideas. Re-implementing LOCK_EX locking would be a possibility for > systemd >= 250, as noted in the cover letter of the series. But for > systemd <= 249, I don't see good options. We can't use LOCK_EX, because > when we release the lock, we have no means to know whether or not a > race with udev occurred (iow, whether udev tried to access the device > while we held the lock, failed, and dropped the event). Thus we'd need > to assume that this was the case, and force a reload after each reload, > which is obvious nonsense. We also have no means to know whether the > full set of rules has ever been run by udev for the device at hand, > because we don't know the set of rules that follow after 11-dm- > mpath.rules. multipathd already refuses to reload newly created devices before it sees the creation uevents to try to avoid this. I assume that the problem you are seeing is on the coldplug after the pivot root, where the devices already exist, or am I confused here. I wonder if we can do something similar where we would ideally delay all device reloads until after the coldplug. The problem is figuring out when it's finished. -Ben > Thanks, > Martin
On Thu, 2024-02-08 at 19:30 -0500, Benjamin Marzinski wrote: > On Tue, Feb 06, 2024 at 11:50:46AM +0100, Martin Wilck wrote: > > > > Unfortunately, testing of my patch series has shown that it's an > > improvement, but it isn't sufficient for completely avoiding races > > between multipathd and udev. DM_SUSPENDED=1 can be avoided, but > > it's > > still possible that the device gets suspended after the udev rules > > test > > for supended state and before they run kpartx, blkid, pvscan, or > > other > > similar commands. > > > > I am quite clueless about this right now, and would be grateful for > > ideas. Re-implementing LOCK_EX locking would be a possibility for > > systemd >= 250, as noted in the cover letter of the series. But for > > systemd <= 249, I don't see good options. We can't use LOCK_EX, > > because > > when we release the lock, we have no means to know whether or not a > > race with udev occurred (iow, whether udev tried to access the > > device > > while we held the lock, failed, and dropped the event). Thus we'd > > need > > to assume that this was the case, and force a reload after each > > reload, > > which is obvious nonsense. We also have no means to know whether > > the > > full set of rules has ever been run by udev for the device at hand, > > because we don't know the set of rules that follow after 11-dm- > > mpath.rules. > > multipathd already refuses to reload newly created devices before it > sees the creation uevents to try to avoid this. Right, thanks for reminding me about that. > I assume that the > problem you are seeing is on the coldplug after the pivot root, where > the devices already exist, or am I confused here. No, that's exactly the situation in which the problem is observed. > I wonder if we can do > something similar where we would ideally delay all device reloads > until > after the coldplug. The problem is figuring out when it's finished. The current state of affairs is that patch 4 in my series has avoided the issue on coldplug events, greatly reducing the frequency of errors on the test sysstem. There's a variant that needs an additional fix: 1 map set up in initrd 2 creation uevent event is observed by multipathd 3 switch root 4 path addition, map reload 5 change event 6 path addition, another map reload (5-6 can repeat) 7 DM_SUSPENDED is observed by udev while handling the change event, and saved as DM_UDEV_DISABLE_OTHER_RULES_FLAG 8 Coldplug event, DM_UDEV_DISABLE_OTHER_RULES_FLAG is restored from udev db => pvscan is skipped I've created another patch to fix this situation, and am now waiting for for test feedback. I'll post about this in more detail later. In short, I believe 10-dm.rules should differentiate DM_UDEV_DISABLE_OTHER_RULES_FLAG originating from lvm itself and DM_SUSPENDED. Regards, Martin
diff --git a/multipath/11-dm-mpath.rules b/multipath/11-dm-mpath.rules index c339f52..2c4d006 100644 --- a/multipath/11-dm-mpath.rules +++ b/multipath/11-dm-mpath.rules @@ -2,12 +2,19 @@ ACTION!="add|change", GOTO="mpath_end" ENV{DM_UDEV_RULES_VSN}!="?*", GOTO="mpath_end" ENV{DM_UUID}!="mpath-?*", GOTO="mpath_end" -IMPORT{db}="DM_DISABLE_OTHER_RULES_FLAG_OLD" -IMPORT{db}="MPATH_DEVICE_READY" - # If this uevent didn't come from dm, don't try to update the # device state -ENV{DM_COOKIE}!="?*", ENV{DM_ACTION}!="PATH_*", IMPORT{db}="DM_UDEV_DISABLE_OTHER_RULES_FLAG", IMPORT{db}="DM_NOSCAN", GOTO="scan_import" +ENV{DM_COOKIE}=="?*", GOTO="check_ready" +ENV{DM_ACTION}=="PATH_*", GOTO="check_ready" + +ENV{DM_UDEV_DISABLE_OTHER_RULES_FLAG}!="?*", IMPORT{db}="DM_UDEV_DISABLE_OTHER_RULES_FLAG" +ENV{DM_NOSCAN}!="?*", IMPORT{db}="DM_NOSCAN" +GOTO="scan_import" + +LABEL="check_ready" + +IMPORT{db}="DM_DISABLE_OTHER_RULES_FLAG_OLD" +IMPORT{db}="MPATH_DEVICE_READY" ENV{.MPATH_DEVICE_READY_OLD}="$env{MPATH_DEVICE_READY}"
DM_UDEV_DISABLE_OTHER_RULES_FLAG and DM_NOSCAN may be already set from previous rules, e.g. if the device is suspended. Make sure we don't overwrite them. DM_DISABLE_OTHER_RULES_FLAG_OLD and MPATH_DEVICE_READY are only used in this file, and not used in the scan_import code path. Signed-off-by: Martin Wilck <mwilck@suse.com> --- multipath/11-dm-mpath.rules | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)