Message ID | 20231127215418.271708-1-bmarzins@redhat.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Delegated to: | Benjamin Marzinski |
Headers | show |
Series | multipathd: Make sure to disable queueing if recovery has failed. | expand |
How long does recovery take? I am unclear on when the explictly set queue_if_no_path is being overridden, and why disabling it is useful. Typically we are setting queue_if_no_path to forever with the intent that it will survive longer storage and/or disk issues without returning an error to the application and/or corrupting fses if the storage issue can be fixed. We generally expect it to recover when the storage comes back, but that the storage could be experiencing significant issues for a significant period of time >10 minutes. Since the storage has to be fixed to get things working again, there is a lot of negative value that requires manual recovery steps when an error gets returned (fsck, loss of data). We also manually disable queueing if we need to remove the mpath devices (paths are already gone as they were non-responsive > 24 hours and removed via tmo_timeout), and/or forcible reboot the nodes when we determine storage is not coming back. On Mon, Nov 27, 2023 at 3:54 PM Benjamin Marzinski <bmarzins@redhat.com> wrote: > > If a multipath device has no_path_retry set to a number and has lost all > paths, gone into recovery mode, and timed out, it will disable > queue_if_no_paths. After that, if one of those failed paths is removed, > when the device is reloaded, queue_if_no_paths will be re-enabled. When > set_no_path_retry() is then called to update the queueing state, it will > not disable queue_if_no_paths, since the device is still in the recovery > state, so it believes no work needs to be done. The device will remain > in the recovery state, with retry_ticks at 0, and queueing enabled, > even though there are no usable paths. > > To fix this, in set_no_path_retry(), if no_path_retry is set to a number > and the device is queueing but it is in recovery mode and out of > retries with no usable paths, manually disable queue_if_no_path. > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> > --- > libmultipath/structs_vec.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c > index 0e8a46e7..3cb23c73 100644 > --- a/libmultipath/structs_vec.c > +++ b/libmultipath/structs_vec.c > @@ -627,8 +627,18 @@ void set_no_path_retry(struct multipath *mpp) > !mpp->in_recovery) > dm_queue_if_no_path(mpp->alias, 1); > leave_recovery_mode(mpp); > - } else if (pathcount(mpp, PATH_PENDING) == 0) > + } else if (pathcount(mpp, PATH_PENDING) == 0) { > + /* > + * If in_recovery is set, enter_recovery_mode does > + * nothing. If the device is already in recovery > + * mode and has already timed out, manually call > + * dm_queue_if_no_path to stop it from queueing. > + */ > + if ((!mpp->features || is_queueing) && > + mpp->in_recovery && mpp->retry_tick == 0) > + dm_queue_if_no_path(mpp->alias, 0); > enter_recovery_mode(mpp); > + } > break; > } > } > -- > 2.41.0 > >
On Mon, Nov 27, 2023 at 04:07:46PM -0600, Roger Heflin wrote: > How long does recovery take? I am unclear on when the explictly > set queue_if_no_path is being overridden, and why disabling it is > useful. To be clear here. This fix shouldn't cause multipath to disable queueing in cases where it wasn't configured to do so. Previously, we were not always respecting no_path_retry. This fix makes sure that we are. This change only applies to the case where no_path_retry is set to a specific number of retries. When configuring multipath, if no_path_reties is set, it always takes precedence over "queue_if_no_path" set in the features line. So if a multipath device isn't configured to queue for a limited number of retries and then disable queueing, the fix changes nothing. When no_path_retry is set to a number of retries, if the last path is lost, the device goes into recovery mode. When a device goes into recovery mode, multipathd sets a counter to the value of no_path_retry. After every loop of the path checker through all the devices, that counter ticks down. If the counter hits zero, multipathd disables queueing on the device. When a path becomes usable, the device leaves recovery mode, stopping the countdown or restoring queueing as necessary. The problem that this patch fixes happens if the multipath device is reloaded after that count has hit zero, and queueing has been disabled. When reloading a device with no_path_retry set to anything other than "fail", multipathd will automatically enable queueing. After reloading the device, multipathd checks the number of usable paths. If there are no usable paths, it tries to enter recovery mode. However, in this case, the device already is in recovery mode, so nothing happens. And since the countdown has already completed, nothing will happen to disable queueing in the future. Now you are left with a device stuck indefinitely in recovery mode with queueing enabled and no usable paths, even though you configured it to only queue for a limited number of retries. This is bad. The solution the patch implements is to check if you are in this situation: 1. The device has no_path_retry set to a number of retries. 2. The device has no usable paths 3. The device is already in recovery mode 4. The retry counter for the device has already reached 0 5. The device is currently queueing IO. In this case, we clearly shouldn't be queueing, and calling enter_recovery_mode() will do nothing. So, we just tell the device to disable queueing. The change I made is in set_no_path_retry() which does get called in other cases besides after a device reload. However, if we are ever in the situation outlined above, we should not be queueing, so it is correct to disable it. The code I added in this patch is the mirror of the code directly above it in set_no_path_retry(). Here is that code. if (count_active_paths(mpp) > 0) { /* * If in_recovery is set, leave_recovery_mode() takes * care of dm_queue_if_no_path. Otherwise, do it here. */ if ((!mpp->features || !is_queueing) && !mpp->in_recovery) dm_queue_if_no_path(mpp->alias, 1); leave_recovery_mode(mpp); It this case, we should be queueing, but we're not, and since we aren't in recovery mode, calling leave_recovery_mode() doesn't do anything. Instead we just directly enable queueing. > Typically we are setting queue_if_no_path to forever with the intent > that it will survive longer storage and/or disk issues without > returning an error to the application and/or corrupting fses if the > storage issue can be fixed. > > We generally expect it to recover when the storage comes back, but > that the storage could be experiencing significant issues for a > significant period of time >10 minutes. Since the storage has to be > fixed to get things working again, there is a lot of negative value > that requires manual recovery steps when an error gets returned (fsck, > loss of data). > > We also manually disable queueing if we need to remove the mpath > devices (paths are already gone as they were non-responsive > 24 hours > and removed via tmo_timeout), and/or forcible reboot the nodes when we > determine storage is not coming back. If you are setting no_path_retry to "queue" (the recommended way) or adding "queue_if_no_path" to the features line while leaving no_path_retry unset (the deprecated way. See the man page) this change will not effect you. You will not ever get to this code in set_no_path_retry(). Does that clear things up? -Ben > > On Mon, Nov 27, 2023 at 3:54 PM Benjamin Marzinski <bmarzins@redhat.com> wrote: > > > > If a multipath device has no_path_retry set to a number and has lost all > > paths, gone into recovery mode, and timed out, it will disable > > queue_if_no_paths. After that, if one of those failed paths is removed, > > when the device is reloaded, queue_if_no_paths will be re-enabled. When > > set_no_path_retry() is then called to update the queueing state, it will > > not disable queue_if_no_paths, since the device is still in the recovery > > state, so it believes no work needs to be done. The device will remain > > in the recovery state, with retry_ticks at 0, and queueing enabled, > > even though there are no usable paths. > > > > To fix this, in set_no_path_retry(), if no_path_retry is set to a number > > and the device is queueing but it is in recovery mode and out of > > retries with no usable paths, manually disable queue_if_no_path. > > > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> > > --- > > libmultipath/structs_vec.c | 12 +++++++++++- > > 1 file changed, 11 insertions(+), 1 deletion(-) > > > > diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c > > index 0e8a46e7..3cb23c73 100644 > > --- a/libmultipath/structs_vec.c > > +++ b/libmultipath/structs_vec.c > > @@ -627,8 +627,18 @@ void set_no_path_retry(struct multipath *mpp) > > !mpp->in_recovery) > > dm_queue_if_no_path(mpp->alias, 1); > > leave_recovery_mode(mpp); > > - } else if (pathcount(mpp, PATH_PENDING) == 0) > > + } else if (pathcount(mpp, PATH_PENDING) == 0) { > > + /* > > + * If in_recovery is set, enter_recovery_mode does > > + * nothing. If the device is already in recovery > > + * mode and has already timed out, manually call > > + * dm_queue_if_no_path to stop it from queueing. > > + */ > > + if ((!mpp->features || is_queueing) && > > + mpp->in_recovery && mpp->retry_tick == 0) > > + dm_queue_if_no_path(mpp->alias, 0); > > enter_recovery_mode(mpp); > > + } > > break; > > } > > } > > -- > > 2.41.0 > > > >
Thank you. That makes it clear. Yes, we are setting it to "queue" so it won't change behavior with that setting. It also makes it clear that if we say set it no_path_retry to say 100000 that once all paths fail and it goes into recovery mode then queuing would be disabled. On Tue, Nov 28, 2023 at 11:13 AM Benjamin Marzinski <bmarzins@redhat.com> wrote: > > On Mon, Nov 27, 2023 at 04:07:46PM -0600, Roger Heflin wrote: > > How long does recovery take? I am unclear on when the explictly > > set queue_if_no_path is being overridden, and why disabling it is > > useful. > > To be clear here. This fix shouldn't cause multipath to disable queueing > in cases where it wasn't configured to do so. Previously, we were not > always respecting no_path_retry. This fix makes sure that we are. > > This change only applies to the case where no_path_retry is set to > a specific number of retries. When configuring multipath, if > no_path_reties is set, it always takes precedence over > "queue_if_no_path" set in the features line. So if a multipath device > isn't configured to queue for a limited number of retries and then > disable queueing, the fix changes nothing. > > When no_path_retry is set to a number of retries, if the last path is > lost, the device goes into recovery mode. When a device goes into > recovery mode, multipathd sets a counter to the value of no_path_retry. > After every loop of the path checker through all the devices, that > counter ticks down. If the counter hits zero, multipathd disables > queueing on the device. When a path becomes usable, the device leaves > recovery mode, stopping the countdown or restoring queueing as > necessary. > > The problem that this patch fixes happens if the multipath device is > reloaded after that count has hit zero, and queueing has been disabled. > When reloading a device with no_path_retry set to anything other than > "fail", multipathd will automatically enable queueing. After reloading > the device, multipathd checks the number of usable paths. If there are > no usable paths, it tries to enter recovery mode. However, in this > case, the device already is in recovery mode, so nothing happens. And > since the countdown has already completed, nothing will happen to > disable queueing in the future. Now you are left with a device stuck > indefinitely in recovery mode with queueing enabled and no usable paths, > even though you configured it to only queue for a limited number of > retries. This is bad. > > The solution the patch implements is to check if you are in this > situation: > 1. The device has no_path_retry set to a number of retries. > 2. The device has no usable paths > 3. The device is already in recovery mode > 4. The retry counter for the device has already reached 0 > 5. The device is currently queueing IO. > > In this case, we clearly shouldn't be queueing, and calling > enter_recovery_mode() will do nothing. So, we just tell the device to > disable queueing. > > The change I made is in set_no_path_retry() which does get called in > other cases besides after a device reload. However, if we are ever in > the situation outlined above, we should not be queueing, so it is > correct to disable it. The code I added in this patch is the mirror of > the code directly above it in set_no_path_retry(). Here is that code. > > if (count_active_paths(mpp) > 0) { > /* > * If in_recovery is set, leave_recovery_mode() takes > * care of dm_queue_if_no_path. Otherwise, do it here. > */ > if ((!mpp->features || !is_queueing) && > !mpp->in_recovery) > dm_queue_if_no_path(mpp->alias, 1); > leave_recovery_mode(mpp); > > It this case, we should be queueing, but we're not, and since we aren't > in recovery mode, calling leave_recovery_mode() doesn't do anything. > Instead we just directly enable queueing. > > > Typically we are setting queue_if_no_path to forever with the intent > > that it will survive longer storage and/or disk issues without > > returning an error to the application and/or corrupting fses if the > > storage issue can be fixed. > > > > We generally expect it to recover when the storage comes back, but > > that the storage could be experiencing significant issues for a > > significant period of time >10 minutes. Since the storage has to be > > fixed to get things working again, there is a lot of negative value > > that requires manual recovery steps when an error gets returned (fsck, > > loss of data). > > > > We also manually disable queueing if we need to remove the mpath > > devices (paths are already gone as they were non-responsive > 24 hours > > and removed via tmo_timeout), and/or forcible reboot the nodes when we > > determine storage is not coming back. > > If you are setting no_path_retry to "queue" (the recommended way) or > adding "queue_if_no_path" to the features line while leaving > no_path_retry unset (the deprecated way. See the man page) this change > will not effect you. You will not ever get to this code in > set_no_path_retry(). > > Does that clear things up? > > -Ben > > > > > On Mon, Nov 27, 2023 at 3:54 PM Benjamin Marzinski <bmarzins@redhat.com> wrote: > > > > > > If a multipath device has no_path_retry set to a number and has lost all > > > paths, gone into recovery mode, and timed out, it will disable > > > queue_if_no_paths. After that, if one of those failed paths is removed, > > > when the device is reloaded, queue_if_no_paths will be re-enabled. When > > > set_no_path_retry() is then called to update the queueing state, it will > > > not disable queue_if_no_paths, since the device is still in the recovery > > > state, so it believes no work needs to be done. The device will remain > > > in the recovery state, with retry_ticks at 0, and queueing enabled, > > > even though there are no usable paths. > > > > > > To fix this, in set_no_path_retry(), if no_path_retry is set to a number > > > and the device is queueing but it is in recovery mode and out of > > > retries with no usable paths, manually disable queue_if_no_path. > > > > > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> > > > --- > > > libmultipath/structs_vec.c | 12 +++++++++++- > > > 1 file changed, 11 insertions(+), 1 deletion(-) > > > > > > diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c > > > index 0e8a46e7..3cb23c73 100644 > > > --- a/libmultipath/structs_vec.c > > > +++ b/libmultipath/structs_vec.c > > > @@ -627,8 +627,18 @@ void set_no_path_retry(struct multipath *mpp) > > > !mpp->in_recovery) > > > dm_queue_if_no_path(mpp->alias, 1); > > > leave_recovery_mode(mpp); > > > - } else if (pathcount(mpp, PATH_PENDING) == 0) > > > + } else if (pathcount(mpp, PATH_PENDING) == 0) { > > > + /* > > > + * If in_recovery is set, enter_recovery_mode does > > > + * nothing. If the device is already in recovery > > > + * mode and has already timed out, manually call > > > + * dm_queue_if_no_path to stop it from queueing. > > > + */ > > > + if ((!mpp->features || is_queueing) && > > > + mpp->in_recovery && mpp->retry_tick == 0) > > > + dm_queue_if_no_path(mpp->alias, 0); > > > enter_recovery_mode(mpp); > > > + } > > > break; > > > } > > > } > > > -- > > > 2.41.0 > > > > > > >
On Tue, Nov 28, 2023 at 11:59:55AM -0600, Roger Heflin wrote: > Thank you. That makes it clear. Yes, we are setting it to "queue" > so it won't change behavior with that setting. > > It also makes it clear that if we say set it no_path_retry to say > 100000 that once all paths fail and it goes into recovery mode then > queuing would be disabled. Well, it wouldn't disable queueing until it had been in recovery mode for over 100000 path checker retries. At a retry every 5 seconds, that's getting close to 6 days. Before those 100000 retries had passed, if the device got reloaded, queueing would still not get disabled with my patch appled, even if there were no usable paths. The change only applies to devices in recovery mode that have run out of retries. If you've noticed something that I've overlooked, and this change could disable queuing when it shouldn't, please let me know. -Ben > > > On Tue, Nov 28, 2023 at 11:13 AM Benjamin Marzinski <bmarzins@redhat.com> wrote: > > > > On Mon, Nov 27, 2023 at 04:07:46PM -0600, Roger Heflin wrote: > > > How long does recovery take? I am unclear on when the explictly > > > set queue_if_no_path is being overridden, and why disabling it is > > > useful. > > > > To be clear here. This fix shouldn't cause multipath to disable queueing > > in cases where it wasn't configured to do so. Previously, we were not > > always respecting no_path_retry. This fix makes sure that we are. > > > > This change only applies to the case where no_path_retry is set to > > a specific number of retries. When configuring multipath, if > > no_path_reties is set, it always takes precedence over > > "queue_if_no_path" set in the features line. So if a multipath device > > isn't configured to queue for a limited number of retries and then > > disable queueing, the fix changes nothing. > > > > When no_path_retry is set to a number of retries, if the last path is > > lost, the device goes into recovery mode. When a device goes into > > recovery mode, multipathd sets a counter to the value of no_path_retry. > > After every loop of the path checker through all the devices, that > > counter ticks down. If the counter hits zero, multipathd disables > > queueing on the device. When a path becomes usable, the device leaves > > recovery mode, stopping the countdown or restoring queueing as > > necessary. > > > > The problem that this patch fixes happens if the multipath device is > > reloaded after that count has hit zero, and queueing has been disabled. > > When reloading a device with no_path_retry set to anything other than > > "fail", multipathd will automatically enable queueing. After reloading > > the device, multipathd checks the number of usable paths. If there are > > no usable paths, it tries to enter recovery mode. However, in this > > case, the device already is in recovery mode, so nothing happens. And > > since the countdown has already completed, nothing will happen to > > disable queueing in the future. Now you are left with a device stuck > > indefinitely in recovery mode with queueing enabled and no usable paths, > > even though you configured it to only queue for a limited number of > > retries. This is bad. > > > > The solution the patch implements is to check if you are in this > > situation: > > 1. The device has no_path_retry set to a number of retries. > > 2. The device has no usable paths > > 3. The device is already in recovery mode > > 4. The retry counter for the device has already reached 0 > > 5. The device is currently queueing IO. > > > > In this case, we clearly shouldn't be queueing, and calling > > enter_recovery_mode() will do nothing. So, we just tell the device to > > disable queueing. > > > > The change I made is in set_no_path_retry() which does get called in > > other cases besides after a device reload. However, if we are ever in > > the situation outlined above, we should not be queueing, so it is > > correct to disable it. The code I added in this patch is the mirror of > > the code directly above it in set_no_path_retry(). Here is that code. > > > > if (count_active_paths(mpp) > 0) { > > /* > > * If in_recovery is set, leave_recovery_mode() takes > > * care of dm_queue_if_no_path. Otherwise, do it here. > > */ > > if ((!mpp->features || !is_queueing) && > > !mpp->in_recovery) > > dm_queue_if_no_path(mpp->alias, 1); > > leave_recovery_mode(mpp); > > > > It this case, we should be queueing, but we're not, and since we aren't > > in recovery mode, calling leave_recovery_mode() doesn't do anything. > > Instead we just directly enable queueing. > > > > > Typically we are setting queue_if_no_path to forever with the intent > > > that it will survive longer storage and/or disk issues without > > > returning an error to the application and/or corrupting fses if the > > > storage issue can be fixed. > > > > > > We generally expect it to recover when the storage comes back, but > > > that the storage could be experiencing significant issues for a > > > significant period of time >10 minutes. Since the storage has to be > > > fixed to get things working again, there is a lot of negative value > > > that requires manual recovery steps when an error gets returned (fsck, > > > loss of data). > > > > > > We also manually disable queueing if we need to remove the mpath > > > devices (paths are already gone as they were non-responsive > 24 hours > > > and removed via tmo_timeout), and/or forcible reboot the nodes when we > > > determine storage is not coming back. > > > > If you are setting no_path_retry to "queue" (the recommended way) or > > adding "queue_if_no_path" to the features line while leaving > > no_path_retry unset (the deprecated way. See the man page) this change > > will not effect you. You will not ever get to this code in > > set_no_path_retry(). > > > > Does that clear things up? > > > > -Ben > > > > > > > > On Mon, Nov 27, 2023 at 3:54 PM Benjamin Marzinski <bmarzins@redhat.com> wrote: > > > > > > > > If a multipath device has no_path_retry set to a number and has lost all > > > > paths, gone into recovery mode, and timed out, it will disable > > > > queue_if_no_paths. After that, if one of those failed paths is removed, > > > > when the device is reloaded, queue_if_no_paths will be re-enabled. When > > > > set_no_path_retry() is then called to update the queueing state, it will > > > > not disable queue_if_no_paths, since the device is still in the recovery > > > > state, so it believes no work needs to be done. The device will remain > > > > in the recovery state, with retry_ticks at 0, and queueing enabled, > > > > even though there are no usable paths. > > > > > > > > To fix this, in set_no_path_retry(), if no_path_retry is set to a number > > > > and the device is queueing but it is in recovery mode and out of > > > > retries with no usable paths, manually disable queue_if_no_path. > > > > > > > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> > > > > --- > > > > libmultipath/structs_vec.c | 12 +++++++++++- > > > > 1 file changed, 11 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c > > > > index 0e8a46e7..3cb23c73 100644 > > > > --- a/libmultipath/structs_vec.c > > > > +++ b/libmultipath/structs_vec.c > > > > @@ -627,8 +627,18 @@ void set_no_path_retry(struct multipath *mpp) > > > > !mpp->in_recovery) > > > > dm_queue_if_no_path(mpp->alias, 1); > > > > leave_recovery_mode(mpp); > > > > - } else if (pathcount(mpp, PATH_PENDING) == 0) > > > > + } else if (pathcount(mpp, PATH_PENDING) == 0) { > > > > + /* > > > > + * If in_recovery is set, enter_recovery_mode does > > > > + * nothing. If the device is already in recovery > > > > + * mode and has already timed out, manually call > > > > + * dm_queue_if_no_path to stop it from queueing. > > > > + */ > > > > + if ((!mpp->features || is_queueing) && > > > > + mpp->in_recovery && mpp->retry_tick == 0) > > > > + dm_queue_if_no_path(mpp->alias, 0); > > > > enter_recovery_mode(mpp); > > > > + } > > > > break; > > > > } > > > > } > > > > -- > > > > 2.41.0 > > > > > > > > > >
On Mon, 2023-11-27 at 16:54 -0500, Benjamin Marzinski wrote: > If a multipath device has no_path_retry set to a number and has lost > all > paths, gone into recovery mode, and timed out, it will disable > queue_if_no_paths. After that, if one of those failed paths is > removed, > when the device is reloaded, queue_if_no_paths will be re-enabled. > When > set_no_path_retry() is then called to update the queueing state, it > will > not disable queue_if_no_paths, since the device is still in the > recovery > state, so it believes no work needs to be done. The device will > remain > in the recovery state, with retry_ticks at 0, and queueing enabled, > even though there are no usable paths. > > To fix this, in set_no_path_retry(), if no_path_retry is set to a > number > and the device is queueing but it is in recovery mode and out of > retries with no usable paths, manually disable queue_if_no_path. > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Thanks, this makes sense. I am just wondering if this logic should be moved to enter_recovery_mode() / leave_recovery_mode() instead. That pair of functions is also called from update_queue_mode{add,del}_path(), where the same reasoning would apply, afaics. Am I overlooking something? Regards Martin > --- > libmultipath/structs_vec.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c > index 0e8a46e7..3cb23c73 100644 > --- a/libmultipath/structs_vec.c > +++ b/libmultipath/structs_vec.c > @@ -627,8 +627,18 @@ void set_no_path_retry(struct multipath *mpp) > !mpp->in_recovery) > dm_queue_if_no_path(mpp->alias, 1); > leave_recovery_mode(mpp); > - } else if (pathcount(mpp, PATH_PENDING) == 0) > + } else if (pathcount(mpp, PATH_PENDING) == 0) { > + /* > + * If in_recovery is set, > enter_recovery_mode does > + * nothing. If the device is already in > recovery > + * mode and has already timed out, manually > call > + * dm_queue_if_no_path to stop it from > queueing. > + */ > + if ((!mpp->features || is_queueing) && > + mpp->in_recovery && mpp->retry_tick == > 0) > + dm_queue_if_no_path(mpp->alias, 0); > enter_recovery_mode(mpp); > + } > break; > } > }
On Mon, Dec 04, 2023 at 03:38:03PM +0100, Martin Wilck wrote: > On Mon, 2023-11-27 at 16:54 -0500, Benjamin Marzinski wrote: > > If a multipath device has no_path_retry set to a number and has lost > > all > > paths, gone into recovery mode, and timed out, it will disable > > queue_if_no_paths. After that, if one of those failed paths is > > removed, > > when the device is reloaded, queue_if_no_paths will be re-enabled. > > When > > set_no_path_retry() is then called to update the queueing state, it > > will > > not disable queue_if_no_paths, since the device is still in the > > recovery > > state, so it believes no work needs to be done. The device will > > remain > > in the recovery state, with retry_ticks at 0, and queueing enabled, > > even though there are no usable paths. > > > > To fix this, in set_no_path_retry(), if no_path_retry is set to a > > number > > and the device is queueing but it is in recovery mode and out of > > retries with no usable paths, manually disable queue_if_no_path. > > > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> > > Thanks, this makes sense. I am just wondering if this logic should > be moved to enter_recovery_mode() / leave_recovery_mode() instead. > That pair of functions is also called from > update_queue_mode{add,del}_path(), where the same reasoning would > apply, afaics. Am I overlooking something? In general, I think it's safe to only do this in set_no_path_retry(). The functions that call update_queue_mode{add,del}_path() are: fail_path() reinstate_path() update_multipath() io_err_stat_handle_pathfail() fail_path() and reinstate_path() will always call the update_queue_mode_* function after calling set_no_path_retry(), so the device will already be in the correct state. update_multipath() usually will as well (except when it's called by a command line handler), and further, it will only call update_queue_mode_del_path() when it is actually failing a path, so the multipath device shouldn't already be in recovery_mode. The same goes for io_err_stat_handle_pathfail(). Also we need to make sure that mpp->features is uptodate before doing these checks, so that we know what the current queueing state is. io_err_stat_handle_pathfail() doesn't update mpp->features before calling update_queue_mode_del_path(), so it could end up doing the wrong thing. And set_no_path_retry() will be called (after updating mpp->features) the next time the path checker completes for a path in the device, so devices won't remain in a bad state long, no matter what. What looking at this did make me realize is that cli_restore_queueing() and cli_restore_all_queueing() don't update the device strings before calling set_no_path_retry(). Thinking about this, it seems like they should call update_multipath(vecs, mpp->alias, 1) instead, to trigger the set_no_path_retry() call after updating mpp->features. I'll send another patch to fix cli_restore_queueing() and cli_restore_all_queueing(). If you think it's necessary, I can also send one to make io_err_stat_handle_pathfail() call update_multipath() and then move the code to manually change the queueing state to enter_recovery_mode() and leave_recovery_mode(), instead of doing it in set_no_path_retry(). -Ben > > Regards > Martin > > > --- > > libmultipath/structs_vec.c | 12 +++++++++++- > > 1 file changed, 11 insertions(+), 1 deletion(-) > > > > diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c > > index 0e8a46e7..3cb23c73 100644 > > --- a/libmultipath/structs_vec.c > > +++ b/libmultipath/structs_vec.c > > @@ -627,8 +627,18 @@ void set_no_path_retry(struct multipath *mpp) > > !mpp->in_recovery) > > dm_queue_if_no_path(mpp->alias, 1); > > leave_recovery_mode(mpp); > > - } else if (pathcount(mpp, PATH_PENDING) == 0) > > + } else if (pathcount(mpp, PATH_PENDING) == 0) { > > + /* > > + * If in_recovery is set, > > enter_recovery_mode does > > + * nothing. If the device is already in > > recovery > > + * mode and has already timed out, manually > > call > > + * dm_queue_if_no_path to stop it from > > queueing. > > + */ > > + if ((!mpp->features || is_queueing) && > > + mpp->in_recovery && mpp->retry_tick == > > 0) > > + dm_queue_if_no_path(mpp->alias, 0); > > enter_recovery_mode(mpp); > > + } > > break; > > } > > }
On Mon, 2023-12-04 at 15:00 -0500, Benjamin Marzinski wrote: > On Mon, Dec 04, 2023 at 03:38:03PM +0100, Martin Wilck wrote: > > On Mon, 2023-11-27 at 16:54 -0500, Benjamin Marzinski wrote: > > > If a multipath device has no_path_retry set to a number and has > > > lost > > > all > > > paths, gone into recovery mode, and timed out, it will disable > > > queue_if_no_paths. After that, if one of those failed paths is > > > removed, > > > when the device is reloaded, queue_if_no_paths will be re- > > > enabled. > > > When > > > set_no_path_retry() is then called to update the queueing state, > > > it > > > will > > > not disable queue_if_no_paths, since the device is still in the > > > recovery > > > state, so it believes no work needs to be done. The device will > > > remain > > > in the recovery state, with retry_ticks at 0, and queueing > > > enabled, > > > even though there are no usable paths. > > > > > > To fix this, in set_no_path_retry(), if no_path_retry is set to a > > > number > > > and the device is queueing but it is in recovery mode and out of > > > retries with no usable paths, manually disable queue_if_no_path. > > > > > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> > > > > Thanks, this makes sense. I am just wondering if this logic should > > be moved to enter_recovery_mode() / leave_recovery_mode() instead. > > That pair of functions is also called from > > update_queue_mode{add,del}_path(), where the same reasoning would > > apply, afaics. Am I overlooking something? > > In general, I think it's safe to only do this in set_no_path_retry(). > The functions that call update_queue_mode{add,del}_path() are: > fail_path() > reinstate_path() > update_multipath() > io_err_stat_handle_pathfail() > > fail_path() and reinstate_path() will always call the > update_queue_mode_* function after calling set_no_path_retry(), so > the > device will already be in the correct state. update_multipath() > usually > will as well (except when it's called by a command line handler), and > further, it will only call update_queue_mode_del_path() when it is > actually failing a path, so the multipath device shouldn't already be > in recovery_mode. The same goes for io_err_stat_handle_pathfail(). > > Also we need to make sure that mpp->features is uptodate before doing > these checks, so that we know what the current queueing state is. > io_err_stat_handle_pathfail() doesn't update mpp->features before > calling update_queue_mode_del_path(), so it could end up doing the > wrong > thing. > > And set_no_path_retry() will be called (after updating mpp->features) > the next time the path checker completes for a path in the device, so > devices won't remain in a bad state long, no matter what. > > What looking at this did make me realize is that > cli_restore_queueing() > and cli_restore_all_queueing() don't update the device strings before > calling set_no_path_retry(). Thinking about this, it seems like they > should call update_multipath(vecs, mpp->alias, 1) instead, to trigger > the set_no_path_retry() call after updating mpp->features. > > I'll send another patch to fix cli_restore_queueing() and > cli_restore_all_queueing(). If you think it's necessary, I can also > send > one to make io_err_stat_handle_pathfail() call update_multipath() and > then move the code to manually change the queueing state to > enter_recovery_mode() and leave_recovery_mode(), instead of doing it > in > set_no_path_retry(). > Thanks for working out all the different cases. Personally, I would like to have this logic in one place, if possible. I'll leave it to your judgement, as you've taken a closer look than myself. I'm not sure what you mean with "mpp->features is up to date"; are you considering some entity applying changes to the mpp features without involving multipathd, e.g. by using direct DM ioctls or dmsetup? Regards Martin > -Ben > > > > > Regards > > Martin > > > > > --- > > > libmultipath/structs_vec.c | 12 +++++++++++- > > > 1 file changed, 11 insertions(+), 1 deletion(-) > > > > > > diff --git a/libmultipath/structs_vec.c > > > b/libmultipath/structs_vec.c > > > index 0e8a46e7..3cb23c73 100644 > > > --- a/libmultipath/structs_vec.c > > > +++ b/libmultipath/structs_vec.c > > > @@ -627,8 +627,18 @@ void set_no_path_retry(struct multipath > > > *mpp) > > > !mpp->in_recovery) > > > dm_queue_if_no_path(mpp->alias, > > > 1); > > > leave_recovery_mode(mpp); > > > - } else if (pathcount(mpp, PATH_PENDING) == 0) > > > + } else if (pathcount(mpp, PATH_PENDING) == 0) { > > > + /* > > > + * If in_recovery is set, > > > enter_recovery_mode does > > > + * nothing. If the device is already in > > > recovery > > > + * mode and has already timed out, > > > manually > > > call > > > + * dm_queue_if_no_path to stop it from > > > queueing. > > > + */ > > > + if ((!mpp->features || is_queueing) && > > > + mpp->in_recovery && mpp->retry_tick > > > == > > > 0) > > > + dm_queue_if_no_path(mpp->alias, > > > 0); > > > enter_recovery_mode(mpp); > > > + } > > > break; > > > } > > > } >
On Mon, Dec 04, 2023 at 09:43:58PM +0100, Martin Wilck wrote: > On Mon, 2023-12-04 at 15:00 -0500, Benjamin Marzinski wrote: > > On Mon, Dec 04, 2023 at 03:38:03PM +0100, Martin Wilck wrote: > > > On Mon, 2023-11-27 at 16:54 -0500, Benjamin Marzinski wrote: > > > > If a multipath device has no_path_retry set to a number and has > > > > lost > > > > all > > > > paths, gone into recovery mode, and timed out, it will disable > > > > queue_if_no_paths. After that, if one of those failed paths is > > > > removed, > > > > when the device is reloaded, queue_if_no_paths will be re- > > > > enabled. > > > > When > > > > set_no_path_retry() is then called to update the queueing state, > > > > it > > > > will > > > > not disable queue_if_no_paths, since the device is still in the > > > > recovery > > > > state, so it believes no work needs to be done. The device will > > > > remain > > > > in the recovery state, with retry_ticks at 0, and queueing > > > > enabled, > > > > even though there are no usable paths. > > > > > > > > To fix this, in set_no_path_retry(), if no_path_retry is set to a > > > > number > > > > and the device is queueing but it is in recovery mode and out of > > > > retries with no usable paths, manually disable queue_if_no_path. > > > > > > > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> > > > > > > Thanks, this makes sense. I am just wondering if this logic should > > > be moved to enter_recovery_mode() / leave_recovery_mode() instead. > > > That pair of functions is also called from > > > update_queue_mode{add,del}_path(), where the same reasoning would > > > apply, afaics. Am I overlooking something? > > > > In general, I think it's safe to only do this in set_no_path_retry(). > > The functions that call update_queue_mode{add,del}_path() are: > > fail_path() > > reinstate_path() > > update_multipath() > > io_err_stat_handle_pathfail() > > > > fail_path() and reinstate_path() will always call the > > update_queue_mode_* function after calling set_no_path_retry(), so > > the > > device will already be in the correct state. update_multipath() > > usually > > will as well (except when it's called by a command line handler), and > > further, it will only call update_queue_mode_del_path() when it is > > actually failing a path, so the multipath device shouldn't already be > > in recovery_mode. The same goes for io_err_stat_handle_pathfail(). > > > > Also we need to make sure that mpp->features is uptodate before doing > > these checks, so that we know what the current queueing state is. > > io_err_stat_handle_pathfail() doesn't update mpp->features before > > calling update_queue_mode_del_path(), so it could end up doing the > > wrong > > thing. > > > > And set_no_path_retry() will be called (after updating mpp->features) > > the next time the path checker completes for a path in the device, so > > devices won't remain in a bad state long, no matter what. > > > > What looking at this did make me realize is that > > cli_restore_queueing() > > and cli_restore_all_queueing() don't update the device strings before > > calling set_no_path_retry(). Thinking about this, it seems like they > > should call update_multipath(vecs, mpp->alias, 1) instead, to trigger > > the set_no_path_retry() call after updating mpp->features. > > > > I'll send another patch to fix cli_restore_queueing() and > > cli_restore_all_queueing(). If you think it's necessary, I can also > > send > > one to make io_err_stat_handle_pathfail() call update_multipath() and > > then move the code to manually change the queueing state to > > enter_recovery_mode() and leave_recovery_mode(), instead of doing it > > in > > set_no_path_retry(). > > > > Thanks for working out all the different cases. Personally, I would > like to have this logic in one place, if possible. I'll leave it to > your judgement, as you've taken a closer look than myself. > > I'm not sure what you mean with "mpp->features is up to date"; are you > considering some entity applying changes to the mpp features without > involving multipathd, e.g. by using direct DM ioctls or dmsetup? Or right after multipathd calls dm_queue_if_no_path(). We don't immediately update mpp->features to reflect the changed queue_if_no_paths setting of the device. So there are windows after someone, including multipathd, has changed the queue_if_no_paths setting of the device, during which checking mpp->features won't give the correct answer unless we first reread the device table. Obviously, if something other than multipathd has changed this, then there's always the possiblity of a race, but we can make sure that we don't race with ourselves. For instance, the way things are right now, running: # multipathd disablequeueing maps; multipathd restorequeueing maps will occasionally leave the devices not queueing until the next time a path in them is checked, and set_no_path_retry actually restores queueing. But I'm actually going to deal with this differently. I'll send a patchset shortly. -Ben > > Regards > Martin > > > -Ben > > > > > > > > Regards > > > Martin > > > > > > > --- > > > > libmultipath/structs_vec.c | 12 +++++++++++- > > > > 1 file changed, 11 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/libmultipath/structs_vec.c > > > > b/libmultipath/structs_vec.c > > > > index 0e8a46e7..3cb23c73 100644 > > > > --- a/libmultipath/structs_vec.c > > > > +++ b/libmultipath/structs_vec.c > > > > @@ -627,8 +627,18 @@ void set_no_path_retry(struct multipath > > > > *mpp) > > > > !mpp->in_recovery) > > > > dm_queue_if_no_path(mpp->alias, > > > > 1); > > > > leave_recovery_mode(mpp); > > > > - } else if (pathcount(mpp, PATH_PENDING) == 0) > > > > + } else if (pathcount(mpp, PATH_PENDING) == 0) { > > > > + /* > > > > + * If in_recovery is set, > > > > enter_recovery_mode does > > > > + * nothing. If the device is already in > > > > recovery > > > > + * mode and has already timed out, > > > > manually > > > > call > > > > + * dm_queue_if_no_path to stop it from > > > > queueing. > > > > + */ > > > > + if ((!mpp->features || is_queueing) && > > > > + mpp->in_recovery && mpp->retry_tick > > > > == > > > > 0) > > > > + dm_queue_if_no_path(mpp->alias, > > > > 0); > > > > enter_recovery_mode(mpp); > > > > + } > > > > break; > > > > } > > > > } > >
On Mon, 2023-12-04 at 18:01 -0500, Benjamin Marzinski wrote: > On Mon, Dec 04, 2023 at 09:43:58PM +0100, Martin Wilck wrote: > > On Mon, 2023-12-04 at 15:00 -0500, Benjamin Marzinski wrote: > > > On Mon, Dec 04, 2023 at 03:38:03PM +0100, Martin Wilck wrote: > > > > On Mon, 2023-11-27 at 16:54 -0500, Benjamin Marzinski wrote: > > > > > If a multipath device has no_path_retry set to a number and > > > > > has > > > > > lost > > > > > all > > > > > paths, gone into recovery mode, and timed out, it will > > > > > disable > > > > > queue_if_no_paths. After that, if one of those failed paths > > > > > is > > > > > removed, > > > > > when the device is reloaded, queue_if_no_paths will be re- > > > > > enabled. > > > > > When > > > > > set_no_path_retry() is then called to update the queueing > > > > > state, > > > > > it > > > > > will > > > > > not disable queue_if_no_paths, since the device is still in > > > > > the > > > > > recovery > > > > > state, so it believes no work needs to be done. The device > > > > > will > > > > > remain > > > > > in the recovery state, with retry_ticks at 0, and queueing > > > > > enabled, > > > > > even though there are no usable paths. > > > > > > > > > > To fix this, in set_no_path_retry(), if no_path_retry is set > > > > > to a > > > > > number > > > > > and the device is queueing but it is in recovery mode and out > > > > > of > > > > > retries with no usable paths, manually disable > > > > > queue_if_no_path. > > > > > > > > > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> > > > > > > > > Thanks, this makes sense. I am just wondering if this logic > > > > should > > > > be moved to enter_recovery_mode() / leave_recovery_mode() > > > > instead. > > > > That pair of functions is also called from > > > > update_queue_mode{add,del}_path(), where the same reasoning > > > > would > > > > apply, afaics. Am I overlooking something? > > > > > > In general, I think it's safe to only do this in > > > set_no_path_retry(). > > > The functions that call update_queue_mode{add,del}_path() are: > > > fail_path() > > > reinstate_path() > > > update_multipath() > > > io_err_stat_handle_pathfail() > > > > > > fail_path() and reinstate_path() will always call the > > > update_queue_mode_* function after calling set_no_path_retry(), > > > so > > > the > > > device will already be in the correct state. update_multipath() > > > usually > > > will as well (except when it's called by a command line handler), > > > and > > > further, it will only call update_queue_mode_del_path() when it > > > is > > > actually failing a path, so the multipath device shouldn't > > > already be > > > in recovery_mode. The same goes for > > > io_err_stat_handle_pathfail(). > > > > > > Also we need to make sure that mpp->features is uptodate before > > > doing > > > these checks, so that we know what the current queueing state is. > > > io_err_stat_handle_pathfail() doesn't update mpp->features before > > > calling update_queue_mode_del_path(), so it could end up doing > > > the > > > wrong > > > thing. > > > > > > And set_no_path_retry() will be called (after updating mpp- > > > >features) > > > the next time the path checker completes for a path in the > > > device, so > > > devices won't remain in a bad state long, no matter what. > > > > > > What looking at this did make me realize is that > > > cli_restore_queueing() > > > and cli_restore_all_queueing() don't update the device strings > > > before > > > calling set_no_path_retry(). Thinking about this, it seems like > > > they > > > should call update_multipath(vecs, mpp->alias, 1) instead, to > > > trigger > > > the set_no_path_retry() call after updating mpp->features. > > > > > > I'll send another patch to fix cli_restore_queueing() and > > > cli_restore_all_queueing(). If you think it's necessary, I can > > > also > > > send > > > one to make io_err_stat_handle_pathfail() call update_multipath() > > > and > > > then move the code to manually change the queueing state to > > > enter_recovery_mode() and leave_recovery_mode(), instead of doing > > > it > > > in > > > set_no_path_retry(). > > > > > > > Thanks for working out all the different cases. Personally, I would > > like to have this logic in one place, if possible. I'll leave it to > > your judgement, as you've taken a closer look than myself. > > > > I'm not sure what you mean with "mpp->features is up to date"; are > > you > > considering some entity applying changes to the mpp features > > without > > involving multipathd, e.g. by using direct DM ioctls or dmsetup? > > Or right after multipathd calls dm_queue_if_no_path(). We don't > immediately update mpp->features to reflect the changed > queue_if_no_paths setting of the device. So there are windows after > someone, including multipathd, has changed the queue_if_no_paths > setting > of the device, during which checking mpp->features won't give the > correct answer unless we first reread the device table. Obviously, if > something other than multipathd has changed this, then there's always > the possiblity of a race, but we can make sure that we don't race > with > ourselves. > > For instance, the way things are right now, running: > > # multipathd disablequeueing maps; multipathd restorequeueing maps > > will occasionally leave the devices not queueing until the next time > a > path in them is checked, and set_no_path_retry actually restores > queueing. But I'm actually going to deal with this differently. I'll > send a patchset shortly. You are right, I see. It's actually an inconsistency that I would like to get rid of. I suggest that we ditch mpp->features entirely, and just keep track of the map's features using an abstract representation using booleans and ints. We should never apply kernel status changes that are inconsistent with our own representation of the map's features, IMO. Along a similar line of reasoning, let me emphasize again why I would like to have the logic for enabling or disabling queueing in one place. Your previous post nicely distinguished the various cases and call chains which change the queueing and recovery mode, but it would be better if that kind of assessment wasn't necessary, and instead it was possible to inspect the code in just a block of (say) 50 code lines that would decide whether or not a given map should be queueing and / or in recovery mode. It's an unfortunate property of the multipath-tools code base that important logic is dispersed over multiple functions and source files, making a correct assessment of the code flow difficult and cumbersome. It can't be completely avoided given the complexity of the subject, but it would be nice to try. I'm looking forward to your new patch set ;-) Thanks, Martin
On Tue, Dec 05, 2023 at 03:57:13PM +0100, Martin Wilck wrote: > On Mon, 2023-12-04 at 18:01 -0500, Benjamin Marzinski wrote: > > On Mon, Dec 04, 2023 at 09:43:58PM +0100, Martin Wilck wrote: > > > On Mon, 2023-12-04 at 15:00 -0500, Benjamin Marzinski wrote: > > > > On Mon, Dec 04, 2023 at 03:38:03PM +0100, Martin Wilck wrote: > > > > > On Mon, 2023-11-27 at 16:54 -0500, Benjamin Marzinski wrote: > > > > > > If a multipath device has no_path_retry set to a number and > > > > > > has > > > > > > lost > > > > > > all > > > > > > paths, gone into recovery mode, and timed out, it will > > > > > > disable > > > > > > queue_if_no_paths. After that, if one of those failed paths > > > > > > is > > > > > > removed, > > > > > > when the device is reloaded, queue_if_no_paths will be re- > > > > > > enabled. > > > > > > When > > > > > > set_no_path_retry() is then called to update the queueing > > > > > > state, > > > > > > it > > > > > > will > > > > > > not disable queue_if_no_paths, since the device is still in > > > > > > the > > > > > > recovery > > > > > > state, so it believes no work needs to be done. The device > > > > > > will > > > > > > remain > > > > > > in the recovery state, with retry_ticks at 0, and queueing > > > > > > enabled, > > > > > > even though there are no usable paths. > > > > > > > > > > > > To fix this, in set_no_path_retry(), if no_path_retry is set > > > > > > to a > > > > > > number > > > > > > and the device is queueing but it is in recovery mode and out > > > > > > of > > > > > > retries with no usable paths, manually disable > > > > > > queue_if_no_path. > > > > > > > > > > > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> > > > > > > > > > > Thanks, this makes sense. I am just wondering if this logic > > > > > should > > > > > be moved to enter_recovery_mode() / leave_recovery_mode() > > > > > instead. > > > > > That pair of functions is also called from > > > > > update_queue_mode{add,del}_path(), where the same reasoning > > > > > would > > > > > apply, afaics. Am I overlooking something? > > > > > > > > In general, I think it's safe to only do this in > > > > set_no_path_retry(). > > > > The functions that call update_queue_mode{add,del}_path() are: > > > > fail_path() > > > > reinstate_path() > > > > update_multipath() > > > > io_err_stat_handle_pathfail() > > > > > > > > fail_path() and reinstate_path() will always call the > > > > update_queue_mode_* function after calling set_no_path_retry(), > > > > so > > > > the > > > > device will already be in the correct state. update_multipath() > > > > usually > > > > will as well (except when it's called by a command line handler), > > > > and > > > > further, it will only call update_queue_mode_del_path() when it > > > > is > > > > actually failing a path, so the multipath device shouldn't > > > > already be > > > > in recovery_mode. The same goes for > > > > io_err_stat_handle_pathfail(). > > > > > > > > Also we need to make sure that mpp->features is uptodate before > > > > doing > > > > these checks, so that we know what the current queueing state is. > > > > io_err_stat_handle_pathfail() doesn't update mpp->features before > > > > calling update_queue_mode_del_path(), so it could end up doing > > > > the > > > > wrong > > > > thing. > > > > > > > > And set_no_path_retry() will be called (after updating mpp- > > > > >features) > > > > the next time the path checker completes for a path in the > > > > device, so > > > > devices won't remain in a bad state long, no matter what. > > > > > > > > What looking at this did make me realize is that > > > > cli_restore_queueing() > > > > and cli_restore_all_queueing() don't update the device strings > > > > before > > > > calling set_no_path_retry(). Thinking about this, it seems like > > > > they > > > > should call update_multipath(vecs, mpp->alias, 1) instead, to > > > > trigger > > > > the set_no_path_retry() call after updating mpp->features. > > > > > > > > I'll send another patch to fix cli_restore_queueing() and > > > > cli_restore_all_queueing(). If you think it's necessary, I can > > > > also > > > > send > > > > one to make io_err_stat_handle_pathfail() call update_multipath() > > > > and > > > > then move the code to manually change the queueing state to > > > > enter_recovery_mode() and leave_recovery_mode(), instead of doing > > > > it > > > > in > > > > set_no_path_retry(). > > > > > > > > > > Thanks for working out all the different cases. Personally, I would > > > like to have this logic in one place, if possible. I'll leave it to > > > your judgement, as you've taken a closer look than myself. > > > > > > I'm not sure what you mean with "mpp->features is up to date"; are > > > you > > > considering some entity applying changes to the mpp features > > > without > > > involving multipathd, e.g. by using direct DM ioctls or dmsetup? > > > > Or right after multipathd calls dm_queue_if_no_path(). We don't > > immediately update mpp->features to reflect the changed > > queue_if_no_paths setting of the device. So there are windows after > > someone, including multipathd, has changed the queue_if_no_paths > > setting > > of the device, during which checking mpp->features won't give the > > correct answer unless we first reread the device table. Obviously, if > > something other than multipathd has changed this, then there's always > > the possiblity of a race, but we can make sure that we don't race > > with > > ourselves. > > > > For instance, the way things are right now, running: > > > > # multipathd disablequeueing maps; multipathd restorequeueing maps > > > > will occasionally leave the devices not queueing until the next time > > a > > path in them is checked, and set_no_path_retry actually restores > > queueing. But I'm actually going to deal with this differently. I'll > > send a patchset shortly. > > You are right, I see. It's actually an inconsistency that I would like > to get rid of. I suggest that we ditch mpp->features entirely, and just > keep track of the map's features using an abstract representation using > booleans and ints. We should never apply kernel status changes that are > inconsistent with our own representation of the map's features, IMO. I don't agree with ditching it entirely. The existance of mpp->features allows people to use a kernel feature with a version of the userspace tools that wasn't doesn't know about it. This means that if a new feature is added to the kernel, the existing multipath userspace tools can be used to test it. We need to provide users the option of setting features, and for backwards compatibility, we need to accept strings in the feature line that we also have other options to set. And we still need to build up a features line to pass into the kernel. So we will still need the code to reconcile the features line with other options. I'm not sure we would gain much by converting it to a set of variables and then back again. The one case where we really need to track our current value separate from the features line is queue_if_no_path, and it does make sense to track that as a variable, so we can easily update it when we make changes to it. But even there, the current value is not necessarily the value we want to put in the features line we pass to the kernel. If we aren't queueing, and then add a working path, when we reload the device we want to use the configured no_path_retry value, and not our current queueing state. One thing that would be nice would be tweaking the queue_if_no_path value in the features string that we pass to the kernel based on the current queueing state and the current path states. If we are reloading a device, and we know that all the paths in that device are down, and we currently aren't queueing, we shouldn't pass in a features string that tells the kernel to enable queueing, which is what we currently do. The patch I posted which started this conversation exists because we don't do this, and we need to fix the queueing state afterwards. > Along a similar line of reasoning, let me emphasize again why I would > like to have the logic for enabling or disabling queueing in one place. > Your previous post nicely distinguished the various cases and call > chains which change the queueing and recovery mode, but it would be > better if that kind of assessment wasn't necessary, and instead it was > possible to inspect the code in just a block of (say) 50 code lines > that would decide whether or not a given map should be queueing and / > or in recovery mode. > > It's an unfortunate property of the multipath-tools code base that > important logic is dispersed over multiple functions and source files, > making a correct assessment of the code flow difficult and cumbersome. > It can't be completely avoided given the complexity of the subject, but > it would be nice to try. > > I'm looking forward to your new patch set ;-) Unfortunately, my new patchset doesn't change the first patch. It just builds on it to clean how the interactive client commands call things, to reduce the callers of some of our functions, and keep the client commands from doing more work than they should be. After the changes, the only things calling set_no_path_retry() are functions that are updating the multipath device due to external events, and the check_path code. The update_queue_mode{add,del}_path() functions should be (and based on my code examination, are) able to assume a device in the correct queuing and recovery state, and just do the work of entering or leaving the recovery mode, based on a change in the number of active paths. If we move the code that checks if the our queuing state is wrong out of set_no_path_retry() into the update_queue_mode{add,del}_path(), then we add a requirement that callers of those functions refresh the current queueing state. Otherwise we can't actually guarantee that they do the right thing. With the current callers, this doesn't really matter, because we don't need them to fix the queueing state. But why not keep the code that fixes the queueing state in a function that's called when it could be wrong, and also is (after my new patchset) only called after we have refreshed the state so it will do the right thing. So, I would prefer if we go with my first patch, as is. I'm open to the idea of moving this code into update_queue_mode{add,del}_path() as part of a future patchset where we switch to tracking the current queueing state outside of mpp->features, and updating it whenever we actually enable or disable queueing. I can work on this if you want. -Ben > Thanks, > Martin > > > >
On Tue, 2023-12-05 at 14:28 -0500, Benjamin Marzinski wrote: > On Tue, Dec 05, 2023 at 03:57:13PM +0100, Martin Wilck wrote: > > > > You are right, I see. It's actually an inconsistency that I would > > like > > to get rid of. I suggest that we ditch mpp->features entirely, and > > just > > keep track of the map's features using an abstract representation > > using > > booleans and ints. We should never apply kernel status changes that > > are > > inconsistent with our own representation of the map's features, > > IMO. > > I don't agree with ditching it entirely. The existance of mpp- > >features > allows people to use a kernel feature with a version of the userspace > tools that wasn't doesn't know about it. This means that if a new > feature is added to the kernel, the existing multipath userspace > tools > can be used to test it. > > We need to provide users the option of setting features, and for > backwards compatibility, we need to accept strings in the feature > line > that we also have other options to set. And we still need to build up > a > features line to pass into the kernel. So we will still need the code > to > reconcile the features line with other options. I'm not sure we > would > gain much by converting it to a set of variables and then back again. Fair enough, but consistency is very important. Let's work towards never setting anything in the kernel that doesn't match our notion of the desired state of a map. > The one case where we really need to track our current value separate > from the features line is queue_if_no_path, and it does make sense to > track that as a variable, so we can easily update it when we make > changes to it. But even there, the current value is not necessarily > the > value we want to put in the features line we pass to the kernel. If > we > aren't queueing, and then add a working path, when we reload the > device > we want to use the configured no_path_retry value, and not our > current > queueing state. > > One thing that would be nice would be tweaking the queue_if_no_path > value in the features string that we pass to the kernel based on the > current queueing state and the current path states. If we are > reloading > a device, and we know that all the paths in that device are down, and > we > currently aren't queueing, we shouldn't pass in a features string > that > tells the kernel to enable queueing, which is what we currently do. Agreed. > The > patch I posted which started this conversation exists because we > don't > do this, and we need to fix the queueing state afterwards. > > > Along a similar line of reasoning, let me emphasize again why I > > would > > like to have the logic for enabling or disabling queueing in one > > place. > > Your previous post nicely distinguished the various cases and call > > chains which change the queueing and recovery mode, but it would be > > better if that kind of assessment wasn't necessary, and instead it > > was > > possible to inspect the code in just a block of (say) 50 code lines > > that would decide whether or not a given map should be queueing and > > / > > or in recovery mode. > > > > It's an unfortunate property of the multipath-tools code base that > > important logic is dispersed over multiple functions and source > > files, > > making a correct assessment of the code flow difficult and > > cumbersome. > > It can't be completely avoided given the complexity of the subject, > > but > > it would be nice to try. > > > > I'm looking forward to your new patch set ;-) > > Unfortunately, my new patchset doesn't change the first patch. It > just > builds on it to clean how the interactive client commands call > things, to > reduce the callers of some of our functions, and keep the client > commands from doing more work than they should be. After the changes, > the only things calling set_no_path_retry() are functions that are > updating the multipath device due to external events, and the > check_path > code. > > The update_queue_mode{add,del}_path() functions should be (and based > on > my code examination, are) able to assume a device in the correct > queuing > and recovery state, and just do the work of entering or leaving the > recovery mode, based on a change in the number of active paths. OK. Perhaps not what I was hoping for, but better than what we have now. Maybe you can add a few comment lines to explain the various cases, lest your analysis be forgotten. > If we move the code that checks if the our queuing state is wrong out > of > set_no_path_retry() into the update_queue_mode{add,del}_path(), then > we > add a requirement that callers of those functions refresh the current > queueing state. Otherwise we can't actually guarantee that they do > the > right thing. With the current callers, this doesn't really matter, > because we don't need them to fix the queueing state. But why not > keep > the code that fixes the queueing state in a function that's called > when > it could be wrong, and also is (after my new patchset) only called > after > we have refreshed the state so it will do the right thing. > > So, I would prefer if we go with my first patch, as is. I'm open to > the > idea of moving this code into update_queue_mode{add,del}_path() as > part > of a future patchset where we switch to tracking the current queueing > state outside of mpp->features, and updating it whenever we actually > enable or disable queueing. I can work on this if you want. I'm fine with leaving your first patch as is, perhaps with some comments. Anyway please resubmit it as part of your new set. Regards Martin
diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c index 0e8a46e7..3cb23c73 100644 --- a/libmultipath/structs_vec.c +++ b/libmultipath/structs_vec.c @@ -627,8 +627,18 @@ void set_no_path_retry(struct multipath *mpp) !mpp->in_recovery) dm_queue_if_no_path(mpp->alias, 1); leave_recovery_mode(mpp); - } else if (pathcount(mpp, PATH_PENDING) == 0) + } else if (pathcount(mpp, PATH_PENDING) == 0) { + /* + * If in_recovery is set, enter_recovery_mode does + * nothing. If the device is already in recovery + * mode and has already timed out, manually call + * dm_queue_if_no_path to stop it from queueing. + */ + if ((!mpp->features || is_queueing) && + mpp->in_recovery && mpp->retry_tick == 0) + dm_queue_if_no_path(mpp->alias, 0); enter_recovery_mode(mpp); + } break; } }
If a multipath device has no_path_retry set to a number and has lost all paths, gone into recovery mode, and timed out, it will disable queue_if_no_paths. After that, if one of those failed paths is removed, when the device is reloaded, queue_if_no_paths will be re-enabled. When set_no_path_retry() is then called to update the queueing state, it will not disable queue_if_no_paths, since the device is still in the recovery state, so it believes no work needs to be done. The device will remain in the recovery state, with retry_ticks at 0, and queueing enabled, even though there are no usable paths. To fix this, in set_no_path_retry(), if no_path_retry is set to a number and the device is queueing but it is in recovery mode and out of retries with no usable paths, manually disable queue_if_no_path. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> --- libmultipath/structs_vec.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)