diff mbox series

[RFC] dm ioctl: fix erroneous EINVAL when signaled

Message ID 20240717185640.1026114-1-khazhy@google.com (mailing list archive)
State Superseded, archived
Delegated to: Mike Snitzer
Headers show
Series [RFC] dm ioctl: fix erroneous EINVAL when signaled | expand

Commit Message

Khazhy Kumykov July 17, 2024, 6:56 p.m. UTC
do_resume when loading a new map first calls dm_suspend, which could
silently fail. When we proceeded to dm_swap_table, we would bail out
with EINVAL. Instead, restore new_map and return ERESTARTSYS when
signaled.

Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
---
 drivers/md/dm-ioctl.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)


RFC as I am rather unfamiliar with the locking semantics here - whether
we do need to re-grab hash_lock to write to an hc we previously grabbed,
and whether the device becoming unhashed while we're in this function is
really something that needs to be checked. However, this patch does fix
the issue we were seeing - we'd get EINVAL when thread in ioctl was
signaled.

Comments

Mikulas Patocka July 17, 2024, 7:44 p.m. UTC | #1
Hi

I am wondering why does do_resume need to call dm_suspend at all. Does 
anyone here remember why is this code path needed?

Mikulas



On Wed, 17 Jul 2024, Khazhismel Kumykov wrote:

> do_resume when loading a new map first calls dm_suspend, which could
> silently fail. When we proceeded to dm_swap_table, we would bail out
> with EINVAL. Instead, restore new_map and return ERESTARTSYS when
> signaled.
> 
> Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
> ---
>  drivers/md/dm-ioctl.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> 
> RFC as I am rather unfamiliar with the locking semantics here - whether
> we do need to re-grab hash_lock to write to an hc we previously grabbed,
> and whether the device becoming unhashed while we're in this function is
> really something that needs to be checked. However, this patch does fix
> the issue we were seeing - we'd get EINVAL when thread in ioctl was
> signaled.
> 
> 
> diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
> index c2c07bfa6471..b81650c6d096 100644
> --- a/drivers/md/dm-ioctl.c
> +++ b/drivers/md/dm-ioctl.c
> @@ -1181,8 +1181,22 @@ static int do_resume(struct dm_ioctl *param)
>  			suspend_flags &= ~DM_SUSPEND_LOCKFS_FLAG;
>  		if (param->flags & DM_NOFLUSH_FLAG)
>  			suspend_flags |= DM_SUSPEND_NOFLUSH_FLAG;
> -		if (!dm_suspended_md(md))
> -			dm_suspend(md, suspend_flags);
> +		if (!dm_suspended_md(md)) {
> +			r = dm_suspend(md, suspend_flags);
> +			if (r == -EINTR)
> +				r = -ERESTARTSYS;
> +			if (r) {
> +				down_write(&_hash_lock);
> +				hc = dm_get_mdptr(md);
> +				if (!hc)
> +					r = -ENXIO;
> +				else
> +					hc->new_map = new_map;
> +				up_write(&_hash_lock);
> +				dm_put(md);
> +				return r;
> +			}
> +		}
>  
>  		old_size = dm_get_size(md);
>  		old_map = dm_swap_table(md, new_map);
> -- 
> 2.45.2.993.g49e7a77208-goog
>
Khazhy Kumykov July 17, 2024, 7:52 p.m. UTC | #2
On Wed, Jul 17, 2024 at 12:45 PM Mikulas Patocka <mpatocka@redhat.com> wrote:
>
> Hi
>
> I am wondering why does do_resume need to call dm_suspend at all. Does
> anyone here remember why is this code path needed?

In our case, we have a sequence with load_table followed by a resume,
with no suspend first. The resume path suspends if needed, swaps
tables, then resumes. Removing the suspend here would break existing
userspace, I'd imagine. It seems like minimizing the suspended time
would also be a nice benefit.

>
> Mikulas
>
>
>
> On Wed, 17 Jul 2024, Khazhismel Kumykov wrote:
>
> > do_resume when loading a new map first calls dm_suspend, which could
> > silently fail. When we proceeded to dm_swap_table, we would bail out
> > with EINVAL. Instead, restore new_map and return ERESTARTSYS when
> > signaled.
> >
> > Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
> > ---
> >  drivers/md/dm-ioctl.c | 18 ++++++++++++++++--
> >  1 file changed, 16 insertions(+), 2 deletions(-)
> >
> >
> > RFC as I am rather unfamiliar with the locking semantics here - whether
> > we do need to re-grab hash_lock to write to an hc we previously grabbed,
> > and whether the device becoming unhashed while we're in this function is
> > really something that needs to be checked. However, this patch does fix
> > the issue we were seeing - we'd get EINVAL when thread in ioctl was
> > signaled.
> >
> >
> > diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
> > index c2c07bfa6471..b81650c6d096 100644
> > --- a/drivers/md/dm-ioctl.c
> > +++ b/drivers/md/dm-ioctl.c
> > @@ -1181,8 +1181,22 @@ static int do_resume(struct dm_ioctl *param)
> >                       suspend_flags &= ~DM_SUSPEND_LOCKFS_FLAG;
> >               if (param->flags & DM_NOFLUSH_FLAG)
> >                       suspend_flags |= DM_SUSPEND_NOFLUSH_FLAG;
> > -             if (!dm_suspended_md(md))
> > -                     dm_suspend(md, suspend_flags);
> > +             if (!dm_suspended_md(md)) {
> > +                     r = dm_suspend(md, suspend_flags);
> > +                     if (r == -EINTR)
> > +                             r = -ERESTARTSYS;
> > +                     if (r) {
> > +                             down_write(&_hash_lock);
> > +                             hc = dm_get_mdptr(md);
> > +                             if (!hc)
> > +                                     r = -ENXIO;
> > +                             else
> > +                                     hc->new_map = new_map;
Oh - I probably want to check if hc->new_map has become non-null in
the meantime and if so... pick a winner then put the loser? Presumably
the newest map should win if that happens / is possible. although the
concept seems suspect to me.
> > +                             up_write(&_hash_lock);
> > +                             dm_put(md);
> > +                             return r;
> > +                     }
> > +             }
> >
> >               old_size = dm_get_size(md);
> >               old_map = dm_swap_table(md, new_map);
> > --
> > 2.45.2.993.g49e7a77208-goog
> >
>
Zdenek Kabelac July 22, 2024, 9:52 a.m. UTC | #3
Dne 17. 07. 24 v 21:52 Khazhy Kumykov napsal(a):
> On Wed, Jul 17, 2024 at 12:45 PM Mikulas Patocka <mpatocka@redhat.com> wrote:
>> Hi
>>
>> I am wondering why does do_resume need to call dm_suspend at all. Does
>> anyone here remember why is this code path needed?
> In our case, we have a sequence with load_table followed by a resume,
> with no suspend first. The resume path suspends if needed, swaps
> tables, then resumes. Removing the suspend here would break existing
> userspace, I'd imagine. It seems like minimizing the suspended time
> would also be a nice benefit.


lvm2 maintainer POV

Automatic 'suspend' for resume is a kernel 'feature' that should not be 
normally used from the userspace. Userspace is supposed to call    'suspend'  
- handle error cases - eventually drop preloaded table and resume existing 
table that should work.

If userspace is using ONLY  'resume'  without calling suspend upfront - there 
are some unsolvable error cases.


So no -  'minimizing'  suspend time is NOT the main reason here. The only 
valid reason to use it is basically if you are  admin and you need to reload 
table for a device you are running from - in this case calling 'dmsetup 
suspend'  might leave your system in 'blocked' state since your rootfs will be 
'frozen/suspend' and you would have no chance to  call 'dmsetup resume'.

lvm2 app is locking itself in the RAM in this critical section so it can 
proceed with regular sequence:    'write metadata - preload DM - suspend DM  - 
commit metadata - resume DM'  which basicall all userland apps should be using.

Regards


Zdenek
diff mbox series

Patch

diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
index c2c07bfa6471..b81650c6d096 100644
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -1181,8 +1181,22 @@  static int do_resume(struct dm_ioctl *param)
 			suspend_flags &= ~DM_SUSPEND_LOCKFS_FLAG;
 		if (param->flags & DM_NOFLUSH_FLAG)
 			suspend_flags |= DM_SUSPEND_NOFLUSH_FLAG;
-		if (!dm_suspended_md(md))
-			dm_suspend(md, suspend_flags);
+		if (!dm_suspended_md(md)) {
+			r = dm_suspend(md, suspend_flags);
+			if (r == -EINTR)
+				r = -ERESTARTSYS;
+			if (r) {
+				down_write(&_hash_lock);
+				hc = dm_get_mdptr(md);
+				if (!hc)
+					r = -ENXIO;
+				else
+					hc->new_map = new_map;
+				up_write(&_hash_lock);
+				dm_put(md);
+				return r;
+			}
+		}
 
 		old_size = dm_get_size(md);
 		old_map = dm_swap_table(md, new_map);