diff mbox

xsplice: Don't perform multiple operations on same payload once work is scheduled.

Message ID 20160429015212.GA12857@localhost.localdomain (mailing list archive)
State New, archived
Headers show

Commit Message

Konrad Rzeszutek Wilk April 29, 2016, 1:52 a.m. UTC
On Fri, Apr 29, 2016 at 12:23:38AM +0100, Andrew Cooper wrote:
> On 28/04/2016 19:16, Konrad Rzeszutek Wilk wrote:
> > diff --git a/xen/common/xsplice.c b/xen/common/xsplice.c
> > index 1b67d39..48088ce 100644
> > --- a/xen/common/xsplice.c
> > +++ b/xen/common/xsplice.c
> > @@ -1099,6 +1099,13 @@ static void xsplice_do_action(void)
> >      data->rc = rc;
> >  }
> >  
> > +static bool_t is_work_scheduled(struct payload *data)
> 
> const struct payload *data

Yes!
> 
> Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

And of course 5 hours later I realized there is a more straightforward
for this. It follows the same idea but it piggyback on data->rc
being set by 'schedule_work' to -EAGAIN once work is scheduled:

It could even be rolled in "xsplice: Implement support for
applying/reverting/replacing patches."


From 83053e3f984b67dfae74cb64e75b8871b9d236ca Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Thu, 28 Apr 2016 21:22:49 -0400
Subject: [PATCH] xsplice: Check data->rc for -EAGAIN to guard against races.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Currently it is possible to:

1)  xc_xsplice_apply()
     \-> xsplice_action
        spin_lock(payload_lock)
             \- schedule_work()
                 data->rc=-EAGAIN
        spin_unlock(payload_lock);

2)  xc_xsplice_unload()
     \-> xsplice_action
        spin_lock(payload_lock)
             free_payload(data);
        spin_unlock(payload_lock);

.. all CPUs are quiesced.

3) check_for_xsplice_work()
     \-> apply_payload
        \-> arch_xsplice_apply_jmp
            BOOM
         data->rc =0

The reason is that state is in 'CHECKED' which changes to 'APPLIED'
once check_for_xsplice_work finishes (and it updates data->rc to zero).

But we have a race between 1) -> 3) where one can manipulate the payload
(as the state is in 'CHECKED' from which you can apply/revert and unload).

This patch adds a simple check on data->rc to see if it is in -EAGAIN
which means that schedule_work has been called for this payload.

If the payload aborts in check_for_xsplice_work (timed out, etc),
the data->rc will be -EBUSY -so one can still unload the payload or
retry the operation.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reported-by: Roger Pau Monné <roger.pau@citrix.com>

---
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: George Dunlap <George.Dunlap@eu.citrix.com>
CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Tim Deegan <tim@xen.org>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Ross Lagerwall <ross.lagerwall@citrix.com>
---
---
 xen/common/xsplice.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Konrad Rzeszutek Wilk April 29, 2016, 2:15 a.m. UTC | #1
On Thu, Apr 28, 2016 at 09:52:14PM -0400, Konrad Rzeszutek Wilk wrote:
> On Fri, Apr 29, 2016 at 12:23:38AM +0100, Andrew Cooper wrote:
> > On 28/04/2016 19:16, Konrad Rzeszutek Wilk wrote:
> > > diff --git a/xen/common/xsplice.c b/xen/common/xsplice.c
> > > index 1b67d39..48088ce 100644
> > > --- a/xen/common/xsplice.c
> > > +++ b/xen/common/xsplice.c
> > > @@ -1099,6 +1099,13 @@ static void xsplice_do_action(void)
> > >      data->rc = rc;
> > >  }
> > >  
> > > +static bool_t is_work_scheduled(struct payload *data)
> > 
> > const struct payload *data
> 
> Yes!
> > 
> > Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> 
> And of course 5 hours later I realized there is a more straightforward
> for this. It follows the same idea but it piggyback on data->rc
> being set by 'schedule_work' to -EAGAIN once work is scheduled:

Err, 'schedule_work' does not set data->rc to -EAGAIN. It happens
within xsplice_do_action:

 	data->rc = -EAGAIN;                                                 
        rc = schedule_work(data, action->cmd, action->timeout);  

(for either replace, revert, or apply).

> 
> It could even be rolled in "xsplice: Implement support for
> applying/reverting/replacing patches."
> 
> 
> >From 83053e3f984b67dfae74cb64e75b8871b9d236ca Mon Sep 17 00:00:00 2001
> From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Date: Thu, 28 Apr 2016 21:22:49 -0400
> Subject: [PATCH] xsplice: Check data->rc for -EAGAIN to guard against races.
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> Currently it is possible to:
> 
> 1)  xc_xsplice_apply()
>      \-> xsplice_action
>         spin_lock(payload_lock)
>              \- schedule_work()
>                  data->rc=-EAGAIN
>         spin_unlock(payload_lock);
> 
> 2)  xc_xsplice_unload()
>      \-> xsplice_action
>         spin_lock(payload_lock)
>              free_payload(data);
>         spin_unlock(payload_lock);
> 
> .. all CPUs are quiesced.
> 
> 3) check_for_xsplice_work()
>      \-> apply_payload
>         \-> arch_xsplice_apply_jmp
>             BOOM
>          data->rc =0
> 
> The reason is that state is in 'CHECKED' which changes to 'APPLIED'
> once check_for_xsplice_work finishes (and it updates data->rc to zero).
> 
> But we have a race between 1) -> 3) where one can manipulate the payload
> (as the state is in 'CHECKED' from which you can apply/revert and unload).
> 
> This patch adds a simple check on data->rc to see if it is in -EAGAIN
> which means that schedule_work has been called for this payload.
> 
> If the payload aborts in check_for_xsplice_work (timed out, etc),
> the data->rc will be -EBUSY -so one can still unload the payload or
> retry the operation.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Reported-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> ---
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: George Dunlap <George.Dunlap@eu.citrix.com>
> CC: Ian Jackson <ian.jackson@eu.citrix.com>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Tim Deegan <tim@xen.org>
> CC: Wei Liu <wei.liu2@citrix.com>
> CC: Ross Lagerwall <ross.lagerwall@citrix.com>
> ---
> ---
>  xen/common/xsplice.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/xen/common/xsplice.c b/xen/common/xsplice.c
> index 1b67d39..0bc7e0f 100644
> --- a/xen/common/xsplice.c
> +++ b/xen/common/xsplice.c
> @@ -1363,6 +1363,9 @@ static int xsplice_action(xen_sysctl_xsplice_action_t *action)
>          return PTR_ERR(data);
>      }
>  
> +    if ( data->rc == -EAGAIN ) /* schedule_work has been called. */
> +        goto out;
> +
>      switch ( action->cmd )
>      {
>      case XSPLICE_ACTION_UNLOAD:
> @@ -1423,6 +1426,7 @@ static int xsplice_action(xen_sysctl_xsplice_action_t *action)
>          break;
>      }
>  
> + out:
>      spin_unlock(&payload_lock);
>  
>      return rc;
> -- 
> 2.4.0
>
Jan Beulich April 29, 2016, 7:35 a.m. UTC | #2
>>> On 29.04.16 at 03:52, <konrad@kernel.org> wrote:
> --- a/xen/common/xsplice.c
> +++ b/xen/common/xsplice.c
> @@ -1363,6 +1363,9 @@ static int xsplice_action(xen_sysctl_xsplice_action_t *action)
>          return PTR_ERR(data);
>      }
>  
> +    if ( data->rc == -EAGAIN ) /* schedule_work has been called. */
> +        goto out;

And nothing else can set data->rc to this value, now or in the
future? Because if that's possible, you'd deny any further actions
on that payload.

> @@ -1423,6 +1426,7 @@ static int xsplice_action(xen_sysctl_xsplice_action_t *action)
>          break;
>      }
>  
> + out:
>      spin_unlock(&payload_lock);
>  
>      return rc;

Taking both together, and looking at patch 2 of the original series, I'm
getting the impression you return success in that case rather than e.g.
-EBUSY or indeed -EAGAIN.

Jan
Konrad Rzeszutek Wilk April 29, 2016, 7:49 a.m. UTC | #3
On Fri, Apr 29, 2016 at 01:35:08AM -0600, Jan Beulich wrote:
> >>> On 29.04.16 at 03:52, <konrad@kernel.org> wrote:
> > --- a/xen/common/xsplice.c
> > +++ b/xen/common/xsplice.c
> > @@ -1363,6 +1363,9 @@ static int xsplice_action(xen_sysctl_xsplice_action_t *action)
> >          return PTR_ERR(data);
> >      }
> >  
> > +    if ( data->rc == -EAGAIN ) /* schedule_work has been called. */
> > +        goto out;
> 
> And nothing else can set data->rc to this value, now or in the

Correct.

> future? Because if that's possible, you'd deny any further actions
> on that payload.

Right. If the code does change and some of the underlaying code changes
it to -EAGAIN we are in trouble.

Or rather, we can do something different (like the earlier patch that Andrew
reviewed).

> 
> > @@ -1423,6 +1426,7 @@ static int xsplice_action(xen_sysctl_xsplice_action_t *action)
> >          break;
> >      }
> >  
> > + out:
> >      spin_unlock(&payload_lock);
> >  
> >      return rc;
> 
> Taking both together, and looking at patch 2 of the original series, I'm
> getting the impression you return success in that case rather than e.g.
> -EBUSY or indeed -EAGAIN.

Correct.

That way xen-xsplice will continue on trying to come back and doing its operation
without exiting out.

But the more I think about it the more that sounds silly. If we have
an situation where the user requests a different state and we hadn't finished
with the one in progress - we shouldn't loop around. We should error out
and tell the admin. (The same way we error out if there is a timeout on the
patching or we can't get hold of the lock).

So based on your input I think (in the context of this patch), it ought to be:

if ( data->rc == -EAGAIN )
{
	rc = -EBUSY;
	goto out;
}


> 
> Jan
>
Jan Beulich April 29, 2016, 8:02 a.m. UTC | #4
>>> On 29.04.16 at 09:49, <konrad.wilk@oracle.com> wrote:
> On Fri, Apr 29, 2016 at 01:35:08AM -0600, Jan Beulich wrote:
>> >>> On 29.04.16 at 03:52, <konrad@kernel.org> wrote:
>> > --- a/xen/common/xsplice.c
>> > +++ b/xen/common/xsplice.c
>> > @@ -1363,6 +1363,9 @@ static int xsplice_action(xen_sysctl_xsplice_action_t 
> *action)
>> >          return PTR_ERR(data);
>> >      }
>> >  
>> > +    if ( data->rc == -EAGAIN ) /* schedule_work has been called. */
>> > +        goto out;
>> 
>> And nothing else can set data->rc to this value, now or in the
> 
> Correct.
> 
>> future? Because if that's possible, you'd deny any further actions
>> on that payload.
> 
> Right. If the code does change and some of the underlaying code changes
> it to -EAGAIN we are in trouble.
> 
> Or rather, we can do something different (like the earlier patch that Andrew
> reviewed).

Yes, I think I'd prefer that one to be used.

Jan
diff mbox

Patch

diff --git a/xen/common/xsplice.c b/xen/common/xsplice.c
index 1b67d39..0bc7e0f 100644
--- a/xen/common/xsplice.c
+++ b/xen/common/xsplice.c
@@ -1363,6 +1363,9 @@  static int xsplice_action(xen_sysctl_xsplice_action_t *action)
         return PTR_ERR(data);
     }
 
+    if ( data->rc == -EAGAIN ) /* schedule_work has been called. */
+        goto out;
+
     switch ( action->cmd )
     {
     case XSPLICE_ACTION_UNLOAD:
@@ -1423,6 +1426,7 @@  static int xsplice_action(xen_sysctl_xsplice_action_t *action)
         break;
     }
 
+ out:
     spin_unlock(&payload_lock);
 
     return rc;