diff mbox series

[for-6.1?,v2,4/7] jobs: Give Job.force_cancel more meaning

Message ID 20210726144613.954844-5-mreitz@redhat.com (mailing list archive)
State New, archived
Headers show
Series mirror: Handle errors after READY cancel | expand

Commit Message

Max Reitz July 26, 2021, 2:46 p.m. UTC
We largely have two cancel modes for jobs:

First, there is actual cancelling.  The job is terminated as soon as
possible, without trying to reach a consistent result.

Second, we have mirror in the READY state.  Technically, the job is not
really cancelled, but it just is a different completion mode.  The job
can still run for an indefinite amount of time while it tries to reach a
consistent result.

We want to be able to clearly distinguish which cancel mode a job is in
(when it has been cancelled).  We can use Job.force_cancel for this, but
right now it only reflects cancel requests from the user with
force=true, but clearly, jobs that do not even distinguish between
force=false and force=true are effectively always force-cancelled.

So this patch has Job.force_cancel signify whether the job will
terminate as soon as possible (force_cancel=true) or whether it will
effectively remain running despite being "cancelled"
(force_cancel=false).

To this end, we let jobs that provide JobDriver.cancel() tell the
generic job code whether they will terminate as soon as possible or not,
and for jobs that do not provide that method we assume they will.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 include/qemu/job.h | 11 ++++++++++-
 block/backup.c     |  3 ++-
 block/mirror.c     | 24 ++++++++++++++++++------
 job.c              |  6 +++++-
 4 files changed, 35 insertions(+), 9 deletions(-)

Comments

Eric Blake July 26, 2021, 9:38 p.m. UTC | #1
On Mon, Jul 26, 2021 at 04:46:10PM +0200, Max Reitz wrote:
> We largely have two cancel modes for jobs:
> 
> First, there is actual cancelling.  The job is terminated as soon as
> possible, without trying to reach a consistent result.
> 
> Second, we have mirror in the READY state.  Technically, the job is not
> really cancelled, but it just is a different completion mode.  The job
> can still run for an indefinite amount of time while it tries to reach a
> consistent result.
> 
> We want to be able to clearly distinguish which cancel mode a job is in
> (when it has been cancelled).  We can use Job.force_cancel for this, but
> right now it only reflects cancel requests from the user with
> force=true, but clearly, jobs that do not even distinguish between
> force=false and force=true are effectively always force-cancelled.
> 
> So this patch has Job.force_cancel signify whether the job will
> terminate as soon as possible (force_cancel=true) or whether it will
> effectively remain running despite being "cancelled"
> (force_cancel=false).
> 
> To this end, we let jobs that provide JobDriver.cancel() tell the
> generic job code whether they will terminate as soon as possible or not,
> and for jobs that do not provide that method we assume they will.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  include/qemu/job.h | 11 ++++++++++-
>  block/backup.c     |  3 ++-
>  block/mirror.c     | 24 ++++++++++++++++++------
>  job.c              |  6 +++++-
>  4 files changed, 35 insertions(+), 9 deletions(-)
>

Reviewed-by: Eric Blake <eblake@redhat.com>
Vladimir Sementsov-Ogievskiy July 27, 2021, 12:40 p.m. UTC | #2
26.07.2021 17:46, Max Reitz wrote:
> We largely have two cancel modes for jobs:
> 
> First, there is actual cancelling.  The job is terminated as soon as
> possible, without trying to reach a consistent result.
> 
> Second, we have mirror in the READY state.  Technically, the job is not
> really cancelled, but it just is a different completion mode.  The job
> can still run for an indefinite amount of time while it tries to reach a
> consistent result.
> 
> We want to be able to clearly distinguish which cancel mode a job is in
> (when it has been cancelled).  We can use Job.force_cancel for this, but
> right now it only reflects cancel requests from the user with
> force=true, but clearly, jobs that do not even distinguish between
> force=false and force=true are effectively always force-cancelled.
> 
> So this patch has Job.force_cancel signify whether the job will
> terminate as soon as possible (force_cancel=true) or whether it will
> effectively remain running despite being "cancelled"
> (force_cancel=false).
> 
> To this end, we let jobs that provide JobDriver.cancel() tell the
> generic job code whether they will terminate as soon as possible or not,
> and for jobs that do not provide that method we assume they will.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   include/qemu/job.h | 11 ++++++++++-
>   block/backup.c     |  3 ++-
>   block/mirror.c     | 24 ++++++++++++++++++------
>   job.c              |  6 +++++-
>   4 files changed, 35 insertions(+), 9 deletions(-)
> 
> diff --git a/include/qemu/job.h b/include/qemu/job.h
> index 5e8edbc2c8..8aa90f7395 100644
> --- a/include/qemu/job.h
> +++ b/include/qemu/job.h
> @@ -253,8 +253,17 @@ struct JobDriver {
>   
>       /**
>        * If the callback is not NULL, it will be invoked in job_cancel_async
> +     *
> +     * This function must return true if the job will be cancelled
> +     * immediately without any further I/O (mandatory if @force is
> +     * true), and false otherwise.  This lets the generic job layer
> +     * know whether a job has been truly (force-)cancelled, or whether
> +     * it is just in a special completion mode (like mirror after
> +     * READY).
> +     * (If the callback is NULL, the job is assumed to terminate
> +     * without I/O.)
>        */
> -    void (*cancel)(Job *job, bool force);
> +    bool (*cancel)(Job *job, bool force);
>   
>   
>       /** Called when the job is freed */
> diff --git a/block/backup.c b/block/backup.c
> index bd3614ce70..513e1c8a0b 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -331,11 +331,12 @@ static void coroutine_fn backup_set_speed(BlockJob *job, int64_t speed)
>       }
>   }
>   
> -static void backup_cancel(Job *job, bool force)
> +static bool backup_cancel(Job *job, bool force)
>   {
>       BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
>   
>       bdrv_cancel_in_flight(s->target_bs);
> +    return true;
>   }
>   
>   static const BlockJobDriver backup_job_driver = {
> diff --git a/block/mirror.c b/block/mirror.c
> index fcb7b65f93..e93631a9f6 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -1087,9 +1087,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
>           trace_mirror_before_sleep(s, cnt, job_is_ready(&s->common.job),
>                                     delay_ns);
>           job_sleep_ns(&s->common.job, delay_ns);
> -        if (job_is_cancelled(&s->common.job) &&
> -            (!job_is_ready(&s->common.job) || s->common.job.force_cancel))
> -        {
> +        if (job_is_cancelled(&s->common.job) && s->common.job.force_cancel) {

Seems, it could it be reduced to

if (s->common.job.force_cancel) {


>               break;
>           }
>           s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
> @@ -1102,7 +1100,7 @@ immediate_exit:
>            * the target is a copy of the source.
>            */
>           assert(ret < 0 ||
> -               ((s->common.job.force_cancel || !job_is_ready(&s->common.job)) &&
> +               (s->common.job.force_cancel &&

and here

>                   job_is_cancelled(&s->common.job)));
>           assert(need_drain);
>           mirror_wait_for_all_io(s);
> @@ -1188,14 +1186,27 @@ static bool mirror_drained_poll(BlockJob *job)
>       return !!s->in_flight;
>   }
>   

anyway:

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
diff mbox series

Patch

diff --git a/include/qemu/job.h b/include/qemu/job.h
index 5e8edbc2c8..8aa90f7395 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -253,8 +253,17 @@  struct JobDriver {
 
     /**
      * If the callback is not NULL, it will be invoked in job_cancel_async
+     *
+     * This function must return true if the job will be cancelled
+     * immediately without any further I/O (mandatory if @force is
+     * true), and false otherwise.  This lets the generic job layer
+     * know whether a job has been truly (force-)cancelled, or whether
+     * it is just in a special completion mode (like mirror after
+     * READY).
+     * (If the callback is NULL, the job is assumed to terminate
+     * without I/O.)
      */
-    void (*cancel)(Job *job, bool force);
+    bool (*cancel)(Job *job, bool force);
 
 
     /** Called when the job is freed */
diff --git a/block/backup.c b/block/backup.c
index bd3614ce70..513e1c8a0b 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -331,11 +331,12 @@  static void coroutine_fn backup_set_speed(BlockJob *job, int64_t speed)
     }
 }
 
-static void backup_cancel(Job *job, bool force)
+static bool backup_cancel(Job *job, bool force)
 {
     BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
 
     bdrv_cancel_in_flight(s->target_bs);
+    return true;
 }
 
 static const BlockJobDriver backup_job_driver = {
diff --git a/block/mirror.c b/block/mirror.c
index fcb7b65f93..e93631a9f6 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1087,9 +1087,7 @@  static int coroutine_fn mirror_run(Job *job, Error **errp)
         trace_mirror_before_sleep(s, cnt, job_is_ready(&s->common.job),
                                   delay_ns);
         job_sleep_ns(&s->common.job, delay_ns);
-        if (job_is_cancelled(&s->common.job) &&
-            (!job_is_ready(&s->common.job) || s->common.job.force_cancel))
-        {
+        if (job_is_cancelled(&s->common.job) && s->common.job.force_cancel) {
             break;
         }
         s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
@@ -1102,7 +1100,7 @@  immediate_exit:
          * the target is a copy of the source.
          */
         assert(ret < 0 ||
-               ((s->common.job.force_cancel || !job_is_ready(&s->common.job)) &&
+               (s->common.job.force_cancel &&
                 job_is_cancelled(&s->common.job)));
         assert(need_drain);
         mirror_wait_for_all_io(s);
@@ -1188,14 +1186,27 @@  static bool mirror_drained_poll(BlockJob *job)
     return !!s->in_flight;
 }
 
-static void mirror_cancel(Job *job, bool force)
+static bool mirror_cancel(Job *job, bool force)
 {
     MirrorBlockJob *s = container_of(job, MirrorBlockJob, common.job);
     BlockDriverState *target = blk_bs(s->target);
 
-    if (force || !job_is_ready(job)) {
+    /*
+     * Before the job is READY, we treat any cancellation like a
+     * force-cancellation.
+     */
+    force = force || !job_is_ready(job);
+
+    if (force) {
         bdrv_cancel_in_flight(target);
     }
+    return force;
+}
+
+static bool commit_active_cancel(Job *job, bool force)
+{
+    /* Same as above in mirror_cancel() */
+    return force || !job_is_ready(job);
 }
 
 static const BlockJobDriver mirror_job_driver = {
@@ -1225,6 +1236,7 @@  static const BlockJobDriver commit_active_job_driver = {
         .abort                  = mirror_abort,
         .pause                  = mirror_pause,
         .complete               = mirror_complete,
+        .cancel                 = commit_active_cancel,
     },
     .drained_poll           = mirror_drained_poll,
 };
diff --git a/job.c b/job.c
index 9e971d64cf..e78d893a9c 100644
--- a/job.c
+++ b/job.c
@@ -719,8 +719,12 @@  static int job_finalize_single(Job *job)
 static void job_cancel_async(Job *job, bool force)
 {
     if (job->driver->cancel) {
-        job->driver->cancel(job, force);
+        force = job->driver->cancel(job, force);
+    } else {
+        /* No .cancel() means the job will behave as if force-cancelled */
+        force = true;
     }
+
     if (job->user_paused) {
         /* Do not call job_enter here, the caller will handle it.  */
         if (job->driver->user_resume) {