[v10,11/21] jobs: group together API calls under the same job lock

Message ID	20220725073855.76049-12-eesposit@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org> From: Emanuele Giuseppe Esposito <eesposit@redhat.com> To: qemu-block@nongnu.org Cc: Kevin Wolf <kwolf@redhat.com>, Hanna Reitz <hreitz@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, John Snow <jsnow@redhat.com>, Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>, Wen Congyang <wencongyang2@huawei.com>, Xie Changlong <xiechanglong.d@gmail.com>, Markus Armbruster <armbru@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, Fam Zheng <fam@euphon.net>, qemu-devel@nongnu.org, Emanuele Giuseppe Esposito <eesposit@redhat.com> Subject: [PATCH v10 11/21] jobs: group together API calls under the same job lock Date: Mon, 25 Jul 2022 03:38:45 -0400 Message-Id: <20220725073855.76049-12-eesposit@redhat.com> In-Reply-To: <20220725073855.76049-1-eesposit@redhat.com> References: <20220725073855.76049-1-eesposit@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=170.10.133.124; envelope-from=eesposit@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -21 X-Spam_score: -2.2 X-Spam_bar: -- X-Spam_report: (-2.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org>
Series	job: replace AioContext lock with job_mutex \| expand [v10,00/21] job: replace AioContext lock with job_mutex [v10,01/21] job.c: make job_mutex and job_lock/unlock() public [v10,02/21] job.h: categorize fields in struct Job [v10,03/21] job.c: API functions not used outside should be static [v10,04/21] aio-wait.h: introduce AIO_WAIT_WHILE_UNLOCKED [v10,05/21] job.c: add job_lock/unlock while keeping job.h intact [v10,06/21] job: move and update comments from blockjob.c [v10,07/21] blockjob: introduce block_job _locked() APIs [v10,08/21] jobs: add job lock in find_* functions [v10,09/21] jobs: use job locks also in the unit tests [v10,10/21] block/mirror.c: use of job helpers in drivers to avoid TOC/TOU [v10,11/21] jobs: group together API calls under the same job lock [v10,12/21] commit and mirror: create new nodes using bdrv_get_aio_context, and not the job aiocont… [v10,13/21] job: detect change of aiocontext within job coroutine [v10,14/21] jobs: protect job.aio_context with BQL and job_mutex [v10,15/21] blockjob.h: categorize fields in struct BlockJob [v10,16/21] blockjob: rename notifier callbacks as _locked [v10,17/21] blockjob: protect iostatus field in BlockJob struct [v10,18/21] job.c: enable job lock/unlock and remove Aiocontext locks [v10,19/21] block_job_query: remove atomic read [v10,20/21] blockjob: remove unused functions [v10,21/21] job: remove unused functions

Emanuele Giuseppe Esposito July 25, 2022, 7:38 a.m. UTC

Now that the API offers also _locked() functions, take advantage
of it and give also the caller control to take the lock and call
_locked functions.

This makes sense especially when we have for loops, because it
makes no sense to have:

for(job = job_next(); ...)

where each job_next() takes the lock internally.
Instead we want

JOB_LOCK_GUARD();
for(job = job_next_locked(); ...)

In addition, protect also direct field accesses, by either creating a
new critical section or widening the existing ones.

Note: at this stage, job_{lock/unlock} and job lock guard macros
are *nop*.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block.c            | 17 ++++++++++-------
 blockdev.c         | 12 +++++++++---
 blockjob.c         | 35 ++++++++++++++++++++++-------------
 job-qmp.c          |  4 +++-
 job.c              |  7 +++++--
 monitor/qmp-cmds.c |  7 +++++--
 qemu-img.c         | 37 +++++++++++++++++++++----------------
 7 files changed, 75 insertions(+), 44 deletions(-)

Vladimir Sementsov-Ogievskiy July 27, 2022, 2:50 p.m. UTC | #1

On 7/25/22 10:38, Emanuele Giuseppe Esposito wrote:
> Now that the API offers also _locked() functions, take advantage
> of it and give also the caller control to take the lock and call
> _locked functions.
> 
> This makes sense especially when we have for loops, because it
> makes no sense to have:
> 
> for(job = job_next(); ...)
> 
> where each job_next() takes the lock internally.
> Instead we want
> 
> JOB_LOCK_GUARD();
> for(job = job_next_locked(); ...)
> 
> In addition, protect also direct field accesses, by either creating a
> new critical section or widening the existing ones.
> 
> Note: at this stage, job_{lock/unlock} and job lock guard macros
> are *nop*.
> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
>   block.c            | 17 ++++++++++-------
>   blockdev.c         | 12 +++++++++---
>   blockjob.c         | 35 ++++++++++++++++++++++-------------
>   job-qmp.c          |  4 +++-
>   job.c              |  7 +++++--
>   monitor/qmp-cmds.c |  7 +++++--
>   qemu-img.c         | 37 +++++++++++++++++++++----------------
>   7 files changed, 75 insertions(+), 44 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 2c00dddd80..7559965dbc 100644
> --- a/block.c
> +++ b/block.c
> @@ -4978,8 +4978,8 @@ static void bdrv_close(BlockDriverState *bs)
>   
>   void bdrv_close_all(void)
>   {
> -    assert(job_next(NULL) == NULL);
>       GLOBAL_STATE_CODE();
> +    assert(job_next(NULL) == NULL);
>   
>       /* Drop references from requests still in flight, such as canceled block
>        * jobs whose AIO context has not been polled yet */
> @@ -6165,13 +6165,16 @@ XDbgBlockGraph *bdrv_get_xdbg_block_graph(Error **errp)
>           }
>       }
>   
> -    for (job = block_job_next(NULL); job; job = block_job_next(job)) {
> -        GSList *el;
> +    WITH_JOB_LOCK_GUARD() {
> +        for (job = block_job_next_locked(NULL); job;
> +             job = block_job_next_locked(job)) {
> +            GSList *el;
>   
> -        xdbg_graph_add_node(gr, job, X_DBG_BLOCK_GRAPH_NODE_TYPE_BLOCK_JOB,
> -                           job->job.id);
> -        for (el = job->nodes; el; el = el->next) {
> -            xdbg_graph_add_edge(gr, job, (BdrvChild *)el->data);
> +            xdbg_graph_add_node(gr, job, X_DBG_BLOCK_GRAPH_NODE_TYPE_BLOCK_JOB,
> +                                job->job.id);
> +            for (el = job->nodes; el; el = el->next) {
> +                xdbg_graph_add_edge(gr, job, (BdrvChild *)el->data);
> +            }
>           }
>       }
>   
> diff --git a/blockdev.c b/blockdev.c
> index 71f793c4ab..5b79093155 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -150,12 +150,15 @@ void blockdev_mark_auto_del(BlockBackend *blk)
>           return;
>       }
>   
> -    for (job = block_job_next(NULL); job; job = block_job_next(job)) {
> +    JOB_LOCK_GUARD();
> +
> +    for (job = block_job_next_locked(NULL); job;
> +         job = block_job_next_locked(job)) {
>           if (block_job_has_bdrv(job, blk_bs(blk))) {
>               AioContext *aio_context = job->job.aio_context;
>               aio_context_acquire(aio_context);
>   
> -            job_cancel(&job->job, false);
> +            job_cancel_locked(&job->job, false);
>   
>               aio_context_release(aio_context);
>           }
> @@ -3745,7 +3748,10 @@ BlockJobInfoList *qmp_query_block_jobs(Error **errp)
>       BlockJobInfoList *head = NULL, **tail = &head;
>       BlockJob *job;
>   
> -    for (job = block_job_next(NULL); job; job = block_job_next(job)) {
> +    JOB_LOCK_GUARD();
> +
> +    for (job = block_job_next_locked(NULL); job;
> +         job = block_job_next_locked(job)) {
>           BlockJobInfo *value;
>           AioContext *aio_context;

below block_job_query should be changed to block_job_query_locked()

>   
> diff --git a/blockjob.c b/blockjob.c
> index 0d59aba439..96fb9d9f73 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -111,8 +111,10 @@ static bool child_job_drained_poll(BdrvChild *c)
>       /* An inactive or completed job doesn't have any pending requests. Jobs
>        * with !job->busy are either already paused or have a pause point after
>        * being reentered, so no job driver code will run before they pause. */
> -    if (!job->busy || job_is_completed(job)) {
> -        return false;
> +    WITH_JOB_LOCK_GUARD() {
> +        if (!job->busy || job_is_completed_locked(job)) {
> +            return false;
> +        }
>       }
>   
>       /* Otherwise, assume that it isn't fully stopped yet, but allow the job to
> @@ -475,13 +477,15 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
>       job->ready_notifier.notify = block_job_event_ready;
>       job->idle_notifier.notify = block_job_on_idle;
>   
> -    notifier_list_add(&job->job.on_finalize_cancelled,
> -                      &job->finalize_cancelled_notifier);
> -    notifier_list_add(&job->job.on_finalize_completed,
> -                      &job->finalize_completed_notifier);
> -    notifier_list_add(&job->job.on_pending, &job->pending_notifier);
> -    notifier_list_add(&job->job.on_ready, &job->ready_notifier);
> -    notifier_list_add(&job->job.on_idle, &job->idle_notifier);
> +    WITH_JOB_LOCK_GUARD() {
> +        notifier_list_add(&job->job.on_finalize_cancelled,
> +                          &job->finalize_cancelled_notifier);
> +        notifier_list_add(&job->job.on_finalize_completed,
> +                          &job->finalize_completed_notifier);
> +        notifier_list_add(&job->job.on_pending, &job->pending_notifier);
> +        notifier_list_add(&job->job.on_ready, &job->ready_notifier);
> +        notifier_list_add(&job->job.on_idle, &job->idle_notifier);
> +    }
>   
>       error_setg(&job->blocker, "block device is in use by block job: %s",
>                  job_type_str(&job->job));
> @@ -558,10 +562,15 @@ BlockErrorAction block_job_error_action(BlockJob *job, BlockdevOnError on_err,
>                                           action);
>       }
>       if (action == BLOCK_ERROR_ACTION_STOP) {
> -        if (!job->job.user_paused) {
> -            job_pause(&job->job);
> -            /* make the pause user visible, which will be resumed from QMP. */
> -            job->job.user_paused = true;
> +        WITH_JOB_LOCK_GUARD() {
> +            if (!job->job.user_paused) {
> +                job_pause_locked(&job->job);
> +                /*
> +                 * make the pause user visible, which will be
> +                 * resumed from QMP.
> +                 */
> +                job->job.user_paused = true;
> +            }
>           }
>           block_job_iostatus_set_err(job, error);
>       }
> diff --git a/job-qmp.c b/job-qmp.c
> index ac11a6c23c..cfaf34ffb7 100644
> --- a/job-qmp.c
> +++ b/job-qmp.c
> @@ -194,7 +194,9 @@ JobInfoList *qmp_query_jobs(Error **errp)
>       JobInfoList *head = NULL, **tail = &head;
>       Job *job;
>   
> -    for (job = job_next(NULL); job; job = job_next(job)) {
> +    JOB_LOCK_GUARD();
> +
> +    for (job = job_next_locked(NULL); job; job = job_next_locked(job)) {
>           JobInfo *value;
>           AioContext *aio_context;
>   
> diff --git a/job.c b/job.c
> index ebaa4e585b..b0729e2bb2 100644
> --- a/job.c
> +++ b/job.c

Two hunks of job.c should be here, they are actually from patch 05.

> @@ -668,7 +668,7 @@ void coroutine_fn job_pause_point(Job *job)
>       job_pause_point_locked(job);
>   }
>   
> -void job_yield_locked(Job *job)
> +static void job_yield_locked(Job *job)
>   {
>       assert(job->busy);
>   
> @@ -1041,11 +1041,14 @@ static void job_completed_txn_abort_locked(Job *job)
>   /* Called with job_mutex held, but releases it temporarily */
>   static int job_prepare_locked(Job *job)
>   {
> +    int ret;
> +
>       GLOBAL_STATE_CODE();
>       if (job->ret == 0 && job->driver->prepare) {
>           job_unlock();
> -        job->ret = job->driver->prepare(job);
> +        ret = job->driver->prepare(job);
>           job_lock();
> +        job->ret = ret;
>           job_update_rc_locked(job);
>       }
>       return job->ret;
> diff --git a/monitor/qmp-cmds.c b/monitor/qmp-cmds.c
> index 1ebb89f46c..1897ed7a13 100644
> --- a/monitor/qmp-cmds.c
> +++ b/monitor/qmp-cmds.c
> @@ -133,8 +133,11 @@ void qmp_cont(Error **errp)
>           blk_iostatus_reset(blk);
>       }
>   
> -    for (job = block_job_next(NULL); job; job = block_job_next(job)) {
> -        block_job_iostatus_reset(job);
> +    WITH_JOB_LOCK_GUARD() {
> +        for (job = block_job_next_locked(NULL); job;
> +             job = block_job_next_locked(job)) {
> +            block_job_iostatus_reset_locked(job);
> +        }
>       }
>   
>       /* Continuing after completed migration. Images have been inactivated to
> diff --git a/qemu-img.c b/qemu-img.c
> index 4cf4d2423d..98c7662b0f 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -912,25 +912,30 @@ static void run_block_job(BlockJob *job, Error **errp)
>       int ret = 0;
>   
>       aio_context_acquire(aio_context);
> -    job_ref(&job->job);
> -    do {
> -        float progress = 0.0f;
> -        aio_poll(aio_context, true);
> +    WITH_JOB_LOCK_GUARD() {
> +        job_ref_locked(&job->job);
> +        do {
> +            float progress = 0.0f;
> +            job_unlock();
> +            aio_poll(aio_context, true);
> +
> +            progress_get_snapshot(&job->job.progress, &progress_current,
> +                                &progress_total);

indent broken

> +            if (progress_total) {
> +                progress = (float)progress_current / progress_total * 100.f;
> +            }
> +            qemu_progress_print(progress, 0);
> +            job_lock();
> +        } while (!job_is_ready_locked(&job->job) &&
> +                 !job_is_completed_locked(&job->job));
>   
> -        progress_get_snapshot(&job->job.progress, &progress_current,
> -                              &progress_total);
> -        if (progress_total) {
> -            progress = (float)progress_current / progress_total * 100.f;
> +        if (!job_is_completed_locked(&job->job)) {
> +            ret = job_complete_sync_locked(&job->job, errp);
> +        } else {
> +            ret = job->job.ret;
>           }
> -        qemu_progress_print(progress, 0);
> -    } while (!job_is_ready(&job->job) && !job_is_completed(&job->job));
> -
> -    if (!job_is_completed(&job->job)) {
> -        ret = job_complete_sync(&job->job, errp);
> -    } else {
> -        ret = job->job.ret;
> +        job_unref_locked(&job->job);
>       }
> -    job_unref(&job->job);
>       aio_context_release(aio_context);
>   
>       /* publish completion progress only when success */

With my notes fixed:
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

Kevin Wolf Aug. 4, 2022, 5:10 p.m. UTC | #2

Am 25.07.2022 um 09:38 hat Emanuele Giuseppe Esposito geschrieben:
> Now that the API offers also _locked() functions, take advantage
> of it and give also the caller control to take the lock and call
> _locked functions.
> 
> This makes sense especially when we have for loops, because it
> makes no sense to have:
> 
> for(job = job_next(); ...)
> 
> where each job_next() takes the lock internally.
> Instead we want
> 
> JOB_LOCK_GUARD();
> for(job = job_next_locked(); ...)
> 
> In addition, protect also direct field accesses, by either creating a
> new critical section or widening the existing ones.

"In addition" sounds like it should be a separate patch. I was indeed
surprised when after a few for loops where you just pulled the existing
locking up a bit, I saw some hunks that add completely new locking.

> Note: at this stage, job_{lock/unlock} and job lock guard macros
> are *nop*.
> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
>  block.c            | 17 ++++++++++-------
>  blockdev.c         | 12 +++++++++---
>  blockjob.c         | 35 ++++++++++++++++++++++-------------
>  job-qmp.c          |  4 +++-
>  job.c              |  7 +++++--
>  monitor/qmp-cmds.c |  7 +++++--
>  qemu-img.c         | 37 +++++++++++++++++++++----------------
>  7 files changed, 75 insertions(+), 44 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 2c00dddd80..7559965dbc 100644
> --- a/block.c
> +++ b/block.c
> @@ -4978,8 +4978,8 @@ static void bdrv_close(BlockDriverState *bs)
>  
>  void bdrv_close_all(void)
>  {
> -    assert(job_next(NULL) == NULL);
>      GLOBAL_STATE_CODE();
> +    assert(job_next(NULL) == NULL);
>  
>      /* Drop references from requests still in flight, such as canceled block
>       * jobs whose AIO context has not been polled yet */
> @@ -6165,13 +6165,16 @@ XDbgBlockGraph *bdrv_get_xdbg_block_graph(Error **errp)
>          }
>      }
>  
> -    for (job = block_job_next(NULL); job; job = block_job_next(job)) {
> -        GSList *el;
> +    WITH_JOB_LOCK_GUARD() {
> +        for (job = block_job_next_locked(NULL); job;
> +             job = block_job_next_locked(job)) {
> +            GSList *el;
>  
> -        xdbg_graph_add_node(gr, job, X_DBG_BLOCK_GRAPH_NODE_TYPE_BLOCK_JOB,
> -                           job->job.id);
> -        for (el = job->nodes; el; el = el->next) {
> -            xdbg_graph_add_edge(gr, job, (BdrvChild *)el->data);
> +            xdbg_graph_add_node(gr, job, X_DBG_BLOCK_GRAPH_NODE_TYPE_BLOCK_JOB,
> +                                job->job.id);
> +            for (el = job->nodes; el; el = el->next) {
> +                xdbg_graph_add_edge(gr, job, (BdrvChild *)el->data);
> +            }
>          }
>      }
>  
> diff --git a/blockdev.c b/blockdev.c
> index 71f793c4ab..5b79093155 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -150,12 +150,15 @@ void blockdev_mark_auto_del(BlockBackend *blk)
>          return;
>      }
>  
> -    for (job = block_job_next(NULL); job; job = block_job_next(job)) {
> +    JOB_LOCK_GUARD();
> +
> +    for (job = block_job_next_locked(NULL); job;
> +         job = block_job_next_locked(job)) {
>          if (block_job_has_bdrv(job, blk_bs(blk))) {

Should this be renamed to block_job_has_bdrv_locked() now?

It looks to me like it does need the locking. (Which actually makes
this patch a fix and not just an optimisation as the commit message
suggests.)

>              AioContext *aio_context = job->job.aio_context;
>              aio_context_acquire(aio_context);
>  
> -            job_cancel(&job->job, false);
> +            job_cancel_locked(&job->job, false);
>  
>              aio_context_release(aio_context);
>          }
> @@ -3745,7 +3748,10 @@ BlockJobInfoList *qmp_query_block_jobs(Error **errp)
>      BlockJobInfoList *head = NULL, **tail = &head;
>      BlockJob *job;
>  
> -    for (job = block_job_next(NULL); job; job = block_job_next(job)) {
> +    JOB_LOCK_GUARD();
> +
> +    for (job = block_job_next_locked(NULL); job;
> +         job = block_job_next_locked(job)) {
>          BlockJobInfo *value;
>          AioContext *aio_context;

More context:

        BlockJobInfo *value;
        AioContext *aio_context;

        if (block_job_is_internal(job)) {
            continue;
        }
        aio_context = block_job_get_aio_context(job);
        aio_context_acquire(aio_context);
        value = block_job_query(job, errp);
        aio_context_release(aio_context);

This should become block_job_query_locked(). (You do that in patch 18,
but it looks a bit out of place there - which is precisely because it
really belongs in this one.)

> diff --git a/blockjob.c b/blockjob.c
> index 0d59aba439..96fb9d9f73 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -111,8 +111,10 @@ static bool child_job_drained_poll(BdrvChild *c)
>      /* An inactive or completed job doesn't have any pending requests. Jobs
>       * with !job->busy are either already paused or have a pause point after
>       * being reentered, so no job driver code will run before they pause. */
> -    if (!job->busy || job_is_completed(job)) {
> -        return false;
> +    WITH_JOB_LOCK_GUARD() {
> +        if (!job->busy || job_is_completed_locked(job)) {
> +            return false;
> +        }
>      }
>  
>      /* Otherwise, assume that it isn't fully stopped yet, but allow the job to

Assuming that the job status can actually change, don't we need the
locking for the rest of the function, too? Otherwise we might call
drv->drained_poll() for a job that has already paused or completed.

Of course, this goes against the assumption that all callbacks are
called without holding the job lock. Maybe it's not a good assumption.

> @@ -475,13 +477,15 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
>      job->ready_notifier.notify = block_job_event_ready;
>      job->idle_notifier.notify = block_job_on_idle;
>  
> -    notifier_list_add(&job->job.on_finalize_cancelled,
> -                      &job->finalize_cancelled_notifier);
> -    notifier_list_add(&job->job.on_finalize_completed,
> -                      &job->finalize_completed_notifier);
> -    notifier_list_add(&job->job.on_pending, &job->pending_notifier);
> -    notifier_list_add(&job->job.on_ready, &job->ready_notifier);
> -    notifier_list_add(&job->job.on_idle, &job->idle_notifier);
> +    WITH_JOB_LOCK_GUARD() {
> +        notifier_list_add(&job->job.on_finalize_cancelled,
> +                          &job->finalize_cancelled_notifier);
> +        notifier_list_add(&job->job.on_finalize_completed,
> +                          &job->finalize_completed_notifier);
> +        notifier_list_add(&job->job.on_pending, &job->pending_notifier);
> +        notifier_list_add(&job->job.on_ready, &job->ready_notifier);
> +        notifier_list_add(&job->job.on_idle, &job->idle_notifier);
> +    }
>  
>      error_setg(&job->blocker, "block device is in use by block job: %s",
>                 job_type_str(&job->job));

Why is this the right scope for the lock? It looks very arbitrary to
lock only here, but not for the assignments above or the function calls
below.

Given that job_create() already puts the job in the job_list so it
becomes visible for other code, should we not keep the job lock from the
moment that we create the job until it is fully initialised?

> @@ -558,10 +562,15 @@ BlockErrorAction block_job_error_action(BlockJob *job, BlockdevOnError on_err,
>                                          action);
>      }
>      if (action == BLOCK_ERROR_ACTION_STOP) {
> -        if (!job->job.user_paused) {
> -            job_pause(&job->job);
> -            /* make the pause user visible, which will be resumed from QMP. */
> -            job->job.user_paused = true;
> +        WITH_JOB_LOCK_GUARD() {
> +            if (!job->job.user_paused) {
> +                job_pause_locked(&job->job);
> +                /*
> +                 * make the pause user visible, which will be
> +                 * resumed from QMP.
> +                 */
> +                job->job.user_paused = true;
> +            }
>          }
>          block_job_iostatus_set_err(job, error);

Why is this call not in the critical section? It accesses job->iostatus.

>      }
> diff --git a/job-qmp.c b/job-qmp.c
> index ac11a6c23c..cfaf34ffb7 100644
> --- a/job-qmp.c
> +++ b/job-qmp.c
> @@ -194,7 +194,9 @@ JobInfoList *qmp_query_jobs(Error **errp)
>      JobInfoList *head = NULL, **tail = &head;
>      Job *job;
>  
> -    for (job = job_next(NULL); job; job = job_next(job)) {
> +    JOB_LOCK_GUARD();
> +
> +    for (job = job_next_locked(NULL); job; job = job_next_locked(job)) {
>          JobInfo *value;
>          AioContext *aio_context;

Should job_query_single() be renamed to job_query_single_locked()?

> diff --git a/job.c b/job.c
> index ebaa4e585b..b0729e2bb2 100644
> --- a/job.c
> +++ b/job.c
> @@ -668,7 +668,7 @@ void coroutine_fn job_pause_point(Job *job)
>      job_pause_point_locked(job);
>  }
>  
> -void job_yield_locked(Job *job)
> +static void job_yield_locked(Job *job)
>  {
>      assert(job->busy);

It was already unused outside of job.c before this patch. Should it have
been static from the start?

> @@ -1041,11 +1041,14 @@ static void job_completed_txn_abort_locked(Job *job)
>  /* Called with job_mutex held, but releases it temporarily */
>  static int job_prepare_locked(Job *job)
>  {
> +    int ret;
> +
>      GLOBAL_STATE_CODE();
>      if (job->ret == 0 && job->driver->prepare) {
>          job_unlock();
> -        job->ret = job->driver->prepare(job);
> +        ret = job->driver->prepare(job);
>          job_lock();
> +        job->ret = ret;
>          job_update_rc_locked(job);
>      }
>      return job->ret;
> diff --git a/monitor/qmp-cmds.c b/monitor/qmp-cmds.c
> index 1ebb89f46c..1897ed7a13 100644
> --- a/monitor/qmp-cmds.c
> +++ b/monitor/qmp-cmds.c
> @@ -133,8 +133,11 @@ void qmp_cont(Error **errp)
>          blk_iostatus_reset(blk);
>      }
>  
> -    for (job = block_job_next(NULL); job; job = block_job_next(job)) {
> -        block_job_iostatus_reset(job);
> +    WITH_JOB_LOCK_GUARD() {
> +        for (job = block_job_next_locked(NULL); job;
> +             job = block_job_next_locked(job)) {
> +            block_job_iostatus_reset_locked(job);
> +        }
>      }
>  
>      /* Continuing after completed migration. Images have been inactivated to
> diff --git a/qemu-img.c b/qemu-img.c
> index 4cf4d2423d..98c7662b0f 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -912,25 +912,30 @@ static void run_block_job(BlockJob *job, Error **errp)
>      int ret = 0;
>  
>      aio_context_acquire(aio_context);
> -    job_ref(&job->job);
> -    do {
> -        float progress = 0.0f;
> -        aio_poll(aio_context, true);
> +    WITH_JOB_LOCK_GUARD() {
> +        job_ref_locked(&job->job);
> +        do {
> +            float progress = 0.0f;
> +            job_unlock();

This might be more a question of style, but mixing WITH_JOB_LOCK_GUARD()
with manual job_unlock()/job_lock() feels dangerous. What if someone
added a break between the unlock/lock pair? The lock guard would try to
unlock a mutex that is already unlocked, which probably means an
assertion failure.

I feel we should just use manual job_lock()/job_unlock() for everything
in this function.

> +            aio_poll(aio_context, true);
> +
> +            progress_get_snapshot(&job->job.progress, &progress_current,
> +                                &progress_total);
> +            if (progress_total) {
> +                progress = (float)progress_current / progress_total * 100.f;
> +            }
> +            qemu_progress_print(progress, 0);
> +            job_lock();
> +        } while (!job_is_ready_locked(&job->job) &&
> +                 !job_is_completed_locked(&job->job));
>  
> -        progress_get_snapshot(&job->job.progress, &progress_current,
> -                              &progress_total);
> -        if (progress_total) {
> -            progress = (float)progress_current / progress_total * 100.f;
> +        if (!job_is_completed_locked(&job->job)) {
> +            ret = job_complete_sync_locked(&job->job, errp);
> +        } else {
> +            ret = job->job.ret;
>          }
> -        qemu_progress_print(progress, 0);
> -    } while (!job_is_ready(&job->job) && !job_is_completed(&job->job));
> -
> -    if (!job_is_completed(&job->job)) {
> -        ret = job_complete_sync(&job->job, errp);
> -    } else {
> -        ret = job->job.ret;
> +        job_unref_locked(&job->job);
>      }
> -    job_unref(&job->job);
>      aio_context_release(aio_context);
>  
>      /* publish completion progress only when success */

Kevin

Emanuele Giuseppe Esposito Aug. 16, 2022, 2:54 p.m. UTC | #3

Am 04/08/2022 um 19:10 schrieb Kevin Wolf:
> Am 25.07.2022 um 09:38 hat Emanuele Giuseppe Esposito geschrieben:
>> Now that the API offers also _locked() functions, take advantage
>> of it and give also the caller control to take the lock and call
>> _locked functions.
>>
>> This makes sense especially when we have for loops, because it
>> makes no sense to have:
>>
>> for(job = job_next(); ...)
>>
>> where each job_next() takes the lock internally.
>> Instead we want
>>
>> JOB_LOCK_GUARD();
>> for(job = job_next_locked(); ...)
>>
>> In addition, protect also direct field accesses, by either creating a
>> new critical section or widening the existing ones.
> 
> "In addition" sounds like it should be a separate patch. I was indeed
> surprised when after a few for loops where you just pulled the existing
> locking up a bit, I saw some hunks that add completely new locking.

Would it be okay if we don't split it in two? There would be two
microscopical patches.

> 
>> Note: at this stage, job_{lock/unlock} and job lock guard macros
>> are *nop*.
>>
>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>> ---
>>  block.c            | 17 ++++++++++-------
>>  blockdev.c         | 12 +++++++++---
>>  blockjob.c         | 35 ++++++++++++++++++++++-------------
>>  job-qmp.c          |  4 +++-
>>  job.c              |  7 +++++--
>>  monitor/qmp-cmds.c |  7 +++++--
>>  qemu-img.c         | 37 +++++++++++++++++++++----------------
>>  7 files changed, 75 insertions(+), 44 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 2c00dddd80..7559965dbc 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -4978,8 +4978,8 @@ static void bdrv_close(BlockDriverState *bs)
>>  
>>  void bdrv_close_all(void)
>>  {
>> -    assert(job_next(NULL) == NULL);
>>      GLOBAL_STATE_CODE();
>> +    assert(job_next(NULL) == NULL);
>>  
>>      /* Drop references from requests still in flight, such as canceled block
>>       * jobs whose AIO context has not been polled yet */
>> @@ -6165,13 +6165,16 @@ XDbgBlockGraph *bdrv_get_xdbg_block_graph(Error **errp)
>>          }
>>      }
>>  
>> -    for (job = block_job_next(NULL); job; job = block_job_next(job)) {
>> -        GSList *el;
>> +    WITH_JOB_LOCK_GUARD() {
>> +        for (job = block_job_next_locked(NULL); job;
>> +             job = block_job_next_locked(job)) {
>> +            GSList *el;
>>  
>> -        xdbg_graph_add_node(gr, job, X_DBG_BLOCK_GRAPH_NODE_TYPE_BLOCK_JOB,
>> -                           job->job.id);
>> -        for (el = job->nodes; el; el = el->next) {
>> -            xdbg_graph_add_edge(gr, job, (BdrvChild *)el->data);
>> +            xdbg_graph_add_node(gr, job, X_DBG_BLOCK_GRAPH_NODE_TYPE_BLOCK_JOB,
>> +                                job->job.id);
>> +            for (el = job->nodes; el; el = el->next) {
>> +                xdbg_graph_add_edge(gr, job, (BdrvChild *)el->data);
>> +            }
>>          }
>>      }
>>  
>> diff --git a/blockdev.c b/blockdev.c
>> index 71f793c4ab..5b79093155 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -150,12 +150,15 @@ void blockdev_mark_auto_del(BlockBackend *blk)
>>          return;
>>      }
>>  
>> -    for (job = block_job_next(NULL); job; job = block_job_next(job)) {
>> +    JOB_LOCK_GUARD();
>> +
>> +    for (job = block_job_next_locked(NULL); job;
>> +         job = block_job_next_locked(job)) {
>>          if (block_job_has_bdrv(job, blk_bs(blk))) {
> 
> Should this be renamed to block_job_has_bdrv_locked() now?
> 
> It looks to me like it does need the locking. (Which actually makes
> this patch a fix and not just an optimisation as the commit message
> suggests.)

Nope, as GSList *nodes; is always read and written under BQL.

> 
>>              AioContext *aio_context = job->job.aio_context;
>>              aio_context_acquire(aio_context);
>>  
>> -            job_cancel(&job->job, false);
>> +            job_cancel_locked(&job->job, false);
>>  
>>              aio_context_release(aio_context);
>>          }
>> @@ -3745,7 +3748,10 @@ BlockJobInfoList *qmp_query_block_jobs(Error **errp)
>>      BlockJobInfoList *head = NULL, **tail = &head;
>>      BlockJob *job;
>>  
>> -    for (job = block_job_next(NULL); job; job = block_job_next(job)) {
>> +    JOB_LOCK_GUARD();
>> +
>> +    for (job = block_job_next_locked(NULL); job;
>> +         job = block_job_next_locked(job)) {
>>          BlockJobInfo *value;
>>          AioContext *aio_context;
> 
> More context:
> 
>         BlockJobInfo *value;
>         AioContext *aio_context;
> 
>         if (block_job_is_internal(job)) {
>             continue;
>         }
>         aio_context = block_job_get_aio_context(job);
>         aio_context_acquire(aio_context);
>         value = block_job_query(job, errp);
>         aio_context_release(aio_context);
> 
> This should become block_job_query_locked(). (You do that in patch 18,
> but it looks a bit out of place there - which is precisely because it
> really belongs in this one.)

Ok
> 
>> diff --git a/blockjob.c b/blockjob.c
>> index 0d59aba439..96fb9d9f73 100644
>> --- a/blockjob.c
>> +++ b/blockjob.c
>> @@ -111,8 +111,10 @@ static bool child_job_drained_poll(BdrvChild *c)
>>      /* An inactive or completed job doesn't have any pending requests. Jobs
>>       * with !job->busy are either already paused or have a pause point after
>>       * being reentered, so no job driver code will run before they pause. */
>> -    if (!job->busy || job_is_completed(job)) {
>> -        return false;
>> +    WITH_JOB_LOCK_GUARD() {
>> +        if (!job->busy || job_is_completed_locked(job)) {
>> +            return false;
>> +        }
>>      }
>>  
>>      /* Otherwise, assume that it isn't fully stopped yet, but allow the job to
> 
> Assuming that the job status can actually change, don't we need the
> locking for the rest of the function, too? Otherwise we might call
> drv->drained_poll() for a job that has already paused or completed.
> 
> Of course, this goes against the assumption that all callbacks are
> called without holding the job lock. Maybe it's not a good assumption.
> 
>> @@ -475,13 +477,15 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
>>      job->ready_notifier.notify = block_job_event_ready;
>>      job->idle_notifier.notify = block_job_on_idle;
>>  
>> -    notifier_list_add(&job->job.on_finalize_cancelled,
>> -                      &job->finalize_cancelled_notifier);
>> -    notifier_list_add(&job->job.on_finalize_completed,
>> -                      &job->finalize_completed_notifier);
>> -    notifier_list_add(&job->job.on_pending, &job->pending_notifier);
>> -    notifier_list_add(&job->job.on_ready, &job->ready_notifier);
>> -    notifier_list_add(&job->job.on_idle, &job->idle_notifier);
>> +    WITH_JOB_LOCK_GUARD() {
>> +        notifier_list_add(&job->job.on_finalize_cancelled,
>> +                          &job->finalize_cancelled_notifier);
>> +        notifier_list_add(&job->job.on_finalize_completed,
>> +                          &job->finalize_completed_notifier);
>> +        notifier_list_add(&job->job.on_pending, &job->pending_notifier);
>> +        notifier_list_add(&job->job.on_ready, &job->ready_notifier);
>> +        notifier_list_add(&job->job.on_idle, &job->idle_notifier);
>> +    }
>>  
>>      error_setg(&job->blocker, "block device is in use by block job: %s",
>>                 job_type_str(&job->job));
> 
> Why is this the right scope for the lock? It looks very arbitrary to
> lock only here, but not for the assignments above or the function calls
> below.
> 
> Given that job_create() already puts the job in the job_list so it
> becomes visible for other code, should we not keep the job lock from the
> moment that we create the job until it is fully initialised?

I try to protect only what needs protection, nothing more. Otherwise
then it is not clear what are we protecting and why. According to the
split I made in job.h, things like job_type_str and whatever I did not
protect are not protected because they don't need the lock.

> 
>> @@ -558,10 +562,15 @@ BlockErrorAction block_job_error_action(BlockJob *job, BlockdevOnError on_err,
>>                                          action);
>>      }
>>      if (action == BLOCK_ERROR_ACTION_STOP) {
>> -        if (!job->job.user_paused) {
>> -            job_pause(&job->job);
>> -            /* make the pause user visible, which will be resumed from QMP. */
>> -            job->job.user_paused = true;
>> +        WITH_JOB_LOCK_GUARD() {
>> +            if (!job->job.user_paused) {
>> +                job_pause_locked(&job->job);
>> +                /*
>> +                 * make the pause user visible, which will be
>> +                 * resumed from QMP.
>> +                 */
>> +                job->job.user_paused = true;
>> +            }
>>          }
>>          block_job_iostatus_set_err(job, error);
> 
> Why is this call not in the critical section? It accesses job->iostatus.

But the blockjob is not yet "classified". Comes after.
> 
>>      }
>> diff --git a/job-qmp.c b/job-qmp.c
>> index ac11a6c23c..cfaf34ffb7 100644
>> --- a/job-qmp.c
>> +++ b/job-qmp.c
>> @@ -194,7 +194,9 @@ JobInfoList *qmp_query_jobs(Error **errp)
>>      JobInfoList *head = NULL, **tail = &head;
>>      Job *job;
>>  
>> -    for (job = job_next(NULL); job; job = job_next(job)) {
>> +    JOB_LOCK_GUARD();
>> +
>> +    for (job = job_next_locked(NULL); job; job = job_next_locked(job)) {
>>          JobInfo *value;
>>          AioContext *aio_context;
> 
> Should job_query_single() be renamed to job_query_single_locked()?

Makes sense
> 
>> diff --git a/job.c b/job.c
>> index ebaa4e585b..b0729e2bb2 100644
>> --- a/job.c
>> +++ b/job.c
>> @@ -668,7 +668,7 @@ void coroutine_fn job_pause_point(Job *job)
>>      job_pause_point_locked(job);
>>  }
>>  
>> -void job_yield_locked(Job *job)
>> +static void job_yield_locked(Job *job)
>>  {
>>      assert(job->busy);
> 
> It was already unused outside of job.c before this patch. Should it have
> been static from the start?
> 
>> @@ -1041,11 +1041,14 @@ static void job_completed_txn_abort_locked(Job *job)
>>  /* Called with job_mutex held, but releases it temporarily */
>>  static int job_prepare_locked(Job *job)
>>  {
>> +    int ret;
>> +
>>      GLOBAL_STATE_CODE();
>>      if (job->ret == 0 && job->driver->prepare) {
>>          job_unlock();
>> -        job->ret = job->driver->prepare(job);
>> +        ret = job->driver->prepare(job);
>>          job_lock();
>> +        job->ret = ret;
>>          job_update_rc_locked(job);
>>      }
>>      return job->ret;
>> diff --git a/monitor/qmp-cmds.c b/monitor/qmp-cmds.c
>> index 1ebb89f46c..1897ed7a13 100644
>> --- a/monitor/qmp-cmds.c
>> +++ b/monitor/qmp-cmds.c
>> @@ -133,8 +133,11 @@ void qmp_cont(Error **errp)
>>          blk_iostatus_reset(blk);
>>      }
>>  
>> -    for (job = block_job_next(NULL); job; job = block_job_next(job)) {
>> -        block_job_iostatus_reset(job);
>> +    WITH_JOB_LOCK_GUARD() {
>> +        for (job = block_job_next_locked(NULL); job;
>> +             job = block_job_next_locked(job)) {
>> +            block_job_iostatus_reset_locked(job);
>> +        }
>>      }
>>  
>>      /* Continuing after completed migration. Images have been inactivated to
>> diff --git a/qemu-img.c b/qemu-img.c
>> index 4cf4d2423d..98c7662b0f 100644
>> --- a/qemu-img.c
>> +++ b/qemu-img.c
>> @@ -912,25 +912,30 @@ static void run_block_job(BlockJob *job, Error **errp)
>>      int ret = 0;
>>  
>>      aio_context_acquire(aio_context);
>> -    job_ref(&job->job);
>> -    do {
>> -        float progress = 0.0f;
>> -        aio_poll(aio_context, true);
>> +    WITH_JOB_LOCK_GUARD() {
>> +        job_ref_locked(&job->job);
>> +        do {
>> +            float progress = 0.0f;
>> +            job_unlock();
> 
> This might be more a question of style, but mixing WITH_JOB_LOCK_GUARD()
> with manual job_unlock()/job_lock() feels dangerous. What if someone
> added a break between the unlock/lock pair? The lock guard would try to
> unlock a mutex that is already unlocked, which probably means an
> assertion failure.
> 
> I feel we should just use manual job_lock()/job_unlock() for everything
> in this function.

As you wish, ok.

> 
>> +            aio_poll(aio_context, true);
>> +
>> +            progress_get_snapshot(&job->job.progress, &progress_current,
>> +                                &progress_total);
>> +            if (progress_total) {
>> +                progress = (float)progress_current / progress_total * 100.f;
>> +            }
>> +            qemu_progress_print(progress, 0);
>> +            job_lock();
>> +        } while (!job_is_ready_locked(&job->job) &&
>> +                 !job_is_completed_locked(&job->job));
>>  
>> -        progress_get_snapshot(&job->job.progress, &progress_current,
>> -                              &progress_total);
>> -        if (progress_total) {
>> -            progress = (float)progress_current / progress_total * 100.f;
>> +        if (!job_is_completed_locked(&job->job)) {
>> +            ret = job_complete_sync_locked(&job->job, errp);
>> +        } else {
>> +            ret = job->job.ret;
>>          }
>> -        qemu_progress_print(progress, 0);
>> -    } while (!job_is_ready(&job->job) && !job_is_completed(&job->job));
>> -
>> -    if (!job_is_completed(&job->job)) {
>> -        ret = job_complete_sync(&job->job, errp);
>> -    } else {
>> -        ret = job->job.ret;
>> +        job_unref_locked(&job->job);
>>      }
>> -    job_unref(&job->job);
>>      aio_context_release(aio_context);
>>  
>>      /* publish completion progress only when success */
> 

Emanuele

Kevin Wolf Aug. 17, 2022, 8:46 a.m. UTC | #4

Am 16.08.2022 um 16:54 hat Emanuele Giuseppe Esposito geschrieben:
> Am 04/08/2022 um 19:10 schrieb Kevin Wolf:
> > Am 25.07.2022 um 09:38 hat Emanuele Giuseppe Esposito geschrieben:
> >> Now that the API offers also _locked() functions, take advantage
> >> of it and give also the caller control to take the lock and call
> >> _locked functions.
> >>
> >> This makes sense especially when we have for loops, because it
> >> makes no sense to have:
> >>
> >> for(job = job_next(); ...)
> >>
> >> where each job_next() takes the lock internally.
> >> Instead we want
> >>
> >> JOB_LOCK_GUARD();
> >> for(job = job_next_locked(); ...)
> >>
> >> In addition, protect also direct field accesses, by either creating a
> >> new critical section or widening the existing ones.
> > 
> > "In addition" sounds like it should be a separate patch. I was indeed
> > surprised when after a few for loops where you just pulled the existing
> > locking up a bit, I saw some hunks that add completely new locking.
> 
> Would it be okay if we don't split it in two? There would be two
> microscopical patches.

If it would be just a hunk or two, fair enough.

> >> Note: at this stage, job_{lock/unlock} and job lock guard macros
> >> are *nop*.
> >>
> >> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> >> ---
> >>  block.c            | 17 ++++++++++-------
> >>  blockdev.c         | 12 +++++++++---
> >>  blockjob.c         | 35 ++++++++++++++++++++++-------------
> >>  job-qmp.c          |  4 +++-
> >>  job.c              |  7 +++++--
> >>  monitor/qmp-cmds.c |  7 +++++--
> >>  qemu-img.c         | 37 +++++++++++++++++++++----------------
> >>  7 files changed, 75 insertions(+), 44 deletions(-)
> >>
> >> diff --git a/block.c b/block.c
> >> index 2c00dddd80..7559965dbc 100644
> >> --- a/block.c
> >> +++ b/block.c
> >> @@ -4978,8 +4978,8 @@ static void bdrv_close(BlockDriverState *bs)
> >>  
> >>  void bdrv_close_all(void)
> >>  {
> >> -    assert(job_next(NULL) == NULL);
> >>      GLOBAL_STATE_CODE();
> >> +    assert(job_next(NULL) == NULL);
> >>  
> >>      /* Drop references from requests still in flight, such as canceled block
> >>       * jobs whose AIO context has not been polled yet */
> >> @@ -6165,13 +6165,16 @@ XDbgBlockGraph *bdrv_get_xdbg_block_graph(Error **errp)
> >>          }
> >>      }
> >>  
> >> -    for (job = block_job_next(NULL); job; job = block_job_next(job)) {
> >> -        GSList *el;
> >> +    WITH_JOB_LOCK_GUARD() {
> >> +        for (job = block_job_next_locked(NULL); job;
> >> +             job = block_job_next_locked(job)) {
> >> +            GSList *el;
> >>  
> >> -        xdbg_graph_add_node(gr, job, X_DBG_BLOCK_GRAPH_NODE_TYPE_BLOCK_JOB,
> >> -                           job->job.id);
> >> -        for (el = job->nodes; el; el = el->next) {
> >> -            xdbg_graph_add_edge(gr, job, (BdrvChild *)el->data);
> >> +            xdbg_graph_add_node(gr, job, X_DBG_BLOCK_GRAPH_NODE_TYPE_BLOCK_JOB,
> >> +                                job->job.id);
> >> +            for (el = job->nodes; el; el = el->next) {
> >> +                xdbg_graph_add_edge(gr, job, (BdrvChild *)el->data);
> >> +            }
> >>          }
> >>      }
> >>  
> >> diff --git a/blockdev.c b/blockdev.c
> >> index 71f793c4ab..5b79093155 100644
> >> --- a/blockdev.c
> >> +++ b/blockdev.c
> >> @@ -150,12 +150,15 @@ void blockdev_mark_auto_del(BlockBackend *blk)
> >>          return;
> >>      }
> >>  
> >> -    for (job = block_job_next(NULL); job; job = block_job_next(job)) {
> >> +    JOB_LOCK_GUARD();
> >> +
> >> +    for (job = block_job_next_locked(NULL); job;
> >> +         job = block_job_next_locked(job)) {
> >>          if (block_job_has_bdrv(job, blk_bs(blk))) {
> > 
> > Should this be renamed to block_job_has_bdrv_locked() now?
> > 
> > It looks to me like it does need the locking. (Which actually makes
> > this patch a fix and not just an optimisation as the commit message
> > suggests.)
> 
> Nope, as GSList *nodes; is always read and written under BQL.

Ah, right. I wonder if we should later take the lock anyway even for
fields where it's not strictly necessary to simplify the locking rules.
Having to check the rules for each field separately is kind of hard. But
let's do only the necessary things in this series.

> > 
> >>              AioContext *aio_context = job->job.aio_context;
> >>              aio_context_acquire(aio_context);
> >>  
> >> -            job_cancel(&job->job, false);
> >> +            job_cancel_locked(&job->job, false);
> >>  
> >>              aio_context_release(aio_context);
> >>          }
> >> @@ -3745,7 +3748,10 @@ BlockJobInfoList *qmp_query_block_jobs(Error **errp)
> >>      BlockJobInfoList *head = NULL, **tail = &head;
> >>      BlockJob *job;
> >>  
> >> -    for (job = block_job_next(NULL); job; job = block_job_next(job)) {
> >> +    JOB_LOCK_GUARD();
> >> +
> >> +    for (job = block_job_next_locked(NULL); job;
> >> +         job = block_job_next_locked(job)) {
> >>          BlockJobInfo *value;
> >>          AioContext *aio_context;
> > 
> > More context:
> > 
> >         BlockJobInfo *value;
> >         AioContext *aio_context;
> > 
> >         if (block_job_is_internal(job)) {
> >             continue;
> >         }
> >         aio_context = block_job_get_aio_context(job);
> >         aio_context_acquire(aio_context);
> >         value = block_job_query(job, errp);
> >         aio_context_release(aio_context);
> > 
> > This should become block_job_query_locked(). (You do that in patch 18,
> > but it looks a bit out of place there - which is precisely because it
> > really belongs in this one.)
> 
> Ok
> > 
> >> diff --git a/blockjob.c b/blockjob.c
> >> index 0d59aba439..96fb9d9f73 100644
> >> --- a/blockjob.c
> >> +++ b/blockjob.c
> >> @@ -111,8 +111,10 @@ static bool child_job_drained_poll(BdrvChild *c)
> >>      /* An inactive or completed job doesn't have any pending requests. Jobs
> >>       * with !job->busy are either already paused or have a pause point after
> >>       * being reentered, so no job driver code will run before they pause. */
> >> -    if (!job->busy || job_is_completed(job)) {
> >> -        return false;
> >> +    WITH_JOB_LOCK_GUARD() {
> >> +        if (!job->busy || job_is_completed_locked(job)) {
> >> +            return false;
> >> +        }
> >>      }
> >>  
> >>      /* Otherwise, assume that it isn't fully stopped yet, but allow the job to
> > 
> > Assuming that the job status can actually change, don't we need the
> > locking for the rest of the function, too? Otherwise we might call
> > drv->drained_poll() for a job that has already paused or completed.
> > 
> > Of course, this goes against the assumption that all callbacks are
> > called without holding the job lock. Maybe it's not a good assumption.
> > 
> >> @@ -475,13 +477,15 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
> >>      job->ready_notifier.notify = block_job_event_ready;
> >>      job->idle_notifier.notify = block_job_on_idle;
> >>  
> >> -    notifier_list_add(&job->job.on_finalize_cancelled,
> >> -                      &job->finalize_cancelled_notifier);
> >> -    notifier_list_add(&job->job.on_finalize_completed,
> >> -                      &job->finalize_completed_notifier);
> >> -    notifier_list_add(&job->job.on_pending, &job->pending_notifier);
> >> -    notifier_list_add(&job->job.on_ready, &job->ready_notifier);
> >> -    notifier_list_add(&job->job.on_idle, &job->idle_notifier);
> >> +    WITH_JOB_LOCK_GUARD() {
> >> +        notifier_list_add(&job->job.on_finalize_cancelled,
> >> +                          &job->finalize_cancelled_notifier);
> >> +        notifier_list_add(&job->job.on_finalize_completed,
> >> +                          &job->finalize_completed_notifier);
> >> +        notifier_list_add(&job->job.on_pending, &job->pending_notifier);
> >> +        notifier_list_add(&job->job.on_ready, &job->ready_notifier);
> >> +        notifier_list_add(&job->job.on_idle, &job->idle_notifier);
> >> +    }
> >>  
> >>      error_setg(&job->blocker, "block device is in use by block job: %s",
> >>                 job_type_str(&job->job));
> > 
> > Why is this the right scope for the lock? It looks very arbitrary to
> > lock only here, but not for the assignments above or the function calls
> > below.
> > 
> > Given that job_create() already puts the job in the job_list so it
> > becomes visible for other code, should we not keep the job lock from the
> > moment that we create the job until it is fully initialised?
> 
> I try to protect only what needs protection, nothing more. Otherwise
> then it is not clear what are we protecting and why. According to the
> split I made in job.h, things like job_type_str and whatever I did not
> protect are not protected because they don't need the lock.

I think the last paragraph above explains what it would protect?

Having a half-initialised job in the job list without holding the lock
sounds dangerous to me. Maybe it's actually okay in practice because
this is GLOBAL_STATE_CODE() and we can assume that code accessing
the job list outside of the main thread probably skips over the
half-initialised job, but it's another case where relying on the BQL is
confusing when there would be a more specific lock for it.

> > 
> >> @@ -558,10 +562,15 @@ BlockErrorAction block_job_error_action(BlockJob *job, BlockdevOnError on_err,
> >>                                          action);
> >>      }
> >>      if (action == BLOCK_ERROR_ACTION_STOP) {
> >> -        if (!job->job.user_paused) {
> >> -            job_pause(&job->job);
> >> -            /* make the pause user visible, which will be resumed from QMP. */
> >> -            job->job.user_paused = true;
> >> +        WITH_JOB_LOCK_GUARD() {
> >> +            if (!job->job.user_paused) {
> >> +                job_pause_locked(&job->job);
> >> +                /*
> >> +                 * make the pause user visible, which will be
> >> +                 * resumed from QMP.
> >> +                 */
> >> +                job->job.user_paused = true;
> >> +            }
> >>          }
> >>          block_job_iostatus_set_err(job, error);
> > 
> > Why is this call not in the critical section? It accesses job->iostatus.
> 
> But the blockjob is not yet "classified". Comes after.

Ok.

Kevin

Emanuele Giuseppe Esposito Aug. 17, 2022, 9:35 a.m. UTC | #5

Am 17/08/2022 um 10:46 schrieb Kevin Wolf:
>>>> @@ -475,13 +477,15 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
>>>>      job->ready_notifier.notify = block_job_event_ready;
>>>>      job->idle_notifier.notify = block_job_on_idle;
>>>>  
>>>> -    notifier_list_add(&job->job.on_finalize_cancelled,
>>>> -                      &job->finalize_cancelled_notifier);
>>>> -    notifier_list_add(&job->job.on_finalize_completed,
>>>> -                      &job->finalize_completed_notifier);
>>>> -    notifier_list_add(&job->job.on_pending, &job->pending_notifier);
>>>> -    notifier_list_add(&job->job.on_ready, &job->ready_notifier);
>>>> -    notifier_list_add(&job->job.on_idle, &job->idle_notifier);
>>>> +    WITH_JOB_LOCK_GUARD() {
>>>> +        notifier_list_add(&job->job.on_finalize_cancelled,
>>>> +                          &job->finalize_cancelled_notifier);
>>>> +        notifier_list_add(&job->job.on_finalize_completed,
>>>> +                          &job->finalize_completed_notifier);
>>>> +        notifier_list_add(&job->job.on_pending, &job->pending_notifier);
>>>> +        notifier_list_add(&job->job.on_ready, &job->ready_notifier);
>>>> +        notifier_list_add(&job->job.on_idle, &job->idle_notifier);
>>>> +    }
>>>>  
>>>>      error_setg(&job->blocker, "block device is in use by block job: %s",
>>>>                 job_type_str(&job->job));
>>> Why is this the right scope for the lock? It looks very arbitrary to
>>> lock only here, but not for the assignments above or the function calls
>>> below.
>>>
>>> Given that job_create() already puts the job in the job_list so it
>>> becomes visible for other code, should we not keep the job lock from the
>>> moment that we create the job until it is fully initialised?
>> I try to protect only what needs protection, nothing more. Otherwise
>> then it is not clear what are we protecting and why. According to the
>> split I made in job.h, things like job_type_str and whatever I did not
>> protect are not protected because they don't need the lock.
> I think the last paragraph above explains what it would protect?
> 
> Having a half-initialised job in the job list without holding the lock
> sounds dangerous to me. Maybe it's actually okay in practice because
> this is GLOBAL_STATE_CODE() and we can assume that code accessing
> the job list outside of the main thread probably skips over the
> half-initialised job, but it's another case where relying on the BQL is
> confusing when there would be a more specific lock for it.
> 

Ok, but this would imply:
1. create job_create_locked()
2. still drop the lock when calling block_job_add_bdrv(), since I am
pretty sure it can drain. So we still split the function in two (or
maybe three, if we need to reaqiure the lock after) parts.

Is that what you had in mind?

Emanuele

Kevin Wolf Aug. 17, 2022, 9:59 a.m. UTC | #6

Am 17.08.2022 um 11:35 hat Emanuele Giuseppe Esposito geschrieben:
> 
> 
> Am 17/08/2022 um 10:46 schrieb Kevin Wolf:
> >>>> @@ -475,13 +477,15 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
> >>>>      job->ready_notifier.notify = block_job_event_ready;
> >>>>      job->idle_notifier.notify = block_job_on_idle;
> >>>>  
> >>>> -    notifier_list_add(&job->job.on_finalize_cancelled,
> >>>> -                      &job->finalize_cancelled_notifier);
> >>>> -    notifier_list_add(&job->job.on_finalize_completed,
> >>>> -                      &job->finalize_completed_notifier);
> >>>> -    notifier_list_add(&job->job.on_pending, &job->pending_notifier);
> >>>> -    notifier_list_add(&job->job.on_ready, &job->ready_notifier);
> >>>> -    notifier_list_add(&job->job.on_idle, &job->idle_notifier);
> >>>> +    WITH_JOB_LOCK_GUARD() {
> >>>> +        notifier_list_add(&job->job.on_finalize_cancelled,
> >>>> +                          &job->finalize_cancelled_notifier);
> >>>> +        notifier_list_add(&job->job.on_finalize_completed,
> >>>> +                          &job->finalize_completed_notifier);
> >>>> +        notifier_list_add(&job->job.on_pending, &job->pending_notifier);
> >>>> +        notifier_list_add(&job->job.on_ready, &job->ready_notifier);
> >>>> +        notifier_list_add(&job->job.on_idle, &job->idle_notifier);
> >>>> +    }
> >>>>  
> >>>>      error_setg(&job->blocker, "block device is in use by block job: %s",
> >>>>                 job_type_str(&job->job));
> >>> Why is this the right scope for the lock? It looks very arbitrary to
> >>> lock only here, but not for the assignments above or the function calls
> >>> below.
> >>>
> >>> Given that job_create() already puts the job in the job_list so it
> >>> becomes visible for other code, should we not keep the job lock from the
> >>> moment that we create the job until it is fully initialised?
> >> I try to protect only what needs protection, nothing more. Otherwise
> >> then it is not clear what are we protecting and why. According to the
> >> split I made in job.h, things like job_type_str and whatever I did not
> >> protect are not protected because they don't need the lock.
> > I think the last paragraph above explains what it would protect?
> > 
> > Having a half-initialised job in the job list without holding the lock
> > sounds dangerous to me. Maybe it's actually okay in practice because
> > this is GLOBAL_STATE_CODE() and we can assume that code accessing
> > the job list outside of the main thread probably skips over the
> > half-initialised job, but it's another case where relying on the BQL is
> > confusing when there would be a more specific lock for it.
> > 
> 
> Ok, but this would imply:
> 1. create job_create_locked()
> 2. still drop the lock when calling block_job_add_bdrv(), since I am
> pretty sure it can drain. So we still split the function in two (or
> maybe three, if we need to reaqiure the lock after) parts.

Ah, crap. Temporarily dropping the lock makes it useless, the incomplete
state is then still visible to the outside. So the locked section would
have to end before it.

Maybe just leave it as it is then.

Kevin

[v10,11/21] jobs: group together API calls under the same job lock

Commit Message

Comments

Patch