diff mbox series

[RFC,2/6] job: _locked functions and public job_lock/unlock for next patch

Message ID 20210707165813.55361-3-eesposit@redhat.com (mailing list archive)
State New, archived
Headers show
Series job: replace AioContext lock with job_mutex | expand

Commit Message

Emanuele Giuseppe Esposito July 7, 2021, 4:58 p.m. UTC
Create _locked functions, to make next patch a little bit smaller.
Also set the locking functions as public, so that they can be used
also from structures using the Job struct.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 include/qemu/job.h | 23 +++++++++++++
 job.c              | 85 ++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 93 insertions(+), 15 deletions(-)

Comments

Stefan Hajnoczi July 8, 2021, 10:50 a.m. UTC | #1
On Wed, Jul 07, 2021 at 06:58:09PM +0200, Emanuele Giuseppe Esposito wrote:
> diff --git a/job.c b/job.c
> index 872bbebb01..96fb8e9730 100644
> --- a/job.c
> +++ b/job.c
> @@ -32,6 +32,10 @@
>  #include "trace/trace-root.h"
>  #include "qapi/qapi-events-job.h"
>  
> +/* job_mutex protexts the jobs list, but also the job operations. */
> +static QemuMutex job_mutex;

It's unclear what protecting "job operations" means. I would prefer a
fine-grained per-job lock that protects the job's fields instead of a
global lock with an unclear scope.

> +
> +/* Protected by job_mutex */
>  static QLIST_HEAD(, Job) jobs = QLIST_HEAD_INITIALIZER(jobs);
>  
>  /* Job State Transition Table */
> @@ -64,27 +68,22 @@ bool JobVerbTable[JOB_VERB__MAX][JOB_STATUS__MAX] = {
>  /* Transactional group of jobs */
>  struct JobTxn {
>  
> -    /* Is this txn being cancelled? */
> +    /* Is this txn being cancelled? Atomic.*/
>      bool aborting;

The comment says atomic but this field is not accessed using atomic
operations (at least at this point in the patch series)?

>  
> -    /* List of jobs */
> +    /* List of jobs. Protected by job_mutex. */
>      QLIST_HEAD(, Job) jobs;
>  
> -    /* Reference count */
> +    /* Reference count. Atomic. */
>      int refcnt;

Same.
Emanuele Giuseppe Esposito July 12, 2021, 8:43 a.m. UTC | #2
On 08/07/2021 12:50, Stefan Hajnoczi wrote:
> On Wed, Jul 07, 2021 at 06:58:09PM +0200, Emanuele Giuseppe Esposito wrote:
>> diff --git a/job.c b/job.c
>> index 872bbebb01..96fb8e9730 100644
>> --- a/job.c
>> +++ b/job.c
>> @@ -32,6 +32,10 @@
>>   #include "trace/trace-root.h"
>>   #include "qapi/qapi-events-job.h"
>>   
>> +/* job_mutex protexts the jobs list, but also the job operations. */
>> +static QemuMutex job_mutex;
> 
> It's unclear what protecting "job operations" means. I would prefer a
> fine-grained per-job lock that protects the job's fields instead of a
> global lock with an unclear scope.

As I wrote in the cover letter, I wanted to try to keep things as simple 
as possible with a global lock. It is possible to try and have a per-job 
lock, but I don't know how complex will that be then.
I will try and see what I can do.

Maybe "job_mutex protexts the jobs list, but also makes the job API 
thread-safe"?

> 
>> +
>> +/* Protected by job_mutex */
>>   static QLIST_HEAD(, Job) jobs = QLIST_HEAD_INITIALIZER(jobs);
>>   
>>   /* Job State Transition Table */
>> @@ -64,27 +68,22 @@ bool JobVerbTable[JOB_VERB__MAX][JOB_STATUS__MAX] = {
>>   /* Transactional group of jobs */
>>   struct JobTxn {
>>   
>> -    /* Is this txn being cancelled? */
>> +    /* Is this txn being cancelled? Atomic.*/
>>       bool aborting;
> 
> The comment says atomic but this field is not accessed using atomic
> operations (at least at this point in the patch series)?

Yes sorry I messed up the hunks in one-two patches. These comments were 
supposed to be on patch 4 "job.h: categorize job fields". Even though 
that might also not be ideal, since that patch just introduces the 
comments, without applying the locking/protection yet.
On the other side, if I merge everything together in patch 5, it will be 
even harder to read.

Emanuele
> 
>>   
>> -    /* List of jobs */
>> +    /* List of jobs. Protected by job_mutex. */
>>       QLIST_HEAD(, Job) jobs;
>>   
>> -    /* Reference count */
>> +    /* Reference count. Atomic. */
>>       int refcnt;
> 
> Same.
>
Stefan Hajnoczi July 13, 2021, 1:32 p.m. UTC | #3
On Mon, Jul 12, 2021 at 10:43:07AM +0200, Emanuele Giuseppe Esposito wrote:
> 
> 
> On 08/07/2021 12:50, Stefan Hajnoczi wrote:
> > On Wed, Jul 07, 2021 at 06:58:09PM +0200, Emanuele Giuseppe Esposito wrote:
> > > diff --git a/job.c b/job.c
> > > index 872bbebb01..96fb8e9730 100644
> > > --- a/job.c
> > > +++ b/job.c
> > > @@ -32,6 +32,10 @@
> > >   #include "trace/trace-root.h"
> > >   #include "qapi/qapi-events-job.h"
> > > +/* job_mutex protexts the jobs list, but also the job operations. */
> > > +static QemuMutex job_mutex;
> > 
> > It's unclear what protecting "job operations" means. I would prefer a
> > fine-grained per-job lock that protects the job's fields instead of a
> > global lock with an unclear scope.
> 
> As I wrote in the cover letter, I wanted to try to keep things as simple as
> possible with a global lock. It is possible to try and have a per-job lock,
> but I don't know how complex will that be then.
> I will try and see what I can do.
> 
> Maybe "job_mutex protexts the jobs list, but also makes the job API
> thread-safe"?

That's clearer, thanks. I thought "job operations" meant the processing
that the actual block jobs do (commit, mirror, stream, backup).

> 
> > 
> > > +
> > > +/* Protected by job_mutex */
> > >   static QLIST_HEAD(, Job) jobs = QLIST_HEAD_INITIALIZER(jobs);
> > >   /* Job State Transition Table */
> > > @@ -64,27 +68,22 @@ bool JobVerbTable[JOB_VERB__MAX][JOB_STATUS__MAX] = {
> > >   /* Transactional group of jobs */
> > >   struct JobTxn {
> > > -    /* Is this txn being cancelled? */
> > > +    /* Is this txn being cancelled? Atomic.*/
> > >       bool aborting;
> > 
> > The comment says atomic but this field is not accessed using atomic
> > operations (at least at this point in the patch series)?
> 
> Yes sorry I messed up the hunks in one-two patches. These comments were
> supposed to be on patch 4 "job.h: categorize job fields". Even though that
> might also not be ideal, since that patch just introduces the comments,
> without applying the locking/protection yet.
> On the other side, if I merge everything together in patch 5, it will be
> even harder to read.

The commit description can describe changes that currently have no
effect but are anticipating a later patch. That helps reviewers
understand whether the change is intentional/correct.

Stefan
diff mbox series

Patch

diff --git a/include/qemu/job.h b/include/qemu/job.h
index 72c7d0f69d..ba2f9b2660 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -305,6 +305,7 @@  void job_txn_add_job(JobTxn *txn, Job *job);
 
 /** Returns the @ret field of a given Job. */
 int job_get_ret(Job *job);
+int job_get_ret_locked(Job *job);
 
 /** Returns the AioContext of a given Job. */
 AioContext *job_get_aiocontext(Job *job);
@@ -336,6 +337,24 @@  bool job_is_force_cancel(Job *job);
 /** Returns the statis of a given Job. */
 JobStatus job_get_status(Job *job);
 
+/**
+ * job_lock:
+ *
+ * Take the mutex protecting the list of jobs and their status.
+ * Most functions called by the monitor need to call job_lock
+ * and job_unlock manually.  On the other hand, function called
+ * by the block jobs themselves and by the block layer will take the
+ * lock for you.
+ */
+void job_lock(void);
+
+/**
+ * job_unlock:
+ *
+ * Release the mutex protecting the list of jobs and their status.
+ */
+void job_unlock(void);
+
 /**
  * Create a new long-running job and return it.
  *
@@ -424,6 +443,7 @@  void job_start(Job *job);
  * Continue the specified job by entering the coroutine.
  */
 void job_enter(Job *job);
+void job_enter_locked(Job *job);
 
 /**
  * @job: The job that is ready to pause.
@@ -462,12 +482,15 @@  bool job_is_internal(Job *job);
 
 /** Returns whether the job is scheduled for cancellation. */
 bool job_is_cancelled(Job *job);
+bool job_is_cancelled_locked(Job *job);
 
 /** Returns whether the job is in a completed state. */
 bool job_is_completed(Job *job);
+bool job_is_completed_locked(Job *job);
 
 /** Returns whether the job is ready to be completed. */
 bool job_is_ready(Job *job);
+bool job_is_ready_locked(Job *job);
 
 /**
  * Request @job to pause at the next pause point. Must be paired with
diff --git a/job.c b/job.c
index 872bbebb01..96fb8e9730 100644
--- a/job.c
+++ b/job.c
@@ -32,6 +32,10 @@ 
 #include "trace/trace-root.h"
 #include "qapi/qapi-events-job.h"
 
+/* job_mutex protexts the jobs list, but also the job operations. */
+static QemuMutex job_mutex;
+
+/* Protected by job_mutex */
 static QLIST_HEAD(, Job) jobs = QLIST_HEAD_INITIALIZER(jobs);
 
 /* Job State Transition Table */
@@ -64,27 +68,22 @@  bool JobVerbTable[JOB_VERB__MAX][JOB_STATUS__MAX] = {
 /* Transactional group of jobs */
 struct JobTxn {
 
-    /* Is this txn being cancelled? */
+    /* Is this txn being cancelled? Atomic.*/
     bool aborting;
 
-    /* List of jobs */
+    /* List of jobs. Protected by job_mutex. */
     QLIST_HEAD(, Job) jobs;
 
-    /* Reference count */
+    /* Reference count. Atomic. */
     int refcnt;
 };
 
-/* Right now, this mutex is only needed to synchronize accesses to job->busy
- * and job->sleep_timer, such as concurrent calls to job_do_yield and
- * job_enter. */
-static QemuMutex job_mutex;
-
-static void job_lock(void)
+void job_lock(void)
 {
     qemu_mutex_lock(&job_mutex);
 }
 
-static void job_unlock(void)
+void job_unlock(void)
 {
     qemu_mutex_unlock(&job_mutex);
 }
@@ -109,11 +108,22 @@  bool job_is_busy(Job *job)
     return qatomic_read(&job->busy);
 }
 
-int job_get_ret(Job *job)
+/* Called with job_mutex held. */
+int job_get_ret_locked(Job *job)
 {
     return job->ret;
 }
 
+/* Called with job_mutex *not* held. */
+int job_get_ret(Job *job)
+{
+    int ret;
+    job_lock();
+    ret = job_get_ret_locked(job);
+    job_unlock();
+    return ret;
+}
+
 Error *job_get_err(Job *job)
 {
     return job->err;
@@ -255,12 +265,24 @@  const char *job_type_str(const Job *job)
     return JobType_str(job_type(job));
 }
 
-bool job_is_cancelled(Job *job)
+/* Called with job_mutex held. */
+bool job_is_cancelled_locked(Job *job)
 {
     return job->cancelled;
 }
 
-bool job_is_ready(Job *job)
+/* Called with job_mutex *not* held. */
+bool job_is_cancelled(Job *job)
+{
+    bool ret;
+    job_lock();
+    ret = job_is_cancelled_locked(job);
+    job_unlock();
+    return ret;
+}
+
+/* Called with job_mutex held. */
+bool job_is_ready_locked(Job *job)
 {
     switch (job->status) {
     case JOB_STATUS_UNDEFINED:
@@ -282,7 +304,18 @@  bool job_is_ready(Job *job)
     return false;
 }
 
-bool job_is_completed(Job *job)
+/* Called with job_mutex *not* held. */
+bool job_is_ready(Job *job)
+{
+    bool ret;
+    job_lock();
+    ret = job_is_ready_locked(job);
+    job_unlock();
+    return ret;
+}
+
+/* Called with job_mutex held. */
+bool job_is_completed_locked(Job *job)
 {
     switch (job->status) {
     case JOB_STATUS_UNDEFINED:
@@ -304,6 +337,17 @@  bool job_is_completed(Job *job)
     return false;
 }
 
+/* Called with job_mutex *not* held. */
+bool job_is_completed(Job *job)
+{
+    bool ret;
+    job_lock();
+    ret = job_is_completed_locked(job);
+    job_unlock();
+    return ret;
+}
+
+/* Does not need job_mutex. Value is never modified */
 static bool job_started(Job *job)
 {
     return job->co;
@@ -503,11 +547,20 @@  void job_enter_cond(Job *job, bool(*fn)(Job *job))
     aio_co_enter(job->aio_context, job->co);
 }
 
-void job_enter(Job *job)
+/* Called with job_mutex held. */
+void job_enter_locked(Job *job)
 {
     job_enter_cond(job, NULL);
 }
 
+/* Called with job_mutex *not* held. */
+void job_enter(Job *job)
+{
+    job_lock();
+    job_enter_locked(job, NULL);
+    job_unlock();
+}
+
 /* Yield, and schedule a timer to reenter the coroutine after @ns nanoseconds.
  * Reentering the job coroutine with job_enter() before the timer has expired
  * is allowed and cancels the timer.
@@ -684,12 +737,14 @@  void job_dismiss(Job **jobptr, Error **errp)
     *jobptr = NULL;
 }
 
+/* Called with job_mutex held. */
 void job_early_fail(Job *job)
 {
     assert(job->status == JOB_STATUS_CREATED);
     job_do_dismiss(job);
 }
 
+/* Called with job_mutex held. */
 static void job_conclude(Job *job)
 {
     job_state_transition(job, JOB_STATUS_CONCLUDED);