diff mbox series

[RFC,v2,03/14] job.h: define locked functions

Message ID 20211104145334.1346363-4-eesposit@redhat.com (mailing list archive)
State New, archived
Headers show
Series job: replace AioContext lock with job_mutex | expand

Commit Message

Emanuele Giuseppe Esposito Nov. 4, 2021, 2:53 p.m. UTC
These functions assume that the job lock is held by the
caller, to avoid TOC/TOU conditions.

Introduce also additional helpers that define _locked
functions (useful when the job_mutex is globally applied).

Note: at this stage, job_{lock/unlock} and job lock guard macros
are *nop*.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 include/qemu/job.h | 66 ++++++++++++++++++++++++++++++++++++++++++----
 job.c              | 18 +++++++++++--
 2 files changed, 77 insertions(+), 7 deletions(-)

Comments

Stefan Hajnoczi Dec. 16, 2021, 4:48 p.m. UTC | #1
On Thu, Nov 04, 2021 at 10:53:23AM -0400, Emanuele Giuseppe Esposito wrote:
>  /** Returns whether the job is ready to be completed. */
>  bool job_is_ready(Job *job);
>  
> +/** Same as job_is_ready(), but assumes job_lock is held. */
> +bool job_is_ready_locked(Job *job);

What I see here is that some functions assume job_lock is held but don't
have _locked in their name (job_ref()), some assume job_lock is held and
have _locked in their name (job_is_ready_locked()), and some assume
job_lock is not held (job_is_ready()).

That means when _locked is not in the name I don't know whether this
function requires job_lock or will deadlock if called under job_lock.

Two ways to it obvious:

1. Always have _locked in the name if the function requires job_lock.
   Functions without _locked must not be called under job_lock.

2. Don't change the name but use the type system instead:

   /*
    * Define a unique type so the compiler warns us. It's just a pointer
    * so it can be efficiently passed by value.
    */
   typedef struct { Job *job; } LockedJob;

   LockedJob job_lock(Job *job);
   Job *job_unlock(LockedJob job);

   Now the compiler catches mistakes:

   bool job_is_completed(LockedJob job);
   bool job_is_ready(Job *job);

   Job *j;
   LockedJob l;
   job_is_completed(j) -> compiler error
   job_is_completed(l) -> ok
   job_is_ready(j) -> ok
   job_is_ready(l) -> compiler error

   This approach assumes per-Job locks but a similar API is possible
   with a global job_mutex too. There just needs to be a function to
   turn Job * into LockedJob and LockedJob back into Job*.

   This is slightly exotic. It's not an approach I've seen used in C, so
   it's not idiomatic and people might find it unfamiliar.

These are just ideas. If you want to keep it the way it is, that's okay
too (although a little confusing).

> diff --git a/job.c b/job.c
> index 0e4dacf028..e393c1222f 100644
> --- a/job.c
> +++ b/job.c
> @@ -242,7 +242,8 @@ bool job_cancel_requested(Job *job)
>      return job->cancelled;
>  }
>  
> -bool job_is_ready(Job *job)
> +/* Called with job_mutex held. */

This information should go with the doc comments (and it's already there
in job.h!). There is no rule on where to put doc comments but in this
case you already added them to job.h, so they are not needed here in
job.c. Leaving them could confuse other people into adding doc comments
into job.c instead of job.h.
Vladimir Sementsov-Ogievskiy Dec. 16, 2021, 5:11 p.m. UTC | #2
16.12.2021 19:48, Stefan Hajnoczi wrote:
> On Thu, Nov 04, 2021 at 10:53:23AM -0400, Emanuele Giuseppe Esposito wrote:
>>   /** Returns whether the job is ready to be completed. */
>>   bool job_is_ready(Job *job);
>>   
>> +/** Same as job_is_ready(), but assumes job_lock is held. */
>> +bool job_is_ready_locked(Job *job);
> 
> What I see here is that some functions assume job_lock is held but don't
> have _locked in their name (job_ref()), some assume job_lock is held and
> have _locked in their name (job_is_ready_locked()), and some assume
> job_lock is not held (job_is_ready()).
> 
> That means when _locked is not in the name I don't know whether this
> function requires job_lock or will deadlock if called under job_lock.
> 
> Two ways to it obvious:
> 
> 1. Always have _locked in the name if the function requires job_lock.
>     Functions without _locked must not be called under job_lock.
> 
> 2. Don't change the name but use the type system instead:
> 
>     /*
>      * Define a unique type so the compiler warns us. It's just a pointer
>      * so it can be efficiently passed by value.
>      */
>     typedef struct { Job *job; } LockedJob;
> 
>     LockedJob job_lock(Job *job);
>     Job *job_unlock(LockedJob job);
> 
>     Now the compiler catches mistakes:
> 
>     bool job_is_completed(LockedJob job);
>     bool job_is_ready(Job *job);
> 
>     Job *j;
>     LockedJob l;
>     job_is_completed(j) -> compiler error
>     job_is_completed(l) -> ok
>     job_is_ready(j) -> ok
>     job_is_ready(l) -> compiler error
> 
>     This approach assumes per-Job locks but a similar API is possible
>     with a global job_mutex too. There just needs to be a function to
>     turn Job * into LockedJob and LockedJob back into Job*.
> 
>     This is slightly exotic. It's not an approach I've seen used in C, so
>     it's not idiomatic and people might find it unfamiliar.

Oh yes. If we need something, I'd prefer function renaming.

> 
> These are just ideas. If you want to keep it the way it is, that's okay
> too (although a little confusing).
> 
>> diff --git a/job.c b/job.c
>> index 0e4dacf028..e393c1222f 100644
>> --- a/job.c
>> +++ b/job.c
>> @@ -242,7 +242,8 @@ bool job_cancel_requested(Job *job)
>>       return job->cancelled;
>>   }
>>   
>> -bool job_is_ready(Job *job)
>> +/* Called with job_mutex held. */
> 
> This information should go with the doc comments (and it's already there
> in job.h!). There is no rule on where to put doc comments but in this
> case you already added them to job.h, so they are not needed here in
> job.c. Leaving them could confuse other people into adding doc comments
> into job.c instead of job.h.
>
Emanuele Giuseppe Esposito Dec. 20, 2021, 10:15 a.m. UTC | #3
On 16/12/2021 18:11, Vladimir Sementsov-Ogievskiy wrote:
> 16.12.2021 19:48, Stefan Hajnoczi wrote:
>> On Thu, Nov 04, 2021 at 10:53:23AM -0400, Emanuele Giuseppe Esposito 
>> wrote:
>>>   /** Returns whether the job is ready to be completed. */
>>>   bool job_is_ready(Job *job);
>>> +/** Same as job_is_ready(), but assumes job_lock is held. */
>>> +bool job_is_ready_locked(Job *job);
>>
>> What I see here is that some functions assume job_lock is held but don't
>> have _locked in their name (job_ref()), some assume job_lock is held and
>> have _locked in their name (job_is_ready_locked()), and some assume
>> job_lock is not held (job_is_ready()).
>>
>> That means when _locked is not in the name I don't know whether this
>> function requires job_lock or will deadlock if called under job_lock.
>>
>> Two ways to it obvious:
>>
>> 1. Always have _locked in the name if the function requires job_lock.
>>     Functions without _locked must not be called under job_lock.
>>
>> 2. Don't change the name but use the type system instead:
>>
>>     /*
>>      * Define a unique type so the compiler warns us. It's just a pointer
>>      * so it can be efficiently passed by value.
>>      */
>>     typedef struct { Job *job; } LockedJob;
>>
>>     LockedJob job_lock(Job *job);
>>     Job *job_unlock(LockedJob job);
>>
>>     Now the compiler catches mistakes:
>>
>>     bool job_is_completed(LockedJob job);
>>     bool job_is_ready(Job *job);
>>
>>     Job *j;
>>     LockedJob l;
>>     job_is_completed(j) -> compiler error
>>     job_is_completed(l) -> ok
>>     job_is_ready(j) -> ok
>>     job_is_ready(l) -> compiler error
>>
>>     This approach assumes per-Job locks but a similar API is possible
>>     with a global job_mutex too. There just needs to be a function to
>>     turn Job * into LockedJob and LockedJob back into Job*.
>>
>>     This is slightly exotic. It's not an approach I've seen used in C, so
>>     it's not idiomatic and people might find it unfamiliar.
> 
> Oh yes. If we need something, I'd prefer function renaming.

Ok, I will go with option 1.

> 
>>
>> These are just ideas. If you want to keep it the way it is, that's okay
>> too (although a little confusing).
>>
>>> diff --git a/job.c b/job.c
>>> index 0e4dacf028..e393c1222f 100644
>>> --- a/job.c
>>> +++ b/job.c
>>> @@ -242,7 +242,8 @@ bool job_cancel_requested(Job *job)
>>>       return job->cancelled;
>>>   }
>>> -bool job_is_ready(Job *job)
>>> +/* Called with job_mutex held. */
>>
>> This information should go with the doc comments (and it's already there
>> in job.h!). There is no rule on where to put doc comments but in this
>> case you already added them to job.h, so they are not needed here in
>> job.c. Leaving them could confuse other people into adding doc comments
>> into job.c instead of job.h.
>>

Yes, in general I will leave the comment for static functions in job.c 
and make sure the public ones are only documented in job.h.

Thank you,
Emanuele
diff mbox series

Patch

diff --git a/include/qemu/job.h b/include/qemu/job.h
index f7036ac6b3..c7a6bcad1b 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -355,6 +355,8 @@  void job_txn_unref(JobTxn *txn);
  * the reference that is automatically grabbed here.
  *
  * If @txn is NULL, the function does nothing.
+ *
+ * Called between job_lock and job_unlock.
  */
 void job_txn_add_job(JobTxn *txn, Job *job);
 
@@ -377,12 +379,16 @@  void *job_create(const char *job_id, const JobDriver *driver, JobTxn *txn,
 /**
  * Add a reference to Job refcnt, it will be decreased with job_unref, and then
  * be freed if it comes to be the last reference.
+ *
+ * Called between job_lock and job_unlock.
  */
 void job_ref(Job *job);
 
 /**
  * Release a reference that was previously acquired with job_ref() or
  * job_create(). If it's the last reference to the object, it will be freed.
+ *
+ * Called between job_lock and job_unlock.
  */
 void job_unref(Job *job);
 
@@ -429,6 +435,8 @@  void job_event_completed(Job *job);
  * Conditionally enter the job coroutine if the job is ready to run, not
  * already busy and fn() returns true. fn() is called while under the job_lock
  * critical section.
+ *
+ * Called between job_lock and job_unlock, but it releases the lock temporarly.
  */
 void job_enter_cond(Job *job, bool(*fn)(Job *job));
 
@@ -490,34 +498,52 @@  bool job_is_cancelled(Job *job);
  */
 bool job_cancel_requested(Job *job);
 
-/** Returns whether the job is in a completed state. */
+/**
+ * Returns whether the job is in a completed state.
+ * Called between job_lock and job_unlock.
+ */
 bool job_is_completed(Job *job);
 
 /** Returns whether the job is ready to be completed. */
 bool job_is_ready(Job *job);
 
+/** Same as job_is_ready(), but assumes job_lock is held. */
+bool job_is_ready_locked(Job *job);
+
 /**
  * Request @job to pause at the next pause point. Must be paired with
  * job_resume(). If the job is supposed to be resumed by user action, call
  * job_user_pause() instead.
+ *
+ * Called between job_lock and job_unlock.
  */
 void job_pause(Job *job);
 
-/** Resumes a @job paused with job_pause. */
+/**
+ * Resumes a @job paused with job_pause.
+ * Called between job_lock and job_unlock.
+ */
 void job_resume(Job *job);
 
 /**
  * Asynchronously pause the specified @job.
  * Do not allow a resume until a matching call to job_user_resume.
+ *
+ * Called between job_lock and job_unlock.
  */
 void job_user_pause(Job *job, Error **errp);
 
-/** Returns true if the job is user-paused. */
+/**
+ * Returns true if the job is user-paused.
+ * Called between job_lock and job_unlock.
+ */
 bool job_user_paused(Job *job);
 
 /**
  * Resume the specified @job.
  * Must be paired with a preceding job_user_pause.
+ *
+ * Called between job_lock and job_unlock.
  */
 void job_user_resume(Job *job, Error **errp);
 
@@ -526,6 +552,8 @@  void job_user_resume(Job *job, Error **errp);
  * first one if @job is %NULL.
  *
  * Returns the requested job, or %NULL if there are no more jobs left.
+ *
+ * Called between job_lock and job_unlock.
  */
 Job *job_next(Job *job);
 
@@ -533,6 +561,8 @@  Job *job_next(Job *job);
  * Get the job identified by @id (which must not be %NULL).
  *
  * Returns the requested job, or %NULL if it doesn't exist.
+ *
+ * Called between job_lock and job_unlock.
  */
 Job *job_get(const char *id);
 
@@ -540,27 +570,39 @@  Job *job_get(const char *id);
  * Check whether the verb @verb can be applied to @job in its current state.
  * Returns 0 if the verb can be applied; otherwise errp is set and -EPERM
  * returned.
+ *
+ * Called between job_lock and job_unlock.
  */
 int job_apply_verb(Job *job, JobVerb verb, Error **errp);
 
 /** The @job could not be started, free it. */
 void job_early_fail(Job *job);
 
+/** Same as job_early_fail(), but assumes job_lock is held. */
+void job_early_fail_locked(Job *job);
+
 /** Moves the @job from RUNNING to READY */
 void job_transition_to_ready(Job *job);
 
-/** Asynchronously complete the specified @job. */
+/**
+ * Asynchronously complete the specified @job.
+ * Called between job_lock and job_unlock, but it releases the lock temporarly.
+ */
 void job_complete(Job *job, Error **errp);
 
 /**
  * Asynchronously cancel the specified @job. If @force is true, the job should
  * be cancelled immediately without waiting for a consistent state.
+ *
+ * Called between job_lock and job_unlock.
  */
 void job_cancel(Job *job, bool force);
 
 /**
  * Cancels the specified job like job_cancel(), but may refuse to do so if the
  * operation isn't meaningful in the current state of the job.
+ *
+ * Called between job_lock and job_unlock.
  */
 void job_user_cancel(Job *job, bool force, Error **errp);
 
@@ -577,7 +619,13 @@  void job_user_cancel(Job *job, bool force, Error **errp);
  */
 int job_cancel_sync(Job *job, bool force);
 
-/** Synchronously force-cancels all jobs using job_cancel_sync(). */
+/**
+ * Synchronously force-cancels all jobs using job_cancel_sync().
+ *
+ * Called with job_lock *not* held, unlike most other APIs consumed
+ * by the monitor! This is primarly to avoid adding unnecessary lock-unlock
+ * patterns in the caller.
+ */
 void job_cancel_sync_all(void);
 
 /**
@@ -593,6 +641,8 @@  void job_cancel_sync_all(void);
  * Returns the return value from the job.
  *
  * Callers must hold the AioContext lock of job->aio_context.
+ *
+ * Called between job_lock and job_unlock.
  */
 int job_complete_sync(Job *job, Error **errp);
 
@@ -603,12 +653,16 @@  int job_complete_sync(Job *job, Error **errp);
  * FIXME: Make the below statement universally true:
  * For jobs that support the manual workflow mode, all graph changes that occur
  * as a result will occur after this command and before a successful reply.
+ *
+ * Called between job_lock and job_unlock.
  */
 void job_finalize(Job *job, Error **errp);
 
 /**
  * Remove the concluded @job from the query list and resets the passed pointer
  * to %NULL. Returns an error if the job is not actually concluded.
+ *
+ * Called between job_lock and job_unlock.
  */
 void job_dismiss(Job **job, Error **errp);
 
@@ -620,6 +674,8 @@  void job_dismiss(Job **job, Error **errp);
  * cancelled before completing, and -errno in other error cases.
  *
  * Callers must hold the AioContext lock of job->aio_context.
+ *
+ * Called between job_lock and job_unlock.
  */
 int job_finish_sync(Job *job, void (*finish)(Job *, Error **errp), Error **errp);
 
diff --git a/job.c b/job.c
index 0e4dacf028..e393c1222f 100644
--- a/job.c
+++ b/job.c
@@ -242,7 +242,8 @@  bool job_cancel_requested(Job *job)
     return job->cancelled;
 }
 
-bool job_is_ready(Job *job)
+/* Called with job_mutex held. */
+bool job_is_ready_locked(Job *job)
 {
     switch (job->status) {
     case JOB_STATUS_UNDEFINED:
@@ -264,6 +265,13 @@  bool job_is_ready(Job *job)
     return false;
 }
 
+/* Called with job_mutex lock *not* held */
+bool job_is_ready(Job *job)
+{
+    JOB_LOCK_GUARD();
+    return job_is_ready_locked(job);
+}
+
 bool job_is_completed(Job *job)
 {
     switch (job->status) {
@@ -659,12 +667,18 @@  void job_dismiss(Job **jobptr, Error **errp)
     *jobptr = NULL;
 }
 
-void job_early_fail(Job *job)
+void job_early_fail_locked(Job *job)
 {
     assert(job->status == JOB_STATUS_CREATED);
     job_do_dismiss(job);
 }
 
+void job_early_fail(Job *job)
+{
+    JOB_LOCK_GUARD();
+    job_early_fail_locked(job);
+}
+
 static void job_conclude(Job *job)
 {
     job_state_transition(job, JOB_STATUS_CONCLUDED);