[RFC,03/10] cpus: Introduce async_wait_run_on_cpu()

Message ID	20160526163549.3276-4-a.rigo@virtualopensystems.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org> From: Alvise Rigo <a.rigo@virtualopensystems.com> To: mttcg@listserver.greensocs.com, alex.bennee@linaro.org Date: Thu, 26 May 2016 18:35:42 +0200 Message-Id: <20160526163549.3276-4-a.rigo@virtualopensystems.com> In-Reply-To: <20160526163549.3276-1-a.rigo@virtualopensystems.com> References: <20160526163549.3276-1-a.rigo@virtualopensystems.com> Subject: [Qemu-devel] [RFC 03/10] cpus: Introduce async_wait_run_on_cpu() Precedence: list Cc: peter.maydell@linaro.org, rth@twiddle.net, claudio.fontana@huawei.com, qemu-devel@nongnu.org, Alvise Rigo <a.rigo@virtualopensystems.com>, cota@braap.org, serge.fdrv@gmail.com, pbonzini@redhat.com, jani.kokkonen@huawei.com, tech@virtualopensystems.com, fred.konrad@greensocs.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>

alvise rigo May 26, 2016, 4:35 p.m. UTC

Introduce a new function that allows the calling VCPU to add a work item
to another VCPU (aka target VCPU). This new function differs from
async_run_on_cpu() since it makes the calling VCPU waiting for the target
VCPU to finish the work item. The mechanism makes use of the halt_cond
to wait and in case process pending work items.

Signed-off-by: Alvise Rigo <a.rigo@virtualopensystems.com>
---
 cpus.c            | 44 ++++++++++++++++++++++++++++++++++++++++++--
 include/qom/cpu.h | 31 +++++++++++++++++++++++++++++++
 2 files changed, 73 insertions(+), 2 deletions(-)

Alex Bennée June 8, 2016, 1:54 p.m. UTC | #1

Alvise Rigo <a.rigo@virtualopensystems.com> writes:

> Introduce a new function that allows the calling VCPU to add a work item
> to another VCPU (aka target VCPU). This new function differs from
> async_run_on_cpu() since it makes the calling VCPU waiting for the target
> VCPU to finish the work item. The mechanism makes use of the halt_cond
> to wait and in case process pending work items.

Isn't this exactly what would happen if you use run_on_cpu(). That will
stall the current vCPU and busy wait until the work item is completed.

>
> Signed-off-by: Alvise Rigo <a.rigo@virtualopensystems.com>
> ---
>  cpus.c            | 44 ++++++++++++++++++++++++++++++++++++++++++--
>  include/qom/cpu.h | 31 +++++++++++++++++++++++++++++++
>  2 files changed, 73 insertions(+), 2 deletions(-)
>
> diff --git a/cpus.c b/cpus.c
> index b9ec903..7bc96e2 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -89,7 +89,7 @@ static bool cpu_thread_is_idle(CPUState *cpu)
>      if (cpu->stop || cpu->queued_work_first) {
>          return false;
>      }
> -    if (cpu_is_stopped(cpu)) {
> +    if (cpu_is_stopped(cpu) || async_waiting_for_work(cpu)) {
>          return true;
>      }
>      if (!cpu->halted || cpu_has_work(cpu) ||
> @@ -1012,6 +1012,7 @@ void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data)
>      wi->func = func;
>      wi->data = data;
>      wi->free = true;
> +    wi->wcpu = NULL;
>
>      qemu_mutex_lock(&cpu->work_mutex);
>      if (cpu->queued_work_first == NULL) {
> @@ -1027,6 +1028,40 @@ void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data)
>      qemu_cpu_kick(cpu);
>  }
>
> +void async_wait_run_on_cpu(CPUState *cpu, CPUState *wcpu, run_on_cpu_func func,
> +                                                                    void *data)
> +{
> +    struct qemu_work_item *wwi;
> +
> +    assert(wcpu != cpu);
> +
> +    wwi = g_malloc0(sizeof(struct qemu_work_item));
> +    wwi->func = func;
> +    wwi->data = data;
> +    wwi->free = true;
> +    wwi->wcpu = wcpu;
> +
> +    /* Increase the number of pending work items */
> +    atomic_inc(&wcpu->pending_work_items);
> +
> +    qemu_mutex_lock(&cpu->work_mutex);
> +    /* Add the waiting work items at the beginning to free as soon as possible
> +     * the waiting CPU. */
> +    if (cpu->queued_work_first == NULL) {
> +        cpu->queued_work_last = wwi;
> +    } else {
> +        wwi->next = cpu->queued_work_first;
> +    }
> +    cpu->queued_work_first = wwi;
> +    wwi->done = false;
> +    qemu_mutex_unlock(&cpu->work_mutex);
> +
> +    qemu_cpu_kick(cpu);
> +
> +    /* In order to wait, @wcpu has to exit the CPU loop */
> +    cpu_exit(wcpu);
> +}
> +
>  /*
>   * Safe work interface
>   *
> @@ -1120,6 +1155,10 @@ static void flush_queued_work(CPUState *cpu)
>          qemu_mutex_unlock(&cpu->work_mutex);
>          wi->func(cpu, wi->data);
>          qemu_mutex_lock(&cpu->work_mutex);
> +        if (wi->wcpu != NULL) {
> +            atomic_dec(&wi->wcpu->pending_work_items);
> +            qemu_cond_broadcast(wi->wcpu->halt_cond);
> +        }
>          if (wi->free) {
>              g_free(wi);
>          } else {
> @@ -1406,7 +1445,8 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
>      while (1) {
>          bool sleep = false;
>
> -        if (cpu_can_run(cpu) && !async_safe_work_pending()) {
> +        if (cpu_can_run(cpu) && !async_safe_work_pending()
> +                             && !async_waiting_for_work(cpu)) {
>              int r;
>
>              atomic_inc(&tcg_scheduled_cpus);
> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
> index 019f06d..7be82ed 100644
> --- a/include/qom/cpu.h
> +++ b/include/qom/cpu.h
> @@ -259,6 +259,8 @@ struct qemu_work_item {
>      void *data;
>      int done;
>      bool free;
> +    /* CPU waiting for this work item to finish. If NULL, no CPU is waiting. */
> +    CPUState *wcpu;
>  };
>
>  /**
> @@ -303,6 +305,7 @@ struct qemu_work_item {
>   * @kvm_fd: vCPU file descriptor for KVM.
>   * @work_mutex: Lock to prevent multiple access to queued_work_*.
>   * @queued_work_first: First asynchronous work pending.
> + * @pending_work_items: Work items for which the CPU needs to wait completion.
>   *
>   * State of one CPU core or thread.
>   */
> @@ -337,6 +340,7 @@ struct CPUState {
>
>      QemuMutex work_mutex;
>      struct qemu_work_item *queued_work_first, *queued_work_last;
> +    int pending_work_items;
>
>      CPUAddressSpace *cpu_ases;
>      int num_ases;
> @@ -398,6 +402,9 @@ struct CPUState {
>       * by a stcond (see softmmu_template.h). */
>      bool excl_succeeded;
>
> +    /* True if some CPU requested a TLB flush for this CPU. */
> +    bool pending_tlb_flush;
> +
>      /* Note that this is accessed at the start of every TB via a negative
>         offset from AREG0.  Leave this field at the end so as to make the
>         (absolute value) offset as small as possible.  This reduces code
> @@ -680,6 +687,19 @@ void run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data);
>  void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data);
>
>  /**
> + * async_wait_run_on_cpu:
> + * @cpu: The vCPU to run on.
> + * @wpu: The vCPU submitting the work.
> + * @func: The function to be executed.
> + * @data: Data to pass to the function.
> + *
> + * Schedules the function @func for execution on the vCPU @cpu asynchronously.
> + * The vCPU @wcpu will wait for @cpu to finish the job.
> + */
> +void async_wait_run_on_cpu(CPUState *cpu, CPUState *wcpu, run_on_cpu_func func,
> +                                                                   void *data);
> +
> +/**
>   * async_safe_run_on_cpu:
>   * @cpu: The vCPU to run on.
>   * @func: The function to be executed.
> @@ -699,6 +719,17 @@ void async_safe_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data);
>  bool async_safe_work_pending(void);
>
>  /**
> + * async_waiting_for_work:
> + *
> + * Check whether there are work items for which @cpu is waiting completion.
> + * Returns: @true if work items are pending for completion, @false otherwise.
> + */
> +static inline bool async_waiting_for_work(CPUState *cpu)
> +{
> +    return atomic_mb_read(&cpu->pending_work_items) != 0;
> +}
> +
> +/**
>   * qemu_get_cpu:
>   * @index: The CPUState@cpu_index value of the CPU to obtain.
>   *


--
Alex Bennée

alvise rigo June 8, 2016, 2:10 p.m. UTC | #2

Using run_on_cpu() we might deadlock QEMU if other vCPUs are waiting
for the current vCPU. We need to exit from the vCPU loop in order to
avoid this.

Regards,
alvise

On Wed, Jun 8, 2016 at 3:54 PM, Alex Bennée <alex.bennee@linaro.org> wrote:
>
> Alvise Rigo <a.rigo@virtualopensystems.com> writes:
>
>> Introduce a new function that allows the calling VCPU to add a work item
>> to another VCPU (aka target VCPU). This new function differs from
>> async_run_on_cpu() since it makes the calling VCPU waiting for the target
>> VCPU to finish the work item. The mechanism makes use of the halt_cond
>> to wait and in case process pending work items.
>
> Isn't this exactly what would happen if you use run_on_cpu(). That will
> stall the current vCPU and busy wait until the work item is completed.
>
>>
>> Signed-off-by: Alvise Rigo <a.rigo@virtualopensystems.com>
>> ---
>>  cpus.c            | 44 ++++++++++++++++++++++++++++++++++++++++++--
>>  include/qom/cpu.h | 31 +++++++++++++++++++++++++++++++
>>  2 files changed, 73 insertions(+), 2 deletions(-)
>>
>> diff --git a/cpus.c b/cpus.c
>> index b9ec903..7bc96e2 100644
>> --- a/cpus.c
>> +++ b/cpus.c
>> @@ -89,7 +89,7 @@ static bool cpu_thread_is_idle(CPUState *cpu)
>>      if (cpu->stop || cpu->queued_work_first) {
>>          return false;
>>      }
>> -    if (cpu_is_stopped(cpu)) {
>> +    if (cpu_is_stopped(cpu) || async_waiting_for_work(cpu)) {
>>          return true;
>>      }
>>      if (!cpu->halted || cpu_has_work(cpu) ||
>> @@ -1012,6 +1012,7 @@ void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data)
>>      wi->func = func;
>>      wi->data = data;
>>      wi->free = true;
>> +    wi->wcpu = NULL;
>>
>>      qemu_mutex_lock(&cpu->work_mutex);
>>      if (cpu->queued_work_first == NULL) {
>> @@ -1027,6 +1028,40 @@ void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data)
>>      qemu_cpu_kick(cpu);
>>  }
>>
>> +void async_wait_run_on_cpu(CPUState *cpu, CPUState *wcpu, run_on_cpu_func func,
>> +                                                                    void *data)
>> +{
>> +    struct qemu_work_item *wwi;
>> +
>> +    assert(wcpu != cpu);
>> +
>> +    wwi = g_malloc0(sizeof(struct qemu_work_item));
>> +    wwi->func = func;
>> +    wwi->data = data;
>> +    wwi->free = true;
>> +    wwi->wcpu = wcpu;
>> +
>> +    /* Increase the number of pending work items */
>> +    atomic_inc(&wcpu->pending_work_items);
>> +
>> +    qemu_mutex_lock(&cpu->work_mutex);
>> +    /* Add the waiting work items at the beginning to free as soon as possible
>> +     * the waiting CPU. */
>> +    if (cpu->queued_work_first == NULL) {
>> +        cpu->queued_work_last = wwi;
>> +    } else {
>> +        wwi->next = cpu->queued_work_first;
>> +    }
>> +    cpu->queued_work_first = wwi;
>> +    wwi->done = false;
>> +    qemu_mutex_unlock(&cpu->work_mutex);
>> +
>> +    qemu_cpu_kick(cpu);
>> +
>> +    /* In order to wait, @wcpu has to exit the CPU loop */
>> +    cpu_exit(wcpu);
>> +}
>> +
>>  /*
>>   * Safe work interface
>>   *
>> @@ -1120,6 +1155,10 @@ static void flush_queued_work(CPUState *cpu)
>>          qemu_mutex_unlock(&cpu->work_mutex);
>>          wi->func(cpu, wi->data);
>>          qemu_mutex_lock(&cpu->work_mutex);
>> +        if (wi->wcpu != NULL) {
>> +            atomic_dec(&wi->wcpu->pending_work_items);
>> +            qemu_cond_broadcast(wi->wcpu->halt_cond);
>> +        }
>>          if (wi->free) {
>>              g_free(wi);
>>          } else {
>> @@ -1406,7 +1445,8 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
>>      while (1) {
>>          bool sleep = false;
>>
>> -        if (cpu_can_run(cpu) && !async_safe_work_pending()) {
>> +        if (cpu_can_run(cpu) && !async_safe_work_pending()
>> +                             && !async_waiting_for_work(cpu)) {
>>              int r;
>>
>>              atomic_inc(&tcg_scheduled_cpus);
>> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
>> index 019f06d..7be82ed 100644
>> --- a/include/qom/cpu.h
>> +++ b/include/qom/cpu.h
>> @@ -259,6 +259,8 @@ struct qemu_work_item {
>>      void *data;
>>      int done;
>>      bool free;
>> +    /* CPU waiting for this work item to finish. If NULL, no CPU is waiting. */
>> +    CPUState *wcpu;
>>  };
>>
>>  /**
>> @@ -303,6 +305,7 @@ struct qemu_work_item {
>>   * @kvm_fd: vCPU file descriptor for KVM.
>>   * @work_mutex: Lock to prevent multiple access to queued_work_*.
>>   * @queued_work_first: First asynchronous work pending.
>> + * @pending_work_items: Work items for which the CPU needs to wait completion.
>>   *
>>   * State of one CPU core or thread.
>>   */
>> @@ -337,6 +340,7 @@ struct CPUState {
>>
>>      QemuMutex work_mutex;
>>      struct qemu_work_item *queued_work_first, *queued_work_last;
>> +    int pending_work_items;
>>
>>      CPUAddressSpace *cpu_ases;
>>      int num_ases;
>> @@ -398,6 +402,9 @@ struct CPUState {
>>       * by a stcond (see softmmu_template.h). */
>>      bool excl_succeeded;
>>
>> +    /* True if some CPU requested a TLB flush for this CPU. */
>> +    bool pending_tlb_flush;
>> +
>>      /* Note that this is accessed at the start of every TB via a negative
>>         offset from AREG0.  Leave this field at the end so as to make the
>>         (absolute value) offset as small as possible.  This reduces code
>> @@ -680,6 +687,19 @@ void run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data);
>>  void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data);
>>
>>  /**
>> + * async_wait_run_on_cpu:
>> + * @cpu: The vCPU to run on.
>> + * @wpu: The vCPU submitting the work.
>> + * @func: The function to be executed.
>> + * @data: Data to pass to the function.
>> + *
>> + * Schedules the function @func for execution on the vCPU @cpu asynchronously.
>> + * The vCPU @wcpu will wait for @cpu to finish the job.
>> + */
>> +void async_wait_run_on_cpu(CPUState *cpu, CPUState *wcpu, run_on_cpu_func func,
>> +                                                                   void *data);
>> +
>> +/**
>>   * async_safe_run_on_cpu:
>>   * @cpu: The vCPU to run on.
>>   * @func: The function to be executed.
>> @@ -699,6 +719,17 @@ void async_safe_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data);
>>  bool async_safe_work_pending(void);
>>
>>  /**
>> + * async_waiting_for_work:
>> + *
>> + * Check whether there are work items for which @cpu is waiting completion.
>> + * Returns: @true if work items are pending for completion, @false otherwise.
>> + */
>> +static inline bool async_waiting_for_work(CPUState *cpu)
>> +{
>> +    return atomic_mb_read(&cpu->pending_work_items) != 0;
>> +}
>> +
>> +/**
>>   * qemu_get_cpu:
>>   * @index: The CPUState@cpu_index value of the CPU to obtain.
>>   *
>
>
> --
> Alex Bennée

Sergey Fedorov June 8, 2016, 2:53 p.m. UTC | #3

On 08/06/16 17:10, alvise rigo wrote:
> Using run_on_cpu() we might deadlock QEMU if other vCPUs are waiting
> for the current vCPU. We need to exit from the vCPU loop in order to
> avoid this.

I see, we could deadlock indeed. Alternatively, we may want fix
run_on_cpu() to avoid waiting for completion by itself when called from
CPU loop.

Kind regards,
Sergey

Alex Bennée June 8, 2016, 3:20 p.m. UTC | #4

Sergey Fedorov <serge.fdrv@gmail.com> writes:

> On 08/06/16 17:10, alvise rigo wrote:
>> Using run_on_cpu() we might deadlock QEMU if other vCPUs are waiting
>> for the current vCPU. We need to exit from the vCPU loop in order to
>> avoid this.
>
> I see, we could deadlock indeed. Alternatively, we may want fix
> run_on_cpu() to avoid waiting for completion by itself when called from
> CPU loop.

async_safe_run_on_cpu can't deadlock as all vCPUs are suspended (or
waiting) for the work to complete. The tasks are run in strict order so
if you queued async tasks for other vCPUs first you could ensure
everything is in the state you want it when you finally service the
calling vCPU.

>
> Kind regards,
> Sergey

--
Alex Bennée

alvise rigo June 8, 2016, 4:24 p.m. UTC | #5

I think that async_safe_run_on_cpu() does a different thing: it
queries a job to the target vCPU and wants all the other to "observe"
the submitted task. However, we will have the certainty that only the
target vCPU observed the task, the other might still be running in the
guest code.

alvise

On Wed, Jun 8, 2016 at 5:20 PM, Alex Bennée <alex.bennee@linaro.org> wrote:
>
> Sergey Fedorov <serge.fdrv@gmail.com> writes:
>
>> On 08/06/16 17:10, alvise rigo wrote:
>>> Using run_on_cpu() we might deadlock QEMU if other vCPUs are waiting
>>> for the current vCPU. We need to exit from the vCPU loop in order to
>>> avoid this.
>>
>> I see, we could deadlock indeed. Alternatively, we may want fix
>> run_on_cpu() to avoid waiting for completion by itself when called from
>> CPU loop.
>
> async_safe_run_on_cpu can't deadlock as all vCPUs are suspended (or
> waiting) for the work to complete. The tasks are run in strict order so
> if you queued async tasks for other vCPUs first you could ensure
> everything is in the state you want it when you finally service the
> calling vCPU.
>
>>
>> Kind regards,
>> Sergey
>
>
> --
> Alex Bennée

Alex Bennée June 13, 2016, 9:26 a.m. UTC | #6

alvise rigo <a.rigo@virtualopensystems.com> writes:

> I think that async_safe_run_on_cpu() does a different thing: it
> queries a job to the target vCPU and wants all the other to "observe"
> the submitted task. However, we will have the certainty that only the
> target vCPU observed the task, the other might still be running in the
> guest code.

For the code to have run every will have come out of the run loop and
synced up at that point. No safe work is run with guest code executing.

>
> alvise
>
> On Wed, Jun 8, 2016 at 5:20 PM, Alex Bennée <alex.bennee@linaro.org> wrote:
>>
>> Sergey Fedorov <serge.fdrv@gmail.com> writes:
>>
>>> On 08/06/16 17:10, alvise rigo wrote:
>>>> Using run_on_cpu() we might deadlock QEMU if other vCPUs are waiting
>>>> for the current vCPU. We need to exit from the vCPU loop in order to
>>>> avoid this.
>>>
>>> I see, we could deadlock indeed. Alternatively, we may want fix
>>> run_on_cpu() to avoid waiting for completion by itself when called from
>>> CPU loop.
>>
>> async_safe_run_on_cpu can't deadlock as all vCPUs are suspended (or
>> waiting) for the work to complete. The tasks are run in strict order so
>> if you queued async tasks for other vCPUs first you could ensure
>> everything is in the state you want it when you finally service the
>> calling vCPU.
>>
>>>
>>> Kind regards,
>>> Sergey
>>
>>
>> --
>> Alex Bennée


--
Alex Bennée

[RFC,03/10] cpus: Introduce async_wait_run_on_cpu()

Commit Message

Comments

Patch