Message ID | 20190130004811.27372-73-cota@braap.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | per-CPU locks | expand |
Emilio G. Cota <cota@braap.org> writes: > Some async jobs do not need the BQL. > > Reviewed-by: Richard Henderson <richard.henderson@linaro.org> > Signed-off-by: Emilio G. Cota <cota@braap.org> > --- > include/qom/cpu.h | 14 ++++++++++++++ > cpus-common.c | 39 ++++++++++++++++++++++++++++++++++----- > 2 files changed, 48 insertions(+), 5 deletions(-) > > diff --git a/include/qom/cpu.h b/include/qom/cpu.h > index 30ed2fae0b..bb0729f969 100644 > --- a/include/qom/cpu.h > +++ b/include/qom/cpu.h > @@ -884,9 +884,23 @@ void run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data); > * @data: Data to pass to the function. > * > * Schedules the function @func for execution on the vCPU @cpu asynchronously. > + * See also: async_run_on_cpu_no_bql() > */ > void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data); > > +/** > + * async_run_on_cpu_no_bql: > + * @cpu: The vCPU to run on. > + * @func: The function to be executed. > + * @data: Data to pass to the function. > + * > + * Schedules the function @func for execution on the vCPU @cpu asynchronously. > + * This function is run outside the BQL. > + * See also: async_run_on_cpu() > + */ > +void async_run_on_cpu_no_bql(CPUState *cpu, run_on_cpu_func func, > + run_on_cpu_data data); > + So we now have a locking/scheduling hierarchy that goes: - run_on_cpu - synchronously wait until target cpu has done the thing - async_run_on_cpu - schedule work on cpu at some point (soon) resources protected by BQL - async_run_on_cpu_no_bql - as above but only protected by cpu_lock - async_safe_run_on_cpu - as above but locking (probably) not required as everything else asleep So the BQL is only really needed to manipulate data that is shared across multiple vCPUs like device emulation or other state shared across multiple vCPUS. For all "just do it over there" cases we should be able to stick to cpu locks. It would be nice if we could expand the documentation in multi-thread-tcg.txt to cover this in long form for people trying to work out the best thing to use. Anyway: Reviewed-by: Alex Bennée <alex.bennee@linaro.org> > /** > * async_safe_run_on_cpu: > * @cpu: The vCPU to run on. > diff --git a/cpus-common.c b/cpus-common.c > index 1241024b2c..5832a8bf37 100644 > --- a/cpus-common.c > +++ b/cpus-common.c > @@ -109,6 +109,7 @@ struct qemu_work_item { > run_on_cpu_func func; > run_on_cpu_data data; > bool free, exclusive, done; > + bool bql; > }; > > /* Called with the CPU's lock held */ > @@ -155,6 +156,7 @@ void run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data) > wi.done = false; > wi.free = false; > wi.exclusive = false; > + wi.bql = true; > > cpu_mutex_lock(cpu); > queue_work_on_cpu_locked(cpu, &wi); > @@ -179,6 +181,21 @@ void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data) > wi->func = func; > wi->data = data; > wi->free = true; > + wi->bql = true; > + > + queue_work_on_cpu(cpu, wi); > +} > + > +void async_run_on_cpu_no_bql(CPUState *cpu, run_on_cpu_func func, > + run_on_cpu_data data) > +{ > + struct qemu_work_item *wi; > + > + wi = g_malloc0(sizeof(struct qemu_work_item)); > + wi->func = func; > + wi->data = data; > + wi->free = true; > + /* wi->bql initialized to false */ > > queue_work_on_cpu(cpu, wi); > } > @@ -323,6 +340,7 @@ void async_safe_run_on_cpu(CPUState *cpu, run_on_cpu_func func, > wi->data = data; > wi->free = true; > wi->exclusive = true; > + /* wi->bql initialized to false */ > > queue_work_on_cpu(cpu, wi); > } > @@ -347,6 +365,7 @@ static void process_queued_cpu_work_locked(CPUState *cpu) > * BQL, so it goes to sleep; start_exclusive() is sleeping too, so > * neither CPU can proceed. > */ > + g_assert(!wi->bql); > if (has_bql) { > qemu_mutex_unlock_iothread(); > } > @@ -357,12 +376,22 @@ static void process_queued_cpu_work_locked(CPUState *cpu) > qemu_mutex_lock_iothread(); > } > } else { > - if (has_bql) { > - wi->func(cpu, wi->data); > + if (wi->bql) { > + if (has_bql) { > + wi->func(cpu, wi->data); > + } else { > + qemu_mutex_lock_iothread(); > + wi->func(cpu, wi->data); > + qemu_mutex_unlock_iothread(); > + } > } else { > - qemu_mutex_lock_iothread(); > - wi->func(cpu, wi->data); > - qemu_mutex_unlock_iothread(); > + if (has_bql) { > + qemu_mutex_unlock_iothread(); > + wi->func(cpu, wi->data); > + qemu_mutex_lock_iothread(); > + } else { > + wi->func(cpu, wi->data); > + } > } > } > cpu_mutex_lock(cpu); -- Alex Bennée
On Fri, Feb 08, 2019 at 14:58:40 +0000, Alex Bennée wrote: > > Emilio G. Cota <cota@braap.org> writes: > > > Some async jobs do not need the BQL. > > > > Reviewed-by: Richard Henderson <richard.henderson@linaro.org> > > Signed-off-by: Emilio G. Cota <cota@braap.org> (snip) > So we now have a locking/scheduling hierarchy that goes: > > - run_on_cpu - synchronously wait until target cpu has done the thing > - async_run_on_cpu - schedule work on cpu at some point (soon) resources protected by BQL > - async_run_on_cpu_no_bql - as above but only protected by cpu_lock This one doesn't hold any locks when calling the passed function, though. The CPU lock is just to serialise the queueing/dequeueing. > - async_safe_run_on_cpu - as above but locking (probably) not required as everything else asleep > > So the BQL is only really needed to manipulate data that is shared > across multiple vCPUs like device emulation or other state shared across > multiple vCPUS. For all "just do it over there" cases we should be able > to stick to cpu locks. > > It would be nice if we could expand the documentation in > multi-thread-tcg.txt to cover this in long form for people trying to > work out the best thing to use. I think the documentation in the functions is probably enough -- they point to each other, so figuring out what to use should be pretty easy. Besides, these functions aren't MTTCG-only, they're QEMU-wide, so perhaps their documentation should go in a different file? Since this can be done later, I'll post a v7 with the other changes you suggested. > Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Thanks! Emilio
diff --git a/include/qom/cpu.h b/include/qom/cpu.h index 30ed2fae0b..bb0729f969 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -884,9 +884,23 @@ void run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data); * @data: Data to pass to the function. * * Schedules the function @func for execution on the vCPU @cpu asynchronously. + * See also: async_run_on_cpu_no_bql() */ void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data); +/** + * async_run_on_cpu_no_bql: + * @cpu: The vCPU to run on. + * @func: The function to be executed. + * @data: Data to pass to the function. + * + * Schedules the function @func for execution on the vCPU @cpu asynchronously. + * This function is run outside the BQL. + * See also: async_run_on_cpu() + */ +void async_run_on_cpu_no_bql(CPUState *cpu, run_on_cpu_func func, + run_on_cpu_data data); + /** * async_safe_run_on_cpu: * @cpu: The vCPU to run on. diff --git a/cpus-common.c b/cpus-common.c index 1241024b2c..5832a8bf37 100644 --- a/cpus-common.c +++ b/cpus-common.c @@ -109,6 +109,7 @@ struct qemu_work_item { run_on_cpu_func func; run_on_cpu_data data; bool free, exclusive, done; + bool bql; }; /* Called with the CPU's lock held */ @@ -155,6 +156,7 @@ void run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data) wi.done = false; wi.free = false; wi.exclusive = false; + wi.bql = true; cpu_mutex_lock(cpu); queue_work_on_cpu_locked(cpu, &wi); @@ -179,6 +181,21 @@ void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data) wi->func = func; wi->data = data; wi->free = true; + wi->bql = true; + + queue_work_on_cpu(cpu, wi); +} + +void async_run_on_cpu_no_bql(CPUState *cpu, run_on_cpu_func func, + run_on_cpu_data data) +{ + struct qemu_work_item *wi; + + wi = g_malloc0(sizeof(struct qemu_work_item)); + wi->func = func; + wi->data = data; + wi->free = true; + /* wi->bql initialized to false */ queue_work_on_cpu(cpu, wi); } @@ -323,6 +340,7 @@ void async_safe_run_on_cpu(CPUState *cpu, run_on_cpu_func func, wi->data = data; wi->free = true; wi->exclusive = true; + /* wi->bql initialized to false */ queue_work_on_cpu(cpu, wi); } @@ -347,6 +365,7 @@ static void process_queued_cpu_work_locked(CPUState *cpu) * BQL, so it goes to sleep; start_exclusive() is sleeping too, so * neither CPU can proceed. */ + g_assert(!wi->bql); if (has_bql) { qemu_mutex_unlock_iothread(); } @@ -357,12 +376,22 @@ static void process_queued_cpu_work_locked(CPUState *cpu) qemu_mutex_lock_iothread(); } } else { - if (has_bql) { - wi->func(cpu, wi->data); + if (wi->bql) { + if (has_bql) { + wi->func(cpu, wi->data); + } else { + qemu_mutex_lock_iothread(); + wi->func(cpu, wi->data); + qemu_mutex_unlock_iothread(); + } } else { - qemu_mutex_lock_iothread(); - wi->func(cpu, wi->data); - qemu_mutex_unlock_iothread(); + if (has_bql) { + qemu_mutex_unlock_iothread(); + wi->func(cpu, wi->data); + qemu_mutex_lock_iothread(); + } else { + wi->func(cpu, wi->data); + } } } cpu_mutex_lock(cpu);