From patchwork Wed Jul 13 21:00:21 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: sergey.fedorov@linaro.org X-Patchwork-Id: 9228483 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 4D92760572 for ; Wed, 13 Jul 2016 21:05:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3A7AD27FB6 for ; Wed, 13 Jul 2016 21:05:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2EE7228066; Wed, 13 Jul 2016 21:05:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 0B4C027FB6 for ; Wed, 13 Jul 2016 21:05:57 +0000 (UTC) Received: from localhost ([::1]:49932 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bNRLw-0005ZJ-3b for patchwork-qemu-devel@patchwork.kernel.org; Wed, 13 Jul 2016 17:05:56 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58250) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bNRH2-0007QZ-BX for qemu-devel@nongnu.org; Wed, 13 Jul 2016 17:00:58 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bNRGx-0001IK-KZ for qemu-devel@nongnu.org; Wed, 13 Jul 2016 17:00:52 -0400 Received: from mail-lf0-x230.google.com ([2a00:1450:4010:c07::230]:34394) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bNRGx-0001I6-7Z for qemu-devel@nongnu.org; Wed, 13 Jul 2016 17:00:47 -0400 Received: by mail-lf0-x230.google.com with SMTP id h129so48278691lfh.1 for ; Wed, 13 Jul 2016 14:00:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=I2QY1yGxibvp3VDw5H+/R45ZJ5EJQbaT1eS167qWh44=; b=QZWedWuibTXj+fCht6Ff8AHqtwMT3vi9EOa4eKtlMLbSThuiOKcRnWXLRylUxpySlP xhotJzrCYSRN3wSLSNMTnrYH/xoLEkxRCCG+wClrqUzMshJfb7GlTqkVaHD7RTQN6Ob9 XpJBOykhQyvc9MdHsm0TwjJjs0YvTkA8Oddl0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=I2QY1yGxibvp3VDw5H+/R45ZJ5EJQbaT1eS167qWh44=; b=g2dfwEZFCNYu3c24aPvYIJgMtzv1VFKpa8XNm5C/XOLmbW8ceLBGuyIA67nwP7VJJZ 1KB74pU5hSlLbvRGmfwyHLX0qdLMaFM+anV3s6fjz/o7LdasAdPNQ1V1FmsQuEXfl8Gp jFV3t2tMtIaVWagq12kfDkSRoV/PJm2cfsdC82ycWq8GGlI+3EKwKeUnfeLBjxoS2TFL Bs9X70bWnM4UFGNSNIjNJn5veE4zDTWl4jsoAfN41jMELhIPOXAp1Fjn8B96eoujWWkR 8g/2PAc2sTZxrPVeLCG0f3sifHBaiwg27xFmV4chbr+OXHd+2vHMWFhXdU2cVhIMVike Xtpg== X-Gm-Message-State: ALyK8tKVq3cklnytfFZo8o0JGcA3UxW8f+yWf3cRie4DDTVcC0G1NHJVNeSJrnpo7VUdncnD X-Received: by 10.25.19.169 with SMTP id 41mr4596439lft.24.1468443646205; Wed, 13 Jul 2016 14:00:46 -0700 (PDT) Received: from sergey-laptop.Dlink (broadband-46-188-120-37.2com.net. [46.188.120.37]) by smtp.gmail.com with ESMTPSA id 16sm152403lja.31.2016.07.13.14.00.44 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 13 Jul 2016 14:00:45 -0700 (PDT) From: Sergey Fedorov To: qemu-devel@nongnu.org Date: Thu, 14 Jul 2016 00:00:21 +0300 Message-Id: <1468443622-17368-12-git-send-email-sergey.fedorov@linaro.org> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1468443622-17368-1-git-send-email-sergey.fedorov@linaro.org> References: <1468443622-17368-1-git-send-email-sergey.fedorov@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2a00:1450:4010:c07::230 Subject: [Qemu-devel] [PATCH v3 11/12] cpu-exec-common: Introduce async_safe_run_on_cpu() X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: MTTCG Devel , Peter Maydell , Riku Voipio , Sergey Fedorov , patches@linaro.org, Peter Crosthwaite , Alvise Rigo , "Emilio G. Cota" , Paolo Bonzini , serge.fdrv@gmail.com, Richard Henderson , =?UTF-8?Q?Alex_Benn=c3=a9e?= , =?UTF-8?B?S09OUkFEIEZyw6lkw6lyaWM=?= Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Sergey Fedorov This patch is based on the ideas found in work of KONRAD Frederic [1], Alex Bennée [2], and Alvise Rigo [3]. This mechanism allows to perform an operation safely in a quiescent state. Quiescent state means: (1) no vCPU is running and (2) BQL in system-mode or 'exclusive_lock' in user-mode emulation is held while performing the operation. This functionality is required e.g. for performing translation buffer flush safely in multi-threaded user-mode emulation. The existing CPU work queue is used to schedule such safe operations. A new 'safe' flag is added into struct qemu_work_item to designate the special requirements of the safe work. An operation in a quiescent sate can be scheduled by using async_safe_run_on_cpu() function which is actually the same as sync_run_on_cpu() except that it marks the queued work item with the 'safe' flag set to true. Given this flag set queue_work_on_cpu() atomically increments 'safe_work_pending' global counter and kicks all the CPUs instead of just the target CPU as in case of normal CPU work. This allows to force other CPUs to exit their execution loops and wait in wait_safe_cpu_work() function for the safe work to finish. When a CPU drains its work queue, if it encounters a work item marked as safe, it first waits for other CPUs to exit their execution loops, then called the work item function, and finally decrements 'safe_work_pending' counter with signalling other CPUs to let them continue execution as soon as all pending safe work items have been processed. The 'tcg_pending_threads' protected by 'exclusive_lock' in user-mode or by 'qemu_global_mutex' in system-mode emulation is used to determine if there is any CPU run and wait for it to exit the execution loop. The fairness of all the CPU work queues is ensured by draining all the pending safe work items before any CPU can run. [1] http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg01128.html [2] http://lists.nongnu.org/archive/html/qemu-devel/2016-04/msg02531.html [3] http://lists.nongnu.org/archive/html/qemu-devel/2016-05/msg04792.html Signed-off-by: Sergey Fedorov Signed-off-by: Sergey Fedorov --- Changes in v3: - bsd-user supported Changes in v2: - some conditional varialbes moved to cpu-exec-common.c - documentation commend for new public API added --- bsd-user/main.c | 3 ++- cpu-exec-common.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++++- cpus.c | 20 ++++++++++++++++++++ include/exec/exec-all.h | 14 ++++++++++++++ include/qom/cpu.h | 14 ++++++++++++++ linux-user/main.c | 13 +++++++------ 6 files changed, 105 insertions(+), 8 deletions(-) diff --git a/bsd-user/main.c b/bsd-user/main.c index f738dd64d691..5433bca0fca6 100644 --- a/bsd-user/main.c +++ b/bsd-user/main.c @@ -66,9 +66,10 @@ int cpu_get_pic_interrupt(CPUX86State *env) void qemu_init_cpu_loop(void) { /* We need to do this becuase process_queued_cpu_work() calls - * qemu_cond_broadcast() on it + * qemu_cond_broadcast() on them */ qemu_cond_init(&qemu_work_cond); + qemu_cond_init(&qemu_safe_work_cond); } QemuMutex *qemu_get_cpu_work_mutex(void) diff --git a/cpu-exec-common.c b/cpu-exec-common.c index a233f0124559..6f278d6d3b70 100644 --- a/cpu-exec-common.c +++ b/cpu-exec-common.c @@ -25,6 +25,7 @@ bool exit_request; CPUState *tcg_current_cpu; +int tcg_pending_threads; /* exit the current TB, but without causing any exception to be raised */ void cpu_loop_exit_noexc(CPUState *cpu) @@ -79,6 +80,17 @@ void cpu_loop_exit_restore(CPUState *cpu, uintptr_t pc) } QemuCond qemu_work_cond; +QemuCond qemu_safe_work_cond; +QemuCond qemu_exclusive_cond; + +static int safe_work_pending; + +void wait_safe_cpu_work(void) +{ + while (atomic_mb_read(&safe_work_pending) > 0) { + qemu_cond_wait(&qemu_safe_work_cond, qemu_get_cpu_work_mutex()); + } +} static void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi) { @@ -91,9 +103,18 @@ static void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi) cpu->queued_work_last = wi; wi->next = NULL; wi->done = false; + if (wi->safe) { + atomic_inc(&safe_work_pending); + } qemu_mutex_unlock(&cpu->work_mutex); - qemu_cpu_kick(cpu); + if (!wi->safe) { + qemu_cpu_kick(cpu); + } else { + CPU_FOREACH(cpu) { + qemu_cpu_kick(cpu); + } + } } void run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data) @@ -108,6 +129,7 @@ void run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data) wi.func = func; wi.data = data; wi.free = false; + wi.safe = false; queue_work_on_cpu(cpu, &wi); while (!atomic_mb_read(&wi.done)) { @@ -131,6 +153,20 @@ void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data) wi->func = func; wi->data = data; wi->free = true; + wi->safe = false; + + queue_work_on_cpu(cpu, wi); +} + +void async_safe_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data) +{ + struct qemu_work_item *wi; + + wi = g_malloc0(sizeof(struct qemu_work_item)); + wi->func = func; + wi->data = data; + wi->free = true; + wi->safe = true; queue_work_on_cpu(cpu, wi); } @@ -150,9 +186,20 @@ void process_queued_cpu_work(CPUState *cpu) if (!cpu->queued_work_first) { cpu->queued_work_last = NULL; } + if (wi->safe) { + while (tcg_pending_threads) { + qemu_cond_wait(&qemu_exclusive_cond, + qemu_get_cpu_work_mutex()); + } + } qemu_mutex_unlock(&cpu->work_mutex); wi->func(cpu, wi->data); qemu_mutex_lock(&cpu->work_mutex); + if (wi->safe) { + if (!atomic_dec_fetch(&safe_work_pending)) { + qemu_cond_broadcast(&qemu_safe_work_cond); + } + } if (wi->free) { g_free(wi); } else { diff --git a/cpus.c b/cpus.c index 282d7e399902..b7122043f650 100644 --- a/cpus.c +++ b/cpus.c @@ -903,6 +903,8 @@ void qemu_init_cpu_loop(void) qemu_cond_init(&qemu_cpu_cond); qemu_cond_init(&qemu_pause_cond); qemu_cond_init(&qemu_work_cond); + qemu_cond_init(&qemu_safe_work_cond); + qemu_cond_init(&qemu_exclusive_cond); qemu_cond_init(&qemu_io_proceeded_cond); qemu_mutex_init(&qemu_global_mutex); @@ -926,6 +928,20 @@ static void qemu_tcg_destroy_vcpu(CPUState *cpu) { } +/* called with qemu_global_mutex held */ +static inline void tcg_cpu_exec_start(CPUState *cpu) +{ + tcg_pending_threads++; +} + +/* called with qemu_global_mutex held */ +static inline void tcg_cpu_exec_end(CPUState *cpu) +{ + if (--tcg_pending_threads) { + qemu_cond_broadcast(&qemu_exclusive_cond); + } +} + static void qemu_wait_io_event_common(CPUState *cpu) { if (cpu->stop) { @@ -950,6 +966,8 @@ static void qemu_tcg_wait_io_event(CPUState *cpu) CPU_FOREACH(cpu) { qemu_wait_io_event_common(cpu); } + + wait_safe_cpu_work(); } static void qemu_kvm_wait_io_event(CPUState *cpu) @@ -1485,7 +1503,9 @@ static void tcg_exec_all(void) (cpu->singlestep_enabled & SSTEP_NOTIMER) == 0); if (cpu_can_run(cpu)) { + tcg_cpu_exec_start(cpu); r = tcg_cpu_exec(cpu); + tcg_cpu_exec_end(cpu); if (r == EXCP_DEBUG) { cpu_handle_guest_debug(cpu); break; diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index 9fc22fc1c4e0..0f51e4ec0cb1 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -405,12 +405,22 @@ extern int singlestep; /* cpu-exec.c, accessed with atomic_mb_read/atomic_mb_set */ extern CPUState *tcg_current_cpu; +extern int tcg_pending_threads; extern bool exit_request; /** * qemu_work_cond - condition to wait for CPU work items completion */ extern QemuCond qemu_work_cond; +/** + * qemu_safe_work_cond - condition to wait for safe CPU work items completion + */ +extern QemuCond qemu_safe_work_cond; +/** + * qemu_exclusive_cond - condition to wait for all TCG threads to be out of + * guest code execution loop + */ +extern QemuCond qemu_exclusive_cond; /** * qemu_get_cpu_work_mutex() - get the mutex which protects CPU work execution @@ -423,5 +433,9 @@ QemuMutex *qemu_get_cpu_work_mutex(void); * @cpu: The CPU which work queue to process. */ void process_queued_cpu_work(CPUState *cpu); +/** + * wait_safe_cpu_work() - wait until all safe CPU work items has processed + */ +void wait_safe_cpu_work(void); #endif diff --git a/include/qom/cpu.h b/include/qom/cpu.h index c19a673f9f68..ab67bf2ba19f 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -238,6 +238,7 @@ struct qemu_work_item { void *data; int done; bool free; + bool safe; }; /** @@ -632,6 +633,19 @@ void run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data); void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data); /** + * async_safe_run_on_cpu: + * @cpu: The vCPU to run on. + * @func: The function to be executed. + * @data: Data to pass to the function. + * + * Schedules the function @func for execution on the vCPU @cpu asynchronously + * and in quiescent state. Quiescent state means: (1) all other vCPUs are + * halted and (2) #qemu_global_mutex (a.k.a. BQL) in system-mode or + * #exclusive_lock in user-mode emulation is held while @func is executing. + */ +void async_safe_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data); + +/** * qemu_get_cpu: * @index: The CPUState@cpu_index value of the CPU to obtain. * diff --git a/linux-user/main.c b/linux-user/main.c index fce61d5a35fc..d0ff5f9976e5 100644 --- a/linux-user/main.c +++ b/linux-user/main.c @@ -110,18 +110,17 @@ int cpu_get_pic_interrupt(CPUX86State *env) which requires quite a lot of per host/target work. */ static QemuMutex cpu_list_mutex; static QemuMutex exclusive_lock; -static QemuCond exclusive_cond; static QemuCond exclusive_resume; static bool exclusive_pending; -static int tcg_pending_threads; void qemu_init_cpu_loop(void) { qemu_mutex_init(&cpu_list_mutex); qemu_mutex_init(&exclusive_lock); - qemu_cond_init(&exclusive_cond); qemu_cond_init(&exclusive_resume); qemu_cond_init(&qemu_work_cond); + qemu_cond_init(&qemu_safe_work_cond); + qemu_cond_init(&qemu_exclusive_cond); } /* Make sure everything is in a consistent state for calling fork(). */ @@ -148,9 +147,10 @@ void fork_end(int child) exclusive_pending = false; qemu_mutex_init(&exclusive_lock); qemu_mutex_init(&cpu_list_mutex); - qemu_cond_init(&exclusive_cond); qemu_cond_init(&exclusive_resume); qemu_cond_init(&qemu_work_cond); + qemu_cond_init(&qemu_safe_work_cond); + qemu_cond_init(&qemu_exclusive_cond); qemu_mutex_init(&tcg_ctx.tb_ctx.tb_lock); gdbserver_fork(thread_cpu); } else { @@ -190,7 +190,7 @@ static inline void start_exclusive(void) } } while (tcg_pending_threads) { - qemu_cond_wait(&exclusive_cond, &exclusive_lock); + qemu_cond_wait(&qemu_exclusive_cond, &exclusive_lock); } } @@ -219,10 +219,11 @@ static inline void cpu_exec_end(CPUState *cpu) cpu->running = false; tcg_pending_threads--; if (!tcg_pending_threads) { - qemu_cond_signal(&exclusive_cond); + qemu_cond_broadcast(&qemu_exclusive_cond); } exclusive_idle(); process_queued_cpu_work(cpu); + wait_safe_cpu_work(); qemu_mutex_unlock(&exclusive_lock); }