From patchwork Fri Oct 15 16:12:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg Kurz X-Patchwork-Id: 12562427 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C7F7C433EF for ; Fri, 15 Oct 2021 16:16:34 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0524461090 for ; Fri, 15 Oct 2021 16:16:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 0524461090 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kaod.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:44000 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mbPsn-0002Dn-2v for qemu-devel@archiver.kernel.org; Fri, 15 Oct 2021 12:16:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46846) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mbPor-0006a3-Pu for qemu-devel@nongnu.org; Fri, 15 Oct 2021 12:12:29 -0400 Received: from us-smtp-delivery-44.mimecast.com ([205.139.111.44]:30231) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mbPop-00083g-UA for qemu-devel@nongnu.org; Fri, 15 Oct 2021 12:12:29 -0400 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-212-sAK_swEVMYi_oaStG3ErmQ-1; Fri, 15 Oct 2021 12:12:24 -0400 X-MC-Unique: sAK_swEVMYi_oaStG3ErmQ-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 297C41007905; Fri, 15 Oct 2021 16:12:23 +0000 (UTC) Received: from bahia.redhat.com (unknown [10.39.195.34]) by smtp.corp.redhat.com (Postfix) with ESMTP id 94B6E5F4E2; Fri, 15 Oct 2021 16:12:21 +0000 (UTC) From: Greg Kurz To: qemu-devel@nongnu.org Subject: [PATCH 1/2] rcu: Introduce force_rcu notifier Date: Fri, 15 Oct 2021 18:12:17 +0200 Message-Id: <20211015161218.1231920-2-groug@kaod.org> In-Reply-To: <20211015161218.1231920-1-groug@kaod.org> References: <20211015161218.1231920-1-groug@kaod.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=groug@kaod.org X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kaod.org Received-SPF: softfail client-ip=205.139.111.44; envelope-from=groug@kaod.org; helo=us-smtp-delivery-44.mimecast.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Eduardo Habkost , Richard Henderson , Greg Kurz , qemu-stable@nongnu.org, Paolo Bonzini Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" The drain_rcu_call() function can be blocked as long as an RCU reader stays in a read-side critical section. This is typically what happens when a TCG vCPU is executing a busy loop. It can deadlock the QEMU monitor as reported in https://gitlab.com/qemu-project/qemu/-/issues/650 . This can be avoided by allowing drain_rcu_call() to enforce an RCU grace period. Since each reader might need to do specific actions to end a read-side critical section, do it with notifiers. Prepare ground for this by adding a NotifierList and use it in wait_for_readers() if drain_rcu_call() is in progress. Readers can now optionally specify a Notifier to be called in this case at thread registration time. The current rcu_register_thread() API is preserved for readers that don't need this. The notifier is removed automatically when the thread unregisters. This is largely based on a draft from Paolo Bonzini. Suggested-by: Paolo Bonzini Signed-off-by: Greg Kurz --- include/qemu/rcu.h | 21 ++++++++++++++++++++- util/rcu.c | 23 +++++++++++++++++++++-- 2 files changed, 41 insertions(+), 3 deletions(-) diff --git a/include/qemu/rcu.h b/include/qemu/rcu.h index 515d327cf11c..498e4e5e3479 100644 --- a/include/qemu/rcu.h +++ b/include/qemu/rcu.h @@ -27,6 +27,7 @@ #include "qemu/thread.h" #include "qemu/queue.h" #include "qemu/atomic.h" +#include "qemu/notify.h" #include "qemu/sys_membarrier.h" #ifdef __cplusplus @@ -66,6 +67,13 @@ struct rcu_reader_data { /* Data used for registry, protected by rcu_registry_lock */ QLIST_ENTRY(rcu_reader_data) node; + + /* + * Notifier used to force an RCU grace period. Accessed under + * rcu_registry_lock. Note that the notifier is called _outside_ + * the thread! + */ + Notifier *force_rcu; }; extern __thread struct rcu_reader_data rcu_reader; @@ -114,8 +122,19 @@ extern void synchronize_rcu(void); /* * Reader thread registration. + * + * The caller can specify an optional notifier if it wants RCU + * to enforce grace periods. This is needed by drain_call_rcu(). + * Note that the notifier is executed in the context of the RCU + * thread. */ -extern void rcu_register_thread(void); +extern void rcu_register_thread_with_force_rcu(Notifier *n); + +static inline void rcu_register_thread(void) +{ + rcu_register_thread_with_force_rcu(NULL); +} + extern void rcu_unregister_thread(void); /* diff --git a/util/rcu.c b/util/rcu.c index 13ac0f75cb2a..da3506917fa8 100644 --- a/util/rcu.c +++ b/util/rcu.c @@ -46,9 +46,17 @@ unsigned long rcu_gp_ctr = RCU_GP_LOCKED; QemuEvent rcu_gp_event; +static int in_drain_call_rcu; static QemuMutex rcu_registry_lock; static QemuMutex rcu_sync_lock; +/* + * NotifierList used to force an RCU grace period. Accessed under + * rcu_registry_lock. + */ +static NotifierList force_rcu_notifiers = + NOTIFIER_LIST_INITIALIZER(force_rcu_notifiers); + /* * Check whether a quiescent state was crossed between the beginning of * update_counter_and_wait and now. @@ -107,6 +115,8 @@ static void wait_for_readers(void) * get some extra futex wakeups. */ qatomic_set(&index->waiting, false); + } else if (qatomic_read(&in_drain_call_rcu)) { + notifier_list_notify(&force_rcu_notifiers, NULL); } } @@ -293,7 +303,6 @@ void call_rcu1(struct rcu_head *node, void (*func)(struct rcu_head *node)) qemu_event_set(&rcu_call_ready_event); } - struct rcu_drain { struct rcu_head rcu; QemuEvent drain_complete_event; @@ -339,8 +348,10 @@ void drain_call_rcu(void) * assumed. */ + qatomic_inc(&in_drain_call_rcu); call_rcu1(&rcu_drain.rcu, drain_rcu_callback); qemu_event_wait(&rcu_drain.drain_complete_event); + qatomic_dec(&in_drain_call_rcu); if (locked) { qemu_mutex_lock_iothread(); @@ -348,10 +359,14 @@ void drain_call_rcu(void) } -void rcu_register_thread(void) +void rcu_register_thread_with_force_rcu(Notifier *n) { assert(rcu_reader.ctr == 0); qemu_mutex_lock(&rcu_registry_lock); + if (n) { + rcu_reader.force_rcu = n; + notifier_list_add(&force_rcu_notifiers, rcu_reader.force_rcu); + } QLIST_INSERT_HEAD(®istry, &rcu_reader, node); qemu_mutex_unlock(&rcu_registry_lock); } @@ -360,6 +375,10 @@ void rcu_unregister_thread(void) { qemu_mutex_lock(&rcu_registry_lock); QLIST_REMOVE(&rcu_reader, node); + if (rcu_reader.force_rcu) { + notifier_remove(rcu_reader.force_rcu); + rcu_reader.force_rcu = NULL; + } qemu_mutex_unlock(&rcu_registry_lock); } From patchwork Fri Oct 15 16:12:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg Kurz X-Patchwork-Id: 12562425 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CADBFC433EF for ; Fri, 15 Oct 2021 16:13:56 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3CF5361053 for ; Fri, 15 Oct 2021 16:13:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 3CF5361053 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kaod.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:40784 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mbPqE-0008Jz-9O for qemu-devel@archiver.kernel.org; Fri, 15 Oct 2021 12:13:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46922) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mbPov-0006ii-OC for qemu-devel@nongnu.org; Fri, 15 Oct 2021 12:12:33 -0400 Received: from us-smtp-delivery-44.mimecast.com ([207.211.30.44]:21930) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mbPou-0008Ks-08 for qemu-devel@nongnu.org; Fri, 15 Oct 2021 12:12:33 -0400 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-585-2XxRRLDrPtqFJJg6vePhAA-1; Fri, 15 Oct 2021 12:12:26 -0400 X-MC-Unique: 2XxRRLDrPtqFJJg6vePhAA-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E45FE1007928; Fri, 15 Oct 2021 16:12:24 +0000 (UTC) Received: from bahia.redhat.com (unknown [10.39.195.34]) by smtp.corp.redhat.com (Postfix) with ESMTP id 733385F4E2; Fri, 15 Oct 2021 16:12:23 +0000 (UTC) From: Greg Kurz To: qemu-devel@nongnu.org Subject: [PATCH 2/2] accel/tcg: Register a force_rcu notifier Date: Fri, 15 Oct 2021 18:12:18 +0200 Message-Id: <20211015161218.1231920-3-groug@kaod.org> In-Reply-To: <20211015161218.1231920-1-groug@kaod.org> References: <20211015161218.1231920-1-groug@kaod.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=groug@kaod.org X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kaod.org Received-SPF: softfail client-ip=207.211.30.44; envelope-from=groug@kaod.org; helo=us-smtp-delivery-44.mimecast.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Eduardo Habkost , Richard Henderson , Greg Kurz , qemu-stable@nongnu.org, Paolo Bonzini Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" A TCG vCPU doing a busy loop systematicaly hangs the QEMU monitor if the user passes 'device_add' without argument. This is because drain_cpu_all() which is called from qmp_device_add() cannot return if readers don't exit read-side critical sections. That is typically what busy-looping TCG vCPUs do, both in MTTCG and RR modes: int cpu_exec(CPUState *cpu) { [...] rcu_read_lock(); [...] while (!cpu_handle_exception(cpu, &ret)) { // Busy loop keeps vCPU here } [...] rcu_read_unlock(); return ret; } Have all vCPUs register a force_rcu notifier that will kick them out of the loop using async_run_on_cpu(). The notifier implementation is shared by MTTCG and RR since both are affected. Suggested-by: Paolo Bonzini Fixes: 7bed89958bfb ("device_core: use drain_call_rcu in in qmp_device_add") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/650 Signed-off-by: Greg Kurz --- accel/tcg/tcg-accel-ops-mttcg.c | 3 ++- accel/tcg/tcg-accel-ops-rr.c | 3 ++- accel/tcg/tcg-accel-ops.c | 11 +++++++++++ accel/tcg/tcg-accel-ops.h | 2 ++ include/hw/core/cpu.h | 2 ++ 5 files changed, 19 insertions(+), 2 deletions(-) diff --git a/accel/tcg/tcg-accel-ops-mttcg.c b/accel/tcg/tcg-accel-ops-mttcg.c index 847d2079d21f..114c10cb22e2 100644 --- a/accel/tcg/tcg-accel-ops-mttcg.c +++ b/accel/tcg/tcg-accel-ops-mttcg.c @@ -48,7 +48,8 @@ static void *mttcg_cpu_thread_fn(void *arg) assert(tcg_enabled()); g_assert(!icount_enabled()); - rcu_register_thread(); + cpu->force_rcu.notify = tcg_cpus_force_rcu; + rcu_register_thread_with_force_rcu(&cpu->force_rcu); tcg_register_thread(); qemu_mutex_lock_iothread(); diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c index a5fd26190e20..90cc02221faa 100644 --- a/accel/tcg/tcg-accel-ops-rr.c +++ b/accel/tcg/tcg-accel-ops-rr.c @@ -146,7 +146,8 @@ static void *rr_cpu_thread_fn(void *arg) CPUState *cpu = arg; assert(tcg_enabled()); - rcu_register_thread(); + cpu->force_rcu.notify = tcg_cpus_force_rcu; + rcu_register_thread_with_force_rcu(&cpu->force_rcu); tcg_register_thread(); qemu_mutex_lock_iothread(); diff --git a/accel/tcg/tcg-accel-ops.c b/accel/tcg/tcg-accel-ops.c index 1a8e8390bd60..58a88a34adaf 100644 --- a/accel/tcg/tcg-accel-ops.c +++ b/accel/tcg/tcg-accel-ops.c @@ -91,6 +91,17 @@ void tcg_handle_interrupt(CPUState *cpu, int mask) } } +static void do_nothing(CPUState *cpu, run_on_cpu_data d) +{ +} + +void tcg_cpus_force_rcu(Notifier *notify, void *data) +{ + CPUState *cpu = container_of(notify, CPUState, force_rcu); + + async_run_on_cpu(cpu, do_nothing, RUN_ON_CPU_NULL); +} + static void tcg_accel_ops_init(AccelOpsClass *ops) { if (qemu_tcg_mttcg_enabled()) { diff --git a/accel/tcg/tcg-accel-ops.h b/accel/tcg/tcg-accel-ops.h index 6a5fcef88980..8742041c8aea 100644 --- a/accel/tcg/tcg-accel-ops.h +++ b/accel/tcg/tcg-accel-ops.h @@ -18,5 +18,7 @@ void tcg_cpus_destroy(CPUState *cpu); int tcg_cpus_exec(CPUState *cpu); void tcg_handle_interrupt(CPUState *cpu, int mask); void tcg_cpu_init_cflags(CPUState *cpu, bool parallel); +/* Common force_rcu notifier for MTTCG and RR */ +void tcg_cpus_force_rcu(Notifier *notify, void *data); #endif /* TCG_CPUS_H */ diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h index b7d5bc1200cd..b0047f4b3a97 100644 --- a/include/hw/core/cpu.h +++ b/include/hw/core/cpu.h @@ -418,6 +418,8 @@ struct CPUState { /* track IOMMUs whose translations we've cached in the TCG TLB */ GArray *iommu_notifiers; + + Notifier force_rcu; }; typedef QTAILQ_HEAD(CPUTailQ, CPUState) CPUTailQ;