From patchwork Tue Jul 23 09:20:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?SsO8cmdlbiBHcm/Dnw==?= X-Patchwork-Id: 11054171 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CAE2C912 for ; Tue, 23 Jul 2019 09:22:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BA3A427EE2 for ; Tue, 23 Jul 2019 09:22:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AEADB283A5; Tue, 23 Jul 2019 09:22:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 833B32841F for ; Tue, 23 Jul 2019 09:22:24 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hpqyl-0003qn-F8; Tue, 23 Jul 2019 09:21:03 +0000 Received: from us1-rack-dfw2.inumbo.com ([104.130.134.6]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hpqyk-0003qd-Hi for xen-devel@lists.xenproject.org; Tue, 23 Jul 2019 09:21:02 +0000 X-Inumbo-ID: 2f65354c-ad2b-11e9-8980-bc764e045a96 Received: from mx1.suse.de (unknown [195.135.220.15]) by us1-rack-dfw2.inumbo.com (Halon) with ESMTPS id 2f65354c-ad2b-11e9-8980-bc764e045a96; Tue, 23 Jul 2019 09:21:00 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 2DD49AF79; Tue, 23 Jul 2019 09:20:59 +0000 (UTC) From: Juergen Gross To: xen-devel@lists.xenproject.org Date: Tue, 23 Jul 2019 11:20:55 +0200 Message-Id: <20190723092056.15045-2-jgross@suse.com> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190723092056.15045-1-jgross@suse.com> References: <20190723092056.15045-1-jgross@suse.com> Subject: [Xen-devel] [PATCH 1/2] xen/sched: fix locking in restore_vcpu_affinity() X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Juergen Gross , George Dunlap , Dario Faggioli MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Commit 0763cd2687897b55e7 ("xen/sched: don't disable scheduler on cpus during suspend") removed a lock in restore_vcpu_affinity() which needs to stay: cpumask_scratch_cpu() must be protected by the scheduler lock. restore_vcpu_affinity() is being called by thaw_domains(), so with multiple domains in the system another domain might already be running and the scheduler might make use of cpumask_scratch_cpu() already. Signed-off-by: Juergen Gross Reviewed-by: Dario Faggioli --- xen/common/schedule.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/xen/common/schedule.c b/xen/common/schedule.c index 25f6ab388d..89bc259ae4 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -708,6 +708,8 @@ void restore_vcpu_affinity(struct domain *d) * set v->processor of each of their vCPUs to something that will * make sense for the scheduler of the cpupool in which they are in. */ + lock = vcpu_schedule_lock_irq(v); + cpumask_and(cpumask_scratch_cpu(cpu), v->cpu_hard_affinity, cpupool_domain_cpumask(d)); if ( cpumask_empty(cpumask_scratch_cpu(cpu)) ) @@ -731,6 +733,9 @@ void restore_vcpu_affinity(struct domain *d) v->processor = cpumask_any(cpumask_scratch_cpu(cpu)); + spin_unlock_irq(lock); + + /* v->processor might have changed, so reacquire the lock. */ lock = vcpu_schedule_lock_irq(v); v->processor = sched_pick_cpu(vcpu_scheduler(v), v); spin_unlock_irq(lock); From patchwork Tue Jul 23 09:20:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?SsO8cmdlbiBHcm/Dnw==?= X-Patchwork-Id: 11054173 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 501321580 for ; Tue, 23 Jul 2019 09:22:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3E2942623D for ; Tue, 23 Jul 2019 09:22:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3221B283A5; Tue, 23 Jul 2019 09:22:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 742342623D for ; Tue, 23 Jul 2019 09:22:26 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hpqyo-0003rA-0q; Tue, 23 Jul 2019 09:21:06 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hpqym-0003qz-JJ for xen-devel@lists.xenproject.org; Tue, 23 Jul 2019 09:21:04 +0000 X-Inumbo-ID: 2f5f1944-ad2b-11e9-9c20-034c0d07280a Received: from mx1.suse.de (unknown [195.135.220.15]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id 2f5f1944-ad2b-11e9-9c20-034c0d07280a; Tue, 23 Jul 2019 09:21:00 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 59D3DB018; Tue, 23 Jul 2019 09:20:59 +0000 (UTC) From: Juergen Gross To: xen-devel@lists.xenproject.org Date: Tue, 23 Jul 2019 11:20:56 +0200 Message-Id: <20190723092056.15045-3-jgross@suse.com> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190723092056.15045-1-jgross@suse.com> References: <20190723092056.15045-1-jgross@suse.com> Subject: [Xen-devel] [PATCH 2/2] xen: merge temporary vcpu pinning scenarios X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Juergen Gross , Stefano Stabellini , Wei Liu , Konrad Rzeszutek Wilk , George Dunlap , Andrew Cooper , Ian Jackson , Tim Deegan , Julien Grall , Jan Beulich , Dario Faggioli , =?utf-8?q?Roger_Pau_Monn=C3=A9?= MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Today there are three scenarios which are pinning vcpus temporarily to a single physical cpu: - NMI/MCE injection into PV domains - wait_event() handling - vcpu_pin_override() handling Each of those cases are handled independently today using their own temporary cpumask to save the old affinity settings. The three cases can be combined as the two latter cases will only pin a vcpu to the physical cpu it is already running on, while vcpu_pin_override() is allowed to fail. So merge the three temporary pinning scenarios by only using one cpumask and a per-vcpu bitmask for specifying which of the three scenarios is currently active (they are allowed to nest). Note that we don't need to call domain_update_node_affinity() as we are only pinning for a brief period of time. Signed-off-by: Juergen Gross --- xen/arch/x86/pv/traps.c | 20 +------------------- xen/arch/x86/traps.c | 8 ++------ xen/common/domain.c | 4 +--- xen/common/schedule.c | 35 +++++++++++++++++++++++------------ xen/common/wait.c | 26 ++++++++------------------ xen/include/xen/sched.h | 8 +++++--- 6 files changed, 40 insertions(+), 61 deletions(-) diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c index 1740784ff2..37dac300ba 100644 --- a/xen/arch/x86/pv/traps.c +++ b/xen/arch/x86/pv/traps.c @@ -151,25 +151,7 @@ static void nmi_mce_softirq(void) BUG_ON(st->vcpu == NULL); - /* - * Set the tmp value unconditionally, so that the check in the iret - * hypercall works. - */ - cpumask_copy(st->vcpu->cpu_hard_affinity_tmp, - st->vcpu->cpu_hard_affinity); - - if ( (cpu != st->processor) || - (st->processor != st->vcpu->processor) ) - { - - /* - * We are on a different physical cpu. Make sure to wakeup the vcpu on - * the specified processor. - */ - vcpu_set_hard_affinity(st->vcpu, cpumask_of(st->processor)); - - /* Affinity is restored in the iret hypercall. */ - } + vcpu_set_tmp_affinity(st->vcpu, st->processor, VCPU_AFFINITY_NMI); /* * Only used to defer wakeup of domain/vcpu to a safe (non-NMI/MCE) diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index 25b4b47e5e..22bb790858 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -1601,12 +1601,8 @@ void async_exception_cleanup(struct vcpu *curr) return; /* Restore affinity. */ - if ( !cpumask_empty(curr->cpu_hard_affinity_tmp) && - !cpumask_equal(curr->cpu_hard_affinity_tmp, curr->cpu_hard_affinity) ) - { - vcpu_set_hard_affinity(curr, curr->cpu_hard_affinity_tmp); - cpumask_clear(curr->cpu_hard_affinity_tmp); - } + if ( curr->affinity_broken & VCPU_AFFINITY_NMI ) + vcpu_set_tmp_affinity(curr, -1, VCPU_AFFINITY_NMI); if ( !(curr->async_exception_mask & (curr->async_exception_mask - 1)) ) trap = __scanbit(curr->async_exception_mask, VCPU_TRAP_NONE); diff --git a/xen/common/domain.c b/xen/common/domain.c index 55aa759b75..e8e850796e 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -133,7 +133,6 @@ static void vcpu_info_reset(struct vcpu *v) static void vcpu_destroy(struct vcpu *v) { free_cpumask_var(v->cpu_hard_affinity); - free_cpumask_var(v->cpu_hard_affinity_tmp); free_cpumask_var(v->cpu_hard_affinity_saved); free_cpumask_var(v->cpu_soft_affinity); @@ -161,7 +160,6 @@ struct vcpu *vcpu_create( grant_table_init_vcpu(v); if ( !zalloc_cpumask_var(&v->cpu_hard_affinity) || - !zalloc_cpumask_var(&v->cpu_hard_affinity_tmp) || !zalloc_cpumask_var(&v->cpu_hard_affinity_saved) || !zalloc_cpumask_var(&v->cpu_soft_affinity) ) goto fail; @@ -1269,7 +1267,7 @@ int vcpu_reset(struct vcpu *v) v->async_exception_mask = 0; memset(v->async_exception_state, 0, sizeof(v->async_exception_state)); #endif - cpumask_clear(v->cpu_hard_affinity_tmp); + v->affinity_broken = 0; clear_bit(_VPF_blocked, &v->pause_flags); clear_bit(_VPF_in_reset, &v->pause_flags); diff --git a/xen/common/schedule.c b/xen/common/schedule.c index 89bc259ae4..d4de74f9c8 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -1106,47 +1106,58 @@ void watchdog_domain_destroy(struct domain *d) kill_timer(&d->watchdog_timer[i]); } -int vcpu_pin_override(struct vcpu *v, int cpu) +int vcpu_set_tmp_affinity(struct vcpu *v, int cpu, uint8_t reason) { spinlock_t *lock; int ret = -EINVAL; + bool migrate; lock = vcpu_schedule_lock_irq(v); if ( cpu < 0 ) { - if ( v->affinity_broken ) + if ( v->affinity_broken & reason ) { - sched_set_affinity(v, v->cpu_hard_affinity_saved, NULL); - v->affinity_broken = 0; ret = 0; + v->affinity_broken &= ~reason; } + if ( !ret && !v->affinity_broken ) + sched_set_affinity(v, v->cpu_hard_affinity_saved, NULL); } else if ( cpu < nr_cpu_ids ) { - if ( v->affinity_broken ) + if ( (v->affinity_broken & reason) || + (v->affinity_broken && v->processor != cpu) ) ret = -EBUSY; else if ( cpumask_test_cpu(cpu, VCPU2ONLINE(v)) ) { - cpumask_copy(v->cpu_hard_affinity_saved, v->cpu_hard_affinity); - v->affinity_broken = 1; - sched_set_affinity(v, cpumask_of(cpu), NULL); + if ( !v->affinity_broken ) + { + cpumask_copy(v->cpu_hard_affinity_saved, v->cpu_hard_affinity); + sched_set_affinity(v, cpumask_of(cpu), NULL); + } + v->affinity_broken |= reason; ret = 0; } } - if ( ret == 0 ) + migrate = !ret && !cpumask_test_cpu(v->processor, v->cpu_hard_affinity); + if ( migrate ) vcpu_migrate_start(v); vcpu_schedule_unlock_irq(lock, v); - domain_update_node_affinity(v->domain); - - vcpu_migrate_finish(v); + if ( migrate ) + vcpu_migrate_finish(v); return ret; } +int vcpu_pin_override(struct vcpu *v, int cpu) +{ + return vcpu_set_tmp_affinity(v, cpu, VCPU_AFFINITY_OVERRIDE); +} + typedef long ret_t; #endif /* !COMPAT */ diff --git a/xen/common/wait.c b/xen/common/wait.c index 4f830a14e8..9f9ad033b3 100644 --- a/xen/common/wait.c +++ b/xen/common/wait.c @@ -34,8 +34,6 @@ struct waitqueue_vcpu { */ void *esp; char *stack; - cpumask_t saved_affinity; - unsigned int wakeup_cpu; #endif }; @@ -131,9 +129,7 @@ static void __prepare_to_wait(struct waitqueue_vcpu *wqv) ASSERT(wqv->esp == 0); /* Save current VCPU affinity; force wakeup on *this* CPU only. */ - wqv->wakeup_cpu = smp_processor_id(); - cpumask_copy(&wqv->saved_affinity, curr->cpu_hard_affinity); - if ( vcpu_set_hard_affinity(curr, cpumask_of(wqv->wakeup_cpu)) ) + if ( vcpu_set_tmp_affinity(curr, smp_processor_id(), VCPU_AFFINITY_WAIT) ) { gdprintk(XENLOG_ERR, "Unable to set vcpu affinity\n"); domain_crash(current->domain); @@ -182,30 +178,24 @@ static void __prepare_to_wait(struct waitqueue_vcpu *wqv) static void __finish_wait(struct waitqueue_vcpu *wqv) { wqv->esp = NULL; - (void)vcpu_set_hard_affinity(current, &wqv->saved_affinity); + vcpu_set_tmp_affinity(current, -1, VCPU_AFFINITY_WAIT); } void check_wakeup_from_wait(void) { - struct waitqueue_vcpu *wqv = current->waitqueue_vcpu; + struct vcpu *curr = current; + struct waitqueue_vcpu *wqv = curr->waitqueue_vcpu; ASSERT(list_empty(&wqv->list)); if ( likely(wqv->esp == NULL) ) return; - /* Check if we woke up on the wrong CPU. */ - if ( unlikely(smp_processor_id() != wqv->wakeup_cpu) ) + /* Check if we are still pinned. */ + if ( unlikely(!(curr->affinity_broken & VCPU_AFFINITY_WAIT)) ) { - /* Re-set VCPU affinity and re-enter the scheduler. */ - struct vcpu *curr = current; - cpumask_copy(&wqv->saved_affinity, curr->cpu_hard_affinity); - if ( vcpu_set_hard_affinity(curr, cpumask_of(wqv->wakeup_cpu)) ) - { - gdprintk(XENLOG_ERR, "Unable to set vcpu affinity\n"); - domain_crash(current->domain); - } - wait(); /* takes us back into the scheduler */ + gdprintk(XENLOG_ERR, "vcpu affinity lost\n"); + domain_crash(current->domain); } /* diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index b40c8fd138..721c429454 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -200,7 +200,10 @@ struct vcpu /* VCPU is paused following shutdown request (d->is_shutting_down)? */ bool paused_for_shutdown; /* VCPU need affinity restored */ - bool affinity_broken; + uint8_t affinity_broken; +#define VCPU_AFFINITY_OVERRIDE 0x01 +#define VCPU_AFFINITY_NMI 0x02 +#define VCPU_AFFINITY_WAIT 0x04 /* A hypercall has been preempted. */ bool hcall_preempted; @@ -246,8 +249,6 @@ struct vcpu /* Bitmask of CPUs on which this VCPU may run. */ cpumask_var_t cpu_hard_affinity; /* Used to change affinity temporarily. */ - cpumask_var_t cpu_hard_affinity_tmp; - /* Used to restore affinity across S3. */ cpumask_var_t cpu_hard_affinity_saved; /* Bitmask of CPUs on which this VCPU prefers to run. */ @@ -875,6 +876,7 @@ int cpu_disable_scheduler(unsigned int cpu); /* We need it in dom0_setup_vcpu */ void sched_set_affinity(struct vcpu *v, const cpumask_t *hard, const cpumask_t *soft); +int vcpu_set_tmp_affinity(struct vcpu *v, int cpu, uint8_t reason); int vcpu_set_hard_affinity(struct vcpu *v, const cpumask_t *affinity); int vcpu_set_soft_affinity(struct vcpu *v, const cpumask_t *affinity); void restore_vcpu_affinity(struct domain *d);