From patchwork Mon Apr 1 10:09:05 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew Cooper X-Patchwork-Id: 10879591 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6882A922 for ; Mon, 1 Apr 2019 10:11:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4CEE6287FA for ; Mon, 1 Apr 2019 10:11:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3A02128806; Mon, 1 Apr 2019 10:11:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 1EE49287FA for ; Mon, 1 Apr 2019 10:11:03 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hAtsP-0005fJ-J9; Mon, 01 Apr 2019 10:09:13 +0000 Received: from us1-rack-dfw2.inumbo.com ([104.130.134.6]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hAtsN-0005fE-Nc for xen-devel@lists.xen.org; Mon, 01 Apr 2019 10:09:11 +0000 X-Inumbo-ID: 306726fc-5466-11e9-bc90-bc764e045a96 Received: from SMTP03.CITRIX.COM (unknown [162.221.156.55]) by us1-rack-dfw2.inumbo.com (Halon) with ESMTPS id 306726fc-5466-11e9-bc90-bc764e045a96; Mon, 01 Apr 2019 10:09:09 +0000 (UTC) X-IronPort-AV: E=Sophos;i="5.60,296,1549929600"; d="scan'208";a="82362266" From: Andrew Cooper To: Xen-devel Date: Mon, 1 Apr 2019 11:09:05 +0100 Message-ID: <1554113345-15068-1-git-send-email-andrew.cooper3@citrix.com> X-Mailer: git-send-email 2.1.4 MIME-Version: 1.0 Subject: [Xen-devel] [PATCH] xen/sched: Remove d->is_pinned X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Juergen Gross , Stefano Stabellini , Wei Liu , George Dunlap , Andrew Cooper , Dario Faggioli , Julien Grall , Jan Beulich , =?utf-8?q?Roger_Pau_Monn=C3=A9?= Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP The is_pinned field is rather odd. It can only be activated with the "dom0_vcpus_pin" command line option, and causes dom0 (or the late hwdom) to have its vcpus identity pinned to pcpus. Having dom0_vcpus_pin active disallows the use of vcpu_set_hard_affinity(). However, when a pcpu is hot unplugged, or moved between cpupools, the affinity is broken and reverts to cpumask_all. This can result in vcpus which are no longer pinned, and cannot be adjusted. A related bit of functionality is the is_pinned_vcpu() predicate. This is only used by x86 code, and permits the use of VCPUOP_get_physid and writeable access to some extra MSRs. The implementation however returns true for is_pinned (which will include unpinned vcpus from the above scenario), *or* if the hard affinity mask only has a single bit set (which is redundant with the intended effect of is_pinned, but also includes other domains). Rework the behaviour of "dom0_vcpus_pin" to only being an initial pinning configuration, and permit full adjustment. This allows the user to reconfigure dom0 after the fact or fix up from the fallout of cpu hot unplug and cpupool manipulation. An unprivileged domain has no buisness using VCPUOP_get_physid, and shouldn't be able to just because it happens to be pinned by admin choice. All uses of is_pinned_vcpu() should be restricted to the hardware domain, so rename it to is_hwdom_pinned_vcpu() to avoid future misuse. Signed-off-by: Andrew Cooper Reviewed-by: Jan Beulich Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Dario Faggioli --- CC: Jan Beulich CC: Wei Liu CC: Roger Pau Monné CC: Stefano Stabellini CC: Julien Grall CC: George Dunlap CC: Dario Faggioli CC: Juergen Gross Other oddities which I've noticed and should probably be fixed in due course. 1) Permitting real writes to MSR_IA32_UCODE_REV is pointless. The only legitimate value to occur here is 0, and that is already properly accounted for in the read handler. 2) dom0_setup_vcpu() results in 2 or 3 calls sched_set_affinity(), which is 1 or 2 too many. 3) cpumask_weight() is a surprisingly expensive function, and we use it all over the place. It will almost certainly benefit from the use of popcnt. --- xen/arch/x86/dom0_build.c | 2 +- xen/arch/x86/domain.c | 2 +- xen/arch/x86/pv/emul-priv-op.c | 9 ++++----- xen/common/domain.c | 1 - xen/common/schedule.c | 5 +---- xen/include/xen/sched.h | 10 ++++++---- 6 files changed, 13 insertions(+), 16 deletions(-) diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c index 6ebe367..73f5407 100644 --- a/xen/arch/x86/dom0_build.c +++ b/xen/arch/x86/dom0_build.c @@ -213,7 +213,7 @@ struct vcpu *__init dom0_setup_vcpu(struct domain *d, } else { - if ( !d->is_pinned && !dom0_affinity_relaxed ) + if ( !opt_dom0_vcpus_pin && !dom0_affinity_relaxed ) sched_set_affinity(v, &dom0_cpus, NULL); sched_set_affinity(v, NULL, &dom0_cpus); } diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 8d579e2..20b86fd 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -1229,7 +1229,7 @@ arch_do_vcpu_op( struct vcpu_get_physid cpu_id; rc = -EINVAL; - if ( !is_pinned_vcpu(v) ) + if ( !is_hwdom_pinned_vcpu(v) ) break; cpu_id.phys_id = diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c index 3746e2a..84ce67c 100644 --- a/xen/arch/x86/pv/emul-priv-op.c +++ b/xen/arch/x86/pv/emul-priv-op.c @@ -1017,7 +1017,7 @@ static int write_msr(unsigned int reg, uint64_t val, if ( boot_cpu_data.x86_vendor != X86_VENDOR_AMD || boot_cpu_data.x86 < 0x10 || boot_cpu_data.x86 > 0x17 ) break; - if ( !is_hardware_domain(currd) || !is_pinned_vcpu(curr) ) + if ( !is_hwdom_pinned_vcpu(curr) ) return X86EMUL_OKAY; if ( (rdmsr_safe(MSR_AMD64_NB_CFG, temp) != 0) || ((val ^ temp) & ~(1ULL << AMD64_NB_CFG_CF8_EXT_ENABLE_BIT)) ) @@ -1030,7 +1030,7 @@ static int write_msr(unsigned int reg, uint64_t val, if ( boot_cpu_data.x86_vendor != X86_VENDOR_AMD || boot_cpu_data.x86 < 0x10 || boot_cpu_data.x86 > 0x17 ) break; - if ( !is_hardware_domain(currd) || !is_pinned_vcpu(curr) ) + if ( !is_hwdom_pinned_vcpu(curr) ) return X86EMUL_OKAY; if ( rdmsr_safe(MSR_FAM10H_MMIO_CONF_BASE, temp) != 0 ) break; @@ -1050,7 +1050,7 @@ static int write_msr(unsigned int reg, uint64_t val, case MSR_IA32_UCODE_REV: if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ) break; - if ( !is_hardware_domain(currd) || !is_pinned_vcpu(curr) ) + if ( !is_hwdom_pinned_vcpu(curr) ) return X86EMUL_OKAY; if ( rdmsr_safe(reg, temp) ) break; @@ -1086,8 +1086,7 @@ static int write_msr(unsigned int reg, uint64_t val, case MSR_IA32_ENERGY_PERF_BIAS: if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ) break; - if ( !is_hardware_domain(currd) || !is_pinned_vcpu(curr) || - wrmsr_safe(reg, val) == 0 ) + if ( !is_hwdom_pinned_vcpu(curr) || wrmsr_safe(reg, val) == 0 ) return X86EMUL_OKAY; break; diff --git a/xen/common/domain.c b/xen/common/domain.c index 3b18f11..a1f8bb4 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -342,7 +342,6 @@ struct domain *domain_create(domid_t domid, if ( hardware_domid < 0 || hardware_domid >= DOMID_FIRST_RESERVED ) panic("The value of hardware_dom must be a valid domain ID\n"); - d->is_pinned = opt_dom0_vcpus_pin; d->disable_migrate = 1; old_hwdom = hardware_domain; hardware_domain = d; diff --git a/xen/common/schedule.c b/xen/common/schedule.c index 60755a6..76d6078 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -276,7 +276,7 @@ int sched_init_vcpu(struct vcpu *v, unsigned int processor) * Initialize affinity settings. The idler, and potentially * domain-0 VCPUs, are pinned onto their respective physical CPUs. */ - if ( is_idle_domain(d) || d->is_pinned ) + if ( is_idle_domain(d) || (is_hardware_domain(d) && opt_dom0_vcpus_pin) ) sched_set_affinity(v, cpumask_of(processor), &cpumask_all); else sched_set_affinity(v, &cpumask_all, &cpumask_all); @@ -958,9 +958,6 @@ int vcpu_set_hard_affinity(struct vcpu *v, const cpumask_t *affinity) cpumask_t online_affinity; cpumask_t *online; - if ( v->domain->is_pinned ) - return -EINVAL; - online = VCPU2ONLINE(v); cpumask_and(&online_affinity, affinity, online); if ( cpumask_empty(&online_affinity) ) diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index edee52d..6d23b6d 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -381,8 +381,6 @@ struct domain bool is_console; /* Is this a xenstore domain (not dom0)? */ bool is_xenstore; - /* Domain's VCPUs are pinned 1:1 to physical CPUs? */ - bool is_pinned; /* Non-migratable and non-restoreable? */ bool disable_migrate; /* Is this guest being debugged by dom0? */ @@ -961,8 +959,12 @@ static inline bool is_hvm_vcpu(const struct vcpu *v) return is_hvm_domain(v->domain); } -#define is_pinned_vcpu(v) ((v)->domain->is_pinned || \ - cpumask_weight((v)->cpu_hard_affinity) == 1) +static inline bool is_hwdom_pinned_vcpu(const struct vcpu *v) +{ + return (is_hardware_domain(v->domain) && + cpumask_weight(v->cpu_hard_affinity) == 1); +} + #ifdef CONFIG_HAS_PASSTHROUGH #define has_iommu_pt(d) (dom_iommu(d)->status != IOMMU_STATUS_disabled) #define need_iommu_pt_sync(d) (dom_iommu(d)->need_sync)