From patchwork Fri Nov 15 10:57:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: George Dunlap X-Patchwork-Id: 11246049 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D42CE13BD for ; Fri, 15 Nov 2019 10:59:23 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A77D52072B for ; Fri, 15 Nov 2019 10:59:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=citrix.com header.i=@citrix.com header.b="B3poj+FH" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A77D52072B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=citrix.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iVZIR-0005T8-HI; Fri, 15 Nov 2019 10:57:47 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iVZIQ-0005T3-Jl for xen-devel@lists.xenproject.org; Fri, 15 Nov 2019 10:57:46 +0000 X-Inumbo-ID: c09e7d44-0796-11ea-9631-bc764e2007e4 Received: from esa4.hc3370-68.iphmx.com (unknown [216.71.155.144]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id c09e7d44-0796-11ea-9631-bc764e2007e4; Fri, 15 Nov 2019 10:57:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=citrix.com; s=securemail; t=1573815465; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=WtPnD7vOW39nH9yvI0agLTVfeW/7v7jauG3CMpYWK4U=; b=B3poj+FH8tZtk8ErKOAWRdRfs/mGjJn6QqlUqrdqKWDLcv0o/DoyMV3D CiRzdEOQg991TExmR4WFdMlIQmMQI5o3h7j6CcpMSG9u+hsFnxBds0UDr a0qg93Rjmx3xbV/pYsjXfBGWxq6DmkU0+x8/7my7lAyaDPkD+bUCX7FA3 E=; Authentication-Results: esa4.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=george.dunlap@citrix.com; spf=Pass smtp.mailfrom=George.Dunlap@citrix.com; spf=None smtp.helo=postmaster@mail.citrix.com Received-SPF: None (esa4.hc3370-68.iphmx.com: no sender authenticity information available from domain of george.dunlap@citrix.com) identity=pra; client-ip=162.221.158.21; receiver=esa4.hc3370-68.iphmx.com; envelope-from="George.Dunlap@citrix.com"; x-sender="george.dunlap@citrix.com"; x-conformance=sidf_compatible Received-SPF: Pass (esa4.hc3370-68.iphmx.com: domain of George.Dunlap@citrix.com designates 162.221.158.21 as permitted sender) identity=mailfrom; client-ip=162.221.158.21; receiver=esa4.hc3370-68.iphmx.com; envelope-from="George.Dunlap@citrix.com"; x-sender="George.Dunlap@citrix.com"; x-conformance=sidf_compatible; x-record-type="v=spf1"; x-record-text="v=spf1 ip4:209.167.231.154 ip4:178.63.86.133 ip4:195.66.111.40/30 ip4:85.115.9.32/28 ip4:199.102.83.4 ip4:192.28.146.160 ip4:192.28.146.107 ip4:216.52.6.88 ip4:216.52.6.188 ip4:162.221.158.21 ip4:162.221.156.83 ip4:168.245.78.127 ~all" Received-SPF: None (esa4.hc3370-68.iphmx.com: no sender authenticity information available from domain of postmaster@mail.citrix.com) identity=helo; client-ip=162.221.158.21; receiver=esa4.hc3370-68.iphmx.com; envelope-from="George.Dunlap@citrix.com"; x-sender="postmaster@mail.citrix.com"; x-conformance=sidf_compatible IronPort-SDR: YILEz5dv3xC/FnczimzS2zJsO6RowadxBC0IlMW+H3Gcepg2/fXUCDJQrfBT2Wg4oeUU0FWeYg anYcS+S9ga6GM3SOe7/GkEDRznOblSGFVUByp7OsAp/7WoDqBwcc+IhB0pb28KWE4ifffWrIey nbrVDiIeTlSe9CcimQEVJEtvtkR47VphJUUGKZYVGKGt4gPfdFq2lWkpkT2KRupHs39OokBMQe EKBiPJNXMh0rxTgxJ9RIPROBLSwe4TQxh3YaXc7ESqKOs+xqT5r6YWmn2gyB1PDIK/eEUbjQya q4g= X-SBRS: 2.7 X-MesageID: 8911549 X-Ironport-Server: esa4.hc3370-68.iphmx.com X-Remote-IP: 162.221.158.21 X-Policy: $RELAYED X-IronPort-AV: E=Sophos;i="5.68,308,1569297600"; d="scan'208";a="8911549" From: George Dunlap To: Date: Fri, 15 Nov 2019 10:57:39 +0000 Message-ID: <20191115105739.20333-1-george.dunlap@citrix.com> X-Mailer: git-send-email 2.24.0 MIME-Version: 1.0 Subject: [Xen-devel] [PATCH RFC] x86: Add hack to disable "Fake HT" mode X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Steven Haigh , Andrew Cooper , George Dunlap , Andreas Kinzler , Jan Beulich , Anthony Perard , Ian Jackson Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" Changeset ca2eee92df44 ("x86, hvm: Expose host core/HT topology to HVM guests") attempted to "fake up" a topology which would induce guest operating systems to not treat vcpus as sibling hyperthreads. This involved (among other things) actually reporting hyperthreading as available, but giving vcpus every other APICID. The resulting cpu featureset is invalid, but most operating systems on most hardware managed to cope with it. Unfortunately, Windows running on modern AMD hardware -- including Ryzen 3xxx series processors, and reportedly EPYC "Rome" cpus -- gets confused by the resulting contradictory feature bits and crashes during installation. (Linux guests have so far continued to cope.) A "proper" fix is complicated and it's too late to fix it either for 4.13, or to backport to supported branches. As a short-term fix, implement an option to disable this "Fake HT" mode. The resulting topology reported will not be canonical, but experimentally continues to work with Windows guests. However, disabling this "Fake HT" mode has not been widely tested, and will almost certainly break migration if applied inconsistently. To minimize impact while allowing administrators to disable "Fake HT" only on guests which are known not to work without it (i.e., Windows guests) on affected hardware, add an environment variable which can be set to disable the "Fake HT" mode on such hardware. Reported-by: Steven Haigh Reported-by: Andreas Kinzler Signed-off-by: George Dunlap --- This has been compile-tested only; I'm posting it early to get feedback on the approach. TODO: Prevent such guests from being migrated Open questions: - Is this the right place to put the `getenv` check? - Is there any way we can make migration work, at least in some cases? - Can we check for known-problematic models, and at least report a more useful error? CC: Andrew Cooper CC: Jan Beulich CC: Ian Jackson CC: Anthony Perard --- tools/libxc/xc_cpuid_x86.c | 74 +++++++++++++++++++++++--------------- 1 file changed, 45 insertions(+), 29 deletions(-) diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c index 312c481f1e..70c85e1467 100644 --- a/tools/libxc/xc_cpuid_x86.c +++ b/tools/libxc/xc_cpuid_x86.c @@ -579,52 +579,68 @@ int xc_cpuid_apply_policy(xc_interface *xch, uint32_t domid, } else { - /* - * Topology for HVM guests is entirely controlled by Xen. For now, we - * hardcode APIC_ID = vcpu_id * 2 to give the illusion of no SMT. - */ - p->basic.htt = true; + p->basic.htt = false; p->extd.cmp_legacy = false; - /* - * Leaf 1 EBX[23:16] is Maximum Logical Processors Per Package. - * Update to reflect vLAPIC_ID = vCPU_ID * 2, but make sure to avoid - * overflow. - */ - if ( !(p->basic.lppp & 0x80) ) - p->basic.lppp *= 2; - switch ( p->x86_vendor ) { case X86_VENDOR_INTEL: for ( i = 0; (p->cache.subleaf[i].type && i < ARRAY_SIZE(p->cache.raw)); ++i ) { - p->cache.subleaf[i].cores_per_package = - (p->cache.subleaf[i].cores_per_package << 1) | 1; + p->cache.subleaf[i].cores_per_package = 0; p->cache.subleaf[i].threads_per_cache = 0; } break; + } - case X86_VENDOR_AMD: - case X86_VENDOR_HYGON: + if ( !getenv("XEN_LIBXC_DISABLE_FAKEHT") ) { /* - * Leaf 0x80000008 ECX[15:12] is ApicIdCoreSize. - * Leaf 0x80000008 ECX[7:0] is NumberOfCores (minus one). - * Update to reflect vLAPIC_ID = vCPU_ID * 2. But avoid - * - overflow, - * - going out of sync with leaf 1 EBX[23:16], - * - incrementing ApicIdCoreSize when it's zero (which changes the - * meaning of bits 7:0). + * Topology for HVM guests is entirely controlled by Xen. For now, we + * hardcode APIC_ID = vcpu_id * 2 to give the illusion of no SMT. */ - if ( p->extd.nc < 0x7f ) + p->basic.htt = true; + + /* + * Leaf 1 EBX[23:16] is Maximum Logical Processors Per Package. + * Update to reflect vLAPIC_ID = vCPU_ID * 2, but make sure to avoid + * overflow. + */ + if ( !(p->basic.lppp & 0x80) ) + p->basic.lppp *= 2; + + switch ( p->x86_vendor ) { - if ( p->extd.apic_id_size != 0 && p->extd.apic_id_size != 0xf ) - p->extd.apic_id_size++; + case X86_VENDOR_INTEL: + for ( i = 0; (p->cache.subleaf[i].type && + i < ARRAY_SIZE(p->cache.raw)); ++i ) + { + p->cache.subleaf[i].cores_per_package = + (p->cache.subleaf[i].cores_per_package << 1) | 1; + p->cache.subleaf[i].threads_per_cache = 0; + } + + case X86_VENDOR_AMD: + case X86_VENDOR_HYGON: + /* + * Leaf 0x80000008 ECX[15:12] is ApicIdCoreSize. + * Leaf 0x80000008 ECX[7:0] is NumberOfCores (minus one). + * Update to reflect vLAPIC_ID = vCPU_ID * 2. But avoid + * - overflow, + * - going out of sync with leaf 1 EBX[23:16], + * - incrementing ApicIdCoreSize when it's zero (which changes the + * meaning of bits 7:0). + */ + if ( p->extd.nc < 0x7f ) + { + if ( p->extd.apic_id_size != 0 && p->extd.apic_id_size != 0xf ) + p->extd.apic_id_size++; + + p->extd.nc = (p->extd.nc << 1) | 1; + } + break; - p->extd.nc = (p->extd.nc << 1) | 1; } - break; } /*