From patchwork Sat Feb 1 00:50:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 13956008 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BD391C2C8 for ; Sat, 1 Feb 2025 00:50:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738371054; cv=none; b=VQqpv2oJ+3RxSCfzwZ/VCQe7uzDSzewXAv76ousVYzTAOSbF2BcET6Gbnj39OnJPNrRb91eM0UvCUV/Zh/gDtzUYYvL7OWcwkTmtxHEHQVANfU0lLC3lzPwvzMruEhx89lTmsvQ3WTlusYJB+Fhkkr+ieJ/RT/C+E4I6IefMOqs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738371054; c=relaxed/simple; bh=8r1IMSgN9S/6dfm6fHWdqJHUa5NHhwmDUjPOyM4evr0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=uittHUJUX+LMeP2hRfWo9qtBuMCQuutyuSOPJ3ihhUMkHZE8vsrzSRjVfTCupHNr5L5NROueQEEHR7eB45vkYMTY907zsRBRUNzdkkpe68wq2x4+p2Dhn5+0CkHbwRWNXjabVJGZ9HQE64RkLZjeM01+mXQ6vV7Mpx/Xd0Sea4g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=zcyRsBr0; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="zcyRsBr0" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ef114d8346so5005389a91.0 for ; Fri, 31 Jan 2025 16:50:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738371052; x=1738975852; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=k2K+3ES3yFwMq3HJwtK6RVZhiYj3gk5lp/ZXG2ZzfAM=; b=zcyRsBr0SNRq3ekT4ZrY8GDjKr4FK9hIS7uItjluocvQkw/DaRTe7iLAoZizPg4hw6 2iTQ2+Md8c4R4SoNNCWzfU0HaeND2ltA1WCprES4sQmfTR/WKU0w4XUFRJz3tlFyXmmN VwgZdhWzlv9t8ysKH/9fdZULTFwHaaiZ3HMgWfb7vQkj+g3YhD07jSP024ve3L89fOlZ hV5IoC2sgm9S2No0oPsciLkx79tO29hs9xX9a3ay6GgPNhnw2jlQGiqNt8Kc5RqWvHyJ 1aHIhZA6vXSDnUIUHtg3vsB4qHafzfV4QlXUYIfQdClRhyzoIp0bvHtm37VvSbzXdCgR apXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738371052; x=1738975852; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=k2K+3ES3yFwMq3HJwtK6RVZhiYj3gk5lp/ZXG2ZzfAM=; b=SsBF18M8VD38hV8L13TALMdFmn0vWRZ/TAMaCvNQPvvIEi+9urHft+e3FUzU+aSNTO ps4KYJKR0Td6YNhNnsKKC7BygZ5SBKdVIO8afGp1Sedfns7PKZDSXc9xt2EF7SIuKRCd HeXCgtbMaNTeUs7A46ceHUEzwXFVUC7HudEKZj36MVO5kBj1QxLiSoPKzMAjMsZ4mSJI zPFHX3LBGhXUF13ib3yIHVyOIv4NPfHw9TQ0g/Ukg8opFgRHGm4KYzZ/S0f6aQYTZApv YBaXSkRloZrCqJOMxg47avrxwfWdE+LB0GuaGdEM3b2Xxfbzm0dD6TvRG5LmKjp/OM/l X5Cg== X-Forwarded-Encrypted: i=1; AJvYcCVP7px/s/nuayDv8LYpTr3+lqmz4rWj7yEa9ck1v2VN/yrgG6yHf31mtgctDHpI13YKMaY=@vger.kernel.org X-Gm-Message-State: AOJu0YwaaZLoHiSnLrONVdBruDVNzML0qAkVoT1klBwcKcld0neJUbmD cgjt2bYVW++oXCf5CVT3QvbGwHLQH8H8eoBa5q92DJL6z9mZMNQJG8XhBJDQkdHGDVLOVDrjkj5 8CA== X-Google-Smtp-Source: AGHT+IG2FdkNAipz+7YfRFK37esVegmX9yUIpKPkI5IQki8FVTV9TcslwFm+Z3SMc3yQI05wjkmkJT6td5Y= X-Received: from pjbov11.prod.google.com ([2002:a17:90b:258b:b0:2ef:7352:9e97]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3a0c:b0:2ee:c9b6:4c42 with SMTP id 98e67ed59e1d1-2f83abff391mr21670588a91.16.1738371052098; Fri, 31 Jan 2025 16:50:52 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 31 Jan 2025 16:50:47 -0800 In-Reply-To: <20250201005048.657470-1-seanjc@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250201005048.657470-1-seanjc@google.com> X-Mailer: git-send-email 2.48.1.362.g079036d154-goog Message-ID: <20250201005048.657470-2-seanjc@google.com> Subject: [PATCH 1/2] x86/mtrr: Return success vs. "failure" from guest_force_mtrr_state() From: Sean Christopherson To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, Paolo Bonzini Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Sean Christopherson , Dionna Glaze , Peter Gonda , " =?utf-8?b?SsO8cmdlbiBHcm/Dnw==?= " , Kirill Shutemov , Vitaly Kuznetsov , "H . Peter Anvin" , Binbin Wu , Tom Lendacky When *potentially* forcing MTRRs to a single memory type, return whether or not MTRRs were indeed overridden so that the caller can take additional action when necessary. E.g. KVM-as-a-guest will use the information to also force the PAT memtype for legacy devices to be WB. Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson --- arch/x86/include/asm/mtrr.h | 5 +++-- arch/x86/kernel/cpu/mtrr/generic.c | 11 +++++++---- 2 files changed, 10 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h index c69e269937c5..598753189f19 100644 --- a/arch/x86/include/asm/mtrr.h +++ b/arch/x86/include/asm/mtrr.h @@ -58,7 +58,7 @@ struct mtrr_state_type { */ # ifdef CONFIG_MTRR void mtrr_bp_init(void); -void guest_force_mtrr_state(struct mtrr_var_range *var, unsigned int num_var, +bool guest_force_mtrr_state(struct mtrr_var_range *var, unsigned int num_var, mtrr_type def_type); extern u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform); extern void mtrr_save_fixed_ranges(void *); @@ -75,10 +75,11 @@ void mtrr_disable(void); void mtrr_enable(void); void mtrr_generic_set_state(void); # else -static inline void guest_force_mtrr_state(struct mtrr_var_range *var, +static inline bool guest_force_mtrr_state(struct mtrr_var_range *var, unsigned int num_var, mtrr_type def_type) { + return false; } static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform) diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c index 2fdfda2b60e4..4fd704907dbc 100644 --- a/arch/x86/kernel/cpu/mtrr/generic.c +++ b/arch/x86/kernel/cpu/mtrr/generic.c @@ -435,19 +435,21 @@ void __init mtrr_copy_map(void) * @var: MTRR variable range array to use * @num_var: length of the @var array * @def_type: default caching type + * + * Returns %true if MTRRs were overridden, %false if they were not. */ -void guest_force_mtrr_state(struct mtrr_var_range *var, unsigned int num_var, +bool guest_force_mtrr_state(struct mtrr_var_range *var, unsigned int num_var, mtrr_type def_type) { unsigned int i; /* Only allowed to be called once before mtrr_bp_init(). */ if (WARN_ON_ONCE(mtrr_state_set)) - return; + return false; /* Only allowed when running virtualized. */ if (!cpu_feature_enabled(X86_FEATURE_HYPERVISOR)) - return; + return false; /* * Only allowed for special virtualization cases: @@ -460,7 +462,7 @@ void guest_force_mtrr_state(struct mtrr_var_range *var, unsigned int num_var, !hv_is_isolation_supported() && !cpu_feature_enabled(X86_FEATURE_XENPV) && !cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) - return; + return false; /* Disable MTRR in order to disable MTRR modifications. */ setup_clear_cpu_cap(X86_FEATURE_MTRR); @@ -480,6 +482,7 @@ void guest_force_mtrr_state(struct mtrr_var_range *var, unsigned int num_var, mtrr_state.enabled |= MTRR_STATE_MTRR_ENABLED; mtrr_state_set = 1; + return true; } static u8 type_merge(u8 type, u8 new_type, u8 *uniform) From patchwork Sat Feb 1 00:50:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 13956009 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 99D701F61C for ; Sat, 1 Feb 2025 00:50:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738371057; cv=none; b=XvrrZk+RQDbN9xhcsWPb7ADEDkNNi46M3rsg0tk6/w1OoXTk5KU33A1wgIknTta6thJ5+VfBatXO9sHOl48KU9O9AR4VilshzbO1b6Ze9udYStOIeD2m9xHvgjMoYboDFAaEIUP/j6Be0IRxXRKc0orlE/vc+lrb5OjS1U3gmqQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738371057; c=relaxed/simple; bh=Q0dVAZtcWC3kGSriq20XNsob1ps+6rdAl4KBthV0lsE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=JcN96ENHlIK9wdwIHgLhH5ezeks42ZwqA9kbySAtFEtD5I1SkJMQCzU72i0lfA/Y1fqzc7KCVatG9Buod47kykepRBdelN9byJDxrJnbfOc/P8L+YVhH26n3bePYloDieKpoqw+pzbRaB/bPWWWxuavsgJfQx78XCE+UVJSGd9I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=zwRd8JYi; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="zwRd8JYi" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2ef9e38b0cfso4930400a91.0 for ; Fri, 31 Jan 2025 16:50:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738371054; x=1738975854; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:from:to:cc:subject:date :message-id:reply-to; bh=hoM9urIHNLcJgS4Dc2h2QR/XgwI2ycQ6Fgv8TpOxhpY=; b=zwRd8JYi18vRFX1iXQGiCjxldWd0x9tqHkJwq12s0oXEz+AILEj5mePMp7y1kXuomY TwUyfsGxptGe77MO4jSEsjaaUvaIB/YryLyXVK+Ety1N7cCfozWDdPPptwrPF7uy/WQl vaW7xYyklq5sZxDmhRFLXf3FEvFT5fRWfX7zcB3QH4ijkttLLUmG0w59Iy4XkFTs7m5k +YYDaQjTp+GpQwM6EJ9kWFzyA4HyKwGGQQGFByqn1vENYh0xhaEcrc860eDXr3WUvRkj 8qTzrxfFv8VRxY1jJtpcjUXSnoC4AARhnedNFARMPHoaDQ5WSjX00t3+bQzTwUk7pl5H wvOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738371054; x=1738975854; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=hoM9urIHNLcJgS4Dc2h2QR/XgwI2ycQ6Fgv8TpOxhpY=; b=ELyIOlBDQxRZXaIZnzDNLojMYLZuvUsUdSF4AifbAoQTt1A1zl4bKnb+Jw68zrFx2S GIli76hNB6yYqV1+rNe9E6Q+fYDWDEIDHyixRNQ/IHgXMH3HcZH3KGKTY7w0lT9x3GQL 2d/c/9lp2NVrhALP6U3eWQXuKHQyLMUGLRlzYqVXs3/gUSyJr/torT+L4szmHp6ls5Ks ts98OankdbAl606cRpDSGyuiZowSIpnP5apWF6ojFaUpmr+pTkPOy0Jb49bWQe3VbAqj X6JSJ1KL09DRjDKf8nj+0mT5PLvkCCW+zYIJ2BGEsDFvCFHn6iQF3oTJS4eutR+hd184 I4yg== X-Forwarded-Encrypted: i=1; AJvYcCUEL2kGgqWUQrr6RrgrejzsfD26uapukzSkGoTM0ZFsoRBFtwVc4Y1vH7hhMjCHJiqKSbk=@vger.kernel.org X-Gm-Message-State: AOJu0YwAMlyjfrK4gWdZu2f9EFT9cd5xY88e/dbVNKcR1uwt+TsrpmfE YmiPnLkl3Wfj9ezGTfEkpFhuADn6cm5PAFmKMBWRiJCp/7KowxzpCGitIvlfODXLYLNqvbyxd3C jyQ== X-Google-Smtp-Source: AGHT+IGI421GnSBBqUYjOcLFGbdaN53zonktkZ1lI4ZcZUJ4Es2rBGDtWBpKi7KRzlsLCpJZRLs34Ghhh2c= X-Received: from pjboh14.prod.google.com ([2002:a17:90b:3a4e:b0:2f5:63a:4513]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:51c6:b0:2ea:4c4f:bd20 with SMTP id 98e67ed59e1d1-2f83ac86e65mr16933408a91.32.1738371053817; Fri, 31 Jan 2025 16:50:53 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 31 Jan 2025 16:50:48 -0800 In-Reply-To: <20250201005048.657470-1-seanjc@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250201005048.657470-1-seanjc@google.com> X-Mailer: git-send-email 2.48.1.362.g079036d154-goog Message-ID: <20250201005048.657470-3-seanjc@google.com> Subject: [PATCH 2/2] x86/kvm: Override low memory above TOLUD to WB when MTRRs are forced WB From: Sean Christopherson To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, Paolo Bonzini Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Sean Christopherson , Dionna Glaze , Peter Gonda , " =?utf-8?b?SsO8cmdlbiBHcm/Dnw==?= " , Kirill Shutemov , Vitaly Kuznetsov , "H . Peter Anvin" , Binbin Wu , Tom Lendacky When running as an SNP or TDX guest under KVM, treat the legacy PCI hole, i.e. memory between Top of Lower Usable DRAM and 4GiB, as an untracked PAT range to workaround issues with mapping legacy devices when MTRRs are forced to WB. In most KVM-based setups, legacy devices such as the HPET and TPM are enumerated via ACPI. For unknown reasons, ACPI auto-maps such devices as WB, whereas the dedicated device drivers map memory as WC or UC. In normal setups, the entire mess "works" as firmware configures the PCI hole (and other device memory) to be UC in the MTRRs. As a result, the ACPI mappings end up UC, which is compatible with the drivers' requested WC/UC-. With WB MTRRs, the ACPI mappings get their requested WB. If acpi_init() runs before the corresponding device driver is probed, ACPI's WB mapping will "win", and result in the driver's ioremap() failing because the existing WB mapping isn't compatible with the requested WC/UC-. E.g. when a TPM is emulated by the hypervisor (ignoring the security implications of relying on what is allegedly an untrusted entity to store measurements), the TPM driver will request UC and fail: [ 1.730459] ioremap error for 0xfed40000-0xfed45000, requested 0x2, got 0x0 [ 1.732780] tpm_tis MSFT0101:00: probe with driver tpm_tis failed with error -12 Note, the '0x2' and '0x0' values refer to "enum page_cache_mode", not x86's memtypes (which frustratingly are an almost pure inversion; 2 == WB, 0 == UC). The above trace is from a Google-VMM based VM, but the same behavior happens with a QEMU based VM. E.g. tracing mapping requests for HPET under QEMU yields: Mapping HPET, req_type = 0 WARNING: CPU: 5 PID: 1 at arch/x86/mm/pat/memtype.c:528 memtype_reserve+0x22f/0x3f0 Call Trace: __ioremap_caller.constprop.0+0xd6/0x330 acpi_os_map_iomem+0x195/0x1b0 acpi_ex_system_memory_space_handler+0x11c/0x2f0 acpi_ev_address_space_dispatch+0x168/0x3b0 acpi_ex_access_region+0xd7/0x280 acpi_ex_field_datum_io+0x73/0x210 acpi_ex_extract_from_field+0x267/0x2a0 acpi_ex_read_data_from_field+0x8e/0x220 acpi_ex_resolve_node_to_value+0xe2/0x2b0 acpi_ds_evaluate_name_path+0xa9/0x100 acpi_ds_exec_end_op+0x21f/0x4c0 acpi_ps_parse_loop+0xf4/0x670 acpi_ps_parse_aml+0x17b/0x3d0 acpi_ps_execute_method+0x137/0x260 acpi_ns_evaluate+0x1f0/0x2d0 acpi_evaluate_object+0x13d/0x2e0 acpi_evaluate_integer+0x50/0xe0 acpi_bus_get_status+0x7b/0x140 acpi_add_single_object+0x3f8/0x750 acpi_bus_check_add+0xb2/0x340 acpi_ns_walk_namespace+0xfe/0x200 acpi_walk_namespace+0xbb/0xe0 acpi_bus_scan+0x1b5/0x1d0 acpi_scan_init+0xd5/0x290 acpi_init+0x1fc/0x520 do_one_initcall+0x41/0x1d0 kernel_init_freeable+0x164/0x260 kernel_init+0x16/0x1a0 ret_from_fork+0x2d/0x50 ret_from_fork_asm+0x11/0x20 ---[ end trace 0000000000000000 ]--- The only reason this doesn't cause problems for HPET is because HPET gets special treatment via x86_init.timers.timer_init(), and so gets a chance to create its UC- mapping before acpi_init() clobbers things. Disabling the early call to hpet_time_init() yields the same behavior for HPET: [ 0.318264] ioremap error for 0xfed00000-0xfed01000, requested 0x2, got 0x0 Hack around the mess by forcing such mappings to WB, as the memory type is irrevelant. Even in a theoretical setup where such devices are passed through by the host, i.e. point at real MMIO memory, it is KVM's (as the hypervisor) responsibility to force the memory to be WC/UC, e.g. via EPT memtype under TDX or real hardware MTRRs under SNP. Not doing so cannot work, and the hypervisor is highly motivated to do the right thing as letting the guest access hardware MMIO with WB would likely result in a variety of fatal #MCs. Limit the hack to the legacy PCI hole on the off chance that there are use cases that want to map virtual devices with WC/UC. E.g. in theory, it would be possible to expose hardware GPU buffers to an SNP or TDX guest. Extending the hack, e.g. if there are use cases for memory above 4GiB that are affected by ACPI, is far easier than debugging memory corruption if a driver requests WC/UC and silently gets WB. Double down on forcing everything to WB, e.g. instead of fixing the CR0.CD issue and reverting to a "normal" model, as OVMF has also been taught to ignore MTRRs when running as a TDX guest: 3a3b12cbda ("UefiCpuPkg/MtrrLib: MtrrLibIsMtrrSupported always return FALSE in TD-Guest") 071d2cfab8 ("OvmfPkg/Sec: Skip setup MTRR early in TD-Guest") And running with firmware that doesn't program MTRRs would likely put the kernel back into the conundrum of ACPI mapping devices WB, with drivers wanting WC/UC-. Fixes: 8e690b817e38 ("x86/kvm: Override default caching mode for SEV-SNP and TDX") Cc: stable@vger.kernel.org Cc: Dionna Glaze Cc: Peter Gonda Cc: Jürgen Groß Cc: Kirill Shutemov Cc: Vitaly Kuznetsov Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: H. Peter Anvin Cc: Binbin Wu Cc: Tom Lendacky Signed-off-by: Sean Christopherson --- arch/x86/kernel/kvm.c | 31 +++++++++++++++++++++++++++++-- 1 file changed, 29 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 7a422a6c5983..7ae294fe99c3 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -931,6 +931,23 @@ static void kvm_sev_hc_page_enc_status(unsigned long pfn, int npages, bool enc) KVM_MAP_GPA_RANGE_ENC_STAT(enc) | KVM_MAP_GPA_RANGE_PAGE_SZ_4K); } +static u64 kvm_tolud __ro_after_init; + +static bool kvm_is_forced_wb_range(u64 start, u64 end) +{ + /* + * In addition to the standard ISA override, force all low memory above + * TOLUD to WB so that legacy devices are mapped with WB when running + * as an SNP or TDX guest. The memtype itself is completely irrevelant + * as the devices are emulated, the override^Whack is needed purely to + * avoid failures due to ACPI mapping device memory as WB in advance of + * device drivers requesting WC or UC. In a system with MTRRs, ACPI's + * mappings get forced to UC via MTRRs (programmed sanely by firmware). + */ + return is_ISA_range(start, end) || + (start >= kvm_tolud && end <= SZ_4G); +} + static void __init kvm_init_platform(void) { if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) && @@ -982,8 +999,18 @@ static void __init kvm_init_platform(void) kvmclock_init(); x86_platform.apic_post_init = kvm_apic_init; - /* Set WB as the default cache mode for SEV-SNP and TDX */ - guest_force_mtrr_state(NULL, 0, MTRR_TYPE_WRBACK); + /* + * Set WB as the default cache mode for SEV-SNP and TDX. MTRRs may be + * enumerated as supported, but neither the TDX-Module (Secure EPT) nor + * KVM (normal EPT for TDX, virtual MTRRs for NPT) actually virtualizes + * MTRR memory types. If MTRRs are forced to writeback, register KVM's + * range-based WB override to handle cases where device drivers try to + * map an emulated device's memory as WC, and fail because it's all WB. + */ + if (guest_force_mtrr_state(NULL, 0, MTRR_TYPE_WRBACK)) { + kvm_tolud = (e820__end_of_low_ram_pfn() << PAGE_SHIFT); + x86_platform.is_untracked_pat_range = kvm_is_forced_wb_range; + } } #if defined(CONFIG_AMD_MEM_ENCRYPT)