From patchwork Wed Feb 23 05:22:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Junaid Shahid X-Patchwork-Id: 12756394 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9645CC433EF for ; Wed, 23 Feb 2022 05:25:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2C9358D0020; Wed, 23 Feb 2022 00:25:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 279798D0001; Wed, 23 Feb 2022 00:25:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0A9468D0020; Wed, 23 Feb 2022 00:25:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0099.hostedemail.com [216.40.44.99]) by kanga.kvack.org (Postfix) with ESMTP id E003B8D0001 for ; Wed, 23 Feb 2022 00:25:02 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 9F6CC9D677 for ; Wed, 23 Feb 2022 05:25:02 +0000 (UTC) X-FDA: 79172905644.19.663096A Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf28.hostedemail.com (Postfix) with ESMTP id 3384EC0004 for ; Wed, 23 Feb 2022 05:25:02 +0000 (UTC) Received: by mail-yb1-f201.google.com with SMTP id x1-20020a25a001000000b0061c64ee0196so26708697ybh.9 for ; Tue, 22 Feb 2022 21:25:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=ADygLqIPpXiMqBEcZG+FIIc+2sjjDL7sC0SphzjVQM8=; b=XTSLmsGvhvcUg1TWPppXBzPUQLRac82+QtTaWAJbfENnf+LFsbfGfYAYKSS2y7fP86 xw06QgEtf7pep5PqE6Q1aM+NyxahQ7SmgpjVkx80n1ORA4q/bqa1CVOsjpTMMWmZloKQ uucN4SETymYCsva4W1OHKwzmwTJIk9TZAOYPJ1y+ZPVUmyQQ1PbH7n1l3q9WUPnCWwcM Cb3CxmDXmSaVEz0TvEtPSskpBLRm0AocmFGmy3Znq/tmFUH/ahr88pzMeI+oucZvgSN3 VZ4LlqahzSCUtRhfh8h6InSjlirWGXuPDeznUvG9vYUdMzAnIxg/DQM0QaE6Bs8x3s0G p5Tg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=ADygLqIPpXiMqBEcZG+FIIc+2sjjDL7sC0SphzjVQM8=; b=H6N9kLSvpfhsonFP5EkW8Bvn/Bm5OsaXUlmaRhfp7DBPCw2aTt2KYngrOqPoIYu8Vs 2to0VS+WYNqMq2JzLalS8DlUVqfnxmaCSmSaKapha9CoT//76n7k+pTw0brB6MPvtH0L G4q622UODTDFwvFUSXjBtNb4F7eHUJ94/WmQm0tymfq0dmXM6U5zdc0eXF3p5GuxIVhf NVH4CS84UMl7dZuIIQOBK4DRC6+ZkLvzeGJa+LMs0r6+smsigvvfS0ubypMX2Z8/ozi+ fJXJra6eF5Ffu2TpSlodjTu5z7uh02UqZ/SF/SjkrSqRKkkZISBouk/DmroiG9sWsE60 yS7w== X-Gm-Message-State: AOAM532pPLB8m571fuUlG+dcW/w/tz3wED3HsH96PGftHCNhPKBoOgC8 pqv6ZKjoOvhtUdAePbQhpOG21IXLdcLI X-Google-Smtp-Source: ABdhPJxnEaZg6RVeib/qxGX0ultucEuvlDuM1/3KZuHHbAgoAGzQNWHho88GWzU0EQgaqkKLhw8qsZUiJNi/ X-Received: from js-desktop.svl.corp.google.com ([2620:15c:2cd:202:ccbe:5d15:e2e6:322]) (user=junaids job=sendgmr) by 2002:a81:354f:0:b0:2d0:e91f:c26 with SMTP id c76-20020a81354f000000b002d0e91f0c26mr27033178ywa.318.1645593901360; Tue, 22 Feb 2022 21:25:01 -0800 (PST) Date: Tue, 22 Feb 2022 21:22:10 -0800 In-Reply-To: <20220223052223.1202152-1-junaids@google.com> Message-Id: <20220223052223.1202152-35-junaids@google.com> Mime-Version: 1.0 References: <20220223052223.1202152-1-junaids@google.com> X-Mailer: git-send-email 2.35.1.473.g83b2b277ed-goog Subject: [RFC PATCH 34/47] kvm: asi: Unmap guest memory from ASI address space when using nested virt From: Junaid Shahid To: linux-kernel@vger.kernel.org Cc: kvm@vger.kernel.org, pbonzini@redhat.com, jmattson@google.com, pjt@google.com, oweisse@google.com, alexandre.chartre@oracle.com, rppt@linux.ibm.com, dave.hansen@linux.intel.com, peterz@infradead.org, tglx@linutronix.de, luto@kernel.org, linux-mm@kvack.org Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=XTSLmsGv; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf28.hostedemail.com: domain of 3LcUVYgcKCCYLWPCKFUIQQING.EQONKPWZ-OOMXCEM.QTI@flex--junaids.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3LcUVYgcKCCYLWPCKFUIQQING.EQONKPWZ-OOMXCEM.QTI@flex--junaids.bounces.google.com X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 3384EC0004 X-Stat-Signature: sbdgxb5ffy8z5quurff1d1idn4rwgwm3 X-HE-Tag: 1645593902-782031 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: L1 guest memory as a whole cannot be considered non-sensitive when an L2 is running. Even if L1 is using its own mitigations, L2 VM Exits could, in theory, bring into the cache some sensitive L1 memory without L1 getting a chance to flush it. For simplicity, we just unmap the entire L1 memory from the ASI restricted address space when nested virtualization is turned on. Though this is overridden if the treat_all_userspace_as_nonsensitive flag is enabled. In the future, we could potentially map some portions of L1 memory which are known to contain non-sensitive memory, which would reduce ASI overhead during nested virtualization. Note that unmapping the guest memory still leaves a slight hole because L2 could also potentially access copies of L1 VCPU registers stored in L0 kernel structures. In the future, this could be mitigated by having a separate ASI address space for each VCPU and treating the associated structures as locally non-sensitive only within that VCPU's ASI address space. Signed-off-by: Junaid Shahid --- arch/x86/include/asm/kvm_host.h | 6 ++++++ arch/x86/kvm/mmu/mmu.c | 10 ++++++++++ arch/x86/kvm/vmx/nested.c | 22 ++++++++++++++++++++++ 3 files changed, 38 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index e63a2f244d7b..8ba88bbcf895 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1200,6 +1200,12 @@ struct kvm_arch { */ struct list_head tdp_mmu_pages; + /* + * Number of VCPUs that have enabled nested virtualization. + * Currently only maintained when ASI is enabled. + */ + int nested_virt_enabled_count; + /* * Protects accesses to the following fields when the MMU lock * is held in read mode: diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 485c0ba3ce8b..5785a0d02558 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -94,6 +94,7 @@ module_param_named(flush_on_reuse, force_flush_and_sync_on_reuse, bool, 0644); #ifdef CONFIG_ADDRESS_SPACE_ISOLATION bool __ro_after_init treat_all_userspace_as_nonsensitive; module_param(treat_all_userspace_as_nonsensitive, bool, 0444); +EXPORT_SYMBOL_GPL(treat_all_userspace_as_nonsensitive); #endif /* @@ -2769,6 +2770,15 @@ static void asi_map_gfn_range(struct kvm_vcpu *vcpu, int err; size_t hva = __gfn_to_hva_memslot(slot, gfn); + /* + * For now, we just don't map any guest memory when using nested + * virtualization. In the future, we could potentially map some + * portions of guest memory which are known to contain only memory + * which would be considered non-sensitive. + */ + if (vcpu->kvm->arch.nested_virt_enabled_count) + return; + err = asi_map_user(vcpu->kvm->asi, (void *)hva, PAGE_SIZE * npages, &vcpu->arch.asi_pgtbl_pool, slot->userspace_addr, slot->userspace_addr + slot->npages * PAGE_SIZE); diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 9c941535f78c..0a0092e4102d 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -318,6 +318,14 @@ static void free_nested(struct kvm_vcpu *vcpu) nested_release_evmcs(vcpu); free_loaded_vmcs(&vmx->nested.vmcs02); + + if (cpu_feature_enabled(X86_FEATURE_ASI) && + !treat_all_userspace_as_nonsensitive) { + write_lock(&vcpu->kvm->mmu_lock); + WARN_ON(vcpu->kvm->arch.nested_virt_enabled_count <= 0); + vcpu->kvm->arch.nested_virt_enabled_count--; + write_unlock(&vcpu->kvm->mmu_lock); + } } /* @@ -4876,6 +4884,20 @@ static int enter_vmx_operation(struct kvm_vcpu *vcpu) pt_update_intercept_for_msr(vcpu); } + if (cpu_feature_enabled(X86_FEATURE_ASI) && + !treat_all_userspace_as_nonsensitive) { + /* + * We do the increment under the MMU lock in order to prevent + * it from happening concurrently with asi_map_gfn_range(). + */ + write_lock(&vcpu->kvm->mmu_lock); + WARN_ON(vcpu->kvm->arch.nested_virt_enabled_count < 0); + vcpu->kvm->arch.nested_virt_enabled_count++; + write_unlock(&vcpu->kvm->mmu_lock); + + asi_unmap_user(vcpu->kvm->asi, 0, TASK_SIZE_MAX); + } + return 0; out_shadow_vmcs: