From patchwork Tue Oct 19 11:01:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12569539 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8366FC433F5 for ; Tue, 19 Oct 2021 11:02:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6500161355 for ; Tue, 19 Oct 2021 11:02:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235355AbhJSLEK (ORCPT ); Tue, 19 Oct 2021 07:04:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44946 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235237AbhJSLEJ (ORCPT ); Tue, 19 Oct 2021 07:04:09 -0400 Received: from mail-pj1-x102f.google.com (mail-pj1-x102f.google.com [IPv6:2607:f8b0:4864:20::102f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 01D13C06161C; Tue, 19 Oct 2021 04:01:57 -0700 (PDT) Received: by mail-pj1-x102f.google.com with SMTP id q2-20020a17090a2e0200b001a0fd4efd49so1720024pjd.1; Tue, 19 Oct 2021 04:01:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=WNgwR02wKvV99D2pCZhogtDphzYokbzMq2B91dyZpwE=; b=WySH3WNgl3isE6aQmWlyHcF5y/brD9eAM4ZB+4I73epIHzbpIa1+rzv7K8q/qOBTMl Pqp1qqeRx4Z4yzMhDUk7oEchY9bnGJ4yGGlGwM2zxykSKlFeNufI0c80OJfcs0CkmE62 rWS6LFrI6dMWar/lFfuGCYZ6cRel396K0XQdGGxpXa+dpqj7osldLUsRF+t0wI8Gl2mb 9EhZm8BRaDNSZK/PhRcIAewkk7Io0r5H3GvmW/5suSYgmF0i8UB9faBkgvuUjpKAPBqg yA8ZHFPw7ZHOuJjOEIt2E7xIRXvzf7SGlS1VhV5nkJmzDPSKmBAJrsJgRvLF5tXbLvIs FY/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=WNgwR02wKvV99D2pCZhogtDphzYokbzMq2B91dyZpwE=; b=oNSO4VnS9zSk0+PnPBYmLx29ArgJCq7yJ+2C4g8GsexJcbg0PxunWbhYQU4b+/t9ad AnNtjsE+HXp9I9T1HVfAAyGcdXXjBIad+5cV/v8fRRtWxNZUw3OMPgUt1PyYrKhFGnZa zxdrj6tQPLiHJnZECK5xkmXHJXe9kkrBVi1iladn/MzGF5mS3Xe0hbHKRmDQESacRy9z RdQxRPtzu3pr8OsZVied22ipzXrrwM7CknmBrRDdKQDRhXKmdh+c5ORfNmuC5YCvPhlo 56tfwGFF9cDf7HbljhJNAEytHVd7fffWzRwhOSg/1pB8SaPsrKNw2KrNy1A7IwJDHWy3 458g== X-Gm-Message-State: AOAM532Zh0K5t3Hc0qhyrtPk2pdssgcBfxtFaV45IHBEVDRD5crsxgdH XVyjynHgP0oafZMCgUbET2sSTt4HG7Y= X-Google-Smtp-Source: ABdhPJxOU1LwmG786WY5NmuQ+jvVYpqx9C3InWfwBeGvrfq7WP6FawYujs9bbmWIgeeUxHYP0dayqQ== X-Received: by 2002:a17:90b:3ec3:: with SMTP id rm3mr5795329pjb.186.1634641316429; Tue, 19 Oct 2021 04:01:56 -0700 (PDT) Received: from localhost ([47.88.60.64]) by smtp.gmail.com with ESMTPSA id me18sm2491812pjb.33.2021.10.19.04.01.55 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Oct 2021 04:01:56 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini Cc: Lai Jiangshan , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH 1/4] KVM: X86: Fix tlb flush for tdp in kvm_invalidate_pcid() Date: Tue, 19 Oct 2021 19:01:51 +0800 Message-Id: <20211019110154.4091-2-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20211019110154.4091-1-jiangshanlai@gmail.com> References: <20211019110154.4091-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan The KVM doesn't know whether any TLB for a specific pcid is cached in the CPU when tdp is enabled. So it is better to flush all the guest TLB when invalidating any single PCID context. The case is rare or even impossible since KVM doesn't intercept CR3 write or INVPCID instructions when tdp is enabled. The fix is just for the sake of robustness in case emulation can reach here or the interception policy is changed. Signed-off-by: Lai Jiangshan --- arch/x86/kvm/x86.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c59b63c56af9..06169ed08db0 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1073,6 +1073,16 @@ static void kvm_invalidate_pcid(struct kvm_vcpu *vcpu, unsigned long pcid) unsigned long roots_to_free = 0; int i; + /* + * It is very unlikely to reach here when tdp_enabled. But if it is + * the case, the kvm doesn't know whether any TLB for the @pcid is + * cached in the CPU. So just flush the guest instead. + */ + if (unlikely(tdp_enabled)) { + kvm_make_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu); + return; + } + /* * If neither the current CR3 nor any of the prev_roots use the given * PCID, then nothing needs to be done here because a resync will From patchwork Tue Oct 19 11:01:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12569541 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0C00C433EF for ; Tue, 19 Oct 2021 11:02:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C6077604AC for ; Tue, 19 Oct 2021 11:02:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235386AbhJSLES (ORCPT ); Tue, 19 Oct 2021 07:04:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44984 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235366AbhJSLEQ (ORCPT ); Tue, 19 Oct 2021 07:04:16 -0400 Received: from mail-pj1-x1036.google.com (mail-pj1-x1036.google.com [IPv6:2607:f8b0:4864:20::1036]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 19C52C061746; Tue, 19 Oct 2021 04:02:04 -0700 (PDT) Received: by mail-pj1-x1036.google.com with SMTP id na16-20020a17090b4c1000b0019f5bb661f9so2382330pjb.0; Tue, 19 Oct 2021 04:02:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gl8VPcoMln3e3DT8GeJsd1LdLDZ2R76jpOIONXiyUnI=; b=OWNJZfD3eYQxMq4xtu6UH4+Tvk6SyZaWfVW3PepY6Q6cQM9UsL+hegB/bDac0/OF7V DNO3hnO22ApGNALcHa44Pc7huffLx0LHvDfBncRO2A0nRm7E5R8fmp8877mO6QTlRt5D b7wiaMUpYbTa319LH7hUrK1gZbd+BbQogglLiR7nf33Yi2GJyJ5LRCr+HprAWThw7ODy nd4MIeKbMU78uNZWhDVCVd6qCLe4rPrYckdIKQ7kUHMiy9nQS4NyUXz6Qpkn42o4EPq2 UNozWRFb2TbULKoQQNSGMEsvd5nX8rhmBvHQBR2wvzwymGBHZWmHiFlR79mUWSeCg2hU RyUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gl8VPcoMln3e3DT8GeJsd1LdLDZ2R76jpOIONXiyUnI=; b=AjUvyO0NzC+TuEGNEsRzhjcInVvf/YLs92n43YzOOaAb0XmlQuZakXsE1DWLuH3iHf +hTuDIn87G7gQvlI+A0OaKSf4v+ug7iQvbv2UACu7zvyHT0F+SAPHUhKPGGjqe0ThKM6 Dm1tIg0DJzEBBfFEy4iJb+0rcEwvSS9qfDSg5VXVvhGXAZwxwvzMDYqfEOge+/IS+vrq rcPnJCMsBYWoMvPp93krxwud/dXTcF+fg8fQbv0boAlWWjSVsntAeAmD7xbG19lt0xxi NpDOv60Kr020oRl8QugCcpe1lN+v+Fbdf73k5ijL1HizWeVUtITg4Y9qwsiQWZnx1m7w JSpw== X-Gm-Message-State: AOAM530tmfcjvJqEjc2txI9DqnLC107X/hP8IfwEStbK7gXEz/WzVTlt UOpzvoDfoxwOFtonNv/Uf63XaWbJZxU= X-Google-Smtp-Source: ABdhPJwxmqB2ZKQOr2OdCNeT0ZdndX8q9PZ7KEGuwdPT7gFjdPpFAv970zHMAVYJYA3mawl7jaWiIQ== X-Received: by 2002:a17:90b:4b89:: with SMTP id lr9mr5775280pjb.11.1634641323340; Tue, 19 Oct 2021 04:02:03 -0700 (PDT) Received: from localhost ([47.88.60.64]) by smtp.gmail.com with ESMTPSA id m11sm10360837pgv.84.2021.10.19.04.02.01 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Oct 2021 04:02:02 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini Cc: Lai Jiangshan , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH 2/4] KVM: X86: Cache CR3 in prev_roots when PCID is disabled Date: Tue, 19 Oct 2021 19:01:52 +0800 Message-Id: <20211019110154.4091-3-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20211019110154.4091-1-jiangshanlai@gmail.com> References: <20211019110154.4091-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan The commit 21823fbda5522 ("KVM: x86: Invalidate all PGDs for the current PCID on MOV CR3 w/ flush") invalidates all PGDs for the specific PCID and in the case of PCID is disabled, it includes all PGDs in the prev_roots and the commit made prev_roots totally unused in this case. Not using prev_roots fixes a problem when CR4.PCIDE is changed 0 -> 1 before the said commit: (CR4.PCIDE=0, CR3=cr3_a, the page for the guest kernel is global, cr3_b is cached in prev_roots) modify the user part of cr3_b the shadow root of cr3_b is unsync in kvm INVPCID single context the guest expects the TLB is clean for PCID=0 change CR4.PCIDE 0 -> 1 switch to cr3_b with PCID=0,NOFLUSH=1 No sync in kvm, cr3_b is still unsync in kvm return to the user part (of cr3_b) the user accesses to wrong page It is a very unlikely case, but it shows that virtualizing guest TLB in prev_roots is not safe in this case and the said commit did fix the problem. But the said commit also disabled caching CR3 in prev_roots when PCID is disabled and NOT all CPUs have PCID, especially the PCID support for AMD CPUs is kind of recent. To restore the original optimization, we have to enable caching CR3 without re-introducing problems. Actually, in short, the said commit just ensures prev_roots not part of the virtualized TLB. So this change caches CR3 in prev_roots, and ensures prev_roots not part of the virtualized TLB by always flushing the virtualized TLB when CR3 is switched from prev_roots to current (it is already the current behavior) and by freeing prev_roots when CR4.PCIDE is changed 0 -> 1. Anyway: PCID enabled: vTLB includes root_hpa, prev_roots and hardware TLB. PCID disabled: vTLB includes root_hpa and hardware TLB, no prev_roots. Signed-off-by: Lai Jiangshan --- arch/x86/kvm/x86.c | 32 ++++++++++++++++++++++++++++++-- 1 file changed, 30 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 06169ed08db0..13df3ca88e09 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1022,10 +1022,29 @@ EXPORT_SYMBOL_GPL(kvm_is_valid_cr4); void kvm_post_set_cr4(struct kvm_vcpu *vcpu, unsigned long old_cr4, unsigned long cr4) { + /* + * If any role bit is changed, the MMU needs to be reset. + * + * If CR4.PCIDE is changed 0 -> 1, there is no need to flush the guest + * TLB per SDM, but the virtualized TLB doesn't include prev_roots when + * CR4.PCIDE is 0, so the prev_roots has to be freed to avoid later + * resuing without explicit flushing. + * If CR4.PCIDE is changed 1 -> 0, there is required to flush the guest + * TLB and KVM_REQ_MMU_RELOAD is fit for the both cases. Although + * KVM_REQ_MMU_RELOAD is slow, changing CR4.PCIDE is a rare case. + * + * If CR4.PGE is changed, there is required to just flush the guest TLB. + * + * Note: reseting MMU covers KVM_REQ_MMU_RELOAD and KVM_REQ_MMU_RELOAD + * covers KVM_REQ_TLB_FLUSH_GUEST, so "else if" is used here and the + * check for later cases are skipped if the check for the preceding + * case is matched. + */ if ((cr4 ^ old_cr4) & KVM_MMU_CR4_ROLE_BITS) kvm_mmu_reset_context(vcpu); - else if (((cr4 ^ old_cr4) & X86_CR4_PGE) || - (!(cr4 & X86_CR4_PCIDE) && (old_cr4 & X86_CR4_PCIDE))) + else if ((cr4 ^ old_cr4) & X86_CR4_PCIDE) + kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu); + else if ((cr4 ^ old_cr4) & X86_CR4_PGE) kvm_make_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu); } EXPORT_SYMBOL_GPL(kvm_post_set_cr4); @@ -1093,6 +1112,15 @@ static void kvm_invalidate_pcid(struct kvm_vcpu *vcpu, unsigned long pcid) kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu); } + /* + * If PCID is disabled, there is no need to free prev_roots even the + * PCIDs for them are also 0. The prev_roots are just not included + * in the "clean" virtualized TLB and a resync will happen anyway + * before switching to any other CR3. + */ + if (!kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE)) + return; + for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) if (kvm_get_pcid(vcpu, mmu->prev_roots[i].pgd) == pcid) roots_to_free |= KVM_MMU_ROOT_PREVIOUS(i); From patchwork Tue Oct 19 11:01:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12569543 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BAF95C433EF for ; Tue, 19 Oct 2021 11:02:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9E1BF60241 for ; Tue, 19 Oct 2021 11:02:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235438AbhJSLE3 (ORCPT ); Tue, 19 Oct 2021 07:04:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45016 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235381AbhJSLEW (ORCPT ); Tue, 19 Oct 2021 07:04:22 -0400 Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 74811C061745; Tue, 19 Oct 2021 04:02:10 -0700 (PDT) Received: by mail-pf1-x432.google.com with SMTP id v8so13153780pfu.11; Tue, 19 Oct 2021 04:02:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=HqMhy1gFJqYHe4rVLAgrJaAnwOJLwZUqxNuxtFFoDtY=; b=UfVdgalT6/QdEQX8d/jg9GOqRzrWKnhsINKmRQVWsuZRfTK8VGqWFvWQwr/I4ys8X4 nRVaO0MR5dQEum0WRUnG00e0+FcsMfVuVhnB+1nx2Cz9UCDRDNfkvDBrnCUlQr/fbVuQ AGXaNkahbdOQB8iizt/Vin5skU1SW1ktLJlK3VTweiI57dl02DBnNuDE4FJiX6IoKDE7 BTnFc+rwQs9Bi6j3Re+kPEekpR0QnfR3gJZPWdsMUGHgrio14vSmzt3d4xT6HHlmAgoP Rxk7UcOjFV+UVL4R4yK6pOCjDneAbrn2zACH7wtbAWkJHTj2ZT/NlXsUJ2cM4u2gQn1I Ox7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=HqMhy1gFJqYHe4rVLAgrJaAnwOJLwZUqxNuxtFFoDtY=; b=buaETJcOcpTwLkkJP9UJZvhIUsVbRPw06MJBLK32sPeUn41Byji0zCXQPpV7KbAg6C QdQzMnDEyBRxhTfF5AP5or6tdf6eYQMQAtFzL/AJj9XecsjUP/wyYdAQUoobwPn9x8ek Qngy5Vn4Sxp+9LUt/VANvC3K1yS/AzNETp7akLWvcooB3BiD2FfCixNwbrJ0CIMcH2uA ehQzDZ9+zfpyjjV0+DGj3s8bJHQ/owr1X+cDk6knuw7p3VQZnmZrlmJrXPIS4nOx9/Kg kklJEQeRQZ85qs6d+wwlgjpbUi+hRILn1hFzgIgFs+OV0w136Rhkh2yhMgu+4t0uLSGV ALug== X-Gm-Message-State: AOAM530SM5D00cH9cQz7u+IAvBxM5dQbjrSi2k/FZb6khtFXnz7qmkwz f7dNKldNy1NwaXrtdwdrLOES4ISyqh8= X-Google-Smtp-Source: ABdhPJxgmyjoIGNkqJVbpW1c/q3fQdhlt0cooRb7ZJMmscHjxZfKh9syTYb1chwx5+jboX2ZTULJRA== X-Received: by 2002:a63:b50d:: with SMTP id y13mr28040336pge.286.1634641329694; Tue, 19 Oct 2021 04:02:09 -0700 (PDT) Received: from localhost ([47.88.60.64]) by smtp.gmail.com with ESMTPSA id mu11sm3038559pjb.20.2021.10.19.04.02.08 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Oct 2021 04:02:09 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini Cc: Lai Jiangshan , Junaid Shahid , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH 3/4] KVM: X86: Use smp_rmb() to pair with smp_wmb() in mmu_try_to_unsync_pages() Date: Tue, 19 Oct 2021 19:01:53 +0800 Message-Id: <20211019110154.4091-4-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20211019110154.4091-1-jiangshanlai@gmail.com> References: <20211019110154.4091-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan The commit 578e1c4db2213 ("kvm: x86: Avoid taking MMU lock in kvm_mmu_sync_roots if no sync is needed") added smp_wmb() in mmu_try_to_unsync_pages(), but the corresponding smp_load_acquire() isn't used on the load of SPTE.W which is impossible since the load of SPTE.W is performed in the CPU's pagetable walking. This patch changes to use smp_rmb() instead. This patch fixes nothing but just comments since smp_rmb() is NOP and compiler barrier() is not required since the load of SPTE.W is before VMEXIT. Cc: Junaid Shahid Signed-off-by: Lai Jiangshan --- arch/x86/kvm/mmu/mmu.c | 47 +++++++++++++++++++++++++++++------------- 1 file changed, 33 insertions(+), 14 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index c6ddb042b281..900c7a157c99 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2665,8 +2665,9 @@ int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot, * (sp->unsync = true) * * The write barrier below ensures that 1.1 happens before 1.2 and thus - * the situation in 2.4 does not arise. The implicit barrier in 2.2 - * pairs with this write barrier. + * the situation in 2.4 does not arise. The implicit read barrier + * between 2.1's load of SPTE.W and 2.3 (as in is_unsync_root()) pairs + * with this write barrier. */ smp_wmb(); @@ -3629,6 +3630,35 @@ static int mmu_alloc_special_roots(struct kvm_vcpu *vcpu) #endif } +static bool is_unsync_root(hpa_t root) +{ + struct kvm_mmu_page *sp; + + /* + * Even if another CPU was marking the SP as unsync-ed simultaneously, + * any guest page table changes are not guaranteed to be visible anyway + * until this VCPU issues a TLB flush strictly after those changes are + * made. We only need to ensure that the other CPU sets these flags + * before any actual changes to the page tables are made. The comments + * in mmu_try_to_unsync_pages() describe what could go wrong if this + * requirement isn't satisfied. + * + * To pair with the smp_wmb() in mmu_try_to_unsync_pages() between the + * write to sp->unsync[_children] and the write to SPTE.W, a read + * barrier is needed after the CPU reads SPTE.W (or the read itself is + * an acquire operation) while doing page table walk and before the + * checks of sp->unsync[_children] here. The CPU has already provided + * the needed semantic, but an NOP smp_rmb() here can provide symmetric + * pairing and richer information. + */ + smp_rmb(); + sp = to_shadow_page(root); + if (sp->unsync || sp->unsync_children) + return true; + + return false; +} + void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu) { int i; @@ -3646,18 +3676,7 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu) hpa_t root = vcpu->arch.mmu->root_hpa; sp = to_shadow_page(root); - /* - * Even if another CPU was marking the SP as unsync-ed - * simultaneously, any guest page table changes are not - * guaranteed to be visible anyway until this VCPU issues a TLB - * flush strictly after those changes are made. We only need to - * ensure that the other CPU sets these flags before any actual - * changes to the page tables are made. The comments in - * mmu_try_to_unsync_pages() describe what could go wrong if - * this requirement isn't satisfied. - */ - if (!smp_load_acquire(&sp->unsync) && - !smp_load_acquire(&sp->unsync_children)) + if (!is_unsync_root(root)) return; write_lock(&vcpu->kvm->mmu_lock); From patchwork Tue Oct 19 11:01:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12569545 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4179EC433EF for ; Tue, 19 Oct 2021 11:03:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1D6D161154 for ; Tue, 19 Oct 2021 11:02:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235326AbhJSLEf (ORCPT ); Tue, 19 Oct 2021 07:04:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45058 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235444AbhJSLE3 (ORCPT ); Tue, 19 Oct 2021 07:04:29 -0400 Received: from mail-pj1-x1032.google.com (mail-pj1-x1032.google.com [IPv6:2607:f8b0:4864:20::1032]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BE9EEC061746; Tue, 19 Oct 2021 04:02:16 -0700 (PDT) Received: by mail-pj1-x1032.google.com with SMTP id ls18-20020a17090b351200b001a00250584aso1703205pjb.4; Tue, 19 Oct 2021 04:02:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=6+dHCDTEpaqypxNMT0rcakQRNLfKZkzPn3pC3wuFQcc=; b=lZUef7uP6DnHKHtQNFD//yOobud2THUXMHhReFuEJQcYd+0HOhZZLour/WAmoEof9Y fsPxT9VE0Hd9oClD0utwcrT8LjAl7Ogo1Qgl4Lwt9u72WEtLdKS5N7VabcihnFCOSQPo gGJUoLBDVetehQOiOqxbCxlFmaU55cmMWL61rVTQWF6xrKUGJHD3IPHc7XD48tg8sPLE bgOexpKOBSDE3hx+FRR/74+uRlWVhJNPVUae6AqiHvLbYNIPE1gQmLugGDFeZrS+gcAf 1eKIcOoIMMJJwABe2lI8uL6NQJYuIZtxdmkRbR9NcD+plP4FfIkQYrvtauC705aKUh6c 66Gw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=6+dHCDTEpaqypxNMT0rcakQRNLfKZkzPn3pC3wuFQcc=; b=TNlt3GXqIX+ttqGo3uDrksNr5GVkbG8Pb4G5n9zl0lYWfESR4V1QAdrHa7u/SyY+uL 3QUcElXzEBVfJDAcLa/DnZAaWfWOy89XYErO245THBy+byAfzl9eNM3IA7v5KnpanY1K rp4JisRixoeGdhc6ykUoiotZtfvfhpxMj6hkGNRvVCJnkjwbiNv5wgSu7ViQMiLyR19d 3pCFG3Fan1V2mqO7JR7WxkF9ajTInyZ6u2WxJsE5cLzS6q4YqnnQ43o6Aa5ZpMORCyVd MgTAZHUuuGVbz18KlmBexv9mVrx0PkrxGjmoSUFM3lW2m3e5Sah4uJ3tmb+SFcMy+D/z NZVg== X-Gm-Message-State: AOAM533vUkCOYPC0iyT2lgoaK1vNxE5xgeOKT+1Dx8gNKkzwWpTDFj05 7cX0W0XvjNcYknsQcoOZ426UPL/nD1s= X-Google-Smtp-Source: ABdhPJwtN8VkQB3KsRA3kACXZEYdwRab04jxqJ8CX8CSk2Jg0mm/AyFNLVHgC0Hr0yY4VhKJLxcYBQ== X-Received: by 2002:a17:90b:1910:: with SMTP id mp16mr5702421pjb.30.1634641336151; Tue, 19 Oct 2021 04:02:16 -0700 (PDT) Received: from localhost ([47.88.60.64]) by smtp.gmail.com with ESMTPSA id i124sm16462896pfc.153.2021.10.19.04.02.15 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 Oct 2021 04:02:15 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini Cc: Lai Jiangshan , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH 4/4] KVM: X86: Don't unload MMU in kvm_vcpu_flush_tlb_guest() Date: Tue, 19 Oct 2021 19:01:54 +0800 Message-Id: <20211019110154.4091-5-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20211019110154.4091-1-jiangshanlai@gmail.com> References: <20211019110154.4091-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan kvm_mmu_unload() destroys all the PGD caches. Use the lighter kvm_mmu_sync_roots() and kvm_mmu_sync_prev_roots() instead. Signed-off-by: Lai Jiangshan --- arch/x86/kvm/mmu.h | 1 + arch/x86/kvm/mmu/mmu.c | 16 ++++++++++++++++ arch/x86/kvm/x86.c | 11 +++++------ 3 files changed, 22 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 1ae70efedcf4..8e9dd63b68a9 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -79,6 +79,7 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code, int kvm_mmu_load(struct kvm_vcpu *vcpu); void kvm_mmu_unload(struct kvm_vcpu *vcpu); void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu); +void kvm_mmu_sync_prev_roots(struct kvm_vcpu *vcpu); static inline int kvm_mmu_reload(struct kvm_vcpu *vcpu) { diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 900c7a157c99..fb45eeb8dd22 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3634,6 +3634,9 @@ static bool is_unsync_root(hpa_t root) { struct kvm_mmu_page *sp; + if (!VALID_PAGE(root)) + return false; + /* * Even if another CPU was marking the SP as unsync-ed simultaneously, * any guest page table changes are not guaranteed to be visible anyway @@ -3706,6 +3709,19 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu) write_unlock(&vcpu->kvm->mmu_lock); } +void kvm_mmu_sync_prev_roots(struct kvm_vcpu *vcpu) +{ + unsigned long roots_to_free = 0; + int i; + + for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) + if (is_unsync_root(vcpu->arch.mmu->prev_roots[i].hpa)) + roots_to_free |= KVM_MMU_ROOT_PREVIOUS(i); + + /* sync prev_roots by simply freeing them */ + kvm_mmu_free_roots(vcpu, vcpu->arch.mmu, roots_to_free); +} + static gpa_t nonpaging_gva_to_gpa(struct kvm_vcpu *vcpu, gpa_t vaddr, u32 access, struct x86_exception *exception) { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 13df3ca88e09..1771cd4bb449 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3251,15 +3251,14 @@ static void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu) ++vcpu->stat.tlb_flush; if (!tdp_enabled) { - /* + /* * A TLB flush on behalf of the guest is equivalent to * INVPCID(all), toggling CR4.PGE, etc., which requires - * a forced sync of the shadow page tables. Unload the - * entire MMU here and the subsequent load will sync the - * shadow page tables, and also flush the TLB. + * a forced sync of the shadow page tables. Ensure all the + * roots are synced and the guest TLB in hardware is clean. */ - kvm_mmu_unload(vcpu); - return; + kvm_mmu_sync_roots(vcpu); + kvm_mmu_sync_prev_roots(vcpu); } static_call(kvm_x86_tlb_flush_guest)(vcpu);