From patchwork Tue Jan 12 18:10:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014293 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49507C433E0 for ; Tue, 12 Jan 2021 18:11:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0382722DFA for ; Tue, 12 Jan 2021 18:11:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390897AbhALSL3 (ORCPT ); Tue, 12 Jan 2021 13:11:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56666 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728965AbhALSL2 (ORCPT ); Tue, 12 Jan 2021 13:11:28 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 301E7C061795 for ; Tue, 12 Jan 2021 10:10:48 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id a206so3309450ybg.0 for ; Tue, 12 Jan 2021 10:10:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=AT/HhadsSLkfqA9qLz1yJR5FuwQcmEn0C1RS62FkcL4=; b=J6taq1Sz6FYVlN36xTqjFZYm8ABILYWAdx6QW+/38pCGCrYz20tUALul8pbOd9n36p LAOV3mpAkpmjGyhiZig7aW8hNRawkHmK78ZQxEatXnkFkA59sK1ah8WRiI+dJ3P8va9y LqRm6XmejkwIBKKJGlgFOAmnk1DqT/Dj73QmwBhl+XP0LiVRNBMs2pIX1hkhhi0Dxrpi pY2hscWTsqiGjTw41xoT9pvLdA1r5hw8EgvHsr+CMiwA6qo1Z4hUivfPMtFF0qHyoUp3 tg+bdoqELmVaGx5N3TAgK7yTtBpTAd1SR1NMYn4jglYeSZtGRZ3ZkPTSMsGIzYdEUYK1 PnuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=AT/HhadsSLkfqA9qLz1yJR5FuwQcmEn0C1RS62FkcL4=; b=O2bx6RLNsuX6AwavRozUc4bycSqUUmLTXkFpq8NgBRB01Exqp6YLMlqGDpSXS53TDw DWRA/wH8Y0PTqYzIsdMt47xpV7uM4LC2G2+3glmHcc9108pR+FoLRuyOZGpRA58Kp4jo rdhIB/C2E/Fcow4c1XAeCXZ85lR7CDjdpQc1NGSWb8C2PJgcE6Eb5hCOZeLGkPW1slpe +QcGm3fRhTGqw5npVSU+sqKGUQ6Ylcxm4kXw7HehZxe0ALv/J6l+g8K7hcXNAwyRFeLI j3xqV55jf8KWwterHjaGy4z7bHCyTrZKHEXQjNqzzKiHlXX7OhuJsC3GNe+RR9hRF+Mg nbFg== X-Gm-Message-State: AOAM532I2ZGm5MRpuIPmFheyNCSlMNt/wRdWZ8fLChuegUAYRkan6mde L3BB1GiWRKbX2uLwN427B1FxTVzIbc4W X-Google-Smtp-Source: ABdhPJwMFpkINzZX1JNlNkc/5A26Ivvmi4jtVWoQuEvFqtIF5atE7j6vpif9VmeLtHE+4jKwza0j7nuxl9ua Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a25:b28f:: with SMTP id k15mr996789ybj.67.1610475047360; Tue, 12 Jan 2021 10:10:47 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:18 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-2-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 01/24] locking/rwlocks: Add contention detection for rwlocks From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon , Ingo Molnar , Will Deacon , Peter Zijlstra , Davidlohr Bueso , Waiman Long Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org rwlocks do not currently have any facility to detect contention like spinlocks do. In order to allow users of rwlocks to better manage latency, add contention detection for queued rwlocks. CC: Ingo Molnar CC: Will Deacon Acked-by: Peter Zijlstra Acked-by: Davidlohr Bueso Acked-by: Waiman Long Acked-by: Paolo Bonzini Signed-off-by: Ben Gardon --- include/asm-generic/qrwlock.h | 24 ++++++++++++++++++------ include/linux/rwlock.h | 7 +++++++ 2 files changed, 25 insertions(+), 6 deletions(-) diff --git a/include/asm-generic/qrwlock.h b/include/asm-generic/qrwlock.h index 84ce841ce735..0020d3b820a7 100644 --- a/include/asm-generic/qrwlock.h +++ b/include/asm-generic/qrwlock.h @@ -14,6 +14,7 @@ #include #include +#include /* * Writer states & reader shift and bias. @@ -116,15 +117,26 @@ static inline void queued_write_unlock(struct qrwlock *lock) smp_store_release(&lock->wlocked, 0); } +/** + * queued_rwlock_is_contended - check if the lock is contended + * @lock : Pointer to queue rwlock structure + * Return: 1 if lock contended, 0 otherwise + */ +static inline int queued_rwlock_is_contended(struct qrwlock *lock) +{ + return arch_spin_is_locked(&lock->wait_lock); +} + /* * Remapping rwlock architecture specific functions to the corresponding * queue rwlock functions. */ -#define arch_read_lock(l) queued_read_lock(l) -#define arch_write_lock(l) queued_write_lock(l) -#define arch_read_trylock(l) queued_read_trylock(l) -#define arch_write_trylock(l) queued_write_trylock(l) -#define arch_read_unlock(l) queued_read_unlock(l) -#define arch_write_unlock(l) queued_write_unlock(l) +#define arch_read_lock(l) queued_read_lock(l) +#define arch_write_lock(l) queued_write_lock(l) +#define arch_read_trylock(l) queued_read_trylock(l) +#define arch_write_trylock(l) queued_write_trylock(l) +#define arch_read_unlock(l) queued_read_unlock(l) +#define arch_write_unlock(l) queued_write_unlock(l) +#define arch_rwlock_is_contended(l) queued_rwlock_is_contended(l) #endif /* __ASM_GENERIC_QRWLOCK_H */ diff --git a/include/linux/rwlock.h b/include/linux/rwlock.h index 3dcd617e65ae..7ce9a51ae5c0 100644 --- a/include/linux/rwlock.h +++ b/include/linux/rwlock.h @@ -128,4 +128,11 @@ do { \ 1 : ({ local_irq_restore(flags); 0; }); \ }) +#ifdef arch_rwlock_is_contended +#define rwlock_is_contended(lock) \ + arch_rwlock_is_contended(&(lock)->raw_lock) +#else +#define rwlock_is_contended(lock) ((void)(lock), 0) +#endif /* arch_rwlock_is_contended */ + #endif /* __LINUX_RWLOCK_H */ From patchwork Tue Jan 12 18:10:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014297 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80692C433E9 for ; Tue, 12 Jan 2021 18:11:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5762D2311F for ; Tue, 12 Jan 2021 18:11:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2392041AbhALSLm (ORCPT ); Tue, 12 Jan 2021 13:11:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729128AbhALSLl (ORCPT ); Tue, 12 Jan 2021 13:11:41 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D8A43C0617A2 for ; Tue, 12 Jan 2021 10:10:49 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id e137so3284283ybf.16 for ; Tue, 12 Jan 2021 10:10:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=C9vIvaeT5TM0OnnRWqYO7mCFVABYsaJZT2qRGhHXbMM=; b=PV4b8+Nqu6dTrIei2IlEQ8qMMwyHUaAfdM4cHfMSVk7ceox8oaA9sz1MT7qsRERRrS s8Z1xt/1cXjtB5b+T3Wg5QS6mvjtRuMyNRTXQsMKQNuUpaFJCPc9UXyiQ+d5FQ1ZzZzW XCbpYb+Uu2IJ8br9EyU/EeuStUPvmfpuho743wszk5A3YfE1WpRE7SOEM0+wJGRgifgL Lbba2kvdlU5qJJFChaE3n0dH6cHGxdPL2kaEt9icaX7RW1AMH4TVILv3ScM3CciQy2tu MTpcErkxRpRsi2nBHRxBoqPKZkWbf1uKdVMP4V7LaPi2SKOYNr/jX6WREK39V2EHtVyZ kO1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=C9vIvaeT5TM0OnnRWqYO7mCFVABYsaJZT2qRGhHXbMM=; b=Qzr8Pab5V+ekSIjWUhZ0fn4kZheeY2qt/anEnQr+M3ygvv+2HhOY5eYowBZZ2CFf6R uT0gwon0v+pB0meZSbkl7ut6BOthzQ9sKLEJWd5xTU93ZNpJKwE0BWNtQWQ1ShIrR2mN R4ivmbF0DC6I+MPjEZzvVh4T/jS0aqSw5rtCQ2Alh8OAlsiuFbbrQihlVeFrvl9WrXb5 5qjuH53j2gHo5M4m04X6RULDOipztZyjqVSmVDiyl1upszpkx15AGkd5KdwWkoLIzpN7 nn2Qqlfwch6OXeaqVtbeum16U4UYniFnivyivgL6CYRbbrO/o5IDu4dH45Moxe43I46N qthw== X-Gm-Message-State: AOAM533ih9BzXFTuqyGfrAp/k4odLRupzLP0qBDU1cNo2FZw1LhNpgXX VkHgLUkDylNEq5/YdDGF8YVH//plJbKB X-Google-Smtp-Source: ABdhPJywC7dNSUcRIcAleDfsH3vsuteOm8Tk1648RI634RDVUCm7jUNUWjmo6tdiqtjCz24UnHzMPywAXz/m Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a25:d1c4:: with SMTP id i187mr929762ybg.7.1610475049081; Tue, 12 Jan 2021 10:10:49 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:19 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-3-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 02/24] sched: Add needbreak for rwlocks From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon , Ingo Molnar , Will Deacon , Peter Zijlstra , Davidlohr Bueso , Waiman Long Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Contention awareness while holding a spin lock is essential for reducing latency when long running kernel operations can hold that lock. Add the same contention detection interface for read/write spin locks. CC: Ingo Molnar CC: Will Deacon Acked-by: Peter Zijlstra Acked-by: Davidlohr Bueso Acked-by: Waiman Long Acked-by: Paolo Bonzini Signed-off-by: Ben Gardon --- include/linux/sched.h | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/include/linux/sched.h b/include/linux/sched.h index 6e3a5eeec509..5d1378e5a040 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1912,6 +1912,23 @@ static inline int spin_needbreak(spinlock_t *lock) #endif } +/* + * Check if a rwlock is contended. + * Returns non-zero if there is another task waiting on the rwlock. + * Returns zero if the lock is not contended or the system / underlying + * rwlock implementation does not support contention detection. + * Technically does not depend on CONFIG_PREEMPTION, but a general need + * for low latency. + */ +static inline int rwlock_needbreak(rwlock_t *lock) +{ +#ifdef CONFIG_PREEMPTION + return rwlock_is_contended(lock); +#else + return 0; +#endif +} + static __always_inline bool need_resched(void) { return unlikely(tif_need_resched()); From patchwork Tue Jan 12 18:10:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014373 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DECF8C433E0 for ; Tue, 12 Jan 2021 18:15:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A6F3D22DFA for ; Tue, 12 Jan 2021 18:15:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404123AbhALSMI (ORCPT ); Tue, 12 Jan 2021 13:12:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56814 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2392249AbhALSMI (ORCPT ); Tue, 12 Jan 2021 13:12:08 -0500 Received: from mail-qt1-x849.google.com (mail-qt1-x849.google.com [IPv6:2607:f8b0:4864:20::849]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B19A2C0617A4 for ; Tue, 12 Jan 2021 10:10:51 -0800 (PST) Received: by mail-qt1-x849.google.com with SMTP id k12so2026078qth.23 for ; Tue, 12 Jan 2021 10:10:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=YYAM0gMb6Ie9Zg4dMbGTd9zaeQV742VW9obZJ2+hcf4=; b=nO2U+u2ck2oNBTbMMWSRi1Z6lMdws7qKznPVKjOACNivSlgP/JqmcsBhbkJN/VI+7q 3Dxpk+DPl0PSnTLG40wqP5cJXnsae1Rjs2enFeDQC2Rbwv1P5fBNlKzu2bNH9XK5823X ww49j2YxI6MU1HwUDpsM3rYFg9dxRZh+YHL8L1CUJHPdL8TGv/DmDmkx9HjM2PEOghCr o7xOltAIZOCa8Izd9ORw+PbNUeWNmtaCj2T5XjmQrX2i2v7T++TNK53TeG/zrGpS+BdR XaTV1xs5BJkUOW8abflpW0cMmrfxR+0raPzcUcjrzUWICoSxF8fmHS9SToxikBbP8kuz Td+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=YYAM0gMb6Ie9Zg4dMbGTd9zaeQV742VW9obZJ2+hcf4=; b=X7SmEJaviaFDNW0SMitIExNRDCuhqaKhpzEehavD94SP7oNPs6M74YCABttV83rkX7 le5NEpL6ffVfzcTZK92xgBkETVzLCKOI/QEON8ExW07cCUL6LtXmCfUsM9wDJUHqjdOo gFc22/hJJEiDOkdZkm62vqQS0K9av0zTpQjgayFMiym+OlnJpFzcLG2oHMiGzdhB5F1Y 8PY486hXvfGESHyx7fcFrfDQheg1DY0hVPa/j0wDtbuag9kArPIfZ9lg6kXFrREvjWbD ANbx5sV56B67jXRCQgl565N93TO4E4YsjKNBPDVOqBdD4IyTX5xB41e8l3JSX68UZ8lZ aenQ== X-Gm-Message-State: AOAM530bbqMvQiR9NgdiP6llyN/ncs01jdAc7h1wxlan7T2wClOqnpVc gzVeKO81GS4bvDpLrWyiEmaTgCto4gtX X-Google-Smtp-Source: ABdhPJxKTHCP58erF28clYSeOqXUy4oCp4yqO+ZJNW445VYt34LMMg+/+kDvNvmn0+BptrrkHx6RIw2xR+ox Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:ad4:442a:: with SMTP id e10mr348846qvt.12.1610475050840; Tue, 12 Jan 2021 10:10:50 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:20 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-4-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 03/24] sched: Add cond_resched_rwlock From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon , Ingo Molnar , Will Deacon , Peter Zijlstra , Davidlohr Bueso , Waiman Long Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Safely rescheduling while holding a spin lock is essential for keeping long running kernel operations running smoothly. Add the facility to cond_resched rwlocks. CC: Ingo Molnar CC: Will Deacon Acked-by: Peter Zijlstra Acked-by: Davidlohr Bueso Acked-by: Waiman Long Acked-by: Paolo Bonzini Signed-off-by: Ben Gardon --- include/linux/sched.h | 12 ++++++++++++ kernel/sched/core.c | 40 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 52 insertions(+) diff --git a/include/linux/sched.h b/include/linux/sched.h index 5d1378e5a040..3052d16da3cf 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1883,12 +1883,24 @@ static inline int _cond_resched(void) { return 0; } }) extern int __cond_resched_lock(spinlock_t *lock); +extern int __cond_resched_rwlock_read(rwlock_t *lock); +extern int __cond_resched_rwlock_write(rwlock_t *lock); #define cond_resched_lock(lock) ({ \ ___might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET);\ __cond_resched_lock(lock); \ }) +#define cond_resched_rwlock_read(lock) ({ \ + __might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \ + __cond_resched_rwlock_read(lock); \ +}) + +#define cond_resched_rwlock_write(lock) ({ \ + __might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \ + __cond_resched_rwlock_write(lock); \ +}) + static inline void cond_resched_rcu(void) { #if defined(CONFIG_DEBUG_ATOMIC_SLEEP) || !defined(CONFIG_PREEMPT_RCU) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 15d2562118d1..ade357642279 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6695,6 +6695,46 @@ int __cond_resched_lock(spinlock_t *lock) } EXPORT_SYMBOL(__cond_resched_lock); +int __cond_resched_rwlock_read(rwlock_t *lock) +{ + int resched = should_resched(PREEMPT_LOCK_OFFSET); + int ret = 0; + + lockdep_assert_held_read(lock); + + if (rwlock_needbreak(lock) || resched) { + read_unlock(lock); + if (resched) + preempt_schedule_common(); + else + cpu_relax(); + ret = 1; + read_lock(lock); + } + return ret; +} +EXPORT_SYMBOL(__cond_resched_rwlock_read); + +int __cond_resched_rwlock_write(rwlock_t *lock) +{ + int resched = should_resched(PREEMPT_LOCK_OFFSET); + int ret = 0; + + lockdep_assert_held_write(lock); + + if (rwlock_needbreak(lock) || resched) { + write_unlock(lock); + if (resched) + preempt_schedule_common(); + else + cpu_relax(); + ret = 1; + write_lock(lock); + } + return ret; +} +EXPORT_SYMBOL(__cond_resched_rwlock_write); + /** * yield - yield the current processor to other threads. * From patchwork Tue Jan 12 18:10:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014369 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40843C4332B for ; Tue, 12 Jan 2021 18:14:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0DFEB2311F for ; Tue, 12 Jan 2021 18:14:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405204AbhALSMK (ORCPT ); Tue, 12 Jan 2021 13:12:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56818 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2404289AbhALSMJ (ORCPT ); Tue, 12 Jan 2021 13:12:09 -0500 Received: from mail-qk1-x749.google.com (mail-qk1-x749.google.com [IPv6:2607:f8b0:4864:20::749]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7468FC0617A6 for ; Tue, 12 Jan 2021 10:10:53 -0800 (PST) Received: by mail-qk1-x749.google.com with SMTP id a17so2127067qko.11 for ; Tue, 12 Jan 2021 10:10:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=KchsQSFTsj0RB0+VLV4VScOtPrFxG2k97/Op1w+ZZ0U=; b=l1f5hheFIXm+V4MLFEphzsvDHhHrcd4ErXJh3tlJI4+rRSb4RZFZ3+FNc7BFnqcwMO tHZUReGsZZ3Bzfqg0N3LMG8xueN0DeLYCCyp3vcqooSJsw1wziaMGDersvHHPO3VNpAQ D5/jIvqSSExtWRLkaiwkGC05Wx1EuUUx0FONeZhpsFaIIBG8jwiWEdmKu0wPJ9wr5G94 U9QUjJWW/uHam2+woFRCFskvLCmMGgekRgEQ/rjBK7r70r15dE9Foe+dhpynIkrdZ60v pbBGxPRCZLqGTWdYjXjX4xm3MEGr4MCgWVKw8NM0wX1sC5oYkYvIi7AjTbNLaisABRO9 ITyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=KchsQSFTsj0RB0+VLV4VScOtPrFxG2k97/Op1w+ZZ0U=; b=kCjHmVMhIOOGKI4Kr+yejsuOVR+fdPRuZh0h4OwDin6v6Tug091PuxmGtpdh2rwxeK fcOiF00f0/mC9JulN0vsbsNOu6nYnH0HiUN+5xezaEgV8z+R01LK58Au6e7hK4hYzMpT LbXiOHv++Ws2zu9Fvx6ntOUAXWckcBFiMyQIW9LHEo9SuU/YNceeXJpVGLJBsRNDytBp 79zOcdBGo1sspH+Cv1m3zjTo431T06UC+IZVeFU02Vs6sL+AqnF2p5+/JXsq0mN4Pw1d /w16A0x3GkTk1G1/nCcC9bcCVPM45nBSXcuFwUxSIA3KkyFYZaXTEj9Q7b9hoYQ+rosU RJGw== X-Gm-Message-State: AOAM531rgwyEZNsF4+FrUdl6jpvlL8fgt3tmQdoJ2VMkcazLytDU0+cC QfoxvECjtQdvT5w2KZmRSOUV7tZ9C/KW X-Google-Smtp-Source: ABdhPJyjvLb3B5vkSKmnfKmTxMmDuEIUDlyIlfudy9eydRvb3x4Cx2NTQtHUDZzlTMknTjnAUCHKWnDFyPMt Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a0c:f00e:: with SMTP id z14mr635871qvk.25.1610475052661; Tue, 12 Jan 2021 10:10:52 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:21 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-5-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 04/24] kvm: x86/mmu: change TDP MMU yield function returns to match cond_resched From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Currently the TDP MMU yield / cond_resched functions either return nothing or return true if the TLBs were not flushed. These are confusing semantics, especially when making control flow decisions in calling functions. To clean things up, change both functions to have the same return value semantics as cond_resched: true if the thread yielded, false if it did not. If the function yielded in the _flush_ version, then the TLBs will have been flushed. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/tdp_mmu.c | 38 +++++++++++++++++++++++++++++--------- 1 file changed, 29 insertions(+), 9 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 2ef8615f9dba..b2784514ca2d 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -413,8 +413,15 @@ static inline void tdp_mmu_set_spte_no_dirty_log(struct kvm *kvm, _mmu->shadow_root_level, _start, _end) /* - * Flush the TLB if the process should drop kvm->mmu_lock. - * Return whether the caller still needs to flush the tlb. + * Flush the TLB and yield if the MMU lock is contended or this thread needs to + * return control to the scheduler. + * + * If this function yields, it will also reset the tdp_iter's walk over the + * paging structure and the calling function should allow the iterator to + * continue its traversal from the paging structure root. + * + * Return true if this function yielded, the TLBs were flushed, and the + * iterator's traversal was reset. Return false if a yield was not needed. */ static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, struct tdp_iter *iter) { @@ -422,18 +429,30 @@ static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, struct tdp_iter *it kvm_flush_remote_tlbs(kvm); cond_resched_lock(&kvm->mmu_lock); tdp_iter_refresh_walk(iter); - return false; - } else { return true; - } + } else + return false; } -static void tdp_mmu_iter_cond_resched(struct kvm *kvm, struct tdp_iter *iter) +/* + * Yield if the MMU lock is contended or this thread needs to return control + * to the scheduler. + * + * If this function yields, it will also reset the tdp_iter's walk over the + * paging structure and the calling function should allow the iterator to + * continue its traversal from the paging structure root. + * + * Return true if this function yielded and the iterator's traversal was reset. + * Return false if a yield was not needed. + */ +static bool tdp_mmu_iter_cond_resched(struct kvm *kvm, struct tdp_iter *iter) { if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { cond_resched_lock(&kvm->mmu_lock); tdp_iter_refresh_walk(iter); - } + return true; + } else + return false; } /* @@ -470,7 +489,8 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte(kvm, &iter, 0); if (can_yield) - flush_needed = tdp_mmu_iter_flush_cond_resched(kvm, &iter); + flush_needed = !tdp_mmu_iter_flush_cond_resched(kvm, + &iter); else flush_needed = true; } @@ -1072,7 +1092,7 @@ static void zap_collapsible_spte_range(struct kvm *kvm, tdp_mmu_set_spte(kvm, &iter, 0); - spte_set = tdp_mmu_iter_flush_cond_resched(kvm, &iter); + spte_set = !tdp_mmu_iter_flush_cond_resched(kvm, &iter); } if (spte_set) From patchwork Tue Jan 12 18:10:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014371 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70483C43331 for ; Tue, 12 Jan 2021 18:14:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4B4D122DFA for ; Tue, 12 Jan 2021 18:14:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406569AbhALSOq (ORCPT ); Tue, 12 Jan 2021 13:14:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56822 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2404410AbhALSMJ (ORCPT ); Tue, 12 Jan 2021 13:12:09 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E298C0617A9 for ; Tue, 12 Jan 2021 10:10:55 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id g17so3276913ybh.5 for ; Tue, 12 Jan 2021 10:10:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=1RYyaiKrhxGY31HgCVLvmle8FD7J6aNbeg18wopI17w=; b=QzsHfGoZge4Tb0qpqS9E0KA0be65MK8NJEQ07kY3AMb52ePOtedWnLL4yOA8D1BJuj Im4H1Pu4wgPuRqL5r1RZ5OtPG13r4toAaY2sQESin9WbNEdIlpxBNv0dpD0lwpW1QHb1 ny1OmmNFuqSTA//72zziaIc+hExL4sj0jLH6GooUlmnhkPEpQEaqYdn1ObZsp2ZQktWj TvNNAKTmeCPaVZWXbmcmQxklXekSKu+XPvbBaZqSwnHOL05uAafOBhKNSERZy3mYykeM 8qlGhbkveJNM1EXCMGhiu+P00t6fpZVSEBmSedxsk5Hj2xHOyP+RUX47Rov9LuVT7w2Y etaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=1RYyaiKrhxGY31HgCVLvmle8FD7J6aNbeg18wopI17w=; b=BVp5UF7Ntfe0pLGtn37yOgRNdCtBmtUydW42DU0NAbGoJjWpyLKZyAAgXy8t/wsukk 0MdbkAHGGxWt2NfreT3VNr9eNmmXvQHpjVwJfQGQ6YGpPKqsRb81MoqEOHlVLgQm+Z4H 2QVCNT3QCAfAuke5vOTj/ebceJNTTYxhW7sWAWg2n2EjFwdNjk97g3isr2X/Qj7KBjNm jPLAjJuc/H0N/keF648YH0xpTJDkJ6bi7lD+nLHKfXqGK1an5Xjp0PCPaRr4Zch4kvhn TyJfqeSaur/IhF0O3Fz+fNG8IDMiD/FKkxxnjx6Os1EVmP7ZfB/41VHuCdn5SLIYcl1x GO/A== X-Gm-Message-State: AOAM532cDycBtqwItndGMkcTfCenyqkMhv/c4u5kwP5waDo2daB9obHL Mcm97omQafIKkh0FZo/IeRNb0MU3yXXk X-Google-Smtp-Source: ABdhPJzUAxlBTg7TdgX2m1CDdii/yjGV+Nvq+y2ojT5UWDQOyO+BguxlCYQT3zrCdjHIpSY+k/d01EsB2t4M Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a25:aa45:: with SMTP id s63mr878778ybi.471.1610475054223; Tue, 12 Jan 2021 10:10:54 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:22 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-6-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 05/24] kvm: x86/mmu: Fix yielding in TDP MMU From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org There are two problems with the way the TDP MMU yields in long running functions. 1.) Given certain conditions, the function may not yield reliably / frequently enough. 2.) In some functions the TDP iter risks not making forward progress if two threads livelock yielding to one another. Case 1 is possible if for example, a paging structure was very large but had few, if any writable entries. wrprot_gfn_range could traverse many entries before finding a writable entry and yielding. Case 2 is possible if two threads were trying to execute wrprot_gfn_range. Each could write protect an entry and then yield. This would reset the tdp_iter's walk over the paging structure and the loop would end up repeating the same entry over and over, preventing either thread from making forward progress. Fix these issues by moving the yield to the beginning of the loop, before other checks and only yielding if the loop has made forward progress since the last yield. Fixes: a6a0b05da9f3 ("kvm: x86/mmu: Support dirty logging for the TDP MMU") Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/tdp_mmu.c | 83 +++++++++++++++++++++++++++++++------- 1 file changed, 69 insertions(+), 14 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index b2784514ca2d..1987da0da66e 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -470,9 +470,23 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, gfn_t start, gfn_t end, bool can_yield) { struct tdp_iter iter; + gfn_t last_goal_gfn = start; bool flush_needed = false; tdp_root_for_each_pte(iter, root, start, end) { + /* Ensure forward progress has been made before yielding. */ + if (can_yield && iter.goal_gfn != last_goal_gfn && + tdp_mmu_iter_flush_cond_resched(kvm, &iter)) { + last_goal_gfn = iter.goal_gfn; + flush_needed = false; + /* + * Yielding caused the paging structure walk to be + * reset so skip to the next iteration to continue the + * walk from the root. + */ + continue; + } + if (!is_shadow_present_pte(iter.old_spte)) continue; @@ -487,12 +501,7 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, continue; tdp_mmu_set_spte(kvm, &iter, 0); - - if (can_yield) - flush_needed = !tdp_mmu_iter_flush_cond_resched(kvm, - &iter); - else - flush_needed = true; + flush_needed = true; } return flush_needed; } @@ -850,12 +859,25 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, { struct tdp_iter iter; u64 new_spte; + gfn_t last_goal_gfn = start; bool spte_set = false; BUG_ON(min_level > KVM_MAX_HUGEPAGE_LEVEL); for_each_tdp_pte_min_level(iter, root->spt, root->role.level, min_level, start, end) { + /* Ensure forward progress has been made before yielding. */ + if (iter.goal_gfn != last_goal_gfn && + tdp_mmu_iter_cond_resched(kvm, &iter)) { + last_goal_gfn = iter.goal_gfn; + /* + * Yielding caused the paging structure walk to be + * reset so skip to the next iteration to continue the + * walk from the root. + */ + continue; + } + if (!is_shadow_present_pte(iter.old_spte) || !is_last_spte(iter.old_spte, iter.level)) continue; @@ -864,8 +886,6 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte); spte_set = true; - - tdp_mmu_iter_cond_resched(kvm, &iter); } return spte_set; } @@ -906,9 +926,22 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, { struct tdp_iter iter; u64 new_spte; + gfn_t last_goal_gfn = start; bool spte_set = false; tdp_root_for_each_leaf_pte(iter, root, start, end) { + /* Ensure forward progress has been made before yielding. */ + if (iter.goal_gfn != last_goal_gfn && + tdp_mmu_iter_cond_resched(kvm, &iter)) { + last_goal_gfn = iter.goal_gfn; + /* + * Yielding caused the paging structure walk to be + * reset so skip to the next iteration to continue the + * walk from the root. + */ + continue; + } + if (spte_ad_need_write_protect(iter.old_spte)) { if (is_writable_pte(iter.old_spte)) new_spte = iter.old_spte & ~PT_WRITABLE_MASK; @@ -923,8 +956,6 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte); spte_set = true; - - tdp_mmu_iter_cond_resched(kvm, &iter); } return spte_set; } @@ -1029,9 +1060,22 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, { struct tdp_iter iter; u64 new_spte; + gfn_t last_goal_gfn = start; bool spte_set = false; tdp_root_for_each_pte(iter, root, start, end) { + /* Ensure forward progress has been made before yielding. */ + if (iter.goal_gfn != last_goal_gfn && + tdp_mmu_iter_cond_resched(kvm, &iter)) { + last_goal_gfn = iter.goal_gfn; + /* + * Yielding caused the paging structure walk to be + * reset so skip to the next iteration to continue the + * walk from the root. + */ + continue; + } + if (!is_shadow_present_pte(iter.old_spte)) continue; @@ -1039,8 +1083,6 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte(kvm, &iter, new_spte); spte_set = true; - - tdp_mmu_iter_cond_resched(kvm, &iter); } return spte_set; @@ -1078,9 +1120,23 @@ static void zap_collapsible_spte_range(struct kvm *kvm, { struct tdp_iter iter; kvm_pfn_t pfn; + gfn_t last_goal_gfn = start; bool spte_set = false; tdp_root_for_each_pte(iter, root, start, end) { + /* Ensure forward progress has been made before yielding. */ + if (iter.goal_gfn != last_goal_gfn && + tdp_mmu_iter_flush_cond_resched(kvm, &iter)) { + last_goal_gfn = iter.goal_gfn; + spte_set = false; + /* + * Yielding caused the paging structure walk to be + * reset so skip to the next iteration to continue the + * walk from the root. + */ + continue; + } + if (!is_shadow_present_pte(iter.old_spte) || is_last_spte(iter.old_spte, iter.level)) continue; @@ -1091,8 +1147,7 @@ static void zap_collapsible_spte_range(struct kvm *kvm, continue; tdp_mmu_set_spte(kvm, &iter, 0); - - spte_set = !tdp_mmu_iter_flush_cond_resched(kvm, &iter); + spte_set = true; } if (spte_set) From patchwork Tue Jan 12 18:10:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014335 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91D69C433DB for ; Tue, 12 Jan 2021 18:12:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4B70E22DFA for ; Tue, 12 Jan 2021 18:12:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405419AbhALSMK (ORCPT ); Tue, 12 Jan 2021 13:12:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56826 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2404623AbhALSMJ (ORCPT ); Tue, 12 Jan 2021 13:12:09 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B1E1DC0617AB for ; Tue, 12 Jan 2021 10:10:56 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id h75so3231131ybg.18 for ; Tue, 12 Jan 2021 10:10:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=6HOAnRul988JJcHPRX8SIpofrQYmL4UmifPvTiFwC5E=; b=LmD+Ye0gR0HZbJIPEvD4VMC4cKzk725zV2/I/TjwpnVJLuYfqqLkO0eIEa2hZOetTC ZjhPYAo3MAhSjXnYSpfZljfHlU/c9zjss8vB7Vc8RSEnJmeceI4VXTEoaArE+NVbSO58 LUF7knuZiZ3BTP7c4FYkC9dbj1lDMNOgURs0nNOH2RTWSa6dNSNKsjlznWFVUbWVe8ZS Lb1Zx1dstLkNo2eOYo4LCQ5SxK02ud96DUmzsCP6V7FzdI7TdiBzzbDx+GRqg3bHMFry hSO0cGIoQLqOoNsUiDsAQjcQonWePyD/XXEyUSLb5StZU+RSVwUpaoajF6NUy0J5EFpa GxYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=6HOAnRul988JJcHPRX8SIpofrQYmL4UmifPvTiFwC5E=; b=HGeeYAvJgSKNRedJA3Jbuc3bd3BEpK5tvXhQaANI51xxMD/YCCOrbE4ZcmTHbCwrPV PZ7O/CUvfPSUz2MHYHfsRV8NIszmFng4w//vkJuyVEf02+FPHFbQvkBN2oN2e1o4uPcc j+AtIMmDOn3I/2cX0BAXckYvlX8YNPxEltrOXQa7qvNdBDFM+r4G5WDN14RvBfrpT/lZ aoqt/ITPNY4oViF9XgY6EcckftqDRcAB/yxk+qev5gwBMrNPi1JjcsFZkhsNvXCehOnS YZJiavzqO8Ig2qnpFMrMdCwmH3IaP8+GxOq+8xmhbo0HP8EkWAcD1UUG1CGSGDtirndl +T2A== X-Gm-Message-State: AOAM530uNBrvd4U5znwJDYdu+uRIeZM4nBX69nqT8gSCxLLCTDO1+Uby QQubuE6i17NNX4Yl7dar8xMSoo0j6Uye X-Google-Smtp-Source: ABdhPJyu8CadpvpA1fsQat/u/JMloS9CaK7kdDmHf9MbN/B1p2uMNEhZ0ffseqJyFcQP6mrEVPryUpTHf8XJ Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a25:5f41:: with SMTP id h1mr975579ybm.159.1610475055930; Tue, 12 Jan 2021 10:10:55 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:23 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-7-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 06/24] kvm: x86/mmu: Skip no-op changes in TDP MMU functions From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Skip setting SPTEs if no change is expected. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/tdp_mmu.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 1987da0da66e..2650fa9fe066 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -882,6 +882,9 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, !is_last_spte(iter.old_spte, iter.level)) continue; + if (!(iter.old_spte & PT_WRITABLE_MASK)) + continue; + new_spte = iter.old_spte & ~PT_WRITABLE_MASK; tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte); @@ -1079,6 +1082,9 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, if (!is_shadow_present_pte(iter.old_spte)) continue; + if (iter.old_spte & shadow_dirty_mask) + continue; + new_spte = iter.old_spte | shadow_dirty_mask; tdp_mmu_set_spte(kvm, &iter, new_spte); From patchwork Tue Jan 12 18:10:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014299 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4070C433E0 for ; Tue, 12 Jan 2021 18:12:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id ABDB32311D for ; Tue, 12 Jan 2021 18:12:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2392179AbhALSL5 (ORCPT ); Tue, 12 Jan 2021 13:11:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732584AbhALSL4 (ORCPT ); Tue, 12 Jan 2021 13:11:56 -0500 Received: from mail-qk1-x74a.google.com (mail-qk1-x74a.google.com [IPv6:2607:f8b0:4864:20::74a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 558B9C0617B1 for ; Tue, 12 Jan 2021 10:10:58 -0800 (PST) Received: by mail-qk1-x74a.google.com with SMTP id l138so2140750qke.4 for ; Tue, 12 Jan 2021 10:10:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=zBj4o53W1HK0H3qMajow5mpw8KHH0LDuEQnosEPCoeQ=; b=uSOAFid37w4lzwlCIdRo8jUVqJkDSGyW0jCe9XqwFmh44qlhSaa4sp4OI72SdZKSIz SyLky6tAXG07oVgUPQMPJ9A7keUkRNlOLwuMFBAP6ZQckkmz4hugiOzRX8A4XOFyYQUJ NmOnGuF29ZkXGWHgmbmliVdxA3pq17gL4Al9dY1kYHgInDWYMkKfqtQLsj362VgcCLca 6phvluUx4fMWQKgGjjVqGs91MgxtrKItjHW608Xij+JvYLCS7blHg7X4672HPUG4VIe2 Q3B3NnLbJOWOgy71KBdUEzu+zWSoydQNPOiRApKqyiKsDojwIBoVYvUB6gnK7/I28hg2 ebBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=zBj4o53W1HK0H3qMajow5mpw8KHH0LDuEQnosEPCoeQ=; b=b1+dwd8zb3GY1V6c99NnhHuPD05P09byOEwB30s2LlJHWfhZuDy/9yIbfD63Zv2Rf+ N6jyWBJQrwiil4pGs3gRdwdzJFJoYI9e+VznsmhebNWakkePNpiOiXtDy9EmB7K6iFqZ SA/hstrpqJYKWvN4OXnwtDLMJM/tFTGIdz6WG1p6UX56MGRzJKBvhPKmAf+IHTjETHky XhpapdjbOeeNdDDpIsWOZYW6LhInpRzXkA8bbOmRDwo6v4C+GLMvhwmf9eTSn2A8eAsL O0JNCrYbgfDXyP5wwsLLSfNmzpJb9u/oS10Wn0Mw7k/s/RxNBfs2fAUO+4quO2IR5p3I OLcg== X-Gm-Message-State: AOAM530Aoi77cVQkkUS17HXSq6TaCWMykVX5obplnKPn+7NZkZ+JiAy5 8NE59tkCE7lj6pHPblb8HOm3S15Na3XK X-Google-Smtp-Source: ABdhPJyAumO5UP7YQDoBTNB/GTjXBeBum2y4V4qjRNJi0M6Gw6MY+3qPYDKW2tNa/hIfoiRX4yCXPjjoy8qs Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:ad4:5192:: with SMTP id b18mr673974qvp.46.1610475057494; Tue, 12 Jan 2021 10:10:57 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:24 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-8-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 07/24] kvm: x86/mmu: Add comment on __tdp_mmu_set_spte From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org __tdp_mmu_set_spte is a very important function in the TDP MMU which already accepts several arguments and will take more in future commits. To offset this complexity, add a comment to the function describing each of the arguemnts. No functional change intended. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/tdp_mmu.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 2650fa9fe066..b033da8243fc 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -357,6 +357,22 @@ static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, new_spte, level); } +/* + * __tdp_mmu_set_spte - Set a TDP MMU SPTE and handle the associated bookkeeping + * @kvm: kvm instance + * @iter: a tdp_iter instance currently on the SPTE that should be set + * @new_spte: The value the SPTE should be set to + * @record_acc_track: Notify the MM subsystem of changes to the accessed state + * of the page. Should be set unless handling an MMU + * notifier for access tracking. Leaving record_acc_track + * unset in that case prevents page accesses from being + * double counted. + * @record_dirty_log: Record the page as dirty in the dirty bitmap if + * appropriate for the change being made. Should be set + * unless performing certain dirty logging operations. + * Leaving record_dirty_log unset in that case prevents page + * writes from being double counted. + */ static inline void __tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter, u64 new_spte, bool record_acc_track, bool record_dirty_log) From patchwork Tue Jan 12 18:10:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014363 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 235CEC433E9 for ; Tue, 12 Jan 2021 18:14:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E4EAE22DFA for ; Tue, 12 Jan 2021 18:14:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406028AbhALSMX (ORCPT ); Tue, 12 Jan 2021 13:12:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56872 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2405515AbhALSMV (ORCPT ); Tue, 12 Jan 2021 13:12:21 -0500 Received: from mail-qt1-x849.google.com (mail-qt1-x849.google.com [IPv6:2607:f8b0:4864:20::849]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 20226C0617BA for ; Tue, 12 Jan 2021 10:11:00 -0800 (PST) Received: by mail-qt1-x849.google.com with SMTP id t7so2037151qtn.19 for ; Tue, 12 Jan 2021 10:11:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=uJIhZwyTggvAIsz1r7Cjfa0SqM+enCeDole0gREFf2c=; b=kI2M99CFZuMrZy2Pp6gDd3rhx8uwKAp6t4NbQ8xEzvlxXrEe6lyhpSSaps8IyM4nho N++QVOOag0/KeQ+RbSiJD7jJ+lGRDAdEh08SMwjwy58bYGACGJSEsA6ztWoTu6QSJR6T bdAZcHbqSlJ6hzu5xSz7LHFKlla+eDePrpeJAuBTKJmI2E7dc8f3EJzYz03+8z2xCbFC 66o7+BiUOs2LPKXK/xzLkgrjwuhxUehpnJTDEVm8AKAm/7DPziC5+tspyHN0OFyrDPbu OPP8UODwyqzbPbR3n2TG1l8vifPwddCUaR0x/NbT1MbT/NE+LuPXF/Vu1pDMLnnF3NUJ Fj0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=uJIhZwyTggvAIsz1r7Cjfa0SqM+enCeDole0gREFf2c=; b=lnhp9uXci73WVI+pzdG59cymJCjJYYMKkIlrmSLeDq0Z38u1Ge76GRSplGdUIuicrz UvXZ8GOvZUoQPk8uirJv18e37kaMWtB/sFuHmz6seFRXQ1xxuYtmfdn4zNUXx0voxswJ 0vpEs2Dm6aHd15lVLsXU/hyBlfhdWIGK6HqqmfacSzS4d93Iz7vfyn0Y0m1EbdsVkzIb Dv0zBzDCP4D5GGJAMwbAKps0c4zyvIxHPNwLmEHEmvvevxQuvclncY6zTAlElNR3FV4M nR5hYqXIBQXbe5VtaHBrRcbJX1J76WYG3H9GaSFZ1kJsyBqAU+91X3pALHd1tICynfUs 1m2g== X-Gm-Message-State: AOAM531q9j5Q7j/RwedOBvRWXhwC7+63qjRidXdezj0FPB8r+cVowR/n ZWf7My08EiyHWEEWp1N2+dNY/lo4hNIE X-Google-Smtp-Source: ABdhPJwQUuuKOtixV5Ldw9Q4x8pMIL/6dkmC+Mnk58e2Azlqio5Z+jZ+cPrDV8xvPzyeUT5gPymeTeesiWuE Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:ad4:4e09:: with SMTP id dl9mr221822qvb.44.1610475059257; Tue, 12 Jan 2021 10:10:59 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:25 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-9-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 08/24] kvm: x86/mmu: Add lockdep when setting a TDP MMU SPTE From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add lockdep to __tdp_mmu_set_spte to ensure that SPTEs are only modified under the MMU lock. This lockdep will be updated in future commits to reflect and validate changes to the TDP MMU's synchronization strategy. No functional change intended. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon Reviewed-by: Sean Christopherson --- arch/x86/kvm/mmu/tdp_mmu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index b033da8243fc..411938e97a00 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -381,6 +381,8 @@ static inline void __tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter, struct kvm_mmu_page *root = sptep_to_sp(root_pt); int as_id = kvm_mmu_page_as_id(root); + lockdep_assert_held(&kvm->mmu_lock); + WRITE_ONCE(*iter->sptep, new_spte); __handle_changed_spte(kvm, as_id, iter->gfn, iter->old_spte, new_spte, From patchwork Tue Jan 12 18:10:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014301 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68675C4332B for ; Tue, 12 Jan 2021 18:12:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 240E92311D for ; Tue, 12 Jan 2021 18:12:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405554AbhALSMM (ORCPT ); Tue, 12 Jan 2021 13:12:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2404494AbhALSMM (ORCPT ); Tue, 12 Jan 2021 13:12:12 -0500 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 968EEC061575 for ; Tue, 12 Jan 2021 10:11:01 -0800 (PST) Received: by mail-pl1-x64a.google.com with SMTP id 98so1970582pla.12 for ; Tue, 12 Jan 2021 10:11:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=W0JHfjmfnxHKf2KI8mDZYKQfzNwrieJpg4TDviTJuJA=; b=OZR2XS2wu1SIfSOY7z9UqFZW6+1Ao4td++fjUL/YO0IpVbUC8TW7X8LfQv20YCjHZc jTNW5MV5uqC1dSRZUkTaO7IXSG8sBQAGIcrabx16Nw2I18yXHS2/J9Q6J83ay93xEott 3fTyxv6mLtQvMzdx5e+s8wYklZKZ49NhTOP/Tt47BLJCdnxDOsmGNDFDFc+fVL7wPp+J 5dsiPi5J7zTNwkhBDVlvqJZXBdQOyM2N8jp0b3LVSgBLdSxwhJjhgoNWGrpqeFiG2Bmg JVLq01gBFKyooNNabQ28AP6AxcB0LQoQoxqVQ0JwsHR7F65da4Y/J9xgTQeCInCyh63R fnUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=W0JHfjmfnxHKf2KI8mDZYKQfzNwrieJpg4TDviTJuJA=; b=StxCE9TVPTc9caqikSBIhF2PBRlF+gvVLg+4kXujq3KLhwC9xK1YmUq9LuTS41ZlSF APChGRqQ/XAVjcpGWgqCdYU2LzvxgdxCujwmPxxPL92FGCbpR8S6rYYHXXv3oJYy/ESC wvLp2CM+hS0g8CP5pwTAUo2txA5xzta4mSSc/KwXpGSR8rC6GQWI2C/XZsNZ6S8dNUOr drlbP7Ge1hixbcf3FHATP9xXe30nAdttLZ0diJCoCtpstlAZBAK2DZJnw/clP14b/tJQ fTmFZK0/lhldUBA3g0zmbhx67elssmzG85seqEn4s9+X0KVjCHS3ZgAC5QyEkV8yceBR mPhw== X-Gm-Message-State: AOAM533jCDs47/3wF4eZ57Z81K2qoiacpO6MzJU5t8nxF4LMOEYUGizU XoloVkMBVANQMY9F+z8HTkp88YMVDZGx X-Google-Smtp-Source: ABdhPJwW40lcFfs/Thl05pWnSw40XLyYUcm72yWABZV24LMePwMOXTV9j9umbwRl2OuKGbF6tDxbT6+LVNR9 Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a17:902:724b:b029:de:229a:47f1 with SMTP id c11-20020a170902724bb02900de229a47f1mr427024pll.10.1610475061143; Tue, 12 Jan 2021 10:11:01 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:26 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-10-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 09/24] kvm: x86/mmu: Don't redundantly clear TDP MMU pt memory From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The KVM MMU caches already guarantee that shadow page table memory will be zeroed, so there is no reason to re-zero the page in the TDP MMU page fault handler. No functional change intended. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon Reviewed-by: Sean Christopherson --- arch/x86/kvm/mmu/tdp_mmu.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 411938e97a00..55df596696c7 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -665,7 +665,6 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, sp = alloc_tdp_mmu_page(vcpu, iter.gfn, iter.level); list_add(&sp->link, &vcpu->kvm->arch.tdp_mmu_pages); child_pt = sp->spt; - clear_page(child_pt); new_spte = make_nonleaf_spte(child_pt, !shadow_accessed_mask); From patchwork Tue Jan 12 18:10:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014357 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 477CBC4332D for ; Tue, 12 Jan 2021 18:14:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1CDCD23121 for ; Tue, 12 Jan 2021 18:14:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406244AbhALSMi (ORCPT ); Tue, 12 Jan 2021 13:12:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56932 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2406231AbhALSMh (ORCPT ); Tue, 12 Jan 2021 13:12:37 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 87510C0617BD for ; Tue, 12 Jan 2021 10:11:03 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id mz17so2165210pjb.5 for ; Tue, 12 Jan 2021 10:11:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=Bz7aEQSXFsShWSFI4jr1fAWxKptbtLQZdb0JsWFe/T4=; b=WIMfFERQ4Xukh14xesiGVwRivHrOwlvIM1VnH10WBjHLN47E0UTmjMqc1pp1e3Cbi5 rdEIvSaUQs8jJV3Xs3QowNAPTZzBqyzrsMSw5wYMGgPsbwRvbK01cLZqf6y3Y4jCxAqk Iy+bTmi390GFRoqSG+P5EjEQJxQZe/He6hLTvehW0UAHaxNlK5uDTQj9/cB1gTviUw3f EEDI6gSa2hr48+1wZ2yr/l+NMKJqZriOKXqLOvz9Ofe7vpnXwavTBEQirY130KN5lXoh QI7taGZix+bfXMuGebFbXSUx29qtN5AoGMjiMjUMVjCEajNO4OiVn3tk990ufVP5KWC6 CBLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=Bz7aEQSXFsShWSFI4jr1fAWxKptbtLQZdb0JsWFe/T4=; b=LnrzT6e3UZ42sn/FNGqaz+amxbNHjvXP2CHyJ2nW93e8oTvUIq6W7NtdzKJg89U+PR 93zLcAEXvj2tWqzO55+s1xAzaDcIQVdKxV+dZPNu3FkLpLwHnD2Q6NUmHVTMe0ttyShf ++sRNgT2mmsynTd5xvAXlUL9qEPb6bwYdM4LTd+CGnJJTHenYr8QbscpoBbwvI1qSlwO qr6WHVxpD/rFpxONk8xK+NBMzY066hwRoytEsaNH7WJrA9oU9B0Z3e2FR51HNr7lamgf qztxva4lD1LnNWqMvOV72FJQu8xZLjhefuJoE3B2GA2pE2QRgFOkUqX0R9MP6uaLN7uC uBbg== X-Gm-Message-State: AOAM532AzE8l81VBLfspEVJaQ4IQZRh0ALSrcYczna77CaJbS385RNhs 73eT1EgccURp0L79CaJwbuYxSDQmlB0l X-Google-Smtp-Source: ABdhPJx3MNHD6vCESyQOIuWic2KKIVM1PZUriW2eZxFsECsrdbPU78U/TGfB9KSfHN/sdSYr9T2liY9g9h79 Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a17:902:c215:b029:da:b079:b9a3 with SMTP id 21-20020a170902c215b02900dab079b9a3mr292723pll.67.1610475063057; Tue, 12 Jan 2021 10:11:03 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:27 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-11-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 10/24] kvm: x86/mmu: Factor out handle disconnected pt From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Factor out the code to handle a disconnected subtree of the TDP paging structure from the code to handle the change to an individual SPTE. Future commits will build on this to allow asynchronous page freeing. No functional change intended. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/tdp_mmu.c | 75 +++++++++++++++++++++++--------------- 1 file changed, 46 insertions(+), 29 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 55df596696c7..e8f35cd46b4c 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -234,6 +234,49 @@ static void handle_changed_spte_dirty_log(struct kvm *kvm, int as_id, gfn_t gfn, } } +/** + * handle_disconnected_tdp_mmu_page - handle a pt removed from the TDP structure + * + * @kvm: kvm instance + * @pt: the page removed from the paging structure + * + * Given a page table that has been removed from the TDP paging structure, + * iterates through the page table to clear SPTEs and free child page tables. + */ +static void handle_disconnected_tdp_mmu_page(struct kvm *kvm, u64 *pt) +{ + struct kvm_mmu_page *sp; + gfn_t gfn; + int level; + u64 old_child_spte; + int i; + + sp = sptep_to_sp(pt); + gfn = sp->gfn; + level = sp->role.level; + + trace_kvm_mmu_prepare_zap_page(sp); + + list_del(&sp->link); + + if (sp->lpage_disallowed) + unaccount_huge_nx_page(kvm, sp); + + for (i = 0; i < PT64_ENT_PER_PAGE; i++) { + old_child_spte = READ_ONCE(*(pt + i)); + WRITE_ONCE(*(pt + i), 0); + handle_changed_spte(kvm, kvm_mmu_page_as_id(sp), + gfn + (i * KVM_PAGES_PER_HPAGE(level - 1)), + old_child_spte, 0, level - 1); + } + + kvm_flush_remote_tlbs_with_address(kvm, gfn, + KVM_PAGES_PER_HPAGE(level)); + + free_page((unsigned long)pt); + kmem_cache_free(mmu_page_header_cache, sp); +} + /** * handle_changed_spte - handle bookkeeping associated with an SPTE change * @kvm: kvm instance @@ -254,10 +297,6 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, bool was_leaf = was_present && is_last_spte(old_spte, level); bool is_leaf = is_present && is_last_spte(new_spte, level); bool pfn_changed = spte_to_pfn(old_spte) != spte_to_pfn(new_spte); - u64 *pt; - struct kvm_mmu_page *sp; - u64 old_child_spte; - int i; WARN_ON(level > PT64_ROOT_MAX_LEVEL); WARN_ON(level < PG_LEVEL_4K); @@ -321,31 +360,9 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, * Recursively handle child PTs if the change removed a subtree from * the paging structure. */ - if (was_present && !was_leaf && (pfn_changed || !is_present)) { - pt = spte_to_child_pt(old_spte, level); - sp = sptep_to_sp(pt); - - trace_kvm_mmu_prepare_zap_page(sp); - - list_del(&sp->link); - - if (sp->lpage_disallowed) - unaccount_huge_nx_page(kvm, sp); - - for (i = 0; i < PT64_ENT_PER_PAGE; i++) { - old_child_spte = READ_ONCE(*(pt + i)); - WRITE_ONCE(*(pt + i), 0); - handle_changed_spte(kvm, as_id, - gfn + (i * KVM_PAGES_PER_HPAGE(level - 1)), - old_child_spte, 0, level - 1); - } - - kvm_flush_remote_tlbs_with_address(kvm, gfn, - KVM_PAGES_PER_HPAGE(level)); - - free_page((unsigned long)pt); - kmem_cache_free(mmu_page_header_cache, sp); - } + if (was_present && !was_leaf && (pfn_changed || !is_present)) + handle_disconnected_tdp_mmu_page(kvm, + spte_to_child_pt(old_spte, level)); } static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, From patchwork Tue Jan 12 18:10:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014365 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CFB09C433E6 for ; Tue, 12 Jan 2021 18:14:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A28F32311F for ; Tue, 12 Jan 2021 18:14:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406121AbhALSMY (ORCPT ); Tue, 12 Jan 2021 13:12:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56814 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2405980AbhALSMX (ORCPT ); Tue, 12 Jan 2021 13:12:23 -0500 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 80410C0617BF for ; Tue, 12 Jan 2021 10:11:05 -0800 (PST) Received: by mail-pg1-x549.google.com with SMTP id o6so2149302pgg.8 for ; Tue, 12 Jan 2021 10:11:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=I5IsJbZuDYpFXFlSvTXHYskF+HsPl8YK9J43vEHOECY=; b=VwoMdWCCamn7/LtEg6RBTX9kpX7d/BuhrxswvYO+plPNfvLxvCdj4KPIAao+O4NmZw 1rKgoW1SN3im8LUW8D00utBmeB22a/IjCJUsnuXeQsQV9ILuWzAHu2FzJg3C1u3OVUJi 8uk4NDD7B97v+1HE8o6mK6k7Q8qxq1Rx1xqN47NviyQBfQS3uV1EoApFoWnCuw5J00G/ UXdNPK9vyqbfIga9pFlfir7gowsNSg1hmVCoBBvpSQ5QvdPoKNQbYAlKJZ0kNQ3nAXqs 7phOBNpnYycGrtPDdrXUKWCp1Ie6JT3yyfVnPWh2ihGMA8EPdtV4Z/not+b9BKhh3xjO liSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=I5IsJbZuDYpFXFlSvTXHYskF+HsPl8YK9J43vEHOECY=; b=U5xw2c15ngd6sC3pJa3ChWWEORHEFO0slMpMQVzszf+mGfIxnkPauOdErqXlpX3ULS m1OEMWRltXkxDOUfe5S3SuO6MjTQjvCVjPGdpBVGRnupuN+APCopA/P12y8fLoZdiAnh 8ZuQEJBP1VqL989k7VWJvqnsomTWxZgb6FC2476Nj8scouj/t9581I4ca+YWMwJeoDhh 2RSMxPtagNNBAhEoyISLnHqL0MZd2X1Dt4yNlWOHZhOVj0EeXqCuEzXOnKe699lJ1RV5 zSR+f3rWXnjFPUg4FReuFYvHBa8/VltgMSMHofpNbQ0+wGV6OqStH9JoHsRKouSmWbDs W6Bw== X-Gm-Message-State: AOAM530hAIS1Q/jbfXcxPtEO9d85lgUEmKmDPLYFG+mzevIF5j8+N5sy NTtl/oNFdBGC+jaw7/lxZbiPerGVWv5b X-Google-Smtp-Source: ABdhPJxbA3iapiR/f9u0K/8sUL1gu4aKEfntqg1SY+Pngz3C29pgj1likg4eP52VuPDdA1tup5WUnbUJi1kM Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a17:90a:ff0d:: with SMTP id ce13mr326617pjb.109.1610475064980; Tue, 12 Jan 2021 10:11:04 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:28 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-12-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 11/24] kvm: x86/mmu: Put TDP MMU PT walks in RCU read-critical section From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org In order to enable concurrent modifications to the paging structures in the TDP MMU, threads must be able to safely remove pages of page table memory while other threads are traversing the same memory. To ensure threads do not access PT memory after it is freed, protect PT memory with RCU. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/tdp_mmu.c | 53 ++++++++++++++++++++++++++++++++++++-- 1 file changed, 51 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index e8f35cd46b4c..662907d374b3 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -458,11 +458,14 @@ static inline void tdp_mmu_set_spte_no_dirty_log(struct kvm *kvm, * Return true if this function yielded, the TLBs were flushed, and the * iterator's traversal was reset. Return false if a yield was not needed. */ -static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, struct tdp_iter *iter) +static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, + struct tdp_iter *iter) { if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { kvm_flush_remote_tlbs(kvm); + rcu_read_unlock(); cond_resched_lock(&kvm->mmu_lock); + rcu_read_lock(); tdp_iter_refresh_walk(iter); return true; } else @@ -483,7 +486,9 @@ static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, struct tdp_iter *it static bool tdp_mmu_iter_cond_resched(struct kvm *kvm, struct tdp_iter *iter) { if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { + rcu_read_unlock(); cond_resched_lock(&kvm->mmu_lock); + rcu_read_lock(); tdp_iter_refresh_walk(iter); return true; } else @@ -508,6 +513,8 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, gfn_t last_goal_gfn = start; bool flush_needed = false; + rcu_read_lock(); + tdp_root_for_each_pte(iter, root, start, end) { /* Ensure forward progress has been made before yielding. */ if (can_yield && iter.goal_gfn != last_goal_gfn && @@ -538,6 +545,8 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte(kvm, &iter, 0); flush_needed = true; } + + rcu_read_unlock(); return flush_needed; } @@ -650,6 +659,9 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, huge_page_disallowed, &req_level); trace_kvm_mmu_spte_requested(gpa, level, pfn); + + rcu_read_lock(); + tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) { if (nx_huge_page_workaround_enabled) disallowed_hugepage_adjust(iter.old_spte, gfn, @@ -693,11 +705,14 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, } } - if (WARN_ON(iter.level != level)) + if (WARN_ON(iter.level != level)) { + rcu_read_unlock(); return RET_PF_RETRY; + } ret = tdp_mmu_map_handle_target_level(vcpu, write, map_writable, &iter, pfn, prefault); + rcu_read_unlock(); return ret; } @@ -768,6 +783,8 @@ static int age_gfn_range(struct kvm *kvm, struct kvm_memory_slot *slot, int young = 0; u64 new_spte = 0; + rcu_read_lock(); + tdp_root_for_each_leaf_pte(iter, root, start, end) { /* * If we have a non-accessed entry we don't need to change the @@ -799,6 +816,8 @@ static int age_gfn_range(struct kvm *kvm, struct kvm_memory_slot *slot, trace_kvm_age_page(iter.gfn, iter.level, slot, young); } + rcu_read_unlock(); + return young; } @@ -844,6 +863,8 @@ static int set_tdp_spte(struct kvm *kvm, struct kvm_memory_slot *slot, u64 new_spte; int need_flush = 0; + rcu_read_lock(); + WARN_ON(pte_huge(*ptep)); new_pfn = pte_pfn(*ptep); @@ -872,6 +893,8 @@ static int set_tdp_spte(struct kvm *kvm, struct kvm_memory_slot *slot, if (need_flush) kvm_flush_remote_tlbs_with_address(kvm, gfn, 1); + rcu_read_unlock(); + return 0; } @@ -896,6 +919,8 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, gfn_t last_goal_gfn = start; bool spte_set = false; + rcu_read_lock(); + BUG_ON(min_level > KVM_MAX_HUGEPAGE_LEVEL); for_each_tdp_pte_min_level(iter, root->spt, root->role.level, @@ -924,6 +949,8 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte); spte_set = true; } + + rcu_read_unlock(); return spte_set; } @@ -966,6 +993,8 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, gfn_t last_goal_gfn = start; bool spte_set = false; + rcu_read_lock(); + tdp_root_for_each_leaf_pte(iter, root, start, end) { /* Ensure forward progress has been made before yielding. */ if (iter.goal_gfn != last_goal_gfn && @@ -994,6 +1023,8 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte); spte_set = true; } + + rcu_read_unlock(); return spte_set; } @@ -1035,6 +1066,8 @@ static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root, struct tdp_iter iter; u64 new_spte; + rcu_read_lock(); + tdp_root_for_each_leaf_pte(iter, root, gfn + __ffs(mask), gfn + BITS_PER_LONG) { if (!mask) @@ -1060,6 +1093,8 @@ static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root, mask &= ~(1UL << (iter.gfn - gfn)); } + + rcu_read_unlock(); } /* @@ -1100,6 +1135,8 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, gfn_t last_goal_gfn = start; bool spte_set = false; + rcu_read_lock(); + tdp_root_for_each_pte(iter, root, start, end) { /* Ensure forward progress has been made before yielding. */ if (iter.goal_gfn != last_goal_gfn && @@ -1125,6 +1162,7 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, spte_set = true; } + rcu_read_unlock(); return spte_set; } @@ -1163,6 +1201,8 @@ static void zap_collapsible_spte_range(struct kvm *kvm, gfn_t last_goal_gfn = start; bool spte_set = false; + rcu_read_lock(); + tdp_root_for_each_pte(iter, root, start, end) { /* Ensure forward progress has been made before yielding. */ if (iter.goal_gfn != last_goal_gfn && @@ -1190,6 +1230,7 @@ static void zap_collapsible_spte_range(struct kvm *kvm, spte_set = true; } + rcu_read_unlock(); if (spte_set) kvm_flush_remote_tlbs(kvm); } @@ -1226,6 +1267,8 @@ static bool write_protect_gfn(struct kvm *kvm, struct kvm_mmu_page *root, u64 new_spte; bool spte_set = false; + rcu_read_lock(); + tdp_root_for_each_leaf_pte(iter, root, gfn, gfn + 1) { if (!is_writable_pte(iter.old_spte)) break; @@ -1237,6 +1280,8 @@ static bool write_protect_gfn(struct kvm *kvm, struct kvm_mmu_page *root, spte_set = true; } + rcu_read_unlock(); + return spte_set; } @@ -1277,10 +1322,14 @@ int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, *root_level = vcpu->arch.mmu->shadow_root_level; + rcu_read_lock(); + tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) { leaf = iter.level; sptes[leaf] = iter.old_spte; } + rcu_read_unlock(); + return leaf; } From patchwork Tue Jan 12 18:10:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014337 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09626C433E9 for ; Tue, 12 Jan 2021 18:12:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C4F4B2311D for ; Tue, 12 Jan 2021 18:12:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406129AbhALSMY (ORCPT ); Tue, 12 Jan 2021 13:12:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56816 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2406109AbhALSMX (ORCPT ); Tue, 12 Jan 2021 13:12:23 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AE067C061382 for ; Tue, 12 Jan 2021 10:11:07 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id k7so3252468ybm.13 for ; Tue, 12 Jan 2021 10:11:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=dO4IqxqIqKBoqVoUqd2f3S75b5ltVbr4dkjOrW+PVUI=; b=e5/8YvUL2mRX71HltI4Oq0TkfCluL9Io4FlT3gEHnD5j8uAUpd3DG4yaWnIsvX12gP 2IrpmG+LwrpAC7swQktZeRnZFVmRXj2WnqLZfng8u2VXO+PkZcTI5FwSlYu3ypwqn0fh 8sHSCYYnQ8Cfya4lgl9hwlRI7qRtFuFUxKNLOceabKf9D6zmTnak/L4RmfVdqVgumHBP R0mOegLGCUoyuK8NL3749Q4sYOKkMkOdiRkFm7O74UAJiOIivXyP8a7DBBv2f7Z7oSDD C9w0yDuANFFlp5g55Qk4ZDWMGw6Idki1mTnQLmFVUQ5PThobhi66rXZsEJ9eipmFnF57 Opxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=dO4IqxqIqKBoqVoUqd2f3S75b5ltVbr4dkjOrW+PVUI=; b=ThiG7U4FmGYsE11K8jvQ4x2yC29DWtS9vHevLaQd671brHST6I8EhuVN1OcYGbkw1z r+zncnigXBGJHRbm+eSGRDSX3bTfsqWPUt7j5IPev9ZgVHN15yJZV2v4tFBGOYRW2wot Vu1Sxt8KDBgSVLRk2v2vyA5Sf1T3nZqHIRJ1SDDzTyFr3dWObgbeIuR5JGSW0GqETlKg MivlDt/iPi7no+MZQJAoj69vT9sA1rvWnrZ4Y6a7FVJdZqadkMoraKPI8xFlrwtjwTWG WmGdzlnKaWtNCBog33VCVFCvsqAZ/i8UXEkAfFO2kSKZDWWck/dCHBBkwjljKX/4Om+0 GrzA== X-Gm-Message-State: AOAM5310TC1ABtqPvM1RId6T3vVX9vWP1kcqLnVK3NpcWJHwngbwMHQW qwI9ML/67aQsYAE0ghh1USc3pyH6flAW X-Google-Smtp-Source: ABdhPJyl2Qzej4S94jfATT+lkwvUU2Z5QsbfCIAXjwg1DUK8GrqZaajn6dNog3SgwEww2CH9SuE0eLHB7T2E Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a25:190b:: with SMTP id 11mr965985ybz.236.1610475066803; Tue, 12 Jan 2021 10:11:06 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:29 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-13-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 12/24] kvm: x86/kvm: RCU dereference tdp mmu page table links From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org In order to protect TDP MMU PT memory with RCU, ensure that page table links are properly rcu_derefenced. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/tdp_iter.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu/tdp_iter.c b/arch/x86/kvm/mmu/tdp_iter.c index 87b7e16911db..82855613ffa0 100644 --- a/arch/x86/kvm/mmu/tdp_iter.c +++ b/arch/x86/kvm/mmu/tdp_iter.c @@ -49,6 +49,8 @@ void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level, */ u64 *spte_to_child_pt(u64 spte, int level) { + u64 *child_pt; + /* * There's no child entry if this entry isn't present or is a * last-level entry. @@ -56,7 +58,9 @@ u64 *spte_to_child_pt(u64 spte, int level) if (!is_shadow_present_pte(spte) || is_last_spte(spte, level)) return NULL; - return __va(spte_to_pfn(spte) << PAGE_SHIFT); + child_pt = __va(spte_to_pfn(spte) << PAGE_SHIFT); + + return rcu_dereference(child_pt); } /* From patchwork Tue Jan 12 18:10:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014339 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F534C43381 for ; Tue, 12 Jan 2021 18:12:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E52012311F for ; Tue, 12 Jan 2021 18:12:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406156AbhALSMZ (ORCPT ); Tue, 12 Jan 2021 13:12:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56818 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2406131AbhALSMY (ORCPT ); Tue, 12 Jan 2021 13:12:24 -0500 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B962C061384 for ; Tue, 12 Jan 2021 10:11:09 -0800 (PST) Received: by mail-pg1-x549.google.com with SMTP id s17so2136483pgv.14 for ; Tue, 12 Jan 2021 10:11:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=HlxHQG6STtx0QDM9VJz9RspfS5viveHc/2QI5Fjkqgw=; b=d4mGH/OB1ceSgbJO5NKkqNsIB4zZmkowAlLR9wa74B703TN95w9BrVrCdbPNO4Yryr WvnNjjMDNKXC0/eGEcJnHWiRgcT3W99lFEMPOcInPWFcON0SSEI82n9OJtJjJItq+RUs +I0M7NHcozxn4iSlhgT42nbqxfQE0rnSH27MW8S/vn992usCRq9VK2V+sSvvCnw3FO8K GN+4CyhivXqt8Q0zHVEnKB5PekVmUcG20I7pKZwNZo6c9OfBp1N6a1DebJsr7arkg7d9 siXIUVI5NIWysNkgLEjOJBHAK3zQ85W38Fvg/XCdZsD/Gmw/H0l0nSVKIpvxaxbCnTzx mKHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=HlxHQG6STtx0QDM9VJz9RspfS5viveHc/2QI5Fjkqgw=; b=bSWynPck91WpB6DzJhB0/ALv2Erc/lp13SBaHMU0/H7kyFxA8v63z0gCvmSTB5Qjq0 eRUjB2ASWA1iWRa0fGWjpm76KCinoPXKEC8nDPqPIkcDp8dILSDC8OO02HhPjJBjkswO WeVcyK2VO9WDaz7Y/eVLz7N8iOQJifvucd9G206BifxArolRWLc14wKuCoJPc0DTN0f/ vPtQADREfjO2x5Y3lMFRqafykkwb6OzeFC9ZYkVLesLa9blZQNsKZ63xdaek24l/GE4c /yFxeAb2lUYyX/UMjLVa7hSlZ3kElZ2wd1S9bprmI+Ij88oXJYjI8MJwUOyzEejSfpT/ /4xg== X-Gm-Message-State: AOAM5337Jhovh06c+GzDy4iWp1+aYLi6xtWWScJI+tVsa2m+MLU/Iiau p8/Z6KdHMDQNT2xWr9AVaha3z8V0J5cV X-Google-Smtp-Source: ABdhPJzg28iu7t/1wOfSzZBQdSgjd/0s49dwgkvIcEHYSTQhnxllRD1Mf3g4mwnNhaeW7LkyIluVzEezkLjm Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a17:902:34f:b029:dc:3032:e47d with SMTP id 73-20020a170902034fb02900dc3032e47dmr181607pld.15.1610475068774; Tue, 12 Jan 2021 10:11:08 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:30 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-14-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 13/24] kvm: x86/mmu: Only free tdp_mmu pages after a grace period From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org By waiting until an RCU grace period has elapsed to free TDP MMU PT memory, the system can ensure that no kernel threads access the memory after it has been freed. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/mmu_internal.h | 3 +++ arch/x86/kvm/mmu/tdp_mmu.c | 31 +++++++++++++++++++++++++++++-- 2 files changed, 32 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index bfc6389edc28..7f599cc64178 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -57,6 +57,9 @@ struct kvm_mmu_page { atomic_t write_flooding_count; bool tdp_mmu_page; + + /* Used for freeing the page asyncronously if it is a TDP MMU page. */ + struct rcu_head rcu_head; }; extern struct kmem_cache *mmu_page_header_cache; diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 662907d374b3..dc5b4bf34ca2 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -42,6 +42,12 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm) return; WARN_ON(!list_empty(&kvm->arch.tdp_mmu_roots)); + + /* + * Ensure that all the outstanding RCU callbacks to free shadow pages + * can run before the VM is torn down. + */ + rcu_barrier(); } static void tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root) @@ -196,6 +202,28 @@ hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu) return __pa(root->spt); } +static void tdp_mmu_free_sp(struct kvm_mmu_page *sp) +{ + free_page((unsigned long)sp->spt); + kmem_cache_free(mmu_page_header_cache, sp); +} + +/* + * This is called through call_rcu in order to free TDP page table memory + * safely with respect to other kernel threads that may be operating on + * the memory. + * By only accessing TDP MMU page table memory in an RCU read critical + * section, and freeing it after a grace period, lockless access to that + * memory won't use it after it is freed. + */ +static void tdp_mmu_free_sp_rcu_callback(struct rcu_head *head) +{ + struct kvm_mmu_page *sp = container_of(head, struct kvm_mmu_page, + rcu_head); + + tdp_mmu_free_sp(sp); +} + static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, u64 old_spte, u64 new_spte, int level); @@ -273,8 +301,7 @@ static void handle_disconnected_tdp_mmu_page(struct kvm *kvm, u64 *pt) kvm_flush_remote_tlbs_with_address(kvm, gfn, KVM_PAGES_PER_HPAGE(level)); - free_page((unsigned long)pt); - kmem_cache_free(mmu_page_header_cache, sp); + call_rcu(&sp->rcu_head, tdp_mmu_free_sp_rcu_callback); } /** From patchwork Tue Jan 12 18:10:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014367 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-23.5 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, UNWANTED_LANGUAGE_BODY,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92771C433DB for ; Tue, 12 Jan 2021 18:14:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4DB4C22DFA for ; Tue, 12 Jan 2021 18:14:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405435AbhALSOb (ORCPT ); Tue, 12 Jan 2021 13:14:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56822 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2406109AbhALSMY (ORCPT ); Tue, 12 Jan 2021 13:12:24 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 00298C061386 for ; Tue, 12 Jan 2021 10:11:10 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id hg11so2039585pjb.2 for ; Tue, 12 Jan 2021 10:11:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=LgySdp/viTtAKHo75ouLisNlmtdMi+1tlQE1KRf+Jq4=; b=QTLYlhbfBvSUHnGRceHvgJsqjpQ9pkIiNIbG72H1icms1k+Y1ZQZrB+4TZbNkcQWX0 aZui+zyFZenRerdYpbrS909IS/1jB8vK04IFzG706Q1zHsGjNqhwvg4MA3X89SgfyngI fryd1Pl+7BDoh4fAS+H/z9HsrdnRqlB697+gF3+QVW/GTnt7qyx14MbdPGZ2IKn1ubE9 PZDsNLA2cFKUZkR/+k6NkkSJhBIWSHghq+1YYq2RVjKJI59ymfxIzTtMMcIGduXxEtIg iEeGKPr3OKt5wZLtd++93vx3qs3FiLnNc8p+kcUSSClEgieBQxDmJyfTIlRgZpsrzHjg N0Rg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=LgySdp/viTtAKHo75ouLisNlmtdMi+1tlQE1KRf+Jq4=; b=Y58iSa6sY4F+wqViDuImPY1E6IhIR4HatR8nvq9TUdIz1y4n3c7aWKnmcv4A82eTSz 4M2mMSIc5IOiKkUNYjUCLdEAX3FPWbyHFGH9oHdFw4DzQAkGCLhQZCDVW5Jb1IG6010q stS5LaBmkeVlH9P780ylIBIPcM3n3P1yGz4MJl1zlcRfS/7iHer7fmdurT9WtHiXnnvY ATZN+1MVG2PLougBckFRnZB9XQzZKrE6TlZPzD1ihPHLnv8FVcrmq9UU6juHKi8oooyD xXb2+cOfvc93HSNDM0Qy1l0lktTvTDCTBiDRP9OKmrDMQKbydDs+n0ZlGgfB25cnGCYs wqcA== X-Gm-Message-State: AOAM533KvVP/LPupSM4fytNg222wE/Yz9pbe53H5fhTYvc8aLwfFDCPk 4Ol7rBnb0ilbI3jyLhx6SpGfEvHPv4yW X-Google-Smtp-Source: ABdhPJzI5cQkD7HnsDIExEpsPued8FVcuF8LikUPCjLBUmwsTUTavYpHAwQZc0N5LYB5M+kfE1u2RRRisIm1 Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a62:19cb:0:b029:19e:75c2:61ec with SMTP id 194-20020a6219cb0000b029019e75c261ecmr190157pfz.19.1610475070498; Tue, 12 Jan 2021 10:11:10 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:31 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-15-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 14/24] kvm: mmu: Wrap mmu_lock lock / unlock in a function From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Wrap locking and unlocking the mmu_lock in a function. This will facilitate future logging and stat collection for the lock and more immediately support a refactoring to move the lock into the struct kvm_arch(s) so that x86 can change the spinlock to a rwlock without affecting the performance of other archs. No functional change intended. Signed-off-by: Peter Feiner Signed-off-by: Ben Gardon --- arch/arm64/kvm/mmu.c | 36 ++++++------- arch/mips/kvm/mips.c | 8 +-- arch/mips/kvm/mmu.c | 14 ++--- arch/powerpc/kvm/book3s_64_mmu_host.c | 4 +- arch/powerpc/kvm/book3s_64_mmu_hv.c | 12 ++--- arch/powerpc/kvm/book3s_64_mmu_radix.c | 22 ++++---- arch/powerpc/kvm/book3s_hv.c | 8 +-- arch/powerpc/kvm/book3s_hv_nested.c | 52 +++++++++--------- arch/powerpc/kvm/book3s_mmu_hpte.c | 10 ++-- arch/powerpc/kvm/e500_mmu_host.c | 4 +- arch/x86/kvm/mmu/mmu.c | 74 +++++++++++++------------- arch/x86/kvm/mmu/page_track.c | 8 +-- arch/x86/kvm/mmu/paging_tmpl.h | 8 +-- arch/x86/kvm/mmu/tdp_mmu.c | 6 +-- arch/x86/kvm/x86.c | 4 +- drivers/gpu/drm/i915/gvt/kvmgt.c | 12 ++--- include/linux/kvm_host.h | 3 ++ virt/kvm/dirty_ring.c | 4 +- virt/kvm/kvm_main.c | 42 +++++++++------ 19 files changed, 172 insertions(+), 159 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 7d2257cc5438..402b1642c944 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -164,13 +164,13 @@ static void stage2_flush_vm(struct kvm *kvm) int idx; idx = srcu_read_lock(&kvm->srcu); - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); slots = kvm_memslots(kvm); kvm_for_each_memslot(memslot, slots) stage2_flush_memslot(kvm, memslot); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); srcu_read_unlock(&kvm->srcu, idx); } @@ -456,13 +456,13 @@ void stage2_unmap_vm(struct kvm *kvm) idx = srcu_read_lock(&kvm->srcu); mmap_read_lock(current->mm); - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); slots = kvm_memslots(kvm); kvm_for_each_memslot(memslot, slots) stage2_unmap_memslot(kvm, memslot); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); mmap_read_unlock(current->mm); srcu_read_unlock(&kvm->srcu, idx); } @@ -472,14 +472,14 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu) struct kvm *kvm = mmu->kvm; struct kvm_pgtable *pgt = NULL; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); pgt = mmu->pgt; if (pgt) { mmu->pgd_phys = 0; mmu->pgt = NULL; free_percpu(mmu->last_vcpu_ran); } - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); if (pgt) { kvm_pgtable_stage2_destroy(pgt); @@ -516,10 +516,10 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, if (ret) break; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); ret = kvm_pgtable_stage2_map(pgt, addr, PAGE_SIZE, pa, prot, &cache); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); if (ret) break; @@ -567,9 +567,9 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot) start = memslot->base_gfn << PAGE_SHIFT; end = (memslot->base_gfn + memslot->npages) << PAGE_SHIFT; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); stage2_wp_range(&kvm->arch.mmu, start, end); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); kvm_flush_remote_tlbs(kvm); } @@ -867,7 +867,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, if (exec_fault && device) return -ENOEXEC; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); pgt = vcpu->arch.hw_mmu->pgt; if (mmu_notifier_retry(kvm, mmu_seq)) goto out_unlock; @@ -912,7 +912,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, } out_unlock: - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); kvm_set_pfn_accessed(pfn); kvm_release_pfn_clean(pfn); return ret; @@ -927,10 +927,10 @@ static void handle_access_fault(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa) trace_kvm_access_fault(fault_ipa); - spin_lock(&vcpu->kvm->mmu_lock); + kvm_mmu_lock(vcpu->kvm); mmu = vcpu->arch.hw_mmu; kpte = kvm_pgtable_stage2_mkyoung(mmu->pgt, fault_ipa); - spin_unlock(&vcpu->kvm->mmu_lock); + kvm_mmu_unlock(vcpu->kvm); pte = __pte(kpte); if (pte_valid(pte)) @@ -1365,12 +1365,12 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, if (change == KVM_MR_FLAGS_ONLY) goto out; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); if (ret) unmap_stage2_range(&kvm->arch.mmu, mem->guest_phys_addr, mem->memory_size); else if (!cpus_have_final_cap(ARM64_HAS_STAGE2_FWB)) stage2_flush_memslot(kvm, memslot); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); out: mmap_read_unlock(current->mm); return ret; @@ -1395,9 +1395,9 @@ void kvm_arch_flush_shadow_memslot(struct kvm *kvm, gpa_t gpa = slot->base_gfn << PAGE_SHIFT; phys_addr_t size = slot->npages << PAGE_SHIFT; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); unmap_stage2_range(&kvm->arch.mmu, gpa, size); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); } /* diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c index 3d6a7f5827b1..4e393d93c1aa 100644 --- a/arch/mips/kvm/mips.c +++ b/arch/mips/kvm/mips.c @@ -217,13 +217,13 @@ void kvm_arch_flush_shadow_memslot(struct kvm *kvm, * need to ensure that it can no longer be accessed by any guest VCPUs. */ - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); /* Flush slot from GPA */ kvm_mips_flush_gpa_pt(kvm, slot->base_gfn, slot->base_gfn + slot->npages - 1); /* Let implementation do the rest */ kvm_mips_callbacks->flush_shadow_memslot(kvm, slot); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); } int kvm_arch_prepare_memory_region(struct kvm *kvm, @@ -258,14 +258,14 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, if (change == KVM_MR_FLAGS_ONLY && (!(old->flags & KVM_MEM_LOG_DIRTY_PAGES) && new->flags & KVM_MEM_LOG_DIRTY_PAGES)) { - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); /* Write protect GPA page table entries */ needs_flush = kvm_mips_mkclean_gpa_pt(kvm, new->base_gfn, new->base_gfn + new->npages - 1); /* Let implementation do the rest */ if (needs_flush) kvm_mips_callbacks->flush_shadow_memslot(kvm, new); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); } } diff --git a/arch/mips/kvm/mmu.c b/arch/mips/kvm/mmu.c index 3dabeda82458..449663152b3c 100644 --- a/arch/mips/kvm/mmu.c +++ b/arch/mips/kvm/mmu.c @@ -593,7 +593,7 @@ static int _kvm_mips_map_page_fast(struct kvm_vcpu *vcpu, unsigned long gpa, bool pfn_valid = false; int ret = 0; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); /* Fast path - just check GPA page table for an existing entry */ ptep = kvm_mips_pte_for_gpa(kvm, NULL, gpa); @@ -628,7 +628,7 @@ static int _kvm_mips_map_page_fast(struct kvm_vcpu *vcpu, unsigned long gpa, *out_buddy = *ptep_buddy(ptep); out: - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); if (pfn_valid) kvm_set_pfn_accessed(pfn); return ret; @@ -710,7 +710,7 @@ static int kvm_mips_map_page(struct kvm_vcpu *vcpu, unsigned long gpa, goto out; } - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); /* Check if an invalidation has taken place since we got pfn */ if (mmu_notifier_retry(kvm, mmu_seq)) { /* @@ -718,7 +718,7 @@ static int kvm_mips_map_page(struct kvm_vcpu *vcpu, unsigned long gpa, * also synchronously if a COW is triggered by * gfn_to_pfn_prot(). */ - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); kvm_release_pfn_clean(pfn); goto retry; } @@ -748,7 +748,7 @@ static int kvm_mips_map_page(struct kvm_vcpu *vcpu, unsigned long gpa, if (out_buddy) *out_buddy = *ptep_buddy(ptep); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); kvm_release_pfn_clean(pfn); kvm_set_pfn_accessed(pfn); out: @@ -1041,12 +1041,12 @@ int kvm_mips_handle_mapped_seg_tlb_fault(struct kvm_vcpu *vcpu, /* And its GVA buddy's GPA page table entry if it also exists */ pte_gpa[!idx] = pfn_pte(0, __pgprot(0)); if (tlb_lo[!idx] & ENTRYLO_V) { - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); ptep_buddy = kvm_mips_pte_for_gpa(kvm, NULL, mips3_tlbpfn_to_paddr(tlb_lo[!idx])); if (ptep_buddy) pte_gpa[!idx] = *ptep_buddy; - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); } /* Get the GVA page table entry pair */ diff --git a/arch/powerpc/kvm/book3s_64_mmu_host.c b/arch/powerpc/kvm/book3s_64_mmu_host.c index e452158a18d7..4039a90c250c 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_host.c +++ b/arch/powerpc/kvm/book3s_64_mmu_host.c @@ -148,7 +148,7 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *orig_pte, cpte = kvmppc_mmu_hpte_cache_next(vcpu); - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); if (!cpte || mmu_notifier_retry(kvm, mmu_seq)) { r = -EAGAIN; goto out_unlock; @@ -200,7 +200,7 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *orig_pte, } out_unlock: - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); kvm_release_pfn_clean(pfn); if (cpte) kvmppc_mmu_hpte_cache_free(cpte); diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index 38ea396a23d6..b1300a18efa7 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -605,12 +605,12 @@ int kvmppc_book3s_hv_page_fault(struct kvm_vcpu *vcpu, * Read the PTE from the process' radix tree and use that * so we get the shift and attribute bits. */ - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); ptep = find_kvm_host_pte(kvm, mmu_seq, hva, &shift); pte = __pte(0); if (ptep) pte = READ_ONCE(*ptep); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); /* * If the PTE disappeared temporarily due to a THP * collapse, just return and let the guest try again. @@ -739,14 +739,14 @@ void kvmppc_rmap_reset(struct kvm *kvm) slots = kvm_memslots(kvm); kvm_for_each_memslot(memslot, slots) { /* Mutual exclusion with kvm_unmap_hva_range etc. */ - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); /* * This assumes it is acceptable to lose reference and * change bits across a reset. */ memset(memslot->arch.rmap, 0, memslot->npages * sizeof(*memslot->arch.rmap)); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); } srcu_read_unlock(&kvm->srcu, srcu_idx); } @@ -1405,14 +1405,14 @@ static void resize_hpt_pivot(struct kvm_resize_hpt *resize) resize_hpt_debug(resize, "resize_hpt_pivot()\n"); - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); asm volatile("ptesync" : : : "memory"); hpt_tmp = kvm->arch.hpt; kvmppc_set_hpt(kvm, &resize->hpt); resize->hpt = hpt_tmp; - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); synchronize_srcu_expedited(&kvm->srcu); diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c index bb35490400e9..b628980c871b 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c @@ -613,7 +613,7 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte, new_ptep = kvmppc_pte_alloc(); /* Check if we might have been invalidated; let the guest retry if so */ - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); ret = -EAGAIN; if (mmu_notifier_retry(kvm, mmu_seq)) goto out_unlock; @@ -749,7 +749,7 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte, ret = 0; out_unlock: - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); if (new_pud) pud_free(kvm->mm, new_pud); if (new_pmd) @@ -837,12 +837,12 @@ int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu, * Read the PTE from the process' radix tree and use that * so we get the shift and attribute bits. */ - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); ptep = find_kvm_host_pte(kvm, mmu_seq, hva, &shift); pte = __pte(0); if (ptep) pte = READ_ONCE(*ptep); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); /* * If the PTE disappeared temporarily due to a THP * collapse, just return and let the guest try again. @@ -972,11 +972,11 @@ int kvmppc_book3s_radix_page_fault(struct kvm_vcpu *vcpu, /* Failed to set the reference/change bits */ if (dsisr & DSISR_SET_RC) { - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); if (kvmppc_hv_handle_set_rc(kvm, false, writing, gpa, kvm->arch.lpid)) dsisr &= ~DSISR_SET_RC; - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); if (!(dsisr & (DSISR_BAD_FAULT_64S | DSISR_NOHPTE | DSISR_PROTFAULT | DSISR_SET_RC))) @@ -1082,7 +1082,7 @@ static int kvm_radix_test_clear_dirty(struct kvm *kvm, pte = READ_ONCE(*ptep); if (pte_present(pte) && pte_dirty(pte)) { - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); /* * Recheck the pte again */ @@ -1094,7 +1094,7 @@ static int kvm_radix_test_clear_dirty(struct kvm *kvm, * walk. */ if (!pte_present(*ptep) || !pte_dirty(*ptep)) { - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); return 0; } } @@ -1109,7 +1109,7 @@ static int kvm_radix_test_clear_dirty(struct kvm *kvm, kvmhv_update_nest_rmap_rc_list(kvm, rmapp, _PAGE_DIRTY, 0, old & PTE_RPN_MASK, 1UL << shift); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); } return ret; } @@ -1154,7 +1154,7 @@ void kvmppc_radix_flush_memslot(struct kvm *kvm, return; gpa = memslot->base_gfn << PAGE_SHIFT; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); for (n = memslot->npages; n; --n) { ptep = find_kvm_secondary_pte(kvm, gpa, &shift); if (ptep && pte_present(*ptep)) @@ -1167,7 +1167,7 @@ void kvmppc_radix_flush_memslot(struct kvm *kvm, * fault that read the memslot earlier from writing a PTE. */ kvm->mmu_notifier_seq++; - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); } static void add_rmmu_ap_encoding(struct kvm_ppc_rmmu_info *info, diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 6f612d240392..ec08abd532f1 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -4753,9 +4753,9 @@ int kvmppc_switch_mmu_to_hpt(struct kvm *kvm) kvmppc_rmap_reset(kvm); kvm->arch.process_table = 0; /* Mutual exclusion with kvm_unmap_hva_range etc. */ - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); kvm->arch.radix = 0; - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); kvmppc_free_radix(kvm); kvmppc_update_lpcr(kvm, LPCR_VPM1, LPCR_VPM1 | LPCR_UPRT | LPCR_GTSE | LPCR_HR); @@ -4775,9 +4775,9 @@ int kvmppc_switch_mmu_to_radix(struct kvm *kvm) return err; kvmppc_rmap_reset(kvm); /* Mutual exclusion with kvm_unmap_hva_range etc. */ - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); kvm->arch.radix = 1; - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); kvmppc_free_hpt(&kvm->arch.hpt); kvmppc_update_lpcr(kvm, LPCR_UPRT | LPCR_GTSE | LPCR_HR, LPCR_VPM1 | LPCR_UPRT | LPCR_GTSE | LPCR_HR); diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c index 33b58549a9aa..18890dca9476 100644 --- a/arch/powerpc/kvm/book3s_hv_nested.c +++ b/arch/powerpc/kvm/book3s_hv_nested.c @@ -628,7 +628,7 @@ static void kvmhv_remove_nested(struct kvm_nested_guest *gp) int lpid = gp->l1_lpid; long ref; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); if (gp == kvm->arch.nested_guests[lpid]) { kvm->arch.nested_guests[lpid] = NULL; if (lpid == kvm->arch.max_nested_lpid) { @@ -639,7 +639,7 @@ static void kvmhv_remove_nested(struct kvm_nested_guest *gp) --gp->refcnt; } ref = gp->refcnt; - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); if (ref == 0) kvmhv_release_nested(gp); } @@ -658,7 +658,7 @@ void kvmhv_release_all_nested(struct kvm *kvm) struct kvm_memory_slot *memslot; int srcu_idx; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); for (i = 0; i <= kvm->arch.max_nested_lpid; i++) { gp = kvm->arch.nested_guests[i]; if (!gp) @@ -670,7 +670,7 @@ void kvmhv_release_all_nested(struct kvm *kvm) } } kvm->arch.max_nested_lpid = -1; - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); while ((gp = freelist) != NULL) { freelist = gp->next; kvmhv_release_nested(gp); @@ -687,9 +687,9 @@ static void kvmhv_flush_nested(struct kvm_nested_guest *gp) { struct kvm *kvm = gp->l1_host; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); kvmppc_free_pgtable_radix(kvm, gp->shadow_pgtable, gp->shadow_lpid); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); kvmhv_flush_lpid(gp->shadow_lpid); kvmhv_update_ptbl_cache(gp); if (gp->l1_gr_to_hr == 0) @@ -705,11 +705,11 @@ struct kvm_nested_guest *kvmhv_get_nested(struct kvm *kvm, int l1_lpid, l1_lpid >= (1ul << ((kvm->arch.l1_ptcr & PRTS_MASK) + 12 - 4))) return NULL; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); gp = kvm->arch.nested_guests[l1_lpid]; if (gp) ++gp->refcnt; - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); if (gp || !create) return gp; @@ -717,7 +717,7 @@ struct kvm_nested_guest *kvmhv_get_nested(struct kvm *kvm, int l1_lpid, newgp = kvmhv_alloc_nested(kvm, l1_lpid); if (!newgp) return NULL; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); if (kvm->arch.nested_guests[l1_lpid]) { /* someone else beat us to it */ gp = kvm->arch.nested_guests[l1_lpid]; @@ -730,7 +730,7 @@ struct kvm_nested_guest *kvmhv_get_nested(struct kvm *kvm, int l1_lpid, kvm->arch.max_nested_lpid = l1_lpid; } ++gp->refcnt; - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); if (newgp) kvmhv_release_nested(newgp); @@ -743,9 +743,9 @@ void kvmhv_put_nested(struct kvm_nested_guest *gp) struct kvm *kvm = gp->l1_host; long ref; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); ref = --gp->refcnt; - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); if (ref == 0) kvmhv_release_nested(gp); } @@ -940,7 +940,7 @@ static bool kvmhv_invalidate_shadow_pte(struct kvm_vcpu *vcpu, pte_t *ptep; int shift; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); ptep = find_kvm_nested_guest_pte(kvm, gp->l1_lpid, gpa, &shift); if (!shift) shift = PAGE_SHIFT; @@ -948,7 +948,7 @@ static bool kvmhv_invalidate_shadow_pte(struct kvm_vcpu *vcpu, kvmppc_unmap_pte(kvm, ptep, gpa, shift, NULL, gp->shadow_lpid); ret = true; } - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); if (shift_ret) *shift_ret = shift; @@ -1035,11 +1035,11 @@ static void kvmhv_emulate_tlbie_lpid(struct kvm_vcpu *vcpu, switch (ric) { case 0: /* Invalidate TLB */ - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); kvmppc_free_pgtable_radix(kvm, gp->shadow_pgtable, gp->shadow_lpid); kvmhv_flush_lpid(gp->shadow_lpid); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); break; case 1: /* @@ -1063,16 +1063,16 @@ static void kvmhv_emulate_tlbie_all_lpid(struct kvm_vcpu *vcpu, int ric) struct kvm_nested_guest *gp; int i; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); for (i = 0; i <= kvm->arch.max_nested_lpid; i++) { gp = kvm->arch.nested_guests[i]; if (gp) { - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); kvmhv_emulate_tlbie_lpid(vcpu, gp, ric); - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); } } - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); } static int kvmhv_emulate_priv_tlbie(struct kvm_vcpu *vcpu, unsigned int instr, @@ -1230,7 +1230,7 @@ static long kvmhv_handle_nested_set_rc(struct kvm_vcpu *vcpu, if (pgflags & ~gpte.rc) return RESUME_HOST; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); /* Set the rc bit in the pte of our (L0) pgtable for the L1 guest */ ret = kvmppc_hv_handle_set_rc(kvm, false, writing, gpte.raddr, kvm->arch.lpid); @@ -1248,7 +1248,7 @@ static long kvmhv_handle_nested_set_rc(struct kvm_vcpu *vcpu, ret = 0; out_unlock: - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); return ret; } @@ -1380,13 +1380,13 @@ static long int __kvmhv_nested_page_fault(struct kvm_vcpu *vcpu, /* See if can find translation in our partition scoped tables for L1 */ pte = __pte(0); - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); pte_p = find_kvm_secondary_pte(kvm, gpa, &shift); if (!shift) shift = PAGE_SHIFT; if (pte_p) pte = *pte_p; - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); if (!pte_present(pte) || (writing && !(pte_val(pte) & _PAGE_WRITE))) { /* No suitable pte found -> try to insert a mapping */ @@ -1461,13 +1461,13 @@ int kvmhv_nested_next_lpid(struct kvm *kvm, int lpid) { int ret = -1; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); while (++lpid <= kvm->arch.max_nested_lpid) { if (kvm->arch.nested_guests[lpid]) { ret = lpid; break; } } - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); return ret; } diff --git a/arch/powerpc/kvm/book3s_mmu_hpte.c b/arch/powerpc/kvm/book3s_mmu_hpte.c index ce79ac33e8d3..ec1b5a6dfee1 100644 --- a/arch/powerpc/kvm/book3s_mmu_hpte.c +++ b/arch/powerpc/kvm/book3s_mmu_hpte.c @@ -60,7 +60,7 @@ void kvmppc_mmu_hpte_cache_map(struct kvm_vcpu *vcpu, struct hpte_cache *pte) trace_kvm_book3s_mmu_map(pte); - spin_lock(&vcpu3s->mmu_lock); + kvm_mmu_lock(vcpu3s); /* Add to ePTE list */ index = kvmppc_mmu_hash_pte(pte->pte.eaddr); @@ -89,7 +89,7 @@ void kvmppc_mmu_hpte_cache_map(struct kvm_vcpu *vcpu, struct hpte_cache *pte) vcpu3s->hpte_cache_count++; - spin_unlock(&vcpu3s->mmu_lock); + kvm_mmu_unlock(vcpu3s); } static void free_pte_rcu(struct rcu_head *head) @@ -107,11 +107,11 @@ static void invalidate_pte(struct kvm_vcpu *vcpu, struct hpte_cache *pte) /* Different for 32 and 64 bit */ kvmppc_mmu_invalidate_pte(vcpu, pte); - spin_lock(&vcpu3s->mmu_lock); + kvm_mmu_lock(vcpu3s); /* pte already invalidated in between? */ if (hlist_unhashed(&pte->list_pte)) { - spin_unlock(&vcpu3s->mmu_lock); + kvm_mmu_unlock(vcpu3s); return; } @@ -124,7 +124,7 @@ static void invalidate_pte(struct kvm_vcpu *vcpu, struct hpte_cache *pte) #endif vcpu3s->hpte_cache_count--; - spin_unlock(&vcpu3s->mmu_lock); + kvm_mmu_unlock(vcpu3s); call_rcu(&pte->rcu_head, free_pte_rcu); } diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index ed0c9c43d0cf..633ae418ba0e 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -459,7 +459,7 @@ static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500, gvaddr &= ~((tsize_pages << PAGE_SHIFT) - 1); } - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); if (mmu_notifier_retry(kvm, mmu_seq)) { ret = -EAGAIN; goto out; @@ -499,7 +499,7 @@ static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500, kvmppc_mmu_flush_icache(pfn); out: - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); /* Drop refcount on page, so that mmu notifiers can clear it */ kvm_release_pfn_clean(pfn); diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 6d16481aa29d..5a4577830606 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2470,7 +2470,7 @@ static int make_mmu_pages_available(struct kvm_vcpu *vcpu) */ void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned long goal_nr_mmu_pages) { - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); if (kvm->arch.n_used_mmu_pages > goal_nr_mmu_pages) { kvm_mmu_zap_oldest_mmu_pages(kvm, kvm->arch.n_used_mmu_pages - @@ -2481,7 +2481,7 @@ void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned long goal_nr_mmu_pages) kvm->arch.n_max_mmu_pages = goal_nr_mmu_pages; - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); } int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn) @@ -2492,7 +2492,7 @@ int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn) pgprintk("%s: looking for gfn %llx\n", __func__, gfn); r = 0; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); for_each_gfn_indirect_valid_sp(kvm, sp, gfn) { pgprintk("%s: gfn %llx role %x\n", __func__, gfn, sp->role.word); @@ -2500,7 +2500,7 @@ int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn) kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list); } kvm_mmu_commit_zap_page(kvm, &invalid_list); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); return r; } @@ -3192,7 +3192,7 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, return; } - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) if (roots_to_free & KVM_MMU_ROOT_PREVIOUS(i)) @@ -3215,7 +3215,7 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, } kvm_mmu_commit_zap_page(kvm, &invalid_list); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); } EXPORT_SYMBOL_GPL(kvm_mmu_free_roots); @@ -3236,16 +3236,16 @@ static hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, gva_t gva, { struct kvm_mmu_page *sp; - spin_lock(&vcpu->kvm->mmu_lock); + kvm_mmu_lock(vcpu->kvm); if (make_mmu_pages_available(vcpu)) { - spin_unlock(&vcpu->kvm->mmu_lock); + kvm_mmu_unlock(vcpu->kvm); return INVALID_PAGE; } sp = kvm_mmu_get_page(vcpu, gfn, gva, level, direct, ACC_ALL); ++sp->root_count; - spin_unlock(&vcpu->kvm->mmu_lock); + kvm_mmu_unlock(vcpu->kvm); return __pa(sp->spt); } @@ -3416,17 +3416,17 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu) !smp_load_acquire(&sp->unsync_children)) return; - spin_lock(&vcpu->kvm->mmu_lock); + kvm_mmu_lock(vcpu->kvm); kvm_mmu_audit(vcpu, AUDIT_PRE_SYNC); mmu_sync_children(vcpu, sp); kvm_mmu_audit(vcpu, AUDIT_POST_SYNC); - spin_unlock(&vcpu->kvm->mmu_lock); + kvm_mmu_unlock(vcpu->kvm); return; } - spin_lock(&vcpu->kvm->mmu_lock); + kvm_mmu_lock(vcpu->kvm); kvm_mmu_audit(vcpu, AUDIT_PRE_SYNC); for (i = 0; i < 4; ++i) { @@ -3440,7 +3440,7 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu) } kvm_mmu_audit(vcpu, AUDIT_POST_SYNC); - spin_unlock(&vcpu->kvm->mmu_lock); + kvm_mmu_unlock(vcpu->kvm); } EXPORT_SYMBOL_GPL(kvm_mmu_sync_roots); @@ -3724,7 +3724,7 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, return r; r = RET_PF_RETRY; - spin_lock(&vcpu->kvm->mmu_lock); + kvm_mmu_lock(vcpu->kvm); if (mmu_notifier_retry(vcpu->kvm, mmu_seq)) goto out_unlock; r = make_mmu_pages_available(vcpu); @@ -3739,7 +3739,7 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, prefault, is_tdp); out_unlock: - spin_unlock(&vcpu->kvm->mmu_lock); + kvm_mmu_unlock(vcpu->kvm); kvm_release_pfn_clean(pfn); return r; } @@ -4999,7 +4999,7 @@ static void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, */ mmu_topup_memory_caches(vcpu, true); - spin_lock(&vcpu->kvm->mmu_lock); + kvm_mmu_lock(vcpu->kvm); gentry = mmu_pte_write_fetch_gpte(vcpu, &gpa, &bytes); @@ -5035,7 +5035,7 @@ static void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, } kvm_mmu_flush_or_zap(vcpu, &invalid_list, remote_flush, local_flush); kvm_mmu_audit(vcpu, AUDIT_POST_PTE_WRITE); - spin_unlock(&vcpu->kvm->mmu_lock); + kvm_mmu_unlock(vcpu->kvm); } int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva) @@ -5423,7 +5423,7 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm) { lockdep_assert_held(&kvm->slots_lock); - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); trace_kvm_mmu_zap_all_fast(kvm); /* @@ -5450,7 +5450,7 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm) if (kvm->arch.tdp_mmu_enabled) kvm_tdp_mmu_zap_all(kvm); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); } static bool kvm_has_zapped_obsolete_pages(struct kvm *kvm) @@ -5492,7 +5492,7 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) int i; bool flush; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { slots = __kvm_memslots(kvm, i); kvm_for_each_memslot(memslot, slots) { @@ -5516,7 +5516,7 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) kvm_flush_remote_tlbs(kvm); } - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); } static bool slot_rmap_write_protect(struct kvm *kvm, @@ -5531,12 +5531,12 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, { bool flush; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); flush = slot_handle_level(kvm, memslot, slot_rmap_write_protect, start_level, KVM_MAX_HUGEPAGE_LEVEL, false); if (kvm->arch.tdp_mmu_enabled) flush |= kvm_tdp_mmu_wrprot_slot(kvm, memslot, PG_LEVEL_4K); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); /* * We can flush all the TLBs out of the mmu lock without TLB @@ -5596,13 +5596,13 @@ void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm, const struct kvm_memory_slot *memslot) { /* FIXME: const-ify all uses of struct kvm_memory_slot. */ - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); slot_handle_leaf(kvm, (struct kvm_memory_slot *)memslot, kvm_mmu_zap_collapsible_spte, true); if (kvm->arch.tdp_mmu_enabled) kvm_tdp_mmu_zap_collapsible_sptes(kvm, memslot); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); } void kvm_arch_flush_remote_tlbs_memslot(struct kvm *kvm, @@ -5625,11 +5625,11 @@ void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm, { bool flush; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); flush = slot_handle_leaf(kvm, memslot, __rmap_clear_dirty, false); if (kvm->arch.tdp_mmu_enabled) flush |= kvm_tdp_mmu_clear_dirty_slot(kvm, memslot); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); /* * It's also safe to flush TLBs out of mmu lock here as currently this @@ -5647,12 +5647,12 @@ void kvm_mmu_slot_largepage_remove_write_access(struct kvm *kvm, { bool flush; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); flush = slot_handle_large_level(kvm, memslot, slot_rmap_write_protect, false); if (kvm->arch.tdp_mmu_enabled) flush |= kvm_tdp_mmu_wrprot_slot(kvm, memslot, PG_LEVEL_2M); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); if (flush) kvm_arch_flush_remote_tlbs_memslot(kvm, memslot); @@ -5664,11 +5664,11 @@ void kvm_mmu_slot_set_dirty(struct kvm *kvm, { bool flush; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); flush = slot_handle_all_level(kvm, memslot, __rmap_set_dirty, false); if (kvm->arch.tdp_mmu_enabled) flush |= kvm_tdp_mmu_slot_set_dirty(kvm, memslot); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); if (flush) kvm_arch_flush_remote_tlbs_memslot(kvm, memslot); @@ -5681,7 +5681,7 @@ void kvm_mmu_zap_all(struct kvm *kvm) LIST_HEAD(invalid_list); int ign; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); restart: list_for_each_entry_safe(sp, node, &kvm->arch.active_mmu_pages, link) { if (WARN_ON(sp->role.invalid)) @@ -5697,7 +5697,7 @@ void kvm_mmu_zap_all(struct kvm *kvm) if (kvm->arch.tdp_mmu_enabled) kvm_tdp_mmu_zap_all(kvm); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); } void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, u64 gen) @@ -5757,7 +5757,7 @@ mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) continue; idx = srcu_read_lock(&kvm->srcu); - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); if (kvm_has_zapped_obsolete_pages(kvm)) { kvm_mmu_commit_zap_page(kvm, @@ -5768,7 +5768,7 @@ mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) freed = kvm_mmu_zap_oldest_mmu_pages(kvm, sc->nr_to_scan); unlock: - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); srcu_read_unlock(&kvm->srcu, idx); /* @@ -5988,7 +5988,7 @@ static void kvm_recover_nx_lpages(struct kvm *kvm) ulong to_zap; rcu_idx = srcu_read_lock(&kvm->srcu); - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); ratio = READ_ONCE(nx_huge_pages_recovery_ratio); to_zap = ratio ? DIV_ROUND_UP(kvm->stat.nx_lpage_splits, ratio) : 0; @@ -6020,7 +6020,7 @@ static void kvm_recover_nx_lpages(struct kvm *kvm) } kvm_mmu_commit_zap_page(kvm, &invalid_list); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); srcu_read_unlock(&kvm->srcu, rcu_idx); } diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c index 8443a675715b..7ae4567c58bf 100644 --- a/arch/x86/kvm/mmu/page_track.c +++ b/arch/x86/kvm/mmu/page_track.c @@ -184,9 +184,9 @@ kvm_page_track_register_notifier(struct kvm *kvm, head = &kvm->arch.track_notifier_head; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); hlist_add_head_rcu(&n->node, &head->track_notifier_list); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); } EXPORT_SYMBOL_GPL(kvm_page_track_register_notifier); @@ -202,9 +202,9 @@ kvm_page_track_unregister_notifier(struct kvm *kvm, head = &kvm->arch.track_notifier_head; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); hlist_del_rcu(&n->node); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); synchronize_srcu(&head->track_srcu); } EXPORT_SYMBOL_GPL(kvm_page_track_unregister_notifier); diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 50e268eb8e1a..a7a29bf6c683 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -868,7 +868,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gpa_t addr, u32 error_code, } r = RET_PF_RETRY; - spin_lock(&vcpu->kvm->mmu_lock); + kvm_mmu_lock(vcpu->kvm); if (mmu_notifier_retry(vcpu->kvm, mmu_seq)) goto out_unlock; @@ -881,7 +881,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gpa_t addr, u32 error_code, kvm_mmu_audit(vcpu, AUDIT_POST_PAGE_FAULT); out_unlock: - spin_unlock(&vcpu->kvm->mmu_lock); + kvm_mmu_unlock(vcpu->kvm); kvm_release_pfn_clean(pfn); return r; } @@ -919,7 +919,7 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa) return; } - spin_lock(&vcpu->kvm->mmu_lock); + kvm_mmu_lock(vcpu->kvm); for_each_shadow_entry_using_root(vcpu, root_hpa, gva, iterator) { level = iterator.level; sptep = iterator.sptep; @@ -954,7 +954,7 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa) if (!is_shadow_present_pte(*sptep) || !sp->unsync_children) break; } - spin_unlock(&vcpu->kvm->mmu_lock); + kvm_mmu_unlock(vcpu->kvm); } /* Note, @addr is a GPA when gva_to_gpa() translates an L2 GPA to an L1 GPA. */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index dc5b4bf34ca2..90807f2d928f 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -170,13 +170,13 @@ static struct kvm_mmu_page *get_tdp_mmu_vcpu_root(struct kvm_vcpu *vcpu) role = page_role_for_level(vcpu, vcpu->arch.mmu->shadow_root_level); - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); /* Check for an existing root before allocating a new one. */ for_each_tdp_mmu_root(kvm, root) { if (root->role.word == role.word) { kvm_mmu_get_root(kvm, root); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); return root; } } @@ -186,7 +186,7 @@ static struct kvm_mmu_page *get_tdp_mmu_vcpu_root(struct kvm_vcpu *vcpu) list_add(&root->link, &kvm->arch.tdp_mmu_roots); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); return root; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9a8969a6dd06..302042af87ee 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7088,9 +7088,9 @@ static bool reexecute_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, if (vcpu->arch.mmu->direct_map) { unsigned int indirect_shadow_pages; - spin_lock(&vcpu->kvm->mmu_lock); + kvm_mmu_lock(vcpu->kvm); indirect_shadow_pages = vcpu->kvm->arch.indirect_shadow_pages; - spin_unlock(&vcpu->kvm->mmu_lock); + kvm_mmu_unlock(vcpu->kvm); if (indirect_shadow_pages) kvm_mmu_unprotect_page(vcpu->kvm, gpa_to_gfn(gpa)); diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index 60f1a386dd06..069e189961ff 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -1703,7 +1703,7 @@ static int kvmgt_page_track_add(unsigned long handle, u64 gfn) return -EINVAL; } - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); if (kvmgt_gfn_is_write_protected(info, gfn)) goto out; @@ -1712,7 +1712,7 @@ static int kvmgt_page_track_add(unsigned long handle, u64 gfn) kvmgt_protect_table_add(info, gfn); out: - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); srcu_read_unlock(&kvm->srcu, idx); return 0; } @@ -1737,7 +1737,7 @@ static int kvmgt_page_track_remove(unsigned long handle, u64 gfn) return -EINVAL; } - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); if (!kvmgt_gfn_is_write_protected(info, gfn)) goto out; @@ -1746,7 +1746,7 @@ static int kvmgt_page_track_remove(unsigned long handle, u64 gfn) kvmgt_protect_table_del(info, gfn); out: - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); srcu_read_unlock(&kvm->srcu, idx); return 0; } @@ -1772,7 +1772,7 @@ static void kvmgt_page_track_flush_slot(struct kvm *kvm, struct kvmgt_guest_info *info = container_of(node, struct kvmgt_guest_info, track_node); - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); for (i = 0; i < slot->npages; i++) { gfn = slot->base_gfn + i; if (kvmgt_gfn_is_write_protected(info, gfn)) { @@ -1781,7 +1781,7 @@ static void kvmgt_page_track_flush_slot(struct kvm *kvm, kvmgt_protect_table_del(info, gfn); } } - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); } static bool __kvmgt_vgpu_exist(struct intel_vgpu *vgpu, struct kvm *kvm) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index f3b1013fb22c..433d14fdae30 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1495,4 +1495,7 @@ static inline void kvm_handle_signal_exit(struct kvm_vcpu *vcpu) /* Max number of entries allowed for each kvm dirty ring */ #define KVM_DIRTY_RING_MAX_ENTRIES 65536 +void kvm_mmu_lock(struct kvm *kvm); +void kvm_mmu_unlock(struct kvm *kvm); + #endif diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c index 9d01299563ee..e1c1538f59a6 100644 --- a/virt/kvm/dirty_ring.c +++ b/virt/kvm/dirty_ring.c @@ -60,9 +60,9 @@ static void kvm_reset_dirty_gfn(struct kvm *kvm, u32 slot, u64 offset, u64 mask) if (!memslot || (offset + __fls(mask)) >= memslot->npages) return; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); kvm_arch_mmu_enable_log_dirty_pt_masked(kvm, memslot, offset, mask); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); } int kvm_dirty_ring_alloc(struct kvm_dirty_ring *ring, int index, u32 size) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index fa9e3614d30e..32f97ed1188d 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -432,6 +432,16 @@ void kvm_vcpu_destroy(struct kvm_vcpu *vcpu) } EXPORT_SYMBOL_GPL(kvm_vcpu_destroy); +void kvm_mmu_lock(struct kvm *kvm) +{ + spin_lock(&kvm->mmu_lock); +} + +void kvm_mmu_unlock(struct kvm *kvm) +{ + spin_unlock(&kvm->mmu_lock); +} + #if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER) static inline struct kvm *mmu_notifier_to_kvm(struct mmu_notifier *mn) { @@ -459,13 +469,13 @@ static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn, int idx; idx = srcu_read_lock(&kvm->srcu); - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); kvm->mmu_notifier_seq++; if (kvm_set_spte_hva(kvm, address, pte)) kvm_flush_remote_tlbs(kvm); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); srcu_read_unlock(&kvm->srcu, idx); } @@ -476,7 +486,7 @@ static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, int need_tlb_flush = 0, idx; idx = srcu_read_lock(&kvm->srcu); - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); /* * The count increase must become visible at unlock time as no * spte can be established without taking the mmu_lock and @@ -489,7 +499,7 @@ static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, if (need_tlb_flush || kvm->tlbs_dirty) kvm_flush_remote_tlbs(kvm); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); srcu_read_unlock(&kvm->srcu, idx); return 0; @@ -500,7 +510,7 @@ static void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn, { struct kvm *kvm = mmu_notifier_to_kvm(mn); - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); /* * This sequence increase will notify the kvm page fault that * the page that is going to be mapped in the spte could have @@ -514,7 +524,7 @@ static void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn, * in conjunction with the smp_rmb in mmu_notifier_retry(). */ kvm->mmu_notifier_count--; - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); BUG_ON(kvm->mmu_notifier_count < 0); } @@ -528,13 +538,13 @@ static int kvm_mmu_notifier_clear_flush_young(struct mmu_notifier *mn, int young, idx; idx = srcu_read_lock(&kvm->srcu); - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); young = kvm_age_hva(kvm, start, end); if (young) kvm_flush_remote_tlbs(kvm); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); srcu_read_unlock(&kvm->srcu, idx); return young; @@ -549,7 +559,7 @@ static int kvm_mmu_notifier_clear_young(struct mmu_notifier *mn, int young, idx; idx = srcu_read_lock(&kvm->srcu); - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); /* * Even though we do not flush TLB, this will still adversely * affect performance on pre-Haswell Intel EPT, where there is @@ -564,7 +574,7 @@ static int kvm_mmu_notifier_clear_young(struct mmu_notifier *mn, * more sophisticated heuristic later. */ young = kvm_age_hva(kvm, start, end); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); srcu_read_unlock(&kvm->srcu, idx); return young; @@ -578,9 +588,9 @@ static int kvm_mmu_notifier_test_young(struct mmu_notifier *mn, int young, idx; idx = srcu_read_lock(&kvm->srcu); - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); young = kvm_test_age_hva(kvm, address); - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); srcu_read_unlock(&kvm->srcu, idx); return young; @@ -1524,7 +1534,7 @@ static int kvm_get_dirty_log_protect(struct kvm *kvm, struct kvm_dirty_log *log) dirty_bitmap_buffer = kvm_second_dirty_bitmap(memslot); memset(dirty_bitmap_buffer, 0, n); - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); for (i = 0; i < n / sizeof(long); i++) { unsigned long mask; gfn_t offset; @@ -1540,7 +1550,7 @@ static int kvm_get_dirty_log_protect(struct kvm *kvm, struct kvm_dirty_log *log) kvm_arch_mmu_enable_log_dirty_pt_masked(kvm, memslot, offset, mask); } - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); } if (flush) @@ -1635,7 +1645,7 @@ static int kvm_clear_dirty_log_protect(struct kvm *kvm, if (copy_from_user(dirty_bitmap_buffer, log->dirty_bitmap, n)) return -EFAULT; - spin_lock(&kvm->mmu_lock); + kvm_mmu_lock(kvm); for (offset = log->first_page, i = offset / BITS_PER_LONG, n = DIV_ROUND_UP(log->num_pages, BITS_PER_LONG); n--; i++, offset += BITS_PER_LONG) { @@ -1658,7 +1668,7 @@ static int kvm_clear_dirty_log_protect(struct kvm *kvm, offset, mask); } } - spin_unlock(&kvm->mmu_lock); + kvm_mmu_unlock(kvm); if (flush) kvm_arch_flush_remote_tlbs_memslot(kvm, memslot); From patchwork Tue Jan 12 18:10:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014345 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 162AAC432C3 for ; Tue, 12 Jan 2021 18:13:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DCFD822DFA for ; Tue, 12 Jan 2021 18:13:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406431AbhALSNO (ORCPT ); Tue, 12 Jan 2021 13:13:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56984 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2406317AbhALSMt (ORCPT ); Tue, 12 Jan 2021 13:12:49 -0500 Received: from mail-pf1-x44a.google.com (mail-pf1-x44a.google.com [IPv6:2607:f8b0:4864:20::44a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DCD8FC061388 for ; Tue, 12 Jan 2021 10:11:12 -0800 (PST) Received: by mail-pf1-x44a.google.com with SMTP id 15so1840062pfu.6 for ; Tue, 12 Jan 2021 10:11:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=bnKGrgv+MMsFcrufi3H/brnDaJ6XjCIFU4yRf/m3uBE=; b=WYSs8lBa9V+nf1FbyXI1n6MSt3USDkeANRkGXZzTaMpDOA1u92XqZ6pwc1BVregNEg H8fdvqA5ZeTkYDT2izYx5Eo4lv1+jiWwGVkcGxkSujtgGr1PyVyDeeURSTpXkdrjX161 2WpWmZPBffdkCIvWXqf1y/QLfGjFs8FSQouFxnSV95mE1w38DGcdBJYsbJ6KvbIXa/UN oHOvyH+mG/Ibo1xOfxJU8E60Wkf0AZj284cnWpfGJi4SdP43XvNOygIPSa7OM+gdUbI1 XmABly4XJ+jjuJxaC/zdt4MZs7VDXTLErQiN4d3wEr/LQoT2QRDK/j92LO5qNdtQCkhc rj+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=bnKGrgv+MMsFcrufi3H/brnDaJ6XjCIFU4yRf/m3uBE=; b=VWrQT+vk5/ZaiyI6H4U4lkB17f/jHKOgI5Y9S276OgpNoSVDHZaiuApLCRZ5mLzxmX jTHP6zLBD/hFC+KtyehtM8CWbTNhwHPcKjiijIz0rCHlRuEoVvScDMW+zJlVqbAtze3a ZhZSX48ksyP1JOpqJAdgcEQTvcCRVV2/87Ih7Srgtdac8izOGcK/7W0iscBY3y5N47ZG j/iqC/rcOKvLfqU/0c3jysn3xh05GeyAybf+fM9WbRIeyAUV2xqfSco4N0IFvZW/dW9I LY01RuKkZLlToKWza3wtG7iSOncePVEqLBvxXTES4VyFPBPPIh0ooCEHY3SqzfxP/2uG 2ISg== X-Gm-Message-State: AOAM532AFyvpRfw8DRi0vtfb+qbwcLsceldIZTVO4CQ+CQd1jqD2Ts5S TUnew9Rm4Hcb9/UVsRWBv9RQnH5wrvCX X-Google-Smtp-Source: ABdhPJyr+exi/2YCCknuX2QZOplfqGjnQhLn1ubw2DdeqkA3PvtqccfKRMCjGVSsNx3SdyCCHuRoOW1y2Rhh Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a62:14d3:0:b029:19e:88c0:8c67 with SMTP id 202-20020a6214d30000b029019e88c08c67mr435380pfu.69.1610475072351; Tue, 12 Jan 2021 10:11:12 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:32 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-16-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 15/24] kvm: mmu: Wrap mmu_lock cond_resched and needbreak From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Wrap the MMU lock cond_reseched and needbreak operations in a function. This will support a refactoring to move the lock into the struct kvm_arch(s) so that x86 can change the spinlock to a rwlock without affecting the performance of other archs. No functional change intended. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- arch/arm64/kvm/mmu.c | 2 +- arch/x86/kvm/mmu/mmu.c | 16 ++++++++-------- arch/x86/kvm/mmu/tdp_mmu.c | 8 ++++---- include/linux/kvm_host.h | 2 ++ virt/kvm/kvm_main.c | 10 ++++++++++ 5 files changed, 25 insertions(+), 13 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 402b1642c944..57ef1ec23b56 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -58,7 +58,7 @@ static int stage2_apply_range(struct kvm *kvm, phys_addr_t addr, break; if (resched && next != end) - cond_resched_lock(&kvm->mmu_lock); + kvm_mmu_lock_cond_resched(kvm); } while (addr = next, addr != end); return ret; diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 5a4577830606..659ed0a2875f 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2016,9 +2016,9 @@ static void mmu_sync_children(struct kvm_vcpu *vcpu, flush |= kvm_sync_page(vcpu, sp, &invalid_list); mmu_pages_clear_parents(&parents); } - if (need_resched() || spin_needbreak(&vcpu->kvm->mmu_lock)) { + if (need_resched() || kvm_mmu_lock_needbreak(vcpu->kvm)) { kvm_mmu_flush_or_zap(vcpu, &invalid_list, false, flush); - cond_resched_lock(&vcpu->kvm->mmu_lock); + kvm_mmu_lock_cond_resched(vcpu->kvm); flush = false; } } @@ -5233,14 +5233,14 @@ slot_handle_level_range(struct kvm *kvm, struct kvm_memory_slot *memslot, if (iterator.rmap) flush |= fn(kvm, iterator.rmap); - if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { + if (need_resched() || kvm_mmu_lock_needbreak(kvm)) { if (flush && lock_flush_tlb) { kvm_flush_remote_tlbs_with_address(kvm, start_gfn, iterator.gfn - start_gfn + 1); flush = false; } - cond_resched_lock(&kvm->mmu_lock); + kvm_mmu_lock_cond_resched(kvm); } } @@ -5390,7 +5390,7 @@ static void kvm_zap_obsolete_pages(struct kvm *kvm) * be in active use by the guest. */ if (batch >= BATCH_ZAP_PAGES && - cond_resched_lock(&kvm->mmu_lock)) { + kvm_mmu_lock_cond_resched(kvm)) { batch = 0; goto restart; } @@ -5688,7 +5688,7 @@ void kvm_mmu_zap_all(struct kvm *kvm) continue; if (__kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list, &ign)) goto restart; - if (cond_resched_lock(&kvm->mmu_lock)) + if (kvm_mmu_lock_cond_resched(kvm)) goto restart; } @@ -6013,9 +6013,9 @@ static void kvm_recover_nx_lpages(struct kvm *kvm) WARN_ON_ONCE(sp->lpage_disallowed); } - if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { + if (need_resched() || kvm_mmu_lock_needbreak(kvm)) { kvm_mmu_commit_zap_page(kvm, &invalid_list); - cond_resched_lock(&kvm->mmu_lock); + kvm_mmu_lock_cond_resched(kvm); } } kvm_mmu_commit_zap_page(kvm, &invalid_list); diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 90807f2d928f..fb911ca428b2 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -488,10 +488,10 @@ static inline void tdp_mmu_set_spte_no_dirty_log(struct kvm *kvm, static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, struct tdp_iter *iter) { - if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { + if (need_resched() || kvm_mmu_lock_needbreak(kvm)) { kvm_flush_remote_tlbs(kvm); rcu_read_unlock(); - cond_resched_lock(&kvm->mmu_lock); + kvm_mmu_lock_cond_resched(kvm); rcu_read_lock(); tdp_iter_refresh_walk(iter); return true; @@ -512,9 +512,9 @@ static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, */ static bool tdp_mmu_iter_cond_resched(struct kvm *kvm, struct tdp_iter *iter) { - if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { + if (need_resched() || kvm_mmu_lock_needbreak(kvm)) { rcu_read_unlock(); - cond_resched_lock(&kvm->mmu_lock); + kvm_mmu_lock_cond_resched(kvm); rcu_read_lock(); tdp_iter_refresh_walk(iter); return true; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 433d14fdae30..6e2773fc406c 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1497,5 +1497,7 @@ static inline void kvm_handle_signal_exit(struct kvm_vcpu *vcpu) void kvm_mmu_lock(struct kvm *kvm); void kvm_mmu_unlock(struct kvm *kvm); +int kvm_mmu_lock_needbreak(struct kvm *kvm); +int kvm_mmu_lock_cond_resched(struct kvm *kvm); #endif diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 32f97ed1188d..b4c49a7e0556 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -442,6 +442,16 @@ void kvm_mmu_unlock(struct kvm *kvm) spin_unlock(&kvm->mmu_lock); } +int kvm_mmu_lock_needbreak(struct kvm *kvm) +{ + return spin_needbreak(&kvm->mmu_lock); +} + +int kvm_mmu_lock_cond_resched(struct kvm *kvm) +{ + return cond_resched_lock(&kvm->mmu_lock); +} + #if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER) static inline struct kvm *mmu_notifier_to_kvm(struct mmu_notifier *mn) { From patchwork Tue Jan 12 18:10:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014361 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BBF2BC433E0 for ; Tue, 12 Jan 2021 18:14:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7DF912311D for ; Tue, 12 Jan 2021 18:14:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406398AbhALSO1 (ORCPT ); Tue, 12 Jan 2021 13:14:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56820 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2406135AbhALSMZ (ORCPT ); Tue, 12 Jan 2021 13:12:25 -0500 Received: from mail-qv1-xf49.google.com (mail-qv1-xf49.google.com [IPv6:2607:f8b0:4864:20::f49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 24931C06138A for ; Tue, 12 Jan 2021 10:11:15 -0800 (PST) Received: by mail-qv1-xf49.google.com with SMTP id m8so2085829qvk.1 for ; Tue, 12 Jan 2021 10:11:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=lYSk7MOddYKWjNX3MxEIYz0vLNaYEHnCnR3+/yAINXw=; b=NKHnnTd7jpdNfSK6bP6++NsuIw0k1dByDPu7R62QtqGk9cZov5lC1ZdIKgrHYqkdCq O+2dgK9s8JHUTSGLSBK1za1HRCgZPrMOgbdIcJy/9el/kMdhoQ1sG666QH9DWiGcbJqK SsANBDgOzW+6iREdYcJhDBBTxN5TgtJEdFd4BOJ3xg6Ascaz1j5400bsKZ74pyyZvT8D rsOpa4bVKxZE8qW2zCPRCpEvvJwe6wA/6j/8QYIUFZnx5X0IHqtZexLxW5AiPG+Bf3cv qGBYy9P3zlhf/Tl6W+O0mihrfRTW/R+eKV/cBCTpDyVCAnI8xQ3BpkyfaPGdKG/eauBc HZCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=lYSk7MOddYKWjNX3MxEIYz0vLNaYEHnCnR3+/yAINXw=; b=HWbvzl4Z8OPA10m+S2dSpZIcVie/EeZ4SgFN6WlwBSDk09FWQAKXAoWBHkTQLoeept 4mvamHvERDJgQha+Ry88NQShVsplTzYI3n2I2BB0k6LHszP7F0IYq8eUJ9DdsqmrJv++ LJrJYmuVN2FPxx7OWMpFy4fXg/uT1LzeQw8BmoccTJ9a+xZFqVaCh9fxXF+89UXxUxgB IS2hyrDZDrLLDkeaBPQ2ygTFxKoBS9Ho8GPWQZSekZt90PziAbUAxhJY1zZKp6FKJ+q+ TrxdFQnhBN+uBCmzCOi4FQMLyaENYHicejV205x0AH35U/adxWRxqXSDpAOUwIueVspQ EZJA== X-Gm-Message-State: AOAM530X0Re6bGZfyx3rC0nLNv83Dca/fRUOrC3yJt5e+L1J41Bv0Xag veclrM/P0gEbYy43kHXj6IlymwRNYblZ X-Google-Smtp-Source: ABdhPJwYwzAWMtBQK6mILE8NEYAUYZdTPRPkm/F7Ns0Y++eItf45aU6HhfxEeu4QJg7N1rYh67ntpkm7bfn+ Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a05:6214:14ee:: with SMTP id k14mr670741qvw.36.1610475074199; Tue, 12 Jan 2021 10:11:14 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:33 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-17-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 16/24] kvm: mmu: Wrap mmu_lock assertions From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Wrap assertions and warnings checking the MMU lock state in a function which uses lockdep_assert_held. While the existing checks use a few different functions to check the lock state, they are all better off using lockdep_assert_held. This will support a refactoring to move the mmu_lock to struct kvm_arch so that it can be replaced with an rwlock for x86. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- arch/arm64/kvm/mmu.c | 2 +- arch/powerpc/include/asm/kvm_book3s_64.h | 7 +++---- arch/powerpc/kvm/book3s_hv_nested.c | 3 +-- arch/x86/kvm/mmu/mmu_internal.h | 4 ++-- arch/x86/kvm/mmu/tdp_mmu.c | 8 ++++---- include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 5 +++++ 7 files changed, 17 insertions(+), 13 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 57ef1ec23b56..8b54eb58bf47 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -130,7 +130,7 @@ static void __unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 struct kvm *kvm = mmu->kvm; phys_addr_t end = start + size; - assert_spin_locked(&kvm->mmu_lock); + kvm_mmu_lock_assert_held(kvm); WARN_ON(size & ~PAGE_MASK); WARN_ON(stage2_apply_range(kvm, start, end, kvm_pgtable_stage2_unmap, may_block)); diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h b/arch/powerpc/include/asm/kvm_book3s_64.h index 9bb9bb370b53..db2e437cd97c 100644 --- a/arch/powerpc/include/asm/kvm_book3s_64.h +++ b/arch/powerpc/include/asm/kvm_book3s_64.h @@ -650,8 +650,8 @@ static inline pte_t *find_kvm_secondary_pte(struct kvm *kvm, unsigned long ea, { pte_t *pte; - VM_WARN(!spin_is_locked(&kvm->mmu_lock), - "%s called with kvm mmu_lock not held \n", __func__); + kvm_mmu_lock_assert_held(kvm); + pte = __find_linux_pte(kvm->arch.pgtable, ea, NULL, hshift); return pte; @@ -662,8 +662,7 @@ static inline pte_t *find_kvm_host_pte(struct kvm *kvm, unsigned long mmu_seq, { pte_t *pte; - VM_WARN(!spin_is_locked(&kvm->mmu_lock), - "%s called with kvm mmu_lock not held \n", __func__); + kvm_mmu_lock_assert_held(kvm); if (mmu_notifier_retry(kvm, mmu_seq)) return NULL; diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c index 18890dca9476..6d5987d1eee7 100644 --- a/arch/powerpc/kvm/book3s_hv_nested.c +++ b/arch/powerpc/kvm/book3s_hv_nested.c @@ -767,8 +767,7 @@ pte_t *find_kvm_nested_guest_pte(struct kvm *kvm, unsigned long lpid, if (!gp) return NULL; - VM_WARN(!spin_is_locked(&kvm->mmu_lock), - "%s called with kvm mmu_lock not held \n", __func__); + kvm_mmu_lock_assert_held(kvm); pte = __find_linux_pte(gp->shadow_pgtable, ea, NULL, hshift); return pte; diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 7f599cc64178..cc8268cf28d2 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -101,14 +101,14 @@ void kvm_flush_remote_tlbs_with_address(struct kvm *kvm, static inline void kvm_mmu_get_root(struct kvm *kvm, struct kvm_mmu_page *sp) { BUG_ON(!sp->root_count); - lockdep_assert_held(&kvm->mmu_lock); + kvm_mmu_lock_assert_held(kvm); ++sp->root_count; } static inline bool kvm_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *sp) { - lockdep_assert_held(&kvm->mmu_lock); + kvm_mmu_lock_assert_held(kvm); --sp->root_count; return !sp->root_count; diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index fb911ca428b2..1d7c01300495 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -117,7 +117,7 @@ void kvm_tdp_mmu_free_root(struct kvm *kvm, struct kvm_mmu_page *root) { gfn_t max_gfn = 1ULL << (shadow_phys_bits - PAGE_SHIFT); - lockdep_assert_held(&kvm->mmu_lock); + kvm_mmu_lock_assert_held(kvm); WARN_ON(root->root_count); WARN_ON(!root->tdp_mmu_page); @@ -425,7 +425,7 @@ static inline void __tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter, struct kvm_mmu_page *root = sptep_to_sp(root_pt); int as_id = kvm_mmu_page_as_id(root); - lockdep_assert_held(&kvm->mmu_lock); + kvm_mmu_lock_assert_held(kvm); WRITE_ONCE(*iter->sptep, new_spte); @@ -1139,7 +1139,7 @@ void kvm_tdp_mmu_clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root; int root_as_id; - lockdep_assert_held(&kvm->mmu_lock); + kvm_mmu_lock_assert_held(kvm); for_each_tdp_mmu_root(kvm, root) { root_as_id = kvm_mmu_page_as_id(root); if (root_as_id != slot->as_id) @@ -1324,7 +1324,7 @@ bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm, int root_as_id; bool spte_set = false; - lockdep_assert_held(&kvm->mmu_lock); + kvm_mmu_lock_assert_held(kvm); for_each_tdp_mmu_root(kvm, root) { root_as_id = kvm_mmu_page_as_id(root); if (root_as_id != slot->as_id) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 6e2773fc406c..022e3522788f 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1499,5 +1499,6 @@ void kvm_mmu_lock(struct kvm *kvm); void kvm_mmu_unlock(struct kvm *kvm); int kvm_mmu_lock_needbreak(struct kvm *kvm); int kvm_mmu_lock_cond_resched(struct kvm *kvm); +void kvm_mmu_lock_assert_held(struct kvm *kvm); #endif diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b4c49a7e0556..c504f876176b 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -452,6 +452,11 @@ int kvm_mmu_lock_cond_resched(struct kvm *kvm) return cond_resched_lock(&kvm->mmu_lock); } +void kvm_mmu_lock_assert_held(struct kvm *kvm) +{ + lockdep_assert_held(&kvm->mmu_lock); +} + #if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER) static inline struct kvm *mmu_notifier_to_kvm(struct mmu_notifier *mn) { From patchwork Tue Jan 12 18:10:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014341 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98196C43331 for ; Tue, 12 Jan 2021 18:12:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 70BD62311F for ; Tue, 12 Jan 2021 18:12:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406177AbhALSM3 (ORCPT ); Tue, 12 Jan 2021 13:12:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56826 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2406150AbhALSMZ (ORCPT ); Tue, 12 Jan 2021 13:12:25 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8647C06138C for ; Tue, 12 Jan 2021 10:11:16 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id g9so2265133ybe.7 for ; Tue, 12 Jan 2021 10:11:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=piJt2JS5pTSM24ESUHevNSUit88Vm39hvm09WUBv428=; b=BAu9qJ6KcGwjE9OCzK9jx2XZ8YHUslIxtm1xHhRaYEQGv3wWeUkUxkJYwV0pKmembi XauVCsY6SBfEtjA1vMrB0IMjb/w2G77s1SmfBB8UBrDul993PffvF3rumJoq3jhbBN3a fZ+IxbCv15YLc9vlG7qdY2Zrf6KtD7PsuBcpTvMIPL465kF4wMg5OkxGxFsjGGiQzHnQ TaQ0qg5I3yDB73VFG0zcm49zCDCyVfm854YJbGSpmTmZG3iEpwG/d/VGgUH9cKzHEJwF phoj45QAdBdK3TbDkyOv3QliSL6rAFxFqj1Dpr/eZmxgnrK4P9ZWSUmz7/M4YMKetOBt ZHSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=piJt2JS5pTSM24ESUHevNSUit88Vm39hvm09WUBv428=; b=GMgp6Dd7OTasc0QFOydPOvpby9k0A4jVaIUQrC++fjsRpzLMOAyYH1bpYWNTfgw3r1 Wm+Re2AtiJa2RP/Op3ZaNFzugDepvftSSG4CJ7bplA2Xjq5jcMJrF7jXWEPwAU4+7T9T YHMfT07jwq87VWNKDSYK5+vQ1Fei7x/a3L7oSXkPqn+0bjkymPp4Dmd53c6yPshYuv3G pS/t9PakU/5y9P5N/auF5u5UJWZLZ07Zs/YWRJGMsh1wmpCR90PZelwlvDmEo8tMReL4 WwBWDacMjpneeTRj/yv0FnirNzSlDaagCsdiSofgd9zkintcmwrxqIQoJLENlW2dcZZz 3L3A== X-Gm-Message-State: AOAM531UV+KWlyeCW7JeB6nMpWhVaFO/3lHJmQjQNQrRQkhRGxvRa2q5 lnLBWGPp8a178D684lp0oC87Ff48/TbZ X-Google-Smtp-Source: ABdhPJyLY977WjUPnJun9HkYOepTWmhCl5F5E+iAFfU57qbvcTW9YEOoOEszp67X+5GBIkF0P1v2ICMSlIch Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a5b:810:: with SMTP id x16mr1027014ybp.86.1610475075974; Tue, 12 Jan 2021 10:11:15 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:34 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-18-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 17/24] kvm: mmu: Move mmu_lock to struct kvm_arch From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Move the mmu_lock to struct kvm_arch so that it can be replaced with a rwlock on x86 without affecting the performance of other archs. No functional change intended. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- Documentation/virt/kvm/locking.rst | 2 +- arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/kvm/arm.c | 2 ++ arch/mips/include/asm/kvm_host.h | 2 ++ arch/mips/kvm/mips.c | 2 ++ arch/mips/kvm/mmu.c | 6 +++--- arch/powerpc/include/asm/kvm_host.h | 2 ++ arch/powerpc/kvm/book3s_64_mmu_radix.c | 10 +++++----- arch/powerpc/kvm/book3s_64_vio_hv.c | 4 ++-- arch/powerpc/kvm/book3s_hv_nested.c | 4 ++-- arch/powerpc/kvm/book3s_hv_rm_mmu.c | 14 +++++++------- arch/powerpc/kvm/e500_mmu_host.c | 2 +- arch/powerpc/kvm/powerpc.c | 2 ++ arch/s390/include/asm/kvm_host.h | 2 ++ arch/s390/kvm/kvm-s390.c | 2 ++ arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/mmu/mmu.c | 2 +- arch/x86/kvm/x86.c | 2 ++ include/linux/kvm_host.h | 1 - virt/kvm/kvm_main.c | 11 +++++------ 20 files changed, 47 insertions(+), 29 deletions(-) diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst index b21a34c34a21..06c006c73c4b 100644 --- a/Documentation/virt/kvm/locking.rst +++ b/Documentation/virt/kvm/locking.rst @@ -212,7 +212,7 @@ which time it will be set using the Dirty tracking mechanism described above. - tsc offset in vmcb :Comment: 'raw' because updating the tsc offsets must not be preempted. -:Name: kvm->mmu_lock +:Name: kvm_arch::mmu_lock :Type: spinlock_t :Arch: any :Protects: -shadow page/shadow tlb entry diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 8fcfab0c2567..6fd4d64eb202 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -102,6 +102,8 @@ struct kvm_arch_memory_slot { }; struct kvm_arch { + spinlock_t mmu_lock; + struct kvm_s2_mmu mmu; /* VTCR_EL2 value for this VM */ diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 04c44853b103..90f4fcd84bb5 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -130,6 +130,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) { int ret; + spin_lock_init(&kvm->arch.mmu_lock); + ret = kvm_arm_setup_stage2(kvm, type); if (ret) return ret; diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h index 24f3d0f9996b..eb3caeffaf91 100644 --- a/arch/mips/include/asm/kvm_host.h +++ b/arch/mips/include/asm/kvm_host.h @@ -216,6 +216,8 @@ struct loongson_kvm_ipi { #endif struct kvm_arch { + spinlock_t mmu_lock; + /* Guest physical mm */ struct mm_struct gpa_mm; /* Mask of CPUs needing GPA ASID flush */ diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c index 4e393d93c1aa..7b8d65d8c863 100644 --- a/arch/mips/kvm/mips.c +++ b/arch/mips/kvm/mips.c @@ -150,6 +150,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) return -EINVAL; }; + spin_lock_init(&kvm->arch.mmu_lock); + /* Allocate page table to map GPA -> RPA */ kvm->arch.gpa_mm.pgd = kvm_pgd_alloc(); if (!kvm->arch.gpa_mm.pgd) diff --git a/arch/mips/kvm/mmu.c b/arch/mips/kvm/mmu.c index 449663152b3c..68fcda1e48f9 100644 --- a/arch/mips/kvm/mmu.c +++ b/arch/mips/kvm/mmu.c @@ -263,7 +263,7 @@ static bool kvm_mips_flush_gpa_pgd(pgd_t *pgd, unsigned long start_gpa, * * Flushes a range of GPA mappings from the GPA page tables. * - * The caller must hold the @kvm->mmu_lock spinlock. + * The caller must hold the @kvm->arch.mmu_lock spinlock. * * Returns: Whether its safe to remove the top level page directory because * all lower levels have been removed. @@ -388,7 +388,7 @@ BUILD_PTE_RANGE_OP(mkclean, pte_mkclean) * Make a range of GPA mappings clean so that guest writes will fault and * trigger dirty page logging. * - * The caller must hold the @kvm->mmu_lock spinlock. + * The caller must hold the @kvm->arch.mmu_lock spinlock. * * Returns: Whether any GPA mappings were modified, which would require * derived mappings (GVA page tables & TLB enties) to be @@ -410,7 +410,7 @@ int kvm_mips_mkclean_gpa_pt(struct kvm *kvm, gfn_t start_gfn, gfn_t end_gfn) * slot to be write protected * * Walks bits set in mask write protects the associated pte's. Caller must - * acquire @kvm->mmu_lock. + * acquire @kvm->arch.mmu_lock. */ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm, struct kvm_memory_slot *slot, diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index d67a470e95a3..7bb8e5847fb4 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -282,6 +282,8 @@ struct kvm_resize_hpt; #define KVMPPC_SECURE_INIT_ABORT 0x4 /* H_SVM_INIT_ABORT issued */ struct kvm_arch { + spinlock_t mmu_lock; + unsigned int lpid; unsigned int smt_mode; /* # vcpus per virtual core */ unsigned int emul_smt_mode; /* emualted SMT mode, on P9 */ diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c index b628980c871b..522d19723512 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c @@ -388,7 +388,7 @@ static void kvmppc_pmd_free(pmd_t *pmdp) kmem_cache_free(kvm_pmd_cache, pmdp); } -/* Called with kvm->mmu_lock held */ +/* Called with kvm->arch.mmu_lock held */ void kvmppc_unmap_pte(struct kvm *kvm, pte_t *pte, unsigned long gpa, unsigned int shift, const struct kvm_memory_slot *memslot, @@ -992,7 +992,7 @@ int kvmppc_book3s_radix_page_fault(struct kvm_vcpu *vcpu, return ret; } -/* Called with kvm->mmu_lock held */ +/* Called with kvm->arch.mmu_lock held */ int kvm_unmap_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, unsigned long gfn) { @@ -1012,7 +1012,7 @@ int kvm_unmap_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, return 0; } -/* Called with kvm->mmu_lock held */ +/* Called with kvm->arch.mmu_lock held */ int kvm_age_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, unsigned long gfn) { @@ -1040,7 +1040,7 @@ int kvm_age_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, return ref; } -/* Called with kvm->mmu_lock held */ +/* Called with kvm->arch.mmu_lock held */ int kvm_test_age_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, unsigned long gfn) { @@ -1073,7 +1073,7 @@ static int kvm_radix_test_clear_dirty(struct kvm *kvm, return ret; /* - * For performance reasons we don't hold kvm->mmu_lock while walking the + * For performance reasons we don't hold kvm->arch.mmu_lock while walking the * partition scoped table. */ ptep = find_kvm_secondary_pte_unlocked(kvm, gpa, &shift); diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c b/arch/powerpc/kvm/book3s_64_vio_hv.c index 083a4e037718..adffa111ebe9 100644 --- a/arch/powerpc/kvm/book3s_64_vio_hv.c +++ b/arch/powerpc/kvm/book3s_64_vio_hv.c @@ -545,7 +545,7 @@ long kvmppc_rm_h_put_tce_indirect(struct kvm_vcpu *vcpu, if (kvmppc_rm_tce_to_ua(vcpu->kvm, tce_list, &ua)) return H_TOO_HARD; - arch_spin_lock(&kvm->mmu_lock.rlock.raw_lock); + arch_spin_lock(&kvm->arch.mmu_lock.rlock.raw_lock); if (kvmppc_rm_ua_to_hpa(vcpu, mmu_seq, ua, &tces)) { ret = H_TOO_HARD; goto unlock_exit; @@ -590,7 +590,7 @@ long kvmppc_rm_h_put_tce_indirect(struct kvm_vcpu *vcpu, unlock_exit: if (!prereg) - arch_spin_unlock(&kvm->mmu_lock.rlock.raw_lock); + arch_spin_unlock(&kvm->arch.mmu_lock.rlock.raw_lock); return ret; } diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c index 6d5987d1eee7..fe0a4e3fef1b 100644 --- a/arch/powerpc/kvm/book3s_hv_nested.c +++ b/arch/powerpc/kvm/book3s_hv_nested.c @@ -611,7 +611,7 @@ static void kvmhv_release_nested(struct kvm_nested_guest *gp) /* * No vcpu is using this struct and no call to * kvmhv_get_nested can find this struct, - * so we don't need to hold kvm->mmu_lock. + * so we don't need to hold kvm->arch.mmu_lock. */ kvmppc_free_pgtable_radix(kvm, gp->shadow_pgtable, gp->shadow_lpid); @@ -892,7 +892,7 @@ static void kvmhv_remove_nest_rmap_list(struct kvm *kvm, unsigned long *rmapp, } } -/* called with kvm->mmu_lock held */ +/* called with kvm->arch.mmu_lock held */ void kvmhv_remove_nest_rmap_range(struct kvm *kvm, const struct kvm_memory_slot *memslot, unsigned long gpa, unsigned long hpa, diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c index 88da2764c1bb..897baf210a2d 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c @@ -249,7 +249,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, /* Translate to host virtual address */ hva = __gfn_to_hva_memslot(memslot, gfn); - arch_spin_lock(&kvm->mmu_lock.rlock.raw_lock); + arch_spin_lock(&kvm->arch.mmu_lock.rlock.raw_lock); ptep = find_kvm_host_pte(kvm, mmu_seq, hva, &hpage_shift); if (ptep) { pte_t pte; @@ -264,7 +264,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, * to <= host page size, if host is using hugepage */ if (host_pte_size < psize) { - arch_spin_unlock(&kvm->mmu_lock.rlock.raw_lock); + arch_spin_unlock(&kvm->arch.mmu_lock.rlock.raw_lock); return H_PARAMETER; } pte = kvmppc_read_update_linux_pte(ptep, writing); @@ -278,7 +278,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, pa |= gpa & ~PAGE_MASK; } } - arch_spin_unlock(&kvm->mmu_lock.rlock.raw_lock); + arch_spin_unlock(&kvm->arch.mmu_lock.rlock.raw_lock); ptel &= HPTE_R_KEY | HPTE_R_PP0 | (psize-1); ptel |= pa; @@ -933,7 +933,7 @@ static long kvmppc_do_h_page_init_zero(struct kvm_vcpu *vcpu, mmu_seq = kvm->mmu_notifier_seq; smp_rmb(); - arch_spin_lock(&kvm->mmu_lock.rlock.raw_lock); + arch_spin_lock(&kvm->arch.mmu_lock.rlock.raw_lock); ret = kvmppc_get_hpa(vcpu, mmu_seq, dest, 1, &pa, &memslot); if (ret != H_SUCCESS) @@ -945,7 +945,7 @@ static long kvmppc_do_h_page_init_zero(struct kvm_vcpu *vcpu, kvmppc_update_dirty_map(memslot, dest >> PAGE_SHIFT, PAGE_SIZE); out_unlock: - arch_spin_unlock(&kvm->mmu_lock.rlock.raw_lock); + arch_spin_unlock(&kvm->arch.mmu_lock.rlock.raw_lock); return ret; } @@ -961,7 +961,7 @@ static long kvmppc_do_h_page_init_copy(struct kvm_vcpu *vcpu, mmu_seq = kvm->mmu_notifier_seq; smp_rmb(); - arch_spin_lock(&kvm->mmu_lock.rlock.raw_lock); + arch_spin_lock(&kvm->arch.mmu_lock.rlock.raw_lock); ret = kvmppc_get_hpa(vcpu, mmu_seq, dest, 1, &dest_pa, &dest_memslot); if (ret != H_SUCCESS) goto out_unlock; @@ -976,7 +976,7 @@ static long kvmppc_do_h_page_init_copy(struct kvm_vcpu *vcpu, kvmppc_update_dirty_map(dest_memslot, dest >> PAGE_SHIFT, PAGE_SIZE); out_unlock: - arch_spin_unlock(&kvm->mmu_lock.rlock.raw_lock); + arch_spin_unlock(&kvm->arch.mmu_lock.rlock.raw_lock); return ret; } diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 633ae418ba0e..fef60e614aaf 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -470,7 +470,7 @@ static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500, /* * We are just looking at the wimg bits, so we don't * care much about the trans splitting bit. - * We are holding kvm->mmu_lock so a notifier invalidate + * We are holding kvm->arch.mmu_lock so a notifier invalidate * can't run hence pfn won't change. */ local_irq_save(flags); diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index cf52d26f49cd..11e35ba0272e 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -452,6 +452,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) } else goto err_out; + spin_lock_init(&kvm->arch.mmu_lock); + if (kvm_ops->owner && !try_module_get(kvm_ops->owner)) return -ENOENT; diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h index 74f9a036bab2..1299deef70b5 100644 --- a/arch/s390/include/asm/kvm_host.h +++ b/arch/s390/include/asm/kvm_host.h @@ -926,6 +926,8 @@ struct kvm_s390_pv { }; struct kvm_arch{ + spinlock_t mmu_lock; + void *sca; int use_esca; rwlock_t sca_lock; diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index dbafd057ca6a..20c6ae7bc25b 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -2642,6 +2642,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) goto out_err; #endif + spin_lock_init(&kvm->arch.mmu_lock); + rc = s390_enable_sie(); if (rc) goto out_err; diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 3d6616f6f6ef..3087de84fad3 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -902,6 +902,8 @@ enum kvm_irqchip_mode { #define APICV_INHIBIT_REASON_X2APIC 5 struct kvm_arch { + spinlock_t mmu_lock; + unsigned long n_used_mmu_pages; unsigned long n_requested_mmu_pages; unsigned long n_max_mmu_pages; diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 659ed0a2875f..ba296ad051c3 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5747,7 +5747,7 @@ mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) if (!nr_to_scan--) break; /* - * n_used_mmu_pages is accessed without holding kvm->mmu_lock + * n_used_mmu_pages is accessed without holding kvm->arch.mmu_lock * here. We may skip a VM instance errorneosly, but we do not * want to shrink a VM that only started to populate its MMU * anyway. diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 302042af87ee..a6cc34e8ccad 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10366,6 +10366,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) if (type) return -EINVAL; + spin_lock_init(&kvm->arch.mmu_lock); + INIT_HLIST_HEAD(&kvm->arch.mask_notifier_list); INIT_LIST_HEAD(&kvm->arch.active_mmu_pages); INIT_LIST_HEAD(&kvm->arch.zapped_obsolete_pages); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 022e3522788f..97e301b8cafd 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -451,7 +451,6 @@ struct kvm_memslots { }; struct kvm { - spinlock_t mmu_lock; struct mutex slots_lock; struct mm_struct *mm; /* userspace tied to this vm */ struct kvm_memslots __rcu *memslots[KVM_ADDRESS_SPACE_NUM]; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index c504f876176b..d168bd4517d4 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -434,27 +434,27 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_destroy); void kvm_mmu_lock(struct kvm *kvm) { - spin_lock(&kvm->mmu_lock); + spin_lock(&kvm->arch.mmu_lock); } void kvm_mmu_unlock(struct kvm *kvm) { - spin_unlock(&kvm->mmu_lock); + spin_unlock(&kvm->arch.mmu_lock); } int kvm_mmu_lock_needbreak(struct kvm *kvm) { - return spin_needbreak(&kvm->mmu_lock); + return spin_needbreak(&kvm->arch.mmu_lock); } int kvm_mmu_lock_cond_resched(struct kvm *kvm) { - return cond_resched_lock(&kvm->mmu_lock); + return cond_resched_lock(&kvm->arch.mmu_lock); } void kvm_mmu_lock_assert_held(struct kvm *kvm) { - lockdep_assert_held(&kvm->mmu_lock); + lockdep_assert_held(&kvm->arch.mmu_lock); } #if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER) @@ -770,7 +770,6 @@ static struct kvm *kvm_create_vm(unsigned long type) if (!kvm) return ERR_PTR(-ENOMEM); - spin_lock_init(&kvm->mmu_lock); mmgrab(current->mm); kvm->mm = current->mm; kvm_eventfd_init(kvm); From patchwork Tue Jan 12 18:10:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014359 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F0B7C43332 for ; Tue, 12 Jan 2021 18:14:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6D8ED2311D for ; Tue, 12 Jan 2021 18:14:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406197AbhALSMe (ORCPT ); Tue, 12 Jan 2021 13:12:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56714 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2406172AbhALSM1 (ORCPT ); Tue, 12 Jan 2021 13:12:27 -0500 Received: from mail-qv1-xf49.google.com (mail-qv1-xf49.google.com [IPv6:2607:f8b0:4864:20::f49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8FB95C06138E for ; Tue, 12 Jan 2021 10:11:18 -0800 (PST) Received: by mail-qv1-xf49.google.com with SMTP id v1so2075659qvb.2 for ; Tue, 12 Jan 2021 10:11:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=SGgrDMzxLeqfoVOYem4hd98mp4oJIORSvgUJGRQH1ZE=; b=f9+TNTq/KFjwV9/e7e8MypA8x6fCvXJxiQQW61/gG/fSzMORRDQrhxkbG+wQt9dAu2 cj8zpvqAlonJwLe6325XkP7u7NiJvsNviz/apA2S6o+5NZvomA714EWu2Z6bEHzWA2V7 vX/ZKT8culqfmftKZrymcsM0EphooQ+Lt2hsDRVyXDCCcbkubWneI//R6ew0qHyrzUbt B2CDL/G6ULMWI7yIiun7KhPivJFzVTmSg2C08cKN40jT8BiC2z3XGm/B10OV8rk0y4xD 10M36WG5JOD+Yzv7iBOZgAa8J6UzOrFyqYkuTQzTzH5937p7tEliVwrs8GsnnrFzLdBH Fldg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=SGgrDMzxLeqfoVOYem4hd98mp4oJIORSvgUJGRQH1ZE=; b=uGozSG5TSh/ZLwzTVJNevaXCZP5gki1gOyO1eq78IPl/8+F5Br2uWkJmA3YdASzJxW 8blqpiCtRPPs9fiEjE8aJGMHTQTa51Iq2cD0Gg9CNQxd7tiCMcLjSDdW6Ss1XiFLWJhO MgTFNXYDV9f16M6hHUYGme4aqc1duttrVZsRI7Bqx34yYEgVEjnKEHhCAKyBfm3KLiMj zIP7HfJlM8GYgjqa9HhL6lgb5U3kh16ts2yuoPnRvfSBe2L0GcsT+R1kJJv2IVMLt4YO M2STF+aJyrc5anEFOvv4+T2UV/Vli87MM9YVPCsiijjqOGv8AQ76GtAePb2I2DPoPX6n CisA== X-Gm-Message-State: AOAM532XvOKsRuF4/nKVooReoAiWX/fdc8X4Nc0Cdct6IrZzvKQ/oCRy XPMGUMykGdbSeKBZVjue4WhHDJYBIM4H X-Google-Smtp-Source: ABdhPJxTmjafxpsc+KviENVMk1UIaGSaTliWop17fvKgk5AxX9foEZKaMyDrKpGvsf/ASTe+SFh+S2qzxOJs Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a0c:c688:: with SMTP id d8mr261688qvj.8.1610475077748; Tue, 12 Jan 2021 10:11:17 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:35 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-19-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 18/24] kvm: x86/mmu: Use an rwlock for the x86 TDP MMU From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add a read / write lock to be used in place of the MMU spinlock when the TDP MMU is enabled. The rwlock will enable the TDP MMU to handle page faults in parallel in a future commit. In cases where the TDP MMU is not in use, no operation would be acquiring the lock in read mode, so a regular spin lock is still used as locking and unlocking a spin lock is slightly faster. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- arch/x86/include/asm/kvm_host.h | 8 ++- arch/x86/kvm/mmu/mmu.c | 89 +++++++++++++++++++++++++++++++++ arch/x86/kvm/mmu/mmu_internal.h | 9 ++++ arch/x86/kvm/mmu/tdp_mmu.c | 10 ++-- arch/x86/kvm/x86.c | 2 - virt/kvm/kvm_main.c | 10 ++-- 6 files changed, 115 insertions(+), 13 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 3087de84fad3..92d5340842c8 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -902,7 +902,13 @@ enum kvm_irqchip_mode { #define APICV_INHIBIT_REASON_X2APIC 5 struct kvm_arch { - spinlock_t mmu_lock; + union { + /* Used if the TDP MMU is enabled. */ + rwlock_t mmu_rwlock; + + /* Used if the TDP MMU is not enabled. */ + spinlock_t mmu_lock; + }; unsigned long n_used_mmu_pages; unsigned long n_requested_mmu_pages; diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index ba296ad051c3..280d7cd6f94b 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5471,6 +5471,11 @@ void kvm_mmu_init_vm(struct kvm *kvm) kvm_mmu_init_tdp_mmu(kvm); + if (kvm->arch.tdp_mmu_enabled) + rwlock_init(&kvm->arch.mmu_rwlock); + else + spin_lock_init(&kvm->arch.mmu_lock); + node->track_write = kvm_mmu_pte_write; node->track_flush_slot = kvm_mmu_invalidate_zap_pages_in_memslot; kvm_page_track_register_notifier(kvm, node); @@ -6074,3 +6079,87 @@ void kvm_mmu_pre_destroy_vm(struct kvm *kvm) if (kvm->arch.nx_lpage_recovery_thread) kthread_stop(kvm->arch.nx_lpage_recovery_thread); } + +void kvm_mmu_lock_shared(struct kvm *kvm) +{ + WARN_ON(!kvm->arch.tdp_mmu_enabled); + read_lock(&kvm->arch.mmu_rwlock); +} + +void kvm_mmu_unlock_shared(struct kvm *kvm) +{ + WARN_ON(!kvm->arch.tdp_mmu_enabled); + read_unlock(&kvm->arch.mmu_rwlock); +} + +void kvm_mmu_lock_exclusive(struct kvm *kvm) +{ + WARN_ON(!kvm->arch.tdp_mmu_enabled); + write_lock(&kvm->arch.mmu_rwlock); +} + +void kvm_mmu_unlock_exclusive(struct kvm *kvm) +{ + WARN_ON(!kvm->arch.tdp_mmu_enabled); + write_unlock(&kvm->arch.mmu_rwlock); +} + +void kvm_mmu_lock(struct kvm *kvm) +{ + if (kvm->arch.tdp_mmu_enabled) + kvm_mmu_lock_exclusive(kvm); + else + spin_lock(&kvm->arch.mmu_lock); +} +EXPORT_SYMBOL_GPL(kvm_mmu_lock); + +void kvm_mmu_unlock(struct kvm *kvm) +{ + if (kvm->arch.tdp_mmu_enabled) + kvm_mmu_unlock_exclusive(kvm); + else + spin_unlock(&kvm->arch.mmu_lock); +} +EXPORT_SYMBOL_GPL(kvm_mmu_unlock); + +int kvm_mmu_lock_needbreak(struct kvm *kvm) +{ + if (kvm->arch.tdp_mmu_enabled) + return rwlock_needbreak(&kvm->arch.mmu_rwlock); + else + return spin_needbreak(&kvm->arch.mmu_lock); +} + +int kvm_mmu_lock_cond_resched_exclusive(struct kvm *kvm) +{ + WARN_ON(!kvm->arch.tdp_mmu_enabled); + return cond_resched_rwlock_write(&kvm->arch.mmu_rwlock); +} + +int kvm_mmu_lock_cond_resched(struct kvm *kvm) +{ + if (kvm->arch.tdp_mmu_enabled) + return kvm_mmu_lock_cond_resched_exclusive(kvm); + else + return cond_resched_lock(&kvm->arch.mmu_lock); +} + +void kvm_mmu_lock_assert_held_shared(struct kvm *kvm) +{ + WARN_ON(!kvm->arch.tdp_mmu_enabled); + lockdep_assert_held_read(&kvm->arch.mmu_rwlock); +} + +void kvm_mmu_lock_assert_held_exclusive(struct kvm *kvm) +{ + WARN_ON(!kvm->arch.tdp_mmu_enabled); + lockdep_assert_held_write(&kvm->arch.mmu_rwlock); +} + +void kvm_mmu_lock_assert_held(struct kvm *kvm) +{ + if (kvm->arch.tdp_mmu_enabled) + lockdep_assert_held(&kvm->arch.mmu_rwlock); + else + lockdep_assert_held(&kvm->arch.mmu_lock); +} diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index cc8268cf28d2..53a789b8a820 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -149,4 +149,13 @@ void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc); void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp); void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp); +void kvm_mmu_lock_shared(struct kvm *kvm); +void kvm_mmu_unlock_shared(struct kvm *kvm); +void kvm_mmu_lock_exclusive(struct kvm *kvm); +void kvm_mmu_unlock_exclusive(struct kvm *kvm); +int kvm_mmu_lock_cond_resched_exclusive(struct kvm *kvm); +void kvm_mmu_lock_assert_held_shared(struct kvm *kvm); +void kvm_mmu_lock_assert_held_exclusive(struct kvm *kvm); +void kvm_mmu_lock_assert_held(struct kvm *kvm); + #endif /* __KVM_X86_MMU_INTERNAL_H */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 1d7c01300495..8b61bdb391a0 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -59,7 +59,7 @@ static void tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root) static inline bool tdp_mmu_next_root_valid(struct kvm *kvm, struct kvm_mmu_page *root) { - lockdep_assert_held(&kvm->mmu_lock); + kvm_mmu_lock_assert_held_exclusive(kvm); if (list_entry_is_head(root, &kvm->arch.tdp_mmu_roots, link)) return false; @@ -117,7 +117,7 @@ void kvm_tdp_mmu_free_root(struct kvm *kvm, struct kvm_mmu_page *root) { gfn_t max_gfn = 1ULL << (shadow_phys_bits - PAGE_SHIFT); - kvm_mmu_lock_assert_held(kvm); + kvm_mmu_lock_assert_held_exclusive(kvm); WARN_ON(root->root_count); WARN_ON(!root->tdp_mmu_page); @@ -425,7 +425,7 @@ static inline void __tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter, struct kvm_mmu_page *root = sptep_to_sp(root_pt); int as_id = kvm_mmu_page_as_id(root); - kvm_mmu_lock_assert_held(kvm); + kvm_mmu_lock_assert_held_exclusive(kvm); WRITE_ONCE(*iter->sptep, new_spte); @@ -1139,7 +1139,7 @@ void kvm_tdp_mmu_clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root; int root_as_id; - kvm_mmu_lock_assert_held(kvm); + kvm_mmu_lock_assert_held_exclusive(kvm); for_each_tdp_mmu_root(kvm, root) { root_as_id = kvm_mmu_page_as_id(root); if (root_as_id != slot->as_id) @@ -1324,7 +1324,7 @@ bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm, int root_as_id; bool spte_set = false; - kvm_mmu_lock_assert_held(kvm); + kvm_mmu_lock_assert_held_exclusive(kvm); for_each_tdp_mmu_root(kvm, root) { root_as_id = kvm_mmu_page_as_id(root); if (root_as_id != slot->as_id) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a6cc34e8ccad..302042af87ee 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10366,8 +10366,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) if (type) return -EINVAL; - spin_lock_init(&kvm->arch.mmu_lock); - INIT_HLIST_HEAD(&kvm->arch.mask_notifier_list); INIT_LIST_HEAD(&kvm->arch.active_mmu_pages); INIT_LIST_HEAD(&kvm->arch.zapped_obsolete_pages); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index d168bd4517d4..dcbdb3beb084 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -432,27 +432,27 @@ void kvm_vcpu_destroy(struct kvm_vcpu *vcpu) } EXPORT_SYMBOL_GPL(kvm_vcpu_destroy); -void kvm_mmu_lock(struct kvm *kvm) +__weak void kvm_mmu_lock(struct kvm *kvm) { spin_lock(&kvm->arch.mmu_lock); } -void kvm_mmu_unlock(struct kvm *kvm) +__weak void kvm_mmu_unlock(struct kvm *kvm) { spin_unlock(&kvm->arch.mmu_lock); } -int kvm_mmu_lock_needbreak(struct kvm *kvm) +__weak int kvm_mmu_lock_needbreak(struct kvm *kvm) { return spin_needbreak(&kvm->arch.mmu_lock); } -int kvm_mmu_lock_cond_resched(struct kvm *kvm) +__weak int kvm_mmu_lock_cond_resched(struct kvm *kvm) { return cond_resched_lock(&kvm->arch.mmu_lock); } -void kvm_mmu_lock_assert_held(struct kvm *kvm) +__weak void kvm_mmu_lock_assert_held(struct kvm *kvm) { lockdep_assert_held(&kvm->arch.mmu_lock); } From patchwork Tue Jan 12 18:10:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014351 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7A5D0C433DB for ; Tue, 12 Jan 2021 18:14:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3DB492311D for ; Tue, 12 Jan 2021 18:14:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406266AbhALSMj (ORCPT ); Tue, 12 Jan 2021 13:12:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56872 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2406171AbhALSMh (ORCPT ); Tue, 12 Jan 2021 13:12:37 -0500 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DDA7CC061342 for ; Tue, 12 Jan 2021 10:11:19 -0800 (PST) Received: by mail-pf1-x449.google.com with SMTP id t16so1819847pfh.22 for ; Tue, 12 Jan 2021 10:11:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=KUOgmZwIC4q3950YAhNBlDyIxwgFqBqj60bM3H+2xtM=; b=Xz1vVSzHm5BnPtQ2DJVfdq/sJMiagonKCEkxyLa6fSeikTA91iBU6KGl4a5Xtf7ah8 3sq2YwGuVfsR+i1liuuLvLOlgRJAlf1IopQnYrTA9NSRCnpfl3hJ3WS16mA+5AJDqiVT WK45OzJHZ+qe/zaxSFPGULbz2PNiiiI90NMzKjsvMJAxEd54bCVPFkK7GQ40Y6CZjsqW 9PJ+CEwciNpOA4xYgSlmg+xmtOyarBPr0S4jR6cTqB+SGWrFJp/BWv41ZK4VyRltnBgH l746/djdGR0WI73HVQPo4RNp97EobJdYKnPBPj5NLjevBuTA9NROTtrd3LWAsF0pj8gk G9tg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=KUOgmZwIC4q3950YAhNBlDyIxwgFqBqj60bM3H+2xtM=; b=GWExHeP5+ps/DlOfru8HQXUyTFrcdm2BrEwgMYLN54HvtKmuZA+veQ5RGWzDUF07+D /Pqdvy2Xq3UqMmx0LoWI0JRh4wS77z11fzY/EmmaiHbs8zHF2W5L0MFqHB3WrdeRNoDv tOdbv3zj2SHxX0E9Qv61Zjou9rU8shKhr/I/4AMFVKBeN83qtAnAnEjdaCZtJE4RYtvr 6MUmheTVvRZMwtagZEbqWQEe/aEDwBoeRg1YYXfXwEM3QSxk2mCtCteio9crwweS6Y/N XhpI0HL1HbsWMfLJ5cOn+rwvwhksyTMge8H5LsBtbqPoir/Xclc3/yHvXlNmjdBlLecX xrJw== X-Gm-Message-State: AOAM531ITlU8uYQ83hnf2nUD5QDxmc8E/jg5sSfcNiOXPLo1c7GGaeDo mA2dJjXLBdDTQ20grxllp7DJglplm2hV X-Google-Smtp-Source: ABdhPJx7bl40gpvxBxKPrzmj9sNWpykE51nyXEQF+Lic8F/XxYhW4apaZpXou7KBKBZhY+O+oyGcv0lNMAiE Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a63:1c1d:: with SMTP id c29mr301844pgc.94.1610475079388; Tue, 12 Jan 2021 10:11:19 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:36 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-20-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 19/24] kvm: x86/mmu: Protect tdp_mmu_pages with a lock From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add a lock to protect the data structures that track the page table memory used by the TDP MMU. In order to handle multiple TDP MMU operations in parallel, pages of PT memory must be added and removed without the exclusive protection of the MMU lock. A new lock to protect the list(s) of in-use pages will cause some serialization, but only on non-leaf page table entries, so the lock is not expected to be very contended. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- arch/x86/include/asm/kvm_host.h | 15 ++++++++ arch/x86/kvm/mmu/tdp_mmu.c | 67 +++++++++++++++++++++++++++++---- 2 files changed, 74 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 92d5340842c8..f8dccb27c722 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1034,6 +1034,21 @@ struct kvm_arch { * tdp_mmu_page set and a root_count of 0. */ struct list_head tdp_mmu_pages; + + /* + * Protects accesses to the following fields when the MMU lock is + * not held exclusively: + * - tdp_mmu_pages (above) + * - the link field of struct kvm_mmu_pages used by the TDP MMU + * when they are part of tdp_mmu_pages (but not when they are part + * of the tdp_mmu_free_list or tdp_mmu_disconnected_list) + * - lpage_disallowed_mmu_pages + * - the lpage_disallowed_link field of struct kvm_mmu_pages used + * by the TDP MMU + * May be acquired under the MMU lock in read mode or non-overlapping + * with the MMU lock. + */ + spinlock_t tdp_mmu_pages_lock; }; struct kvm_vm_stat { diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 8b61bdb391a0..264594947c3b 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -33,6 +33,7 @@ void kvm_mmu_init_tdp_mmu(struct kvm *kvm) kvm->arch.tdp_mmu_enabled = true; INIT_LIST_HEAD(&kvm->arch.tdp_mmu_roots); + spin_lock_init(&kvm->arch.tdp_mmu_pages_lock); INIT_LIST_HEAD(&kvm->arch.tdp_mmu_pages); } @@ -262,6 +263,58 @@ static void handle_changed_spte_dirty_log(struct kvm *kvm, int as_id, gfn_t gfn, } } +/** + * tdp_mmu_link_page - Add a new page to the list of pages used by the TDP MMU + * + * @kvm: kvm instance + * @sp: the new page + * @atomic: This operation is not running under the exclusive use of the MMU + * lock and the operation must be atomic with respect to ther threads + * that might be adding or removing pages. + * @account_nx: This page replaces a NX large page and should be marked for + * eventual reclaim. + */ +static void tdp_mmu_link_page(struct kvm *kvm, struct kvm_mmu_page *sp, + bool atomic, bool account_nx) +{ + if (atomic) + spin_lock(&kvm->arch.tdp_mmu_pages_lock); + else + kvm_mmu_lock_assert_held_exclusive(kvm); + + list_add(&sp->link, &kvm->arch.tdp_mmu_pages); + if (account_nx) + account_huge_nx_page(kvm, sp); + + if (atomic) + spin_unlock(&kvm->arch.tdp_mmu_pages_lock); +} + +/** + * tdp_mmu_unlink_page - Remove page from the list of pages used by the TDP MMU + * + * @kvm: kvm instance + * @sp: the page to be removed + * @atomic: This operation is not running under the exclusive use of the MMU + * lock and the operation must be atomic with respect to ther threads + * that might be adding or removing pages. + */ +static void tdp_mmu_unlink_page(struct kvm *kvm, struct kvm_mmu_page *sp, + bool atomic) +{ + if (atomic) + spin_lock(&kvm->arch.tdp_mmu_pages_lock); + else + kvm_mmu_lock_assert_held_exclusive(kvm); + + list_del(&sp->link); + if (sp->lpage_disallowed) + unaccount_huge_nx_page(kvm, sp); + + if (atomic) + spin_unlock(&kvm->arch.tdp_mmu_pages_lock); +} + /** * handle_disconnected_tdp_mmu_page - handle a pt removed from the TDP structure * @@ -285,10 +338,7 @@ static void handle_disconnected_tdp_mmu_page(struct kvm *kvm, u64 *pt) trace_kvm_mmu_prepare_zap_page(sp); - list_del(&sp->link); - - if (sp->lpage_disallowed) - unaccount_huge_nx_page(kvm, sp); + tdp_mmu_unlink_page(kvm, sp, atomic); for (i = 0; i < PT64_ENT_PER_PAGE; i++) { old_child_spte = READ_ONCE(*(pt + i)); @@ -719,15 +769,16 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, if (!is_shadow_present_pte(iter.old_spte)) { sp = alloc_tdp_mmu_page(vcpu, iter.gfn, iter.level); - list_add(&sp->link, &vcpu->kvm->arch.tdp_mmu_pages); child_pt = sp->spt; + + tdp_mmu_link_page(vcpu->kvm, sp, false, + huge_page_disallowed && + req_level >= iter.level); + new_spte = make_nonleaf_spte(child_pt, !shadow_accessed_mask); trace_kvm_mmu_get_page(sp, true); - if (huge_page_disallowed && req_level >= iter.level) - account_huge_nx_page(vcpu->kvm, sp); - tdp_mmu_set_spte(vcpu->kvm, &iter, new_spte); } } From patchwork Tue Jan 12 18:10:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014355 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40311C4332B for ; Tue, 12 Jan 2021 18:14:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EFF362311F for ; Tue, 12 Jan 2021 18:14:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406602AbhALSOG (ORCPT ); Tue, 12 Jan 2021 13:14:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56814 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2406231AbhALSMi (ORCPT ); Tue, 12 Jan 2021 13:12:38 -0500 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B3C61C061344 for ; Tue, 12 Jan 2021 10:11:21 -0800 (PST) Received: by mail-pg1-x549.google.com with SMTP id 33so2158562pgv.0 for ; Tue, 12 Jan 2021 10:11:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=weYuAzHiKmuI5kx/0dPz0XMIF8zLAlYQHLKS7oXdpNI=; b=LyiTsCYxm1PQhsdCDlJG0ixdReyEi3wVVCEj5VPqU5LFhcbiaVSyogHEmZi+JAzqAV hqAI50dZ9VxGt8CbLZaz6CX9o9l60nqxHPp7bhDpjcd6qS3cvsz+IsiVvkYqGRXODRky YoSFIPgh0IIZPipsc/Ml5qe1FjsOV9espjt4ZPq2VuIMOz+HtCj9fj/cOZ+iWeOifC8m xk7Vq/U9kNG45GLKs6AebAu44lYkqpylwVfR/IXyC7pzimJcCwYqt6CKdwAn8owaOPqU 82VAdYAvM8Tmh5CIKSszhOsyh7LfWaLNayuh/82EkIkAbUZ6Xfea9P05Y7Z8rC5+lu6z THIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=weYuAzHiKmuI5kx/0dPz0XMIF8zLAlYQHLKS7oXdpNI=; b=oR2BMia6ACJCX8iR0KI6/p9rj61lk/lobmwO/9TzknZ+TvbUowMhJ6JPiuTufPEKga 69Cvnb9vUrDLvFjdaSpDDgcZjatjtxROY1U8lj/0hrwrbUXtUdzCkcbMEdEJZEg/a8+s J54uPWDO7V5xUWb4GC10HtxxWiX/Immpam84EKhOOW1bMYhfDrc3k+fE+1UFgQqh3gSJ 9EUwcDG2u6wzqkhVSksFbhOvPtJT/qv2MX1t5uZflMfbAy9MOCNzP1Q8K2r0kASzB3QS ToFG403ReoXEX0nnNVAiAFqP8PkydjGw7M8Osd+S9CxO7SsGSIJnctNHL0TqZCXVTMvy /YIw== X-Gm-Message-State: AOAM532+vkSk/EcDkJLzPGWa//t1pbQ++tj0IMIgsYG6aoj+cHPU4Dh2 OWTT7yWYHH/EHr2L1GP7ZzIwvfIBQV1c X-Google-Smtp-Source: ABdhPJzXcA/BDbM8k8s2YwSAeag6O4e2HhzU8tM/6UDUIcpNDCme8y8mPYUq+1bGJv+VCTffQZqlBYOrHYnL Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:aa7:9357:0:b029:1a5:43da:b90d with SMTP id 23-20020aa793570000b02901a543dab90dmr484505pfn.54.1610475081090; Tue, 12 Jan 2021 10:11:21 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:37 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-21-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 20/24] kvm: x86/mmu: Add atomic option for setting SPTEs From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org In order to allow multiple TDP MMU operations to proceed in parallel, there must be an option to modify SPTEs atomically so that changes are not lost. Add that option to __tdp_mmu_set_spte and __handle_changed_spte. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/tdp_mmu.c | 67 ++++++++++++++++++++++++++++++++------ 1 file changed, 57 insertions(+), 10 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 264594947c3b..1380ed313476 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -7,6 +7,7 @@ #include "tdp_mmu.h" #include "spte.h" +#include #include #ifdef CONFIG_X86_64 @@ -226,7 +227,8 @@ static void tdp_mmu_free_sp_rcu_callback(struct rcu_head *head) } static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, - u64 old_spte, u64 new_spte, int level); + u64 old_spte, u64 new_spte, int level, + bool atomic); static int kvm_mmu_page_as_id(struct kvm_mmu_page *sp) { @@ -320,15 +322,19 @@ static void tdp_mmu_unlink_page(struct kvm *kvm, struct kvm_mmu_page *sp, * * @kvm: kvm instance * @pt: the page removed from the paging structure + * @atomic: Use atomic operations to clear the SPTEs in any disconnected + * pages of memory. * * Given a page table that has been removed from the TDP paging structure, * iterates through the page table to clear SPTEs and free child page tables. */ -static void handle_disconnected_tdp_mmu_page(struct kvm *kvm, u64 *pt) +static void handle_disconnected_tdp_mmu_page(struct kvm *kvm, u64 *pt, + bool atomic) { struct kvm_mmu_page *sp; gfn_t gfn; int level; + u64 *sptep; u64 old_child_spte; int i; @@ -341,11 +347,17 @@ static void handle_disconnected_tdp_mmu_page(struct kvm *kvm, u64 *pt) tdp_mmu_unlink_page(kvm, sp, atomic); for (i = 0; i < PT64_ENT_PER_PAGE; i++) { - old_child_spte = READ_ONCE(*(pt + i)); - WRITE_ONCE(*(pt + i), 0); + sptep = pt + i; + + if (atomic) { + old_child_spte = xchg(sptep, 0); + } else { + old_child_spte = READ_ONCE(*sptep); + WRITE_ONCE(*sptep, 0); + } handle_changed_spte(kvm, kvm_mmu_page_as_id(sp), gfn + (i * KVM_PAGES_PER_HPAGE(level - 1)), - old_child_spte, 0, level - 1); + old_child_spte, 0, level - 1, atomic); } kvm_flush_remote_tlbs_with_address(kvm, gfn, @@ -362,12 +374,15 @@ static void handle_disconnected_tdp_mmu_page(struct kvm *kvm, u64 *pt) * @old_spte: The value of the SPTE before the change * @new_spte: The value of the SPTE after the change * @level: the level of the PT the SPTE is part of in the paging structure + * @atomic: Use atomic operations to clear the SPTEs in any disconnected + * pages of memory. * * Handle bookkeeping that might result from the modification of a SPTE. * This function must be called for all TDP SPTE modifications. */ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, - u64 old_spte, u64 new_spte, int level) + u64 old_spte, u64 new_spte, int level, + bool atomic) { bool was_present = is_shadow_present_pte(old_spte); bool is_present = is_shadow_present_pte(new_spte); @@ -439,18 +454,50 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, */ if (was_present && !was_leaf && (pfn_changed || !is_present)) handle_disconnected_tdp_mmu_page(kvm, - spte_to_child_pt(old_spte, level)); + spte_to_child_pt(old_spte, level), atomic); } static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, - u64 old_spte, u64 new_spte, int level) + u64 old_spte, u64 new_spte, int level, + bool atomic) { - __handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level); + __handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level, + atomic); handle_changed_spte_acc_track(old_spte, new_spte, level); handle_changed_spte_dirty_log(kvm, as_id, gfn, old_spte, new_spte, level); } +/* + * tdp_mmu_set_spte_atomic - Set a TDP MMU SPTE atomically and handle the + * associated bookkeeping + * + * @kvm: kvm instance + * @iter: a tdp_iter instance currently on the SPTE that should be set + * @new_spte: The value the SPTE should be set to + * Returns: true if the SPTE was set, false if it was not. If false is returned, + * this function will have no side-effects. + */ +static inline bool tdp_mmu_set_spte_atomic(struct kvm *kvm, + struct tdp_iter *iter, + u64 new_spte) +{ + u64 *root_pt = tdp_iter_root_pt(iter); + struct kvm_mmu_page *root = sptep_to_sp(root_pt); + int as_id = kvm_mmu_page_as_id(root); + + kvm_mmu_lock_assert_held_shared(kvm); + + if (cmpxchg64(iter->sptep, iter->old_spte, new_spte) != iter->old_spte) + return false; + + handle_changed_spte(kvm, as_id, iter->gfn, iter->old_spte, new_spte, + iter->level, true); + + return true; +} + + /* * __tdp_mmu_set_spte - Set a TDP MMU SPTE and handle the associated bookkeeping * @kvm: kvm instance @@ -480,7 +527,7 @@ static inline void __tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter, WRITE_ONCE(*iter->sptep, new_spte); __handle_changed_spte(kvm, as_id, iter->gfn, iter->old_spte, new_spte, - iter->level); + iter->level, false); if (record_acc_track) handle_changed_spte_acc_track(iter->old_spte, new_spte, iter->level); From patchwork Tue Jan 12 18:10:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014353 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B535DC433E6 for ; Tue, 12 Jan 2021 18:14:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7EB4123121 for ; Tue, 12 Jan 2021 18:14:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406582AbhALSN5 (ORCPT ); Tue, 12 Jan 2021 13:13:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56824 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2406278AbhALSMj (ORCPT ); Tue, 12 Jan 2021 13:12:39 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 751DAC061347 for ; Tue, 12 Jan 2021 10:11:23 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id m7so2181438pjr.0 for ; Tue, 12 Jan 2021 10:11:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=UBoo2fCmFJGzXhXWYE7DtuvRH/Onu5tHDfHxpfcIjEQ=; b=aUsijLju1lhuBX60hyH4ICGlB8ruZdoEKm3mE8dJPzxs0NPqgwo3V/gWkLpNCDeTtU fvTTF1/hp9D/VdmMJUjWYz2GdfyWmFeW1Gt1QM6ddzXRvwhWyh/fXBjUVrhjakFYhK82 XHAwC/6nUFmTVRYQYL4XBafuA1ChZGtHx7GKDBF24tLUhNEGd2G6ZL9p6PL5QnGnQCRx TXkwnAcgOcPcYetGo7huCPBPeMobo1WXZ0nRyOvgy6Eb/TLd5b2VEktBkSfgvwjccuNQ oKFTaNvpY54jeNbqraPzTdp3ZNrEMXQLma8VAYQOZn4wOSZC2R42C9GS8Q281YeC7C5v 8gnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=UBoo2fCmFJGzXhXWYE7DtuvRH/Onu5tHDfHxpfcIjEQ=; b=LfL3K2o1B802J7LpN3nUddgai+khbOZ/m/mXrd9tKXi4yIHN+G6qmTVrfqa5ITIdSQ 6P+vsVFzBdbnZLunnu9qll7mgE6jlOBMUIe1nlBDMAc2EHnIm5wfPUoCbAfREZ1Glbd4 TH3Df/i5kd2NgEjSTY6EPtfHQM524A5I4UYtnNqE2qmQiP3ALuOAcndvjEefj1HdtJNJ 2+8/6BYlMRx2+QoEaIZ0/CMPlMGfeKFGtd9FxYeEXYszWgLr6tyBDLbKwGkdeevGIyXC TFV5xwgDIi9MQUtZpJHPVrmzbpbxRFcJjEMyPCQjW8CNIVnLVbwsHOcVixz7Y/ivl0IU fipQ== X-Gm-Message-State: AOAM531aq+c8W4k2BkIk8O54qUCj3rRqJkFn/lrtUbhvHZqoKsf9XBn3 0EW2UrXXzjJokgNRSYOCEDOcEEDdaIom X-Google-Smtp-Source: ABdhPJyY12528ymLrH01ojxqjlGYFXTeMg/eRaqEg67g5CdpZom5RuSfCb3BzBaby9wLlVlNG0h2mwxElr03 Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a17:902:7881:b029:de:2fb:99e with SMTP id q1-20020a1709027881b02900de02fb099emr302383pll.53.1610475082969; Tue, 12 Jan 2021 10:11:22 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:38 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-22-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 21/24] kvm: x86/mmu: Use atomic ops to set SPTEs in TDP MMU map From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org To prepare for handling page faults in parallel, change the TDP MMU page fault handler to use atomic operations to set SPTEs so that changes are not lost if multiple threads attempt to modify the same SPTE. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/tdp_mmu.c | 38 ++++++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 16 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 1380ed313476..7b12a87a4124 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -714,21 +714,18 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, int write, int ret = 0; int make_spte_ret = 0; - if (unlikely(is_noslot_pfn(pfn))) { + if (unlikely(is_noslot_pfn(pfn))) new_spte = make_mmio_spte(vcpu, iter->gfn, ACC_ALL); - trace_mark_mmio_spte(iter->sptep, iter->gfn, new_spte); - } else { + else make_spte_ret = make_spte(vcpu, ACC_ALL, iter->level, iter->gfn, pfn, iter->old_spte, prefault, true, map_writable, !shadow_accessed_mask, &new_spte); - trace_kvm_mmu_set_spte(iter->level, iter->gfn, iter->sptep); - } if (new_spte == iter->old_spte) ret = RET_PF_SPURIOUS; - else - tdp_mmu_set_spte(vcpu->kvm, iter, new_spte); + else if (!tdp_mmu_set_spte_atomic(vcpu->kvm, iter, new_spte)) + return RET_PF_RETRY; /* * If the page fault was caused by a write but the page is write @@ -742,8 +739,11 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, int write, } /* If a MMIO SPTE is installed, the MMIO will need to be emulated. */ - if (unlikely(is_mmio_spte(new_spte))) + if (unlikely(is_mmio_spte(new_spte))) { + trace_mark_mmio_spte(iter->sptep, iter->gfn, new_spte); ret = RET_PF_EMULATE; + } else + trace_kvm_mmu_set_spte(iter->level, iter->gfn, iter->sptep); trace_kvm_mmu_set_spte(iter->level, iter->gfn, iter->sptep); if (!prefault) @@ -801,7 +801,8 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, */ if (is_shadow_present_pte(iter.old_spte) && is_large_pte(iter.old_spte)) { - tdp_mmu_set_spte(vcpu->kvm, &iter, 0); + if (!tdp_mmu_set_spte_atomic(vcpu->kvm, &iter, 0)) + break; kvm_flush_remote_tlbs_with_address(vcpu->kvm, iter.gfn, KVM_PAGES_PER_HPAGE(iter.level)); @@ -818,19 +819,24 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, sp = alloc_tdp_mmu_page(vcpu, iter.gfn, iter.level); child_pt = sp->spt; - tdp_mmu_link_page(vcpu->kvm, sp, false, - huge_page_disallowed && - req_level >= iter.level); - new_spte = make_nonleaf_spte(child_pt, !shadow_accessed_mask); - trace_kvm_mmu_get_page(sp, true); - tdp_mmu_set_spte(vcpu->kvm, &iter, new_spte); + if (tdp_mmu_set_spte_atomic(vcpu->kvm, &iter, + new_spte)) { + tdp_mmu_link_page(vcpu->kvm, sp, true, + huge_page_disallowed && + req_level >= iter.level); + + trace_kvm_mmu_get_page(sp, true); + } else { + tdp_mmu_free_sp(sp); + break; + } } } - if (WARN_ON(iter.level != level)) { + if (iter.level != level) { rcu_read_unlock(); return RET_PF_RETRY; } From patchwork Tue Jan 12 18:10:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014349 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91D70C433E0 for ; Tue, 12 Jan 2021 18:14:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 618F32311F for ; Tue, 12 Jan 2021 18:14:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728469AbhALSMp (ORCPT ); Tue, 12 Jan 2021 13:12:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56826 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2406306AbhALSMl (ORCPT ); Tue, 12 Jan 2021 13:12:41 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8EB62C06134A for ; Tue, 12 Jan 2021 10:11:25 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id a206so3311391ybg.0 for ; Tue, 12 Jan 2021 10:11:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=l+I+bLoy5utbOpCikl2L8mHdXXvFsr/pgAIn4+WfC5A=; b=FyNjxbQGAv0YdqOhXA4BwEs4JKTkEI1PzhKb4+kP6NRE66OJcfyF82O1fFyGtyddtE d+q+x+Z/vncrFRscx7imQbNPcGuIK1UJpHX8hcR+m1UbroJaFgSKIcsrtt0sTGyNWakK qYNJM+PBLdNCcgfu0zjAGPIfEgvzPV8yp3szhLGiM+VQlaArXNZnM4mdl6R2nB4xtWNT leRU2wZNhyaMnHjF3e8AfjAOZOFwLTvt5RH55oQwF5GBw6xpFe56J/D6X0bDkXTnHLlP 1OkpJK/+q1ZflkJ7I74Vxuhtxrcf/sFHEon9NCtw/8CegZ/sifROxAswsaxOZPJRISsK oAcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=l+I+bLoy5utbOpCikl2L8mHdXXvFsr/pgAIn4+WfC5A=; b=Ob6n09FZ3CGAcwlhl05grBDq0bwPwifDl3UwGdrnYUZOmUJD7+JOLQmu+JJI0OT+Cp mVbkYSeLYIrbTI29nLYsQgmaLTEyztCO+i6BtY/1H9aC0HmbstHS7ulo9xXV2OAMTcmL Pr5TRPB2852D6w8DcJpw9yIw2NhrAUelDGNu5B2y4h840jzM6LKXIOQdskyRdPLD+hu2 C3qpz8nOZh/GLRBSgsJ0k/wcqv7As+NPMmsnqlb5tlvHO9rtPErqcua9XqjBye+K+1UB IqzoLsYloiyX1QMpYm6ZWO5JbFjMD8eVox/h0Noi3vgoMx7CfgMngPyPb/I/V5wBgSPG JMkQ== X-Gm-Message-State: AOAM5314n5hlrxQzpwte8llNhYms7GWOYwGi4jVku64n57620jYzBi5r mcxHPIj0/lGJSClr4b2EGQjxLVUhCGND X-Google-Smtp-Source: ABdhPJxak5SYdNqV732YONbr/FySR+NrMxiKTt7HH0qlBQi0b4+BbB3dDDdCXWYuEMGlUxh/dNEzD59+9Moj Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a25:40d:: with SMTP id 13mr1054607ybe.422.1610475084759; Tue, 12 Jan 2021 10:11:24 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:39 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-23-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 22/24] kvm: x86/mmu: Flush TLBs after zap in TDP MMU PF handler From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When the TDP MMU is allowed to handle page faults in parallel there is the possiblity of a race where an SPTE is cleared and then imediately replaced with a present SPTE pointing to a different PFN, before the TLBs can be flushed. This race would violate architectural specs. Ensure that the TLBs are flushed properly before other threads are allowed to install any present value for the SPTE. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/spte.h | 16 +++++++++- arch/x86/kvm/mmu/tdp_mmu.c | 62 ++++++++++++++++++++++++++++++++------ 2 files changed, 68 insertions(+), 10 deletions(-) diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index 2b3a30bd38b0..ecd9bfbccef4 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -130,6 +130,20 @@ extern u64 __read_mostly shadow_nonpresent_or_rsvd_mask; PT64_EPT_EXECUTABLE_MASK) #define SHADOW_ACC_TRACK_SAVED_BITS_SHIFT PT64_SECOND_AVAIL_BITS_SHIFT +/* + * If a thread running without exclusive control of the MMU lock must perform a + * multi-part operation on an SPTE, it can set the SPTE to FROZEN_SPTE as a + * non-present intermediate value. This will guarantee that other threads will + * not modify the spte. + * + * This constant works because it is considered non-present on both AMD and + * Intel CPUs and does not create a L1TF vulnerability because the pfn section + * is zeroed out. + * + * Only used by the TDP MMU. + */ +#define FROZEN_SPTE (1ull << 59) + /* * In some cases, we need to preserve the GFN of a non-present or reserved * SPTE when we usurp the upper five bits of the physical address space to @@ -187,7 +201,7 @@ static inline bool is_access_track_spte(u64 spte) static inline int is_shadow_present_pte(u64 pte) { - return (pte != 0) && !is_mmio_spte(pte); + return (pte != 0) && !is_mmio_spte(pte) && (pte != FROZEN_SPTE); } static inline int is_large_pte(u64 pte) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 7b12a87a4124..5c9d053000ad 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -429,15 +429,19 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, */ if (!was_present && !is_present) { /* - * If this change does not involve a MMIO SPTE, it is - * unexpected. Log the change, though it should not impact the - * guest since both the former and current SPTEs are nonpresent. + * If this change does not involve a MMIO SPTE or FROZEN_SPTE, + * it is unexpected. Log the change, though it should not + * impact the guest since both the former and current SPTEs + * are nonpresent. */ - if (WARN_ON(!is_mmio_spte(old_spte) && !is_mmio_spte(new_spte))) + if (WARN_ON(!is_mmio_spte(old_spte) && + !is_mmio_spte(new_spte) && + new_spte != FROZEN_SPTE)) pr_err("Unexpected SPTE change! Nonpresent SPTEs\n" "should not be replaced with another,\n" "different nonpresent SPTE, unless one or both\n" - "are MMIO SPTEs.\n" + "are MMIO SPTEs, or the new SPTE is\n" + "FROZEN_SPTE.\n" "as_id: %d gfn: %llx old_spte: %llx new_spte: %llx level: %d", as_id, gfn, old_spte, new_spte, level); return; @@ -488,6 +492,13 @@ static inline bool tdp_mmu_set_spte_atomic(struct kvm *kvm, kvm_mmu_lock_assert_held_shared(kvm); + /* + * Do not change FROZEN_SPTEs. Only the thread that froze the SPTE + * may modify it. + */ + if (iter->old_spte == FROZEN_SPTE) + return false; + if (cmpxchg64(iter->sptep, iter->old_spte, new_spte) != iter->old_spte) return false; @@ -497,6 +508,34 @@ static inline bool tdp_mmu_set_spte_atomic(struct kvm *kvm, return true; } +static inline bool tdp_mmu_zap_spte_atomic(struct kvm *kvm, + struct tdp_iter *iter) +{ + /* + * Freeze the SPTE by setting it to a special, + * non-present value. This will stop other threads from + * immediately installing a present entry in its place + * before the TLBs are flushed. + */ + if (!tdp_mmu_set_spte_atomic(kvm, iter, FROZEN_SPTE)) + return false; + + kvm_flush_remote_tlbs_with_address(kvm, iter->gfn, + KVM_PAGES_PER_HPAGE(iter->level)); + + /* + * No other thread can overwrite the frozen SPTE as they + * must either wait on the MMU lock or use + * tdp_mmu_set_spte_atomic which will not overrite the + * special frozen SPTE value. No bookkeeping is needed + * here since the SPTE is going from non-present + * to non-present. + */ + WRITE_ONCE(*iter->sptep, 0); + + return true; +} + /* * __tdp_mmu_set_spte - Set a TDP MMU SPTE and handle the associated bookkeeping @@ -524,6 +563,14 @@ static inline void __tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter, kvm_mmu_lock_assert_held_exclusive(kvm); + /* + * No thread should be using this function to set SPTEs to FROZEN_SPTE. + * If operating under the MMU lock in read mode, tdp_mmu_set_spte_atomic + * should be used. If operating under the MMU lock in write mode, the + * use of FROZEN_SPTE should not be necessary. + */ + WARN_ON(iter->old_spte == FROZEN_SPTE); + WRITE_ONCE(*iter->sptep, new_spte); __handle_changed_spte(kvm, as_id, iter->gfn, iter->old_spte, new_spte, @@ -801,12 +848,9 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, */ if (is_shadow_present_pte(iter.old_spte) && is_large_pte(iter.old_spte)) { - if (!tdp_mmu_set_spte_atomic(vcpu->kvm, &iter, 0)) + if (!tdp_mmu_zap_spte_atomic(vcpu->kvm, &iter)) break; - kvm_flush_remote_tlbs_with_address(vcpu->kvm, iter.gfn, - KVM_PAGES_PER_HPAGE(iter.level)); - /* * The iter must explicitly re-read the spte here * because the new value informs the !present From patchwork Tue Jan 12 18:10:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014347 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80C5EC4332E for ; Tue, 12 Jan 2021 18:13:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5693C2311F for ; Tue, 12 Jan 2021 18:13:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406458AbhALSNr (ORCPT ); Tue, 12 Jan 2021 13:13:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56714 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2392300AbhALSMp (ORCPT ); Tue, 12 Jan 2021 13:12:45 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F2D6EC06134C for ; Tue, 12 Jan 2021 10:11:26 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id mz17so2165857pjb.5 for ; Tue, 12 Jan 2021 10:11:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=VFeIbs3EEJfmUmShc5b4NHNzxuz8HmBwtr3zdhFPFdU=; b=TeQ6xmCyxzPNvf9GuBnToRKMeXIiYDGUqc2kFUphQnu7K32zcBxTsE4DjcCEtVaQtF lxKWr9yPhKG2FGGHjgsCSIBPMJA4dyDdMVf5sTBzD7p1ahOTIpmQ7LhLCrK0PxWigKIF uu6Tkpi31fo2Lo/dEZk8IHrMQt6p+m5VUGPJuKifLnVGDxhCw6q2FUpEu/cYwxIMCVro amu9jXCRsdub9C6BjU94sgGJW4ljCn7uswFFhhAMg5o0RQ9ZzW9LI55j6YpC4DZ8E2jL 2aCsZmfSrHQnNJWOJOOKPBuipQNTsHqsBczTJGoDARaVkzc2xG6QLTyKncQbLo8baEz/ vuZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=VFeIbs3EEJfmUmShc5b4NHNzxuz8HmBwtr3zdhFPFdU=; b=Knq5wqRRnHIaXTkCfVUpdgRSFxlgrhWCFYA9OnIhHX498RyK4wOQpBE6Dz+hzaElb3 OTx1Ps/uVYVfMyr2LQuIUVjqbA3teczachctB9ltU4m0p3Ve3U6TABG9+c38yhOWjlf3 thiDCptzohlcjv+pHnNyXBrGtnAmnJhh80R2qpbQOAcYErFuQkcVcUicMxSXpCFm1faz X+Vo7Ajya9WkEewiGb2KQbVHbX4GlLvtoak2ksm0CrejVrgHIYlqtufiIoz05WYoGSnA 9nhn1f95cLqa2PjcN3/BHA1e8exji6kvHRPzYlIZ9CI+/609DxZAPS9zKNZdd5bJrnP/ CSCA== X-Gm-Message-State: AOAM532HrkArUn1YGkM2Ryff0QoYfd/9OzR5DFZZT8GKZ27XVW7D71E7 fQBSHLIjDRFzmXnucTIY85sMsm6YiO3O X-Google-Smtp-Source: ABdhPJyH+TC4ZouRxYz2W9PZwrMT+08RUoyoMahT/o8Q9ng3T9TJlcsgSogRAenAdozCUMNfLeSakPbwEyYb Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a17:90b:298:: with SMTP id az24mr381970pjb.128.1610475086556; Tue, 12 Jan 2021 10:11:26 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:40 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-24-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 23/24] kvm: x86/mmu: Freeze SPTEs in disconnected pages From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When clearing TDP MMU pages what have been disconnected from the paging structure root, set the SPTEs to a special non-present value which will not be overwritten by other threads. This is needed to prevent races in which a thread is clearing a disconnected page table, but another thread has already acquired a pointer to that memory and installs a mapping in an already cleared entry. This can lead to memory leaks and accounting errors. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/tdp_mmu.c | 35 +++++++++++++++++++++++++++++------ 1 file changed, 29 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 5c9d053000ad..45160ff84e91 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -333,13 +333,14 @@ static void handle_disconnected_tdp_mmu_page(struct kvm *kvm, u64 *pt, { struct kvm_mmu_page *sp; gfn_t gfn; + gfn_t base_gfn; int level; u64 *sptep; u64 old_child_spte; int i; sp = sptep_to_sp(pt); - gfn = sp->gfn; + base_gfn = sp->gfn; level = sp->role.level; trace_kvm_mmu_prepare_zap_page(sp); @@ -348,16 +349,38 @@ static void handle_disconnected_tdp_mmu_page(struct kvm *kvm, u64 *pt, for (i = 0; i < PT64_ENT_PER_PAGE; i++) { sptep = pt + i; + gfn = base_gfn + (i * KVM_PAGES_PER_HPAGE(level - 1)); if (atomic) { - old_child_spte = xchg(sptep, 0); + /* + * Set the SPTE to a nonpresent value that other + * threads will not overwrite. If the SPTE was already + * frozen then another thread handling a page fault + * could overwrite it, so set the SPTE until it is set + * from nonfrozen -> frozen. + */ + for (;;) { + old_child_spte = xchg(sptep, FROZEN_SPTE); + if (old_child_spte != FROZEN_SPTE) + break; + cpu_relax(); + } } else { old_child_spte = READ_ONCE(*sptep); - WRITE_ONCE(*sptep, 0); + + /* + * Setting the SPTE to FROZEN_SPTE is not strictly + * necessary here as the MMU lock should stop other + * threads from concurrentrly modifying this SPTE. + * Using FROZEN_SPTE keeps the atomic and + * non-atomic cases consistent and simplifies the + * function. + */ + WRITE_ONCE(*sptep, FROZEN_SPTE); } - handle_changed_spte(kvm, kvm_mmu_page_as_id(sp), - gfn + (i * KVM_PAGES_PER_HPAGE(level - 1)), - old_child_spte, 0, level - 1, atomic); + handle_changed_spte(kvm, kvm_mmu_page_as_id(sp), gfn, + old_child_spte, FROZEN_SPTE, level - 1, + atomic); } kvm_flush_remote_tlbs_with_address(kvm, gfn, From patchwork Tue Jan 12 18:10:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12014343 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 619BBC4332D for ; Tue, 12 Jan 2021 18:13:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 34A7422DFA for ; Tue, 12 Jan 2021 18:13:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406397AbhALSNF (ORCPT ); Tue, 12 Jan 2021 13:13:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56874 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2406341AbhALSMw (ORCPT ); Tue, 12 Jan 2021 13:12:52 -0500 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DD5F7C061786 for ; Tue, 12 Jan 2021 10:11:28 -0800 (PST) Received: by mail-pj1-x1049.google.com with SMTP id m9so2036206pji.4 for ; Tue, 12 Jan 2021 10:11:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=g0L8hmPn9GuXc9nZO+Oou7JZXqfcIc+E8Ihxm6+j3DU=; b=UT0gYjhnNhJ+NBN73bHFodwS8foa8jvrx4VmOboLjqDbrgRqkha4K2C69MJnR6Sqf5 H2gUwLNolfgumTqWw5nf9oo7A6B6j8Te/VpjlEhGDHrr1uDFEa3WCBf6N1nwmpfd7vNU IeiQQ/gUvG3/xt/JtGO5PXOyNTf36L5gA2i+hrltGwouhR2tClE5m/2AsUqq0hP0qfqP GDuLNIMzumJgBhT3uYKCtOgGh1fuayZNIwIf8FCXghVM/V5+YFsuTrSGtZxim2DXSzWL uYiFzkvXeSm7xQpvCqfwLicmmy2eLViDEfKavNF9z1tR6S3efmaZwa3CHkh/qqEyrAZ+ hDug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=g0L8hmPn9GuXc9nZO+Oou7JZXqfcIc+E8Ihxm6+j3DU=; b=DMhRlzh43JnR3q2lw35fBEAb/D5wpupUrsdgX70s4yjfFk0pOARwl4hekwOCwLEiUE vmLFHABC4Fm8DZ8AR3pA1WHxdooFc27pXYMmhi06i283yST9ru04GrBejxsOTS6wD9wv Esv6s42gSaFsilTSRfnX186BlucgXXZ751FRQthQrL0XsPsG3HJOyNmsndiPsetjpT7k Lskgv+ZZQXZ5vB6tJYT+hcyjLVYgwkaNTD0L3/0IuSDjowzrwaZSJMI90LbzR+GOoFfb f/HBKnuuJSeIasuTjyY/zbtc0wKTY7kNgCBYXOXSZIMIz9MgdzJrOjJObPWeKRNOMlqC cxMQ== X-Gm-Message-State: AOAM530/VfBInBHoAcZQvCyf2yGIxfmsMKg5eZlrAxDeCjm4fYBR73ys e9qijd5RvaRmjQWWNZm6LULp8D+G6Coj X-Google-Smtp-Source: ABdhPJy+BhtXRABw6D2zuNKJQwmN0hyHQI3MsRgEDguv9uWdT8S5ve5nVaC82N7Wy6MFXkyveNHCcsSyooM+ Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a17:90b:ec2:: with SMTP id gz2mr328174pjb.143.1610475088336; Tue, 12 Jan 2021 10:11:28 -0800 (PST) Date: Tue, 12 Jan 2021 10:10:41 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-25-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH 24/24] kvm: x86/mmu: Allow parallel page faults for the TDP MMU From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Make the last few changes necessary to enable the TDP MMU to handle page faults in parallel while holding the mmu_lock in read mode. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/mmu.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 280d7cd6f94b..fa111ceb67d4 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3724,7 +3724,12 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, return r; r = RET_PF_RETRY; - kvm_mmu_lock(vcpu->kvm); + + if (is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa)) + kvm_mmu_lock_shared(vcpu->kvm); + else + kvm_mmu_lock(vcpu->kvm); + if (mmu_notifier_retry(vcpu->kvm, mmu_seq)) goto out_unlock; r = make_mmu_pages_available(vcpu); @@ -3739,7 +3744,10 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, prefault, is_tdp); out_unlock: - kvm_mmu_unlock(vcpu->kvm); + if (is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa)) + kvm_mmu_unlock_shared(vcpu->kvm); + else + kvm_mmu_unlock(vcpu->kvm); kvm_release_pfn_clean(pfn); return r; }