From patchwork Fri Jun 16 02:32:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yan Zhao X-Patchwork-Id: 13281967 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15E1EEB64D9 for ; Fri, 16 Jun 2023 02:57:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239805AbjFPC5h (ORCPT ); Thu, 15 Jun 2023 22:57:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57036 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229509AbjFPC5f (ORCPT ); Thu, 15 Jun 2023 22:57:35 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5C045297C; Thu, 15 Jun 2023 19:57:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1686884254; x=1718420254; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=ut2e3J98JDctUs6MMT7+OPH8GV0+N2ela+vuLlRy0YU=; b=VLl4mwSK3nWsjmnxllFM8zr+8D5U9sdl83C3d2ytVfcTFAy8qi/trJlf oJeHDf3OOglPkPuTuLhjiN1FbIYJUPIQc8EqZD1+QviozFeV7GyXvebjL Qiuqj6Us2NMCpHLy/+GhlQ1WJx6oyvpo8cDvVvZr3tYaORregT4xYPcvk W2RIjKFWznRK9BEl49GMasnurTqwn3TYAfocb1WlXzHtaS8mTvQsdHMgG ZdmaediC5yeLiw64QJhn/Es/qpN7sDurqJQQV3vf+njOIEJs1bOh++cb0 zuvaXMo6/5vE8xSD+BboIHqH9RYrYB91OXI4S8364Ild4p2/kFGO5/wTW A==; X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="387732923" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="387732923" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 19:57:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="802653734" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="802653734" Received: from yzhao56-desk.sh.intel.com ([10.239.159.62]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 19:57:32 -0700 From: Yan Zhao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, chao.gao@intel.com, kai.huang@intel.com, robert.hoo.linux@gmail.com, Yan Zhao Subject: [PATCH v3 01/11] KVM: x86/mmu: helpers to return if KVM honors guest MTRRs Date: Fri, 16 Jun 2023 10:32:17 +0800 Message-Id: <20230616023217.7081-1-yan.y.zhao@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230616023101.7019-1-yan.y.zhao@intel.com> References: <20230616023101.7019-1-yan.y.zhao@intel.com> Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Added helpers to check if KVM honors guest MTRRs. The inner helper __kvm_mmu_honors_guest_mtrrs() is also provided to outside callers for the purpose of checking if guest MTRRs were honored before stopping non-coherent DMA. Suggested-by: Sean Christopherson Signed-off-by: Yan Zhao --- arch/x86/kvm/mmu.h | 7 +++++++ arch/x86/kvm/mmu/mmu.c | 15 +++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 92d5a1924fc1..38bd449226f6 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -235,6 +235,13 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, return -(u32)fault & errcode; } +bool __kvm_mmu_honors_guest_mtrrs(struct kvm *kvm, bool vm_has_noncoherent_dma); + +static inline bool kvm_mmu_honors_guest_mtrrs(struct kvm *kvm) +{ + return __kvm_mmu_honors_guest_mtrrs(kvm, kvm_arch_has_noncoherent_dma(kvm)); +} + void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end); int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 1e5db621241f..b4f89f015c37 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4516,6 +4516,21 @@ static int kvm_tdp_mmu_page_fault(struct kvm_vcpu *vcpu, } #endif +bool __kvm_mmu_honors_guest_mtrrs(struct kvm *kvm, bool vm_has_noncoherent_dma) +{ + /* + * If the TDP is enabled, the host MTRRs are ignored by TDP + * (shadow_memtype_mask is non-zero), and the VM has non-coherent DMA + * (DMA doesn't snoop CPU caches), KVM's ABI is to honor the memtype + * from the guest's MTRRs so that guest accesses to memory that is + * DMA'd aren't cached against the guest's wishes. + * + * Note, KVM may still ultimately ignore guest MTRRs for certain PFNs, + * e.g. KVM will force UC memtype for host MMIO. + */ + return vm_has_noncoherent_dma && tdp_enabled && shadow_memtype_mask; +} + int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) { /* From patchwork Fri Jun 16 02:34:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yan Zhao X-Patchwork-Id: 13281968 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F929EB64DB for ; Fri, 16 Jun 2023 03:00:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241248AbjFPDAO (ORCPT ); Thu, 15 Jun 2023 23:00:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57680 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229509AbjFPC7s (ORCPT ); Thu, 15 Jun 2023 22:59:48 -0400 Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2D16D2953; Thu, 15 Jun 2023 19:59:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1686884388; x=1718420388; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=nOdi2KVC4OwTsVt3a1I9Bdhj8qUZvzNWXLp63uGaNfw=; b=D7u16eJR0Pi7k+mPyq5IjSsihPjvex7ZG8JibahMQZsI7pzJpXeSpko3 bGUFtVdrUYhD6TkeFQS+wcVhJN/UAXwQGC5bIyX9gzI2vCKKgUXRkEoZa F9Ftez1U2h+wXRXLnvNZgL+V0ZZlv6hO+8IhpKuJqOu/0HVhbx62ummYL Hv0/s+HMSmfZ8e2LDvN6ggz77t+0TFzocepskCq0RC3efyHBtdk2yMlea 1zepPptz0WJwnVb62AIVvuN56hxejbhR6PM7ZVg0jOdS4R2lkcytckA0b ry5VGvH0qTEHtA89HkGOb2L5wGEqs3WRPhKviy+POL/pS4A+eC4weeI/b g==; X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="359104326" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="359104326" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 19:59:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="690036199" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="690036199" Received: from yzhao56-desk.sh.intel.com ([10.239.159.62]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 19:59:45 -0700 From: Yan Zhao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, chao.gao@intel.com, kai.huang@intel.com, robert.hoo.linux@gmail.com, Yan Zhao Subject: [PATCH v3 02/11] KVM: x86/mmu: Use KVM honors guest MTRRs helper in kvm_tdp_page_fault() Date: Fri, 16 Jun 2023 10:34:35 +0800 Message-Id: <20230616023435.7142-1-yan.y.zhao@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230616023101.7019-1-yan.y.zhao@intel.com> References: <20230616023101.7019-1-yan.y.zhao@intel.com> Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Let kvm_tdp_page_fault() use helper function kvm_mmu_honors_guest_mtrrs() to decide if it needs to consult guest MTRR to check GFN range consistency. No functional change intended. Suggested-by: Sean Christopherson Signed-off-by: Yan Zhao --- arch/x86/kvm/mmu/mmu.c | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index b4f89f015c37..7f52bbe013b3 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4536,16 +4536,9 @@ int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) /* * If the guest's MTRRs may be used to compute the "real" memtype, * restrict the mapping level to ensure KVM uses a consistent memtype - * across the entire mapping. If the host MTRRs are ignored by TDP - * (shadow_memtype_mask is non-zero), and the VM has non-coherent DMA - * (DMA doesn't snoop CPU caches), KVM's ABI is to honor the memtype - * from the guest's MTRRs so that guest accesses to memory that is - * DMA'd aren't cached against the guest's wishes. - * - * Note, KVM may still ultimately ignore guest MTRRs for certain PFNs, - * e.g. KVM will force UC memtype for host MMIO. + * across the entire mapping. */ - if (shadow_memtype_mask && kvm_arch_has_noncoherent_dma(vcpu->kvm)) { + if (kvm_mmu_honors_guest_mtrrs(vcpu->kvm)) { for ( ; fault->max_level > PG_LEVEL_4K; --fault->max_level) { int page_num = KVM_PAGES_PER_HPAGE(fault->max_level); gfn_t base = gfn_round_for_level(fault->gfn, From patchwork Fri Jun 16 02:35:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yan Zhao X-Patchwork-Id: 13281970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFC98EB64D9 for ; Fri, 16 Jun 2023 03:01:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233953AbjFPDBK (ORCPT ); Thu, 15 Jun 2023 23:01:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58690 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241279AbjFPDBH (ORCPT ); Thu, 15 Jun 2023 23:01:07 -0400 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B30B32D57; Thu, 15 Jun 2023 20:00:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1686884457; x=1718420457; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=fDH9F5ik6wgWY9hy79+dGA7B1WWmfkgvQ9/kTzs5dJw=; b=M41zP/EcGdC1LY0p8tdEJLrlcBz8q2oN1r1awh4D0Oue60KXHJgH4yBS V9i++zVBKRasHTcMSGh72nReqELBLnVLt4nK3bahxkA0xFiik75hY/cAS Lp7JZU274sN/MfvbHm2SVLWTRzjHvz/lfIcpI05S48tIVblAeKfFB97Ew whsD/sQQcFpIt9DNq+V7FBvBZxDMsq9VdM6JKrNSvG/FvLRsCLacZzo4R ZyPTy3I83wpQtbgBf4yoFXHj8JwsqNATg6lGPHRzkzdWRk9SWxUoV5qKK TTIkaa/DVK5SZcaKpBTwGlxYtaJIrT6CYCRPqpvTKr8gFKzYrVw9LUNBL Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="425031714" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="425031714" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 20:00:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="706914642" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="706914642" Received: from yzhao56-desk.sh.intel.com ([10.239.159.62]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 20:00:33 -0700 From: Yan Zhao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, chao.gao@intel.com, kai.huang@intel.com, robert.hoo.linux@gmail.com, Yan Zhao Subject: [PATCH v3 03/11] KVM: x86/mmu: Use KVM honors guest MTRRs helper when CR0.CD toggles Date: Fri, 16 Jun 2023 10:35:24 +0800 Message-Id: <20230616023524.7203-1-yan.y.zhao@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230616023101.7019-1-yan.y.zhao@intel.com> References: <20230616023101.7019-1-yan.y.zhao@intel.com> Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Call helper to check if guest MTRRs are honored by KVM MMU before zapping, as values of guest CR0.CD will only affect memory types of KVM TDP when guest MTRRs are honored. Suggested-by: Chao Gao Signed-off-by: Yan Zhao --- arch/x86/kvm/x86.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9e7186864542..6693daeb5686 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -942,7 +942,7 @@ void kvm_post_set_cr0(struct kvm_vcpu *vcpu, unsigned long old_cr0, unsigned lon kvm_mmu_reset_context(vcpu); if (((cr0 ^ old_cr0) & X86_CR0_CD) && - kvm_arch_has_noncoherent_dma(vcpu->kvm) && + kvm_mmu_honors_guest_mtrrs(vcpu->kvm) && !kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_CD_NW_CLEARED)) kvm_zap_gfn_range(vcpu->kvm, 0, ~0ULL); } From patchwork Fri Jun 16 02:36:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yan Zhao X-Patchwork-Id: 13281971 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 591F4EB64D9 for ; Fri, 16 Jun 2023 03:01:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241507AbjFPDBz (ORCPT ); Thu, 15 Jun 2023 23:01:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58906 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239232AbjFPDB3 (ORCPT ); Thu, 15 Jun 2023 23:01:29 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 40D5A2D5D; Thu, 15 Jun 2023 20:01:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1686884487; x=1718420487; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=a7dmPZNYlw0iNOFy5IcfL4+fWnLVVuU7MOzfbP4iSGY=; b=Ia6WMJWV6tXEYCm5UdM5e2keTCJaQvAMkQMdKdw1x4otOE7w1FpYLfIP 0jlCj/GrIwfRrHrHGZVZMQSaICC1JmI0TnI43VaVBkyIwz5Ppm04qE0eD iD+jnY45hoC+GtHllHeXIH0U0AQMc4k1BMQQj+m6hyF05/ubanXibBzH9 F7VuTE6Ogr3oMz1rGKClBdW/9mz5aBSfH08sOX11Ys2/nDk7bj4tehavF ojOgnpfcMx/Ln0YBEED9xXjOVXPkhTBFrmtHnvQNdVL+vjUAo+wi+Nl2E yDrdTS3nn9iXzP8IcoWCTPRD+ednb+TgDVzGvhuM29EvZE+Kq42BopcfF A==; X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="348809733" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="348809733" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 20:01:26 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="857213580" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="857213580" Received: from yzhao56-desk.sh.intel.com ([10.239.159.62]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 20:01:24 -0700 From: Yan Zhao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, chao.gao@intel.com, kai.huang@intel.com, robert.hoo.linux@gmail.com, Yan Zhao Subject: [PATCH v3 04/11] KVM: x86/mmu: Use KVM honors guest MTRRs helper when update mtrr Date: Fri, 16 Jun 2023 10:36:14 +0800 Message-Id: <20230616023614.7261-1-yan.y.zhao@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230616023101.7019-1-yan.y.zhao@intel.com> References: <20230616023101.7019-1-yan.y.zhao@intel.com> Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Call helper to check if guest MTRRs are honored by KVM MMU before calculation and zapping. Guest MTRRs only affect TDP memtypes when TDP honors guest MTRRs, there's no meaning to do the calculation and zapping otherwise. Suggested-by: Chao Gao Suggested-by: Sean Christopherson Cc: Kai Huang Signed-off-by: Yan Zhao --- arch/x86/kvm/mtrr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c index 3eb6e7f47e96..a67c28a56417 100644 --- a/arch/x86/kvm/mtrr.c +++ b/arch/x86/kvm/mtrr.c @@ -320,7 +320,7 @@ static void update_mtrr(struct kvm_vcpu *vcpu, u32 msr) struct kvm_mtrr *mtrr_state = &vcpu->arch.mtrr_state; gfn_t start, end; - if (!tdp_enabled || !kvm_arch_has_noncoherent_dma(vcpu->kvm)) + if (!kvm_mmu_honors_guest_mtrrs(vcpu->kvm)) return; if (!mtrr_is_enabled(mtrr_state) && msr != MSR_MTRRdefType) From patchwork Fri Jun 16 02:37:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yan Zhao X-Patchwork-Id: 13281972 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 112D6EB64D9 for ; Fri, 16 Jun 2023 03:02:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239202AbjFPDCu (ORCPT ); Thu, 15 Jun 2023 23:02:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59512 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241505AbjFPDCe (ORCPT ); Thu, 15 Jun 2023 23:02:34 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3EF672D4C; Thu, 15 Jun 2023 20:02:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1686884541; x=1718420541; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=mQjyugDM9ae/+xsV/na3qkdiKTAXgzQ9oL683PGryPA=; b=UXVqXD2tI9XLBeR26o/oTh7rmsvzXwV9xse9p+8HHkFxLjpkwzspqw1l NcCPL5TuDRh8OA9Td07yzA0FHlg/VRAxis1E0Ke8dvjyXyP8yby+u2T0W 8xZovVbcitfFju+5ltpmHyEHo/JTC9ndpALc1wGghyDyIxHEMsJsTGQcR oJ4bWyVZU26Jjtk5rDz5mv7jlgJVTN7QPJpjtThTrt5mZeblDMgTdOdtI G6pIKavuEUNhlKcfzwFQzv3/5IH1+hNzdRFUJUyjmMmoBPjbar5JdsKBO oe9d0kJZLJDxCQd7Le/s42oRCqdreMQ+askczP8yXxOu4zPeyyH8ywhD/ A==; X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="348809867" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="348809867" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 20:02:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="857213767" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="857213767" Received: from yzhao56-desk.sh.intel.com ([10.239.159.62]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 20:02:18 -0700 From: Yan Zhao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, chao.gao@intel.com, kai.huang@intel.com, robert.hoo.linux@gmail.com, Yan Zhao Subject: [PATCH v3 05/11] KVM: x86/mmu: zap KVM TDP when noncoherent DMA assignment starts/stops Date: Fri, 16 Jun 2023 10:37:09 +0800 Message-Id: <20230616023709.7318-1-yan.y.zhao@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230616023101.7019-1-yan.y.zhao@intel.com> References: <20230616023101.7019-1-yan.y.zhao@intel.com> Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Zap KVM TDP when noncoherent DMA assignment starts (noncoherent dma count transitions from 0 to 1) or stops (noncoherent dma count transistions from 1 to 0). Before the zap, test if guest MTRR is to be honored after the assignment starts or was honored before the assignment stops. When there's no noncoherent DMA device, EPT memory type is ((MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT) When there're noncoherent DMA devices, EPT memory type needs to honor guest CR0.CD and MTRR settings. So, if noncoherent DMA count transitions between 0 and 1, EPT leaf entries need to be zapped to clear stale memory type. This issue might be hidden when the device is statically assigned with VFIO adding/removing MMIO regions of the noncoherent DMA devices for several times during guest boot, and current KVM MMU will call kvm_mmu_zap_all_fast() on the memslot removal. But if the device is hot-plugged, or if the guest has mmio_always_on for the device, the MMIO regions of it may only be added for once, then there's no path to do the EPT entries zapping to clear stale memory type. Therefore do the EPT zapping when noncoherent assignment starts/stops to ensure stale entries cleaned away. Signed-off-by: Yan Zhao --- arch/x86/kvm/x86.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 6693daeb5686..ac9548efa76f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -13164,15 +13164,31 @@ bool noinstr kvm_arch_has_assigned_device(struct kvm *kvm) } EXPORT_SYMBOL_GPL(kvm_arch_has_assigned_device); +static void kvm_noncoherent_dma_assignment_start_or_stop(struct kvm *kvm) +{ + /* + * Non-coherent DMA assignement and de-assignment will affect + * whether KVM honors guest MTRRs and cause changes in memtypes + * in TDP. + * So, specify the second parameter as true here to indicate + * non-coherent DMAs are/were involved and TDP zap might be + * necessary. + */ + if (__kvm_mmu_honors_guest_mtrrs(kvm, true)) + kvm_zap_gfn_range(kvm, gpa_to_gfn(0), gpa_to_gfn(~0ULL)); +} + void kvm_arch_register_noncoherent_dma(struct kvm *kvm) { - atomic_inc(&kvm->arch.noncoherent_dma_count); + if (atomic_inc_return(&kvm->arch.noncoherent_dma_count) == 1) + kvm_noncoherent_dma_assignment_start_or_stop(kvm); } EXPORT_SYMBOL_GPL(kvm_arch_register_noncoherent_dma); void kvm_arch_unregister_noncoherent_dma(struct kvm *kvm) { - atomic_dec(&kvm->arch.noncoherent_dma_count); + if (!atomic_dec_return(&kvm->arch.noncoherent_dma_count)) + kvm_noncoherent_dma_assignment_start_or_stop(kvm); } EXPORT_SYMBOL_GPL(kvm_arch_unregister_noncoherent_dma); From patchwork Fri Jun 16 02:37:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yan Zhao X-Patchwork-Id: 13281973 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7587EB64DA for ; Fri, 16 Jun 2023 03:03:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241435AbjFPDDL (ORCPT ); Thu, 15 Jun 2023 23:03:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60176 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241326AbjFPDC7 (ORCPT ); Thu, 15 Jun 2023 23:02:59 -0400 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 59A2F2D4C; Thu, 15 Jun 2023 20:02:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1686884575; x=1718420575; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=rpvZF8/E2sO1ceRHFD/Pgi1BwdLKFTmV/M8xCwRXvyg=; b=Kc5nxSVobPUSZi6EYjYfYQV2iVc5pCfheydzrAeaXzWVm+BPWondcfzh NvoD3GX6SNUxLZVQzxl0V8cN62y9GAav4zkR3qmdY7T1q9W87ndscot4S tbto2soFrqTjuKPzSS/geiI++0TY3ATV3hw3UGmt/fjz7qwRcdcJ5Z8nm HIxRcDBgmvMXM3OyZzTfpz/ZgOVmIurX8g9u+6siFtTQvGcBXglvyowNM PvS1tSFZf3QGbEUHrKR71qNrJv6bgUUdINWzEZ2xkoqkjjhm4Fjc+R/8y y7Q0+Vfjk846YDdmwV7JSuI2Y/APAdaEMWvRqC70LPfyNXcd/fUzwQoNb g==; X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="361621625" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="361621625" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 20:02:54 -0700 X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="745963368" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="745963368" Received: from yzhao56-desk.sh.intel.com ([10.239.159.62]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 20:02:52 -0700 From: Yan Zhao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, chao.gao@intel.com, kai.huang@intel.com, robert.hoo.linux@gmail.com, Yan Zhao Subject: [PATCH v3 06/11] KVM: x86/mmu: move TDP zaps from guest MTRRs update to CR0.CD toggling Date: Fri, 16 Jun 2023 10:37:42 +0800 Message-Id: <20230616023742.7379-1-yan.y.zhao@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230616023101.7019-1-yan.y.zhao@intel.com> References: <20230616023101.7019-1-yan.y.zhao@intel.com> Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org If guest MTRRs are honored, always zap TDP when CR0.CD toggles and don't do it if guest MTRRs are updated under CR0.CD=1. This is because CR0.CD=1 takes precedence over guest MTRRs to decide TDP memory types, TDP memtypes are not changed if guest MTRRs update under CR0.CD=1. Instead, always do the TDP zapping when CR0.CD toggles, because even with the quirk KVM_X86_QUIRK_CD_NW_CLEARED, TDP memory types may change after guest CR0.CD toggles. Suggested-by: Sean Christopherson Signed-off-by: Yan Zhao --- arch/x86/kvm/mtrr.c | 3 +++ arch/x86/kvm/x86.c | 3 +-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c index a67c28a56417..3ce58734ad22 100644 --- a/arch/x86/kvm/mtrr.c +++ b/arch/x86/kvm/mtrr.c @@ -323,6 +323,9 @@ static void update_mtrr(struct kvm_vcpu *vcpu, u32 msr) if (!kvm_mmu_honors_guest_mtrrs(vcpu->kvm)) return; + if (kvm_is_cr0_bit_set(vcpu, X86_CR0_CD)) + return; + if (!mtrr_is_enabled(mtrr_state) && msr != MSR_MTRRdefType) return; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ac9548efa76f..32cc8bfaa5f1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -942,8 +942,7 @@ void kvm_post_set_cr0(struct kvm_vcpu *vcpu, unsigned long old_cr0, unsigned lon kvm_mmu_reset_context(vcpu); if (((cr0 ^ old_cr0) & X86_CR0_CD) && - kvm_mmu_honors_guest_mtrrs(vcpu->kvm) && - !kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_CD_NW_CLEARED)) + kvm_mmu_honors_guest_mtrrs(vcpu->kvm)) kvm_zap_gfn_range(vcpu->kvm, 0, ~0ULL); } EXPORT_SYMBOL_GPL(kvm_post_set_cr0); From patchwork Fri Jun 16 02:38:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yan Zhao X-Patchwork-Id: 13281974 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F11AEEB64D9 for ; Fri, 16 Jun 2023 03:04:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241272AbjFPDED (ORCPT ); Thu, 15 Jun 2023 23:04:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32972 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229654AbjFPDD7 (ORCPT ); Thu, 15 Jun 2023 23:03:59 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F30AC2D47; Thu, 15 Jun 2023 20:03:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1686884638; x=1718420638; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=tApTo48fmEWmOt000/vMI+RpIzA2pYKklHGoAN2lprc=; b=DCilRLKjlPcbjNYarsrjYWgkn934h+4jjB88EP+z5vlvBftB7ok9v1Mw HydxmHlgzrXTb4w/ksrTd3wuhEQD82n4fYODyBb+Gfzg+dRxDruGSv4Mu AQ2z+cZx38yCvX3J201s/CbEzHU+/iLY/Dy+4msQDelUQHLNy3DkYHzOY HhzjGxgyZr8H6kHLauEC3MMQ3ZZY+zkyJCd1hlkvk0y3Ip9gS0j1F9CKz 9qjCRhwtjgNtwfVkgKfIzH8dGFiORlBjWWSDVy5EBQH63I4JWJh1tdSo8 Fef0qiupUYibaur28516osfkm0nqtAo4xPgYqSSVKJhVf4qFiSm8T9Enl w==; X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="357976726" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="357976726" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 20:03:58 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="782724568" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="782724568" Received: from yzhao56-desk.sh.intel.com ([10.239.159.62]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 20:03:34 -0700 From: Yan Zhao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, chao.gao@intel.com, kai.huang@intel.com, robert.hoo.linux@gmail.com, Yan Zhao Subject: [PATCH v3 07/11] KVM: VMX: drop IPAT in memtype when CD=1 for KVM_X86_QUIRK_CD_NW_CLEARED Date: Fri, 16 Jun 2023 10:38:15 +0800 Message-Id: <20230616023815.7439-1-yan.y.zhao@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230616023101.7019-1-yan.y.zhao@intel.com> References: <20230616023101.7019-1-yan.y.zhao@intel.com> Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org For KVM_X86_QUIRK_CD_NW_CLEARED, remove the ignore PAT bit in EPT memory types when cache is disabled and non-coherent DMA are present. With the quirk KVM_X86_QUIRK_CD_NW_CLEARED, WB + IPAT are returned as the EPT memory type when guest cache is disabled before this patch. Removing the IPAT bit in this patch will allow effective memory type to honor PAT values as well, which will make the effective memory type stronger than WB as WB is the weakest memtype. However, this change is acceptable as it doesn't make the memory type weaker, consider without this quirk, the original memtype for cache disabled is UC + IPAT. Suggested-by: Sean Christopherson Signed-off-by: Yan Zhao --- arch/x86/kvm/vmx/vmx.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 0ecf4be2c6af..c1e93678cea4 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7548,8 +7548,6 @@ static int vmx_vm_init(struct kvm *kvm) static u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio) { - u8 cache; - /* We wanted to honor guest CD/MTRR/PAT, but doing so could result in * memory aliases with conflicting memory types and sometimes MCEs. * We have to be careful as to what are honored and when. @@ -7576,11 +7574,10 @@ static u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio) if (kvm_read_cr0_bits(vcpu, X86_CR0_CD)) { if (kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_CD_NW_CLEARED)) - cache = MTRR_TYPE_WRBACK; + return MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT; else - cache = MTRR_TYPE_UNCACHABLE; - - return (cache << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT; + return (MTRR_TYPE_UNCACHABLE << VMX_EPT_MT_EPTE_SHIFT) | + VMX_EPT_IPAT_BIT; } return kvm_mtrr_get_guest_memory_type(vcpu, gfn) << VMX_EPT_MT_EPTE_SHIFT; From patchwork Fri Jun 16 02:38:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yan Zhao X-Patchwork-Id: 13281975 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C350EB64DA for ; Fri, 16 Jun 2023 03:04:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239977AbjFPDEU (ORCPT ); Thu, 15 Jun 2023 23:04:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33234 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241534AbjFPDEQ (ORCPT ); Thu, 15 Jun 2023 23:04:16 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D06F02D60; Thu, 15 Jun 2023 20:04:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1686884653; x=1718420653; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=BbhsWAIbuhRQ7z0GhXvz/jeDOJUZBkuqg3ejQ3AlksA=; b=ZN73HhKEUJ7uhNxX2fVuQgSYasuT2ipt5XCONxF1vCe5aPaDmDcck1/L xLdcVUcxQ+T/YMnDzCzsP81kHQj+6BrFiXovIcRXHvzofaIDQ6yLzVZeO QaT6cVdpeJhjQ7vb0igiL7mKDSxfNQmkW/O8hV1kFnsCAo7jKjVJQz88c n/tmBbLEms9XzrWc8mVJMyQVLFdpJuOrPIHeQWYzRCkWEFhLJ2lgJ6Wbl NT7i5qaYo+Cw9ahDfkk9Yrd6QZRmO2A4mmqNoxTacSHYXTo3N6OIlMtgG atCm3lC+My4w37+XNvIhCtH9VNc2YrFhiPqoCc1wximuzBVg1kpG8UkXQ A==; X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="445482463" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="445482463" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 20:04:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="690037834" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="690037834" Received: from yzhao56-desk.sh.intel.com ([10.239.159.62]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 20:04:08 -0700 From: Yan Zhao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, chao.gao@intel.com, kai.huang@intel.com, robert.hoo.linux@gmail.com, Yan Zhao Subject: [PATCH v3 08/11] KVM: x86: move vmx code to get EPT memtype when CR0.CD=1 to x86 common code Date: Fri, 16 Jun 2023 10:38:58 +0800 Message-Id: <20230616023858.7503-1-yan.y.zhao@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230616023101.7019-1-yan.y.zhao@intel.com> References: <20230616023101.7019-1-yan.y.zhao@intel.com> Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Move code in vmx.c to get cache disabled memtype when non-coherent DMA present to x86 common code. This is the preparation patch for later implementation of fine-grained gfn zap for CR0.CD toggles when guest MTRRs are honored. No functional change intended. Signed-off-by: Yan Zhao --- arch/x86/kvm/mtrr.c | 19 +++++++++++++++++++ arch/x86/kvm/vmx/vmx.c | 10 +++++----- arch/x86/kvm/x86.h | 1 + 3 files changed, 25 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c index 3ce58734ad22..b35dd0bc9cad 100644 --- a/arch/x86/kvm/mtrr.c +++ b/arch/x86/kvm/mtrr.c @@ -721,3 +721,22 @@ bool kvm_mtrr_check_gfn_range_consistency(struct kvm_vcpu *vcpu, gfn_t gfn, return type == mtrr_default_type(mtrr_state); } + +void kvm_mtrr_get_cd_memory_type(struct kvm_vcpu *vcpu, u8 *type, bool *ipat) +{ + /* + * this routine is supposed to be called when guest mtrrs are honored + */ + if (unlikely(!kvm_mmu_honors_guest_mtrrs(vcpu->kvm))) { + *type = MTRR_TYPE_WRBACK; + *ipat = true; + } else if (unlikely(!kvm_check_has_quirk(vcpu->kvm, + KVM_X86_QUIRK_CD_NW_CLEARED))) { + *type = MTRR_TYPE_UNCACHABLE; + *ipat = true; + } else { + *type = MTRR_TYPE_WRBACK; + *ipat = false; + } +} +EXPORT_SYMBOL_GPL(kvm_mtrr_get_cd_memory_type); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index c1e93678cea4..6414c5a6e892 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7573,11 +7573,11 @@ static u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio) return (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT; if (kvm_read_cr0_bits(vcpu, X86_CR0_CD)) { - if (kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_CD_NW_CLEARED)) - return MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT; - else - return (MTRR_TYPE_UNCACHABLE << VMX_EPT_MT_EPTE_SHIFT) | - VMX_EPT_IPAT_BIT; + bool ipat; + u8 cache; + + kvm_mtrr_get_cd_memory_type(vcpu, &cache, &ipat); + return cache << VMX_EPT_MT_EPTE_SHIFT | (ipat ? VMX_EPT_IPAT_BIT : 0); } return kvm_mtrr_get_guest_memory_type(vcpu, gfn) << VMX_EPT_MT_EPTE_SHIFT; diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 82e3dafc5453..9781b4b32d68 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -313,6 +313,7 @@ int kvm_mtrr_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data); int kvm_mtrr_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata); bool kvm_mtrr_check_gfn_range_consistency(struct kvm_vcpu *vcpu, gfn_t gfn, int page_num); +void kvm_mtrr_get_cd_memory_type(struct kvm_vcpu *vcpu, u8 *type, bool *ipat); bool kvm_vector_hashing_enabled(void); void kvm_fixup_and_inject_pf_error(struct kvm_vcpu *vcpu, gva_t gva, u16 error_code); int x86_decode_emulated_instruction(struct kvm_vcpu *vcpu, int emulation_type, From patchwork Fri Jun 16 02:39:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yan Zhao X-Patchwork-Id: 13281976 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33680EB64D9 for ; Fri, 16 Jun 2023 03:05:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241340AbjFPDFR (ORCPT ); Thu, 15 Jun 2023 23:05:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33894 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241691AbjFPDE7 (ORCPT ); Thu, 15 Jun 2023 23:04:59 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4E70F2D58; Thu, 15 Jun 2023 20:04:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1686884698; x=1718420698; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=gRnQaSa7yFgKDt0BHybvJWmtcW9MjixKOmKtXyz6GPg=; b=ZWwKHmhMMME9YavITx8fIQ/iTbKqkY7U6XV8N9HnQ6B92ZXEygNd7I8s ph22Qt/jSo5DdvwASY25JDAtIQUQD7qLR/XQBvF0JA+d2IcqlntxBrpFa Hd/QVz4wgmk9Q6DRf4853gq8P8bLY7Pc803/btuBquOXVV4YwwHWGV/lo 5DVVrjGeAU6IZUzMtWhXwjZ3aB9Q21G2DTq6W8M9ahLZsehbjlze2YIXY 3+LVP9o4oosNju25EACMA/vD1yy5vyCAgUVJ9ph6cy3JPdkta3VFyg373 83bGBK66494qc+V923Ho3nh4hO/ySN805T+E6EgVmpoYdQjfR2ZEqNnue A==; X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="445482660" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="445482660" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 20:04:57 -0700 X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="745964267" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="745964267" Received: from yzhao56-desk.sh.intel.com ([10.239.159.62]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 20:04:54 -0700 From: Yan Zhao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, chao.gao@intel.com, kai.huang@intel.com, robert.hoo.linux@gmail.com, Yan Zhao Subject: [PATCH v3 09/11] KVM: x86/mmu: serialize vCPUs to zap gfn when guest MTRRs are honored Date: Fri, 16 Jun 2023 10:39:45 +0800 Message-Id: <20230616023945.7570-1-yan.y.zhao@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230616023101.7019-1-yan.y.zhao@intel.com> References: <20230616023101.7019-1-yan.y.zhao@intel.com> Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Serialize concurrent and repeated calls of kvm_zap_gfn_range() from every vCPU for CR0.CD toggles and MTRR updates when guest MTRRs are honored. During guest boot-up, if guest MTRRs are honored by TDP, TDP zaps are triggered several times by each vCPU for CR0.CD toggles and MTRRs updates. This will take unexpected longer CPU cycles because of the contention of kvm->mmu_lock. Therefore, introduce a mtrr_zap_list to remove duplicated zap and an atomic mtrr_zapping to allow only one vCPU to do the real zap work at one time. Suggested-by: Sean Christopherson Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Yan Zhao --- arch/x86/include/asm/kvm_host.h | 4 + arch/x86/kvm/mtrr.c | 141 +++++++++++++++++++++++++++++++- arch/x86/kvm/x86.c | 2 +- arch/x86/kvm/x86.h | 1 + 4 files changed, 146 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 28bd38303d70..8da1517a1513 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1444,6 +1444,10 @@ struct kvm_arch { */ #define SPLIT_DESC_CACHE_MIN_NR_OBJECTS (SPTE_ENT_PER_PAGE + 1) struct kvm_mmu_memory_cache split_desc_cache; + + struct list_head mtrr_zap_list; + spinlock_t mtrr_zap_list_lock; + atomic_t mtrr_zapping; }; struct kvm_vm_stat { diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c index b35dd0bc9cad..688748e3a4d2 100644 --- a/arch/x86/kvm/mtrr.c +++ b/arch/x86/kvm/mtrr.c @@ -25,6 +25,8 @@ #define IA32_MTRR_DEF_TYPE_FE (1ULL << 10) #define IA32_MTRR_DEF_TYPE_TYPE_MASK (0xff) +static void kvm_mtrr_zap_gfn_range(struct kvm_vcpu *vcpu, + gfn_t gfn_start, gfn_t gfn_end); static bool is_mtrr_base_msr(unsigned int msr) { /* MTRR base MSRs use even numbers, masks use odd numbers. */ @@ -341,7 +343,7 @@ static void update_mtrr(struct kvm_vcpu *vcpu, u32 msr) var_mtrr_range(var_mtrr_msr_to_range(vcpu, msr), &start, &end); } - kvm_zap_gfn_range(vcpu->kvm, gpa_to_gfn(start), gpa_to_gfn(end)); + kvm_mtrr_zap_gfn_range(vcpu, gpa_to_gfn(start), gpa_to_gfn(end)); } static bool var_mtrr_range_is_valid(struct kvm_mtrr_range *range) @@ -437,6 +439,11 @@ int kvm_mtrr_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata) void kvm_vcpu_mtrr_init(struct kvm_vcpu *vcpu) { INIT_LIST_HEAD(&vcpu->arch.mtrr_state.head); + + if (vcpu->vcpu_id == 0) { + spin_lock_init(&vcpu->kvm->arch.mtrr_zap_list_lock); + INIT_LIST_HEAD(&vcpu->kvm->arch.mtrr_zap_list); + } } struct mtrr_iter { @@ -740,3 +747,135 @@ void kvm_mtrr_get_cd_memory_type(struct kvm_vcpu *vcpu, u8 *type, bool *ipat) } } EXPORT_SYMBOL_GPL(kvm_mtrr_get_cd_memory_type); + +struct mtrr_zap_range { + gfn_t start; + /* end is exclusive */ + gfn_t end; + struct list_head node; +}; + +static void kvm_clear_mtrr_zap_list(struct kvm *kvm) +{ + struct list_head *head = &kvm->arch.mtrr_zap_list; + struct mtrr_zap_range *tmp, *n; + + spin_lock(&kvm->arch.mtrr_zap_list_lock); + list_for_each_entry_safe(tmp, n, head, node) { + list_del(&tmp->node); + kfree(tmp); + } + spin_unlock(&kvm->arch.mtrr_zap_list_lock); +} + +/* + * Add @range into kvm->arch.mtrr_zap_list and sort the list in + * "length" ascending + "start" descending order, so that + * ranges consuming more zap cycles can be dequeued later and their + * chances of being found duplicated are increased. + */ +static void kvm_add_mtrr_zap_list(struct kvm *kvm, struct mtrr_zap_range *range) +{ + struct list_head *head = &kvm->arch.mtrr_zap_list; + u64 len = range->end - range->start; + struct mtrr_zap_range *cur, *n; + bool added = false; + + spin_lock(&kvm->arch.mtrr_zap_list_lock); + + if (list_empty(head)) { + list_add(&range->node, head); + spin_unlock(&kvm->arch.mtrr_zap_list_lock); + return; + } + + list_for_each_entry_safe(cur, n, head, node) { + u64 cur_len = cur->end - cur->start; + + if (len < cur_len) + break; + + if (len > cur_len) + continue; + + if (range->start > cur->start) + break; + + if (range->start < cur->start) + continue; + + /* equal len & start, no need to add */ + added = true; + kfree(range); + break; + } + + if (!added) + list_add_tail(&range->node, &cur->node); + + spin_unlock(&kvm->arch.mtrr_zap_list_lock); +} + +static void kvm_zap_mtrr_zap_list(struct kvm *kvm) +{ + struct list_head *head = &kvm->arch.mtrr_zap_list; + struct mtrr_zap_range *cur = NULL; + + spin_lock(&kvm->arch.mtrr_zap_list_lock); + + while (!list_empty(head)) { + u64 start, end; + + cur = list_first_entry(head, typeof(*cur), node); + start = cur->start; + end = cur->end; + list_del(&cur->node); + kfree(cur); + spin_unlock(&kvm->arch.mtrr_zap_list_lock); + + kvm_zap_gfn_range(kvm, start, end); + + spin_lock(&kvm->arch.mtrr_zap_list_lock); + } + + spin_unlock(&kvm->arch.mtrr_zap_list_lock); +} + +static void kvm_zap_or_wait_mtrr_zap_list(struct kvm *kvm) +{ + if (atomic_cmpxchg_acquire(&kvm->arch.mtrr_zapping, 0, 1) == 0) { + kvm_zap_mtrr_zap_list(kvm); + atomic_set_release(&kvm->arch.mtrr_zapping, 0); + return; + } + + while (atomic_read(&kvm->arch.mtrr_zapping)) + cpu_relax(); +} + +static void kvm_mtrr_zap_gfn_range(struct kvm_vcpu *vcpu, + gfn_t gfn_start, gfn_t gfn_end) +{ + struct mtrr_zap_range *range; + + range = kmalloc(sizeof(*range), GFP_KERNEL_ACCOUNT); + if (!range) + goto fail; + + range->start = gfn_start; + range->end = gfn_end; + + kvm_add_mtrr_zap_list(vcpu->kvm, range); + + kvm_zap_or_wait_mtrr_zap_list(vcpu->kvm); + return; + +fail: + kvm_clear_mtrr_zap_list(vcpu->kvm); + kvm_zap_gfn_range(vcpu->kvm, gfn_start, gfn_end); +} + +void kvm_zap_gfn_range_on_cd_toggle(struct kvm_vcpu *vcpu) +{ + return kvm_mtrr_zap_gfn_range(vcpu, gpa_to_gfn(0), gpa_to_gfn(~0ULL)); +} diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 32cc8bfaa5f1..74aac14a3c0b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -943,7 +943,7 @@ void kvm_post_set_cr0(struct kvm_vcpu *vcpu, unsigned long old_cr0, unsigned lon if (((cr0 ^ old_cr0) & X86_CR0_CD) && kvm_mmu_honors_guest_mtrrs(vcpu->kvm)) - kvm_zap_gfn_range(vcpu->kvm, 0, ~0ULL); + kvm_zap_gfn_range_on_cd_toggle(vcpu); } EXPORT_SYMBOL_GPL(kvm_post_set_cr0); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 9781b4b32d68..be946aba2bf0 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -314,6 +314,7 @@ int kvm_mtrr_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata); bool kvm_mtrr_check_gfn_range_consistency(struct kvm_vcpu *vcpu, gfn_t gfn, int page_num); void kvm_mtrr_get_cd_memory_type(struct kvm_vcpu *vcpu, u8 *type, bool *ipat); +void kvm_zap_gfn_range_on_cd_toggle(struct kvm_vcpu *vcpu); bool kvm_vector_hashing_enabled(void); void kvm_fixup_and_inject_pf_error(struct kvm_vcpu *vcpu, gva_t gva, u16 error_code); int x86_decode_emulated_instruction(struct kvm_vcpu *vcpu, int emulation_type, From patchwork Fri Jun 16 02:41:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yan Zhao X-Patchwork-Id: 13281977 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E80D6EB64D9 for ; Fri, 16 Jun 2023 03:06:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241505AbjFPDGu (ORCPT ); Thu, 15 Jun 2023 23:06:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34788 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241420AbjFPDGr (ORCPT ); Thu, 15 Jun 2023 23:06:47 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CCC4D297D; Thu, 15 Jun 2023 20:06:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1686884806; x=1718420806; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=haZIAE2eU8HEF6uKNlr28ZLP3izTWpZyD2lkLgNpwh4=; b=nLsJNV6LnXMu6N56RugHPPdb7hG8Fa2WzNQKSXUotMSnmke0ANq1QfpP ZPyPcBIZWnfmwC8WTT2JI6hWXDEIo/v6PHTRovi9IUMd2473nK+doyV3Q T3mJPRYjs0L9u1orcsFRVlsE1PYp/ObgqCGcApYcnUsz/Nj6qY6UhVf7Y EdHs1HNF0t2GzXVMKAW/MlVm/JyMuyV2KkxQUL1medJPdzn3E4DRmeleM Of7L35iixx32N9RWpSc1sRKCEOTCnEhKq1k3BeX/Y/HaRuTzoBdGGIrey XxneMQKHpK6QcZcr5nMeEaNZM5pYTmQLFHq4QNdKo4LJQgCpyPvU/5L/K Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="445482981" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="445482981" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 20:06:46 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="690038845" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="690038845" Received: from yzhao56-desk.sh.intel.com ([10.239.159.62]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 20:06:43 -0700 From: Yan Zhao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, chao.gao@intel.com, kai.huang@intel.com, robert.hoo.linux@gmail.com, Yan Zhao Subject: [PATCH v3 10/11] KVM: x86/mmu: fine-grained gfn zap when guest MTRRs are honored Date: Fri, 16 Jun 2023 10:41:34 +0800 Message-Id: <20230616024134.7649-1-yan.y.zhao@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230616023101.7019-1-yan.y.zhao@intel.com> References: <20230616023101.7019-1-yan.y.zhao@intel.com> Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Find out guest MTRR ranges of non-default type and only zap those ranges when guest MTRRs are enabled and MTRR default type equals to the memtype of CR0.CD=1. This allows fine-grained and faster gfn zap because of precise and shorter zap ranges, and increases chances for concurent vCPUs to find existing ranges to zap in zap list. Incidentally, fix a typo in the original comment. Suggested-by: Sean Christopherson Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Yan Zhao --- arch/x86/kvm/mtrr.c | 108 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 106 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c index 688748e3a4d2..e2a097822a62 100644 --- a/arch/x86/kvm/mtrr.c +++ b/arch/x86/kvm/mtrr.c @@ -179,7 +179,7 @@ static struct fixed_mtrr_segment fixed_seg_table[] = { { .start = 0xc0000, .end = 0x100000, - .range_shift = 12, /* 12K */ + .range_shift = 12, /* 4K */ .range_start = 24, } }; @@ -816,6 +816,67 @@ static void kvm_add_mtrr_zap_list(struct kvm *kvm, struct mtrr_zap_range *range) spin_unlock(&kvm->arch.mtrr_zap_list_lock); } +/* + * Fixed ranges are only 256 pages in total. + * After balancing between reducing overhead of zap multiple ranges + * and increasing chances of finding duplicated ranges, + * just add fixed mtrr ranges as a whole to the mtrr zap list + * if memory type of one of them is not the specified type. + */ +static int prepare_zaplist_fixed_mtrr_of_non_type(struct kvm_vcpu *vcpu, u8 type) +{ + struct kvm_mtrr *mtrr_state = &vcpu->arch.mtrr_state; + struct mtrr_zap_range *range; + int index, seg_end; + u8 mem_type; + + for (index = 0; index < KVM_NR_FIXED_MTRR_REGION; index++) { + mem_type = mtrr_state->fixed_ranges[index]; + + if (mem_type == type) + continue; + + range = kmalloc(sizeof(*range), GFP_KERNEL_ACCOUNT); + if (!range) + return -ENOMEM; + + seg_end = ARRAY_SIZE(fixed_seg_table) - 1; + range->start = gpa_to_gfn(fixed_seg_table[0].start); + range->end = gpa_to_gfn(fixed_seg_table[seg_end].end); + kvm_add_mtrr_zap_list(vcpu->kvm, range); + break; + } + return 0; +} + +/* + * Add var mtrr ranges to the mtrr zap list + * if its memory type does not equal to type + */ +static int prepare_zaplist_var_mtrr_of_non_type(struct kvm_vcpu *vcpu, u8 type) +{ + struct kvm_mtrr *mtrr_state = &vcpu->arch.mtrr_state; + struct mtrr_zap_range *range; + struct kvm_mtrr_range *tmp; + u8 mem_type; + + list_for_each_entry(tmp, &mtrr_state->head, node) { + mem_type = tmp->base & 0xff; + if (mem_type == type) + continue; + + range = kmalloc(sizeof(*range), GFP_KERNEL_ACCOUNT); + if (!range) + return -ENOMEM; + + var_mtrr_range(tmp, &range->start, &range->end); + range->start = gpa_to_gfn(range->start); + range->end = gpa_to_gfn(range->end); + kvm_add_mtrr_zap_list(vcpu->kvm, range); + } + return 0; +} + static void kvm_zap_mtrr_zap_list(struct kvm *kvm) { struct list_head *head = &kvm->arch.mtrr_zap_list; @@ -875,7 +936,50 @@ static void kvm_mtrr_zap_gfn_range(struct kvm_vcpu *vcpu, kvm_zap_gfn_range(vcpu->kvm, gfn_start, gfn_end); } +/* + * Zap GFN ranges when CR0.CD toggles between 0 and 1. + * With noncoherent DMA present, + * when CR0.CD=1, TDP memtype is WB or UC + IPAT; + * when CR0.CD=0, TDP memtype is determined by guest MTRR. + * Therefore, if the cache disabled memtype is different from default memtype + * in guest MTRR, everything is zapped; + * if the cache disabled memtype is equal to default memtype in guest MTRR, + * only MTRR ranges of non-default-memtype are required to be zapped. + */ void kvm_zap_gfn_range_on_cd_toggle(struct kvm_vcpu *vcpu) { - return kvm_mtrr_zap_gfn_range(vcpu, gpa_to_gfn(0), gpa_to_gfn(~0ULL)); + struct kvm_mtrr *mtrr_state = &vcpu->arch.mtrr_state; + bool mtrr_enabled = mtrr_is_enabled(mtrr_state); + u8 default_type; + u8 cd_type; + bool ipat; + + kvm_mtrr_get_cd_memory_type(vcpu, &cd_type, &ipat); + + default_type = mtrr_enabled ? mtrr_default_type(mtrr_state) : + mtrr_disabled_type(vcpu); + + if (cd_type != default_type || ipat) + return kvm_mtrr_zap_gfn_range(vcpu, gpa_to_gfn(0), gpa_to_gfn(~0ULL)); + + /* + * If mtrr is not enabled, it will go to zap all above if the default + * type does not equal to cd_type; + * Or it has no need to zap if the default type equals to cd_type. + */ + if (mtrr_enabled) { + if (prepare_zaplist_fixed_mtrr_of_non_type(vcpu, default_type)) + goto fail; + + if (prepare_zaplist_var_mtrr_of_non_type(vcpu, default_type)) + goto fail; + + kvm_zap_or_wait_mtrr_zap_list(vcpu->kvm); + } + return; +fail: + kvm_clear_mtrr_zap_list(vcpu->kvm); + /* resort to zapping all on failure*/ + kvm_zap_gfn_range(vcpu->kvm, 0, ~0ULL); + return; } From patchwork Fri Jun 16 02:42:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yan Zhao X-Patchwork-Id: 13281978 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F016EB64D9 for ; Fri, 16 Jun 2023 03:07:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241627AbjFPDHm (ORCPT ); Thu, 15 Jun 2023 23:07:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35478 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241646AbjFPDHi (ORCPT ); Thu, 15 Jun 2023 23:07:38 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C9C8F2D40; Thu, 15 Jun 2023 20:07:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1686884846; x=1718420846; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=0qVLWf99lh56KC7DV9RlLyblgh+kyME9c2tLdLIauOU=; b=cIfU7OqXRv4tSSIg6RzmWDwBxNdocnnBfmbPh5goFNgwEiQHQGPMwzJT RPIm+bVBzkkuKwdTrwzKQIobQ0C6JUP0Vil7uM6ml0jAwlQIIbEJcagLE 1thwxWiKnbTln/zFLb00fdZO2VMO3KHv3suOUkAuq09YFGLJ8jp7tpbOo zs49FiaRmpvGaJrUvuyXoR6KM9lEELZLxksLhlk4NUQyaNQ0D3Fo/8SeB QmD6awNgGLROs+bp9hI2sUWc1RI/3UnL4q7tVe4+OwvZYagZn+LHzX0E2 kP2omGVAnI13R9o2pYNdjummuH0nDS11XK28UKW51ts1GICPcTGnEFfXg A==; X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="387736540" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="387736540" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 20:07:26 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="742483474" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="742483474" Received: from yzhao56-desk.sh.intel.com ([10.239.159.62]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 20:07:23 -0700 From: Yan Zhao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, chao.gao@intel.com, kai.huang@intel.com, robert.hoo.linux@gmail.com, Yan Zhao Subject: [PATCH v3 11/11] KVM: x86/mmu: split a single gfn zap range when guest MTRRs are honored Date: Fri, 16 Jun 2023 10:42:13 +0800 Message-Id: <20230616024213.7706-1-yan.y.zhao@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230616023101.7019-1-yan.y.zhao@intel.com> References: <20230616023101.7019-1-yan.y.zhao@intel.com> Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Split a single gfn zap range (specifially range [0, ~0UL)) to smaller ranges according to current memslot layout when guest MTRRs are honored. Though vCPUs have been serialized to perform kvm_zap_gfn_range() for MTRRs updates and CR0.CD toggles, contention caused rescheduling cost is still huge when there're concurrent page fault caused read locks of kvm->mmu_lock. Split a single huge zap range according to the actual memslot layout can reduce unnecessary transversal and scheduling cost in tdp mmu. Also, it can increase the chances for larger ranges to find existing ranges to zap in zap list. Signed-off-by: Yan Zhao --- arch/x86/kvm/mtrr.c | 31 +++++++++++++++++++++++++------ 1 file changed, 25 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c index e2a097822a62..b83abd14ccb1 100644 --- a/arch/x86/kvm/mtrr.c +++ b/arch/x86/kvm/mtrr.c @@ -917,21 +917,40 @@ static void kvm_zap_or_wait_mtrr_zap_list(struct kvm *kvm) static void kvm_mtrr_zap_gfn_range(struct kvm_vcpu *vcpu, gfn_t gfn_start, gfn_t gfn_end) { + int idx = srcu_read_lock(&vcpu->kvm->srcu); + const struct kvm_memory_slot *memslot; struct mtrr_zap_range *range; + struct kvm_memslot_iter iter; + struct kvm_memslots *slots; + gfn_t start, end; + int i; + + for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { + slots = __kvm_memslots(vcpu->kvm, i); + kvm_for_each_memslot_in_gfn_range(&iter, slots, gfn_start, gfn_end) { + memslot = iter.slot; + start = max(gfn_start, memslot->base_gfn); + end = min(gfn_end, memslot->base_gfn + memslot->npages); + if (WARN_ON_ONCE(start >= end)) + continue; - range = kmalloc(sizeof(*range), GFP_KERNEL_ACCOUNT); - if (!range) - goto fail; + range = kmalloc(sizeof(*range), GFP_KERNEL_ACCOUNT); + if (!range) + goto fail; - range->start = gfn_start; - range->end = gfn_end; + range->start = start; + range->end = end; - kvm_add_mtrr_zap_list(vcpu->kvm, range); + kvm_add_mtrr_zap_list(vcpu->kvm, range); + } + } + srcu_read_unlock(&vcpu->kvm->srcu, idx); kvm_zap_or_wait_mtrr_zap_list(vcpu->kvm); return; fail: + srcu_read_unlock(&vcpu->kvm->srcu, idx); kvm_clear_mtrr_zap_list(vcpu->kvm); kvm_zap_gfn_range(vcpu->kvm, gfn_start, gfn_end); }