[v2,08/27] KVM: Move the memslot update in-progress flag to bit 63

Message ID	20190205210137.1377-8-sean.j.christopherson@intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@kernel.org> From: Sean Christopherson <sean.j.christopherson@intel.com> To: Paolo Bonzini <pbonzini@redhat.com>, =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= <rkrcmar@redhat.com> Cc: kvm@vger.kernel.org, Xiao Guangrong <guangrong.xiao@gmail.com> Subject: [PATCH v2 08/27] KVM: Move the memslot update in-progress flag to bit 63 Date: Tue, 5 Feb 2019 13:01:18 -0800 Message-Id: <20190205210137.1377-8-sean.j.christopherson@intel.com> In-Reply-To: <20190205205443.1059-1-sean.j.christopherson@intel.com> References: <20190205205443.1059-1-sean.j.christopherson@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: kvm-owner@vger.kernel.org Precedence: bulk
Series	KVM: x86/mmu: Remove fast invalidate mechanism \| expand [v2,00/27] KVM: x86/mmu: Remove fast invalidate mechanism [v2,01/27] KVM: Call kvm_arch_memslots_updated() before updating memslots [v2,02/27] KVM: x86/mmu: Detect MMIO generation wrap in any address space [v2,03/27] KVM: x86/mmu: Do not cache MMIO accesses while memslots are in flux [v2,04/27] KVM: Explicitly define the "memslot update in-progress" bit [v2,05/27] KVM: x86: Use a u64 when passing the MMIO gen around [v2,06/27] KVM: x86: Refactor the MMIO SPTE generation handling [v2,07/27] KVM: Remove the hack to trigger memslot generation wraparound [v2,08/27] KVM: Move the memslot update in-progress flag to bit 63 [v2,09/27] KVM: x86/mmu: Move slot_level_*() helper functions up a few lines [v2,10/27] KVM: x86/mmu: Split remote_flush+zap case out of kvm_mmu_flush_or_zap() [v2,11/27] KVM: x86/mmu: Zap only the relevant pages when removing a memslot [v2,12/27] Revert "KVM: MMU: document fast invalidate all pages" [v2,13/27] Revert "KVM: MMU: drop kvm_mmu_zap_mmio_sptes" [v2,14/27] KVM: x86/mmu: Voluntarily reschedule as needed when zapping MMIO sptes [v2,15/27] KVM: x86/mmu: Remove is_obsolete() call [v2,16/27] Revert "KVM: MMU: reclaim the zapped-obsolete page first" [v2,17/27] Revert "KVM: MMU: collapse TLB flushes when zap all pages" [v2,18/27] Revert "KVM: MMU: zap pages in batch" [v2,19/27] Revert "KVM: MMU: add tracepoint for kvm_mmu_invalidate_all_pages" [v2,20/27] Revert "KVM: MMU: show mmu_valid_gen in shadow page related tracepoints" [v2,21/27] Revert "KVM: x86: use the fast way to invalidate all pages" [v2,22/27] KVM: x86/mmu: skip over invalid root pages when zapping all sptes [v2,23/27] KVM: x86/mmu: Voluntarily reschedule as needed when zapping all sptes [v2,24/27] Revert "KVM: MMU: fast invalidate all pages" [v2,25/27] KVM: x86/mmu: Differentiate between nr zapped and list unstable [v2,26/27] KVM: x86/mmu: WARN if zapping a MMIO spte results in zapping children [v2,27/27] KVM: x86/mmu: Consolidate kvm_mmu_zap_all() and kvm_mmu_zap_mmio_sptes()

diff --git a/Documentation/virtual/kvm/mmu.txt b/Documentation/virtual/kvm/mmu.txt index e507a9e0421e..367a952f50ab 100644 --- a/Documentation/virtual/kvm/mmu.txt +++ b/Documentation/virtual/kvm/mmu.txt @@ -452,13 +452,16 @@ stored into the MMIO spte. Thus, the MMIO spte might be created based on out-of-date information, but with an up-to-date generation number. To avoid this, the generation number is incremented again after synchronize_srcu -returns; thus, the low bit of kvm_memslots(kvm)->generation is only 1 during a +returns; thus, bit 63 of kvm_memslots(kvm)->generation set to 1 only during a memslot update, while some SRCU readers might be using the old copy. We do not want to use an MMIO sptes created with an odd generation number, and we can do -this without losing a bit in the MMIO spte. The low bit of the generation -is not stored in MMIO spte, and presumed zero when it is extracted out of the -spte. If KVM is unlucky and creates an MMIO spte while the low bit is 1, -the next access to the spte will always be a cache miss. +this without losing a bit in the MMIO spte. The "update in-progress" bit of the +generation is not stored in MMIO spte, and is so is implicitly zero when the +generation is extracted out of the spte. If KVM is unlucky and creates an MMIO +spte while an update is in-progress, the next access to the spte will always be +a cache miss. For example, a subsequent access during the update window will +miss due to the in-progress flag diverging, while an access after the update +window closes will have a higher generation number (as compared to the spte). Further reading diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index ef2931770440..9b4e6ab74cf3 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -333,18 +333,17 @@ static inline bool is_access_track_spte(u64 spte) * Due to limited space in PTEs, the MMIO generation is a 19 bit subset of * the memslots generation and is derived as follows: * - * Bits 1-9 of the memslot generation are propagated to spte bits 3-11 - * Bits 10-19 of the memslot generation are propagated to spte bits 52-61 + * Bits 0-8 of the MMIO generation are propagated to spte bits 3-11 + * Bits 9-18 of the MMIO generation are propagated to spte bits 52-61 * - * The MMIO generation starts at bit 1 of the memslots generation in order to - * skip over bit 0, the KVM_MEMSLOT_GEN_UPDATE_IN_PROGRESS flag. Including - * the flag would require stealing a bit from the "real" generation number and - * thus effectively halve the maximum number of MMIO generations that can be - * handled before encountering a wrap (which requires a full MMU zap). The - * flag is instead explicitly queried when checking for MMIO spte cache hits. + * The KVM_MEMSLOT_GEN_UPDATE_IN_PROGRESS flag is intentionally not included in + * the MMIO generation number, as doing so would require stealing a bit from + * the "real" generation number and thus effectively halve the maximum number + * of MMIO generations that can be handled before encountering a wrap (which + * requires a full MMU zap). The flag is instead explicitly queried when + * checking for MMIO spte cache hits. */ -#define MMIO_SPTE_GEN_MASK GENMASK_ULL(19, 1) -#define MMIO_SPTE_GEN_SHIFT 1 +#define MMIO_SPTE_GEN_MASK GENMASK_ULL(18, 0) #define MMIO_SPTE_GEN_LOW_START 3 #define MMIO_SPTE_GEN_LOW_END 11 @@ -361,8 +360,6 @@ static u64 generation_mmio_spte_mask(u64 gen) WARN_ON(gen & ~MMIO_SPTE_GEN_MASK); - gen >>= MMIO_SPTE_GEN_SHIFT; - mask = (gen << MMIO_SPTE_GEN_LOW_START) & MMIO_SPTE_GEN_LOW_MASK; mask |= (gen << MMIO_SPTE_GEN_HIGH_START) & MMIO_SPTE_GEN_HIGH_MASK; return mask; @@ -376,7 +373,7 @@ static u64 get_mmio_spte_generation(u64 spte) gen = (spte & MMIO_SPTE_GEN_LOW_MASK) >> MMIO_SPTE_GEN_LOW_START; gen |= (spte & MMIO_SPTE_GEN_HIGH_MASK) >> MMIO_SPTE_GEN_HIGH_START; - return gen << MMIO_SPTE_GEN_SHIFT; + return gen; } static void mark_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, u64 gfn, @@ -5902,14 +5899,10 @@ static bool kvm_has_zapped_obsolete_pages(struct kvm *kvm) void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, u64 gen) { + WARN_ON(gen & KVM_MEMSLOT_GEN_UPDATE_IN_PROGRESS); + gen &= MMIO_SPTE_GEN_MASK; - /* - * Shift to adjust for the "update in-progress" flag, which isn't - * included in the MMIO generation number. - */ - gen >>= MMIO_SPTE_GEN_SHIFT; - /* * Generation numbers are incremented in multiples of the number of * address spaces in order to provide unique generations across all diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 5e1cb74922b3..85c0c00d5159 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -49,7 +49,7 @@ #define KVM_MEMSLOT_INVALID (1UL << 16) /* - * Bit 0 of the memslot generation number is an "update in-progress flag", + * Bit 63 of the memslot generation number is an "update in-progress flag", * e.g. is temporarily set for the duration of install_new_memslots(). * This flag effectively creates a unique generation number that is used to * mark cached memslot data, e.g. MMIO accesses, as potentially being stale, @@ -67,7 +67,7 @@ * the actual generation number against accesses that were inserted into the * cache *before* the memslots were updated. */ -#define KVM_MEMSLOT_GEN_UPDATE_IN_PROGRESS BIT_ULL(0) +#define KVM_MEMSLOT_GEN_UPDATE_IN_PROGRESS BIT_ULL(63) /* Two fragments for cross MMIO pages. */ #define KVM_MAX_MMIO_FRAGMENTS 2 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index c350c349c54c..84c99204f155 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -661,7 +661,7 @@ static struct kvm *kvm_create_vm(unsigned long type) if (!slots) goto out_err_no_srcu; /* Generations must be different for each address space. */ - slots->generation = i * 2; + slots->generation = i; rcu_assign_pointer(kvm->memslots[i], slots); } @@ -894,10 +894,10 @@ static struct kvm_memslots *install_new_memslots(struct kvm *kvm, * Generations must be unique even across address spaces. We do not need * a global counter for that, instead the generation space is evenly split * across address spaces. For example, with two address spaces, address - * space 0 will use generations 0, 4, 8, ... while address space 1 will - * use generations 2, 6, 10, 14, ... + * space 0 will use generations 0, 2, 4, ... while address space 1 will + * use generations 1, 3, 5, ... */ - gen += KVM_ADDRESS_SPACE_NUM * 2; + gen += KVM_ADDRESS_SPACE_NUM; kvm_arch_memslots_updated(kvm, gen);

[v2,08/27] KVM: Move the memslot update in-progress flag to bit 63

Commit Message

Patch