Message ID | 20210813203504.2742757-4-dmatlack@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Pass memslot around during page fault handling | expand |
On 13/08/21 22:35, David Matlack wrote: > - if (is_writable_pte(new_spte) && !is_writable_pte(old_spte)) { > - /* > - * The gfn of direct spte is stable since it is > - * calculated by sp->gfn. > - */ > - gfn = kvm_mmu_page_get_gfn(sp, sptep - sp->spt); > - kvm_vcpu_mark_page_dirty(vcpu, gfn); > - } > + if (is_writable_pte(new_spte) && !is_writable_pte(old_spte)) > + mark_page_dirty_in_slot(vcpu->kvm, fault->slot, fault->gfn); Oops, this actually needs kvm_vcpu_mark_page_dirty to receive the slot. Paolo
On Tue, Aug 17, 2021 at 6:00 AM Paolo Bonzini <pbonzini@redhat.com> wrote: > > On 13/08/21 22:35, David Matlack wrote: > > - if (is_writable_pte(new_spte) && !is_writable_pte(old_spte)) { > > - /* > > - * The gfn of direct spte is stable since it is > > - * calculated by sp->gfn. > > - */ > > - gfn = kvm_mmu_page_get_gfn(sp, sptep - sp->spt); > > - kvm_vcpu_mark_page_dirty(vcpu, gfn); > > - } > > + if (is_writable_pte(new_spte) && !is_writable_pte(old_spte)) > > + mark_page_dirty_in_slot(vcpu->kvm, fault->slot, fault->gfn); > > Oops, this actually needs kvm_vcpu_mark_page_dirty to receive the slot. What do you mean? kvm_vcpu_mark_page_dirty ultimately just calls mark_page_dirty_in_slot. > > Paolo >
On 17/08/21 18:13, David Matlack wrote: > On Tue, Aug 17, 2021 at 6:00 AM Paolo Bonzini <pbonzini@redhat.com> wrote: >> >> On 13/08/21 22:35, David Matlack wrote: >>> - if (is_writable_pte(new_spte) && !is_writable_pte(old_spte)) { >>> - /* >>> - * The gfn of direct spte is stable since it is >>> - * calculated by sp->gfn. >>> - */ >>> - gfn = kvm_mmu_page_get_gfn(sp, sptep - sp->spt); >>> - kvm_vcpu_mark_page_dirty(vcpu, gfn); >>> - } >>> + if (is_writable_pte(new_spte) && !is_writable_pte(old_spte)) >>> + mark_page_dirty_in_slot(vcpu->kvm, fault->slot, fault->gfn); >> >> Oops, this actually needs kvm_vcpu_mark_page_dirty to receive the slot. > > What do you mean? kvm_vcpu_mark_page_dirty ultimately just calls > mark_page_dirty_in_slot. Yeah, I was thinking of some very old version of the dirty page ring buffer patches. What I wrote makes no sense. :) Paolo
On Fri, Aug 13, 2021, David Matlack wrote: > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index 3352312ab1c9..fb2c95e8df00 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -2890,7 +2890,7 @@ int kvm_mmu_max_mapping_level(struct kvm *kvm, > > void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) > { > - struct kvm_memory_slot *slot; > + struct kvm_memory_slot *slot = fault->slot; > kvm_pfn_t mask; > > fault->huge_page_disallowed = fault->exec && fault->nx_huge_page_workaround_enabled; > @@ -2901,8 +2901,7 @@ void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault > if (is_error_noslot_pfn(fault->pfn) || kvm_is_reserved_pfn(fault->pfn)) > return; > > - slot = gfn_to_memslot_dirty_bitmap(vcpu, fault->gfn, true); > - if (!slot) > + if (kvm_slot_dirty_track_enabled(slot)) This is unnecessarily obfuscated. It relies on the is_error_noslot_pfn() to ensure fault->slot is valid, but the only reason that helper is used is because it was the most efficient code when slot wasn't available. IMO, this would be better: if (!slot || kvm_slot_dirty_track_enabled(slot)) return; if (kvm_is_reserved_pfn(fault->pfn)) return; On a related topic, a good follow-up to this series would be to pass @fault into the prefetch helpers, and modify the prefetch logic to re-use fault->slot and refuse to prefetch across memslot boundaries. That would eliminate all users of gfn_to_memslot_dirty_bitmap() and allow us to drop that abomination.
On Thu, Aug 19, 2021 at 9:37 AM Sean Christopherson <seanjc@google.com> wrote: > > On Fri, Aug 13, 2021, David Matlack wrote: > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index 3352312ab1c9..fb2c95e8df00 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -2890,7 +2890,7 @@ int kvm_mmu_max_mapping_level(struct kvm *kvm, > > > > void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) > > { > > - struct kvm_memory_slot *slot; > > + struct kvm_memory_slot *slot = fault->slot; > > kvm_pfn_t mask; > > > > fault->huge_page_disallowed = fault->exec && fault->nx_huge_page_workaround_enabled; > > @@ -2901,8 +2901,7 @@ void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault > > if (is_error_noslot_pfn(fault->pfn) || kvm_is_reserved_pfn(fault->pfn)) > > return; > > > > - slot = gfn_to_memslot_dirty_bitmap(vcpu, fault->gfn, true); > > - if (!slot) > > + if (kvm_slot_dirty_track_enabled(slot)) > > This is unnecessarily obfuscated. Ugh. It's pure luck too. I meant to check if the slot is null here. > It relies on the is_error_noslot_pfn() to > ensure fault->slot is valid, but the only reason that helper is used is because > it was the most efficient code when slot wasn't available. IMO, this would be > better: > > if (!slot || kvm_slot_dirty_track_enabled(slot)) > return; > > if (kvm_is_reserved_pfn(fault->pfn)) > return; That looks reasonable to me. I can send a patch next week with this change. > > On a related topic, a good follow-up to this series would be to pass @fault into > the prefetch helpers, and modify the prefetch logic to re-use fault->slot and > refuse to prefetch across memslot boundaries. That would eliminate all users of > gfn_to_memslot_dirty_bitmap() and allow us to drop that abomination.
On Fri, Aug 20, 2021, David Matlack wrote: > On Thu, Aug 19, 2021 at 9:37 AM Sean Christopherson <seanjc@google.com> wrote: > > > > On Fri, Aug 13, 2021, David Matlack wrote: > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > > index 3352312ab1c9..fb2c95e8df00 100644 > > > --- a/arch/x86/kvm/mmu/mmu.c > > > +++ b/arch/x86/kvm/mmu/mmu.c > > > @@ -2890,7 +2890,7 @@ int kvm_mmu_max_mapping_level(struct kvm *kvm, > > > > > > void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) > > > { > > > - struct kvm_memory_slot *slot; > > > + struct kvm_memory_slot *slot = fault->slot; > > > kvm_pfn_t mask; > > > > > > fault->huge_page_disallowed = fault->exec && fault->nx_huge_page_workaround_enabled; > > > @@ -2901,8 +2901,7 @@ void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault > > > if (is_error_noslot_pfn(fault->pfn) || kvm_is_reserved_pfn(fault->pfn)) > > > return; > > > > > > - slot = gfn_to_memslot_dirty_bitmap(vcpu, fault->gfn, true); > > > - if (!slot) > > > + if (kvm_slot_dirty_track_enabled(slot)) > > > > This is unnecessarily obfuscated. > > Ugh. It's pure luck too. I meant to check if the slot is null here. Ha, better to be lucky than good ;-)
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 2c726b255fa8..8d13333f0345 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -158,6 +158,9 @@ struct kvm_page_fault { /* Shifted addr, or result of guest page table walk if addr is a gva. */ gfn_t gfn; + /* The memslot containing gfn. May be NULL. */ + struct kvm_memory_slot *slot; + /* Outputs of kvm_faultin_pfn. */ kvm_pfn_t pfn; hva_t hva; diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 3352312ab1c9..fb2c95e8df00 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2890,7 +2890,7 @@ int kvm_mmu_max_mapping_level(struct kvm *kvm, void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) { - struct kvm_memory_slot *slot; + struct kvm_memory_slot *slot = fault->slot; kvm_pfn_t mask; fault->huge_page_disallowed = fault->exec && fault->nx_huge_page_workaround_enabled; @@ -2901,8 +2901,7 @@ void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault if (is_error_noslot_pfn(fault->pfn) || kvm_is_reserved_pfn(fault->pfn)) return; - slot = gfn_to_memslot_dirty_bitmap(vcpu, fault->gfn, true); - if (!slot) + if (kvm_slot_dirty_track_enabled(slot)) return; /* @@ -3076,13 +3075,9 @@ static bool page_fault_can_be_fast(struct kvm_page_fault *fault) * someone else modified the SPTE from its original value. */ static bool -fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, +fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, u64 *sptep, u64 old_spte, u64 new_spte) { - gfn_t gfn; - - WARN_ON(!sp->role.direct); - /* * Theoretically we could also set dirty bit (and flush TLB) here in * order to eliminate unnecessary PML logging. See comments in @@ -3098,14 +3093,8 @@ fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, if (cmpxchg64(sptep, old_spte, new_spte) != old_spte) return false; - if (is_writable_pte(new_spte) && !is_writable_pte(old_spte)) { - /* - * The gfn of direct spte is stable since it is - * calculated by sp->gfn. - */ - gfn = kvm_mmu_page_get_gfn(sp, sptep - sp->spt); - kvm_vcpu_mark_page_dirty(vcpu, gfn); - } + if (is_writable_pte(new_spte) && !is_writable_pte(old_spte)) + mark_page_dirty_in_slot(vcpu->kvm, fault->slot, fault->gfn); return true; } @@ -3233,7 +3222,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) * since the gfn is not stable for indirect shadow page. See * Documentation/virt/kvm/locking.rst to get more detail. */ - if (fast_pf_fix_direct_spte(vcpu, sp, sptep, spte, new_spte)) { + if (fast_pf_fix_direct_spte(vcpu, fault, sptep, spte, new_spte)) { ret = RET_PF_FIXED; break; } @@ -3823,7 +3812,7 @@ static bool kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, static bool kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, int *r) { - struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, fault->gfn); + struct kvm_memory_slot *slot = fault->slot; bool async; /* @@ -3888,6 +3877,8 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault int r; fault->gfn = fault->addr >> PAGE_SHIFT; + fault->slot = kvm_vcpu_gfn_to_memslot(vcpu, fault->gfn); + if (page_fault_handle_page_track(vcpu, fault)) return RET_PF_EMULATE; diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index f70afecbf3a2..50ade6450ace 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -847,6 +847,8 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault } fault->gfn = walker.gfn; + fault->slot = kvm_vcpu_gfn_to_memslot(vcpu, fault->gfn); + if (page_fault_handle_page_track(vcpu, fault)) { shadow_page_table_clear_flood(vcpu, fault->addr); return RET_PF_EMULATE; diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 47ec9f968406..6f733a68d750 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -533,6 +533,7 @@ static inline bool tdp_mmu_set_spte_atomic_no_dirty_log(struct kvm *kvm, * TDP page fault. * * @vcpu: The vcpu instance that took the TDP page fault. + * @fault: The kvm_page_fault being resolved by this SPTE. * @iter: a tdp_iter instance currently on the SPTE that should be set * @new_spte: The value the SPTE should be set to * @@ -540,6 +541,7 @@ static inline bool tdp_mmu_set_spte_atomic_no_dirty_log(struct kvm *kvm, * this function will have no side-effects. */ static inline bool tdp_mmu_map_set_spte_atomic(struct kvm_vcpu *vcpu, + struct kvm_page_fault *fault, struct tdp_iter *iter, u64 new_spte) { @@ -553,12 +555,10 @@ static inline bool tdp_mmu_map_set_spte_atomic(struct kvm_vcpu *vcpu, * handle_changed_spte_dirty_log() to leverage vcpu->last_used_slot. */ if (is_writable_pte(new_spte)) { - struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, iter->gfn); - - if (slot && kvm_slot_dirty_track_enabled(slot)) { + if (fault->slot && kvm_slot_dirty_track_enabled(fault->slot)) { /* Enforced by kvm_mmu_hugepage_adjust. */ WARN_ON_ONCE(iter->level > PG_LEVEL_4K); - mark_page_dirty_in_slot(kvm, slot, iter->gfn); + mark_page_dirty_in_slot(kvm, fault->slot, iter->gfn); } } @@ -934,7 +934,7 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, if (new_spte == iter->old_spte) ret = RET_PF_SPURIOUS; - else if (!tdp_mmu_map_set_spte_atomic(vcpu, iter, new_spte)) + else if (!tdp_mmu_map_set_spte_atomic(vcpu, fault, iter, new_spte)) return RET_PF_RETRY; /*
The memslot for the faulting gfn is used throughout the page fault handling code, so capture it in kvm_page_fault as soon as we know the gfn and use it in the page fault handling code that has direct access to the kvm_page_fault struct. This, in combination with the subsequent patch, improves "Populate memory time" in dirty_log_perf_test by 5% when using the legacy MMU. There is no discerable improvement to the performance of the TDP MMU. No functional change intended. Suggested-by: Ben Gardon <bgardon@google.com> Signed-off-by: David Matlack <dmatlack@google.com> --- arch/x86/kvm/mmu.h | 3 +++ arch/x86/kvm/mmu/mmu.c | 27 +++++++++------------------ arch/x86/kvm/mmu/paging_tmpl.h | 2 ++ arch/x86/kvm/mmu/tdp_mmu.c | 10 +++++----- 4 files changed, 19 insertions(+), 23 deletions(-)