Message ID | 20200206221434.23790-1-sean.j.christopherson@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: x86/mmu: Avoid retpoline on ->page_fault() with TDP | expand |
Sean Christopherson <sean.j.christopherson@intel.com> writes: > Wrap calls to ->page_fault() with a small shim to directly invoke the > TDP fault handler when the kernel is using retpolines and TDP is being > used. Denote the TDP fault handler by nullifying mmu->page_fault, and > annotate the TDP path as likely to coerce the compiler into preferring > the TDP path. > > Rename tdp_page_fault() to kvm_tdp_page_fault() as it's exposed outside > of mmu.c to allow inlining the shim. > > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> > --- Out of pure curiosity, if we do something like if (vcpu->arch.mmu->page_fault == tdp_page_fault) tdp_page_fault(...) else if (vcpu->arch.mmu->page_fault == nonpaging_page_fault) nonpaging_page_fault(...) ... we also defeat the retpoline, right? Should we use this technique ... everywhere? :-)
On Fri, Feb 07, 2020 at 10:29:16AM +0100, Vitaly Kuznetsov wrote: > Sean Christopherson <sean.j.christopherson@intel.com> writes: > > > Wrap calls to ->page_fault() with a small shim to directly invoke the > > TDP fault handler when the kernel is using retpolines and TDP is being > > used. Denote the TDP fault handler by nullifying mmu->page_fault, and > > annotate the TDP path as likely to coerce the compiler into preferring > > the TDP path. > > > > Rename tdp_page_fault() to kvm_tdp_page_fault() as it's exposed outside > > of mmu.c to allow inlining the shim. > > > > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> > > --- > > Out of pure curiosity, if we do something like > > if (vcpu->arch.mmu->page_fault == tdp_page_fault) > tdp_page_fault(...) > else if (vcpu->arch.mmu->page_fault == nonpaging_page_fault) > nonpaging_page_fault(...) > ... > > we also defeat the retpoline, right? Yep. > Should we use this technique ... everywhere? :-) It becomes a matter of weighing the maintenance cost and robustness against the performance benefits. For the TDP case, amost no one (that cares about performance) uses shadow paging, the change is very explicit, tiny and isolated, and TDP page fault are a hot path, e.g. when booting the VM. I.e. low maintenance overhead, still robust, and IMO worth the shenanigans. The changes to VMX's VM-Exit handlers follow similar thinking: snipe off the exit handlers that are performance critical, but use a low maintenance implementation for the majority of handlers. There have been multiple attempts to add infrastructure to solve the maintenance and robustness problems[*], but AFAIK none of them have made their way upstream. [*] https://lwn.net/Articles/774743/
Sean Christopherson <sean.j.christopherson@intel.com> writes: > On Fri, Feb 07, 2020 at 10:29:16AM +0100, Vitaly Kuznetsov wrote: >> Sean Christopherson <sean.j.christopherson@intel.com> writes: >> >> > Wrap calls to ->page_fault() with a small shim to directly invoke the >> > TDP fault handler when the kernel is using retpolines and TDP is being >> > used. Denote the TDP fault handler by nullifying mmu->page_fault, and >> > annotate the TDP path as likely to coerce the compiler into preferring >> > the TDP path. >> > >> > Rename tdp_page_fault() to kvm_tdp_page_fault() as it's exposed outside >> > of mmu.c to allow inlining the shim. >> > >> > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> >> > --- >> >> Out of pure curiosity, if we do something like >> >> if (vcpu->arch.mmu->page_fault == tdp_page_fault) >> tdp_page_fault(...) >> else if (vcpu->arch.mmu->page_fault == nonpaging_page_fault) >> nonpaging_page_fault(...) >> ... >> >> we also defeat the retpoline, right? > > Yep. > >> Should we use this technique ... everywhere? :-) > > It becomes a matter of weighing the maintenance cost and robustness against > the performance benefits. For the TDP case, amost no one (that cares about > performance) uses shadow paging, the change is very explicit, tiny and > isolated, and TDP page fault are a hot path, e.g. when booting the VM. > I.e. low maintenance overhead, still robust, and IMO worth the shenanigans. > > The changes to VMX's VM-Exit handlers follow similar thinking: snipe off > the exit handlers that are performance critical, but use a low maintenance > implementation for the majority of handlers. > > There have been multiple attempts to add infrastructure to solve the > maintenance and robustness problems[*], but AFAIK none of them have made > their way upstream. > > [*] https://lwn.net/Articles/774743/ > Oh I see, missed some of these discussion. And I actualy forgot to say: Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> as the patch itself looks good to me, I was just wondering about the approach in general.
On 07/02/20 16:55, Sean Christopherson wrote: > It becomes a matter of weighing the maintenance cost and robustness against > the performance benefits. For the TDP case, amost no one (that cares about > performance) uses shadow paging, the change is very explicit, tiny and > isolated, and TDP page fault are a hot path, e.g. when booting the VM. > I.e. low maintenance overhead, still robust, and IMO worth the shenanigans. The "NULL" trick does not seem needed though. Any objections to this? diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 9277ee8a54a5..a647601c9e1c 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -109,7 +109,7 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u32 err, bool prefault) { #ifdef CONFIG_RETPOLINE - if (likely(!vcpu->arch.mmu->page_fault)) + if (likely(vcpu->arch.mmu->page_fault == kvm_tdp_page_fault)) return kvm_tdp_page_fault(vcpu, cr2_or_gpa, err, prefault); #endif return vcpu->arch.mmu->page_fault(vcpu, cr2_or_gpa, err, prefault); diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 5267f1440677..87e9ba27ada1 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4925,12 +4925,7 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu) return; context->mmu_role.as_u64 = new_role.as_u64; -#ifdef CONFIG_RETPOLINE - /* Nullify ->page_fault() to use direct kvm_tdp_page_fault() call. */ - context->page_fault = NULL; -#else context->page_fault = kvm_tdp_page_fault; -#endif context->sync_page = nonpaging_sync_page; context->invlpg = nonpaging_invlpg; context->update_pte = nonpaging_update_pte; Paolo
On Wed, Feb 12, 2020 at 12:55:09PM +0100, Paolo Bonzini wrote: > On 07/02/20 16:55, Sean Christopherson wrote: > > It becomes a matter of weighing the maintenance cost and robustness against > > the performance benefits. For the TDP case, amost no one (that cares about > > performance) uses shadow paging, the change is very explicit, tiny and > > isolated, and TDP page fault are a hot path, e.g. when booting the VM. > > I.e. low maintenance overhead, still robust, and IMO worth the shenanigans. > > The "NULL" trick does not seem needed though. Any objections to this? Nope, no objections. > diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h > index 9277ee8a54a5..a647601c9e1c 100644 > --- a/arch/x86/kvm/mmu.h > +++ b/arch/x86/kvm/mmu.h > @@ -109,7 +109,7 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, > u32 err, bool prefault) > { > #ifdef CONFIG_RETPOLINE > - if (likely(!vcpu->arch.mmu->page_fault)) > + if (likely(vcpu->arch.mmu->page_fault == kvm_tdp_page_fault)) > return kvm_tdp_page_fault(vcpu, cr2_or_gpa, err, prefault); > #endif > return vcpu->arch.mmu->page_fault(vcpu, cr2_or_gpa, err, prefault); > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index 5267f1440677..87e9ba27ada1 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -4925,12 +4925,7 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu) > return; > > context->mmu_role.as_u64 = new_role.as_u64; > -#ifdef CONFIG_RETPOLINE > - /* Nullify ->page_fault() to use direct kvm_tdp_page_fault() call. */ > - context->page_fault = NULL; > -#else > context->page_fault = kvm_tdp_page_fault; > -#endif > context->sync_page = nonpaging_sync_page; > context->invlpg = nonpaging_invlpg; > context->update_pte = nonpaging_update_pte; > > Paolo >
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index d55674f44a18..9277ee8a54a5 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -102,6 +102,19 @@ static inline void kvm_mmu_load_cr3(struct kvm_vcpu *vcpu) kvm_get_active_pcid(vcpu)); } +int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, + bool prefault); + +static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, + u32 err, bool prefault) +{ +#ifdef CONFIG_RETPOLINE + if (likely(!vcpu->arch.mmu->page_fault)) + return kvm_tdp_page_fault(vcpu, cr2_or_gpa, err, prefault); +#endif + return vcpu->arch.mmu->page_fault(vcpu, cr2_or_gpa, err, prefault); +} + /* * Currently, we have two sorts of write-protection, a) the first one * write-protects guest page to sync the guest modification, b) another one is diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 7011a4e54866..5267f1440677 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4219,8 +4219,8 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code, } EXPORT_SYMBOL_GPL(kvm_handle_page_fault); -static int tdp_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, - bool prefault) +int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, + bool prefault) { int max_level; @@ -4925,7 +4925,12 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu) return; context->mmu_role.as_u64 = new_role.as_u64; - context->page_fault = tdp_page_fault; +#ifdef CONFIG_RETPOLINE + /* Nullify ->page_fault() to use direct kvm_tdp_page_fault() call. */ + context->page_fault = NULL; +#else + context->page_fault = kvm_tdp_page_fault; +#endif context->sync_page = nonpaging_sync_page; context->invlpg = nonpaging_invlpg; context->update_pte = nonpaging_update_pte; @@ -5436,9 +5441,8 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 error_code, } if (r == RET_PF_INVALID) { - r = vcpu->arch.mmu->page_fault(vcpu, cr2_or_gpa, - lower_32_bits(error_code), - false); + r = kvm_mmu_do_page_fault(vcpu, cr2_or_gpa, + lower_32_bits(error_code), false); WARN_ON(r == RET_PF_INVALID); } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index fbabb2f06273..39251ecafd2b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10182,7 +10182,7 @@ void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work) work->arch.cr3 != vcpu->arch.mmu->get_cr3(vcpu)) return; - vcpu->arch.mmu->page_fault(vcpu, work->cr2_or_gpa, 0, true); + kvm_mmu_do_page_fault(vcpu, work->cr2_or_gpa, 0, true); } static inline u32 kvm_async_pf_hash_fn(gfn_t gfn)
Wrap calls to ->page_fault() with a small shim to directly invoke the TDP fault handler when the kernel is using retpolines and TDP is being used. Denote the TDP fault handler by nullifying mmu->page_fault, and annotate the TDP path as likely to coerce the compiler into preferring the TDP path. Rename tdp_page_fault() to kvm_tdp_page_fault() as it's exposed outside of mmu.c to allow inlining the shim. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> --- Haven't done any performance testing, this popped into my head when mucking with the 5-level page table crud as an easy way to shave cycles in the happy path. arch/x86/kvm/mmu.h | 13 +++++++++++++ arch/x86/kvm/mmu/mmu.c | 16 ++++++++++------ arch/x86/kvm/x86.c | 2 +- 3 files changed, 24 insertions(+), 7 deletions(-)