Message ID | 9b866a0ae7147f96571c439e75429a03dcb659b6.1712785629.git.isaku.yamahata@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: Guest Memory Pre-Population API | expand |
On Wed, 2024-04-10 at 15:07 -0700, isaku.yamahata@intel.com wrote: > > +int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code, > + u8 *level) > +{ > + int r; > + > + /* Restrict to TDP page fault. */ > + if (vcpu->arch.mmu->page_fault != kvm_tdp_page_fault) > + return -EINVAL; > + > + r = __kvm_mmu_do_page_fault(vcpu, gpa, error_code, false, NULL, > level); Why not prefetch = true? Doesn't it fit? It looks like the behavior will be to not set the access bit. > + if (r < 0) > + return r; > + > + switch (r) { > + case RET_PF_RETRY: > + return -EAGAIN; > + > + case RET_PF_FIXED: > + case RET_PF_SPURIOUS: > + return 0; > + > + case RET_PF_EMULATE: > + return -EINVAL; > + > + case RET_PF_CONTINUE: > + case RET_PF_INVALID: > + default: > + WARN_ON_ONCE(r); > + return -EIO; > + } > +}
On Wed, Apr 10, 2024 at 03:07:31PM -0700, isaku.yamahata@intel.com wrote: >From: Isaku Yamahata <isaku.yamahata@intel.com> > >Introduce a helper function to call the KVM fault handler. It allows a new >ioctl to invoke the KVM fault handler to populate without seeing RET_PF_* >enums or other KVM MMU internal definitions because RET_PF_* are internal >to x86 KVM MMU. The implementation is restricted to two-dimensional paging >for simplicity. The shadow paging uses GVA for faulting instead of L1 GPA. >It makes the API difficult to use. > >Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> >--- >v2: >- Make the helper function two-dimensional paging specific. (David) >- Return error when vcpu is in guest mode. (David) >- Rename goal_level to level in kvm_tdp_mmu_map_page(). (Sean) >- Update return code conversion. Don't check pfn. > RET_PF_EMULATE => EINVAL, RET_PF_CONTINUE => EIO (Sean) >- Add WARN_ON_ONCE on RET_PF_CONTINUE and RET_PF_INVALID. (Sean) >- Drop unnecessary EXPORT_SYMBOL_GPL(). (Sean) >--- > arch/x86/kvm/mmu.h | 3 +++ > arch/x86/kvm/mmu/mmu.c | 32 ++++++++++++++++++++++++++++++++ > 2 files changed, 35 insertions(+) > >diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h >index e8b620a85627..51ff4f67e115 100644 >--- a/arch/x86/kvm/mmu.h >+++ b/arch/x86/kvm/mmu.h >@@ -183,6 +183,9 @@ static inline void kvm_mmu_refresh_passthrough_bits(struct kvm_vcpu *vcpu, > __kvm_mmu_refresh_passthrough_bits(vcpu, mmu); > } > >+int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code, >+ u8 *level); >+ > /* > * Check if a given access (described through the I/D, W/R and U/S bits of a > * page fault error code pfec) causes a permission fault with the given PTE >diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c >index 91dd4c44b7d8..a34f4af44cbd 100644 >--- a/arch/x86/kvm/mmu/mmu.c >+++ b/arch/x86/kvm/mmu/mmu.c >@@ -4687,6 +4687,38 @@ int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) > return direct_page_fault(vcpu, fault); > } > >+int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code, >+ u8 *level) >+{ >+ int r; >+ >+ /* Restrict to TDP page fault. */ need to explain why. (just as you do in the changelog) >+ if (vcpu->arch.mmu->page_fault != kvm_tdp_page_fault) page fault handlers (i.e., vcpu->arch.mmu->page_fault()) will be called finally. why not let page fault handlers reject the request to get rid of this ad-hoc check? We just need to plumb a flag indicating this is a pre-population request into the handlers. I think this way is clearer. What do you think?
On Tue, Apr 16, 2024 at 02:46:17PM +0000, "Edgecombe, Rick P" <rick.p.edgecombe@intel.com> wrote: > On Wed, 2024-04-10 at 15:07 -0700, isaku.yamahata@intel.com wrote: > > > > +int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code, > > + u8 *level) > > +{ > > + int r; > > + > > + /* Restrict to TDP page fault. */ > > + if (vcpu->arch.mmu->page_fault != kvm_tdp_page_fault) > > + return -EINVAL; > > + > > + r = __kvm_mmu_do_page_fault(vcpu, gpa, error_code, false, NULL, > > level); > > Why not prefetch = true? Doesn't it fit? It looks like the behavior will be to > not set the access bit. Makes sense. Yes, the difference is to set A/D bit or not.
On Wed, Apr 17, 2024 at 03:04:08PM +0800, Chao Gao <chao.gao@intel.com> wrote: > On Wed, Apr 10, 2024 at 03:07:31PM -0700, isaku.yamahata@intel.com wrote: > >From: Isaku Yamahata <isaku.yamahata@intel.com> > > > >Introduce a helper function to call the KVM fault handler. It allows a new > >ioctl to invoke the KVM fault handler to populate without seeing RET_PF_* > >enums or other KVM MMU internal definitions because RET_PF_* are internal > >to x86 KVM MMU. The implementation is restricted to two-dimensional paging > >for simplicity. The shadow paging uses GVA for faulting instead of L1 GPA. > >It makes the API difficult to use. > > > >Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> > >--- > >v2: > >- Make the helper function two-dimensional paging specific. (David) > >- Return error when vcpu is in guest mode. (David) > >- Rename goal_level to level in kvm_tdp_mmu_map_page(). (Sean) > >- Update return code conversion. Don't check pfn. > > RET_PF_EMULATE => EINVAL, RET_PF_CONTINUE => EIO (Sean) > >- Add WARN_ON_ONCE on RET_PF_CONTINUE and RET_PF_INVALID. (Sean) > >- Drop unnecessary EXPORT_SYMBOL_GPL(). (Sean) > >--- > > arch/x86/kvm/mmu.h | 3 +++ > > arch/x86/kvm/mmu/mmu.c | 32 ++++++++++++++++++++++++++++++++ > > 2 files changed, 35 insertions(+) > > > >diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h > >index e8b620a85627..51ff4f67e115 100644 > >--- a/arch/x86/kvm/mmu.h > >+++ b/arch/x86/kvm/mmu.h > >@@ -183,6 +183,9 @@ static inline void kvm_mmu_refresh_passthrough_bits(struct kvm_vcpu *vcpu, > > __kvm_mmu_refresh_passthrough_bits(vcpu, mmu); > > } > > > >+int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code, > >+ u8 *level); > >+ > > /* > > * Check if a given access (described through the I/D, W/R and U/S bits of a > > * page fault error code pfec) causes a permission fault with the given PTE > >diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > >index 91dd4c44b7d8..a34f4af44cbd 100644 > >--- a/arch/x86/kvm/mmu/mmu.c > >+++ b/arch/x86/kvm/mmu/mmu.c > >@@ -4687,6 +4687,38 @@ int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) > > return direct_page_fault(vcpu, fault); > > } > > > >+int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code, > >+ u8 *level) > >+{ > >+ int r; > >+ > >+ /* Restrict to TDP page fault. */ > > need to explain why. (just as you do in the changelog) Sure. > >+ if (vcpu->arch.mmu->page_fault != kvm_tdp_page_fault) > > page fault handlers (i.e., vcpu->arch.mmu->page_fault()) will be called > finally. why not let page fault handlers reject the request to get rid of > this ad-hoc check? We just need to plumb a flag indicating this is a > pre-population request into the handlers. I think this way is clearer. > > What do you think? __kvm_mmu_do_page_fault() doesn't check if the mmu mode is TDP or not. If we don't want to check page_fault handler, the alternative check would be if (!vcpu->arch.mmu->direct). Or we will require the caller to guarantee that MMU mode is tdp (direct or tdp_mmu).
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index e8b620a85627..51ff4f67e115 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -183,6 +183,9 @@ static inline void kvm_mmu_refresh_passthrough_bits(struct kvm_vcpu *vcpu, __kvm_mmu_refresh_passthrough_bits(vcpu, mmu); } +int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code, + u8 *level); + /* * Check if a given access (described through the I/D, W/R and U/S bits of a * page fault error code pfec) causes a permission fault with the given PTE diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 91dd4c44b7d8..a34f4af44cbd 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4687,6 +4687,38 @@ int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) return direct_page_fault(vcpu, fault); } +int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code, + u8 *level) +{ + int r; + + /* Restrict to TDP page fault. */ + if (vcpu->arch.mmu->page_fault != kvm_tdp_page_fault) + return -EINVAL; + + r = __kvm_mmu_do_page_fault(vcpu, gpa, error_code, false, NULL, level); + if (r < 0) + return r; + + switch (r) { + case RET_PF_RETRY: + return -EAGAIN; + + case RET_PF_FIXED: + case RET_PF_SPURIOUS: + return 0; + + case RET_PF_EMULATE: + return -EINVAL; + + case RET_PF_CONTINUE: + case RET_PF_INVALID: + default: + WARN_ON_ONCE(r); + return -EIO; + } +} + static void nonpaging_init_context(struct kvm_mmu *context) { context->page_fault = nonpaging_page_fault;