Message ID | 20240904030751.117579-17-rick.p.edgecombe@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | TDX MMU Part 2 | expand |
On 9/4/24 05:07, Rick Edgecombe wrote: > +static int tdx_mem_page_record_premap_cnt(struct kvm *kvm, gfn_t gfn, > + enum pg_level level, kvm_pfn_t pfn) > +{ > + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); > + > + /* Returning error here to let TDP MMU bail out early. */ > + if (KVM_BUG_ON(level != PG_LEVEL_4K, kvm)) { > + tdx_unpin(kvm, pfn); > + return -EINVAL; > + } Should this "if" already be part of patch 14, and in tdx_sept_set_private_spte() rather than tdx_mem_page_record_premap_cnt()? Thanks, Paolo
On 9/4/24 05:07, Rick Edgecombe wrote: > From: Isaku Yamahata <isaku.yamahata@intel.com> > > Update TDX's hook of set_external_spte() to record pre-mapping cnt instead > of doing nothing and returning when TD is not finalized. > > TDX uses ioctl KVM_TDX_INIT_MEM_REGION to initialize its initial guest > memory. This ioctl calls kvm_gmem_populate() to get guest pages and in > tdx_gmem_post_populate(), it will > (1) Map page table pages into KVM mirror page table and private EPT. > (2) Map guest pages into KVM mirror page table. In the propagation hook, > just record pre-mapping cnt without mapping the guest page into private > EPT. > (3) Map guest pages into private EPT and decrease pre-mapping cnt. > > Do not map guest pages into private EPT directly in step (2), because TDX > requires TDH.MEM.PAGE.ADD() to add a guest page before TD is finalized, > which copies page content from a source page from user to target guest page > to be added. However, source page is not available via common interface > kvm_tdp_map_page() in step (2). > > Therefore, just pre-map the guest page into KVM mirror page table and > record the pre-mapping cnt in TDX's propagation hook. The pre-mapping cnt > would be decreased in ioctl KVM_TDX_INIT_MEM_REGION when the guest page is > mapped into private EPT. Stale commit message; squashing all of it into patch 20 is an easy cop out... Paolo > Co-developed-by: Yan Zhao <yan.y.zhao@intel.com> > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> > Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> > --- > TDX MMU part 2 v1: > - Update the code comment and patch log according to latest gmem update. > https://lore.kernel.org/kvm/CABgObfa=a3cKcKJHQRrCs-3Ty8ppSRou=dhi6Q+KdZnom0Zegw@mail.gmail.com/ > - Rename tdx_mem_page_add() to tdx_mem_page_record_premap_cnt() to avoid > confusion. > - Change the patch title to "KVM: TDX: Premap initial guest memory". > - Rename KVM_MEMORY_MAPPING => KVM_MAP_MEMORY (Sean) > - Drop issueing TDH.MEM.PAGE.ADD() on KVM_MAP_MEMORY(), defer it to > KVM_TDX_INIT_MEM_REGION. (Sean) > - Added nr_premapped to track the number of premapped pages > - Drop tdx_post_mmu_map_page(). > > v19: > - Switched to use KVM_MEMORY_MAPPING > - Dropped measurement extension > - updated commit message. private_page_add() => set_private_spte() > --- > arch/x86/kvm/vmx/tdx.c | 40 +++++++++++++++++++++++++++++++++------- > arch/x86/kvm/vmx/tdx.h | 2 +- > 2 files changed, 34 insertions(+), 8 deletions(-) > > diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c > index 59b627b45475..435112562954 100644 > --- a/arch/x86/kvm/vmx/tdx.c > +++ b/arch/x86/kvm/vmx/tdx.c > @@ -488,6 +488,34 @@ static int tdx_mem_page_aug(struct kvm *kvm, gfn_t gfn, > return 0; > } > > +/* > + * KVM_TDX_INIT_MEM_REGION calls kvm_gmem_populate() to get guest pages and > + * tdx_gmem_post_populate() to premap page table pages into private EPT. > + * Mapping guest pages into private EPT before TD is finalized should use a > + * seamcall TDH.MEM.PAGE.ADD(), which copies page content from a source page > + * from user to target guest pages to be added. This source page is not > + * available via common interface kvm_tdp_map_page(). So, currently, > + * kvm_tdp_map_page() only premaps guest pages into KVM mirrored root. > + * A counter nr_premapped is increased here to record status. The counter will > + * be decreased after TDH.MEM.PAGE.ADD() is called after the kvm_tdp_map_page() > + * in tdx_gmem_post_populate(). > + */ > +static int tdx_mem_page_record_premap_cnt(struct kvm *kvm, gfn_t gfn, > + enum pg_level level, kvm_pfn_t pfn) > +{ > + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); > + > + /* Returning error here to let TDP MMU bail out early. */ > + if (KVM_BUG_ON(level != PG_LEVEL_4K, kvm)) { > + tdx_unpin(kvm, pfn); > + return -EINVAL; > + } > + > + /* nr_premapped will be decreased when tdh_mem_page_add() is called. */ > + atomic64_inc(&kvm_tdx->nr_premapped); > + return 0; > +} > + > int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gfn, > enum pg_level level, kvm_pfn_t pfn) > { > @@ -510,11 +538,7 @@ int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gfn, > if (likely(is_td_finalized(kvm_tdx))) > return tdx_mem_page_aug(kvm, gfn, level, pfn); > > - /* > - * TODO: KVM_MAP_MEMORY support to populate before finalize comes > - * here for the initial memory. > - */ > - return 0; > + return tdx_mem_page_record_premap_cnt(kvm, gfn, level, pfn); > } > > static int tdx_sept_drop_private_spte(struct kvm *kvm, gfn_t gfn, > @@ -546,10 +570,12 @@ static int tdx_sept_drop_private_spte(struct kvm *kvm, gfn_t gfn, > if (unlikely(!is_td_finalized(kvm_tdx) && > err == (TDX_EPT_WALK_FAILED | TDX_OPERAND_ID_RCX))) { > /* > - * This page was mapped with KVM_MAP_MEMORY, but > - * KVM_TDX_INIT_MEM_REGION is not issued yet. > + * Page is mapped by KVM_TDX_INIT_MEM_REGION, but hasn't called > + * tdh_mem_page_add(). > */ > if (!is_last_spte(entry, level) || !(entry & VMX_EPT_RWX_MASK)) { > + WARN_ON_ONCE(!atomic64_read(&kvm_tdx->nr_premapped)); > + atomic64_dec(&kvm_tdx->nr_premapped); > tdx_unpin(kvm, pfn); > return 0; > } > diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h > index 66540c57ed61..25a4aaede2ba 100644 > --- a/arch/x86/kvm/vmx/tdx.h > +++ b/arch/x86/kvm/vmx/tdx.h > @@ -26,7 +26,7 @@ struct kvm_tdx { > > u64 tsc_offset; > > - /* For KVM_MAP_MEMORY and KVM_TDX_INIT_MEM_REGION. */ > + /* For KVM_TDX_INIT_MEM_REGION. */ > atomic64_t nr_premapped; > > struct kvm_cpuid2 *cpuid;
On Tue, 2024-09-10 at 12:24 +0200, Paolo Bonzini wrote: > On 9/4/24 05:07, Rick Edgecombe wrote: > > +static int tdx_mem_page_record_premap_cnt(struct kvm *kvm, gfn_t gfn, > > + enum pg_level level, kvm_pfn_t > > pfn) > > +{ > > + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); > > + > > + /* Returning error here to let TDP MMU bail out early. */ > > + if (KVM_BUG_ON(level != PG_LEVEL_4K, kvm)) { > > + tdx_unpin(kvm, pfn); > > + return -EINVAL; > > + } > > Should this "if" already be part of patch 14, and in > tdx_sept_set_private_spte() rather than tdx_mem_page_record_premap_cnt()? Hmm, makes sense to me. Thanks.
On Tue, 2024-09-10 at 12:49 +0200, Paolo Bonzini wrote: > On 9/4/24 05:07, Rick Edgecombe wrote: > > From: Isaku Yamahata <isaku.yamahata@intel.com> > > > > Update TDX's hook of set_external_spte() to record pre-mapping cnt instead > > of doing nothing and returning when TD is not finalized. > > > > TDX uses ioctl KVM_TDX_INIT_MEM_REGION to initialize its initial guest > > memory. This ioctl calls kvm_gmem_populate() to get guest pages and in > > tdx_gmem_post_populate(), it will > > (1) Map page table pages into KVM mirror page table and private EPT. > > (2) Map guest pages into KVM mirror page table. In the propagation hook, > > just record pre-mapping cnt without mapping the guest page into > > private > > EPT. > > (3) Map guest pages into private EPT and decrease pre-mapping cnt. > > > > Do not map guest pages into private EPT directly in step (2), because TDX > > requires TDH.MEM.PAGE.ADD() to add a guest page before TD is finalized, > > which copies page content from a source page from user to target guest page > > to be added. However, source page is not available via common interface > > kvm_tdp_map_page() in step (2). > > > > Therefore, just pre-map the guest page into KVM mirror page table and > > record the pre-mapping cnt in TDX's propagation hook. The pre-mapping cnt > > would be decreased in ioctl KVM_TDX_INIT_MEM_REGION when the guest page is > > mapped into private EPT. > > Stale commit message; squashing all of it into patch 20 is an easy cop > out... Arh, yes this has details that are not relevant to the patch. Squashing it seems fine, but I wasn't sure about whether we actually needed this nr_premapped. It was one of the things we decided to punt a decision on in order to continue our debates on the list. So we need to pick up the debate again.
On Wed, Sep 11, 2024 at 2:30 AM Edgecombe, Rick P <rick.p.edgecombe@intel.com> wrote: > Arh, yes this has details that are not relevant to the patch. > > Squashing it seems fine, but I wasn't sure about whether we actually needed this > nr_premapped. It was one of the things we decided to punt a decision on in order > to continue our debates on the list. So we need to pick up the debate again. I think keeping nr_premapped is safer. Paolo
On Wed, 2024-09-11 at 12:39 +0200, Paolo Bonzini wrote: > On Wed, Sep 11, 2024 at 2:30 AM Edgecombe, Rick P > <rick.p.edgecombe@intel.com> wrote: > > Arh, yes this has details that are not relevant to the patch. > > > > Squashing it seems fine, but I wasn't sure about whether we actually needed > > this > > nr_premapped. It was one of the things we decided to punt a decision on in > > order > > to continue our debates on the list. So we need to pick up the debate again. > > I think keeping nr_premapped is safer. Heh, well it's not hurting anything except adding a small amount of complexity, so I guess we can cancel the debate. Thanks.
On 11/09/24 03:19, Edgecombe, Rick P wrote: > On Tue, 2024-09-10 at 12:24 +0200, Paolo Bonzini wrote: >> On 9/4/24 05:07, Rick Edgecombe wrote: >>> +static int tdx_mem_page_record_premap_cnt(struct kvm *kvm, gfn_t gfn, >>> + enum pg_level level, kvm_pfn_t >>> pfn) >>> +{ >>> + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); >>> + >>> + /* Returning error here to let TDP MMU bail out early. */ >>> + if (KVM_BUG_ON(level != PG_LEVEL_4K, kvm)) { >>> + tdx_unpin(kvm, pfn); >>> + return -EINVAL; >>> + } >> >> Should this "if" already be part of patch 14, and in >> tdx_sept_set_private_spte() rather than tdx_mem_page_record_premap_cnt()? > > Hmm, makes sense to me. Thanks. It is already in patch 14, so just remove it from this patch presumably.
On Fri, 2024-09-13 at 16:33 +0300, Adrian Hunter wrote: > It is already in patch 14, so just remove it from this patch > presumably. Right, there is an equivalent check in tdx_sept_set_private_spte(), so we can just drop this check.
diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 59b627b45475..435112562954 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -488,6 +488,34 @@ static int tdx_mem_page_aug(struct kvm *kvm, gfn_t gfn, return 0; } +/* + * KVM_TDX_INIT_MEM_REGION calls kvm_gmem_populate() to get guest pages and + * tdx_gmem_post_populate() to premap page table pages into private EPT. + * Mapping guest pages into private EPT before TD is finalized should use a + * seamcall TDH.MEM.PAGE.ADD(), which copies page content from a source page + * from user to target guest pages to be added. This source page is not + * available via common interface kvm_tdp_map_page(). So, currently, + * kvm_tdp_map_page() only premaps guest pages into KVM mirrored root. + * A counter nr_premapped is increased here to record status. The counter will + * be decreased after TDH.MEM.PAGE.ADD() is called after the kvm_tdp_map_page() + * in tdx_gmem_post_populate(). + */ +static int tdx_mem_page_record_premap_cnt(struct kvm *kvm, gfn_t gfn, + enum pg_level level, kvm_pfn_t pfn) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + + /* Returning error here to let TDP MMU bail out early. */ + if (KVM_BUG_ON(level != PG_LEVEL_4K, kvm)) { + tdx_unpin(kvm, pfn); + return -EINVAL; + } + + /* nr_premapped will be decreased when tdh_mem_page_add() is called. */ + atomic64_inc(&kvm_tdx->nr_premapped); + return 0; +} + int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gfn, enum pg_level level, kvm_pfn_t pfn) { @@ -510,11 +538,7 @@ int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gfn, if (likely(is_td_finalized(kvm_tdx))) return tdx_mem_page_aug(kvm, gfn, level, pfn); - /* - * TODO: KVM_MAP_MEMORY support to populate before finalize comes - * here for the initial memory. - */ - return 0; + return tdx_mem_page_record_premap_cnt(kvm, gfn, level, pfn); } static int tdx_sept_drop_private_spte(struct kvm *kvm, gfn_t gfn, @@ -546,10 +570,12 @@ static int tdx_sept_drop_private_spte(struct kvm *kvm, gfn_t gfn, if (unlikely(!is_td_finalized(kvm_tdx) && err == (TDX_EPT_WALK_FAILED | TDX_OPERAND_ID_RCX))) { /* - * This page was mapped with KVM_MAP_MEMORY, but - * KVM_TDX_INIT_MEM_REGION is not issued yet. + * Page is mapped by KVM_TDX_INIT_MEM_REGION, but hasn't called + * tdh_mem_page_add(). */ if (!is_last_spte(entry, level) || !(entry & VMX_EPT_RWX_MASK)) { + WARN_ON_ONCE(!atomic64_read(&kvm_tdx->nr_premapped)); + atomic64_dec(&kvm_tdx->nr_premapped); tdx_unpin(kvm, pfn); return 0; } diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index 66540c57ed61..25a4aaede2ba 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -26,7 +26,7 @@ struct kvm_tdx { u64 tsc_offset; - /* For KVM_MAP_MEMORY and KVM_TDX_INIT_MEM_REGION. */ + /* For KVM_TDX_INIT_MEM_REGION. */ atomic64_t nr_premapped; struct kvm_cpuid2 *cpuid;