Message ID | 20240904030751.117579-21-rick.p.edgecombe@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | TDX MMU Part 2 | expand |
On 4/09/24 06:07, Rick Edgecombe wrote: > From: Isaku Yamahata <isaku.yamahata@intel.com> > > Add a new VM-scoped KVM_MEMORY_ENCRYPT_OP IOCTL subcommand, > KVM_TDX_FINALIZE_VM, to perform TD Measurement Finalization. > > Documentation for the API is added in another patch: > "Documentation/virt/kvm: Document on Trust Domain Extensions(TDX)" > > For the purpose of attestation, a measurement must be made of the TDX VM > initial state. This is referred to as TD Measurement Finalization, and > uses SEAMCALL TDH.MR.FINALIZE, after which: > 1. The VMM adding TD private pages with arbitrary content is no longer > allowed > 2. The TDX VM is runnable > > Co-developed-by: Adrian Hunter <adrian.hunter@intel.com> > Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> > Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> > --- > TDX MMU part 2 v1: > - Added premapped check. > - Update for the wrapper functions for SEAMCALLs. (Sean) > - Add check if nr_premapped is zero. If not, return error. > - Use KVM_BUG_ON() in tdx_td_finalizer() for consistency. > - Change tdx_td_finalizemr() to take struct kvm_tdx_cmd *cmd and return error > (Adrian) > - Handle TDX_OPERAND_BUSY case (Adrian) > - Updates from seamcall overhaul (Kai) > - Rename error->hw_error > > v18: > - Remove the change of tools/arch/x86/include/uapi/asm/kvm.h. > > v15: > - removed unconditional tdx_track() by tdx_flush_tlb_current() that > does tdx_track(). > --- > arch/x86/include/uapi/asm/kvm.h | 1 + > arch/x86/kvm/vmx/tdx.c | 28 ++++++++++++++++++++++++++++ > 2 files changed, 29 insertions(+) > > diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h > index 789d1d821b4f..0b4827e39458 100644 > --- a/arch/x86/include/uapi/asm/kvm.h > +++ b/arch/x86/include/uapi/asm/kvm.h > @@ -932,6 +932,7 @@ enum kvm_tdx_cmd_id { > KVM_TDX_INIT_VM, > KVM_TDX_INIT_VCPU, > KVM_TDX_INIT_MEM_REGION, > + KVM_TDX_FINALIZE_VM, > KVM_TDX_GET_CPUID, > > KVM_TDX_CMD_NR_MAX, > diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c > index 796d1a495a66..3083a66bb895 100644 > --- a/arch/x86/kvm/vmx/tdx.c > +++ b/arch/x86/kvm/vmx/tdx.c > @@ -1257,6 +1257,31 @@ void tdx_flush_tlb_current(struct kvm_vcpu *vcpu) > ept_sync_global(); > } > > +static int tdx_td_finalizemr(struct kvm *kvm, struct kvm_tdx_cmd *cmd) > +{ > + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); > + > + if (!is_hkid_assigned(kvm_tdx) || is_td_finalized(kvm_tdx)) > + return -EINVAL; > + /* > + * Pages are pending for KVM_TDX_INIT_MEM_REGION to issue > + * TDH.MEM.PAGE.ADD(). > + */ > + if (atomic64_read(&kvm_tdx->nr_premapped)) > + return -EINVAL; > + > + cmd->hw_error = tdh_mr_finalize(kvm_tdx); > + if ((cmd->hw_error & TDX_SEAMCALL_STATUS_MASK) == TDX_OPERAND_BUSY) > + return -EAGAIN; > + if (KVM_BUG_ON(cmd->hw_error, kvm)) { > + pr_tdx_error(TDH_MR_FINALIZE, cmd->hw_error); > + return -EIO; > + } > + > + kvm_tdx->finalized = true; > + return 0; > +} Isaku was going to lock the mmu. Seems like the change got lost. To protect against racing with KVM_PRE_FAULT_MEMORY, KVM_TDX_INIT_MEM_REGION, tdx_sept_set_private_spte() etc e.g. Rename tdx_td_finalizemr to __tdx_td_finalizemr and add: static int tdx_td_finalizemr(struct kvm *kvm, struct kvm_tdx_cmd *cmd) { int ret; write_lock(&kvm->mmu_lock); ret = __tdx_td_finalizemr(kvm, cmd); write_unlock(&kvm->mmu_lock); return ret; } > + > int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) > { > struct kvm_tdx_cmd tdx_cmd; > @@ -1281,6 +1306,9 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) > case KVM_TDX_INIT_VM: > r = tdx_td_init(kvm, &tdx_cmd); > break; > + case KVM_TDX_FINALIZE_VM: > + r = tdx_td_finalizemr(kvm, &tdx_cmd); > + break; > default: > r = -EINVAL; > goto out;
On Wed, 2024-09-04 at 18:37 +0300, Adrian Hunter wrote: > > Isaku was going to lock the mmu. Seems like the change got lost. > To protect against racing with KVM_PRE_FAULT_MEMORY, > KVM_TDX_INIT_MEM_REGION, tdx_sept_set_private_spte() etc > e.g. Rename tdx_td_finalizemr to __tdx_td_finalizemr and add: > > static int tdx_td_finalizemr(struct kvm *kvm, struct kvm_tdx_cmd *cmd) > { > int ret; > > write_lock(&kvm->mmu_lock); > ret = __tdx_td_finalizemr(kvm, cmd); > write_unlock(&kvm->mmu_lock); > > return ret; > } Makes sense. Thanks.
On 9/4/24 05:07, Rick Edgecombe wrote: > From: Isaku Yamahata <isaku.yamahata@intel.com> > > Add a new VM-scoped KVM_MEMORY_ENCRYPT_OP IOCTL subcommand, > KVM_TDX_FINALIZE_VM, to perform TD Measurement Finalization. > > Documentation for the API is added in another patch: > "Documentation/virt/kvm: Document on Trust Domain Extensions(TDX)" > > For the purpose of attestation, a measurement must be made of the TDX VM > initial state. This is referred to as TD Measurement Finalization, and > uses SEAMCALL TDH.MR.FINALIZE, after which: > 1. The VMM adding TD private pages with arbitrary content is no longer > allowed > 2. The TDX VM is runnable > > Co-developed-by: Adrian Hunter <adrian.hunter@intel.com> > Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> > Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> > --- > TDX MMU part 2 v1: > - Added premapped check. > - Update for the wrapper functions for SEAMCALLs. (Sean) > - Add check if nr_premapped is zero. If not, return error. > - Use KVM_BUG_ON() in tdx_td_finalizer() for consistency. > - Change tdx_td_finalizemr() to take struct kvm_tdx_cmd *cmd and return error > (Adrian) > - Handle TDX_OPERAND_BUSY case (Adrian) > - Updates from seamcall overhaul (Kai) > - Rename error->hw_error > > v18: > - Remove the change of tools/arch/x86/include/uapi/asm/kvm.h. > > v15: > - removed unconditional tdx_track() by tdx_flush_tlb_current() that > does tdx_track(). > --- > arch/x86/include/uapi/asm/kvm.h | 1 + > arch/x86/kvm/vmx/tdx.c | 28 ++++++++++++++++++++++++++++ > 2 files changed, 29 insertions(+) > > diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h > index 789d1d821b4f..0b4827e39458 100644 > --- a/arch/x86/include/uapi/asm/kvm.h > +++ b/arch/x86/include/uapi/asm/kvm.h > @@ -932,6 +932,7 @@ enum kvm_tdx_cmd_id { > KVM_TDX_INIT_VM, > KVM_TDX_INIT_VCPU, > KVM_TDX_INIT_MEM_REGION, > + KVM_TDX_FINALIZE_VM, > KVM_TDX_GET_CPUID, > > KVM_TDX_CMD_NR_MAX, > diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c > index 796d1a495a66..3083a66bb895 100644 > --- a/arch/x86/kvm/vmx/tdx.c > +++ b/arch/x86/kvm/vmx/tdx.c > @@ -1257,6 +1257,31 @@ void tdx_flush_tlb_current(struct kvm_vcpu *vcpu) > ept_sync_global(); > } > > +static int tdx_td_finalizemr(struct kvm *kvm, struct kvm_tdx_cmd *cmd) > +{ > + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); > + > + if (!is_hkid_assigned(kvm_tdx) || is_td_finalized(kvm_tdx)) > + return -EINVAL; > + /* > + * Pages are pending for KVM_TDX_INIT_MEM_REGION to issue > + * TDH.MEM.PAGE.ADD(). > + */ > + if (atomic64_read(&kvm_tdx->nr_premapped)) > + return -EINVAL; I suggest moving all of patch 16, plus the + WARN_ON_ONCE(!atomic64_read(&kvm_tdx->nr_premapped)); + atomic64_dec(&kvm_tdx->nr_premapped); lines of patch 19, into this patch. > + cmd->hw_error = tdh_mr_finalize(kvm_tdx); > + if ((cmd->hw_error & TDX_SEAMCALL_STATUS_MASK) == TDX_OPERAND_BUSY) > + return -EAGAIN; > + if (KVM_BUG_ON(cmd->hw_error, kvm)) { > + pr_tdx_error(TDH_MR_FINALIZE, cmd->hw_error); > + return -EIO; > + } > + > + kvm_tdx->finalized = true; > + return 0; This should also set pre_fault_allowed to true. Paolo
On 9/4/24 17:37, Adrian Hunter wrote: > Isaku was going to lock the mmu. Seems like the change got lost. > To protect against racing with KVM_PRE_FAULT_MEMORY, > KVM_TDX_INIT_MEM_REGION, tdx_sept_set_private_spte() etc > e.g. Rename tdx_td_finalizemr to __tdx_td_finalizemr and add: > > static int tdx_td_finalizemr(struct kvm *kvm, struct kvm_tdx_cmd *cmd) > { > int ret; > > write_lock(&kvm->mmu_lock); > ret = __tdx_td_finalizemr(kvm, cmd); > write_unlock(&kvm->mmu_lock); > > return ret; > } kvm->slots_lock is better. In tdx_vcpu_init_mem_region() you can take it before the is_td_finalized() so that there is a lock that is clearly protecting kvm_tdx->finalized between the two. (I also suggest switching to guard() in tdx_vcpu_init_mem_region()). Also, I think that in patch 16 (whether merged or not) nr_premapped should not be incremented, once kvm_tdx->finalized has been set? Paolo
On 10/09/24 13:33, Paolo Bonzini wrote: > On 9/4/24 17:37, Adrian Hunter wrote: >> Isaku was going to lock the mmu. Seems like the change got lost. >> To protect against racing with KVM_PRE_FAULT_MEMORY, >> KVM_TDX_INIT_MEM_REGION, tdx_sept_set_private_spte() etc >> e.g. Rename tdx_td_finalizemr to __tdx_td_finalizemr and add: >> >> static int tdx_td_finalizemr(struct kvm *kvm, struct kvm_tdx_cmd *cmd) >> { >> int ret; >> >> write_lock(&kvm->mmu_lock); >> ret = __tdx_td_finalizemr(kvm, cmd); >> write_unlock(&kvm->mmu_lock); >> >> return ret; >> } > > kvm->slots_lock is better. In tdx_vcpu_init_mem_region() you can take it before the is_td_finalized() so that there is a lock that is clearly protecting kvm_tdx->finalized between the two. (I also suggest switching to guard() in tdx_vcpu_init_mem_region()). Doesn't KVM_PRE_FAULT_MEMORY also need to be protected? > > Also, I think that in patch 16 (whether merged or not) nr_premapped should not be incremented, once kvm_tdx->finalized has been set? tdx_sept_set_private_spte() checks is_td_finalized() to decide whether to call tdx_mem_page_aug() or tdx_mem_page_record_premap_cnt() Refer patch 14 "KVM: TDX: Implement hooks to propagate changes of TDP MMU mirror page table" for the addition of tdx_sept_set_private_spte()
On 9/10/24 13:15, Adrian Hunter wrote: >> kvm->slots_lock is better. In tdx_vcpu_init_mem_region() you can >> take it before the is_td_finalized() so that there is a lock that >> is clearly protecting kvm_tdx->finalized between the two. (I also >> suggest switching to guard() in tdx_vcpu_init_mem_region()). > > Doesn't KVM_PRE_FAULT_MEMORY also need to be protected? KVM_PRE_FAULT_MEMORY is forbidden until kvm->arch.pre_fault_allowed is set. Paolo
On 10/09/24 14:15, Adrian Hunter wrote: > On 10/09/24 13:33, Paolo Bonzini wrote: >> On 9/4/24 17:37, Adrian Hunter wrote: >>> Isaku was going to lock the mmu. Seems like the change got lost. >>> To protect against racing with KVM_PRE_FAULT_MEMORY, >>> KVM_TDX_INIT_MEM_REGION, tdx_sept_set_private_spte() etc >>> e.g. Rename tdx_td_finalizemr to __tdx_td_finalizemr and add: >>> >>> static int tdx_td_finalizemr(struct kvm *kvm, struct kvm_tdx_cmd *cmd) >>> { >>> int ret; >>> >>> write_lock(&kvm->mmu_lock); >>> ret = __tdx_td_finalizemr(kvm, cmd); >>> write_unlock(&kvm->mmu_lock); >>> >>> return ret; >>> } >> >> kvm->slots_lock is better. In tdx_vcpu_init_mem_region() you can take it before the is_td_finalized() so that there is a lock that is clearly protecting kvm_tdx->finalized between the two. (I also suggest switching to guard() in tdx_vcpu_init_mem_region()). > > Doesn't KVM_PRE_FAULT_MEMORY also need to be protected? Ah, but not if pre_fault_allowed is false. > >> >> Also, I think that in patch 16 (whether merged or not) nr_premapped should not be incremented, once kvm_tdx->finalized has been set? > > tdx_sept_set_private_spte() checks is_td_finalized() to decide > whether to call tdx_mem_page_aug() or tdx_mem_page_record_premap_cnt() > Refer patch 14 "KVM: TDX: Implement hooks to propagate changes > of TDP MMU mirror page table" for the addition of > tdx_sept_set_private_spte() > >
On 10/09/24 13:25, Paolo Bonzini wrote: > On 9/4/24 05:07, Rick Edgecombe wrote: >> From: Isaku Yamahata <isaku.yamahata@intel.com> >> >> Add a new VM-scoped KVM_MEMORY_ENCRYPT_OP IOCTL subcommand, >> KVM_TDX_FINALIZE_VM, to perform TD Measurement Finalization. >> >> Documentation for the API is added in another patch: >> "Documentation/virt/kvm: Document on Trust Domain Extensions(TDX)" >> >> For the purpose of attestation, a measurement must be made of the TDX VM >> initial state. This is referred to as TD Measurement Finalization, and >> uses SEAMCALL TDH.MR.FINALIZE, after which: >> 1. The VMM adding TD private pages with arbitrary content is no longer >> allowed >> 2. The TDX VM is runnable >> >> Co-developed-by: Adrian Hunter <adrian.hunter@intel.com> >> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> >> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> >> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> >> --- >> TDX MMU part 2 v1: >> - Added premapped check. >> - Update for the wrapper functions for SEAMCALLs. (Sean) >> - Add check if nr_premapped is zero. If not, return error. >> - Use KVM_BUG_ON() in tdx_td_finalizer() for consistency. >> - Change tdx_td_finalizemr() to take struct kvm_tdx_cmd *cmd and return error >> (Adrian) >> - Handle TDX_OPERAND_BUSY case (Adrian) >> - Updates from seamcall overhaul (Kai) >> - Rename error->hw_error >> >> v18: >> - Remove the change of tools/arch/x86/include/uapi/asm/kvm.h. >> >> v15: >> - removed unconditional tdx_track() by tdx_flush_tlb_current() that >> does tdx_track(). >> --- >> arch/x86/include/uapi/asm/kvm.h | 1 + >> arch/x86/kvm/vmx/tdx.c | 28 ++++++++++++++++++++++++++++ >> 2 files changed, 29 insertions(+) >> >> diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h >> index 789d1d821b4f..0b4827e39458 100644 >> --- a/arch/x86/include/uapi/asm/kvm.h >> +++ b/arch/x86/include/uapi/asm/kvm.h >> @@ -932,6 +932,7 @@ enum kvm_tdx_cmd_id { >> KVM_TDX_INIT_VM, >> KVM_TDX_INIT_VCPU, >> KVM_TDX_INIT_MEM_REGION, >> + KVM_TDX_FINALIZE_VM, >> KVM_TDX_GET_CPUID, >> KVM_TDX_CMD_NR_MAX, >> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c >> index 796d1a495a66..3083a66bb895 100644 >> --- a/arch/x86/kvm/vmx/tdx.c >> +++ b/arch/x86/kvm/vmx/tdx.c >> @@ -1257,6 +1257,31 @@ void tdx_flush_tlb_current(struct kvm_vcpu *vcpu) >> ept_sync_global(); >> } >> +static int tdx_td_finalizemr(struct kvm *kvm, struct kvm_tdx_cmd *cmd) >> +{ >> + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); >> + >> + if (!is_hkid_assigned(kvm_tdx) || is_td_finalized(kvm_tdx)) >> + return -EINVAL; >> + /* >> + * Pages are pending for KVM_TDX_INIT_MEM_REGION to issue >> + * TDH.MEM.PAGE.ADD(). >> + */ >> + if (atomic64_read(&kvm_tdx->nr_premapped)) >> + return -EINVAL; > > I suggest moving all of patch 16, plus the > > + WARN_ON_ONCE(!atomic64_read(&kvm_tdx->nr_premapped)); > + atomic64_dec(&kvm_tdx->nr_premapped); > > lines of patch 19, into this patch. > >> + cmd->hw_error = tdh_mr_finalize(kvm_tdx); >> + if ((cmd->hw_error & TDX_SEAMCALL_STATUS_MASK) == TDX_OPERAND_BUSY) >> + return -EAGAIN; >> + if (KVM_BUG_ON(cmd->hw_error, kvm)) { >> + pr_tdx_error(TDH_MR_FINALIZE, cmd->hw_error); >> + return -EIO; >> + } >> + >> + kvm_tdx->finalized = true; >> + return 0; > > This should also set pre_fault_allowed to true. Ideally, need to ensure it is not possible for another CPU to see kvm_tdx->finalized==false and pre_fault_allowed==true Perhaps also, to document the dependency, return an error if pre_fault_allowed is true in tdx_mem_page_record_premap_cnt().
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 789d1d821b4f..0b4827e39458 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -932,6 +932,7 @@ enum kvm_tdx_cmd_id { KVM_TDX_INIT_VM, KVM_TDX_INIT_VCPU, KVM_TDX_INIT_MEM_REGION, + KVM_TDX_FINALIZE_VM, KVM_TDX_GET_CPUID, KVM_TDX_CMD_NR_MAX, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 796d1a495a66..3083a66bb895 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1257,6 +1257,31 @@ void tdx_flush_tlb_current(struct kvm_vcpu *vcpu) ept_sync_global(); } +static int tdx_td_finalizemr(struct kvm *kvm, struct kvm_tdx_cmd *cmd) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + + if (!is_hkid_assigned(kvm_tdx) || is_td_finalized(kvm_tdx)) + return -EINVAL; + /* + * Pages are pending for KVM_TDX_INIT_MEM_REGION to issue + * TDH.MEM.PAGE.ADD(). + */ + if (atomic64_read(&kvm_tdx->nr_premapped)) + return -EINVAL; + + cmd->hw_error = tdh_mr_finalize(kvm_tdx); + if ((cmd->hw_error & TDX_SEAMCALL_STATUS_MASK) == TDX_OPERAND_BUSY) + return -EAGAIN; + if (KVM_BUG_ON(cmd->hw_error, kvm)) { + pr_tdx_error(TDH_MR_FINALIZE, cmd->hw_error); + return -EIO; + } + + kvm_tdx->finalized = true; + return 0; +} + int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { struct kvm_tdx_cmd tdx_cmd; @@ -1281,6 +1306,9 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) case KVM_TDX_INIT_VM: r = tdx_td_init(kvm, &tdx_cmd); break; + case KVM_TDX_FINALIZE_VM: + r = tdx_td_finalizemr(kvm, &tdx_cmd); + break; default: r = -EINVAL; goto out;