Message ID | 880c1016c29624964baee580985b6a736fc7d656.1667110240.git.isaku.yamahata@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM TDX basic feature support | expand |
On Sat, 2022-10-29 at 23:22 -0700, isaku.yamahata@intel.com wrote: > From: Isaku Yamahata <isaku.yamahata@intel.com> > > kvm_unmap_gfn_range() needs to know the reason of the callback for TDX. > mmu notifier, set memattr ioctl or restrictedmem notifier. Based on the > reason, TDX changes the behavior. For mmu notifier, it's the operation on > shared memory slot to zap shared PTE. For set memattr, private<->shared > conversion, zap the original PTE. For restrictedmem, it's a hint that TDX > can ignore. Could you elaborate why restricted memfd notifier can be ignored? IIUC if userspace punch a hole, the pages within the hole will be de-allocated. So why can such notifier be ignored? Btw, I think this patch should be just merged to the patch which consumes those flags (checked it is next patch). It's easier to review when putting them together. > > Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> > --- > include/linux/kvm_host.h | 8 +++++++- > virt/kvm/kvm_main.c | 5 ++++- > 2 files changed, 11 insertions(+), 2 deletions(-) > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index 839d98d56632..b658803ea2c7 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -247,12 +247,18 @@ int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu); > > > #if defined(KVM_ARCH_WANT_MMU_NOTIFIER) || defined(CONFIG_HAVE_KVM_RESTRICTED_MEM) > +#define KVM_GFN_RANGE_FLAGS_RESTRICTED_MEM BIT(0) > +#define KVM_GFN_RANGE_FLAGS_SET_MEM_ATTR BIT(1) > struct kvm_gfn_range { > struct kvm_memory_slot *slot; > gfn_t start; > gfn_t end; > - pte_t pte; > + union { > + pte_t pte; > + int attr; > + }; attribute is u64 in Chao's series. > bool may_block; > + unsigned int flags; > };
On Wed, Dec 14, 2022 at 10:51:31AM +0000, "Huang, Kai" <kai.huang@intel.com> wrote: > On Sat, 2022-10-29 at 23:22 -0700, isaku.yamahata@intel.com wrote: > > From: Isaku Yamahata <isaku.yamahata@intel.com> > > > > kvm_unmap_gfn_range() needs to know the reason of the callback for TDX. > > mmu notifier, set memattr ioctl or restrictedmem notifier. Based on the > > reason, TDX changes the behavior. For mmu notifier, it's the operation on > > shared memory slot to zap shared PTE. For set memattr, private<->shared > > conversion, zap the original PTE. For restrictedmem, it's a hint that TDX > > can ignore. > > Could you elaborate why restricted memfd notifier can be ignored? IIUC if > userspace punch a hole, the pages within the hole will be de-allocated. So why > can such notifier be ignored? Because set-memory-attribute ioctl is expected to follow the callback from restrictedmem. So set memory attributes can do de-allocation. I wanted to avoid zapping twice. With v9 UPM, the restrictedmem callback was triggered for both allocation and punch-hole. With v10 UPM, the callback is triggered only for punch-hole. With v10 callback semantics, probably this can be cleaned up slightly.
On Thu, 2022-12-15 at 14:10 -0800, Isaku Yamahata wrote: > On Wed, Dec 14, 2022 at 10:51:31AM +0000, > "Huang, Kai" <kai.huang@intel.com> wrote: > > > On Sat, 2022-10-29 at 23:22 -0700, isaku.yamahata@intel.com wrote: > > > From: Isaku Yamahata <isaku.yamahata@intel.com> > > > > > > kvm_unmap_gfn_range() needs to know the reason of the callback for TDX. > > > mmu notifier, set memattr ioctl or restrictedmem notifier. Based on the > > > reason, TDX changes the behavior. For mmu notifier, it's the operation on > > > shared memory slot to zap shared PTE. For set memattr, private<->shared > > > conversion, zap the original PTE. For restrictedmem, it's a hint that TDX > > > can ignore. > > > > Could you elaborate why restricted memfd notifier can be ignored? IIUC if > > userspace punch a hole, the pages within the hole will be de-allocated. So why > > can such notifier be ignored? > > Because set-memory-attribute ioctl is expected to follow the callback from > restrictedmem. So set memory attributes can do de-allocation. I wanted to avoid > zapping twice. Even this is true, the punch hole can be done alone w/o being followed by set_memory_attribute(), correct? Your explanation doesn't seem to be reasonable? At least, you need to explain the semantics of how to use "punch hole" and set_memory_attributes() clearly in the changelog. Otherwise it's hard for people to review.
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 839d98d56632..b658803ea2c7 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -247,12 +247,18 @@ int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu); #if defined(KVM_ARCH_WANT_MMU_NOTIFIER) || defined(CONFIG_HAVE_KVM_RESTRICTED_MEM) +#define KVM_GFN_RANGE_FLAGS_RESTRICTED_MEM BIT(0) +#define KVM_GFN_RANGE_FLAGS_SET_MEM_ATTR BIT(1) struct kvm_gfn_range { struct kvm_memory_slot *slot; gfn_t start; gfn_t end; - pte_t pte; + union { + pte_t pte; + int attr; + }; bool may_block; + unsigned int flags; }; bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range); #endif diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 3b05a3396f89..dda2f2ec4faa 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -676,6 +676,7 @@ static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, gfn_range.start = hva_to_gfn_memslot(hva_start, slot); gfn_range.end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, slot); gfn_range.slot = slot; + gfn_range.flags = 0; if (!locked) { locked = true; @@ -947,8 +948,9 @@ static void kvm_unmap_mem_range(struct kvm *kvm, gfn_t start, gfn_t end, int i; int r = 0; - gfn_range.pte = __pte(0); + gfn_range.attr = attr; gfn_range.may_block = true; + gfn_range.flags = KVM_GFN_RANGE_FLAGS_SET_MEM_ATTR; for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { slots = __kvm_memslots(kvm, i); @@ -1074,6 +1076,7 @@ static void kvm_restrictedmem_invalidate_begin(struct restrictedmem_notifier *no gfn_range.slot = slot; gfn_range.pte = __pte(0); gfn_range.may_block = true; + gfn_range.flags = KVM_GFN_RANGE_FLAGS_RESTRICTED_MEM; if (kvm_unmap_gfn_range(kvm, &gfn_range)) kvm_flush_remote_tlbs(kvm);