Message ID | 1584880579-12178-17-git-send-email-yi.l.liu@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | intel_iommu: expose Shared Virtual Addressing to VMs | expand |
On Sun, Mar 22, 2020 at 05:36:13AM -0700, Liu Yi L wrote: > This patch replays guest pasid bindings after context cache > invalidation. This is a behavior to ensure safety. Actually, > programmer should issue pasid cache invalidation with proper > granularity after issuing a context cache invalidation. > > Cc: Kevin Tian <kevin.tian@intel.com> > Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> > Cc: Peter Xu <peterx@redhat.com> > Cc: Yi Sun <yi.y.sun@linux.intel.com> > Cc: Paolo Bonzini <pbonzini@redhat.com> > Cc: Richard Henderson <rth@twiddle.net> > Cc: Eduardo Habkost <ehabkost@redhat.com> > Signed-off-by: Liu Yi L <yi.l.liu@intel.com> > --- > hw/i386/intel_iommu.c | 68 ++++++++++++++++++++++++++++++++++++++++++ > hw/i386/intel_iommu_internal.h | 6 +++- > hw/i386/trace-events | 1 + > 3 files changed, 74 insertions(+), 1 deletion(-) > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > index 8ec638f..1e0ccde 100644 > --- a/hw/i386/intel_iommu.c > +++ b/hw/i386/intel_iommu.c > @@ -68,6 +68,10 @@ static void vtd_address_space_refresh_all(IntelIOMMUState *s); > static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n); > > static void vtd_pasid_cache_reset(IntelIOMMUState *s); > +static void vtd_replay_guest_pasid_bindings(IntelIOMMUState *s, > + uint16_t *did, bool is_dsi); > +static void vtd_pasid_cache_devsi(IntelIOMMUState *s, > + VTDBus *vtd_bus, uint16_t devfn); > > static void vtd_panic_require_caching_mode(void) > { > @@ -1865,6 +1869,8 @@ static void vtd_context_global_invalidate(IntelIOMMUState *s) > * VT-d emulation codes. > */ > vtd_iommu_replay_all(s); > + > + vtd_replay_guest_pasid_bindings(s, NULL, false); I think the only uncertain thing is whether you still want to rework the vtd_replay_guest_pasid_bindings() interface. It'll depend on the future discussion of previous patches. Besides that this patch looks good to me.
> From: Peter Xu > Sent: Wednesday, March 25, 2020 1:47 AM > To: Liu, Yi L <yi.l.liu@intel.com> > Subject: Re: [PATCH v1 14/22] intel_iommu: bind/unbind guest page table to host > > On Sun, Mar 22, 2020 at 05:36:11AM -0700, Liu Yi L wrote: > > This patch captures the guest PASID table entry modifications and > > propagates the changes to host to setup dual stage DMA translation. > > The guest page table is configured as 1st level page table (GVA->GPA) > > whose translation result would further go through host VT-d 2nd level > > page table(GPA->HPA) under nested translation mode. This is the key > > part of vSVA support, and also a key to support IOVA over 1st- level > > page table for Intel VT-d in virtualization environment. > > > > Cc: Kevin Tian <kevin.tian@intel.com> > > Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> > > Cc: Peter Xu <peterx@redhat.com> > > Cc: Yi Sun <yi.y.sun@linux.intel.com> > > Cc: Paolo Bonzini <pbonzini@redhat.com> > > Cc: Richard Henderson <rth@twiddle.net> > > Signed-off-by: Liu Yi L <yi.l.liu@intel.com> > > --- > > hw/i386/intel_iommu.c | 98 > +++++++++++++++++++++++++++++++++++++++--- > > hw/i386/intel_iommu_internal.h | 25 +++++++++++ > > 2 files changed, 118 insertions(+), 5 deletions(-) > > > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index > > c985cae..0423c83 100644 > > --- a/hw/i386/intel_iommu.c > > +++ b/hw/i386/intel_iommu.c > > @@ -41,6 +41,7 @@ > > #include "migration/vmstate.h" > > #include "trace.h" > > #include "qemu/jhash.h" > > +#include <linux/iommu.h> > >> From: Peter Xu < peterx@redhat.com> > Sent: Wednesday, March 25, 2020 2:08 AM > To: Liu, Yi L <yi.l.liu@intel.com> > Subject: Re: [PATCH v1 16/22] intel_iommu: replay pasid binds after context cache > invalidation > > On Sun, Mar 22, 2020 at 05:36:13AM -0700, Liu Yi L wrote: > > This patch replays guest pasid bindings after context cache > > invalidation. This is a behavior to ensure safety. Actually, > > programmer should issue pasid cache invalidation with proper > > granularity after issuing a context cache invalidation. > > > > Cc: Kevin Tian <kevin.tian@intel.com> > > Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> > > Cc: Peter Xu <peterx@redhat.com> > > Cc: Yi Sun <yi.y.sun@linux.intel.com> > > Cc: Paolo Bonzini <pbonzini@redhat.com> > > Cc: Richard Henderson <rth@twiddle.net> > > Cc: Eduardo Habkost <ehabkost@redhat.com> > > Signed-off-by: Liu Yi L <yi.l.liu@intel.com> > > --- > > hw/i386/intel_iommu.c | 68 > ++++++++++++++++++++++++++++++++++++++++++ > > hw/i386/intel_iommu_internal.h | 6 +++- > > hw/i386/trace-events | 1 + > > 3 files changed, 74 insertions(+), 1 deletion(-) > > > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index > > 8ec638f..1e0ccde 100644 > > --- a/hw/i386/intel_iommu.c > > +++ b/hw/i386/intel_iommu.c > > @@ -68,6 +68,10 @@ static void > > vtd_address_space_refresh_all(IntelIOMMUState *s); static void > > vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n); > > > > static void vtd_pasid_cache_reset(IntelIOMMUState *s); > > +static void vtd_replay_guest_pasid_bindings(IntelIOMMUState *s, > > + uint16_t *did, bool > > +is_dsi); static void vtd_pasid_cache_devsi(IntelIOMMUState *s, > > + VTDBus *vtd_bus, uint16_t devfn); > > > > static void vtd_panic_require_caching_mode(void) > > { > > @@ -1865,6 +1869,8 @@ static void > vtd_context_global_invalidate(IntelIOMMUState *s) > > * VT-d emulation codes. > > */ > > vtd_iommu_replay_all(s); > > + > > + vtd_replay_guest_pasid_bindings(s, NULL, false); > > I think the only uncertain thing is whether you still want to rework the > vtd_replay_guest_pasid_bindings() interface. It'll depend on the future > discussion > of previous patches. Besides that this patch looks good to me. Thanks, as I replied in other thread, I'll try to make the PSI handling and the replay code aligned. Briefly includes two steps like PSI. Existing replay function actually do all the things in one packet although it doesn’t miss anything. But it would make code more readable for future maintain. Regards, Yi Liu > > /* context entry operations */ > > #define VTD_CE_GET_RID2PASID(ce) \ > > @@ -695,6 +696,16 @@ static inline uint16_t > vtd_pe_get_domain_id(VTDPASIDEntry *pe) > > return VTD_SM_PASID_ENTRY_DID((pe)->val[1]); > > } > > > > +static inline uint32_t vtd_pe_get_fl_aw(VTDPASIDEntry *pe) { > > + return 48 + ((pe->val[2] >> 2) & VTD_SM_PASID_ENTRY_FLPM) * 9; } > > + > > +static inline dma_addr_t vtd_pe_get_flpt_base(VTDPASIDEntry *pe) { > > + return pe->val[2] & VTD_SM_PASID_ENTRY_FLPTPTR; } > > + > > static inline bool vtd_pdire_present(VTDPASIDDirEntry *pdire) { > > return pdire->val & 1; > > @@ -1856,6 +1867,81 @@ static void > vtd_context_global_invalidate(IntelIOMMUState *s) > > vtd_iommu_replay_all(s); > > } > > > > +/** > > + * Caller should hold iommu_lock. > > + */ > > +static int vtd_bind_guest_pasid(IntelIOMMUState *s, VTDBus *vtd_bus, > > + int devfn, int pasid, VTDPASIDEntry *pe, > > + VTDPASIDOp op) { > > + VTDHostIOMMUContext *vtd_dev_icx; > > + HostIOMMUContext *host_icx; > > + DualIOMMUStage1BindData *bind_data; > > + struct iommu_gpasid_bind_data *g_bind_data; > > + int ret = -1; > > + > > + vtd_dev_icx = vtd_bus->dev_icx[devfn]; > > + if (!vtd_dev_icx) { > > + return -EINVAL; > > + } > > + > > + host_icx = vtd_dev_icx->host_icx; > > + if (!host_icx) { > > + return -EINVAL; > > + } > > + > > + if (!(host_icx->stage1_formats > > + & IOMMU_PASID_FORMAT_INTEL_VTD)) { > > + error_report_once("IOMMU Stage 1 format is not > > + compatible!\n"); > > Shouldn't we fail with this? oh, yes. no need to go further though host should also fail it. > > + } > > + > > + bind_data = g_malloc0(sizeof(*bind_data)); > > + bind_data->pasid = pasid; > > + g_bind_data = &bind_data->bind_data.gpasid_bind; > > + > > + g_bind_data->flags = 0; > > + g_bind_data->vtd.flags = 0; > > + switch (op) { > > + case VTD_PASID_BIND: > > + case VTD_PASID_UPDATE: > > Is VTD_PASID_UPDATE used anywhere? > > But since it's called "UPDATE"... I really want to confirm with you that the bind() to > the kernel will handle the UPDATE case, right? I mean, we need to unbind first if > there is an existing pgtable pointer. I guess you mean host kernel. right? Actually, it's fine. host kernel only needs to fill in the latest pgtable pointer and permission configs to the pasid entry and then issue a cache invalidation. No need to do unbind firstly since kernel always needs to flush cache after modifying a pasid entry (includes valid->valid). > > If the answer is yes, then I think we're good, but we really need to comment it > somewhere about the fact. > > > + g_bind_data->version = IOMMU_UAPI_VERSION; > > + g_bind_data->format = IOMMU_PASID_FORMAT_INTEL_VTD; > > + g_bind_data->gpgd = vtd_pe_get_flpt_base(pe); > > + g_bind_data->addr_width = vtd_pe_get_fl_aw(pe); > > + g_bind_data->hpasid = pasid; > > + g_bind_data->gpasid = pasid; > > + g_bind_data->flags |= IOMMU_SVA_GPASID_VAL; > > + g_bind_data->vtd.flags = > > + (VTD_SM_PASID_ENTRY_SRE_BIT(pe->val[2]) ? 1 : 0) > > + | (VTD_SM_PASID_ENTRY_EAFE_BIT(pe->val[2]) ? 1 : 0) > > + | (VTD_SM_PASID_ENTRY_PCD_BIT(pe->val[1]) ? 1 : 0) > > + | (VTD_SM_PASID_ENTRY_PWT_BIT(pe->val[1]) ? 1 : 0) > > + | (VTD_SM_PASID_ENTRY_EMTE_BIT(pe->val[1]) ? 1 : 0) > > + | (VTD_SM_PASID_ENTRY_CD_BIT(pe->val[1]) ? 1 : 0); > > + g_bind_data->vtd.pat = VTD_SM_PASID_ENTRY_PAT(pe->val[1]); > > + g_bind_data->vtd.emt = VTD_SM_PASID_ENTRY_EMT(pe->val[1]); > > + ret = host_iommu_ctx_bind_stage1_pgtbl(host_icx, bind_data); > > + break; > > + case VTD_PASID_UNBIND: > > + g_bind_data->version = IOMMU_UAPI_VERSION; > > + g_bind_data->format = IOMMU_PASID_FORMAT_INTEL_VTD; > > + g_bind_data->gpgd = 0; > > + g_bind_data->addr_width = 0; > > + g_bind_data->hpasid = pasid; > > + g_bind_data->gpasid = pasid; > > + g_bind_data->flags |= IOMMU_SVA_GPASID_VAL; > > + ret = host_iommu_ctx_unbind_stage1_pgtbl(host_icx, bind_data); > > + break; > > + default: > > + error_report_once("Unknown VTDPASIDOp!!!\n"); > > + break; > > + } > > + > > + g_free(bind_data); > > + > > + return ret; > > +} > > + > > /* Do a context-cache device-selective invalidation. > > * @func_mask: FM field after shifting > > */ > > @@ -2481,10 +2567,10 @@ static inline void > > vtd_fill_in_pe_in_cache(IntelIOMMUState *s, > > > > pc_entry->pasid_entry = *pe; > > pc_entry->pasid_cache_gen = s->pasid_cache_gen; > > - /* > > - * TODO: > > - * - send pasid bind to host for passthru devices > > - */ > > + vtd_bind_guest_pasid(s, vtd_pasid_as->vtd_bus, > > + vtd_pasid_as->devfn, > > + vtd_pasid_as->pasid, > > + pe, VTD_PASID_BIND); > > } > > > > /** > > @@ -2574,11 +2660,13 @@ static gboolean vtd_flush_pasid(gpointer key, > gpointer value, > > * - when pasid-base-iotlb(piotlb) infrastructure is ready, > > * should invalidate QEMU piotlb togehter with this change. > > */ > > + > > return false; > > remove: > > + vtd_bind_guest_pasid(s, vtd_bus, devfn, > > + pasid, NULL, VTD_PASID_UNBIND); > > /* > > * TODO: > > - * - send pasid bind to host for passthru devices > > * - when pasid-base-iotlb(piotlb) infrastructure is ready, > > * should invalidate QEMU piotlb togehter with this change. > > */ > > diff --git a/hw/i386/intel_iommu_internal.h > > b/hw/i386/intel_iommu_internal.h index 01fd95c..4451acf 100644 > > --- a/hw/i386/intel_iommu_internal.h > > +++ b/hw/i386/intel_iommu_internal.h > > @@ -516,6 +516,20 @@ typedef struct VTDRootEntry VTDRootEntry; > > #define VTD_SM_CONTEXT_ENTRY_RSVD_VAL0(aw) (0x1e0ULL | > ~VTD_HAW_MASK(aw)) > > #define VTD_SM_CONTEXT_ENTRY_RSVD_VAL1 0xffffffffffe00000ULL > > > > +enum VTD_DUAL_STAGE_UAPI { > > + UAPI_BIND_GPASID, > > + UAPI_NUM > > +}; > > +typedef enum VTD_DUAL_STAGE_UAPI VTD_DUAL_STAGE_UAPI; > > + > > +enum VTDPASIDOp { > > + VTD_PASID_BIND, > > + VTD_PASID_UNBIND, > > + VTD_PASID_UPDATE, > > Same here (whether to drop?). > If above reply doesn't make sense, may drop it. Regards, Yi Liu
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 8ec638f..1e0ccde 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -68,6 +68,10 @@ static void vtd_address_space_refresh_all(IntelIOMMUState *s); static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n); static void vtd_pasid_cache_reset(IntelIOMMUState *s); +static void vtd_replay_guest_pasid_bindings(IntelIOMMUState *s, + uint16_t *did, bool is_dsi); +static void vtd_pasid_cache_devsi(IntelIOMMUState *s, + VTDBus *vtd_bus, uint16_t devfn); static void vtd_panic_require_caching_mode(void) { @@ -1865,6 +1869,8 @@ static void vtd_context_global_invalidate(IntelIOMMUState *s) * VT-d emulation codes. */ vtd_iommu_replay_all(s); + + vtd_replay_guest_pasid_bindings(s, NULL, false); } /** @@ -1999,6 +2005,22 @@ static void vtd_context_device_invalidate(IntelIOMMUState *s, * happened. */ vtd_sync_shadow_page_table(vtd_as); + /* + * Per spec, context flush should also followed with PASID + * cache and iotlb flush. Regards to a device selective + * context cache invalidation: + * if (emaulted_device) + * modify the pasid cache gen and pasid-based iotlb gen + * value (will be added in following patches) + * else if (assigned_device) + * check if the device has been bound to any pasid + * invoke pasid_unbind regards to each bound pasid + * Here, we have vtd_pasid_cache_devsi() to invalidate pasid + * caches, while for piotlb in QEMU, we don't have it yet, so + * no handling. For assigned device, host iommu driver would + * flush piotlb when a pasid unbind is pass down to it. + */ + vtd_pasid_cache_devsi(s, vtd_bus, devfn_it); } } } @@ -2631,6 +2653,12 @@ static gboolean vtd_flush_pasid(gpointer key, gpointer value, /* Fall through */ case VTD_PASID_CACHE_GLOBAL: break; + case VTD_PASID_CACHE_DEVSI: + if (pc_info->vtd_bus != vtd_bus || + pc_info->devfn == devfn) { + return false; + } + break; default: error_report("invalid pc_info->flags"); abort(); @@ -2971,6 +2999,46 @@ static int vtd_pasid_cache_psi(IntelIOMMUState *s, return 0; } +static void vtd_pasid_cache_devsi(IntelIOMMUState *s, + VTDBus *vtd_bus, uint16_t devfn) +{ + VTDPASIDCacheInfo pc_info; + VTDContextEntry ce; + VTDHostIOMMUContext *vtd_dev_icx; + vtd_pasid_table_walk_info info; + + trace_vtd_pasid_cache_devsi(devfn); + + pc_info.flags = VTD_PASID_CACHE_DEVSI; + pc_info.vtd_bus = vtd_bus; + pc_info.devfn = devfn; + + vtd_iommu_lock(s); + g_hash_table_foreach_remove(s->vtd_pasid_as, vtd_flush_pasid, &pc_info); + + /* + * To be safe, after invalidating the pasid caches, + * emulator needs to replay the pasid bindings by + * walking guest pasid dir and pasid table. + */ + vtd_dev_icx = vtd_bus->dev_icx[devfn]; + if (vtd_dev_icx && vtd_dev_icx->host_icx && + !vtd_dev_to_context_entry(s, pci_bus_num(vtd_bus->bus), + devfn, &ce)) { + info.flags = 0x0; + info.did = 0; + info.vtd_bus = vtd_bus; + info.devfn = devfn; + vtd_sm_pasid_table_walk(s, + VTD_CE_GET_PASID_DIR_TABLE(&ce), + 0, + VTD_MAX_HPASID, + &info); + } + + vtd_iommu_unlock(s); +} + /** * Caller of this function should hold iommu_lock */ diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h index b0a324c..6f32d7b 100644 --- a/hw/i386/intel_iommu_internal.h +++ b/hw/i386/intel_iommu_internal.h @@ -534,13 +534,17 @@ struct VTDPASIDCacheInfo { #define VTD_PASID_CACHE_GLOBAL (1ULL << 0) #define VTD_PASID_CACHE_DOMSI (1ULL << 1) #define VTD_PASID_CACHE_PASIDSI (1ULL << 2) +#define VTD_PASID_CACHE_DEVSI (1ULL << 3) uint32_t flags; uint16_t domain_id; uint32_t pasid; + VTDBus *vtd_bus; + uint16_t devfn; }; #define VTD_PASID_CACHE_INFO_MASK (VTD_PASID_CACHE_GLOBAL | \ VTD_PASID_CACHE_DOMSI | \ - VTD_PASID_CACHE_PASIDSI) + VTD_PASID_CACHE_PASIDSI | \ + VTD_PASID_CACHE_DEVSI) typedef struct VTDPASIDCacheInfo VTDPASIDCacheInfo; /* PASID Table Related Definitions */ diff --git a/hw/i386/trace-events b/hw/i386/trace-events index 60d20c1..3853fa8 100644 --- a/hw/i386/trace-events +++ b/hw/i386/trace-events @@ -26,6 +26,7 @@ vtd_pasid_cache_gsi(void) "" vtd_pasid_cache_reset(void) "" vtd_pasid_cache_dsi(uint16_t domain) "Domian slective PC invalidation domain 0x%"PRIx16 vtd_pasid_cache_psi(uint16_t domain, uint32_t pasid) "PASID slective PC invalidation domain 0x%"PRIx16" pasid 0x%"PRIx32 +vtd_pasid_cache_devsi(uint16_t devfn) "Dev selective PC invalidation dev: 0x%"PRIx16 vtd_re_not_present(uint8_t bus) "Root entry bus %"PRIu8" not present" vtd_ce_not_present(uint8_t bus, uint8_t devfn) "Context entry bus %"PRIu8" devfn %"PRIu8" not present" vtd_iotlb_page_hit(uint16_t sid, uint64_t addr, uint64_t slpte, uint16_t domain) "IOTLB page hit sid 0x%"PRIx16" iova 0x%"PRIx64" slpte 0x%"PRIx64" domain 0x%"PRIx16
This patch replays guest pasid bindings after context cache invalidation. This is a behavior to ensure safety. Actually, programmer should issue pasid cache invalidation with proper granularity after issuing a context cache invalidation. Cc: Kevin Tian <kevin.tian@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Peter Xu <peterx@redhat.com> Cc: Yi Sun <yi.y.sun@linux.intel.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Liu Yi L <yi.l.liu@intel.com> --- hw/i386/intel_iommu.c | 68 ++++++++++++++++++++++++++++++++++++++++++ hw/i386/intel_iommu_internal.h | 6 +++- hw/i386/trace-events | 1 + 3 files changed, 74 insertions(+), 1 deletion(-)