Message ID | 1519900415-30314-6-git-send-email-yi.l.liu@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 01/03/2018 11:33, Liu, Yi L wrote: > +void pci_setup_sva_ops(PCIDevice *dev, PCISVAOps *ops) > +{ > + if (dev) { > + dev->sva_ops = ops; > + } > + return; > +} > + Better: { assert(ops && !dev->sva_ops); dev->sva_ops = ops; }
On Fri, Mar 02, 2018 at 04:10:48PM +0100, Paolo Bonzini wrote: > On 01/03/2018 11:33, Liu, Yi L wrote: > > +void pci_setup_sva_ops(PCIDevice *dev, PCISVAOps *ops) > > +{ > > + if (dev) { > > + dev->sva_ops = ops; > > + } > > + return; > > +} > > + > > Better: > > { > assert(ops && !dev->sva_ops); > dev->sva_ops = ops; > } Thanks, would apply in next version. Regards, Yi Liu
On Mon, Mar 05, 2018 at 02:31:44PM +1100, David Gibson wrote: > On Thu, Mar 01, 2018 at 06:31:55PM +0800, Liu, Yi L wrote: > > This patch intoduces PCISVAOps for virt-SVA. > > > > So far, to setup virt-SVA for assigned SVA capable device, needs to > > config host translation structures. e.g. for VT-d, needs to set the > > guest pasid table to host and enable nested translation. Besides, > > vIOMMU emulator needs to forward guest's cache invalidation to host. > > On VT-d, it is guest's invalidation to 1st level translation related > > cache, such invalidation should be forwarded to host. > > > > Proposed PCISVAOps are: > > * sva_bind_guest_pasid_table: set the guest pasid table to host, and > > enable nested translation in host > > * sva_register_notifier: register sva_notifier to forward guest's > > cache invalidation to host > > * sva_unregister_notifier: unregister sva_notifier > > > > The PCISVAOps should be provided by vfio or modules alike. Mainly for > > assigned SVA capable devices. > > > > Take virt-SVA on VT-d as an exmaple: > > If a guest wants to setup virt-SVA for an assigned SVA capable device, > > it programs its context entry. vIOMMU emulator captures guest's context > > entry programming, and figure out the target device. vIOMMU emulator > > use the pci_device_sva_bind_pasid_table() API to bind the guest pasid > > table to host. > > > > Guest would also program its pasid table. vIOMMU emulator captures > > guest's pasid entry programming. In Qemu, needs to allocate an > > AddressSpace to stand for the pasid tagged address space and Qemu also > > needs to register sva_notifier to forward future cache invalidation > > request to host. > > > > Allocating AddressSpace to stand for the pasid tagged address space is > > for the emulation of emulated SVA capable devices. Emulated SVA capable > > devices may issue SVA aware DMAs, Qemu needs to emulate read/write to a > > pasid tagged AddressSpace. Thus needs an abstraction for such address > > space in Qemu. > > > > Signed-off-by: Liu, Yi L <yi.l.liu@linux.intel.com> > > So PCISVAOps is roughly equivalent to the cluster-of-PASIDs context I > was suggesting in my earlier comments, yes, it is. The purpose is to expose pasid table bind and sva notfier registration/unregistration to vIOMMU emulators. > however it's only an ops > structure. That means you can't easily share a context between > multiple PCI devices which is unfortunate because: > * The simplest use case for SVA I can see would just put the > same set of PASIDs into place for every SVA capable device Do you mean for emulated SVA capable device? > * Sometimes the IOMMU can't determine exactly what device a DMA > came from. Now the bridge cases where this applies are probably > unlikely with SVA devices, but I wouldn't want to bet on it. In > addition, the chances some manufacturer will eventually put out > a buggy multifunction SVA capable device that use the wrong RIDs > for the secondary functions is pretty darn high. I'm not sure I 100% got your point here. Do yu mean physical device? In PCIE TLP, DMA packet should have a RID field? And it looks more like a hardware layer trouble. For this series, it only provides necessary software support to make sure guest's SVA operation is well prepared before the SVA device issues the SVA aware DMA. e.g. link guest's pasid table to host, and config iommu translation in nested mode. > > So I think instead you want a cluster-of-PASIDs object which has an > ops table including both these and the per-PASID calls from the > earlier patches (but the per-PASID calls would now take an explicit > PASID value). I didn't quite get "including both these and the per-PASID calls". What do you mean by "these"? Do you mean the PCISVAOps? Thanks, Yi Liu > > --- > > hw/pci/pci.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++++ > > include/hw/pci/pci.h | 21 ++++++++++++++++++ > > 2 files changed, 81 insertions(+) > > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c > > index e006b6a..157fe21 100644 > > --- a/hw/pci/pci.c > > +++ b/hw/pci/pci.c > > @@ -2573,6 +2573,66 @@ void pci_setup_iommu(PCIBus *bus, PCIIOMMUFunc fn, void *opaque) > > bus->iommu_opaque = opaque; > > } > > > > +void pci_setup_sva_ops(PCIDevice *dev, PCISVAOps *ops) > > +{ > > + if (dev) { > > + dev->sva_ops = ops; > > + } > > + return; > > +} > > + > > +void pci_device_sva_bind_pasid_table(PCIBus *bus, > > + int32_t devfn, uint64_t addr, uint32_t size) > > +{ > > + PCIDevice *dev; > > + > > + if (!bus) { > > + return; > > + } > > + > > + dev = bus->devices[devfn]; > > + if (dev && dev->sva_ops) { > > + dev->sva_ops->sva_bind_pasid_table(bus, devfn, addr, size); > > + } > > + return; > > +} > > + > > +void pci_device_sva_register_notifier(PCIBus *bus, int32_t devfn, > > + IOMMUSVAContext *sva_ctx) > > +{ > > + PCIDevice *dev; > > + > > + if (!bus) { > > + return; > > + } > > + > > + dev = bus->devices[devfn]; > > + if (dev && dev->sva_ops) { > > + dev->sva_ops->sva_register_notifier(bus, > > + devfn, > > + sva_ctx); > > + } > > + return; > > +} > > + > > +void pci_device_sva_unregister_notifier(PCIBus *bus, int32_t devfn, > > + IOMMUSVAContext *sva_ctx) > > +{ > > + PCIDevice *dev; > > + > > + if (!bus) { > > + return; > > + } > > + > > + dev = bus->devices[devfn]; > > + if (dev && dev->sva_ops) { > > + dev->sva_ops->sva_unregister_notifier(bus, > > + devfn, > > + sva_ctx); > > + } > > + return; > > +} > > + > > static void pci_dev_get_w64(PCIBus *b, PCIDevice *dev, void *opaque) > > { > > Range *range = opaque; > > diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h > > index d8c18c7..32889a4 100644 > > --- a/include/hw/pci/pci.h > > +++ b/include/hw/pci/pci.h > > @@ -10,6 +10,8 @@ > > > > #include "hw/pci/pcie.h" > > > > +#include "hw/core/pasid.h" > > + > > extern bool pci_available; > > > > /* PCI bus */ > > @@ -262,6 +264,16 @@ struct PCIReqIDCache { > > }; > > typedef struct PCIReqIDCache PCIReqIDCache; > > > > +typedef struct PCISVAOps PCISVAOps; > > +struct PCISVAOps { > > + void (*sva_bind_pasid_table)(PCIBus *bus, int32_t devfn, > > + uint64_t pasidt_addr, uint32_t size); > > + void (*sva_register_notifier)(PCIBus *bus, int32_t devfn, > > + IOMMUSVAContext *sva_ctx); > > + void (*sva_unregister_notifier)(PCIBus *bus, int32_t devfn, > > + IOMMUSVAContext *sva_ctx); > > +}; > > + > > struct PCIDevice { > > DeviceState qdev; > > > > @@ -351,6 +363,7 @@ struct PCIDevice { > > MSIVectorUseNotifier msix_vector_use_notifier; > > MSIVectorReleaseNotifier msix_vector_release_notifier; > > MSIVectorPollNotifier msix_vector_poll_notifier; > > + PCISVAOps *sva_ops; > > }; > > > > void pci_register_bar(PCIDevice *pci_dev, int region_num, > > @@ -477,6 +490,14 @@ typedef AddressSpace *(*PCIIOMMUFunc)(PCIBus *, void *, int); > > AddressSpace *pci_device_iommu_address_space(PCIDevice *dev); > > void pci_setup_iommu(PCIBus *bus, PCIIOMMUFunc fn, void *opaque); > > > > +void pci_setup_sva_ops(PCIDevice *dev, PCISVAOps *ops); > > +void pci_device_sva_bind_pasid_table(PCIBus *bus, int32_t devfn, > > + uint64_t pasidt_addr, uint32_t size); > > +void pci_device_sva_register_notifier(PCIBus *bus, int32_t devfn, > > + IOMMUSVAContext *sva_ctx); > > +void pci_device_sva_unregister_notifier(PCIBus *bus, int32_t devfn, > > + IOMMUSVAContext *sva_ctx); > > + > > static inline void > > pci_set_byte(uint8_t *config, uint8_t val) > > {
On Tue, Mar 06, 2018 at 06:33:52PM +0800, Liu, Yi L wrote: > On Mon, Mar 05, 2018 at 02:31:44PM +1100, David Gibson wrote: > > On Thu, Mar 01, 2018 at 06:31:55PM +0800, Liu, Yi L wrote: > > > This patch intoduces PCISVAOps for virt-SVA. > > > > > > So far, to setup virt-SVA for assigned SVA capable device, needs to > > > config host translation structures. e.g. for VT-d, needs to set the > > > guest pasid table to host and enable nested translation. Besides, > > > vIOMMU emulator needs to forward guest's cache invalidation to host. > > > On VT-d, it is guest's invalidation to 1st level translation related > > > cache, such invalidation should be forwarded to host. > > > > > > Proposed PCISVAOps are: > > > * sva_bind_guest_pasid_table: set the guest pasid table to host, and > > > enable nested translation in host > > > * sva_register_notifier: register sva_notifier to forward guest's > > > cache invalidation to host > > > * sva_unregister_notifier: unregister sva_notifier > > > > > > The PCISVAOps should be provided by vfio or modules alike. Mainly for > > > assigned SVA capable devices. > > > > > > Take virt-SVA on VT-d as an exmaple: > > > If a guest wants to setup virt-SVA for an assigned SVA capable device, > > > it programs its context entry. vIOMMU emulator captures guest's context > > > entry programming, and figure out the target device. vIOMMU emulator > > > use the pci_device_sva_bind_pasid_table() API to bind the guest pasid > > > table to host. > > > > > > Guest would also program its pasid table. vIOMMU emulator captures > > > guest's pasid entry programming. In Qemu, needs to allocate an > > > AddressSpace to stand for the pasid tagged address space and Qemu also > > > needs to register sva_notifier to forward future cache invalidation > > > request to host. > > > > > > Allocating AddressSpace to stand for the pasid tagged address space is > > > for the emulation of emulated SVA capable devices. Emulated SVA capable > > > devices may issue SVA aware DMAs, Qemu needs to emulate read/write to a > > > pasid tagged AddressSpace. Thus needs an abstraction for such address > > > space in Qemu. > > > > > > Signed-off-by: Liu, Yi L <yi.l.liu@linux.intel.com> > > > > So PCISVAOps is roughly equivalent to the cluster-of-PASIDs context I > > was suggesting in my earlier comments, > > yes, it is. The purpose is to expose pasid table bind and sva notfier > registration/unregistration to vIOMMU emulators. > > > however it's only an ops > > structure. That means you can't easily share a context between > > multiple PCI devices which is unfortunate because: > > * The simplest use case for SVA I can see would just put the > > same set of PASIDs into place for every SVA capable device > > Do you mean for emulated SVA capable device? Not necessarily. I'd expect that model could be useful for both emulated and passthrough SVA capable devices. > > * Sometimes the IOMMU can't determine exactly what device a DMA > > came from. Now the bridge cases where this applies are probably > > unlikely with SVA devices, but I wouldn't want to bet on it. In > > addition, the chances some manufacturer will eventually put out > > a buggy multifunction SVA capable device that use the wrong RIDs > > for the secondary functions is pretty darn high. > > I'm not sure I 100% got your point here. Do yu mean physical device? > In PCIE TLP, DMA packet should have a RID field? Yes, but that RID isn't accurate in all cases. One case is if you have a PCIe device behind both a PCIe->PCI and PCI->PCIe bridge. Now obviously SVA won't work in that case, but it would be good to at least detect it and refuse to attempt SVA. Another case is with a buggy device that just sends the wrong RID. In particular there are some multifunction devices that use function 0's RID for all functions. Obviously that's a hardware bug and we can't expect everything to work in this case. But forcing all the functions to share an SVAContext in this case - like we alreayd force them to share an IOMMU group - allows us to reason about what will and won't work > And it looks more like > a hardware layer trouble. For this series, it only provides necessary > software support to make sure guest's SVA operation is well prepared > before the SVA device issues the SVA aware DMA. e.g. link guest's pasid > table to host, and config iommu translation in nested mode. > > > > > So I think instead you want a cluster-of-PASIDs object which has an > > ops table including both these and the per-PASID calls from the > > earlier patches (but the per-PASID calls would now take an explicit > > PASID value). > > I didn't quite get "including both these and the per-PASID calls". > What do you mean by "these"? Do you mean the PCISVAOps? I mean that I think PCISVAOps should become a full object including an ops table, not just an ops table. That table would include the things currently in PCISVAOps. It would also include callbacks for the things that are in your per-PASID object in this draft, but those callbacks would now need to take an explicit PASIC parameter.
Hi David, > From: David Gibson [mailto:david@gibson.dropbear.id.au] > Sent: Thursday, April 12, 2018 10:36 AM > On Tue, Mar 06, 2018 at 06:33:52PM +0800, Liu, Yi L wrote: > > On Mon, Mar 05, 2018 at 02:31:44PM +1100, David Gibson wrote: > > > On Thu, Mar 01, 2018 at 06:31:55PM +0800, Liu, Yi L wrote: > > > > This patch intoduces PCISVAOps for virt-SVA. > > > > > > > > So far, to setup virt-SVA for assigned SVA capable device, needs to > > > > config host translation structures. e.g. for VT-d, needs to set the > > > > guest pasid table to host and enable nested translation. Besides, > > > > vIOMMU emulator needs to forward guest's cache invalidation to host. > > > > On VT-d, it is guest's invalidation to 1st level translation related > > > > cache, such invalidation should be forwarded to host. > > > > > > > > Proposed PCISVAOps are: > > > > * sva_bind_guest_pasid_table: set the guest pasid table to host, and > > > > enable nested translation in host > > > > * sva_register_notifier: register sva_notifier to forward guest's > > > > cache invalidation to host > > > > * sva_unregister_notifier: unregister sva_notifier > > > > > > > > The PCISVAOps should be provided by vfio or modules alike. Mainly for > > > > assigned SVA capable devices. > > > > > > > > Take virt-SVA on VT-d as an exmaple: > > > > If a guest wants to setup virt-SVA for an assigned SVA capable device, > > > > it programs its context entry. vIOMMU emulator captures guest's context > > > > entry programming, and figure out the target device. vIOMMU emulator > > > > use the pci_device_sva_bind_pasid_table() API to bind the guest pasid > > > > table to host. > > > > > > > > Guest would also program its pasid table. vIOMMU emulator captures > > > > guest's pasid entry programming. In Qemu, needs to allocate an > > > > AddressSpace to stand for the pasid tagged address space and Qemu also > > > > needs to register sva_notifier to forward future cache invalidation > > > > request to host. > > > > > > > > Allocating AddressSpace to stand for the pasid tagged address space is > > > > for the emulation of emulated SVA capable devices. Emulated SVA capable > > > > devices may issue SVA aware DMAs, Qemu needs to emulate read/write to a > > > > pasid tagged AddressSpace. Thus needs an abstraction for such address > > > > space in Qemu. > > > > > > > > Signed-off-by: Liu, Yi L <yi.l.liu@linux.intel.com> > > > > > > So PCISVAOps is roughly equivalent to the cluster-of-PASIDs context I > > > was suggesting in my earlier comments, > > > > yes, it is. The purpose is to expose pasid table bind and sva notfier > > registration/unregistration to vIOMMU emulators. > > > > > however it's only an ops > > > structure. That means you can't easily share a context between > > > multiple PCI devices which is unfortunate because: > > > * The simplest use case for SVA I can see would just put the > > > same set of PASIDs into place for every SVA capable device > > > > Do you mean for emulated SVA capable device? > > Not necessarily. I'd expect that model could be useful for both > emulated and passthrough SVA capable devices. > > > > * Sometimes the IOMMU can't determine exactly what device a DMA > > > came from. Now the bridge cases where this applies are probably > > > unlikely with SVA devices, but I wouldn't want to bet on it. In > > > addition, the chances some manufacturer will eventually put out > > > a buggy multifunction SVA capable device that use the wrong RIDs > > > for the secondary functions is pretty darn high. > > > > I'm not sure I 100% got your point here. Do yu mean physical device? > > In PCIE TLP, DMA packet should have a RID field? > > Yes, but that RID isn't accurate in all cases. > > One case is if you have a PCIe device behind both a PCIe->PCI and > PCI->PCIe bridge. Now obviously SVA won't work in that case, but it > would be good to at least detect it and refuse to attempt SVA. > > Another case is with a buggy device that just sends the wrong RID. In > particular there are some multifunction devices that use function 0's > RID for all functions. Obviously that's a hardware bug and we can't > expect everything to work in this case. But forcing all the functions > to share an SVAContext in this case - like we alreayd force them to > share an IOMMU group - allows us to reason about what will and won't work Agree. > > > And it looks more like > > a hardware layer trouble. For this series, it only provides necessary > > software support to make sure guest's SVA operation is well prepared > > before the SVA device issues the SVA aware DMA. e.g. link guest's pasid > > table to host, and config iommu translation in nested mode. yes, it is. > > > > > > > > So I think instead you want a cluster-of-PASIDs object which has an > > > ops table including both these and the per-PASID calls from the > > > earlier patches (but the per-PASID calls would now take an explicit > > > PASID value). > > > > I didn't quite get "including both these and the per-PASID calls". > > What do you mean by "these"? Do you mean the PCISVAOps? > > I mean that I think PCISVAOps should become a full object including an > ops table, not just an ops table. That table would include the things > currently in PCISVAOps. It would also include callbacks for the > things that are in your per-PASID object in this draft, but those > callbacks would now need to take an explicit PASIC parameter. Based on some comments from Peter Xu and your comments on the " [PATCH v3 03/12] hw/core: introduce IOMMUSVAContext for virt-SVA" I've considered a new approach which might be able to reuse the existing translation logic based on MemoryRegion. I'm preparing some code to show it. Before that, I'd like to see your opinion. As we discussed, for assigned devices, we want to prepare the config before the SVA device issues the SVA aware DMA. And this can be achieved by PCISVAOps proposed as below: struct PCISVAOps { void (*pasid_bind_table)(PCIBus *bus, int32_t devfn, uint64_t pasidt_addr, uint32_t size); void (*pasid_invalidate_extend_iotlb)(PCIBus *bus, int32_t devfn, void *data); } This is no more notifier based, for further extension, it could include more callback in this ops. Previously, I thought notifier is better. However, Peter corrected me in another thread. While for emulated devices, we need to support address translation. So I introduced IOMMUSVAContext and translate callback in this series. But it duplicates much translation logic from MemoryRegion based logic. So I reconsidered if it is ok to reuse MemoryRegion. And seems like possible. Below is my thought. I took VT-d as example, you may check if works based on your understanding. 1) add "pasid" and "pasid_allocated" field in the structure below. For each PASID tagged address space, Qemu creates a VTDAddressSpace instance, and set the pasid field. For PCI DMA address space, it won't init pasid and pasid_allocated field. struct VTDAddressSpace { PCIBus *bus; uint8_t devfn; + bool pasid_allocated; + uint32_t pasid; AddressSpace as; IOMMUMemoryRegion iommu; MemoryRegion root; ... } When emulated SVA capable devices issues SVA aware DMA, its device model should be able to get a PASID, and then get correct AddressSpace. The DMA emulation logic would finally call into imrc->translate() callback which provided by IOMMU emulator. In the callback, it can get the VTDAddressSpace and check if PASID is allocated. If yes, do translation with corresponding 1st-level page table. If no, walk through the I/O page table( 2nd level-page table). Another benefit of reusing MemoryRegion would be: if someone wants to implement SVA with shadow solution, he or she could use the MAP/UNMAP API to shadow SVA mapping to host iommu. Thanks, Yi Liu
diff --git a/hw/pci/pci.c b/hw/pci/pci.c index e006b6a..157fe21 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -2573,6 +2573,66 @@ void pci_setup_iommu(PCIBus *bus, PCIIOMMUFunc fn, void *opaque) bus->iommu_opaque = opaque; } +void pci_setup_sva_ops(PCIDevice *dev, PCISVAOps *ops) +{ + if (dev) { + dev->sva_ops = ops; + } + return; +} + +void pci_device_sva_bind_pasid_table(PCIBus *bus, + int32_t devfn, uint64_t addr, uint32_t size) +{ + PCIDevice *dev; + + if (!bus) { + return; + } + + dev = bus->devices[devfn]; + if (dev && dev->sva_ops) { + dev->sva_ops->sva_bind_pasid_table(bus, devfn, addr, size); + } + return; +} + +void pci_device_sva_register_notifier(PCIBus *bus, int32_t devfn, + IOMMUSVAContext *sva_ctx) +{ + PCIDevice *dev; + + if (!bus) { + return; + } + + dev = bus->devices[devfn]; + if (dev && dev->sva_ops) { + dev->sva_ops->sva_register_notifier(bus, + devfn, + sva_ctx); + } + return; +} + +void pci_device_sva_unregister_notifier(PCIBus *bus, int32_t devfn, + IOMMUSVAContext *sva_ctx) +{ + PCIDevice *dev; + + if (!bus) { + return; + } + + dev = bus->devices[devfn]; + if (dev && dev->sva_ops) { + dev->sva_ops->sva_unregister_notifier(bus, + devfn, + sva_ctx); + } + return; +} + static void pci_dev_get_w64(PCIBus *b, PCIDevice *dev, void *opaque) { Range *range = opaque; diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index d8c18c7..32889a4 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -10,6 +10,8 @@ #include "hw/pci/pcie.h" +#include "hw/core/pasid.h" + extern bool pci_available; /* PCI bus */ @@ -262,6 +264,16 @@ struct PCIReqIDCache { }; typedef struct PCIReqIDCache PCIReqIDCache; +typedef struct PCISVAOps PCISVAOps; +struct PCISVAOps { + void (*sva_bind_pasid_table)(PCIBus *bus, int32_t devfn, + uint64_t pasidt_addr, uint32_t size); + void (*sva_register_notifier)(PCIBus *bus, int32_t devfn, + IOMMUSVAContext *sva_ctx); + void (*sva_unregister_notifier)(PCIBus *bus, int32_t devfn, + IOMMUSVAContext *sva_ctx); +}; + struct PCIDevice { DeviceState qdev; @@ -351,6 +363,7 @@ struct PCIDevice { MSIVectorUseNotifier msix_vector_use_notifier; MSIVectorReleaseNotifier msix_vector_release_notifier; MSIVectorPollNotifier msix_vector_poll_notifier; + PCISVAOps *sva_ops; }; void pci_register_bar(PCIDevice *pci_dev, int region_num, @@ -477,6 +490,14 @@ typedef AddressSpace *(*PCIIOMMUFunc)(PCIBus *, void *, int); AddressSpace *pci_device_iommu_address_space(PCIDevice *dev); void pci_setup_iommu(PCIBus *bus, PCIIOMMUFunc fn, void *opaque); +void pci_setup_sva_ops(PCIDevice *dev, PCISVAOps *ops); +void pci_device_sva_bind_pasid_table(PCIBus *bus, int32_t devfn, + uint64_t pasidt_addr, uint32_t size); +void pci_device_sva_register_notifier(PCIBus *bus, int32_t devfn, + IOMMUSVAContext *sva_ctx); +void pci_device_sva_unregister_notifier(PCIBus *bus, int32_t devfn, + IOMMUSVAContext *sva_ctx); + static inline void pci_set_byte(uint8_t *config, uint8_t val) {
This patch intoduces PCISVAOps for virt-SVA. So far, to setup virt-SVA for assigned SVA capable device, needs to config host translation structures. e.g. for VT-d, needs to set the guest pasid table to host and enable nested translation. Besides, vIOMMU emulator needs to forward guest's cache invalidation to host. On VT-d, it is guest's invalidation to 1st level translation related cache, such invalidation should be forwarded to host. Proposed PCISVAOps are: * sva_bind_guest_pasid_table: set the guest pasid table to host, and enable nested translation in host * sva_register_notifier: register sva_notifier to forward guest's cache invalidation to host * sva_unregister_notifier: unregister sva_notifier The PCISVAOps should be provided by vfio or modules alike. Mainly for assigned SVA capable devices. Take virt-SVA on VT-d as an exmaple: If a guest wants to setup virt-SVA for an assigned SVA capable device, it programs its context entry. vIOMMU emulator captures guest's context entry programming, and figure out the target device. vIOMMU emulator use the pci_device_sva_bind_pasid_table() API to bind the guest pasid table to host. Guest would also program its pasid table. vIOMMU emulator captures guest's pasid entry programming. In Qemu, needs to allocate an AddressSpace to stand for the pasid tagged address space and Qemu also needs to register sva_notifier to forward future cache invalidation request to host. Allocating AddressSpace to stand for the pasid tagged address space is for the emulation of emulated SVA capable devices. Emulated SVA capable devices may issue SVA aware DMAs, Qemu needs to emulate read/write to a pasid tagged AddressSpace. Thus needs an abstraction for such address space in Qemu. Signed-off-by: Liu, Yi L <yi.l.liu@linux.intel.com> --- hw/pci/pci.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++++ include/hw/pci/pci.h | 21 ++++++++++++++++++ 2 files changed, 81 insertions(+)