Message ID | 1571920483-3382-1-git-send-email-yi.l.liu@intel.com (mailing list archive) |
---|---|
Headers | show |
Series | intel_iommu: expose Shared Virtual Addressing to VM | expand |
Patchew URL: https://patchew.org/QEMU/1571920483-3382-1-git-send-email-yi.l.liu@intel.com/ Hi, This series failed the docker-quick@centos7 build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. === TEST SCRIPT BEGIN === #!/bin/bash make docker-image-centos7 V=1 NETWORK=1 time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1 === TEST SCRIPT END === CC hw/pci/pci_host.o CC hw/pci/pcie.o /tmp/qemu-test/src/hw/pci-host/designware.c: In function 'designware_pcie_host_realize': /tmp/qemu-test/src/hw/pci-host/designware.c:693:5: error: incompatible type for argument 2 of 'pci_setup_iommu' pci_setup_iommu(pci->bus, designware_iommu_ops, s); ^ In file included from /tmp/qemu-test/src/include/hw/pci/msi.h:24:0, --- /tmp/qemu-test/src/include/hw/pci/pci.h:495:6: note: expected 'const struct PCIIOMMUOps *' but argument is of type 'PCIIOMMUOps' void pci_setup_iommu(PCIBus *bus, const PCIIOMMUOps *iommu_ops, void *opaque); ^ make: *** [hw/pci-host/designware.o] Error 1 make: *** Waiting for unfinished jobs.... Traceback (most recent call last): File "./tests/docker/docker.py", line 662, in <module> --- raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=092c0f9750e6454780dcead436e6bc2c', '-u', '1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-ih3zhzs3/src/docker-src.2019-10-25-02.18.26.32058:/var/tmp/qemu:z,ro', 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2. filter=--filter=label=com.qemu.instance.uuid=092c0f9750e6454780dcead436e6bc2c make[1]: *** [docker-run] Error 1 make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-ih3zhzs3/src' make: *** [docker-run-test-quick@centos7] Error 2 real 3m8.783s user 0m8.093s The full log is available at http://patchew.org/logs/1571920483-3382-1-git-send-email-yi.l.liu@intel.com/testing.docker-quick@centos7/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-devel@redhat.com
Patchew URL: https://patchew.org/QEMU/1571920483-3382-1-git-send-email-yi.l.liu@intel.com/ Hi, This series failed the docker-mingw@fedora build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. === TEST SCRIPT BEGIN === #! /bin/bash export ARCH=x86_64 make docker-image-fedora V=1 NETWORK=1 time make docker-test-mingw@fedora J=14 NETWORK=1 === TEST SCRIPT END === CC hw/pci/pci_host.o CC hw/pci/pcie.o /tmp/qemu-test/src/hw/pci-host/designware.c: In function 'designware_pcie_host_realize': /tmp/qemu-test/src/hw/pci-host/designware.c:693:31: error: incompatible type for argument 2 of 'pci_setup_iommu' pci_setup_iommu(pci->bus, designware_iommu_ops, s); ^~~~~~~~~~~~~~~~~~~~ In file included from /tmp/qemu-test/src/include/hw/pci/msi.h:24, --- /tmp/qemu-test/src/include/hw/pci/pci.h:495:54: note: expected 'const PCIIOMMUOps *' {aka 'const struct PCIIOMMUOps *'} but argument is of type 'PCIIOMMUOps' {aka 'const struct PCIIOMMUOps'} void pci_setup_iommu(PCIBus *bus, const PCIIOMMUOps *iommu_ops, void *opaque); ~~~~~~~~~~~~~~~~~~~^~~~~~~~~ make: *** [/tmp/qemu-test/src/rules.mak:69: hw/pci-host/designware.o] Error 1 make: *** Waiting for unfinished jobs.... Traceback (most recent call last): File "./tests/docker/docker.py", line 662, in <module> --- raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=c26679928a9c432d9832978acd80e20b', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-c5ij2tri/src/docker-src.2019-10-25-02.27.27.16595:/var/tmp/qemu:z,ro', 'qemu:fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit status 2. filter=--filter=label=com.qemu.instance.uuid=c26679928a9c432d9832978acd80e20b make[1]: *** [docker-run] Error 1 make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-c5ij2tri/src' make: *** [docker-run-test-mingw@fedora] Error 2 real 2m45.686s user 0m7.841s The full log is available at http://patchew.org/logs/1571920483-3382-1-git-send-email-yi.l.liu@intel.com/testing.docker-mingw@fedora/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-devel@redhat.com
On 2019/10/24 下午8:34, Liu Yi L wrote: > Shared virtual address (SVA), a.k.a, Shared virtual memory (SVM) on Intel > platforms allow address space sharing between device DMA and applications. Interesting, so the below figure demonstrates the case of VM. I wonder how much differences if we compare it with doing SVM between device and an ordinary process (e.g dpdk)? Thanks > SVA can reduce programming complexity and enhance security. > This series is intended to expose SVA capability to VMs. i.e. shared guest > application address space with passthru devices. The whole SVA virtualization > requires QEMU/VFIO/IOMMU changes. This series includes the QEMU changes, for > VFIO and IOMMU changes, they are in separate series (listed in the "Related > series"). > > The high-level architecture for SVA virtualization is as below: > > .-------------. .---------------------------. > | vIOMMU | | Guest process CR3, FL only| > | | '---------------------------' > .----------------/ > | PASID Entry |--- PASID cache flush - > '-------------' | > | | V > | | CR3 in GPA > '-------------' > Guest > ------| Shadow |--------------------------|-------- > v v v > Host > .-------------. .----------------------. > | pIOMMU | | Bind FL for GVA-GPA | > | | '----------------------' > .----------------/ | > | PASID Entry | V (Nested xlate) > '----------------\.------------------------------. > | | |SL for GPA-HPA, default domain| > | | '------------------------------' > '-------------' > Where: > - FL = First level/stage one page tables > - SL = Second level/stage two page tables > > The complete vSVA upstream patches are divided into three phases: > 1. Common APIs and PCI device direct assignment > 2. Page Request Services (PRS) support > 3. Mediated device assignment > > This RFC patchset is aiming for the phase 1. Works together with the VT-d > driver[1] changes and VFIO changes[2]. > > Related series: > [1] [PATCH v6 00/10] Nested Shared Virtual Address (SVA) VT-d support: > https://lkml.org/lkml/2019/10/22/953 > <This series is based on this kernel series from Jacob Pan> > > [2] [RFC v2 0/3] vfio: support Shared Virtual Addressing from Yi Liu > > There are roughly four parts: > 1. Introduce IOMMUContext as abstract layer between vIOMMU emulator and > VFIO to avoid direct calling between the two > 2. Passdown PASID allocation and free to host > 3. Passdown guest PASID binding to host > 4. Passdown guest IOMMU cache invalidation to host > > The full set can be found in below link: > https://github.com/luxis1999/qemu.git: sva_vtd_v6_qemu_rfc_v2 > > Changelog: > - RFC v1 -> v2: > Introduce IOMMUContext to abstract the connection between VFIO > and vIOMMU emulator, which is a replacement of the PCIPASIDOps > in RFC v1. Modify x-scalable-mode to be string option instead of > adding a new option as RFC v1 did. Refined the pasid cache management > and addressed the TODOs mentioned in RFC v1. > RFC v1: https://patchwork.kernel.org/cover/11033657/ > > Eric Auger (1): > update-linux-headers: Import iommu.h > > Liu Yi L (20): > header update VFIO/IOMMU vSVA APIs against 5.4.0-rc3+ > intel_iommu: modify x-scalable-mode to be string option > vfio/common: add iommu_ctx_notifier in container > hw/pci: modify pci_setup_iommu() to set PCIIOMMUOps > hw/pci: introduce pci_device_iommu_context() > intel_iommu: provide get_iommu_context() callback > vfio/pci: add iommu_context notifier for pasid alloc/free > intel_iommu: add virtual command capability support > intel_iommu: process pasid cache invalidation > intel_iommu: add present bit check for pasid table entries > intel_iommu: add PASID cache management infrastructure > vfio/pci: add iommu_context notifier for pasid bind/unbind > intel_iommu: bind/unbind guest page table to host > intel_iommu: replay guest pasid bindings to host > intel_iommu: replay pasid binds after context cache invalidation > intel_iommu: do not passdown pasid bind for PASID #0 > vfio/pci: add iommu_context notifier for PASID-based iotlb flush > intel_iommu: process PASID-based iotlb invalidation > intel_iommu: propagate PASID-based iotlb invalidation to host > intel_iommu: process PASID-based Device-TLB invalidation > > Peter Xu (1): > hw/iommu: introduce IOMMUContext > > hw/Makefile.objs | 1 + > hw/alpha/typhoon.c | 6 +- > hw/arm/smmu-common.c | 6 +- > hw/hppa/dino.c | 6 +- > hw/i386/amd_iommu.c | 6 +- > hw/i386/intel_iommu.c | 1249 +++++++++++++++++++++++++++++++++++++-- > hw/i386/intel_iommu_internal.h | 109 ++++ > hw/i386/trace-events | 6 + > hw/iommu/Makefile.objs | 1 + > hw/iommu/iommu.c | 66 +++ > hw/pci-host/designware.c | 6 +- > hw/pci-host/ppce500.c | 6 +- > hw/pci-host/prep.c | 6 +- > hw/pci-host/sabre.c | 6 +- > hw/pci/pci.c | 27 +- > hw/ppc/ppc440_pcix.c | 6 +- > hw/ppc/spapr_pci.c | 6 +- > hw/s390x/s390-pci-bus.c | 8 +- > hw/vfio/common.c | 10 + > hw/vfio/pci.c | 149 +++++ > include/hw/i386/intel_iommu.h | 58 +- > include/hw/iommu/iommu.h | 113 ++++ > include/hw/pci/pci.h | 13 +- > include/hw/pci/pci_bus.h | 2 +- > include/hw/vfio/vfio-common.h | 9 + > linux-headers/linux/iommu.h | 324 ++++++++++ > linux-headers/linux/vfio.h | 83 +++ > scripts/update-linux-headers.sh | 2 +- > 28 files changed, 2232 insertions(+), 58 deletions(-) > create mode 100644 hw/iommu/Makefile.objs > create mode 100644 hw/iommu/iommu.c > create mode 100644 include/hw/iommu/iommu.h > create mode 100644 linux-headers/linux/iommu.h >
> From: Jason Wang [mailto:jasowang@redhat.com] > Sent: Friday, October 25, 2019 5:49 PM > > > On 2019/10/24 下午8:34, Liu Yi L wrote: > > Shared virtual address (SVA), a.k.a, Shared virtual memory (SVM) on Intel > > platforms allow address space sharing between device DMA and > applications. > > > Interesting, so the below figure demonstrates the case of VM. I wonder > how much differences if we compare it with doing SVM between device > and > an ordinary process (e.g dpdk)? > > Thanks One difference is that ordinary process requires only stage-1 translation, while VM requires nested translation. > > > > SVA can reduce programming complexity and enhance security. > > This series is intended to expose SVA capability to VMs. i.e. shared guest > > application address space with passthru devices. The whole SVA > virtualization > > requires QEMU/VFIO/IOMMU changes. This series includes the QEMU > changes, for > > VFIO and IOMMU changes, they are in separate series (listed in the > "Related > > series"). > > > > The high-level architecture for SVA virtualization is as below: > > > > .-------------. .---------------------------. > > | vIOMMU | | Guest process CR3, FL only| > > | | '---------------------------' > > .----------------/ > > | PASID Entry |--- PASID cache flush - > > '-------------' | > > | | V > > | | CR3 in GPA > > '-------------' > > Guest > > ------| Shadow |--------------------------|-------- > > v v v > > Host > > .-------------. .----------------------. > > | pIOMMU | | Bind FL for GVA-GPA | > > | | '----------------------' > > .----------------/ | > > | PASID Entry | V (Nested xlate) > > '----------------\.------------------------------. > > | | |SL for GPA-HPA, default domain| > > | | '------------------------------' > > '-------------' > > Where: > > - FL = First level/stage one page tables > > - SL = Second level/stage two page tables > > > > The complete vSVA upstream patches are divided into three phases: > > 1. Common APIs and PCI device direct assignment > > 2. Page Request Services (PRS) support > > 3. Mediated device assignment > > > > This RFC patchset is aiming for the phase 1. Works together with the VT-d > > driver[1] changes and VFIO changes[2]. > > > > Related series: > > [1] [PATCH v6 00/10] Nested Shared Virtual Address (SVA) VT-d support: > > https://lkml.org/lkml/2019/10/22/953 > > <This series is based on this kernel series from Jacob Pan> > > > > [2] [RFC v2 0/3] vfio: support Shared Virtual Addressing from Yi Liu > > > > There are roughly four parts: > > 1. Introduce IOMMUContext as abstract layer between vIOMMU > emulator and > > VFIO to avoid direct calling between the two > > 2. Passdown PASID allocation and free to host > > 3. Passdown guest PASID binding to host > > 4. Passdown guest IOMMU cache invalidation to host > > > > The full set can be found in below link: > > https://github.com/luxis1999/qemu.git: sva_vtd_v6_qemu_rfc_v2 > > > > Changelog: > > - RFC v1 -> v2: > > Introduce IOMMUContext to abstract the connection between > VFIO > > and vIOMMU emulator, which is a replacement of the > PCIPASIDOps > > in RFC v1. Modify x-scalable-mode to be string option instead of > > adding a new option as RFC v1 did. Refined the pasid cache > management > > and addressed the TODOs mentioned in RFC v1. > > RFC v1: https://patchwork.kernel.org/cover/11033657/ > > > > Eric Auger (1): > > update-linux-headers: Import iommu.h > > > > Liu Yi L (20): > > header update VFIO/IOMMU vSVA APIs against 5.4.0-rc3+ > > intel_iommu: modify x-scalable-mode to be string option > > vfio/common: add iommu_ctx_notifier in container > > hw/pci: modify pci_setup_iommu() to set PCIIOMMUOps > > hw/pci: introduce pci_device_iommu_context() > > intel_iommu: provide get_iommu_context() callback > > vfio/pci: add iommu_context notifier for pasid alloc/free > > intel_iommu: add virtual command capability support > > intel_iommu: process pasid cache invalidation > > intel_iommu: add present bit check for pasid table entries > > intel_iommu: add PASID cache management infrastructure > > vfio/pci: add iommu_context notifier for pasid bind/unbind > > intel_iommu: bind/unbind guest page table to host > > intel_iommu: replay guest pasid bindings to host > > intel_iommu: replay pasid binds after context cache invalidation > > intel_iommu: do not passdown pasid bind for PASID #0 > > vfio/pci: add iommu_context notifier for PASID-based iotlb flush > > intel_iommu: process PASID-based iotlb invalidation > > intel_iommu: propagate PASID-based iotlb invalidation to host > > intel_iommu: process PASID-based Device-TLB invalidation > > > > Peter Xu (1): > > hw/iommu: introduce IOMMUContext > > > > hw/Makefile.objs | 1 + > > hw/alpha/typhoon.c | 6 +- > > hw/arm/smmu-common.c | 6 +- > > hw/hppa/dino.c | 6 +- > > hw/i386/amd_iommu.c | 6 +- > > hw/i386/intel_iommu.c | 1249 > +++++++++++++++++++++++++++++++++++++-- > > hw/i386/intel_iommu_internal.h | 109 ++++ > > hw/i386/trace-events | 6 + > > hw/iommu/Makefile.objs | 1 + > > hw/iommu/iommu.c | 66 +++ > > hw/pci-host/designware.c | 6 +- > > hw/pci-host/ppce500.c | 6 +- > > hw/pci-host/prep.c | 6 +- > > hw/pci-host/sabre.c | 6 +- > > hw/pci/pci.c | 27 +- > > hw/ppc/ppc440_pcix.c | 6 +- > > hw/ppc/spapr_pci.c | 6 +- > > hw/s390x/s390-pci-bus.c | 8 +- > > hw/vfio/common.c | 10 + > > hw/vfio/pci.c | 149 +++++ > > include/hw/i386/intel_iommu.h | 58 +- > > include/hw/iommu/iommu.h | 113 ++++ > > include/hw/pci/pci.h | 13 +- > > include/hw/pci/pci_bus.h | 2 +- > > include/hw/vfio/vfio-common.h | 9 + > > linux-headers/linux/iommu.h | 324 ++++++++++ > > linux-headers/linux/vfio.h | 83 +++ > > scripts/update-linux-headers.sh | 2 +- > > 28 files changed, 2232 insertions(+), 58 deletions(-) > > create mode 100644 hw/iommu/Makefile.objs > > create mode 100644 hw/iommu/iommu.c > > create mode 100644 include/hw/iommu/iommu.h > > create mode 100644 linux-headers/linux/iommu.h > >
On 2019/10/25 下午6:12, Tian, Kevin wrote: >> From: Jason Wang [mailto:jasowang@redhat.com] >> Sent: Friday, October 25, 2019 5:49 PM >> >> >> On 2019/10/24 下午8:34, Liu Yi L wrote: >>> Shared virtual address (SVA), a.k.a, Shared virtual memory (SVM) on Intel >>> platforms allow address space sharing between device DMA and >> applications. >> >> >> Interesting, so the below figure demonstrates the case of VM. I wonder >> how much differences if we compare it with doing SVM between device >> and >> an ordinary process (e.g dpdk)? >> >> Thanks > One difference is that ordinary process requires only stage-1 translation, > while VM requires nested translation. A silly question, then I believe there's no need for VFIO DMA API in this case consider the page table is shared between MMU and IOMMU? Thanks >
> From: Jason Wang [mailto:jasowang@redhat.com] > Sent: Thursday, October 31, 2019 12:33 PM > > > On 2019/10/25 下午6:12, Tian, Kevin wrote: > >> From: Jason Wang [mailto:jasowang@redhat.com] > >> Sent: Friday, October 25, 2019 5:49 PM > >> > >> > >> On 2019/10/24 下午8:34, Liu Yi L wrote: > >>> Shared virtual address (SVA), a.k.a, Shared virtual memory (SVM) on > Intel > >>> platforms allow address space sharing between device DMA and > >> applications. > >> > >> > >> Interesting, so the below figure demonstrates the case of VM. I wonder > >> how much differences if we compare it with doing SVM between device > >> and > >> an ordinary process (e.g dpdk)? > >> > >> Thanks > > One difference is that ordinary process requires only stage-1 translation, > > while VM requires nested translation. > > > A silly question, then I believe there's no need for VFIO DMA API in > this case consider the page table is shared between MMU and IOMMU? > yes, only need to intercept guest iotlb invalidation request on stage-1 translation and then forward to IOMMU through new VFIO API. Existing VFIO DMA API applies to only the stage-2 translation (GPA->HPA) here. Thanks Kevin
> From: Jason Wang [mailto:jasowang@redhat.com] > Sent: Thursday, October 31, 2019 5:33 AM > Subject: Re: [RFC v2 00/22] intel_iommu: expose Shared Virtual Addressing to VM > > > On 2019/10/25 下午6:12, Tian, Kevin wrote: > >> From: Jason Wang [mailto:jasowang@redhat.com] > >> Sent: Friday, October 25, 2019 5:49 PM > >> > >> > >> On 2019/10/24 下午8:34, Liu Yi L wrote: > >>> Shared virtual address (SVA), a.k.a, Shared virtual memory (SVM) on > >>> Intel platforms allow address space sharing between device DMA and > >> applications. > >> > >> > >> Interesting, so the below figure demonstrates the case of VM. I > >> wonder how much differences if we compare it with doing SVM between > >> device and an ordinary process (e.g dpdk)? > >> > >> Thanks > > One difference is that ordinary process requires only stage-1 > > translation, while VM requires nested translation. > > > A silly question, then I believe there's no need for VFIO DMA API in this case consider > the page table is shared between MMU and IOMMU? Echo Kevin's reply. We use nested translation here. For stage-1, yes, no need to use VFIO DMA API. For stage-2, we still use VFIO DMA API to program the GPA->HPA mapping to host. :-) Regards, Yi Liu > > Thanks > > >
On 2019/10/31 下午10:07, Liu, Yi L wrote: >> From: Jason Wang [mailto:jasowang@redhat.com] >> Sent: Thursday, October 31, 2019 5:33 AM >> Subject: Re: [RFC v2 00/22] intel_iommu: expose Shared Virtual Addressing to VM >> >> >> On 2019/10/25 下午6:12, Tian, Kevin wrote: >>>> From: Jason Wang [mailto:jasowang@redhat.com] >>>> Sent: Friday, October 25, 2019 5:49 PM >>>> >>>> >>>> On 2019/10/24 下午8:34, Liu Yi L wrote: >>>>> Shared virtual address (SVA), a.k.a, Shared virtual memory (SVM) on >>>>> Intel platforms allow address space sharing between device DMA and >>>> applications. >>>> >>>> >>>> Interesting, so the below figure demonstrates the case of VM. I >>>> wonder how much differences if we compare it with doing SVM between >>>> device and an ordinary process (e.g dpdk)? >>>> >>>> Thanks >>> One difference is that ordinary process requires only stage-1 >>> translation, while VM requires nested translation. >> >> A silly question, then I believe there's no need for VFIO DMA API in this case consider >> the page table is shared between MMU and IOMMU? > Echo Kevin's reply. We use nested translation here. For stage-1, yes, no need to use > VFIO DMA API. For stage-2, we still use VFIO DMA API to program the GPA->HPA > mapping to host. :-) Cool, two more questions: - Can EPT shares its page table with IOMMU L2? - Similar to EPT, when GPA->HPA (actually HVA->HPA) is modified by mm, VFIO need to use MMU notifier do modify L2 accordingly besides DMA API? Thanks > > Regards, > Yi Liu >> Thanks >>
> From: Jason Wang [mailto:jasowang@redhat.com] > Sent: Friday, November 1, 2019 3:30 PM > > > On 2019/10/31 下午10:07, Liu, Yi L wrote: > >> From: Jason Wang [mailto:jasowang@redhat.com] > >> Sent: Thursday, October 31, 2019 5:33 AM > >> Subject: Re: [RFC v2 00/22] intel_iommu: expose Shared Virtual > Addressing to VM > >> > >> > >> On 2019/10/25 下午6:12, Tian, Kevin wrote: > >>>> From: Jason Wang [mailto:jasowang@redhat.com] > >>>> Sent: Friday, October 25, 2019 5:49 PM > >>>> > >>>> > >>>> On 2019/10/24 下午8:34, Liu Yi L wrote: > >>>>> Shared virtual address (SVA), a.k.a, Shared virtual memory (SVM) on > >>>>> Intel platforms allow address space sharing between device DMA > and > >>>> applications. > >>>> > >>>> > >>>> Interesting, so the below figure demonstrates the case of VM. I > >>>> wonder how much differences if we compare it with doing SVM > between > >>>> device and an ordinary process (e.g dpdk)? > >>>> > >>>> Thanks > >>> One difference is that ordinary process requires only stage-1 > >>> translation, while VM requires nested translation. > >> > >> A silly question, then I believe there's no need for VFIO DMA API in this > case consider > >> the page table is shared between MMU and IOMMU? > > Echo Kevin's reply. We use nested translation here. For stage-1, yes, no > need to use > > VFIO DMA API. For stage-2, we still use VFIO DMA API to program the > GPA->HPA > > mapping to host. :-) > > > Cool, two more questions: > > - Can EPT shares its page table with IOMMU L2? yes, their formats are compatible. > > - Similar to EPT, when GPA->HPA (actually HVA->HPA) is modified by mm, > VFIO need to use MMU notifier do modify L2 accordingly besides DMA API? > VFIO devices need to pin-down guest memory pages that are mapped in IOMMU. So notifier is not required since mm won't change the mapping for those pages. Thanks Kevin
On 2019/11/1 下午3:46, Tian, Kevin wrote: >> From: Jason Wang [mailto:jasowang@redhat.com] >> Sent: Friday, November 1, 2019 3:30 PM >> >> >> On 2019/10/31 下午10:07, Liu, Yi L wrote: >>>> From: Jason Wang [mailto:jasowang@redhat.com] >>>> Sent: Thursday, October 31, 2019 5:33 AM >>>> Subject: Re: [RFC v2 00/22] intel_iommu: expose Shared Virtual >> Addressing to VM >>>> >>>> On 2019/10/25 下午6:12, Tian, Kevin wrote: >>>>>> From: Jason Wang [mailto:jasowang@redhat.com] >>>>>> Sent: Friday, October 25, 2019 5:49 PM >>>>>> >>>>>> >>>>>> On 2019/10/24 下午8:34, Liu Yi L wrote: >>>>>>> Shared virtual address (SVA), a.k.a, Shared virtual memory (SVM) on >>>>>>> Intel platforms allow address space sharing between device DMA >> and >>>>>> applications. >>>>>> >>>>>> >>>>>> Interesting, so the below figure demonstrates the case of VM. I >>>>>> wonder how much differences if we compare it with doing SVM >> between >>>>>> device and an ordinary process (e.g dpdk)? >>>>>> >>>>>> Thanks >>>>> One difference is that ordinary process requires only stage-1 >>>>> translation, while VM requires nested translation. >>>> A silly question, then I believe there's no need for VFIO DMA API in this >> case consider >>>> the page table is shared between MMU and IOMMU? >>> Echo Kevin's reply. We use nested translation here. For stage-1, yes, no >> need to use >>> VFIO DMA API. For stage-2, we still use VFIO DMA API to program the >> GPA->HPA >>> mapping to host. :-) >> >> Cool, two more questions: >> >> - Can EPT shares its page table with IOMMU L2? > yes, their formats are compatible. > >> - Similar to EPT, when GPA->HPA (actually HVA->HPA) is modified by mm, >> VFIO need to use MMU notifier do modify L2 accordingly besides DMA API? >> > VFIO devices need to pin-down guest memory pages that are mapped > in IOMMU. So notifier is not required since mm won't change the mapping > for those pages. The GUP tends to lead a lot of issues, we may consider to allow userspace to choose to not pin them in the future. Thanks > > Thanks > Kevin
On 2019/11/1 下午4:04, Jason Wang wrote: > > On 2019/11/1 下午3:46, Tian, Kevin wrote: >>> From: Jason Wang [mailto:jasowang@redhat.com] >>> Sent: Friday, November 1, 2019 3:30 PM >>> >>> >>> On 2019/10/31 下午10:07, Liu, Yi L wrote: >>>>> From: Jason Wang [mailto:jasowang@redhat.com] >>>>> Sent: Thursday, October 31, 2019 5:33 AM >>>>> Subject: Re: [RFC v2 00/22] intel_iommu: expose Shared Virtual >>> Addressing to VM >>>>> >>>>> On 2019/10/25 下午6:12, Tian, Kevin wrote: >>>>>>> From: Jason Wang [mailto:jasowang@redhat.com] >>>>>>> Sent: Friday, October 25, 2019 5:49 PM >>>>>>> >>>>>>> >>>>>>> On 2019/10/24 下午8:34, Liu Yi L wrote: >>>>>>>> Shared virtual address (SVA), a.k.a, Shared virtual memory >>>>>>>> (SVM) on >>>>>>>> Intel platforms allow address space sharing between device DMA >>> and >>>>>>> applications. >>>>>>> >>>>>>> >>>>>>> Interesting, so the below figure demonstrates the case of VM. I >>>>>>> wonder how much differences if we compare it with doing SVM >>> between >>>>>>> device and an ordinary process (e.g dpdk)? >>>>>>> >>>>>>> Thanks >>>>>> One difference is that ordinary process requires only stage-1 >>>>>> translation, while VM requires nested translation. >>>>> A silly question, then I believe there's no need for VFIO DMA API >>>>> in this >>> case consider >>>>> the page table is shared between MMU and IOMMU? >>>> Echo Kevin's reply. We use nested translation here. For stage-1, >>>> yes, no >>> need to use >>>> VFIO DMA API. For stage-2, we still use VFIO DMA API to program the >>> GPA->HPA >>>> mapping to host. :-) >>> >>> Cool, two more questions: >>> >>> - Can EPT shares its page table with IOMMU L2? >> yes, their formats are compatible. >> >>> - Similar to EPT, when GPA->HPA (actually HVA->HPA) is modified by mm, >>> VFIO need to use MMU notifier do modify L2 accordingly besides DMA API? >>> >> VFIO devices need to pin-down guest memory pages that are mapped >> in IOMMU. So notifier is not required since mm won't change the mapping >> for those pages. > > > The GUP tends to lead a lot of issues, we may consider to allow > userspace to choose to not pin them in the future. Btw, I'm asking since I see MMU notifier is used by intel-svm.c to flush IOTLB. (I don't see any users in kernel source that use that API though e.g intel_svm_bind_mm()). Thanks > > Thanks > > >> >> Thanks >> Kevin
> From: Jason Wang [mailto:jasowang@redhat.com] > Sent: Friday, November 1, 2019 4:10 PM > > > On 2019/11/1 下午4:04, Jason Wang wrote: > > > > On 2019/11/1 下午3:46, Tian, Kevin wrote: > >>> From: Jason Wang [mailto:jasowang@redhat.com] > >>> Sent: Friday, November 1, 2019 3:30 PM > >>> > >>> > >>> On 2019/10/31 下午10:07, Liu, Yi L wrote: > >>>>> From: Jason Wang [mailto:jasowang@redhat.com] > >>>>> Sent: Thursday, October 31, 2019 5:33 AM > >>>>> Subject: Re: [RFC v2 00/22] intel_iommu: expose Shared Virtual > >>> Addressing to VM > >>>>> > >>>>> On 2019/10/25 下午6:12, Tian, Kevin wrote: > >>>>>>> From: Jason Wang [mailto:jasowang@redhat.com] > >>>>>>> Sent: Friday, October 25, 2019 5:49 PM > >>>>>>> > >>>>>>> > >>>>>>> On 2019/10/24 下午8:34, Liu Yi L wrote: > >>>>>>>> Shared virtual address (SVA), a.k.a, Shared virtual memory > >>>>>>>> (SVM) on > >>>>>>>> Intel platforms allow address space sharing between device DMA > >>> and > >>>>>>> applications. > >>>>>>> > >>>>>>> > >>>>>>> Interesting, so the below figure demonstrates the case of VM. I > >>>>>>> wonder how much differences if we compare it with doing SVM > >>> between > >>>>>>> device and an ordinary process (e.g dpdk)? > >>>>>>> > >>>>>>> Thanks > >>>>>> One difference is that ordinary process requires only stage-1 > >>>>>> translation, while VM requires nested translation. > >>>>> A silly question, then I believe there's no need for VFIO DMA API > >>>>> in this > >>> case consider > >>>>> the page table is shared between MMU and IOMMU? > >>>> Echo Kevin's reply. We use nested translation here. For stage-1, > >>>> yes, no > >>> need to use > >>>> VFIO DMA API. For stage-2, we still use VFIO DMA API to program the > >>> GPA->HPA > >>>> mapping to host. :-) > >>> > >>> Cool, two more questions: > >>> > >>> - Can EPT shares its page table with IOMMU L2? > >> yes, their formats are compatible. > >> > >>> - Similar to EPT, when GPA->HPA (actually HVA->HPA) is modified by > mm, > >>> VFIO need to use MMU notifier do modify L2 accordingly besides DMA > API? > >>> > >> VFIO devices need to pin-down guest memory pages that are mapped > >> in IOMMU. So notifier is not required since mm won't change the > mapping > >> for those pages. > > > > > > The GUP tends to lead a lot of issues, we may consider to allow > > userspace to choose to not pin them in the future. > > > Btw, I'm asking since I see MMU notifier is used by intel-svm.c to flush > IOTLB. (I don't see any users in kernel source that use that API though > e.g intel_svm_bind_mm()). > intel-svm.c requires MMU notifier to invalidate IOTLB upon any change on the CPU page table, when the latter is shared with device in SVA case. But for VFIO usage, which is based on stage2, the map/unmap requests explicitly come from userspace. there is no need to sync with mm. Thanks Kevin
On Thu, Oct 24, 2019 at 08:34:21AM -0400, Liu Yi L wrote: > Shared virtual address (SVA), a.k.a, Shared virtual memory (SVM) on Intel > platforms allow address space sharing between device DMA and applications. > SVA can reduce programming complexity and enhance security. > This series is intended to expose SVA capability to VMs. i.e. shared guest > application address space with passthru devices. The whole SVA virtualization > requires QEMU/VFIO/IOMMU changes. This series includes the QEMU changes, for > VFIO and IOMMU changes, they are in separate series (listed in the "Related > series"). > > The high-level architecture for SVA virtualization is as below: > > .-------------. .---------------------------. > | vIOMMU | | Guest process CR3, FL only| > | | '---------------------------' > .----------------/ > | PASID Entry |--- PASID cache flush - > '-------------' | > | | V > | | CR3 in GPA > '-------------' > Guest > ------| Shadow |--------------------------|-------- > v v v > Host > .-------------. .----------------------. > | pIOMMU | | Bind FL for GVA-GPA | > | | '----------------------' > .----------------/ | > | PASID Entry | V (Nested xlate) > '----------------\.------------------------------. > | | |SL for GPA-HPA, default domain| > | | '------------------------------' > '-------------' > Where: > - FL = First level/stage one page tables > - SL = Second level/stage two page tables Yi, Would you mind to always mention what tests you have been done with the patchset in the cover letter? It'll be fine to say that you're running this against FPGAs so no one could really retest it, but still it would be good to know that as well. It'll even be better to mention that which part of the series is totally untested if you are aware of. Thanks,
> From: Peter Xu [mailto:peterx@redhat.com] > Sent: Tuesday, November 5, 2019 1:23 AM > To: Liu, Yi L <yi.l.liu@intel.com> > Subject: Re: [RFC v2 00/22] intel_iommu: expose Shared Virtual Addressing to VM > > On Thu, Oct 24, 2019 at 08:34:21AM -0400, Liu Yi L wrote: > > Shared virtual address (SVA), a.k.a, Shared virtual memory (SVM) on > > Intel platforms allow address space sharing between device DMA and applications. > > SVA can reduce programming complexity and enhance security. > > This series is intended to expose SVA capability to VMs. i.e. shared > > guest application address space with passthru devices. The whole SVA > > virtualization requires QEMU/VFIO/IOMMU changes. This series includes > > the QEMU changes, for VFIO and IOMMU changes, they are in separate > > series (listed in the "Related series"). > > [...] > > Yi, > > Would you mind to always mention what tests you have been done with the > patchset in the cover letter? It'll be fine to say that you're running this against FPGAs > so no one could really retest it, but still it would be good to know that as well. It'll > even be better to mention that which part of the series is totally untested if you are > aware of. Sure, I should have included the test parts. Will do in next version. Thanks, Yi Liu