Message ID | 0-v2-65016290f146+33e-vfio_iommufd_jgg@nvidia.com (mailing list archive) |
---|---|
Headers | show |
Series | Connect VFIO to IOMMUFD | expand |
On Mon, Nov 07, 2022 at 08:52:44PM -0400, Jason Gunthorpe wrote: > This is on github: https://github.com/jgunthorpe/linux/commits/vfio_iommufd [...] > v2: > - Rebase to v6.1-rc3, v4 iommufd series > - Fixup comments and commit messages from list remarks > - Fix leaking of the iommufd for mdevs > - New patch to fix vfio modaliases when vfio container is disabled > - Add a dmesg once when the iommufd provided /dev/vfio/vfio is opened > to signal that iommufd is providing this I've redone my previous sanity tests. Except those reported bugs, things look fine. Once we fix those issues, GVT and other modules can run some more stressful tests, I think.
On 2022/11/8 17:19, Nicolin Chen wrote: > On Mon, Nov 07, 2022 at 08:52:44PM -0400, Jason Gunthorpe wrote: > >> This is on github: https://github.com/jgunthorpe/linux/commits/vfio_iommufd > [...] >> v2: >> - Rebase to v6.1-rc3, v4 iommufd series >> - Fixup comments and commit messages from list remarks >> - Fix leaking of the iommufd for mdevs >> - New patch to fix vfio modaliases when vfio container is disabled >> - Add a dmesg once when the iommufd provided /dev/vfio/vfio is opened >> to signal that iommufd is providing this > > I've redone my previous sanity tests. Except those reported bugs, > things look fine. Once we fix those issues, GVT and other modules > can run some more stressful tests, I think. our side is also starting test (gvt, nic passthrough) this version. need to wait a while for the result.
every mail in this series is shown thrice in lore: https://lore.kernel.org/all/0-v2-65016290f146+33e-vfio_iommufd_jgg@nvidia.com/ not sure what caused it but it's annoying to check the conversation there. the iommufd series doesn't have this problem. > From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Tuesday, November 8, 2022 8:53 AM > > This series provides an alternative container layer for VFIO implemented > using iommufd. This is optional, if CONFIG_IOMMUFD is not set then it will > not be compiled in. > > At this point iommufd can be injected by passing in a iommfd FD to > VFIO_GROUP_SET_CONTAINER which will use the VFIO compat layer in > iommufd > to obtain the compat IOAS and then connect up all the VFIO drivers as > appropriate. > > This is temporary stopping point, a following series will provide a way to > directly open a VFIO device FD and directly connect it to IOMMUFD using > native ioctls that can expose the IOMMUFD features like hwpt, future > vPASID and dynamic attachment. > > This series, in compat mode, has passed all the qemu tests we have > available, including the test suites for the Intel GVT mdev. Aside from > the temporary limitation with P2P memory this is belived to be fully > compatible with VFIO. > > This is on github: > https://github.com/jgunthorpe/linux/commits/vfio_iommufd > > It requires the iommufd series: > > https://lore.kernel.org/r/0-v4-0de2f6c78ed0+9d1-iommufd_jgg@nvidia.com > > v2: > - Rebase to v6.1-rc3, v4 iommufd series > - Fixup comments and commit messages from list remarks > - Fix leaking of the iommufd for mdevs > - New patch to fix vfio modaliases when vfio container is disabled > - Add a dmesg once when the iommufd provided /dev/vfio/vfio is opened > to signal that iommufd is providing this > v1: https://lore.kernel.org/r/0-v1-4991695894d8+211- > vfio_iommufd_jgg@nvidia.com > > Jason Gunthorpe (11): > vfio: Move vfio_device driver open/close code to a function > vfio: Move vfio_device_assign_container() into > vfio_device_first_open() > vfio: Rename vfio_device_assign/unassign_container() > vfio: Move storage of allow_unsafe_interrupts to vfio_main.c > vfio: Use IOMMU_CAP_ENFORCE_CACHE_COHERENCY for > vfio_file_enforced_coherent() > vfio-iommufd: Allow iommufd to be used in place of a container fd > vfio-iommufd: Support iommufd for physical VFIO devices > vfio-iommufd: Support iommufd for emulated VFIO devices > vfio: Move container related MODULE_ALIAS statements into container.c > vfio: Make vfio_container optionally compiled > iommufd: Allow iommufd to supply /dev/vfio/vfio > > drivers/gpu/drm/i915/gvt/kvmgt.c | 3 + > drivers/iommu/iommufd/Kconfig | 12 + > drivers/iommu/iommufd/main.c | 36 ++ > drivers/s390/cio/vfio_ccw_ops.c | 3 + > drivers/s390/crypto/vfio_ap_ops.c | 3 + > drivers/vfio/Kconfig | 36 +- > drivers/vfio/Makefile | 5 +- > drivers/vfio/container.c | 141 ++------ > drivers/vfio/fsl-mc/vfio_fsl_mc.c | 3 + > drivers/vfio/iommufd.c | 157 ++++++++ > .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 6 + > drivers/vfio/pci/mlx5/main.c | 3 + > drivers/vfio/pci/vfio_pci.c | 3 + > drivers/vfio/platform/vfio_amba.c | 3 + > drivers/vfio/platform/vfio_platform.c | 3 + > drivers/vfio/vfio.h | 100 +++++- > drivers/vfio/vfio_iommu_type1.c | 5 +- > drivers/vfio/vfio_main.c | 338 ++++++++++++++---- > include/linux/vfio.h | 39 ++ > 19 files changed, 700 insertions(+), 199 deletions(-) > create mode 100644 drivers/vfio/iommufd.c > > > base-commit: ca3067007d4f2aa7f3a5375bd256839e08a09453 > -- > 2.38.1
On Wed, Nov 09, 2022 at 09:03:52AM +0000, Tian, Kevin wrote: > every mail in this series is shown thrice in lore: > > https://lore.kernel.org/all/0-v2-65016290f146+33e-vfio_iommufd_jgg@nvidia.com/ > > not sure what caused it but it's annoying to check the conversation there. It is sort of a lore issue, it only combines messages that are exactly the same together. Several of the mailing lists on CC here mangle the message in various ways, eg adding trailer or whatever. This causes repeated messages in lore. The trick in lore is to replace "/all/" with a good list, like /kvm/ or /linux-iommu/ that shows the original non-mangled version, and only once. Jason
On Tue, Nov 08, 2022 at 11:18:03PM +0800, Yi Liu wrote: > On 2022/11/8 17:19, Nicolin Chen wrote: > > On Mon, Nov 07, 2022 at 08:52:44PM -0400, Jason Gunthorpe wrote: > > > > > This is on github: https://github.com/jgunthorpe/linux/commits/vfio_iommufd > > [...] > > > v2: > > > - Rebase to v6.1-rc3, v4 iommufd series > > > - Fixup comments and commit messages from list remarks > > > - Fix leaking of the iommufd for mdevs > > > - New patch to fix vfio modaliases when vfio container is disabled > > > - Add a dmesg once when the iommufd provided /dev/vfio/vfio is opened > > > to signal that iommufd is providing this > > > > I've redone my previous sanity tests. Except those reported bugs, > > things look fine. Once we fix those issues, GVT and other modules > > can run some more stressful tests, I think. > > our side is also starting test (gvt, nic passthrough) this version. need to > wait a while for the result. I've updated the branches with the two functional fixes discussed on the list plus all the doc updates. Thanks, Jason
> From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Wednesday, November 9, 2022 8:48 PM > > On Wed, Nov 09, 2022 at 09:03:52AM +0000, Tian, Kevin wrote: > > every mail in this series is shown thrice in lore: > > > > https://lore.kernel.org/all/0-v2-65016290f146+33e- > vfio_iommufd_jgg@nvidia.com/ > > > > not sure what caused it but it's annoying to check the conversation there. > > It is sort of a lore issue, it only combines messages that are exactly > the same together. Several of the mailing lists on CC here mangle the > message in various ways, eg adding trailer or whatever. This causes > repeated messages in lore. > > The trick in lore is to replace "/all/" with a good list, like /kvm/ > or /linux-iommu/ that shows the original non-mangled version, and only > once. > this trick works. Thanks!
On 11/7/22 7:52 PM, Jason Gunthorpe wrote: > This series provides an alternative container layer for VFIO implemented > using iommufd. This is optional, if CONFIG_IOMMUFD is not set then it will > not be compiled in. > > At this point iommufd can be injected by passing in a iommfd FD to > VFIO_GROUP_SET_CONTAINER which will use the VFIO compat layer in iommufd > to obtain the compat IOAS and then connect up all the VFIO drivers as > appropriate. > > This is temporary stopping point, a following series will provide a way to > directly open a VFIO device FD and directly connect it to IOMMUFD using > native ioctls that can expose the IOMMUFD features like hwpt, future > vPASID and dynamic attachment. > > This series, in compat mode, has passed all the qemu tests we have > available, including the test suites for the Intel GVT mdev. Aside from > the temporary limitation with P2P memory this is belived to be fully > compatible with VFIO. AFAICT there is no equivalent means to specify vfio_iommu_type1.dma_entry_limit when using iommufd; looks like we'll just always get the default 65535. Was this because you envision the limit being not applicable for iommufd (limits will be enforced via either means and eventually we won't want to ) or was it an oversight? Thanks, Matt
On 2022/11/10 00:57, Jason Gunthorpe wrote: > On Tue, Nov 08, 2022 at 11:18:03PM +0800, Yi Liu wrote: >> On 2022/11/8 17:19, Nicolin Chen wrote: >>> On Mon, Nov 07, 2022 at 08:52:44PM -0400, Jason Gunthorpe wrote: >>> >>>> This is on github: https://github.com/jgunthorpe/linux/commits/vfio_iommufd >>> [...] >>>> v2: >>>> - Rebase to v6.1-rc3, v4 iommufd series >>>> - Fixup comments and commit messages from list remarks >>>> - Fix leaking of the iommufd for mdevs >>>> - New patch to fix vfio modaliases when vfio container is disabled >>>> - Add a dmesg once when the iommufd provided /dev/vfio/vfio is opened >>>> to signal that iommufd is providing this >>> >>> I've redone my previous sanity tests. Except those reported bugs, >>> things look fine. Once we fix those issues, GVT and other modules >>> can run some more stressful tests, I think. >> >> our side is also starting test (gvt, nic passthrough) this version. need to >> wait a while for the result. > > I've updated the branches with the two functional fixes discussed on > the list plus all the doc updates. > I see, due to timzone, the kernel we grabbed is 37c9e6e44d77a, it has slight diff in the scripts/kernel-doc compared with the latest commit (6bb16a9c67769). I don't think it impacts the test. https://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git/log/?h=for-next (37c9e6e44d77a) Our side, Yu He, Lixiao Yang has done below tests on Intel platform with the above kernel, results are: 1) GVT-g test suit passed, Intel iGFx passthrough passed. 2) NIC passthrough test with different guest memory (1G/4G), passed. 3) Booting two different QEMUs in the same time but one QEMU opens legacy /dev/vfio/vfio and another opens /dev/iommu. Tests passed. 4) Tried below Kconfig combinations, results are expected. VFIO_CONTAINER=y, IOMMUFD=y -- test pass VFIO_CONTAINER=y, IOMMUFD=n -- test pass VFIO_CONTAINER=n, IOMMUFD=y , IOMMUFD_VFIO_CONTAINER=y -- test pass VFIO_CONTAINER=n, IOMMUFD=y , IOMMUFD_VFIO_CONTAINER=n -- no /dev/vfio/vfio, so test fail, expected 5) Tested devices from multi-device group. Assign such devices to the same VM, pass; assign them to different VMs, fail; assign them to a VM with Intel virtual VT-d, fail; Results are expected. Meanwhile, I also tested the branch in development branch for nesting, the basic functionality looks good. Tested-by: Yi Liu <yi.l.liu@intel.com>
On Thu, Nov 10, 2022 at 10:01:13PM -0500, Matthew Rosato wrote: > On 11/7/22 7:52 PM, Jason Gunthorpe wrote: > > This series provides an alternative container layer for VFIO implemented > > using iommufd. This is optional, if CONFIG_IOMMUFD is not set then it will > > not be compiled in. > > > > At this point iommufd can be injected by passing in a iommfd FD to > > VFIO_GROUP_SET_CONTAINER which will use the VFIO compat layer in iommufd > > to obtain the compat IOAS and then connect up all the VFIO drivers as > > appropriate. > > > > This is temporary stopping point, a following series will provide a way to > > directly open a VFIO device FD and directly connect it to IOMMUFD using > > native ioctls that can expose the IOMMUFD features like hwpt, future > > vPASID and dynamic attachment. > > > > This series, in compat mode, has passed all the qemu tests we have > > available, including the test suites for the Intel GVT mdev. Aside from > > the temporary limitation with P2P memory this is belived to be fully > > compatible with VFIO. > > AFAICT there is no equivalent means to specify > vfio_iommu_type1.dma_entry_limit when using iommufd; looks like > we'll just always get the default 65535. No, there is no arbitary limit on iommufd > Was this because you envision the limit being not applicable for > iommufd (limits will be enforced via either means and eventually we > won't want to ) or was it an oversight? The limit here is primarily about limiting userspace abuse of the interface. iommufd is using GFP_KERNEL_ACCOUNT which shifts the responsiblity to cgroups, which is similar to how KVM works. So, for a VM sandbox you'd set a cgroup limit and if a hostile userspace in the sanbox decides to try to OOM the system it will hit that limit, regardless of which kernel APIs it tries to abuse. This work is not entirely complete as we also need the iommu driver to use GFP_KERNEL_ACCOUNT for allocations connected to the iommu_domain, particularly for allocations of the IO page tables themselves - which can be quite big. Jason
On 2022/11/14 20:51, Yi Liu wrote: > On 2022/11/10 00:57, Jason Gunthorpe wrote: >> On Tue, Nov 08, 2022 at 11:18:03PM +0800, Yi Liu wrote: >>> On 2022/11/8 17:19, Nicolin Chen wrote: >>>> On Mon, Nov 07, 2022 at 08:52:44PM -0400, Jason Gunthorpe wrote: >>>> >>>>> This is on github: >>>>> https://github.com/jgunthorpe/linux/commits/vfio_iommufd >>>> [...] >>>>> v2: >>>>> - Rebase to v6.1-rc3, v4 iommufd series >>>>> - Fixup comments and commit messages from list remarks >>>>> - Fix leaking of the iommufd for mdevs >>>>> - New patch to fix vfio modaliases when vfio container is disabled >>>>> - Add a dmesg once when the iommufd provided /dev/vfio/vfio is opened >>>>> to signal that iommufd is providing this >>>> >>>> I've redone my previous sanity tests. Except those reported bugs, >>>> things look fine. Once we fix those issues, GVT and other modules >>>> can run some more stressful tests, I think. >>> >>> our side is also starting test (gvt, nic passthrough) this version. >>> need to wait a while for the result. >> >> I've updated the branches with the two functional fixes discussed on >> the list plus all the doc updates. >> > > I see, due to timzone, the kernel we grabbed is 37c9e6e44d77a, it has > slight diff in the scripts/kernel-doc compared with the latest commit > (6bb16a9c67769). I don't think it impacts the test. > > https://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git/log/?h=for-next > (37c9e6e44d77a) > > Our side, Yu He, Lixiao Yang has done below tests on Intel platform > with the above kernel, results are: > > 1) GVT-g test suit passed, Intel iGFx passthrough passed. > > 2) NIC passthrough test with different guest memory (1G/4G), passed. > > 3) Booting two different QEMUs in the same time but one QEMU opens > legacy /dev/vfio/vfio and another opens /dev/iommu. Tests passed. > > 4) Tried below Kconfig combinations, results are expected. > > VFIO_CONTAINER=y, IOMMUFD=y -- test pass > VFIO_CONTAINER=y, IOMMUFD=n -- test pass > VFIO_CONTAINER=n, IOMMUFD=y , IOMMUFD_VFIO_CONTAINER=y -- test pass > VFIO_CONTAINER=n, IOMMUFD=y , IOMMUFD_VFIO_CONTAINER=n -- no > /dev/vfio/vfio, so test fail, expected > > 5) Tested devices from multi-device group. Assign such devices to the > same VM, pass; assign them to different VMs, fail; assign them to a VM > with Intel virtual VT-d, fail; Results are expected. > > Meanwhile, I also tested the branch in development branch for nesting, > the basic functionality looks good. > > Tested-by: Yi Liu <yi.l.liu@intel.com> > Tested-by: Lixiao Yang <lixiao.yang@intel.com> -- Regards, Lixiao Yang
On Mon, Nov 14, 2022 at 08:51:58PM +0800, Yi Liu wrote: > Our side, Yu He, Lixiao Yang has done below tests on Intel platform with > the above kernel, results are: > > 1) GVT-g test suit passed, Intel iGFx passthrough passed. > > 2) NIC passthrough test with different guest memory (1G/4G), passed. > > 3) Booting two different QEMUs in the same time but one QEMU opens > legacy /dev/vfio/vfio and another opens /dev/iommu. Tests passed. > > 4) Tried below Kconfig combinations, results are expected. > > VFIO_CONTAINER=y, IOMMUFD=y -- test pass > VFIO_CONTAINER=y, IOMMUFD=n -- test pass > VFIO_CONTAINER=n, IOMMUFD=y , IOMMUFD_VFIO_CONTAINER=y -- test pass > VFIO_CONTAINER=n, IOMMUFD=y , IOMMUFD_VFIO_CONTAINER=n -- no > /dev/vfio/vfio, so test fail, expected > > 5) Tested devices from multi-device group. Assign such devices to the same > VM, pass; assign them to different VMs, fail; assign them to a VM with Intel > virtual VT-d, fail; Results are expected. > > Meanwhile, I also tested the branch in development branch for nesting, > the basic functionality looks good. > > Tested-by: Yi Liu <yi.l.liu@intel.com> Great thanks! In future I also recommend running tests with the CONFIG_IOMMUFD_TEST turned on, it enables a bunch more fast path assertions that might catch something interesting Jason
On 2022/11/14 22:38, Jason Gunthorpe wrote: > On Mon, Nov 14, 2022 at 08:51:58PM +0800, Yi Liu wrote: > >> Our side, Yu He, Lixiao Yang has done below tests on Intel platform with >> the above kernel, results are: >> >> 1) GVT-g test suit passed, Intel iGFx passthrough passed. >> >> 2) NIC passthrough test with different guest memory (1G/4G), passed. >> >> 3) Booting two different QEMUs in the same time but one QEMU opens >> legacy /dev/vfio/vfio and another opens /dev/iommu. Tests passed. >> >> 4) Tried below Kconfig combinations, results are expected. >> >> VFIO_CONTAINER=y, IOMMUFD=y -- test pass >> VFIO_CONTAINER=y, IOMMUFD=n -- test pass >> VFIO_CONTAINER=n, IOMMUFD=y , IOMMUFD_VFIO_CONTAINER=y -- test pass >> VFIO_CONTAINER=n, IOMMUFD=y , IOMMUFD_VFIO_CONTAINER=n -- no >> /dev/vfio/vfio, so test fail, expected >> >> 5) Tested devices from multi-device group. Assign such devices to the same >> VM, pass; assign them to different VMs, fail; assign them to a VM with Intel >> virtual VT-d, fail; Results are expected. >> >> Meanwhile, I also tested the branch in development branch for nesting, >> the basic functionality looks good. >> >> Tested-by: Yi Liu <yi.l.liu@intel.com> > > Great thanks! you are welcome. this is a team work. :) > In future I also recommend running tests with the CONFIG_IOMMUFD_TEST > turned on, it enables a bunch more fast path assertions that might > catch something interesting oh, sure. will try.
On 11/14/22 9:23 AM, Jason Gunthorpe wrote: > On Thu, Nov 10, 2022 at 10:01:13PM -0500, Matthew Rosato wrote: >> On 11/7/22 7:52 PM, Jason Gunthorpe wrote: >>> This series provides an alternative container layer for VFIO implemented >>> using iommufd. This is optional, if CONFIG_IOMMUFD is not set then it will >>> not be compiled in. >>> >>> At this point iommufd can be injected by passing in a iommfd FD to >>> VFIO_GROUP_SET_CONTAINER which will use the VFIO compat layer in iommufd >>> to obtain the compat IOAS and then connect up all the VFIO drivers as >>> appropriate. >>> >>> This is temporary stopping point, a following series will provide a way to >>> directly open a VFIO device FD and directly connect it to IOMMUFD using >>> native ioctls that can expose the IOMMUFD features like hwpt, future >>> vPASID and dynamic attachment. >>> >>> This series, in compat mode, has passed all the qemu tests we have >>> available, including the test suites for the Intel GVT mdev. Aside from >>> the temporary limitation with P2P memory this is belived to be fully >>> compatible with VFIO. >> >> AFAICT there is no equivalent means to specify >> vfio_iommu_type1.dma_entry_limit when using iommufd; looks like >> we'll just always get the default 65535. > > No, there is no arbitary limit on iommufd Yeah, that's what I suspected. But FWIW, userspace checks the advertised limit via VFIO_IOMMU_GET_INFO / VFIO_IOMMU_TYPE1_INFO_DMA_AVAIL, and this is still being advertised as 65535 when using iommufd. I don't think there is a defined way to return 'ignore this value'. This should go away later when we bind to iommufd directly since QEMU would not be sharing the type1 codepath in userspace. > >> Was this because you envision the limit being not applicable for >> iommufd (limits will be enforced via either means and eventually we >> won't want to ) or was it an oversight? > > The limit here is primarily about limiting userspace abuse of the > interface. > > iommufd is using GFP_KERNEL_ACCOUNT which shifts the responsiblity to > cgroups, which is similar to how KVM works. > > So, for a VM sandbox you'd set a cgroup limit and if a hostile > userspace in the sanbox decides to try to OOM the system it will hit > that limit, regardless of which kernel APIs it tries to abuse. > > This work is not entirely complete as we also need the iommu driver to > use GFP_KERNEL_ACCOUNT for allocations connected to the iommu_domain, > particularly for allocations of the IO page tables themselves - which > can be quite big. > > Jason
On Mon, Nov 14, 2022 at 09:55:21AM -0500, Matthew Rosato wrote: > >> AFAICT there is no equivalent means to specify > >> vfio_iommu_type1.dma_entry_limit when using iommufd; looks like > >> we'll just always get the default 65535. > > > > No, there is no arbitary limit on iommufd > > Yeah, that's what I suspected. But FWIW, userspace checks the > advertised limit via VFIO_IOMMU_GET_INFO / > VFIO_IOMMU_TYPE1_INFO_DMA_AVAIL, and this is still being advertised > as 65535 when using iommufd. I don't think there is a defined way > to return 'ignore this value'. Is something using this? Should we make it much bigger? Jason
On 11/14/22 9:59 AM, Jason Gunthorpe wrote: > On Mon, Nov 14, 2022 at 09:55:21AM -0500, Matthew Rosato wrote: >>>> AFAICT there is no equivalent means to specify >>>> vfio_iommu_type1.dma_entry_limit when using iommufd; looks like >>>> we'll just always get the default 65535. >>> >>> No, there is no arbitary limit on iommufd >> >> Yeah, that's what I suspected. But FWIW, userspace checks the >> advertised limit via VFIO_IOMMU_GET_INFO / >> VFIO_IOMMU_TYPE1_INFO_DMA_AVAIL, and this is still being advertised >> as 65535 when using iommufd. I don't think there is a defined way >> to return 'ignore this value'. > > Is something using this? Should we make it much bigger? Yes, s390 when doing lazy unmapping likes to use larger amounts of concurrent DMA, so there can be cases where we want to raise this limit. The initial value of 65535 is already pretty arbitrary (U16_MAX) -- If iommufd is doing its own management and this value becomes deprecated in this scenario, and we can't set it to a magic value that says 'ignore me' then maybe it just makes sense for now to set it arbitrarily larger when using iommufd e.g. U32_MAX?
On Mon, Nov 14, 2022 at 10:21:50AM -0500, Matthew Rosato wrote: > On 11/14/22 9:59 AM, Jason Gunthorpe wrote: > > On Mon, Nov 14, 2022 at 09:55:21AM -0500, Matthew Rosato wrote: > >>>> AFAICT there is no equivalent means to specify > >>>> vfio_iommu_type1.dma_entry_limit when using iommufd; looks like > >>>> we'll just always get the default 65535. > >>> > >>> No, there is no arbitary limit on iommufd > >> > >> Yeah, that's what I suspected. But FWIW, userspace checks the > >> advertised limit via VFIO_IOMMU_GET_INFO / > >> VFIO_IOMMU_TYPE1_INFO_DMA_AVAIL, and this is still being advertised > >> as 65535 when using iommufd. I don't think there is a defined way > >> to return 'ignore this value'. > > > > Is something using this? Should we make it much bigger? > > Yes, s390 when doing lazy unmapping likes to use larger amounts of > concurrent DMA, so there can be cases where we want to raise this > limit. > > The initial value of 65535 is already pretty arbitrary (U16_MAX) -- It was choosen to match VFIO's default > If iommufd is doing its own management and this value becomes > deprecated in this scenario, and we can't set it to a magic value > that says 'ignore me' then maybe it just makes sense for now to set > it arbitrarily larger when using iommufd e.g. U32_MAX? Sure /* * iommufd's limit is based on the cgroup's memory limit. * Normally vfio would return U16_MAX here, and provide a module * parameter to adjust it. Since S390 qemu userspace actually * pays attention and needs a value bigger than U16_MAX return * U32_MAX. */ .avail = U32_MAX, Thanks, Jason
On 11/9/22 11:57 AM, Jason Gunthorpe wrote: > On Tue, Nov 08, 2022 at 11:18:03PM +0800, Yi Liu wrote: >> On 2022/11/8 17:19, Nicolin Chen wrote: >>> On Mon, Nov 07, 2022 at 08:52:44PM -0400, Jason Gunthorpe wrote: >>> >>>> This is on github: https://github.com/jgunthorpe/linux/commits/vfio_iommufd >>> [...] >>>> v2: >>>> - Rebase to v6.1-rc3, v4 iommufd series >>>> - Fixup comments and commit messages from list remarks >>>> - Fix leaking of the iommufd for mdevs >>>> - New patch to fix vfio modaliases when vfio container is disabled >>>> - Add a dmesg once when the iommufd provided /dev/vfio/vfio is opened >>>> to signal that iommufd is providing this >>> >>> I've redone my previous sanity tests. Except those reported bugs, >>> things look fine. Once we fix those issues, GVT and other modules >>> can run some more stressful tests, I think. >> >> our side is also starting test (gvt, nic passthrough) this version. need to >> wait a while for the result. > > I've updated the branches with the two functional fixes discussed on > the list plus all the doc updates. > For s390, tested vfio-pci against some data mover workloads using QEMU on s390x with CONFIG_VFIO_CONTAINER=y and =n using zPCI interpretation assists (over ism/SMC-D, mlx5 and NVMe) and without zPCI interpretation assists (over mlx5 and NVMe) - will continue testing with more aggressive workloads. (I did not run with CONFIG_IOMMUFD_TEST other than when building the selftest, but I see you mentioned this to Yi -- I'll incorporate that setting into future runs.) Ran the self-tests on s390 in LPAR and within a QEMU guest -- all tests pass (used 1M hugepages) Did light regression testing of vfio-ap and vfio-ccw on s390x with CONFIG_VFIO_CONTAINER=y and =n. Didn't see it in your branch yet, but also verified the proposed change to iommufd_fill_cap_dma_avail (.avail = U32_MAX) would work as expected. Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
On 2022/11/14 22:37, Yang, Lixiao wrote: > On 2022/11/14 20:51, Yi Liu wrote: >> On 2022/11/10 00:57, Jason Gunthorpe wrote: >>> On Tue, Nov 08, 2022 at 11:18:03PM +0800, Yi Liu wrote: >>>> On 2022/11/8 17:19, Nicolin Chen wrote: >>>>> On Mon, Nov 07, 2022 at 08:52:44PM -0400, Jason Gunthorpe wrote: >>>>> >>>>>> This is on github: >>>>>> https://github.com/jgunthorpe/linux/commits/vfio_iommufd >>>>> [...] >>>>>> v2: >>>>>> - Rebase to v6.1-rc3, v4 iommufd series >>>>>> - Fixup comments and commit messages from list remarks >>>>>> - Fix leaking of the iommufd for mdevs >>>>>> - New patch to fix vfio modaliases when vfio container is disabled >>>>>> - Add a dmesg once when the iommufd provided /dev/vfio/vfio is opened >>>>>> to signal that iommufd is providing this >>>>> >>>>> I've redone my previous sanity tests. Except those reported bugs, >>>>> things look fine. Once we fix those issues, GVT and other modules >>>>> can run some more stressful tests, I think. >>>> >>>> our side is also starting test (gvt, nic passthrough) this version. >>>> need to wait a while for the result. >>> >>> I've updated the branches with the two functional fixes discussed on >>> the list plus all the doc updates. >>> >> >> I see, due to timzone, the kernel we grabbed is 37c9e6e44d77a, it has >> slight diff in the scripts/kernel-doc compared with the latest commit >> (6bb16a9c67769). I don't think it impacts the test. >> >> https://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git/log/?h=for-next >> (37c9e6e44d77a) >> >> Our side, Yu He, Lixiao Yang has done below tests on Intel platform >> with the above kernel, results are: >> >> 1) GVT-g test suit passed, Intel iGFx passthrough passed. >> >> 2) NIC passthrough test with different guest memory (1G/4G), passed. >> >> 3) Booting two different QEMUs in the same time but one QEMU opens >> legacy /dev/vfio/vfio and another opens /dev/iommu. Tests passed. >> >> 4) Tried below Kconfig combinations, results are expected. >> >> VFIO_CONTAINER=y, IOMMUFD=y -- test pass >> VFIO_CONTAINER=y, IOMMUFD=n -- test pass >> VFIO_CONTAINER=n, IOMMUFD=y , IOMMUFD_VFIO_CONTAINER=y -- test pass >> VFIO_CONTAINER=n, IOMMUFD=y , IOMMUFD_VFIO_CONTAINER=n -- no >> /dev/vfio/vfio, so test fail, expected >> >> 5) Tested devices from multi-device group. Assign such devices to the >> same VM, pass; assign them to different VMs, fail; assign them to a VM >> with Intel virtual VT-d, fail; Results are expected. >> >> Meanwhile, I also tested the branch in development branch for nesting, >> the basic functionality looks good. >> >> Tested-by: Yi Liu <yi.l.liu@intel.com> >> > Tested-by: Lixiao Yang <lixiao.yang@intel.com> > Tested-by: Yu He <yu.he@intel.com> -- Best regards, He,Yu