mbox series

[v3,00/10] Update vfio_pin/unpin_pages API

Message ID 20220708224427.1245-1-nicolinc@nvidia.com (mailing list archive)
Headers show
Series Update vfio_pin/unpin_pages API | expand

Message

Nicolin Chen July 8, 2022, 10:44 p.m. UTC
This is a preparatory series for IOMMUFD v2 patches. It prepares for
replacing vfio_iommu_type1 implementations of vfio_pin/unpin_pages()
with IOMMUFD version.

There's a gap between these two versions: the vfio_iommu_type1 version
inputs a non-contiguous PFN list and outputs another PFN list for the
pinned physical page list, while the IOMMUFD version only supports a
contiguous address input by accepting the starting IO virtual address
of a set of pages to pin and by outputting to a physical page list.

The nature of existing callers mostly aligns with the IOMMUFD version,
except s390's vfio_ccw_cp code where some additional change is needed
along with this series. Overall, updating to "iova" and "phys_page"
does improve the caller side to some extent.

Also fix a misuse of physical address and virtual address in the s390's
crypto code. And update the input naming at the adjacent vfio_dma_rw().

This is on github:
https://github.com/nicolinc/iommufd/commits/vfio_pin_pages

Terrence has tested this series on i915; Eric has tested on s390.

Thanks!

Changelog
v3:
 * Added a patch to replace roundup with DIV_ROUND_UP in i915 gvt
 * Dropped the "driver->ops->unpin_pages" and NULL checks in PATCH-1
 * Changed to use WARN_ON and separate into lines in PATCH-1
 * Replaced "guest" words with "user" and fix typo in PATCH-5
 * Updated commit log of PATCH-1, PATCH-6, and PATCH-10
 * Added Reviewed/Acked-by from Christoph, Jason, Kirti, Kevin and Eric
 * Added Tested-by from Terrence (i915) and Eric (s390)
v2: https://lore.kernel.org/kvm/20220706062759.24946-1-nicolinc@nvidia.com/
 * Added a patch to make vfio_unpin_pages return void
 * Added two patches to remove PFN list from two s390 callers
 * Renamed "phys_page" parameter to "pages" for vfio_pin_pages
 * Updated commit log of kmap_local_page() patch
 * Added Harald's "Reviewed-by" to pa_ind patch
 * Rebased on top of Alex's extern removal path
v1: https://lore.kernel.org/kvm/20220616235212.15185-1-nicolinc@nvidia.com/

Nicolin Chen (10):
  vfio: Make vfio_unpin_pages() return void
  drm/i915/gvt: Replace roundup with DIV_ROUND_UP
  vfio/ap: Pass in physical address of ind to ap_aqic()
  vfio/ccw: Only pass in contiguous pages
  vfio: Pass in starting IOVA to vfio_pin/unpin_pages API
  vfio/ap: Change saved_pfn to saved_iova
  vfio/ccw: Change pa_pfn list to pa_iova list
  vfio: Rename user_iova of vfio_dma_rw()
  vfio/ccw: Add kmap_local_page() for memcpy
  vfio: Replace phys_pfn with pages for vfio_pin_pages()

 .../driver-api/vfio-mediated-device.rst       |   6 +-
 arch/s390/include/asm/ap.h                    |   6 +-
 drivers/gpu/drm/i915/gvt/kvmgt.c              |  49 ++---
 drivers/s390/cio/vfio_ccw_cp.c                | 195 +++++++++++-------
 drivers/s390/crypto/ap_queue.c                |   2 +-
 drivers/s390/crypto/vfio_ap_ops.c             |  54 +++--
 drivers/s390/crypto/vfio_ap_private.h         |   4 +-
 drivers/vfio/vfio.c                           |  54 ++---
 drivers/vfio/vfio.h                           |   8 +-
 drivers/vfio/vfio_iommu_type1.c               |  45 ++--
 include/linux/vfio.h                          |   9 +-
 11 files changed, 215 insertions(+), 217 deletions(-)

Comments

Alex Williamson July 22, 2022, 10:11 p.m. UTC | #1
On Fri, 8 Jul 2022 15:44:18 -0700
Nicolin Chen <nicolinc@nvidia.com> wrote:

> This is a preparatory series for IOMMUFD v2 patches. It prepares for
> replacing vfio_iommu_type1 implementations of vfio_pin/unpin_pages()
> with IOMMUFD version.
> 
> There's a gap between these two versions: the vfio_iommu_type1 version
> inputs a non-contiguous PFN list and outputs another PFN list for the
> pinned physical page list, while the IOMMUFD version only supports a
> contiguous address input by accepting the starting IO virtual address
> of a set of pages to pin and by outputting to a physical page list.
> 
> The nature of existing callers mostly aligns with the IOMMUFD version,
> except s390's vfio_ccw_cp code where some additional change is needed
> along with this series. Overall, updating to "iova" and "phys_page"
> does improve the caller side to some extent.
> 
> Also fix a misuse of physical address and virtual address in the s390's
> crypto code. And update the input naming at the adjacent vfio_dma_rw().
> 
> This is on github:
> https://github.com/nicolinc/iommufd/commits/vfio_pin_pages
> 
> Terrence has tested this series on i915; Eric has tested on s390.
> 
> Thanks!
> 
> Changelog
> v3:
>  * Added a patch to replace roundup with DIV_ROUND_UP in i915 gvt
>  * Dropped the "driver->ops->unpin_pages" and NULL checks in PATCH-1
>  * Changed to use WARN_ON and separate into lines in PATCH-1
>  * Replaced "guest" words with "user" and fix typo in PATCH-5
>  * Updated commit log of PATCH-1, PATCH-6, and PATCH-10
>  * Added Reviewed/Acked-by from Christoph, Jason, Kirti, Kevin and Eric
>  * Added Tested-by from Terrence (i915) and Eric (s390)
> v2: https://lore.kernel.org/kvm/20220706062759.24946-1-nicolinc@nvidia.com/
>  * Added a patch to make vfio_unpin_pages return void
>  * Added two patches to remove PFN list from two s390 callers
>  * Renamed "phys_page" parameter to "pages" for vfio_pin_pages
>  * Updated commit log of kmap_local_page() patch
>  * Added Harald's "Reviewed-by" to pa_ind patch
>  * Rebased on top of Alex's extern removal path
> v1: https://lore.kernel.org/kvm/20220616235212.15185-1-nicolinc@nvidia.com/
> 
> Nicolin Chen (10):
>   vfio: Make vfio_unpin_pages() return void
>   drm/i915/gvt: Replace roundup with DIV_ROUND_UP
>   vfio/ap: Pass in physical address of ind to ap_aqic()
>   vfio/ccw: Only pass in contiguous pages
>   vfio: Pass in starting IOVA to vfio_pin/unpin_pages API
>   vfio/ap: Change saved_pfn to saved_iova
>   vfio/ccw: Change pa_pfn list to pa_iova list
>   vfio: Rename user_iova of vfio_dma_rw()
>   vfio/ccw: Add kmap_local_page() for memcpy
>   vfio: Replace phys_pfn with pages for vfio_pin_pages()
> 
>  .../driver-api/vfio-mediated-device.rst       |   6 +-
>  arch/s390/include/asm/ap.h                    |   6 +-
>  drivers/gpu/drm/i915/gvt/kvmgt.c              |  49 ++---
>  drivers/s390/cio/vfio_ccw_cp.c                | 195 +++++++++++-------
>  drivers/s390/crypto/ap_queue.c                |   2 +-
>  drivers/s390/crypto/vfio_ap_ops.c             |  54 +++--
>  drivers/s390/crypto/vfio_ap_private.h         |   4 +-
>  drivers/vfio/vfio.c                           |  54 ++---
>  drivers/vfio/vfio.h                           |   8 +-
>  drivers/vfio/vfio_iommu_type1.c               |  45 ++--
>  include/linux/vfio.h                          |   9 +-
>  11 files changed, 215 insertions(+), 217 deletions(-)
> 

GVT-g explodes for me with this series on my Broadwell test system,
continuously spewing the following:

[   47.344126] ------------[ cut here ]------------
[   47.348778] WARNING: CPU: 3 PID: 501 at drivers/vfio/vfio_iommu_type1.c:978 vfio_iommu_type1_unpin_pages+0x7b/0x100 [vfio_iommu_type1]
[   47.360871] Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun bridge stp llc rfkill sunrpc vfat fat intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel iTCO_wdt at24 mei_wdt mei_hdcp intel_pmc_bxt mei_pxp rapl iTCO_vendor_support intel_cstate pcspkr e1000e mei_me intel_uncore i2c_i801 mei lpc_ich i2c_smbus acpi_pad fuse zram ip_tables kvmgt mdev vfio_iommu_type1 vfio kvm irqbypass i915 crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pinctrl_lynxpoint i2c_algo_bit drm_buddy video drm_display_helper drm_kms_helper cec ttm drm
[   47.423398] CPU: 3 PID: 501 Comm: gvt:rcs0 Tainted: G        W         5.19.0-rc4+ #3
[   47.431228] Hardware name:  /NUC5i5MYBE, BIOS MYBDWi5v.86A.0054.2019.0520.1531 05/20/2019
[   47.439408] RIP: 0010:vfio_iommu_type1_unpin_pages+0x7b/0x100 [vfio_iommu_type1]
[   47.446818] Code: 10 00 00 45 31 ed 48 8b 7b 40 48 85 ff 74 12 48 8b 47 18 49 39 c6 77 23 48 8b 7f 10 48 85 ff 75 ee 48 8b 3c 24 e8 45 57 92 e4 <0f> 0b 48 83 c4 08 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 03 47 28 49
[   47.465573] RSP: 0018:ffff9ac5806cfbe0 EFLAGS: 00010246
[   47.470807] RAX: ffff8cb42f4c5180 RBX: ffff8cb4145c03c0 RCX: 0000000000000000
[   47.477948] RDX: 0000000000000000 RSI: 0000163802000000 RDI: ffff8cb4145c03e0
[   47.485088] RBP: 0000000000000001 R08: 0000000000000000 R09: ffff9ac581aed000
[   47.492230] R10: ffff9ac5806cfc58 R11: 00000001b2202000 R12: 0000000000000001
[   47.499370] R13: 0000000000000000 R14: 0000163802001000 R15: 0000163802000000
[   47.506513] FS:  0000000000000000(0000) GS:ffff8cb776d80000(0000) knlGS:0000000000000000
[   47.514608] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   47.520361] CR2: ffffdc0933f76192 CR3: 0000000118118003 CR4: 00000000003726e0
[   47.527510] Call Trace:
[   47.529976]  <TASK>
[   47.532091]  intel_gvt_dma_unmap_guest_page+0xd5/0x110 [kvmgt]
[   47.537948]  ppgtt_invalidate_spt+0x323/0x340 [kvmgt]
[   47.543017]  ppgtt_invalidate_spt+0x173/0x340 [kvmgt]
[   47.548088]  ppgtt_invalidate_spt+0x173/0x340 [kvmgt]
[   47.553159]  ppgtt_invalidate_spt+0x173/0x340 [kvmgt]
[   47.558228]  invalidate_ppgtt_mm+0x5f/0x110 [kvmgt]
[   47.563124]  _intel_vgpu_mm_release+0xd6/0xe0 [kvmgt]
[   47.568193]  intel_vgpu_destroy_workload+0x1b7/0x1e0 [kvmgt]
[   47.573872]  workload_thread+0xa4c/0x19a0 [kvmgt]
[   47.578613]  ? _raw_spin_rq_lock_irqsave+0x20/0x20
[   47.583422]  ? dequeue_task_stop+0x70/0x70
[   47.587530]  ? _raw_spin_lock_irqsave+0x24/0x50
[   47.592072]  ? intel_vgpu_reset_submission+0x40/0x40 [kvmgt]
[   47.597746]  kthread+0xe7/0x110
[   47.600902]  ? kthread_complete_and_exit+0x20/0x20
[   47.605702]  ret_from_fork+0x22/0x30
[   47.609293]  </TASK>
[   47.611503] ---[ end trace 0000000000000000 ]---

Line 978 is the WARN_ON(i != npage) line.  For the cases where we don't
find a matching vfio_dma, I'm seeing addresses that look maybe like
we're shifting  a value that's already an iova by PAGE_SHIFT somewhere.
Thanks,

Alex
Nicolin Chen July 22, 2022, 11:12 p.m. UTC | #2
On Fri, Jul 22, 2022 at 04:11:29PM -0600, Alex Williamson wrote:

> GVT-g explodes for me with this series on my Broadwell test system,
> continuously spewing the following:

Thank you for running additional tests.

> [   47.348778] WARNING: CPU: 3 PID: 501 at drivers/vfio/vfio_iommu_type1.c:978 vfio_iommu_type1_unpin_pages+0x7b/0x100 [vfio_iommu_type1]
 
> Line 978 is the WARN_ON(i != npage) line.  For the cases where we don't
> find a matching vfio_dma, I'm seeing addresses that look maybe like
> we're shifting  a value that's already an iova by PAGE_SHIFT somewhere.

Hmm..I don't understand the PAGE_SHIFT part. Do you mind clarifying?

And GVT code initiated an unpin request from gvt_unpin_guest_pag()
that is currently unpinning one page at a time on a contiguous IOVA
range, prior to this series. After this series, it leaves the per-
page routine to the internal loop of vfio_iommu_type1_unpin_pages(),
which is supposed to do the same.

So, either resulted from the npage input being wrong or some other
factor weighed in that invoked a vfio_remove_dma on those iovas?

Thanks
Nic
Alex Williamson July 23, 2022, 12:18 a.m. UTC | #3
On Fri, 22 Jul 2022 16:12:19 -0700
Nicolin Chen <nicolinc@nvidia.com> wrote:

> On Fri, Jul 22, 2022 at 04:11:29PM -0600, Alex Williamson wrote:
> 
> > GVT-g explodes for me with this series on my Broadwell test system,
> > continuously spewing the following:  
> 
> Thank you for running additional tests.
> 
> > [   47.348778] WARNING: CPU: 3 PID: 501 at drivers/vfio/vfio_iommu_type1.c:978 vfio_iommu_type1_unpin_pages+0x7b/0x100 [vfio_iommu_type1]  
>  
> > Line 978 is the WARN_ON(i != npage) line.  For the cases where we don't
> > find a matching vfio_dma, I'm seeing addresses that look maybe like
> > we're shifting  a value that's already an iova by PAGE_SHIFT somewhere.  
> 
> Hmm..I don't understand the PAGE_SHIFT part. Do you mind clarifying?

The iova was a very large address for a 4GB VM with a lot of zeros on
the low order bits, ex. 0x162459000000.  Thanks,

Alex
 
> And GVT code initiated an unpin request from gvt_unpin_guest_pag()
> that is currently unpinning one page at a time on a contiguous IOVA
> range, prior to this series. After this series, it leaves the per-
> page routine to the internal loop of vfio_iommu_type1_unpin_pages(),
> which is supposed to do the same.
> 
> So, either resulted from the npage input being wrong or some other
> factor weighed in that invoked a vfio_remove_dma on those iovas?
> 
> Thanks
> Nic
>
Nicolin Chen July 23, 2022, 12:38 a.m. UTC | #4
On Fri, Jul 22, 2022 at 06:18:00PM -0600, Alex Williamson wrote:
> External email: Use caution opening links or attachments
> 
> 
> On Fri, 22 Jul 2022 16:12:19 -0700
> Nicolin Chen <nicolinc@nvidia.com> wrote:
> 
> > On Fri, Jul 22, 2022 at 04:11:29PM -0600, Alex Williamson wrote:
> >
> > > GVT-g explodes for me with this series on my Broadwell test system,
> > > continuously spewing the following:
> >
> > Thank you for running additional tests.
> >
> > > [   47.348778] WARNING: CPU: 3 PID: 501 at drivers/vfio/vfio_iommu_type1.c:978 vfio_iommu_type1_unpin_pages+0x7b/0x100 [vfio_iommu_type1]
> >
> > > Line 978 is the WARN_ON(i != npage) line.  For the cases where we don't
> > > find a matching vfio_dma, I'm seeing addresses that look maybe like
> > > we're shifting  a value that's already an iova by PAGE_SHIFT somewhere.
> >
> > Hmm..I don't understand the PAGE_SHIFT part. Do you mind clarifying?
> 
> The iova was a very large address for a 4GB VM with a lot of zeros on
> the low order bits, ex. 0x162459000000.  Thanks,

Ah! Thanks for the hint. The following commit did a double shifting:
   "vfio: Pass in starting IOVA to vfio_pin/unpin_pages AP"

And the following change should fix:
-------------------
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index 481dd2aeb40e..4790c7f35b88 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -293,7 +293,7 @@ static int gvt_dma_map_page(struct intel_vgpu *vgpu, unsigned long gfn,
        if (dma_mapping_error(dev, *dma_addr)) {
                gvt_vgpu_err("DMA mapping failed for pfn 0x%lx, ret %d\n",
                             page_to_pfn(page), ret);
-               gvt_unpin_guest_page(vgpu, gfn << PAGE_SHIFT, size);
+               gvt_unpin_guest_page(vgpu, gfn, size);
                return -ENOMEM;
        }

@@ -306,7 +306,7 @@ static void gvt_dma_unmap_page(struct intel_vgpu *vgpu, unsigned long gfn,
        struct device *dev = vgpu->gvt->gt->i915->drm.dev;

        dma_unmap_page(dev, dma_addr, size, DMA_BIDIRECTIONAL);
-       gvt_unpin_guest_page(vgpu, gfn << PAGE_SHIFT, size);
+       gvt_unpin_guest_page(vgpu, gfn, size);
 }

 static struct gvt_dma *__gvt_cache_find_dma_addr(struct intel_vgpu *vgpu,
-------------------


So, I think that I should send a v4, given that the patches aren't
officially applied?
Alex Williamson July 23, 2022, 1:09 a.m. UTC | #5
On Fri, 22 Jul 2022 17:38:25 -0700
Nicolin Chen <nicolinc@nvidia.com> wrote:

> On Fri, Jul 22, 2022 at 06:18:00PM -0600, Alex Williamson wrote:
> > External email: Use caution opening links or attachments
> > 
> > 
> > On Fri, 22 Jul 2022 16:12:19 -0700
> > Nicolin Chen <nicolinc@nvidia.com> wrote:
> >   
> > > On Fri, Jul 22, 2022 at 04:11:29PM -0600, Alex Williamson wrote:
> > >  
> > > > GVT-g explodes for me with this series on my Broadwell test system,
> > > > continuously spewing the following:  
> > >
> > > Thank you for running additional tests.
> > >  
> > > > [   47.348778] WARNING: CPU: 3 PID: 501 at drivers/vfio/vfio_iommu_type1.c:978 vfio_iommu_type1_unpin_pages+0x7b/0x100 [vfio_iommu_type1]  
> > >  
> > > > Line 978 is the WARN_ON(i != npage) line.  For the cases where we don't
> > > > find a matching vfio_dma, I'm seeing addresses that look maybe like
> > > > we're shifting  a value that's already an iova by PAGE_SHIFT somewhere.  
> > >
> > > Hmm..I don't understand the PAGE_SHIFT part. Do you mind clarifying?  
> > 
> > The iova was a very large address for a 4GB VM with a lot of zeros on
> > the low order bits, ex. 0x162459000000.  Thanks,  
> 
> Ah! Thanks for the hint. The following commit did a double shifting:
>    "vfio: Pass in starting IOVA to vfio_pin/unpin_pages AP"
> 
> And the following change should fix:
> -------------------
> diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
> index 481dd2aeb40e..4790c7f35b88 100644
> --- a/drivers/gpu/drm/i915/gvt/kvmgt.c
> +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
> @@ -293,7 +293,7 @@ static int gvt_dma_map_page(struct intel_vgpu *vgpu, unsigned long gfn,
>         if (dma_mapping_error(dev, *dma_addr)) {
>                 gvt_vgpu_err("DMA mapping failed for pfn 0x%lx, ret %d\n",
>                              page_to_pfn(page), ret);
> -               gvt_unpin_guest_page(vgpu, gfn << PAGE_SHIFT, size);
> +               gvt_unpin_guest_page(vgpu, gfn, size);
>                 return -ENOMEM;
>         }
> 
> @@ -306,7 +306,7 @@ static void gvt_dma_unmap_page(struct intel_vgpu *vgpu, unsigned long gfn,
>         struct device *dev = vgpu->gvt->gt->i915->drm.dev;
> 
>         dma_unmap_page(dev, dma_addr, size, DMA_BIDIRECTIONAL);
> -       gvt_unpin_guest_page(vgpu, gfn << PAGE_SHIFT, size);
> +       gvt_unpin_guest_page(vgpu, gfn, size);
>  }
> 
>  static struct gvt_dma *__gvt_cache_find_dma_addr(struct intel_vgpu *vgpu,
> -------------------

Looks likely.  Not sure how Terrance was able to test this successfully
though.

> So, I think that I should send a v4, given that the patches aren't
> officially applied?

Yep, please rebase on current vfio next branch.  Thanks,

Alex
Nicolin Chen July 23, 2022, 2:10 a.m. UTC | #6
On Fri, Jul 22, 2022 at 07:09:01PM -0600, Alex Williamson wrote:

> > So, I think that I should send a v4, given that the patches aren't
> > officially applied?
> 
> Yep, please rebase on current vfio next branch.  Thanks,

Sent. And they are on Github, basing on linux-vfio next too:
https://github.com/nicolinc/iommufd/commits/vfio_pin_pages-v4

Thanks!
Nic