diff mbox series

[v9,16/25] iommufd/device: Add iommufd_access_detach() API

Message ID 20230401151833.124749-17-yi.l.liu@intel.com (mailing list archive)
State New, archived
Headers show
Series Add vfio_device cdev for iommufd support | expand

Commit Message

Yi Liu April 1, 2023, 3:18 p.m. UTC
From: Nicolin Chen <nicolinc@nvidia.com>

Previously, the detach routine is only done by the destroy(). And it was
called by vfio_iommufd_emulated_unbind() when the device runs close(), so
all the mappings in iopt were cleaned in that setup, when the call trace
reaches this detach() routine.

Now, there's a need of a detach uAPI, meaning that it does not only need
a new iommufd_access_detach() API, but also requires access->ops->unmap()
call as a cleanup. So add one.

However, leaving that unprotected can introduce some potential of a race
condition during the pin_/unpin_pages() call, where access->ioas->iopt is
getting referenced. So, add an ioas_lock to protect the context of iopt
referencings.

Also, to allow the iommufd_access_unpin_pages() callback to happen via
this unmap() call, add an ioas_unpin pointer, so the unpin routine won't
be affected by the "access->ioas = NULL" trick.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Terrence Xu <terrence.xu@intel.com>
Tested-by: Yanting Jiang <yanting.jiang@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/iommu/iommufd/device.c          | 76 +++++++++++++++++++++++--
 drivers/iommu/iommufd/iommufd_private.h |  2 +
 include/linux/iommufd.h                 |  1 +
 3 files changed, 74 insertions(+), 5 deletions(-)

Comments

Alex Williamson April 4, 2023, 10:45 p.m. UTC | #1
On Sat,  1 Apr 2023 08:18:24 -0700
Yi Liu <yi.l.liu@intel.com> wrote:

> From: Nicolin Chen <nicolinc@nvidia.com>
> 
> Previously, the detach routine is only done by the destroy(). And it was
> called by vfio_iommufd_emulated_unbind() when the device runs close(), so
> all the mappings in iopt were cleaned in that setup, when the call trace
> reaches this detach() routine.
> 
> Now, there's a need of a detach uAPI, meaning that it does not only need
> a new iommufd_access_detach() API, but also requires access->ops->unmap()
> call as a cleanup. So add one.
> 
> However, leaving that unprotected can introduce some potential of a race
> condition during the pin_/unpin_pages() call, where access->ioas->iopt is
> getting referenced. So, add an ioas_lock to protect the context of iopt
> referencings.
> 
> Also, to allow the iommufd_access_unpin_pages() callback to happen via
> this unmap() call, add an ioas_unpin pointer, so the unpin routine won't
> be affected by the "access->ioas = NULL" trick.
> 
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Tested-by: Terrence Xu <terrence.xu@intel.com>
> Tested-by: Yanting Jiang <yanting.jiang@intel.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> ---
>  drivers/iommu/iommufd/device.c          | 76 +++++++++++++++++++++++--
>  drivers/iommu/iommufd/iommufd_private.h |  2 +
>  include/linux/iommufd.h                 |  1 +
>  3 files changed, 74 insertions(+), 5 deletions(-)

Does this need to go in via iommufd first?  There seems to be quite a
bit of churn in iommufd/device.c vs the vfio_mdev_ops branch (ie. it
doesn't apply). Thanks,

Alex
Jason Gunthorpe April 5, 2023, 11:56 a.m. UTC | #2
On Tue, Apr 04, 2023 at 04:45:12PM -0600, Alex Williamson wrote:
> On Sat,  1 Apr 2023 08:18:24 -0700
> Yi Liu <yi.l.liu@intel.com> wrote:
> 
> > From: Nicolin Chen <nicolinc@nvidia.com>
> > 
> > Previously, the detach routine is only done by the destroy(). And it was
> > called by vfio_iommufd_emulated_unbind() when the device runs close(), so
> > all the mappings in iopt were cleaned in that setup, when the call trace
> > reaches this detach() routine.
> > 
> > Now, there's a need of a detach uAPI, meaning that it does not only need
> > a new iommufd_access_detach() API, but also requires access->ops->unmap()
> > call as a cleanup. So add one.
> > 
> > However, leaving that unprotected can introduce some potential of a race
> > condition during the pin_/unpin_pages() call, where access->ioas->iopt is
> > getting referenced. So, add an ioas_lock to protect the context of iopt
> > referencings.
> > 
> > Also, to allow the iommufd_access_unpin_pages() callback to happen via
> > this unmap() call, add an ioas_unpin pointer, so the unpin routine won't
> > be affected by the "access->ioas = NULL" trick.
> > 
> > Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> > Tested-by: Terrence Xu <terrence.xu@intel.com>
> > Tested-by: Yanting Jiang <yanting.jiang@intel.com>
> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> > ---
> >  drivers/iommu/iommufd/device.c          | 76 +++++++++++++++++++++++--
> >  drivers/iommu/iommufd/iommufd_private.h |  2 +
> >  include/linux/iommufd.h                 |  1 +
> >  3 files changed, 74 insertions(+), 5 deletions(-)
> 
> Does this need to go in via iommufd first?  There seems to be quite a
> bit of churn in iommufd/device.c vs the vfio_mdev_ops branch (ie. it
> doesn't apply). Thanks,

I think it is best to stay with this series, Yi has to rebase it

Jason
Yi Liu April 5, 2023, 2:10 p.m. UTC | #3
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 5, 2023 7:56 PM
> 
> On Tue, Apr 04, 2023 at 04:45:12PM -0600, Alex Williamson wrote:
> > On Sat,  1 Apr 2023 08:18:24 -0700
> > Yi Liu <yi.l.liu@intel.com> wrote:
> >
> > > From: Nicolin Chen <nicolinc@nvidia.com>
> > >
> > > Previously, the detach routine is only done by the destroy(). And it was
> > > called by vfio_iommufd_emulated_unbind() when the device runs close(), so
> > > all the mappings in iopt were cleaned in that setup, when the call trace
> > > reaches this detach() routine.
> > >
> > > Now, there's a need of a detach uAPI, meaning that it does not only need
> > > a new iommufd_access_detach() API, but also requires access->ops->unmap()
> > > call as a cleanup. So add one.
> > >
> > > However, leaving that unprotected can introduce some potential of a race
> > > condition during the pin_/unpin_pages() call, where access->ioas->iopt is
> > > getting referenced. So, add an ioas_lock to protect the context of iopt
> > > referencings.
> > >
> > > Also, to allow the iommufd_access_unpin_pages() callback to happen via
> > > this unmap() call, add an ioas_unpin pointer, so the unpin routine won't
> > > be affected by the "access->ioas = NULL" trick.
> > >
> > > Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> > > Tested-by: Terrence Xu <terrence.xu@intel.com>
> > > Tested-by: Yanting Jiang <yanting.jiang@intel.com>
> > > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > > Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> > > ---
> > >  drivers/iommu/iommufd/device.c          | 76 +++++++++++++++++++++++--
> > >  drivers/iommu/iommufd/iommufd_private.h |  2 +
> > >  include/linux/iommufd.h                 |  1 +
> > >  3 files changed, 74 insertions(+), 5 deletions(-)
> >
> > Does this need to go in via iommufd first?  There seems to be quite a
> > bit of churn in iommufd/device.c vs the vfio_mdev_ops branch (ie. it
> > doesn't apply). Thanks,
> 
> I think it is best to stay with this series, Yi has to rebase it

The rebased version is here. Shall I resend a version which is rebased on
top of vfio_mdev_ops?

https://github.com/yiliu1765/iommufd/commit/d3d8f65c82fe2ca2a7b1a635f4b40b2a0971daa9

Regards,
Yi Liu
Jason Gunthorpe April 5, 2023, 2:28 p.m. UTC | #4
On Wed, Apr 05, 2023 at 02:10:19PM +0000, Liu, Yi L wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, April 5, 2023 7:56 PM
> > 
> > On Tue, Apr 04, 2023 at 04:45:12PM -0600, Alex Williamson wrote:
> > > On Sat,  1 Apr 2023 08:18:24 -0700
> > > Yi Liu <yi.l.liu@intel.com> wrote:
> > >
> > > > From: Nicolin Chen <nicolinc@nvidia.com>
> > > >
> > > > Previously, the detach routine is only done by the destroy(). And it was
> > > > called by vfio_iommufd_emulated_unbind() when the device runs close(), so
> > > > all the mappings in iopt were cleaned in that setup, when the call trace
> > > > reaches this detach() routine.
> > > >
> > > > Now, there's a need of a detach uAPI, meaning that it does not only need
> > > > a new iommufd_access_detach() API, but also requires access->ops->unmap()
> > > > call as a cleanup. So add one.
> > > >
> > > > However, leaving that unprotected can introduce some potential of a race
> > > > condition during the pin_/unpin_pages() call, where access->ioas->iopt is
> > > > getting referenced. So, add an ioas_lock to protect the context of iopt
> > > > referencings.
> > > >
> > > > Also, to allow the iommufd_access_unpin_pages() callback to happen via
> > > > this unmap() call, add an ioas_unpin pointer, so the unpin routine won't
> > > > be affected by the "access->ioas = NULL" trick.
> > > >
> > > > Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> > > > Tested-by: Terrence Xu <terrence.xu@intel.com>
> > > > Tested-by: Yanting Jiang <yanting.jiang@intel.com>
> > > > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > > > Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> > > > ---
> > > >  drivers/iommu/iommufd/device.c          | 76 +++++++++++++++++++++++--
> > > >  drivers/iommu/iommufd/iommufd_private.h |  2 +
> > > >  include/linux/iommufd.h                 |  1 +
> > > >  3 files changed, 74 insertions(+), 5 deletions(-)
> > >
> > > Does this need to go in via iommufd first?  There seems to be quite a
> > > bit of churn in iommufd/device.c vs the vfio_mdev_ops branch (ie. it
> > > doesn't apply). Thanks,
> > 
> > I think it is best to stay with this series, Yi has to rebase it
> 
> The rebased version is here. Shall I resend a version which is rebased on
> top of vfio_mdev_ops?
> 
> https://github.com/yiliu1765/iommufd/commit/d3d8f65c82fe2ca2a7b1a635f4b40b2a0971daa9

When you post the v10 it should be based on top of the vfio_mdev_ops
and the hot reset series.

Jason
Yi Liu April 5, 2023, 2:46 p.m. UTC | #5
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 5, 2023 10:29 PM
> > > >
> > > > Does this need to go in via iommufd first?  There seems to be quite a
> > > > bit of churn in iommufd/device.c vs the vfio_mdev_ops branch (ie. it
> > > > doesn't apply). Thanks,
> > >
> > > I think it is best to stay with this series, Yi has to rebase it
> >
> > The rebased version is here. Shall I resend a version which is rebased on
> > top of vfio_mdev_ops?
> >
> >
> https://github.com/yiliu1765/iommufd/commit/d3d8f65c82fe2ca2a7b1a635f4b40b2a
> 0971daa9
> 
> When you post the v10 it should be based on top of the vfio_mdev_ops
> and the hot reset series.

yes. At least, I see the hot reset series needs to be refreshed w.r.t. the
comments from Alex and Eric.

Regards,
Yi Liu
diff mbox series

Patch

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 04a57aa1ae2c..0eaae60f3537 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -474,6 +474,7 @@  iommufd_access_create(struct iommufd_ctx *ictx,
 	iommufd_ctx_get(ictx);
 	iommufd_object_finalize(ictx, &access->obj);
 	*id = access->obj.id;
+	mutex_init(&access->ioas_lock);
 	return access;
 }
 EXPORT_SYMBOL_NS_GPL(iommufd_access_create, IOMMUFD);
@@ -493,26 +494,66 @@  void iommufd_access_destroy(struct iommufd_access *access)
 }
 EXPORT_SYMBOL_NS_GPL(iommufd_access_destroy, IOMMUFD);
 
+static void __iommufd_access_detach(struct iommufd_access *access)
+{
+	struct iommufd_ioas *cur_ioas = access->ioas;
+
+	lockdep_assert_held(&access->ioas_lock);
+	/*
+	 * Set ioas to NULL to block any further iommufd_access_pin_pages().
+	 * iommufd_access_unpin_pages() can continue using access->ioas_unpin.
+	 */
+	access->ioas = NULL;
+
+	if (access->ops->unmap) {
+		mutex_unlock(&access->ioas_lock);
+		access->ops->unmap(access->data, 0, ULONG_MAX);
+		mutex_lock(&access->ioas_lock);
+	}
+	iopt_remove_access(&cur_ioas->iopt, access);
+	refcount_dec(&cur_ioas->obj.users);
+}
+
+void iommufd_access_detach(struct iommufd_access *access)
+{
+	mutex_lock(&access->ioas_lock);
+	if (WARN_ON(!access->ioas))
+		goto out;
+	__iommufd_access_detach(access);
+out:
+	access->ioas_unpin = NULL;
+	mutex_unlock(&access->ioas_lock);
+}
+EXPORT_SYMBOL_NS_GPL(iommufd_access_detach, IOMMUFD);
+
 int iommufd_access_attach(struct iommufd_access *access, u32 ioas_id)
 {
 	struct iommufd_ioas *new_ioas;
 	int rc = 0;
 
-	if (access->ioas != NULL && access->ioas->obj.id != ioas_id)
+	mutex_lock(&access->ioas_lock);
+	if (access->ioas != NULL && access->ioas->obj.id != ioas_id) {
+		mutex_unlock(&access->ioas_lock);
 		return -EINVAL;
+	}
 
 	new_ioas = iommufd_get_ioas(access->ictx, ioas_id);
-	if (IS_ERR(new_ioas))
+	if (IS_ERR(new_ioas)) {
+		mutex_unlock(&access->ioas_lock);
 		return PTR_ERR(new_ioas);
+	}
 
 	rc = iopt_add_access(&new_ioas->iopt, access);
 	if (rc) {
+		mutex_unlock(&access->ioas_lock);
 		iommufd_put_object(&new_ioas->obj);
 		return rc;
 	}
 	iommufd_ref_to_users(&new_ioas->obj);
 
 	access->ioas = new_ioas;
+	access->ioas_unpin = new_ioas;
+	mutex_unlock(&access->ioas_lock);
 	return 0;
 }
 EXPORT_SYMBOL_NS_GPL(iommufd_access_attach, IOMMUFD);
@@ -567,8 +608,8 @@  void iommufd_access_notify_unmap(struct io_pagetable *iopt, unsigned long iova,
 void iommufd_access_unpin_pages(struct iommufd_access *access,
 				unsigned long iova, unsigned long length)
 {
-	struct io_pagetable *iopt = &access->ioas->iopt;
 	struct iopt_area_contig_iter iter;
+	struct io_pagetable *iopt;
 	unsigned long last_iova;
 	struct iopt_area *area;
 
@@ -576,6 +617,13 @@  void iommufd_access_unpin_pages(struct iommufd_access *access,
 	    WARN_ON(check_add_overflow(iova, length - 1, &last_iova)))
 		return;
 
+	mutex_lock(&access->ioas_lock);
+	if (!access->ioas_unpin) {
+		mutex_unlock(&access->ioas_lock);
+		return;
+	}
+	iopt = &access->ioas_unpin->iopt;
+
 	down_read(&iopt->iova_rwsem);
 	iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova)
 		iopt_area_remove_access(
@@ -585,6 +633,7 @@  void iommufd_access_unpin_pages(struct iommufd_access *access,
 				min(last_iova, iopt_area_last_iova(area))));
 	up_read(&iopt->iova_rwsem);
 	WARN_ON(!iopt_area_contig_done(&iter));
+	mutex_unlock(&access->ioas_lock);
 }
 EXPORT_SYMBOL_NS_GPL(iommufd_access_unpin_pages, IOMMUFD);
 
@@ -630,8 +679,8 @@  int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova,
 			     unsigned long length, struct page **out_pages,
 			     unsigned int flags)
 {
-	struct io_pagetable *iopt = &access->ioas->iopt;
 	struct iopt_area_contig_iter iter;
+	struct io_pagetable *iopt;
 	unsigned long last_iova;
 	struct iopt_area *area;
 	int rc;
@@ -646,6 +695,13 @@  int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova,
 	if (check_add_overflow(iova, length - 1, &last_iova))
 		return -EOVERFLOW;
 
+	mutex_lock(&access->ioas_lock);
+	if (!access->ioas) {
+		mutex_unlock(&access->ioas_lock);
+		return -ENOENT;
+	}
+	iopt = &access->ioas->iopt;
+
 	down_read(&iopt->iova_rwsem);
 	iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova) {
 		unsigned long last = min(last_iova, iopt_area_last_iova(area));
@@ -676,6 +732,7 @@  int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova,
 	}
 
 	up_read(&iopt->iova_rwsem);
+	mutex_unlock(&access->ioas_lock);
 	return 0;
 
 err_remove:
@@ -690,6 +747,7 @@  int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova,
 						  iopt_area_last_iova(area))));
 	}
 	up_read(&iopt->iova_rwsem);
+	mutex_unlock(&access->ioas_lock);
 	return rc;
 }
 EXPORT_SYMBOL_NS_GPL(iommufd_access_pin_pages, IOMMUFD);
@@ -709,8 +767,8 @@  EXPORT_SYMBOL_NS_GPL(iommufd_access_pin_pages, IOMMUFD);
 int iommufd_access_rw(struct iommufd_access *access, unsigned long iova,
 		      void *data, size_t length, unsigned int flags)
 {
-	struct io_pagetable *iopt = &access->ioas->iopt;
 	struct iopt_area_contig_iter iter;
+	struct io_pagetable *iopt;
 	struct iopt_area *area;
 	unsigned long last_iova;
 	int rc;
@@ -720,6 +778,13 @@  int iommufd_access_rw(struct iommufd_access *access, unsigned long iova,
 	if (check_add_overflow(iova, length - 1, &last_iova))
 		return -EOVERFLOW;
 
+	mutex_lock(&access->ioas_lock);
+	if (!access->ioas) {
+		mutex_unlock(&access->ioas_lock);
+		return -ENOENT;
+	}
+	iopt = &access->ioas->iopt;
+
 	down_read(&iopt->iova_rwsem);
 	iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova) {
 		unsigned long last = min(last_iova, iopt_area_last_iova(area));
@@ -746,6 +811,7 @@  int iommufd_access_rw(struct iommufd_access *access, unsigned long iova,
 		rc = -ENOENT;
 err_out:
 	up_read(&iopt->iova_rwsem);
+	mutex_unlock(&access->ioas_lock);
 	return rc;
 }
 EXPORT_SYMBOL_NS_GPL(iommufd_access_rw, IOMMUFD);
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 2e6e8e217cce..ec2ce3ef187d 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -263,6 +263,8 @@  struct iommufd_access {
 	struct iommufd_object obj;
 	struct iommufd_ctx *ictx;
 	struct iommufd_ioas *ioas;
+	struct iommufd_ioas *ioas_unpin;
+	struct mutex ioas_lock;
 	const struct iommufd_access_ops *ops;
 	void *data;
 	unsigned long iova_alignment;
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index ac96df406833..9e0e8894dacc 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -47,6 +47,7 @@  iommufd_access_create(struct iommufd_ctx *ictx,
 		      const struct iommufd_access_ops *ops, void *data, u32 *id);
 void iommufd_access_destroy(struct iommufd_access *access);
 int iommufd_access_attach(struct iommufd_access *access, u32 ioas_id);
+void iommufd_access_detach(struct iommufd_access *access);
 
 void iommufd_ctx_get(struct iommufd_ctx *ictx);