diff mbox series

[v13,11/15] s390/vfio-ap: implement in-use callback for vfio_ap driver

Message ID 20201223011606.5265-12-akrowiak@linux.ibm.com (mailing list archive)
State New, archived
Headers show
Series s390/vfio-ap: dynamic configuration support | expand

Commit Message

Tony Krowiak Dec. 23, 2020, 1:16 a.m. UTC
Let's implement the callback to indicate when an APQN
is in use by the vfio_ap device driver. The callback is
invoked whenever a change to the apmask or aqmask would
result in one or more queue devices being removed from the driver. The
vfio_ap device driver will indicate a resource is in use
if the APQN of any of the queue devices to be removed are assigned to
any of the matrix mdevs under the driver's control.

There is potential for a deadlock condition between the matrix_dev->lock
used to lock the matrix device during assignment of adapters and domains
and the ap_perms_mutex locked by the AP bus when changes are made to the
sysfs apmask/aqmask attributes.

Consider following scenario (courtesy of Halil Pasic):
1) apmask_store() takes ap_perms_mutex
2) assign_adapter_store() takes matrix_dev->lock
3) apmask_store() calls vfio_ap_mdev_resource_in_use() which tries
   to take matrix_dev->lock
4) assign_adapter_store() calls ap_apqn_in_matrix_owned_by_def_drv
   which tries to take ap_perms_mutex

BANG!

To resolve this issue, instead of using the mutex_lock(&matrix_dev->lock)
function to lock the matrix device during assignment of an adapter or
domain to a matrix_mdev as well as during the in_use callback, the
mutex_trylock(&matrix_dev->lock) function will be used. If the lock is not
obtained, then the assignment and in_use functions will terminate with
-EBUSY.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
---
 drivers/s390/crypto/vfio_ap_drv.c     |  1 +
 drivers/s390/crypto/vfio_ap_ops.c     | 21 ++++++++++++++++++---
 drivers/s390/crypto/vfio_ap_private.h |  2 ++
 3 files changed, 21 insertions(+), 3 deletions(-)

Comments

Halil Pasic Jan. 12, 2021, 1:20 a.m. UTC | #1
On Tue, 22 Dec 2020 20:16:02 -0500
Tony Krowiak <akrowiak@linux.ibm.com> wrote:

> Let's implement the callback to indicate when an APQN
> is in use by the vfio_ap device driver. The callback is
> invoked whenever a change to the apmask or aqmask would
> result in one or more queue devices being removed from the driver. The
> vfio_ap device driver will indicate a resource is in use
> if the APQN of any of the queue devices to be removed are assigned to
> any of the matrix mdevs under the driver's control.
> 
> There is potential for a deadlock condition between the matrix_dev->lock
> used to lock the matrix device during assignment of adapters and domains
> and the ap_perms_mutex locked by the AP bus when changes are made to the
> sysfs apmask/aqmask attributes.
> 
> Consider following scenario (courtesy of Halil Pasic):
> 1) apmask_store() takes ap_perms_mutex
> 2) assign_adapter_store() takes matrix_dev->lock
> 3) apmask_store() calls vfio_ap_mdev_resource_in_use() which tries
>    to take matrix_dev->lock
> 4) assign_adapter_store() calls ap_apqn_in_matrix_owned_by_def_drv
>    which tries to take ap_perms_mutex
> 
> BANG!
> 
> To resolve this issue, instead of using the mutex_lock(&matrix_dev->lock)
> function to lock the matrix device during assignment of an adapter or
> domain to a matrix_mdev as well as during the in_use callback, the
> mutex_trylock(&matrix_dev->lock) function will be used. If the lock is not
> obtained, then the assignment and in_use functions will terminate with
> -EBUSY.
> 
> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
> ---
>  drivers/s390/crypto/vfio_ap_drv.c     |  1 +
>  drivers/s390/crypto/vfio_ap_ops.c     | 21 ++++++++++++++++++---
>  drivers/s390/crypto/vfio_ap_private.h |  2 ++
>  3 files changed, 21 insertions(+), 3 deletions(-)
> 
[..]
>  }
> +
> +int vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long *aqm)
> +{
> +	int ret;
> +
> +	if (!mutex_trylock(&matrix_dev->lock))
> +		return -EBUSY;
> +	ret = vfio_ap_mdev_verify_no_sharing(NULL, apm, aqm);

If we detect that resources are in use, then we spit warnings to the
message log, right?

@Matt: Is your userspace tooling going to guarantee that this will never
happen?

> +	mutex_unlock(&matrix_dev->lock);
> +
> +	return ret;
> +}
> diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
> index d2d26ba18602..15b7cd74843b 100644
> --- a/drivers/s390/crypto/vfio_ap_private.h
> +++ b/drivers/s390/crypto/vfio_ap_private.h
> @@ -107,4 +107,6 @@ struct vfio_ap_queue {
>  int vfio_ap_mdev_probe_queue(struct ap_device *queue);
>  void vfio_ap_mdev_remove_queue(struct ap_device *queue);
>  
> +int vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long *aqm);
> +
>  #endif /* _VFIO_AP_PRIVATE_H_ */
Matthew Rosato Jan. 12, 2021, 2:14 p.m. UTC | #2
On 1/11/21 8:20 PM, Halil Pasic wrote:
> On Tue, 22 Dec 2020 20:16:02 -0500
> Tony Krowiak <akrowiak@linux.ibm.com> wrote:
> 
>> Let's implement the callback to indicate when an APQN
>> is in use by the vfio_ap device driver. The callback is
>> invoked whenever a change to the apmask or aqmask would
>> result in one or more queue devices being removed from the driver. The
>> vfio_ap device driver will indicate a resource is in use
>> if the APQN of any of the queue devices to be removed are assigned to
>> any of the matrix mdevs under the driver's control.
>>
>> There is potential for a deadlock condition between the matrix_dev->lock
>> used to lock the matrix device during assignment of adapters and domains
>> and the ap_perms_mutex locked by the AP bus when changes are made to the
>> sysfs apmask/aqmask attributes.
>>
>> Consider following scenario (courtesy of Halil Pasic):
>> 1) apmask_store() takes ap_perms_mutex
>> 2) assign_adapter_store() takes matrix_dev->lock
>> 3) apmask_store() calls vfio_ap_mdev_resource_in_use() which tries
>>     to take matrix_dev->lock
>> 4) assign_adapter_store() calls ap_apqn_in_matrix_owned_by_def_drv
>>     which tries to take ap_perms_mutex
>>
>> BANG!
>>
>> To resolve this issue, instead of using the mutex_lock(&matrix_dev->lock)
>> function to lock the matrix device during assignment of an adapter or
>> domain to a matrix_mdev as well as during the in_use callback, the
>> mutex_trylock(&matrix_dev->lock) function will be used. If the lock is not
>> obtained, then the assignment and in_use functions will terminate with
>> -EBUSY.
>>
>> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
>> ---
>>   drivers/s390/crypto/vfio_ap_drv.c     |  1 +
>>   drivers/s390/crypto/vfio_ap_ops.c     | 21 ++++++++++++++++++---
>>   drivers/s390/crypto/vfio_ap_private.h |  2 ++
>>   3 files changed, 21 insertions(+), 3 deletions(-)
>>
> [..]
>>   }
>> +
>> +int vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long *aqm)
>> +{
>> +	int ret;
>> +
>> +	if (!mutex_trylock(&matrix_dev->lock))
>> +		return -EBUSY;
>> +	ret = vfio_ap_mdev_verify_no_sharing(NULL, apm, aqm);
> 
> If we detect that resources are in use, then we spit warnings to the
> message log, right?
> 
> @Matt: Is your userspace tooling going to guarantee that this will never
> happen?

Yes, but only when using the tooling to modify apmask/aqmask.  You would 
still be able to create such a scenario by bypassing the tooling and 
invoking the sysfs interfaces directly.
Halil Pasic Jan. 12, 2021, 4:49 p.m. UTC | #3
On Tue, 12 Jan 2021 09:14:07 -0500
Matthew Rosato <mjrosato@linux.ibm.com> wrote:

> On 1/11/21 8:20 PM, Halil Pasic wrote:
> > On Tue, 22 Dec 2020 20:16:02 -0500
> > Tony Krowiak <akrowiak@linux.ibm.com> wrote:
> >   
> >> Let's implement the callback to indicate when an APQN
> >> is in use by the vfio_ap device driver. The callback is
> >> invoked whenever a change to the apmask or aqmask would
> >> result in one or more queue devices being removed from the driver. The
> >> vfio_ap device driver will indicate a resource is in use
> >> if the APQN of any of the queue devices to be removed are assigned to
> >> any of the matrix mdevs under the driver's control.
> >>
> >> There is potential for a deadlock condition between the matrix_dev->lock
> >> used to lock the matrix device during assignment of adapters and domains
> >> and the ap_perms_mutex locked by the AP bus when changes are made to the
> >> sysfs apmask/aqmask attributes.
> >>
> >> Consider following scenario (courtesy of Halil Pasic):
> >> 1) apmask_store() takes ap_perms_mutex
> >> 2) assign_adapter_store() takes matrix_dev->lock
> >> 3) apmask_store() calls vfio_ap_mdev_resource_in_use() which tries
> >>     to take matrix_dev->lock
> >> 4) assign_adapter_store() calls ap_apqn_in_matrix_owned_by_def_drv
> >>     which tries to take ap_perms_mutex
> >>
> >> BANG!
> >>
> >> To resolve this issue, instead of using the mutex_lock(&matrix_dev->lock)
> >> function to lock the matrix device during assignment of an adapter or
> >> domain to a matrix_mdev as well as during the in_use callback, the
> >> mutex_trylock(&matrix_dev->lock) function will be used. If the lock is not
> >> obtained, then the assignment and in_use functions will terminate with
> >> -EBUSY.
> >>
> >> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
> >> ---
> >>   drivers/s390/crypto/vfio_ap_drv.c     |  1 +
> >>   drivers/s390/crypto/vfio_ap_ops.c     | 21 ++++++++++++++++++---
> >>   drivers/s390/crypto/vfio_ap_private.h |  2 ++
> >>   3 files changed, 21 insertions(+), 3 deletions(-)
> >>  
> > [..]  
> >>   }
> >> +
> >> +int vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long *aqm)
> >> +{
> >> +	int ret;
> >> +
> >> +	if (!mutex_trylock(&matrix_dev->lock))
> >> +		return -EBUSY;
> >> +	ret = vfio_ap_mdev_verify_no_sharing(NULL, apm, aqm);  
> > 
> > If we detect that resources are in use, then we spit warnings to the
> > message log, right?
> > 
> > @Matt: Is your userspace tooling going to guarantee that this will never
> > happen?  
> 
> Yes, but only when using the tooling to modify apmask/aqmask.  You would 
> still be able to create such a scenario by bypassing the tooling and 
> invoking the sysfs interfaces directly.
> 
> 

Since, I suppose, the tooling is going to catch this anyway, and produce
much better feedback to the user, I believe we should be fine degrading
the severity to info or debug. 

I would prefer not producing a warning here, because I believe it is
likely to do more harm, than good (by implying a kernel problem, as I
don't think based on the message one will think that it is an userspace
problem). But if everybody else agrees, that we want a warning here, then
I can live with that as well.

Regards,
Halil
diff mbox series

Patch

diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c
index 73bd073fd5d3..8934471b7944 100644
--- a/drivers/s390/crypto/vfio_ap_drv.c
+++ b/drivers/s390/crypto/vfio_ap_drv.c
@@ -147,6 +147,7 @@  static int __init vfio_ap_init(void)
 	memset(&vfio_ap_drv, 0, sizeof(vfio_ap_drv));
 	vfio_ap_drv.probe = vfio_ap_mdev_probe_queue;
 	vfio_ap_drv.remove = vfio_ap_mdev_remove_queue;
+	vfio_ap_drv.in_use = vfio_ap_mdev_resource_in_use;
 	vfio_ap_drv.ids = ap_queue_ids;
 
 	ret = ap_driver_register(&vfio_ap_drv, THIS_MODULE, VFIO_AP_DRV_NAME);
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 843862c88379..6bc2e80cc565 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -644,7 +644,8 @@  static ssize_t assign_adapter_store(struct device *dev,
 	memset(apm, 0, sizeof(apm));
 	set_bit_inv(apid, apm);
 
-	mutex_lock(&matrix_dev->lock);
+	if (!mutex_trylock(&matrix_dev->lock))
+		return -EBUSY;
 
 	ret = vfio_ap_mdev_validate_masks(matrix_mdev, apm,
 					  matrix_mdev->matrix.aqm);
@@ -777,7 +778,8 @@  static ssize_t assign_domain_store(struct device *dev,
 	memset(aqm, 0, sizeof(aqm));
 	set_bit_inv(apqi, aqm);
 
-	mutex_lock(&matrix_dev->lock);
+	if (!mutex_trylock(&matrix_dev->lock))
+		return -EBUSY;
 
 	ret = vfio_ap_mdev_validate_masks(matrix_mdev, matrix_mdev->matrix.apm,
 					  aqm);
@@ -896,7 +898,8 @@  static ssize_t assign_control_domain_store(struct device *dev,
 	 * least significant, correspond to IDs 0 up to the one less than the
 	 * number of control domains that can be assigned.
 	 */
-	mutex_lock(&matrix_dev->lock);
+	if (!mutex_trylock(&matrix_dev->lock))
+		return -EBUSY;
 	set_bit_inv(id, matrix_mdev->matrix.adm);
 	vfio_ap_mdev_hot_plug_cdom(matrix_mdev, id);
 	mutex_unlock(&matrix_dev->lock);
@@ -1446,3 +1449,15 @@  void vfio_ap_mdev_remove_queue(struct ap_device *apdev)
 	kfree(q);
 	mutex_unlock(&matrix_dev->lock);
 }
+
+int vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long *aqm)
+{
+	int ret;
+
+	if (!mutex_trylock(&matrix_dev->lock))
+		return -EBUSY;
+	ret = vfio_ap_mdev_verify_no_sharing(NULL, apm, aqm);
+	mutex_unlock(&matrix_dev->lock);
+
+	return ret;
+}
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index d2d26ba18602..15b7cd74843b 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -107,4 +107,6 @@  struct vfio_ap_queue {
 int vfio_ap_mdev_probe_queue(struct ap_device *queue);
 void vfio_ap_mdev_remove_queue(struct ap_device *queue);
 
+int vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long *aqm);
+
 #endif /* _VFIO_AP_PRIVATE_H_ */