diff mbox series

[12/12] crypto: caam - change JR device ownership scheme

Message ID 20190904023515.7107-13-andrew.smirnov@gmail.com (mailing list archive)
State Changes Requested
Delegated to: Herbert Xu
Headers show
Series CAAM bugfixes, small improvements | expand

Commit Message

Andrey Smirnov Sept. 4, 2019, 2:35 a.m. UTC
Returning -EBUSY from platform device's .remove() callback won't stop
the removal process, so the code in caam_jr_remove() is not going to
have the desired effect of preventing JR from being removed.

In order to be able to deal with removal of the JR device, change the
code as follows:

  1. To make sure that underlying struct device remains valid for as
     long as we have a reference to it, add appropriate device
     refcount management to caam_jr_alloc() and caam_jr_free()

  2. To make sure that device removal doesn't happen in parallel to
      use using the device in caam_jr_enqueue() augment the latter to
      acquire/release device lock for the duration of the subroutine

  3. In order to handle the case when caam_jr_enqueue() is executed
     right after corresponding caam_jr_remove(), add code to check
     that driver data has not been set to NULL.

Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Chris Healy <cphealy@gmail.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Horia Geantă <horia.geanta@nxp.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Iuliana Prodan <iuliana.prodan@nxp.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/crypto/caam/jr.c | 36 +++++++++++++++++++++++++-----------
 1 file changed, 25 insertions(+), 11 deletions(-)

Comments

Leonard Crestez Sept. 13, 2019, 7:16 p.m. UTC | #1
On 04.09.2019 05:35, Andrey Smirnov wrote:
> Returning -EBUSY from platform device's .remove() callback won't stop
> the removal process, so the code in caam_jr_remove() is not going to
> have the desired effect of preventing JR from being removed.
> 
> In order to be able to deal with removal of the JR device, change the
> code as follows:
> 
>    1. To make sure that underlying struct device remains valid for as
>       long as we have a reference to it, add appropriate device
>       refcount management to caam_jr_alloc() and caam_jr_free()
> 
>    2. To make sure that device removal doesn't happen in parallel to
>        use using the device in caam_jr_enqueue() augment the latter to
>        acquire/release device lock for the duration of the subroutine
> 
>    3. In order to handle the case when caam_jr_enqueue() is executed
>       right after corresponding caam_jr_remove(), add code to check
>       that driver data has not been set to NULL.

What happens if a driver is unbound while a transform is executing 
asynchronously?

Doing get_device and put_device in caam_jr_alloc and caam_jr_free 
doesn't protect you against an explicit unbind of caam_jr, that path 
doesn't care about device reference counts. For example on imx6qp:

# echo 2102000.jr1 > /sys/bus/platform/drivers/caam_jr/unbind

CPU: 2 PID: 561 Comm: bash Not tainted 5.3.0-rc7-next-20190904
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
Backtrace:
[<c010f8a4>] (dump_backtrace) from [<c010fc5c>] (show_stack+0x20/0x24)
[<c010fc3c>] (show_stack) from [<c0d21600>] (dump_stack+0xdc/0x114)
[<c0d21524>] (dump_stack) from [<c0955774>] (caam_jr_remove+0x24/0xf8)
[<c0955750>] (caam_jr_remove) from [<c06e2d98>] 
(platform_drv_remove+0x34/0x4c)
[<c06e2d64>] (platform_drv_remove) from [<c06e0b74>] 
(device_release_driver_internal+0xf8/0x1b0)
[<c06e0a7c>] (device_release_driver_internal) from [<c06e0c70>] 
(device_driver_detach+0x20/0x24)
[<c06e0c50>] (device_driver_detach) from [<c06de5a4>] 
(unbind_store+0x6c/0xe0)
[<c06de538>] (unbind_store) from [<c06dd950>] (drv_attr_store+0x30/0x3c)
[<c06dd920>] (drv_attr_store) from [<c0364c18>] (sysfs_kf_write+0x50/0x5c)
[<c0364bc8>] (sysfs_kf_write) from [<c0363e0c>] 
(kernfs_fop_write+0x10c/0x1f8)
[<c0363d00>] (kernfs_fop_write) from [<c02c3a30>] (__vfs_write+0x44/0x1d0)
[<c02c39ec>] (__vfs_write) from [<c02c68ec>] (vfs_write+0xb0/0x194)
[<c02c683c>] (vfs_write) from [<c02c6b7c>] (ksys_write+0x64/0xd8)
[<c02c6b18>] (ksys_write) from [<c02c6c08>] (sys_write+0x18/0x1c)
[<c02c6bf0>] (sys_write) from [<c0101000>] (ret_fast_syscall+0x0/0x28)

I think you need to do some form of slow wait loop in jrpriv until 
jrpriv->tfm_count reaches zero. Preventing new operations from starting 
would help but isn't strictly necessary for correctness.

> diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c
> index 47b389cb1c62..8a30bbd7f2aa 100644
> --- a/drivers/crypto/caam/jr.c
> +++ b/drivers/crypto/caam/jr.c
> @@ -124,14 +124,6 @@ static int caam_jr_remove(struct platform_device *pdev)
>   	jrdev = &pdev->dev;
>   	jrpriv = dev_get_drvdata(jrdev);
>   
> -	/*
> -	 * Return EBUSY if job ring already allocated.
> -	 */
> -	if (atomic_read(&jrpriv->tfm_count)) {
> -		dev_err(jrdev, "Device is busy\n");
> -		return -EBUSY;
> -	}
> -
>   	/* Unregister JR-based RNG & crypto algorithms */
>   	unregister_algs();
>   
> @@ -300,7 +292,7 @@ struct device *caam_jr_alloc(void)
>   
>   	if (min_jrpriv) {
>   		atomic_inc(&min_jrpriv->tfm_count);
> -		dev = min_jrpriv->dev;
> +		dev = get_device(min_jrpriv->dev);
>   	}
>   	spin_unlock(&driver_data.jr_alloc_lock);
>   
> @@ -318,13 +310,16 @@ void caam_jr_free(struct device *rdev)
>   	struct caam_drv_private_jr *jrpriv = dev_get_drvdata(rdev);
>   
>   	atomic_dec(&jrpriv->tfm_count);
> +	put_device(rdev);
>   }
>   EXPORT_SYMBOL(caam_jr_free);
>   
>   /**
>    * caam_jr_enqueue() - Enqueue a job descriptor head. Returns 0 if OK,
>    * -EBUSY if the queue is full, -EIO if it cannot map the caller's
> - * descriptor.
> + * descriptor, -ENODEV if given device was removed and is no longer
> + * valid
> + *
>    * @dev:  device of the job ring to be used. This device should have
>    *        been assigned prior by caam_jr_register().
>    * @desc: points to a job descriptor that execute our request. All
> @@ -354,15 +349,32 @@ int caam_jr_enqueue(struct device *dev, u32 *desc,
>   				u32 status, void *areq),
>   		    void *areq)
>   {
> -	struct caam_drv_private_jr *jrp = dev_get_drvdata(dev);
> +	struct caam_drv_private_jr *jrp;
>   	struct caam_jrentry_info *head_entry;
>   	int head, tail, desc_size;
>   	dma_addr_t desc_dma;
>   
> +	/*
> +	 * Lock the device to prevent it from being removed while we
> +	 * are using it
> +	 */
> +	device_lock(dev);
> +
> +	/*
> +	 * If driver data is NULL, it is very likely that this device
> +	 * was removed already. Nothing we can do here but bail out.
> +	 */
> +	jrp = dev_get_drvdata(dev);
> +	if (!jrp) {
> +		device_unlock(dev);
> +		return -ENODEV;
> +	}
> +
>   	desc_size = (caam32_to_cpu(*desc) & HDR_JD_LENGTH_MASK) * sizeof(u32);
>   	desc_dma = dma_map_single(dev, desc, desc_size, DMA_TO_DEVICE);
>   	if (dma_mapping_error(dev, desc_dma)) {
>   		dev_err(dev, "caam_jr_enqueue(): can't map jobdesc\n");
> +		device_unlock(dev);
>   		return -EIO;
>   	}
>   
> @@ -375,6 +387,7 @@ int caam_jr_enqueue(struct device *dev, u32 *desc,
>   	    CIRC_SPACE(head, tail, JOBR_DEPTH) <= 0) {
>   		spin_unlock_bh(&jrp->inplock);
>   		dma_unmap_single(dev, desc_dma, desc_size, DMA_TO_DEVICE);
> +		device_unlock(dev);
>   		return -EBUSY;
>   	}
>   
> @@ -411,6 +424,7 @@ int caam_jr_enqueue(struct device *dev, u32 *desc,
>   		jrp->inpring_avail = rd_reg32(&jrp->rregs->inpring_avail);
>   
>   	spin_unlock_bh(&jrp->inplock);
> +	device_unlock(dev);
>   
>   	return 0;
>   }
Andrey Smirnov Sept. 18, 2019, 3:13 a.m. UTC | #2
On Fri, Sep 13, 2019 at 12:16 PM Leonard Crestez
<leonard.crestez@nxp.com> wrote:
>
> On 04.09.2019 05:35, Andrey Smirnov wrote:
> > Returning -EBUSY from platform device's .remove() callback won't stop
> > the removal process, so the code in caam_jr_remove() is not going to
> > have the desired effect of preventing JR from being removed.
> >
> > In order to be able to deal with removal of the JR device, change the
> > code as follows:
> >
> >    1. To make sure that underlying struct device remains valid for as
> >       long as we have a reference to it, add appropriate device
> >       refcount management to caam_jr_alloc() and caam_jr_free()
> >
> >    2. To make sure that device removal doesn't happen in parallel to
> >        use using the device in caam_jr_enqueue() augment the latter to
> >        acquire/release device lock for the duration of the subroutine
> >
> >    3. In order to handle the case when caam_jr_enqueue() is executed
> >       right after corresponding caam_jr_remove(), add code to check
> >       that driver data has not been set to NULL.
>
> What happens if a driver is unbound while a transform is executing
> asynchronously?
>

AFAICT caam_jr_shutdown() is going to disable JR's interrupt and kill
corresponding tasklet, so I think that transform will never finish. We
probably could handle that better by walking the list of unfinished
jobs and calling their callbacks with an appropriate error code.

> Doing get_device and put_device in caam_jr_alloc and caam_jr_free
> doesn't protect you against an explicit unbind of caam_jr, that path
> doesn't care about device reference counts. For example on imx6qp:
>

My thinking with get_device() and put_device() was to prevent struct
device * from being freed while we are using it. The intent with
explicit unbind was to protect against it by
device_lock()/device_unlock() as a well as check that device data is
not NULL. I think I did miss code to do that in caam_jr_free() before
accessing tfm_count.


> # echo 2102000.jr1 > /sys/bus/platform/drivers/caam_jr/unbind
>

A bit of a silly question, but is this with my patches applied or
against the vanilla driver?

> CPU: 2 PID: 561 Comm: bash Not tainted 5.3.0-rc7-next-20190904
> Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> Backtrace:
> [<c010f8a4>] (dump_backtrace) from [<c010fc5c>] (show_stack+0x20/0x24)
> [<c010fc3c>] (show_stack) from [<c0d21600>] (dump_stack+0xdc/0x114)
> [<c0d21524>] (dump_stack) from [<c0955774>] (caam_jr_remove+0x24/0xf8)
> [<c0955750>] (caam_jr_remove) from [<c06e2d98>]
> (platform_drv_remove+0x34/0x4c)
> [<c06e2d64>] (platform_drv_remove) from [<c06e0b74>]
> (device_release_driver_internal+0xf8/0x1b0)
> [<c06e0a7c>] (device_release_driver_internal) from [<c06e0c70>]
> (device_driver_detach+0x20/0x24)
> [<c06e0c50>] (device_driver_detach) from [<c06de5a4>]
> (unbind_store+0x6c/0xe0)
> [<c06de538>] (unbind_store) from [<c06dd950>] (drv_attr_store+0x30/0x3c)
> [<c06dd920>] (drv_attr_store) from [<c0364c18>] (sysfs_kf_write+0x50/0x5c)
> [<c0364bc8>] (sysfs_kf_write) from [<c0363e0c>]
> (kernfs_fop_write+0x10c/0x1f8)
> [<c0363d00>] (kernfs_fop_write) from [<c02c3a30>] (__vfs_write+0x44/0x1d0)
> [<c02c39ec>] (__vfs_write) from [<c02c68ec>] (vfs_write+0xb0/0x194)
> [<c02c683c>] (vfs_write) from [<c02c6b7c>] (ksys_write+0x64/0xd8)
> [<c02c6b18>] (ksys_write) from [<c02c6c08>] (sys_write+0x18/0x1c)
> [<c02c6bf0>] (sys_write) from [<c0101000>] (ret_fast_syscall+0x0/0x28)
>
> I think you need to do some form of slow wait loop in jrpriv until
> jrpriv->tfm_count reaches zero.

Hmm, what do we do if it never does? Why do you think it would be
better than cancelling all outstanding jobs and resetting the HW?

> Preventing new operations from starting
> would help but isn't strictly necessary for correctness.

Wouldn't it be strictly necessary if you want to wait for tfm_count
reaching zero?

Thanks,
Andrey Smirnov
Horia Geanta Sept. 19, 2019, 11:19 a.m. UTC | #3
On 9/18/2019 6:13 AM, Andrey Smirnov wrote:
>> I think you need to do some form of slow wait loop in jrpriv until
>> jrpriv->tfm_count reaches zero.
> Hmm, what do we do if it never does? Why do you think it would be
> better than cancelling all outstanding jobs and resetting the HW?
> 
Herbert,

What should a driver do when:
-user tries to unbind it AND
-there are tfms referencing algorithms registered by this driver

1. If driver tries to unregister the algorithms during its .remove()
callback, then this BUG_ON is hit:

int crypto_unregister_alg(struct crypto_alg *alg)
{
[...]
        BUG_ON(refcount_read(&alg->cra_refcnt) != 1);

2. If driver exits without unregistering the algorithms,
next time one of the tfms referencing those algorithms will be used
bad things will happen.

3. There is no mechanism in crypto core for notifying users
to stop using a tfm.

Thanks,
Horia
Herbert Xu Sept. 19, 2019, 1:45 p.m. UTC | #4
On Thu, Sep 19, 2019 at 11:19:22AM +0000, Horia Geanta wrote:
>
> What should a driver do when:
> -user tries to unbind it AND
> -there are tfms referencing algorithms registered by this driver
> 
> 1. If driver tries to unregister the algorithms during its .remove()
> callback, then this BUG_ON is hit:
> 
> int crypto_unregister_alg(struct crypto_alg *alg)
> {
> [...]
>         BUG_ON(refcount_read(&alg->cra_refcnt) != 1);

You must not unregister the algorithm until it's no longer in use.
Normally we enforce this using module reference counts.

For hardware devices that need to be unregistered without unloading
the module at the same time, you will need your own reference
counting to determine when unregistration can be carried out.  But
it must be carefully done so that it is not racy.

> 2. If driver exits without unregistering the algorithms,
> next time one of the tfms referencing those algorithms will be used
> bad things will happen.

Well remember hardware devices can always go away (e.g., dead
or unplugged) so the driver must do something sensible when that
happens.  If this happened on a live tfm then further operations
should ideally fail and any outstanding requests should also fail
in a timely manner.

> 3. There is no mechanism in crypto core for notifying users
> to stop using a tfm.

For software implementations we use the module reference count
as the mechanism to signal that an algorithm is going away.  In
particular try_module_get (and consequently crypto_mod_get) will
fail when the module is being unregistered.

We should extend this to cover hardware devices.  Perhaps we can
introduce a callback in crypto_alg that crypto_mod_get can invoke
which would then fail if the device is going away (or has gone away).

Cheers,
diff mbox series

Patch

diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c
index 47b389cb1c62..8a30bbd7f2aa 100644
--- a/drivers/crypto/caam/jr.c
+++ b/drivers/crypto/caam/jr.c
@@ -124,14 +124,6 @@  static int caam_jr_remove(struct platform_device *pdev)
 	jrdev = &pdev->dev;
 	jrpriv = dev_get_drvdata(jrdev);
 
-	/*
-	 * Return EBUSY if job ring already allocated.
-	 */
-	if (atomic_read(&jrpriv->tfm_count)) {
-		dev_err(jrdev, "Device is busy\n");
-		return -EBUSY;
-	}
-
 	/* Unregister JR-based RNG & crypto algorithms */
 	unregister_algs();
 
@@ -300,7 +292,7 @@  struct device *caam_jr_alloc(void)
 
 	if (min_jrpriv) {
 		atomic_inc(&min_jrpriv->tfm_count);
-		dev = min_jrpriv->dev;
+		dev = get_device(min_jrpriv->dev);
 	}
 	spin_unlock(&driver_data.jr_alloc_lock);
 
@@ -318,13 +310,16 @@  void caam_jr_free(struct device *rdev)
 	struct caam_drv_private_jr *jrpriv = dev_get_drvdata(rdev);
 
 	atomic_dec(&jrpriv->tfm_count);
+	put_device(rdev);
 }
 EXPORT_SYMBOL(caam_jr_free);
 
 /**
  * caam_jr_enqueue() - Enqueue a job descriptor head. Returns 0 if OK,
  * -EBUSY if the queue is full, -EIO if it cannot map the caller's
- * descriptor.
+ * descriptor, -ENODEV if given device was removed and is no longer
+ * valid
+ *
  * @dev:  device of the job ring to be used. This device should have
  *        been assigned prior by caam_jr_register().
  * @desc: points to a job descriptor that execute our request. All
@@ -354,15 +349,32 @@  int caam_jr_enqueue(struct device *dev, u32 *desc,
 				u32 status, void *areq),
 		    void *areq)
 {
-	struct caam_drv_private_jr *jrp = dev_get_drvdata(dev);
+	struct caam_drv_private_jr *jrp;
 	struct caam_jrentry_info *head_entry;
 	int head, tail, desc_size;
 	dma_addr_t desc_dma;
 
+	/*
+	 * Lock the device to prevent it from being removed while we
+	 * are using it
+	 */
+	device_lock(dev);
+
+	/*
+	 * If driver data is NULL, it is very likely that this device
+	 * was removed already. Nothing we can do here but bail out.
+	 */
+	jrp = dev_get_drvdata(dev);
+	if (!jrp) {
+		device_unlock(dev);
+		return -ENODEV;
+	}
+
 	desc_size = (caam32_to_cpu(*desc) & HDR_JD_LENGTH_MASK) * sizeof(u32);
 	desc_dma = dma_map_single(dev, desc, desc_size, DMA_TO_DEVICE);
 	if (dma_mapping_error(dev, desc_dma)) {
 		dev_err(dev, "caam_jr_enqueue(): can't map jobdesc\n");
+		device_unlock(dev);
 		return -EIO;
 	}
 
@@ -375,6 +387,7 @@  int caam_jr_enqueue(struct device *dev, u32 *desc,
 	    CIRC_SPACE(head, tail, JOBR_DEPTH) <= 0) {
 		spin_unlock_bh(&jrp->inplock);
 		dma_unmap_single(dev, desc_dma, desc_size, DMA_TO_DEVICE);
+		device_unlock(dev);
 		return -EBUSY;
 	}
 
@@ -411,6 +424,7 @@  int caam_jr_enqueue(struct device *dev, u32 *desc,
 		jrp->inpring_avail = rd_reg32(&jrp->rregs->inpring_avail);
 
 	spin_unlock_bh(&jrp->inplock);
+	device_unlock(dev);
 
 	return 0;
 }