Message ID | 55BF9E4E.6020006@intel.com (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Herbert Xu |
Headers | show |
On Mon, Aug 03, 2015 at 10:01:02AM -0700, Tadeusz Struk wrote: > > There is one problem that I missed before. > We can shutdown a qat device and bring it back up via ioct. > Before we shut the last device we also unregister the algorithms. > Then we can bring them up again, but it fails without resetting the flags. Good catch. However it's not as simple as that. We have a bigger problem here. For software implementations it's really easy because we use the module reference count to prevent unregisters from happening while there are tfms using it. However for hardware devices this breaks down because you can remove the device at any time. So we'll have to come up with a way of handling this without causing live tfms to explode. FWIW the current qat code (and possibly other drivers) is buggy because if you call unregister while there are still tfms using it, it will BUG in the unregister call. Cheers,
On Tue, Aug 04, 2015 at 05:59:09PM +0800, Herbert Xu wrote: > On Mon, Aug 03, 2015 at 10:01:02AM -0700, Tadeusz Struk wrote: > > > > There is one problem that I missed before. > > We can shutdown a qat device and bring it back up via ioct. > > Before we shut the last device we also unregister the algorithms. > > Then we can bring them up again, but it fails without resetting the flags. > > Good catch. However it's not as simple as that. We have a bigger > problem here. In order to make progress I'm going to fold your patch into my series and push it into cryptodev. I'll try to come up with a proper fix for this after I'm done with the AEAD stuff. Cheers,
On 08/04/2015 02:59 AM, Herbert Xu wrote: > Good catch. However it's not as simple as that. We have a bigger > problem here. > > For software implementations it's really easy because we use the > module reference count to prevent unregisters from happening while > there are tfms using it. > > However for hardware devices this breaks down because you can remove > the device at any time. So we'll have to come up with a way of > handling this without causing live tfms to explode. > > FWIW the current qat code (and possibly other drivers) is buggy > because if you call unregister while there are still tfms using > it, it will BUG in the unregister call. The way we handle it in qat is as follows - when tfm allocates a crypto "instance" on a qat dev it then calls qat_crypto_put_instance(), which calls also adf_dev_put() on the appropriate qat device. This then calls module_put() on accel_dev->owner. This way, when any tfm is allocated on a device the attempts to rmmod or ioctl shutdown returns -EBUSY. See adf_ctl_is_device_in_use() One can only successfully shut down a device or rmmod qat_dh895 when all tfms are freed. The only possibility when we may try to restart a device when tfms are allocated is when we handle HW uncorrectable errors, which in theory should never happen :) -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/04/2015 05:47 AM, Herbert Xu wrote: > In order to make progress I'm going to fold your patch into my > series and push it into cryptodev. > > I'll try to come up with a proper fix for this after I'm done > with the AEAD stuff. Thanks. Also thanks for converting this. It looks much cleaner now. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Aug 04, 2015 at 06:26:58AM -0700, Tadeusz Struk wrote: > > The way we handle it in qat is as follows - when tfm allocates a > crypto "instance" on a qat dev it then calls qat_crypto_put_instance(), > which calls also adf_dev_put() on the appropriate qat device. > This then calls module_put() on accel_dev->owner. > This way, when any tfm is allocated on a device the attempts to rmmod > or ioctl shutdown returns -EBUSY. See adf_ctl_is_device_in_use() > One can only successfully shut down a device or rmmod qat_dh895 when > all tfms are freed. > > The only possibility when we may try to restart a device when tfms are > allocated is when we handle HW uncorrectable errors, which in theory > should never happen :) What if someone calls adf_remove? For a software implementation you can prevent the algorithm from going away by holding module reference counts. But a device can always be removed and you're not going to stop that no matter how many reference counts you hold :) Cheers,
On 08/04/2015 06:32 AM, Herbert Xu wrote: > What if someone calls adf_remove? For a software implementation > you can prevent the algorithm from going away by holding module > reference counts. But a device can always be removed and you're > not going to stop that no matter how many reference counts you > hold :) No way! It is defined as static ;) -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Aug 04, 2015 at 06:42:04AM -0700, Tadeusz Struk wrote: > On 08/04/2015 06:32 AM, Herbert Xu wrote: > > What if someone calls adf_remove? For a software implementation > > you can prevent the algorithm from going away by holding module > > reference counts. But a device can always be removed and you're > > not going to stop that no matter how many reference counts you > > hold :) > > No way! It is defined as static ;) It's exported as a PCI device, no? PCI devices can be removed at any time through sysfs. Cheers,
On 08/04/2015 06:44 AM, Herbert Xu wrote: > On Tue, Aug 04, 2015 at 06:42:04AM -0700, Tadeusz Struk wrote: >> > On 08/04/2015 06:32 AM, Herbert Xu wrote: >>> > > What if someone calls adf_remove? For a software implementation >>> > > you can prevent the algorithm from going away by holding module >>> > > reference counts. But a device can always be removed and you're >>> > > not going to stop that no matter how many reference counts you >>> > > hold :) >> > >> > No way! It is defined as static ;) > It's exported as a PCI device, no? PCI devices can be removed > at any time through sysfs. right. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/crypto/qat/qat_common/qat_algs.c b/drivers/crypto/qat/qat_common/qat_algs.c index 45420a6..1411e4c 100644 --- a/drivers/crypto/qat/qat_common/qat_algs.c +++ b/drivers/crypto/qat/qat_common/qat_algs.c @@ -1183,16 +1183,22 @@ static struct crypto_alg qat_algs[] = { { int qat_algs_register(void) { - int ret = 0; + int ret = 0, i; mutex_lock(&algs_lock); if (++active_devs != 1) goto unlock; + for (i = 0; i < ARRAY_SIZE(qat_algs); i++) + qat_algs[i].cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC; + ret = crypto_register_algs(qat_algs, ARRAY_SIZE(qat_algs)); if (ret) goto unlock; + for (i = 0; i < ARRAY_SIZE(qat_aeads); i++) + qat_aeads[i].base.cra_flags = CRYPTO_ALG_ASYNC | CRYPTO_ALG_AEAD_NEW; + ret = crypto_register_aeads(qat_aeads, ARRAY_SIZE(qat_aeads)); if (ret) goto unreg_algs;