diff mbox

[2/3] ARM: mxs: crypto: Add Freescale MXS DCP driver

Message ID 1380194306-5243-2-git-send-email-marex@denx.de (mailing list archive)
State New, archived
Headers show

Commit Message

Marek Vasut Sept. 26, 2013, 11:18 a.m. UTC
Add support for the MXS DCP block. The driver currently supports
SHA-1/SHA-256 hashing and AES-128 CBC/ECB modes. The non-standard
CRC32 is not yet supported.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
---
 drivers/crypto/Kconfig   |   17 +
 drivers/crypto/Makefile  |    1 +
 drivers/crypto/mxs-dcp.c | 1082 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 1100 insertions(+)
 create mode 100644 drivers/crypto/mxs-dcp.c

Comments

Fabio Estevam Sept. 26, 2013, 11:27 a.m. UTC | #1
Hi Marek,

On Thu, Sep 26, 2013 at 8:18 AM, Marek Vasut <marex@denx.de> wrote:
> Add support for the MXS DCP block. The driver currently supports
> SHA-1/SHA-256 hashing and AES-128 CBC/ECB modes. The non-standard
> CRC32 is not yet supported.
>
> Signed-off-by: Marek Vasut <marex@denx.de>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: David S. Miller <davem@davemloft.net>
> ---
>  drivers/crypto/Kconfig   |   17 +
>  drivers/crypto/Makefile  |    1 +
>  drivers/crypto/mxs-dcp.c | 1082 ++++++++++++++++++++++++++++++++++++++++++++++

What about the existing DCP driver at drivers/crypto/dcp.c ?

Why do we need to have two drivers for the same IP block? It looks
confusing to have both.

Regards,

Fabio Estevam
Marek Vasut Sept. 26, 2013, 12:07 p.m. UTC | #2
Dear Fabio Estevam,

> Hi Marek,
> 
> On Thu, Sep 26, 2013 at 8:18 AM, Marek Vasut <marex@denx.de> wrote:
> > Add support for the MXS DCP block. The driver currently supports
> > SHA-1/SHA-256 hashing and AES-128 CBC/ECB modes. The non-standard
> > CRC32 is not yet supported.
> > 
> > Signed-off-by: Marek Vasut <marex@denx.de>
> > Cc: Herbert Xu <herbert@gondor.apana.org.au>
> > Cc: David S. Miller <davem@davemloft.net>
> > ---
> > 
> >  drivers/crypto/Kconfig   |   17 +
> >  drivers/crypto/Makefile  |    1 +
> >  drivers/crypto/mxs-dcp.c | 1082
> >  ++++++++++++++++++++++++++++++++++++++++++++++
> 
> What about the existing DCP driver at drivers/crypto/dcp.c ?

I was not aware of that one.

> Why do we need to have two drivers for the same IP block? It looks
> confusing to have both.

Sure, I agree. I reviewed the one in mainline just now and I see some 
deficiencies of the dcp.c driver:

1) It only supports AES_CBC (mine does support AES_ECB, AES_CBC, SHA1 and SH256)
2) The driver was apparently never ran behind anyone working with MXS. ie.:
   -> Restarting the DCP block is not done via mxs_reset_block()
   -> The DT name is not "fsl,dcp" or "fsl,mxs-dcp" as other MXS drivers
3) What are those ugly new IOCTLs in the dcp.c driver?
4) The VMI IRQ is never used, yet it even calls the IRQ handler, this is bogus
   -> The DCP always triggers the dcp_irq upon DMA completion
   -> VMI IRQ is not handled in the handler at all as I see it
5) The IRQ handler can't use usual completion() in the driver because that'd 
trigger "scheduling while atomic" oops, yes?

Finally, because the dcp.c driver only supports AES128 CBC, it depends on kernel 
_always_ passing the DCP scatterlist such that each of it's elements is 16-bytes 
long. If each of the elements is 16 bytes, there is no problem and the DCP will 
operate correctly. That is because the DCP has the following limitations:

-> For AES128, each buffer in the DMA chain must be multiple of 16 bytes.
-> For SHA1/SHA256, each buffer in the DMA chain must by multiple of 64 bytes 
BUT only the last buffer in the DMA chain can be shorter.

So, in the AES128 case, if the hardware is passed two (4 bytes + 12 bytes for 
example) DMA descriptors instead of single 16 bytes descriptor, the DCP will 
simply stall or produce incorrect result. This can happen if the user of the 
async crypto API passes such a scatterlist.

It is true this is not caught by the in-kernel crypto tests, because they mostly 
trigger the software fallback in this driver, since this driver only supports 16 
byte key (klen == 16 , see crypto/testmgr.h). It can be triggered by modifying 
the crypto tests so that they pass two buffers, multiple of 16 bytes in total, 
but each of them not multiple of 16 bytes.

I ran into many such problems when I was developing this driver I submitted 
here, I managed to trigger those by running the Cryptodev [1] tests [2] against 
this driver as well as testing the performance of this driver via 
Cryptodev/OpenSSL combination:

$ openssl speed sha1
$ openssl speed sha256
$ openssl speed aes-128-cbc

I also measured the performance via OpenSSL encryption/decryption. On larger 
files, the performance of encryption/decryption via DCP, even with fixups for 
unaligned buffers and the overhead of userspace-kernel-userspace switches due to 
cryptodev is still higher than software implementation:

$ time openssl enc -aes-128-ecb -in $IFILE -out $OFILE -k "test" -nosalt
$ time openssl enc -d -aes-128-ecb -in $OFILE -out $DFILE -k "test" -nosalt

Also, since OpenSSH uses OpenSSL, this helped my testing too.

[1] http://cryptodev-linux.org/
[2] https://github.com/nmav/cryptodev-linux/tree/master/tests

Best regards,
Marek Vasut
Lothar Waßmann Sept. 26, 2013, 12:13 p.m. UTC | #3
Hi Marek,

some small comments below.

Marek Vasut writes:
> diff --git a/drivers/crypto/mxs-dcp.c b/drivers/crypto/mxs-dcp.c
> new file mode 100644
> index 0000000..c2b35c7
> --- /dev/null
> +++ b/drivers/crypto/mxs-dcp.c
[...]
> +/* AES 128 ECB and AES 128 CBC */
> +static struct crypto_alg dcp_aes_algs[] = {
> +	[0] = {
> +		.cra_name		= "ecb(aes)",
> +		.cra_driver_name	= "ecb-aes-dcp",
> +		.cra_priority		= 400,
> +		.cra_alignmask		= 15,
> +		.cra_flags		= CRYPTO_ALG_TYPE_ABLKCIPHER |
> +					  CRYPTO_ALG_ASYNC |
> +					  CRYPTO_ALG_NEED_FALLBACK,
> +		.cra_init		= mxs_dcp_aes_fallback_init,
> +		.cra_exit		= mxs_dcp_aes_fallback_exit,
> +		.cra_blocksize		= AES_BLOCK_SIZE,
> +		.cra_ctxsize		= sizeof(struct dcp_async_ctx),
> +		.cra_type		= &crypto_ablkcipher_type,
> +		.cra_module		= THIS_MODULE,
> +		.cra_u	= {
> +			.ablkcipher = {
> +				.min_keysize	= AES_MIN_KEY_SIZE,
> +				.max_keysize	= AES_MAX_KEY_SIZE,
> +				.setkey		= mxs_dcp_aes_setkey,
> +				.encrypt	= mxs_dcp_aes_ecb_encrypt,
> +				.decrypt	= mxs_dcp_aes_ecb_decrypt
> +			}
missing ',' after '}'
> +		}
dto.

> +	},
> +	[1] = {
> +		.cra_name		= "cbc(aes)",
> +		.cra_driver_name	= "cbc-aes-dcp",
> +		.cra_priority		= 400,
> +		.cra_alignmask		= 15,
> +		.cra_flags		= CRYPTO_ALG_TYPE_ABLKCIPHER |
> +					  CRYPTO_ALG_ASYNC |
> +					  CRYPTO_ALG_NEED_FALLBACK,
> +		.cra_init		= mxs_dcp_aes_fallback_init,
> +		.cra_exit		= mxs_dcp_aes_fallback_exit,
> +		.cra_blocksize		= AES_BLOCK_SIZE,
> +		.cra_ctxsize		= sizeof(struct dcp_async_ctx),
> +		.cra_type		= &crypto_ablkcipher_type,
> +		.cra_module		= THIS_MODULE,
> +		.cra_u = {
> +			.ablkcipher = {
> +				.min_keysize	= AES_MIN_KEY_SIZE,
> +				.max_keysize	= AES_MAX_KEY_SIZE,
> +				.setkey		= mxs_dcp_aes_setkey,
> +				.encrypt	= mxs_dcp_aes_cbc_encrypt,
> +				.decrypt	= mxs_dcp_aes_cbc_decrypt,
> +				.ivsize		= AES_BLOCK_SIZE,
> +			}
dto.
> +		}
dto.
> +	},
> +};
> +
> +/* SHA1 */
> +static struct ahash_alg dcp_sha1_alg = {
> +	.init	= dcp_sha_init,
> +	.update	= dcp_sha_update,
> +	.final	= dcp_sha_final,
> +	.finup	= dcp_sha_finup,
> +	.digest	= dcp_sha_digest,
> +	.halg	= {
> +		.digestsize	= SHA1_DIGEST_SIZE,
> +		.base		= {
> +			.cra_name		= "sha1",
> +			.cra_driver_name	= "sha1-dcp",
> +			.cra_priority		= 400,
> +			.cra_alignmask		= 63,
> +			.cra_flags		= CRYPTO_ALG_ASYNC,
> +			.cra_blocksize		= SHA1_BLOCK_SIZE,
> +			.cra_ctxsize		= sizeof(struct dcp_async_ctx),
> +			.cra_module		= THIS_MODULE,
> +			.cra_init		= dcp_sha_cra_init,
> +			.cra_exit		= dcp_sha_cra_exit,
> +		}
dto.
> +	}
dto.

> +};
> +
> +/* SHA256 */
> +static struct ahash_alg dcp_sha256_alg = {
> +	.init	= dcp_sha_init,
> +	.update	= dcp_sha_update,
> +	.final	= dcp_sha_final,
> +	.finup	= dcp_sha_finup,
> +	.digest	= dcp_sha_digest,
> +	.halg	= {
> +		.digestsize	= SHA256_DIGEST_SIZE,
> +		.base		= {
> +			.cra_name		= "sha256",
> +			.cra_driver_name	= "sha256-dcp",
> +			.cra_priority		= 400,
> +			.cra_alignmask		= 63,
> +			.cra_flags		= CRYPTO_ALG_ASYNC,
> +			.cra_blocksize		= SHA256_BLOCK_SIZE,
> +			.cra_ctxsize		= sizeof(struct dcp_async_ctx),
> +			.cra_module		= THIS_MODULE,
> +			.cra_init		= dcp_sha_cra_init,
> +			.cra_exit		= dcp_sha_cra_exit,
> +		}
dto.
> +	}
dto.

> +static const struct of_device_id mxs_dcp_dt_ids[] = {
> +	{.compatible = "fsl,mxs-dcp", .data = NULL,},
>
missing spaces after '{' and before '}'


Lothar Waßmann
Veli-Pekka Peltola Sept. 26, 2013, 12:37 p.m. UTC | #4
Hi Marek,

> +         To compile this driver as a module, choose M here: the module
> +         will be called atmel-sha.

This is probably wrong?

--
Veli-Pekka Peltola
Fabio Estevam Sept. 26, 2013, 12:48 p.m. UTC | #5
On Thu, Sep 26, 2013 at 8:18 AM, Marek Vasut <marex@denx.de> wrote:

> +config CRYPTO_DEV_MXS_DCP
> +       tristate "Support for Freescale MXS DCP"
> +       depends on ARCH_MXS
> +       select CRYPTO_SHA1
> +       select CRYPTO_SHA256
> +       select CRYPTO_CBC
> +       select CRYPTO_ECB
> +       select CRYPTO_AES
> +       select CRYPTO_BLKCIPHER
> +       select CRYPTO_ALGAPI
> +       help
> +         The Freescale i.MX23/i.MX28 has SHA1/SHA256 and AES128 CBC/ECB
> +         co-processor on the die.
> +
> +         To compile this driver as a module, choose M here: the module
> +         will be called atmel-sha.

Actually it will be called 'mxs-dcp'.

> + * There can even be only one instance of the MXS DCP due to the
> + * design of Linux Crypto API.

Is this true? Usually we don't want to create a global struct.

> +
> +/* AES 128 ECB and AES 128 CBC */
> +static struct crypto_alg dcp_aes_algs[] = {
> +       [0] = {

No need for explicitely add this [0]

> +               .cra_name               = "ecb(aes)",
> +               .cra_driver_name        = "ecb-aes-dcp",
> +               .cra_priority           = 400,
> +               .cra_alignmask          = 15,
> +               .cra_flags              = CRYPTO_ALG_TYPE_ABLKCIPHER |
> +                                         CRYPTO_ALG_ASYNC |
> +                                         CRYPTO_ALG_NEED_FALLBACK,
> +               .cra_init               = mxs_dcp_aes_fallback_init,
> +               .cra_exit               = mxs_dcp_aes_fallback_exit,
> +               .cra_blocksize          = AES_BLOCK_SIZE,
> +               .cra_ctxsize            = sizeof(struct dcp_async_ctx),
> +               .cra_type               = &crypto_ablkcipher_type,
> +               .cra_module             = THIS_MODULE,
> +               .cra_u  = {
> +                       .ablkcipher = {
> +                               .min_keysize    = AES_MIN_KEY_SIZE,
> +                               .max_keysize    = AES_MAX_KEY_SIZE,
> +                               .setkey         = mxs_dcp_aes_setkey,
> +                               .encrypt        = mxs_dcp_aes_ecb_encrypt,
> +                               .decrypt        = mxs_dcp_aes_ecb_decrypt
> +                       }
> +               }
> +       },
> +       [1] = {

Same here.

> +static int mxs_dcp_probe(struct platform_device *pdev)
> +{
> +       struct device *dev = &pdev->dev;
> +       struct dcp *sdcp = NULL;
> +       int i, ret;
> +
> +       struct resource *iores;
> +       int dcp_vmi_irq, dcp_irq;
> +
> +       mutex_lock(&global_mutex);
> +       if (global_sdcp) {
> +               dev_err(dev, "Only one DCP instance allowed!\n");
> +               ret = -ENODEV;
> +               goto err_mutex;
> +       }
> +
> +       iores = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +       dcp_vmi_irq = platform_get_irq(pdev, 0);
> +       dcp_irq = platform_get_irq(pdev, 1);
> +       if (!iores || dcp_vmi_irq < 0 || dcp_irq < 0) {

No need to check for !iores here.

You use it inside devm_ioremap_resource, which already does this checking.


> +       /*
> +        * We do not enable context switching. Give the context buffer a
> +        * pointer to an illegal address so if context switching is
> +        * inadvertantly enabled, the DCP will return an error instead of
> +        * trashing good memory. The DCP DMA cannot access ROM, so any ROM
> +        * address will do.
> +        */
> +       writel(0xffff0000, sdcp->base + MXS_DCP_CONTEXT);

Can you use a define instead of this hardcoded number?

> +static int mxs_dcp_remove(struct platform_device *pdev)
> +{
> +       struct dcp *sdcp;
> +       int i;
> +
> +       sdcp = platform_get_drvdata(pdev);
> +
> +       kthread_stop(sdcp->thread[DCP_CHAN_HASH_SHA]);
> +       kthread_stop(sdcp->thread[DCP_CHAN_CRYPTO]);
> +
> +       platform_set_drvdata(pdev, NULL);
> +
> +       dma_free_coherent(sdcp->dev, sizeof(struct dcp_coherent_block),
> +                         sdcp->coh, sdcp->coh_phys);
> +
> +       if (sdcp->caps & MXS_DCP_CAPABILITY1_SHA256)
> +               crypto_unregister_ahash(&dcp_sha256_alg);
> +
> +       if (sdcp->caps & MXS_DCP_CAPABILITY1_SHA1)
> +               crypto_unregister_ahash(&dcp_sha1_alg);
> +
> +       if (sdcp->caps & MXS_DCP_CAPABILITY1_AES128) {
> +               for (i = ARRAY_SIZE(dcp_aes_algs); i >= 0; i--)
> +                       crypto_unregister_alg(&dcp_aes_algs[i]);
> +       }
> +
> +       mutex_lock(&global_mutex);
> +       global_sdcp = NULL;
> +       mutex_unlock(&global_mutex);


The order of the resources removal does not look correct here.

It should match the order of the error path in probe.

> +
> +       return 0;
> +}
> +
> +static const struct of_device_id mxs_dcp_dt_ids[] = {
> +       {.compatible = "fsl,mxs-dcp", .data = NULL,},

In the other mxs/imx drivers we use:

.compatible = "fsl,<soc>-dcp"

You also need to provide a devicetree documentation for this binding.

> +       { /* sentinel */ }
> +};
> +
> +MODULE_DEVICE_TABLE(of, mxs_dcp_dt_ids);
> +
> +static struct platform_driver mxs_dcp_driver = {
> +       .probe  = mxs_dcp_probe,
> +       .remove = mxs_dcp_remove,
> +       .driver = {
> +               .name           = "mxs-dcp",
> +               .owner          = THIS_MODULE,
> +               .of_match_table = mxs_dcp_dt_ids,
> +       },
> +};
> +
> +module_platform_driver(mxs_dcp_driver);
> +
> +MODULE_AUTHOR("Marek Vasut <marex@denx.de>");
> +MODULE_DESCRIPTION("Freescale MXS DCP Driver");
> +MODULE_LICENSE("GPL");


Could also add MODULE_ALIAS.
Marek Vasut Sept. 26, 2013, 2:02 p.m. UTC | #6
Dear Veli-Pekka Peltola,

> Hi Marek,
> 
> > +         To compile this driver as a module, choose M here: the module
> > +         will be called atmel-sha.
> 
> This is probably wrong?

Certainly. Nice to have you back btw. ;-)

Best regards,
Marek Vasut
Marek Vasut Sept. 26, 2013, 2:04 p.m. UTC | #7
Dear Lothar Waßmann,

> Hi Marek,
> 
> some small comments below.
> 
> Marek Vasut writes:
> > diff --git a/drivers/crypto/mxs-dcp.c b/drivers/crypto/mxs-dcp.c
> > new file mode 100644
> > index 0000000..c2b35c7
> > --- /dev/null
> > +++ b/drivers/crypto/mxs-dcp.c
> 
> [...]
> 
> > +/* AES 128 ECB and AES 128 CBC */
> > +static struct crypto_alg dcp_aes_algs[] = {
> > +	[0] = {
> > +		.cra_name		= "ecb(aes)",
> > +		.cra_driver_name	= "ecb-aes-dcp",
> > +		.cra_priority		= 400,
> > +		.cra_alignmask		= 15,
> > +		.cra_flags		= CRYPTO_ALG_TYPE_ABLKCIPHER |
> > +					  CRYPTO_ALG_ASYNC |
> > +					  CRYPTO_ALG_NEED_FALLBACK,
> > +		.cra_init		= mxs_dcp_aes_fallback_init,
> > +		.cra_exit		= mxs_dcp_aes_fallback_exit,
> > +		.cra_blocksize		= AES_BLOCK_SIZE,
> > +		.cra_ctxsize		= sizeof(struct dcp_async_ctx),
> > +		.cra_type		= &crypto_ablkcipher_type,
> > +		.cra_module		= THIS_MODULE,
> > +		.cra_u	= {
> > +			.ablkcipher = {
> > +				.min_keysize	= AES_MIN_KEY_SIZE,
> > +				.max_keysize	= AES_MAX_KEY_SIZE,
> > +				.setkey		= mxs_dcp_aes_setkey,
> > +				.encrypt	= mxs_dcp_aes_ecb_encrypt,
> > +				.decrypt	= mxs_dcp_aes_ecb_decrypt
> > +			}
> 
> missing ',' after '}'

Is this a problem? The last ',' is not needed by the C standard.

[...]


> 
> > +static const struct of_device_id mxs_dcp_dt_ids[] = {
> > +	{.compatible = "fsl,mxs-dcp", .data = NULL,},
> 
> missing spaces after '{' and before '}'

Thanks!

Best regards,
Marek Vasut
Lothar Waßmann Sept. 26, 2013, 2:09 p.m. UTC | #8
Hi,

Marek Vasut writes:
> Dear Lothar Waßmann,
> 
> > Hi Marek,
> > 
> > some small comments below.
> > 
> > Marek Vasut writes:
> > > diff --git a/drivers/crypto/mxs-dcp.c b/drivers/crypto/mxs-dcp.c
> > > new file mode 100644
> > > index 0000000..c2b35c7
> > > --- /dev/null
> > > +++ b/drivers/crypto/mxs-dcp.c
> > 
> > [...]
> > 
> > > +/* AES 128 ECB and AES 128 CBC */
> > > +static struct crypto_alg dcp_aes_algs[] = {
> > > +	[0] = {
> > > +		.cra_name		= "ecb(aes)",
> > > +		.cra_driver_name	= "ecb-aes-dcp",
> > > +		.cra_priority		= 400,
> > > +		.cra_alignmask		= 15,
> > > +		.cra_flags		= CRYPTO_ALG_TYPE_ABLKCIPHER |
> > > +					  CRYPTO_ALG_ASYNC |
> > > +					  CRYPTO_ALG_NEED_FALLBACK,
> > > +		.cra_init		= mxs_dcp_aes_fallback_init,
> > > +		.cra_exit		= mxs_dcp_aes_fallback_exit,
> > > +		.cra_blocksize		= AES_BLOCK_SIZE,
> > > +		.cra_ctxsize		= sizeof(struct dcp_async_ctx),
> > > +		.cra_type		= &crypto_ablkcipher_type,
> > > +		.cra_module		= THIS_MODULE,
> > > +		.cra_u	= {
> > > +			.ablkcipher = {
> > > +				.min_keysize	= AES_MIN_KEY_SIZE,
> > > +				.max_keysize	= AES_MAX_KEY_SIZE,
> > > +				.setkey		= mxs_dcp_aes_setkey,
> > > +				.encrypt	= mxs_dcp_aes_ecb_encrypt,
> > > +				.decrypt	= mxs_dcp_aes_ecb_decrypt
> > > +			}
> > 
> > missing ',' after '}'
> 
> Is this a problem? The last ',' is not needed by the C standard.
> 
The problem arises when someone wants to append another item to the
list. Without the comma he has to change two lines which may cause
merge conflicts if two people add different items to the same struct.

That's why we usually have (unnecessary) commas on the last element of
a struct initializer (except when they are meant to be the last
element of course).


Lothar Waßmann
Tobias Rauter Sept. 27, 2013, 4:03 p.m. UTC | #9
Hi,

Here are some thoughts of the design decisions I made when I wrote
the dcp.c driver. Maybe it helps.

On 2013-09-26 14:07, Marek Vasut wrote:
> Dear Fabio Estevam,
>
>> Hi Marek,
>>
>> Why do we need to have two drivers for the same IP block? It looks
>> confusing to have both.
>
> Sure, I agree. I reviewed the one in mainline just now and I see some
> deficiencies of the dcp.c driver:
>
> 1) It only supports AES_CBC (mine does support AES_ECB, AES_CBC, SHA1 and SH256)

Right, but for ECP only the interface is missing (and it is no real
mode of operation) and hashes should be generally faster in SW.

> 2) The driver was apparently never ran behind anyone working with MXS.


That is probably right.


> 3) What are those ugly new IOCTLs in the dcp.c driver?

When I firstly posted the driver in the mailinglist, there where one
person who actually used this interface (it was introduced in
Freescale's SDK) to use the OTP keys for crypto. As far as I have
seen, the crypto API does not support such keys (i.e. there seems to
be no way to tell a driver to use some kind of special keys - which
are not delivered by the user - via the API).
Therefore I added this miscdevice and adopted Freescale's interface.

> 4) The VMI IRQ is never used, yet it even calls the IRQ handler, this is bogus

That's absolutely right.

>     -> The DCP always triggers the dcp_irq upon DMA completion

The IRQ is triggered after every packet, to enable simultaneous work
for CPU/DCP: While the DCP is computing, the CPU is able to fill more
packets. I don't know how far this is useful, because the 20 Packets
which are enabled by default can address up to 80kB of
plain-/ciphertext. However, I think it is better to do the work 
simultaneously to safe time (actual real world time, not CPU time).

> 5) The IRQ handler can't use usual completion() in the driver because that'd
> trigger "scheduling while atomic" oops, yes?

I decided to use the tasklets because of performance reasons. I don't
remember numbers but a workqueue was significantly slower.  The
use of a kernel thread may reduce the overhead compared to the wq. I
was not sure if it is appropriate to create an extra thread for a
crypto-driver, without real reason (IMHO).

> Finally, because the dcp.c driver only supports AES128 CBC, it depends on kernel
> _always_ passing the DCP scatterlist such that each of it's elements is 16-bytes
> long. [...]
> So, in the AES128 case, if the hardware is passed two (4 bytes + 12 bytes for
> example) DMA descriptors instead of single 16 bytes descriptor, the DCP will
> simply stall or produce incorrect result. This can happen if the user of the
> async crypto API passes such a scatterlist.

The scatterlist alignment and bounce-buffering to get full 16 Byte
blocks is done by the ablkcipher_walk API (with the error
parameter) when needed. As far as I see, you are copying the whole 
buffer to your coherent block and back. Wouldn't it be better to do that 
just for unaligned blocks?


kind regards,
tr
Marek Vasut Sept. 28, 2013, 3:35 a.m. UTC | #10
Hi Tobias,

> Hi,
> 
> Here are some thoughts of the design decisions I made when I wrote
> the dcp.c driver. Maybe it helps.

Was the driver not rather forward-ported from some old FSL BSP kernel?

> On 2013-09-26 14:07, Marek Vasut wrote:
> > Dear Fabio Estevam,
> > 
> >> Hi Marek,
> >> 
> >> Why do we need to have two drivers for the same IP block? It looks
> >> confusing to have both.
> > 
> > Sure, I agree. I reviewed the one in mainline just now and I see some
> > deficiencies of the dcp.c driver:
> > 
> > 1) It only supports AES_CBC (mine does support AES_ECB, AES_CBC, SHA1 and
> > SH256)
> 
> Right, but for ECP only the interface is missing (and it is no real
> mode of operation) and hashes should be generally faster in SW.

The ECB is slightly different from CBC, for example the IV handling differs. As 
for the performance, the SHA* hashing was more performant in hardware starting 
with ~8MB chunks. I checked this via the "openssl -speed" test via the linux 
cryptodev patch.

> > 2) The driver was apparently never ran behind anyone working with MXS.
> 
> That is probably right.
> 
> > 3) What are those ugly new IOCTLs in the dcp.c driver?
> 
> When I firstly posted the driver in the mailinglist, there where one
> person who actually used this interface (it was introduced in
> Freescale's SDK) to use the OTP keys for crypto. As far as I have
> seen, the crypto API does not support such keys (i.e. there seems to
> be no way to tell a driver to use some kind of special keys - which
> are not delivered by the user - via the API).
> Therefore I added this miscdevice and adopted Freescale's interface.

The keys are programmed into the OTP registers, correct? There is OCOTP driver 
for the MX23/MX28 OTP hardware. This is what should have been used then.

NOTE: This IOCTL interface seems like quite an abusive way to allow userland to 
access the crypto API in kernel. I understand this is used by some Freescale 
tool, but won't it be better to fix the Freescale tool instead ?

> > 4) The VMI IRQ is never used, yet it even calls the IRQ handler, this is
> > bogus
> 
> That's absolutely right.
> 
> >     -> The DCP always triggers the dcp_irq upon DMA completion
> 
> The IRQ is triggered after every packet, to enable simultaneous work
> for CPU/DCP: While the DCP is computing, the CPU is able to fill more
> packets. I don't know how far this is useful, because the 20 Packets
> which are enabled by default can address up to 80kB of
> plain-/ciphertext. However, I think it is better to do the work
> simultaneously to safe time (actual real world time, not CPU time).

This really depends on how you program the DCP's own DMA controller, or more 
precisely what you write into the DMA descriptors. Each descriptor has a bit 
that signalizes whether the DCP should generate an IRQ upon it's completion. You 
can thus program it to generate IRQ upon competing each descriptor only for the 
last descriptor in a chain or what fits you.

NOTE that chaining multiple DMA descriptors of small size is less performant 
than filling up a large buffer and then running it through the DCP in one longer 
DMA descriptor. This is actually why I have such a large bounce-buffer and not 
only a small one.

> > 5) The IRQ handler can't use usual completion() in the driver because
> > that'd trigger "scheduling while atomic" oops, yes?
> 
> I decided to use the tasklets because of performance reasons. I don't
> remember numbers but a workqueue was significantly slower.  The
> use of a kernel thread may reduce the overhead compared to the wq. I
> was not sure if it is appropriate to create an extra thread for a
> crypto-driver, without real reason (IMHO).

Certainly, my remark was that the FSL driver had this issue and the code here 
didn't change much from the FSL driver. I was thus curious why you don't use 
completion in the driver as usual and whether this was to work around the issue 
the FSL driver had.

> > Finally, because the dcp.c driver only supports AES128 CBC, it depends on
> > kernel _always_ passing the DCP scatterlist such that each of it's
> > elements is 16-bytes long. [...]
> > So, in the AES128 case, if the hardware is passed two (4 bytes + 12 bytes
> > for example) DMA descriptors instead of single 16 bytes descriptor, the
> > DCP will simply stall or produce incorrect result. This can happen if
> > the user of the async crypto API passes such a scatterlist.
> 
> The scatterlist alignment and bounce-buffering to get full 16 Byte
> blocks is done by the ablkcipher_walk API (with the error
> parameter) when needed. As far as I see, you are copying the whole
> buffer to your coherent block and back. Wouldn't it be better to do that
> just for unaligned blocks?

I tried writing a proper driver for this DCP piece of crap hardware about 7 
times. I tried to squeeze as much power from it as possible, among others tried 
to avoid the copying, but there was always some hidden problem somewhere. The 
DCP is really _horribly_ designed hardware. But to answer your question, the 
copying of stuff back and forth is actually not a problem here.

The real question here is, how do we handle the current situation? I think this 
might be a question for Dave or Herbert to answer though.

Best regards,
Marek Vasut
Marek Vasut Sept. 29, 2013, 10:05 p.m. UTC | #11
Dear Lothar Waßmann,

> Hi,
> 
> Marek Vasut writes:
> > Dear Lothar Waßmann,
> > 
> > > Hi Marek,
> > > 
> > > some small comments below.
> > > 
> > > Marek Vasut writes:
> > > > diff --git a/drivers/crypto/mxs-dcp.c b/drivers/crypto/mxs-dcp.c
> > > > new file mode 100644
> > > > index 0000000..c2b35c7
> > > > --- /dev/null
> > > > +++ b/drivers/crypto/mxs-dcp.c
> > > 
> > > [...]
> > > 
> > > > +/* AES 128 ECB and AES 128 CBC */
> > > > +static struct crypto_alg dcp_aes_algs[] = {
> > > > +	[0] = {
> > > > +		.cra_name		= "ecb(aes)",
> > > > +		.cra_driver_name	= "ecb-aes-dcp",
> > > > +		.cra_priority		= 400,
> > > > +		.cra_alignmask		= 15,
> > > > +		.cra_flags		= CRYPTO_ALG_TYPE_ABLKCIPHER |
> > > > +					  CRYPTO_ALG_ASYNC |
> > > > +					  CRYPTO_ALG_NEED_FALLBACK,
> > > > +		.cra_init		= mxs_dcp_aes_fallback_init,
> > > > +		.cra_exit		= mxs_dcp_aes_fallback_exit,
> > > > +		.cra_blocksize		= AES_BLOCK_SIZE,
> > > > +		.cra_ctxsize		= sizeof(struct dcp_async_ctx),
> > > > +		.cra_type		= &crypto_ablkcipher_type,
> > > > +		.cra_module		= THIS_MODULE,
> > > > +		.cra_u	= {
> > > > +			.ablkcipher = {
> > > > +				.min_keysize	= AES_MIN_KEY_SIZE,
> > > > +				.max_keysize	= AES_MAX_KEY_SIZE,
> > > > +				.setkey		= mxs_dcp_aes_setkey,
> > > > +				.encrypt	= 
mxs_dcp_aes_ecb_encrypt,
> > > > +				.decrypt	= 
mxs_dcp_aes_ecb_decrypt
> > > > +			}
> > > 
> > > missing ',' after '}'
> > 
> > Is this a problem? The last ',' is not needed by the C standard.
> 
> The problem arises when someone wants to append another item to the
> list. Without the comma he has to change two lines which may cause
> merge conflicts if two people add different items to the same struct.
> 
> That's why we usually have (unnecessary) commas on the last element of
> a struct initializer (except when they are meant to be the last
> element of course).

Good point.

Best regards,
Marek Vasut
Marek Vasut Sept. 29, 2013, 10:09 p.m. UTC | #12
Dear Fabio Estevam,

[...]

> > + * There can even be only one instance of the MXS DCP due to the
> > + * design of Linux Crypto API.
> 
> Is this true? Usually we don't want to create a global struct.

Yeah, unfortunatelly :-(

> 
> > +
> > +/* AES 128 ECB and AES 128 CBC */
> > +static struct crypto_alg dcp_aes_algs[] = {
> > +       [0] = {
> 
> No need for explicitely add this [0]

I wanted to be explicit here, but OK.

[...]

Rest is fixed, thanks!

Best regards,
Marek Vasut
Christoph G. Baumann Oct. 7, 2013, 9:50 a.m. UTC | #13
Hello Marek,

> Marek Vasut <marex@denx.de> hat am 28. September 2013 um 05:35 geschrieben:
> [...]
> > > 3) What are those ugly new IOCTLs in the dcp.c driver?
> > 
> > When I firstly posted the driver in the mailinglist, there where one
> > person who actually used this interface (it was introduced in
> > Freescale's SDK) to use the OTP keys for crypto. As far as I have
> > seen, the crypto API does not support such keys (i.e. there seems to
> > be no way to tell a driver to use some kind of special keys - which
> > are not delivered by the user - via the API).
> > Therefore I added this miscdevice and adopted Freescale's interface.

> The keys are programmed into the OTP registers, correct? There is OCOTP driver 
> for the MX23/MX28 OTP hardware. This is what should have been used then.

> NOTE: This IOCTL interface seems like quite an abusive way to allow userland to 
> access the crypto API in kernel. I understand this is used by some Freescale 
> tool, but won't it be better to fix the Freescale tool instead ?


the IOCTL interface is used to AES encrypt a bootstream with the AES key in
OCOTP.
The idea is that only the DCP can read/access the key once it has been
programmed
into the OCOTP. If the crypto API has means to tell the DCP to use the key from
OCOTP, the tool from Freescale is a minor problem.


Regards,
Christoph
Marek Vasut Oct. 7, 2013, 3:48 p.m. UTC | #14
Hello Christoph,

> Hello Marek,
> 
> > Marek Vasut <marex@denx.de> hat am 28. September 2013 um 05:35 geschrieben:
> > [...]
> >
> > > > 3) What are those ugly new IOCTLs in the dcp.c driver?
> > > 
> > > When I firstly posted the driver in the mailinglist, there where one
> > > person who actually used this interface (it was introduced in
> > > Freescale's SDK) to use the OTP keys for crypto. As far as I have
> > > seen, the crypto API does not support such keys (i.e. there seems to
> > > be no way to tell a driver to use some kind of special keys - which
> > > are not delivered by the user - via the API).
> > > Therefore I added this miscdevice and adopted Freescale's interface.
> > 
> > The keys are programmed into the OTP registers, correct? There is OCOTP d
> >river 
> >for the MX23/MX28 OTP hardware. This is what should have been used then. 
> > NOTE: This IOCTL interface seems like quite an abusive way to allow userl
> >and to 
> >access the crypto API in kernel. I understand this is used by some Freesc
> >ale tool, but won't it be better to fix the Freescale tool instead ?
> 
> the IOCTL interface is used to AES encrypt a bootstream with the AES key in
> OCOTP.
> The idea is that only the DCP can read/access the key once it has been
> programmed
> into the OCOTP. If the crypto API has means to tell the DCP to use the key
> from OCOTP, the tool from Freescale is a minor problem.

Ah right. I suspect the crypto API services shall not be exported into userland 
at all, yes ? So there has to be some kind of workaround here for this freescale 
tool, which is rather unfortunate.

Thanks for clearing this up.

Best regards,
Marek Vasut
Herbert Xu Oct. 8, 2013, 4:36 a.m. UTC | #15
On Mon, Oct 07, 2013 at 05:48:26PM +0200, Marek Vasut wrote:
> Hello Christoph,
> 
> > Hello Marek,
> > 
> > > Marek Vasut <marex@denx.de> hat am 28. September 2013 um 05:35 geschrieben:
> > > [...]
> > >
> > > > > 3) What are those ugly new IOCTLs in the dcp.c driver?
> > > > 
> > > > When I firstly posted the driver in the mailinglist, there where one
> > > > person who actually used this interface (it was introduced in
> > > > Freescale's SDK) to use the OTP keys for crypto. As far as I have
> > > > seen, the crypto API does not support such keys (i.e. there seems to
> > > > be no way to tell a driver to use some kind of special keys - which
> > > > are not delivered by the user - via the API).
> > > > Therefore I added this miscdevice and adopted Freescale's interface.
> > > 
> > > The keys are programmed into the OTP registers, correct? There is OCOTP d
> > >river 
> > >for the MX23/MX28 OTP hardware. This is what should have been used then. 
> > > NOTE: This IOCTL interface seems like quite an abusive way to allow userl
> > >and to 
> > >access the crypto API in kernel. I understand this is used by some Freesc
> > >ale tool, but won't it be better to fix the Freescale tool instead ?
> > 
> > the IOCTL interface is used to AES encrypt a bootstream with the AES key in
> > OCOTP.
> > The idea is that only the DCP can read/access the key once it has been
> > programmed
> > into the OCOTP. If the crypto API has means to tell the DCP to use the key
> > from OCOTP, the tool from Freescale is a minor problem.
> 
> Ah right. I suspect the crypto API services shall not be exported into userland 
> at all, yes ? So there has to be some kind of workaround here for this freescale 
> tool, which is rather unfortunate.

These ioctls have to go.  I should have never let them through in
the first place.  Can someone cook up a patch to kill them please?

Thanks,
Marek Vasut Oct. 8, 2013, 10:33 a.m. UTC | #16
Hello Herbert,

> On Mon, Oct 07, 2013 at 05:48:26PM +0200, Marek Vasut wrote:
> > Hello Christoph,
> > 
> > > Hello Marek,
> > > 
> > > > Marek Vasut <marex@denx.de> hat am 28. September 2013 um 05:35
> > > > geschrieben: [...]
> > > > 
> > > > > > 3) What are those ugly new IOCTLs in the dcp.c driver?
> > > > > 
> > > > > When I firstly posted the driver in the mailinglist, there where
> > > > > one person who actually used this interface (it was introduced in
> > > > > Freescale's SDK) to use the OTP keys for crypto. As far as I have
> > > > > seen, the crypto API does not support such keys (i.e. there seems
> > > > > to be no way to tell a driver to use some kind of special keys -
> > > > > which are not delivered by the user - via the API).
> > > > > Therefore I added this miscdevice and adopted Freescale's
> > > > > interface.
> > > > 
> > > > The keys are programmed into the OTP registers, correct? There is
> > > > OCOTP d
> > > >
> > > >river
> > > >for the MX23/MX28 OTP hardware. This is what should have been used
> > > >then.
> > > >
> > > > NOTE: This IOCTL interface seems like quite an abusive way to allow
> > > > userl
> > > >
> > > >and to
> > > >access the crypto API in kernel. I understand this is used by some
> > > >Freesc ale tool, but won't it be better to fix the Freescale tool
> > > >instead ?
> > > 
> > > the IOCTL interface is used to AES encrypt a bootstream with the AES
> > > key in OCOTP.
> > > The idea is that only the DCP can read/access the key once it has been
> > > programmed
> > > into the OCOTP. If the crypto API has means to tell the DCP to use the
> > > key from OCOTP, the tool from Freescale is a minor problem.
> > 
> > Ah right. I suspect the crypto API services shall not be exported into
> > userland at all, yes ? So there has to be some kind of workaround here
> > for this freescale tool, which is rather unfortunate.
> 
> These ioctls have to go.  I should have never let them through in
> the first place.  Can someone cook up a patch to kill them please?

I can do that. I wonder if we can't agree to nuke the in-tree driver altogether 
instead and replace it by this one though. Does it not sound more reasonable?

Best regards,
Marek Vasut
Marek Vasut Nov. 10, 2013, 5:48 p.m. UTC | #17
Hi,

> Hello Herbert,
> 
> > On Mon, Oct 07, 2013 at 05:48:26PM +0200, Marek Vasut wrote:
> > > Hello Christoph,
> > > 
> > > > Hello Marek,
> > > > 
> > > > > Marek Vasut <marex@denx.de> hat am 28. September 2013 um 05:35
> > > > > geschrieben: [...]
> > > > > 
> > > > > > > 3) What are those ugly new IOCTLs in the dcp.c driver?
> > > > > > 
> > > > > > When I firstly posted the driver in the mailinglist, there where
> > > > > > one person who actually used this interface (it was introduced in
> > > > > > Freescale's SDK) to use the OTP keys for crypto. As far as I have
> > > > > > seen, the crypto API does not support such keys (i.e. there seems
> > > > > > to be no way to tell a driver to use some kind of special keys -
> > > > > > which are not delivered by the user - via the API).
> > > > > > Therefore I added this miscdevice and adopted Freescale's
> > > > > > interface.
> > > > > 
> > > > > The keys are programmed into the OTP registers, correct? There is
> > > > > OCOTP d
> > > > >
> > > > >river
> > > > >for the MX23/MX28 OTP hardware. This is what should have been used
> > > > >then.
> > > > >
> > > > > NOTE: This IOCTL interface seems like quite an abusive way to allow
> > > > > userl
> > > > >
> > > > >and to
> > > > >access the crypto API in kernel. I understand this is used by some
> > > > >Freesc ale tool, but won't it be better to fix the Freescale tool
> > > > >instead ?
> > > > 
> > > > the IOCTL interface is used to AES encrypt a bootstream with the AES
> > > > key in OCOTP.
> > > > The idea is that only the DCP can read/access the key once it has
> > > > been programmed
> > > > into the OCOTP. If the crypto API has means to tell the DCP to use
> > > > the key from OCOTP, the tool from Freescale is a minor problem.
> > > 
> > > Ah right. I suspect the crypto API services shall not be exported into
> > > userland at all, yes ? So there has to be some kind of workaround here
> > > for this freescale tool, which is rather unfortunate.
> > 
> > These ioctls have to go.  I should have never let them through in
> > the first place.  Can someone cook up a patch to kill them please?
> 
> I can do that. I wonder if we can't agree to nuke the in-tree driver
> altogether instead and replace it by this one though. Does it not sound
> more reasonable?

Bump?

Best regards,
Marek Vasut
Herbert Xu Nov. 20, 2013, 11:20 a.m. UTC | #18
On Sun, Nov 10, 2013 at 06:48:11PM +0100, Marek Vasut wrote:
> Hi,
> 
> > Hello Herbert,
> > 
> > > On Mon, Oct 07, 2013 at 05:48:26PM +0200, Marek Vasut wrote:
> > > > Hello Christoph,
> > > > 
> > > > > Hello Marek,
> > > > > 
> > > > > > Marek Vasut <marex@denx.de> hat am 28. September 2013 um 05:35
> > > > > > geschrieben: [...]
> > > > > > 
> > > > > > > > 3) What are those ugly new IOCTLs in the dcp.c driver?
> > > > > > > 
> > > > > > > When I firstly posted the driver in the mailinglist, there where
> > > > > > > one person who actually used this interface (it was introduced in
> > > > > > > Freescale's SDK) to use the OTP keys for crypto. As far as I have
> > > > > > > seen, the crypto API does not support such keys (i.e. there seems
> > > > > > > to be no way to tell a driver to use some kind of special keys -
> > > > > > > which are not delivered by the user - via the API).
> > > > > > > Therefore I added this miscdevice and adopted Freescale's
> > > > > > > interface.
> > > > > > 
> > > > > > The keys are programmed into the OTP registers, correct? There is
> > > > > > OCOTP d
> > > > > >
> > > > > >river
> > > > > >for the MX23/MX28 OTP hardware. This is what should have been used
> > > > > >then.
> > > > > >
> > > > > > NOTE: This IOCTL interface seems like quite an abusive way to allow
> > > > > > userl
> > > > > >
> > > > > >and to
> > > > > >access the crypto API in kernel. I understand this is used by some
> > > > > >Freesc ale tool, but won't it be better to fix the Freescale tool
> > > > > >instead ?
> > > > > 
> > > > > the IOCTL interface is used to AES encrypt a bootstream with the AES
> > > > > key in OCOTP.
> > > > > The idea is that only the DCP can read/access the key once it has
> > > > > been programmed
> > > > > into the OCOTP. If the crypto API has means to tell the DCP to use
> > > > > the key from OCOTP, the tool from Freescale is a minor problem.
> > > > 
> > > > Ah right. I suspect the crypto API services shall not be exported into
> > > > userland at all, yes ? So there has to be some kind of workaround here
> > > > for this freescale tool, which is rather unfortunate.
> > > 
> > > These ioctls have to go.  I should have never let them through in
> > > the first place.  Can someone cook up a patch to kill them please?
> > 
> > I can do that. I wonder if we can't agree to nuke the in-tree driver
> > altogether instead and replace it by this one though. Does it not sound
> > more reasonable?
> 
> Bump?

Well as it's been months and nobody has stepped up to maintain
the in-tree driver then yes we can get rid of it.

Thanks,
Marek Vasut Dec. 1, 2013, 10:20 p.m. UTC | #19
Hi Herbert,

[...]

> > > I can do that. I wonder if we can't agree to nuke the in-tree driver
> > > altogether instead and replace it by this one though. Does it not sound
> > > more reasonable?
> > 
> > Bump?
> 
> Well as it's been months and nobody has stepped up to maintain
> the in-tree driver then yes we can get rid of it.

OK, I updated the DCP patches and re-posted them. Thanks!

Best regards,
Marek Vasut
diff mbox

Patch

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index f4fd837..4aa6686 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -399,4 +399,21 @@  config CRYPTO_DEV_ATMEL_SHA
 	  To compile this driver as a module, choose M here: the module
 	  will be called atmel-sha.
 
+config CRYPTO_DEV_MXS_DCP
+	tristate "Support for Freescale MXS DCP"
+	depends on ARCH_MXS
+	select CRYPTO_SHA1
+	select CRYPTO_SHA256
+	select CRYPTO_CBC
+	select CRYPTO_ECB
+	select CRYPTO_AES
+	select CRYPTO_BLKCIPHER
+	select CRYPTO_ALGAPI
+	help
+	  The Freescale i.MX23/i.MX28 has SHA1/SHA256 and AES128 CBC/ECB
+	  co-processor on the die.
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called atmel-sha.
+
 endif # CRYPTO_HW
diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
index b4946dd..56cce04 100644
--- a/drivers/crypto/Makefile
+++ b/drivers/crypto/Makefile
@@ -22,3 +22,4 @@  obj-$(CONFIG_CRYPTO_DEV_NX) += nx/
 obj-$(CONFIG_CRYPTO_DEV_ATMEL_AES) += atmel-aes.o
 obj-$(CONFIG_CRYPTO_DEV_ATMEL_TDES) += atmel-tdes.o
 obj-$(CONFIG_CRYPTO_DEV_ATMEL_SHA) += atmel-sha.o
+obj-$(CONFIG_CRYPTO_DEV_MXS_DCP) += mxs-dcp.o
diff --git a/drivers/crypto/mxs-dcp.c b/drivers/crypto/mxs-dcp.c
new file mode 100644
index 0000000..c2b35c7
--- /dev/null
+++ b/drivers/crypto/mxs-dcp.c
@@ -0,0 +1,1082 @@ 
+/*
+ * Freescale i.MX23/i.MX28 Data Co-Processor driver
+ *
+ * Copyright (C) 2013 Marek Vasut <marex@denx.de>
+ *
+ * The code contained herein is licensed under the GNU General Public
+ * License. You may obtain a copy of the GNU General Public License
+ * Version 2 or later at the following locations:
+ *
+ * http://www.opensource.org/licenses/gpl-license.html
+ * http://www.gnu.org/copyleft/gpl.html
+ */
+
+#include <linux/crypto.h>
+#include <linux/dma-mapping.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/stmp_device.h>
+
+#include <crypto/aes.h>
+#include <crypto/sha.h>
+#include <crypto/internal/hash.h>
+
+#define DCP_MAX_CHANS	4
+#define DCP_BUF_SZ	PAGE_SIZE
+
+/* DCP DMA descriptor. */
+struct dcp_dma_desc {
+	uint32_t	next_cmd_addr;
+	uint32_t	control0;
+	uint32_t	control1;
+	uint32_t	source;
+	uint32_t	destination;
+	uint32_t	size;
+	uint32_t	payload;
+	uint32_t	status;
+};
+
+/* Coherent aligned block for bounce buffering. */
+struct dcp_coherent_block {
+	uint8_t			aes_in_buf[DCP_BUF_SZ];
+	uint8_t			aes_out_buf[DCP_BUF_SZ];
+	uint8_t			sha_in_buf[DCP_BUF_SZ];
+
+	uint8_t			aes_key[2 * AES_KEYSIZE_128];
+	uint8_t			sha_digest[SHA256_DIGEST_SIZE];
+
+	struct dcp_dma_desc	desc[DCP_MAX_CHANS];
+};
+
+struct dcp {
+	struct device			*dev;
+	void __iomem			*base;
+
+	uint32_t			caps;
+
+	struct dcp_coherent_block	*coh;
+	dma_addr_t			coh_phys;
+
+	struct completion		completion[DCP_MAX_CHANS];
+	struct mutex			mutex[DCP_MAX_CHANS];
+	struct task_struct		*thread[DCP_MAX_CHANS];
+	struct crypto_queue		queue[DCP_MAX_CHANS];
+};
+
+enum dcp_chan {
+	DCP_CHAN_HASH_SHA	= 0,
+	DCP_CHAN_CRYPTO		= 2,
+};
+
+struct dcp_async_ctx {
+	/* Common context */
+	enum dcp_chan	chan;
+	uint32_t	fill;
+
+	/* SHA Hash-specific context */
+	struct mutex			mutex;
+	uint32_t			alg;
+	unsigned int			hot:1;
+
+	/* Crypto-specific context */
+	unsigned int			enc:1;
+	unsigned int			ecb:1;
+	struct crypto_ablkcipher	*fallback;
+	unsigned int			key_len;
+	uint8_t				key[AES_KEYSIZE_128];
+};
+
+struct dcp_sha_req_ctx {
+	unsigned int	init:1;
+	unsigned int	fini:1;
+};
+
+/*
+ * There can even be only one instance of the MXS DCP due to the
+ * design of Linux Crypto API.
+ */
+static struct dcp *global_sdcp;
+DEFINE_MUTEX(global_mutex);
+
+/* DCP register layout. */
+#define MXS_DCP_CTRL				0x00
+#define MXS_DCP_CTRL_GATHER_RESIDUAL_WRITES	(1 << 23)
+#define MXS_DCP_CTRL_ENABLE_CONTEXT_CACHING	(1 << 22)
+
+#define MXS_DCP_STAT				0x10
+#define MXS_DCP_STAT_CLR			0x18
+#define MXS_DCP_STAT_IRQ_MASK			0xf
+
+#define MXS_DCP_CHANNELCTRL			0x20
+#define MXS_DCP_CHANNELCTRL_ENABLE_CHANNEL_MASK	0xff
+
+#define MXS_DCP_CAPABILITY1			0x40
+#define MXS_DCP_CAPABILITY1_SHA256		(4 << 16)
+#define MXS_DCP_CAPABILITY1_SHA1		(1 << 16)
+#define MXS_DCP_CAPABILITY1_AES128		(1 << 0)
+
+#define MXS_DCP_CONTEXT				0x50
+
+#define MXS_DCP_CH_N_CMDPTR(n)			(0x100 + ((n) * 0x40))
+
+#define MXS_DCP_CH_N_SEMA(n)			(0x110 + ((n) * 0x40))
+
+#define MXS_DCP_CH_N_STAT(n)			(0x120 + ((n) * 0x40))
+#define MXS_DCP_CH_N_STAT_CLR(n)		(0x128 + ((n) * 0x40))
+
+/* DMA descriptor bits. */
+#define MXS_DCP_CONTROL0_HASH_TERM		(1 << 13)
+#define MXS_DCP_CONTROL0_HASH_INIT		(1 << 12)
+#define MXS_DCP_CONTROL0_PAYLOAD_KEY		(1 << 11)
+#define MXS_DCP_CONTROL0_CIPHER_ENCRYPT		(1 << 8)
+#define MXS_DCP_CONTROL0_CIPHER_INIT		(1 << 9)
+#define MXS_DCP_CONTROL0_ENABLE_HASH		(1 << 6)
+#define MXS_DCP_CONTROL0_ENABLE_CIPHER		(1 << 5)
+#define MXS_DCP_CONTROL0_DECR_SEMAPHORE		(1 << 1)
+#define MXS_DCP_CONTROL0_INTERRUPT		(1 << 0)
+
+#define MXS_DCP_CONTROL1_HASH_SELECT_SHA256	(2 << 16)
+#define MXS_DCP_CONTROL1_HASH_SELECT_SHA1	(0 << 16)
+#define MXS_DCP_CONTROL1_CIPHER_MODE_CBC	(1 << 4)
+#define MXS_DCP_CONTROL1_CIPHER_MODE_ECB	(0 << 4)
+#define MXS_DCP_CONTROL1_CIPHER_SELECT_AES128	(0 << 0)
+
+static int mxs_dcp_start_dma(struct dcp_async_ctx *actx)
+{
+	struct dcp *sdcp = global_sdcp;
+	const int chan = actx->chan;
+	uint32_t stat;
+	int ret;
+	dma_addr_t desc_phys = sdcp->coh_phys +
+		offsetof(struct dcp_coherent_block, desc[actx->chan]);
+
+	INIT_COMPLETION(sdcp->completion[chan]);
+
+	/* Clear status register. */
+	writel(0xffffffff, sdcp->base + MXS_DCP_CH_N_STAT_CLR(chan));
+
+	/* Load the DMA descriptor. */
+	writel(desc_phys, sdcp->base + MXS_DCP_CH_N_CMDPTR(chan));
+
+	/* Increment the semaphore to start the DMA transfer. */
+	writel(1, sdcp->base + MXS_DCP_CH_N_SEMA(chan));
+
+	ret = wait_for_completion_timeout(&sdcp->completion[chan],
+					  msecs_to_jiffies(1000));
+	if (!ret) {
+		dev_err(sdcp->dev, "Channel %i timeout (DCP_STAT=0x%08x)\n",
+			chan, readl(sdcp->base + MXS_DCP_STAT));
+		return -ETIMEDOUT;
+	}
+
+	stat = readl(sdcp->base + MXS_DCP_CH_N_STAT(chan));
+	if (stat & 0xff) {
+		dev_err(sdcp->dev, "Channel %i error (CH_STAT=0x%08x)\n",
+			chan, stat);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+/*
+ * Encryption (AES128)
+ */
+static void mxs_dcp_assemble_desc_aes(struct dcp_async_ctx *actx, int init)
+{
+	struct dcp *sdcp = global_sdcp;
+	struct dcp_dma_desc *desc = &sdcp->coh->desc[actx->chan];
+
+	dma_addr_t key_phys = sdcp->coh_phys +
+		offsetof(struct dcp_coherent_block, aes_key);
+	dma_addr_t src_phys = sdcp->coh_phys +
+		   offsetof(struct dcp_coherent_block, aes_in_buf);
+	dma_addr_t dst_phys = sdcp->coh_phys +
+		   offsetof(struct dcp_coherent_block, aes_out_buf);
+
+	/* Fill in the DMA descriptor. */
+	desc->control0 = MXS_DCP_CONTROL0_DECR_SEMAPHORE |
+		    MXS_DCP_CONTROL0_INTERRUPT |
+		    MXS_DCP_CONTROL0_ENABLE_CIPHER;
+
+	/* Payload contains the key. */
+	desc->control0 |= MXS_DCP_CONTROL0_PAYLOAD_KEY;
+
+	if (actx->enc)
+		desc->control0 |= MXS_DCP_CONTROL0_CIPHER_ENCRYPT;
+	if (init)
+		desc->control0 |= MXS_DCP_CONTROL0_CIPHER_INIT;
+
+	desc->control1 = MXS_DCP_CONTROL1_CIPHER_SELECT_AES128;
+
+	if (actx->ecb)
+		desc->control1 |= MXS_DCP_CONTROL1_CIPHER_MODE_ECB;
+	else
+		desc->control1 |= MXS_DCP_CONTROL1_CIPHER_MODE_CBC;
+
+	desc->next_cmd_addr = 0;
+	desc->source = src_phys;
+	desc->destination = dst_phys;
+	desc->size = actx->fill;
+	desc->payload = key_phys;
+	desc->status = 0;
+}
+
+static int mxs_dcp_aes_block_crypt(struct crypto_async_request *arq)
+{
+	struct dcp *sdcp = global_sdcp;
+
+	struct ablkcipher_request *req = ablkcipher_request_cast(arq);
+	struct dcp_async_ctx *actx = crypto_tfm_ctx(arq->tfm);
+
+	struct scatterlist *dst = req->dst;
+	struct scatterlist *src = req->src;
+	const int nents = sg_nents(req->src);
+
+	const int out_off = DCP_BUF_SZ;
+	uint8_t *in_buf = sdcp->coh->aes_in_buf;
+	uint8_t *out_buf = sdcp->coh->aes_out_buf;
+
+	uint8_t *out_tmp, *src_buf, *dst_buf = NULL;
+	uint32_t dst_off = 0;
+
+	uint8_t *key = sdcp->coh->aes_key;
+
+	int ret = 0;
+	int split = 0;
+	unsigned int i, len, clen, rem = 0;
+	int init = 0;
+
+	actx->fill = 0;
+
+	/* Copy the key from the temporary location. */
+	memcpy(key, actx->key, actx->key_len);
+
+	if (!actx->ecb) {
+		/* Copy the CBC IV just past the key. */
+		memcpy(key + AES_KEYSIZE_128, req->info, AES_KEYSIZE_128);
+		/* CBC needs the INIT set. */
+		init = 1;
+	} else {
+		memset(key + AES_KEYSIZE_128, 0, AES_KEYSIZE_128);
+	}
+
+	for_each_sg(req->src, src, nents, i) {
+		src_buf = sg_virt(src);
+		len = sg_dma_len(src);
+
+		do {
+			if (actx->fill + len > out_off)
+				clen = out_off - actx->fill;
+			else
+				clen = len;
+
+			memcpy(in_buf + actx->fill, src_buf, clen);
+			len -= clen;
+			src_buf += clen;
+			actx->fill += clen;
+
+			/*
+			 * If we filled the buffer or this is the last SG,
+			 * submit the buffer.
+			 */
+			if (actx->fill == out_off || sg_is_last(src)) {
+				mxs_dcp_assemble_desc_aes(actx, init);
+				ret = mxs_dcp_start_dma(actx);
+				if (ret)
+					return ret;
+				init = 0;
+
+				out_tmp = out_buf;
+				while (dst && actx->fill) {
+					if (!split) {
+						dst_buf = sg_virt(dst);
+						dst_off = 0;
+					}
+					rem = min(sg_dma_len(dst) - dst_off,
+						  actx->fill);
+
+					memcpy(dst_buf + dst_off, out_tmp, rem);
+					out_tmp += rem;
+					dst_off += rem;
+					actx->fill -= rem;
+
+					if (dst_off == sg_dma_len(dst)) {
+						dst = sg_next(dst);
+						split = 0;
+					} else {
+						split = 1;
+					}
+				}
+			}
+		} while (len);
+	}
+
+	return ret;
+}
+
+static int dcp_chan_thread_aes(void *data)
+{
+	struct dcp *sdcp = global_sdcp;
+	const int chan = DCP_CHAN_CRYPTO;
+
+	struct crypto_async_request *backlog;
+	struct crypto_async_request *arq;
+
+	int ret;
+
+	do {
+		__set_current_state(TASK_INTERRUPTIBLE);
+
+		mutex_lock(&sdcp->mutex[chan]);
+		backlog = crypto_get_backlog(&sdcp->queue[chan]);
+		arq = crypto_dequeue_request(&sdcp->queue[chan]);
+		mutex_unlock(&sdcp->mutex[chan]);
+
+		if (backlog)
+			backlog->complete(backlog, -EINPROGRESS);
+
+		if (arq) {
+			ret = mxs_dcp_aes_block_crypt(arq);
+			arq->complete(arq, ret);
+			continue;
+		}
+
+		schedule();
+	} while (!kthread_should_stop());
+
+	return 0;
+}
+
+static int mxs_dcp_block_fallback(struct ablkcipher_request *req, int enc)
+{
+	struct crypto_tfm *tfm =
+		crypto_ablkcipher_tfm(crypto_ablkcipher_reqtfm(req));
+	struct dcp_async_ctx *ctx = crypto_ablkcipher_ctx(
+		crypto_ablkcipher_reqtfm(req));
+	int ret;
+
+	ablkcipher_request_set_tfm(req, ctx->fallback);
+
+	if (enc)
+		ret = crypto_ablkcipher_encrypt(req);
+	else
+		ret = crypto_ablkcipher_decrypt(req);
+
+	ablkcipher_request_set_tfm(req, __crypto_ablkcipher_cast(tfm));
+
+	return ret;
+}
+
+static int mxs_dcp_aes_enqueue(struct ablkcipher_request *req, int enc, int ecb)
+{
+	struct dcp *sdcp = global_sdcp;
+	struct crypto_async_request *arq = &req->base;
+	struct dcp_async_ctx *actx = crypto_tfm_ctx(arq->tfm);
+	int ret;
+
+	if (unlikely(actx->key_len != AES_KEYSIZE_128))
+		return mxs_dcp_block_fallback(req, enc);
+
+	actx->enc = enc;
+	actx->ecb = ecb;
+	actx->chan = DCP_CHAN_CRYPTO;
+
+	mutex_lock(&sdcp->mutex[actx->chan]);
+	ret = crypto_enqueue_request(&sdcp->queue[actx->chan], &req->base);
+	mutex_unlock(&sdcp->mutex[actx->chan]);
+
+	wake_up_process(sdcp->thread[actx->chan]);
+
+	return -EINPROGRESS;
+}
+
+static int mxs_dcp_aes_ecb_decrypt(struct ablkcipher_request *req)
+{
+	return mxs_dcp_aes_enqueue(req, 0, 1);
+}
+
+static int mxs_dcp_aes_ecb_encrypt(struct ablkcipher_request *req)
+{
+	return mxs_dcp_aes_enqueue(req, 1, 1);
+}
+
+static int mxs_dcp_aes_cbc_decrypt(struct ablkcipher_request *req)
+{
+	return mxs_dcp_aes_enqueue(req, 0, 0);
+}
+
+static int mxs_dcp_aes_cbc_encrypt(struct ablkcipher_request *req)
+{
+	return mxs_dcp_aes_enqueue(req, 1, 0);
+}
+
+static int mxs_dcp_aes_setkey(struct crypto_ablkcipher *tfm, const u8 *key,
+			      unsigned int len)
+{
+	struct dcp_async_ctx *actx = crypto_ablkcipher_ctx(tfm);
+	unsigned int ret;
+
+	/*
+	 * AES 128 is supposed by the hardware, store key into temporary
+	 * buffer and exit. We must use the temporary buffer here, since
+	 * there can still be an operation in progress.
+	 */
+	actx->key_len = len;
+	if (len == AES_KEYSIZE_128) {
+		memcpy(actx->key, key, len);
+		return 0;
+	}
+
+	/* Check if the key size is supported by kernel at all. */
+	if (len != AES_KEYSIZE_192 && len != AES_KEYSIZE_256) {
+		tfm->base.crt_flags |= CRYPTO_TFM_RES_BAD_KEY_LEN;
+		return -EINVAL;
+	}
+
+	/*
+	 * If the requested AES key size is not supported by the hardware,
+	 * but is supported by in-kernel software implementation, we use
+	 * software fallback.
+	 */
+	actx->fallback->base.crt_flags &= ~CRYPTO_TFM_REQ_MASK;
+	actx->fallback->base.crt_flags |=
+		tfm->base.crt_flags & CRYPTO_TFM_REQ_MASK;
+
+	ret = crypto_ablkcipher_setkey(actx->fallback, key, len);
+	if (!ret)
+		return 0;
+
+	tfm->base.crt_flags &= ~CRYPTO_TFM_RES_MASK;
+	tfm->base.crt_flags |=
+		actx->fallback->base.crt_flags & CRYPTO_TFM_RES_MASK;
+
+	return ret;
+}
+
+static int mxs_dcp_aes_fallback_init(struct crypto_tfm *tfm)
+{
+	const char *name = tfm->__crt_alg->cra_name;
+	const uint32_t flags = CRYPTO_ALG_ASYNC | CRYPTO_ALG_NEED_FALLBACK;
+	struct dcp_async_ctx *actx = crypto_tfm_ctx(tfm);
+	struct crypto_ablkcipher *blk;
+
+	blk = crypto_alloc_ablkcipher(name, 0, flags);
+	if (IS_ERR(blk))
+		return PTR_ERR(blk);
+
+	actx->fallback = blk;
+	tfm->crt_ablkcipher.reqsize = sizeof(struct dcp_async_ctx);
+	return 0;
+}
+
+static void mxs_dcp_aes_fallback_exit(struct crypto_tfm *tfm)
+{
+	struct dcp_async_ctx *actx = crypto_tfm_ctx(tfm);
+
+	crypto_free_ablkcipher(actx->fallback);
+	actx->fallback = NULL;
+}
+
+/*
+ * Hashing (SHA1/SHA256)
+ */
+static void mxs_dcp_assemble_desc_sha(struct ahash_request *req)
+{
+	struct dcp *sdcp = global_sdcp;
+
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct dcp_async_ctx *actx = crypto_ahash_ctx(tfm);
+	struct dcp_sha_req_ctx *rctx = ahash_request_ctx(req);
+
+	struct dcp_dma_desc *desc = &sdcp->coh->desc[actx->chan];
+	dma_addr_t digest_phys = sdcp->coh_phys +
+		      offsetof(struct dcp_coherent_block, sha_digest);
+	dma_addr_t buf_phys = sdcp->coh_phys +
+		   offsetof(struct dcp_coherent_block, sha_in_buf);
+
+	/* Fill in the DMA descriptor. */
+	desc->control0 = MXS_DCP_CONTROL0_DECR_SEMAPHORE |
+		    MXS_DCP_CONTROL0_INTERRUPT |
+		    MXS_DCP_CONTROL0_ENABLE_HASH;
+	if (rctx->init)
+		desc->control0 |= MXS_DCP_CONTROL0_HASH_INIT;
+
+	desc->control1 = actx->alg;
+	desc->next_cmd_addr = 0;
+	desc->source = buf_phys;
+	desc->destination = 0;
+	desc->size = actx->fill;
+	desc->payload = 0;
+	desc->status = 0;
+
+	/* Set HASH_TERM bit for last transfer block. */
+	if (rctx->fini) {
+		desc->control0 |= MXS_DCP_CONTROL0_HASH_TERM;
+		desc->payload = digest_phys;
+	}
+}
+
+static int dcp_sha_req_to_buf(struct crypto_async_request *arq)
+{
+	struct dcp *sdcp = global_sdcp;
+
+	struct ahash_request *req = ahash_request_cast(arq);
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct dcp_async_ctx *actx = crypto_ahash_ctx(tfm);
+	struct dcp_sha_req_ctx *rctx = ahash_request_ctx(req);
+	struct hash_alg_common *halg = crypto_hash_alg_common(tfm);
+	const int nents = sg_nents(req->src);
+
+	uint8_t *digest = sdcp->coh->sha_digest;
+	uint8_t *in_buf = sdcp->coh->sha_in_buf;
+
+	uint8_t *src_buf;
+
+	struct scatterlist *src;
+
+	unsigned int i, len, clen;
+	int ret;
+
+	int fin = rctx->fini;
+	if (fin)
+		rctx->fini = 0;
+
+	for_each_sg(req->src, src, nents, i) {
+		src_buf = sg_virt(src);
+		len = sg_dma_len(src);
+
+		do {
+			if (actx->fill + len > DCP_BUF_SZ)
+				clen = DCP_BUF_SZ - actx->fill;
+			else
+				clen = len;
+
+			memcpy(in_buf + actx->fill, src_buf, clen);
+			len -= clen;
+			src_buf += clen;
+			actx->fill += clen;
+
+			/*
+			 * If we filled the buffer and still have some
+			 * more data, submit the buffer.
+			 */
+			if (len && actx->fill == DCP_BUF_SZ) {
+				mxs_dcp_assemble_desc_sha(req);
+				ret = mxs_dcp_start_dma(actx);
+				if (ret)
+					return ret;
+				actx->fill = 0;
+				rctx->init = 0;
+			}
+		} while (len);
+	}
+
+	if (fin) {
+		rctx->fini = 1;
+
+		/* Submit whatever is left. */
+		mxs_dcp_assemble_desc_sha(req);
+		ret = mxs_dcp_start_dma(actx);
+		if (ret || !req->result)
+			return ret;
+
+		/* For some reason, the result is flipped. */
+		for (i = 0; i < halg->digestsize; i++)
+			req->result[i] = digest[halg->digestsize - i - 1];
+	}
+
+	return 0;
+}
+
+static int dcp_chan_thread_sha(void *data)
+{
+	struct dcp *sdcp = global_sdcp;
+	const int chan = DCP_CHAN_HASH_SHA;
+
+	struct crypto_async_request *backlog;
+	struct crypto_async_request *arq;
+
+	struct dcp_sha_req_ctx *rctx;
+
+	struct ahash_request *req;
+	int ret, fini;
+
+	do {
+		__set_current_state(TASK_INTERRUPTIBLE);
+
+		mutex_lock(&sdcp->mutex[chan]);
+		backlog = crypto_get_backlog(&sdcp->queue[chan]);
+		arq = crypto_dequeue_request(&sdcp->queue[chan]);
+		mutex_unlock(&sdcp->mutex[chan]);
+
+		if (backlog)
+			backlog->complete(backlog, -EINPROGRESS);
+
+		if (arq) {
+			req = ahash_request_cast(arq);
+			rctx = ahash_request_ctx(req);
+
+			ret = dcp_sha_req_to_buf(arq);
+			fini = rctx->fini;
+			arq->complete(arq, ret);
+			if (!fini)
+				continue;
+		}
+
+		schedule();
+	} while (!kthread_should_stop());
+
+	return 0;
+}
+
+static int dcp_sha_init(struct ahash_request *req)
+{
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct dcp_async_ctx *actx = crypto_ahash_ctx(tfm);
+
+	struct hash_alg_common *halg = crypto_hash_alg_common(tfm);
+
+	/*
+	 * Start hashing session. The code below only inits the
+	 * hashing session context, nothing more.
+	 */
+	memset(actx, 0, sizeof(*actx));
+
+	if (strcmp(halg->base.cra_name, "sha1") == 0)
+		actx->alg = MXS_DCP_CONTROL1_HASH_SELECT_SHA1;
+	else
+		actx->alg = MXS_DCP_CONTROL1_HASH_SELECT_SHA256;
+
+	actx->fill = 0;
+	actx->hot = 0;
+	actx->chan = DCP_CHAN_HASH_SHA;
+
+	mutex_init(&actx->mutex);
+
+	return 0;
+}
+
+static int dcp_sha_update(struct ahash_request *req)
+{
+	struct dcp *sdcp = global_sdcp;
+
+	struct dcp_sha_req_ctx *rctx = ahash_request_ctx(req);
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct dcp_async_ctx *actx = crypto_ahash_ctx(tfm);
+
+	int ret;
+
+	/*
+	 * Ignore requests that have no data in them and are not
+	 * the trailing requests in the stream of requests.
+	 */
+	if (!req->nbytes && !rctx->fini)
+		return 0;
+
+	mutex_lock(&actx->mutex);
+	if (!actx->hot) {
+		actx->hot = 1;
+		rctx->init = 1;
+	}
+
+	mutex_lock(&sdcp->mutex[actx->chan]);
+	ret = crypto_enqueue_request(&sdcp->queue[actx->chan], &req->base);
+	mutex_unlock(&sdcp->mutex[actx->chan]);
+
+	wake_up_process(sdcp->thread[actx->chan]);
+	mutex_unlock(&actx->mutex);
+
+	return -EINPROGRESS;
+}
+
+static int dcp_sha_final(struct ahash_request *req)
+{
+	struct dcp_sha_req_ctx *rctx = ahash_request_ctx(req);
+
+	ahash_request_set_crypt(req, NULL, req->result, 0);
+
+	rctx->fini = 1;
+	return dcp_sha_update(req);
+}
+
+static int dcp_sha_finup(struct ahash_request *req)
+{
+	struct dcp_sha_req_ctx *rctx = ahash_request_ctx(req);
+
+	rctx->fini = 1;
+	return dcp_sha_update(req);
+}
+
+static int dcp_sha_digest(struct ahash_request *req)
+{
+	int ret;
+
+	ret = dcp_sha_init(req);
+	if (ret)
+		return ret;
+
+	return dcp_sha_finup(req);
+}
+
+static int dcp_sha_cra_init(struct crypto_tfm *tfm)
+{
+	crypto_ahash_set_reqsize(__crypto_ahash_cast(tfm),
+				 sizeof(struct dcp_sha_req_ctx));
+	return 0;
+}
+
+static void dcp_sha_cra_exit(struct crypto_tfm *tfm)
+{
+}
+
+/* AES 128 ECB and AES 128 CBC */
+static struct crypto_alg dcp_aes_algs[] = {
+	[0] = {
+		.cra_name		= "ecb(aes)",
+		.cra_driver_name	= "ecb-aes-dcp",
+		.cra_priority		= 400,
+		.cra_alignmask		= 15,
+		.cra_flags		= CRYPTO_ALG_TYPE_ABLKCIPHER |
+					  CRYPTO_ALG_ASYNC |
+					  CRYPTO_ALG_NEED_FALLBACK,
+		.cra_init		= mxs_dcp_aes_fallback_init,
+		.cra_exit		= mxs_dcp_aes_fallback_exit,
+		.cra_blocksize		= AES_BLOCK_SIZE,
+		.cra_ctxsize		= sizeof(struct dcp_async_ctx),
+		.cra_type		= &crypto_ablkcipher_type,
+		.cra_module		= THIS_MODULE,
+		.cra_u	= {
+			.ablkcipher = {
+				.min_keysize	= AES_MIN_KEY_SIZE,
+				.max_keysize	= AES_MAX_KEY_SIZE,
+				.setkey		= mxs_dcp_aes_setkey,
+				.encrypt	= mxs_dcp_aes_ecb_encrypt,
+				.decrypt	= mxs_dcp_aes_ecb_decrypt
+			}
+		}
+	},
+	[1] = {
+		.cra_name		= "cbc(aes)",
+		.cra_driver_name	= "cbc-aes-dcp",
+		.cra_priority		= 400,
+		.cra_alignmask		= 15,
+		.cra_flags		= CRYPTO_ALG_TYPE_ABLKCIPHER |
+					  CRYPTO_ALG_ASYNC |
+					  CRYPTO_ALG_NEED_FALLBACK,
+		.cra_init		= mxs_dcp_aes_fallback_init,
+		.cra_exit		= mxs_dcp_aes_fallback_exit,
+		.cra_blocksize		= AES_BLOCK_SIZE,
+		.cra_ctxsize		= sizeof(struct dcp_async_ctx),
+		.cra_type		= &crypto_ablkcipher_type,
+		.cra_module		= THIS_MODULE,
+		.cra_u = {
+			.ablkcipher = {
+				.min_keysize	= AES_MIN_KEY_SIZE,
+				.max_keysize	= AES_MAX_KEY_SIZE,
+				.setkey		= mxs_dcp_aes_setkey,
+				.encrypt	= mxs_dcp_aes_cbc_encrypt,
+				.decrypt	= mxs_dcp_aes_cbc_decrypt,
+				.ivsize		= AES_BLOCK_SIZE,
+			}
+		}
+	},
+};
+
+/* SHA1 */
+static struct ahash_alg dcp_sha1_alg = {
+	.init	= dcp_sha_init,
+	.update	= dcp_sha_update,
+	.final	= dcp_sha_final,
+	.finup	= dcp_sha_finup,
+	.digest	= dcp_sha_digest,
+	.halg	= {
+		.digestsize	= SHA1_DIGEST_SIZE,
+		.base		= {
+			.cra_name		= "sha1",
+			.cra_driver_name	= "sha1-dcp",
+			.cra_priority		= 400,
+			.cra_alignmask		= 63,
+			.cra_flags		= CRYPTO_ALG_ASYNC,
+			.cra_blocksize		= SHA1_BLOCK_SIZE,
+			.cra_ctxsize		= sizeof(struct dcp_async_ctx),
+			.cra_module		= THIS_MODULE,
+			.cra_init		= dcp_sha_cra_init,
+			.cra_exit		= dcp_sha_cra_exit,
+		}
+	}
+};
+
+/* SHA256 */
+static struct ahash_alg dcp_sha256_alg = {
+	.init	= dcp_sha_init,
+	.update	= dcp_sha_update,
+	.final	= dcp_sha_final,
+	.finup	= dcp_sha_finup,
+	.digest	= dcp_sha_digest,
+	.halg	= {
+		.digestsize	= SHA256_DIGEST_SIZE,
+		.base		= {
+			.cra_name		= "sha256",
+			.cra_driver_name	= "sha256-dcp",
+			.cra_priority		= 400,
+			.cra_alignmask		= 63,
+			.cra_flags		= CRYPTO_ALG_ASYNC,
+			.cra_blocksize		= SHA256_BLOCK_SIZE,
+			.cra_ctxsize		= sizeof(struct dcp_async_ctx),
+			.cra_module		= THIS_MODULE,
+			.cra_init		= dcp_sha_cra_init,
+			.cra_exit		= dcp_sha_cra_exit,
+		}
+	}
+};
+
+static irqreturn_t mxs_dcp_irq(int irq, void *context)
+{
+	struct dcp *sdcp = context;
+	uint32_t stat;
+	int i;
+
+	stat = readl(sdcp->base + MXS_DCP_STAT);
+	stat &= MXS_DCP_STAT_IRQ_MASK;
+	if (!stat)
+		return IRQ_NONE;
+
+	/* Clear the interrupts. */
+	writel(stat, sdcp->base + MXS_DCP_STAT_CLR);
+
+	/* Complete the DMA requests that finished. */
+	for (i = 0; i < DCP_MAX_CHANS; i++)
+		if (stat & (1 << i))
+			complete(&sdcp->completion[i]);
+
+	return IRQ_HANDLED;
+}
+
+static int mxs_dcp_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct dcp *sdcp = NULL;
+	int i, ret;
+
+	struct resource *iores;
+	int dcp_vmi_irq, dcp_irq;
+
+	mutex_lock(&global_mutex);
+	if (global_sdcp) {
+		dev_err(dev, "Only one DCP instance allowed!\n");
+		ret = -ENODEV;
+		goto err_mutex;
+	}
+
+	iores = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	dcp_vmi_irq = platform_get_irq(pdev, 0);
+	dcp_irq = platform_get_irq(pdev, 1);
+	if (!iores || dcp_vmi_irq < 0 || dcp_irq < 0) {
+		ret = -EINVAL;
+		goto err_mutex;
+	}
+
+	sdcp = devm_kzalloc(dev, sizeof(*sdcp), GFP_KERNEL);
+	if (!sdcp) {
+		ret = -ENOMEM;
+		goto err_mutex;
+	}
+
+	sdcp->dev = dev;
+	sdcp->base = devm_ioremap_resource(dev, iores);
+	if (IS_ERR(sdcp->base)) {
+		ret = PTR_ERR(sdcp->base);
+		goto err_mutex;
+	}
+
+	ret = devm_request_irq(dev, dcp_vmi_irq, mxs_dcp_irq, 0,
+			       "dcp-vmi-irq", sdcp);
+	if (ret) {
+		dev_err(dev, "Failed to claim DCP VMI IRQ!\n");
+		goto err_mutex;
+	}
+
+	ret = devm_request_irq(dev, dcp_irq, mxs_dcp_irq, 0,
+			       "dcp-irq", sdcp);
+	if (ret) {
+		dev_err(dev, "Failed to claim DCP IRQ!\n");
+		goto err_mutex;
+	}
+
+	/* Allocate coherent helper block. */
+	sdcp->coh = dma_alloc_coherent(dev, sizeof(struct dcp_coherent_block),
+				       &sdcp->coh_phys, GFP_KERNEL);
+	if (!sdcp->coh) {
+		dev_err(dev, "Error allocating coherent block\n");
+		ret = -ENOMEM;
+		goto err_mutex;
+	}
+
+	/* Restart the DCP block. */
+	stmp_reset_block(sdcp->base);
+
+	/* Initialize control register. */
+	writel(MXS_DCP_CTRL_GATHER_RESIDUAL_WRITES |
+	       MXS_DCP_CTRL_ENABLE_CONTEXT_CACHING | 0xf,
+	       sdcp->base + MXS_DCP_CTRL);
+
+	/* Enable all DCP DMA channels. */
+	writel(MXS_DCP_CHANNELCTRL_ENABLE_CHANNEL_MASK,
+	       sdcp->base + MXS_DCP_CHANNELCTRL);
+
+	/*
+	 * We do not enable context switching. Give the context buffer a
+	 * pointer to an illegal address so if context switching is
+	 * inadvertantly enabled, the DCP will return an error instead of
+	 * trashing good memory. The DCP DMA cannot access ROM, so any ROM
+	 * address will do.
+	 */
+	writel(0xffff0000, sdcp->base + MXS_DCP_CONTEXT);
+	for (i = 0; i < DCP_MAX_CHANS; i++)
+		writel(0xffffffff, sdcp->base + MXS_DCP_CH_N_STAT_CLR(i));
+	writel(0xffffffff, sdcp->base + MXS_DCP_STAT_CLR);
+
+	global_sdcp = sdcp;
+
+	platform_set_drvdata(pdev, sdcp);
+
+	for (i = 0; i < DCP_MAX_CHANS; i++) {
+		mutex_init(&sdcp->mutex[i]);
+		init_completion(&sdcp->completion[i]);
+		crypto_init_queue(&sdcp->queue[i], 50);
+	}
+
+	/* Create the SHA and AES handler threads. */
+	sdcp->thread[DCP_CHAN_HASH_SHA] = kthread_create(dcp_chan_thread_sha,
+						      NULL, "mxs_dcp_chan/sha");
+	if (IS_ERR(sdcp->thread[DCP_CHAN_HASH_SHA])) {
+		dev_err(dev, "Error starting SHA thread!\n");
+		ret = PTR_ERR(sdcp->thread[DCP_CHAN_HASH_SHA]);
+		goto err_free_coherent;
+	}
+
+	sdcp->thread[DCP_CHAN_CRYPTO] = kthread_create(dcp_chan_thread_aes,
+						    NULL, "mxs_dcp_chan/aes");
+	if (IS_ERR(sdcp->thread[DCP_CHAN_CRYPTO])) {
+		dev_err(dev, "Error starting SHA thread!\n");
+		ret = PTR_ERR(sdcp->thread[DCP_CHAN_CRYPTO]);
+		goto err_destroy_sha_thread;
+	}
+
+	/* Register the various crypto algorithms. */
+	sdcp->caps = readl(sdcp->base + MXS_DCP_CAPABILITY1);
+
+	if (sdcp->caps & MXS_DCP_CAPABILITY1_AES128) {
+		ret = crypto_register_algs(dcp_aes_algs,
+					   ARRAY_SIZE(dcp_aes_algs));
+		if (ret) {
+			/* Failed to register algorithm. */
+			dev_err(dev, "Failed to register AES crypto!\n");
+			goto err_destroy_aes_thread;
+		}
+	}
+
+	if (sdcp->caps & MXS_DCP_CAPABILITY1_SHA1) {
+		ret = crypto_register_ahash(&dcp_sha1_alg);
+		if (ret) {
+			dev_err(dev, "Failed to register %s hash!\n",
+				dcp_sha1_alg.halg.base.cra_name);
+			goto err_unregister_aes;
+		}
+	}
+
+	if (sdcp->caps & MXS_DCP_CAPABILITY1_SHA256) {
+		ret = crypto_register_ahash(&dcp_sha256_alg);
+		if (ret) {
+			dev_err(dev, "Failed to register %s hash!\n",
+				dcp_sha256_alg.halg.base.cra_name);
+			goto err_unregister_sha1;
+		}
+	}
+
+	return 0;
+
+err_unregister_sha1:
+	if (sdcp->caps & MXS_DCP_CAPABILITY1_SHA1)
+		crypto_unregister_ahash(&dcp_sha1_alg);
+
+err_unregister_aes:
+	if (sdcp->caps & MXS_DCP_CAPABILITY1_AES128)
+		crypto_unregister_algs(dcp_aes_algs, ARRAY_SIZE(dcp_aes_algs));
+
+err_destroy_aes_thread:
+	kthread_stop(sdcp->thread[DCP_CHAN_CRYPTO]);
+
+err_destroy_sha_thread:
+	kthread_stop(sdcp->thread[DCP_CHAN_HASH_SHA]);
+
+err_free_coherent:
+	dma_free_coherent(sdcp->dev,
+			  4 * sizeof(struct dcp_coherent_block),
+			  sdcp->coh, sdcp->coh_phys);
+err_mutex:
+	mutex_unlock(&global_mutex);
+	return ret;
+}
+
+static int mxs_dcp_remove(struct platform_device *pdev)
+{
+	struct dcp *sdcp;
+	int i;
+
+	sdcp = platform_get_drvdata(pdev);
+
+	kthread_stop(sdcp->thread[DCP_CHAN_HASH_SHA]);
+	kthread_stop(sdcp->thread[DCP_CHAN_CRYPTO]);
+
+	platform_set_drvdata(pdev, NULL);
+
+	dma_free_coherent(sdcp->dev, sizeof(struct dcp_coherent_block),
+			  sdcp->coh, sdcp->coh_phys);
+
+	if (sdcp->caps & MXS_DCP_CAPABILITY1_SHA256)
+		crypto_unregister_ahash(&dcp_sha256_alg);
+
+	if (sdcp->caps & MXS_DCP_CAPABILITY1_SHA1)
+		crypto_unregister_ahash(&dcp_sha1_alg);
+
+	if (sdcp->caps & MXS_DCP_CAPABILITY1_AES128) {
+		for (i = ARRAY_SIZE(dcp_aes_algs); i >= 0; i--)
+			crypto_unregister_alg(&dcp_aes_algs[i]);
+	}
+
+	mutex_lock(&global_mutex);
+	global_sdcp = NULL;
+	mutex_unlock(&global_mutex);
+
+	return 0;
+}
+
+static const struct of_device_id mxs_dcp_dt_ids[] = {
+	{.compatible = "fsl,mxs-dcp", .data = NULL,},
+	{ /* sentinel */ }
+};
+
+MODULE_DEVICE_TABLE(of, mxs_dcp_dt_ids);
+
+static struct platform_driver mxs_dcp_driver = {
+	.probe	= mxs_dcp_probe,
+	.remove	= mxs_dcp_remove,
+	.driver	= {
+		.name		= "mxs-dcp",
+		.owner		= THIS_MODULE,
+		.of_match_table	= mxs_dcp_dt_ids,
+	},
+};
+
+module_platform_driver(mxs_dcp_driver);
+
+MODULE_AUTHOR("Marek Vasut <marex@denx.de>");
+MODULE_DESCRIPTION("Freescale MXS DCP Driver");
+MODULE_LICENSE("GPL");