diff mbox

[PATCHv3,03/11] crypto: omap-sham: implement context export/import APIs

Message ID 1470306526-27219-4-git-send-email-t-kristo@ti.com (mailing list archive)
State New, archived
Headers show

Commit Message

Tero Kristo Aug. 4, 2016, 10:28 a.m. UTC
Context export/import are now required for ahash algorithms due to
required support in algif_hash. Implement these for OMAP SHA driver,
saving and restoring the internal state of the driver.

Signed-off-by: Tero Kristo <t-kristo@ti.com>
---
 drivers/crypto/omap-sham.c | 31 +++++++++++++++++++++++++++++--
 1 file changed, 29 insertions(+), 2 deletions(-)

Comments

Herbert Xu Aug. 9, 2016, 10:06 a.m. UTC | #1
On Thu, Aug 04, 2016 at 01:28:38PM +0300, Tero Kristo wrote:
> Context export/import are now required for ahash algorithms due to
> required support in algif_hash. Implement these for OMAP SHA driver,
> saving and restoring the internal state of the driver.
> 
> Signed-off-by: Tero Kristo <t-kristo@ti.com>
> ---
>  drivers/crypto/omap-sham.c | 31 +++++++++++++++++++++++++++++--
>  1 file changed, 29 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/crypto/omap-sham.c b/drivers/crypto/omap-sham.c
> index 6e53944..aa71e61 100644
> --- a/drivers/crypto/omap-sham.c
> +++ b/drivers/crypto/omap-sham.c
> @@ -1379,6 +1379,27 @@ exit_unlock:
>  	return ret;
>  }
>  
> +static int omap_sham_export(struct ahash_request *req, void *out)
> +{
> +	struct omap_sham_reqctx *rctx = ahash_request_ctx(req);
> +
> +	while (omap_sham_flush(req) == -EINPROGRESS)
> +		msleep(10);

Do we really need this? You must not call export until the previous
operation has completed.

Cheers,
Tero Kristo Aug. 29, 2016, 2:11 p.m. UTC | #2
On 09/08/16 13:06, Herbert Xu wrote:
> On Thu, Aug 04, 2016 at 01:28:38PM +0300, Tero Kristo wrote:
>> Context export/import are now required for ahash algorithms due to
>> required support in algif_hash. Implement these for OMAP SHA driver,
>> saving and restoring the internal state of the driver.
>>
>> Signed-off-by: Tero Kristo <t-kristo@ti.com>
>> ---
>>  drivers/crypto/omap-sham.c | 31 +++++++++++++++++++++++++++++--
>>  1 file changed, 29 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/crypto/omap-sham.c b/drivers/crypto/omap-sham.c
>> index 6e53944..aa71e61 100644
>> --- a/drivers/crypto/omap-sham.c
>> +++ b/drivers/crypto/omap-sham.c
>> @@ -1379,6 +1379,27 @@ exit_unlock:
>>  	return ret;
>>  }
>>
>> +static int omap_sham_export(struct ahash_request *req, void *out)
>> +{
>> +	struct omap_sham_reqctx *rctx = ahash_request_ctx(req);
>> +
>> +	while (omap_sham_flush(req) == -EINPROGRESS)
>> +		msleep(10);
>
> Do we really need this? You must not call export until the previous
> operation has completed.
>
> Cheers,
>

Sorry about a late reply, I was out on vacation.

For OMAP SHAM, this is actually needed, because the driver still has a 
very large internal buffer for performance reasons, and the whole buffer 
can't be exported. The flush functionality pushes out sufficient amount 
of data to the hardware, so that the rest of the buffer can be exported 
to the available space.

This is pretty much related to the discussion we had previously here:

https://patchwork.kernel.org/patch/9192881/

Basically I decided to keep the driver buffer the same size as 
previously, but flush out any extra data.

-Tero
Herbert Xu Sept. 1, 2016, 3:33 a.m. UTC | #3
On Mon, Aug 29, 2016 at 05:11:35PM +0300, Tero Kristo wrote:
>
> >>+static int omap_sham_export(struct ahash_request *req, void *out)
> >>+{
> >>+	struct omap_sham_reqctx *rctx = ahash_request_ctx(req);
> >>+
> >>+	while (omap_sham_flush(req) == -EINPROGRESS)
> >>+		msleep(10);
> >
> >Do we really need this? You must not call export until the previous
> >operation has completed.
> >
> >Cheers,
> >
> 
> Sorry about a late reply, I was out on vacation.
> 
> For OMAP SHAM, this is actually needed, because the driver still has
> a very large internal buffer for performance reasons, and the whole
> buffer can't be exported. The flush functionality pushes out
> sufficient amount of data to the hardware, so that the rest of the
> buffer can be exported to the available space.

It doesn't matter whether you have a buffer or not.  The point
is that the completion function should not be called until the
operation is actually complete.  This is the whole point of the
async interface.

As the user must not call export until the completion function
has been called, there should be no need to wait in the export
function.

Cheers,
Tero Kristo Sept. 1, 2016, 6:12 a.m. UTC | #4
On 01/09/16 06:33, Herbert Xu wrote:
> On Mon, Aug 29, 2016 at 05:11:35PM +0300, Tero Kristo wrote:
>>
>>>> +static int omap_sham_export(struct ahash_request *req, void *out)
>>>> +{
>>>> +	struct omap_sham_reqctx *rctx = ahash_request_ctx(req);
>>>> +
>>>> +	while (omap_sham_flush(req) == -EINPROGRESS)
>>>> +		msleep(10);
>>>
>>> Do we really need this? You must not call export until the previous
>>> operation has completed.
>>>
>>> Cheers,
>>>
>>
>> Sorry about a late reply, I was out on vacation.
>>
>> For OMAP SHAM, this is actually needed, because the driver still has
>> a very large internal buffer for performance reasons, and the whole
>> buffer can't be exported. The flush functionality pushes out
>> sufficient amount of data to the hardware, so that the rest of the
>> buffer can be exported to the available space.
>
> It doesn't matter whether you have a buffer or not.  The point
> is that the completion function should not be called until the
> operation is actually complete.  This is the whole point of the
> async interface.
>
> As the user must not call export until the completion function
> has been called, there should be no need to wait in the export
> function.

Well, but the driver doesn't flush its buffers automatically, it caches 
data until it has sufficient amount available. So, assuming you want to 
do this:

sham_init
   sham_update 256 bytes
   sham_update 256 bytes
   wait until two above updates are complete
   sham_export

... the execution hangs at the wait phase as the driver is still waiting 
for more data to cache, and will never complete the two update requests. 
Currently, the driver is written in such way that it waits until it has 
enough data cached before starting to push it out to hardware, or waits 
until sham_final to be called. Pushing out small pieces of data causes 
severe performance degradation on the driver, as setting up the DMA 
operation itself is rather costly.

Either way, flush for the buffers is needed, I wonder if automatic flush 
should be added also based on some timer.

-Tero
Herbert Xu Sept. 1, 2016, 6:16 a.m. UTC | #5
On Thu, Sep 01, 2016 at 09:12:59AM +0300, Tero Kristo wrote:
> 
> Well, but the driver doesn't flush its buffers automatically, it
> caches data until it has sufficient amount available. So, assuming
> you want to do this:
> 
> sham_init
>   sham_update 256 bytes
>   sham_update 256 bytes
>   wait until two above updates are complete
>   sham_export
> 
> ... the execution hangs at the wait phase as the driver is still

Well that's a bug in the driver.  While it's not illegal to wait
for more data, it's usually unnecessary.  Because we instead try
to get our users to generate as big a request as possible, e.g.,
one packet for IPsec.

If you really have to do the hold thing, then you must install a
timer like sha1-mb does on x86 to do the flush.

In any case, the completion function must not be called until
you're actually complete.

Cheers,
Tero Kristo Sept. 1, 2016, 6:56 a.m. UTC | #6
On 01/09/16 09:16, Herbert Xu wrote:
> On Thu, Sep 01, 2016 at 09:12:59AM +0300, Tero Kristo wrote:
>>
>> Well, but the driver doesn't flush its buffers automatically, it
>> caches data until it has sufficient amount available. So, assuming
>> you want to do this:
>>
>> sham_init
>>   sham_update 256 bytes
>>   sham_update 256 bytes
>>   wait until two above updates are complete
>>   sham_export
>>
>> ... the execution hangs at the wait phase as the driver is still
>
> Well that's a bug in the driver.  While it's not illegal to wait
> for more data, it's usually unnecessary.  Because we instead try
> to get our users to generate as big a request as possible, e.g.,
> one packet for IPsec.
>
> If you really have to do the hold thing, then you must install a
> timer like sha1-mb does on x86 to do the flush.
>
> In any case, the completion function must not be called until
> you're actually complete.

Hmm, looking at the driver, sham_update returns 0 immediately if it just 
caches data. In a sense, the update is not completed at this point. Are 
you saying this is illegal and can't be done?

 From my understanding, valid results are expected from the driver only 
after ->final is called.

-Tero
Herbert Xu Sept. 1, 2016, 7:19 a.m. UTC | #7
On Thu, Sep 01, 2016 at 09:56:06AM +0300, Tero Kristo wrote:
>
> Hmm, looking at the driver, sham_update returns 0 immediately if it
> just caches data. In a sense, the update is not completed at this
> point. Are you saying this is illegal and can't be done?

Once you call the completion function (and returning zero from
the update itself is equivalent to calling the completion function)
the hardware must not touch the request anymore.

> From my understanding, valid results are expected from the driver
> only after ->final is called.

That's because you never implemented export/import :)

Cheers,
Tero Kristo Sept. 1, 2016, 7:28 a.m. UTC | #8
On 01/09/16 10:19, Herbert Xu wrote:
> On Thu, Sep 01, 2016 at 09:56:06AM +0300, Tero Kristo wrote:
>>
>> Hmm, looking at the driver, sham_update returns 0 immediately if it
>> just caches data. In a sense, the update is not completed at this
>> point. Are you saying this is illegal and can't be done?
>
> Once you call the completion function (and returning zero from
> the update itself is equivalent to calling the completion function)
> the hardware must not touch the request anymore.

Yeah, it is not touching it anymore. All the data has been copied to the 
local buffer at this point, and the driver isn't retaining any kind of a 
handle to the request itself anymore. It is just that it has not been 
processed by the hardware yet.

>
>> From my understanding, valid results are expected from the driver
>> only after ->final is called.
>
> That's because you never implemented export/import :)

Yeah, the flush should do the trick now. Kind of a chicken-egg problem 
here. :P How do you see the situation with the above explanation?


-Tero
Herbert Xu Sept. 1, 2016, 7:31 a.m. UTC | #9
On Thu, Sep 01, 2016 at 10:28:47AM +0300, Tero Kristo wrote:
>
> Yeah, the flush should do the trick now. Kind of a chicken-egg
> problem here. :P How do you see the situation with the above
> explanation?

The export function is not allowed to sleep so you must not wait
for the hardware to complete in it.

If you need to wait, then you must use the completion mechanism.

Cheers,
Tero Kristo Sept. 1, 2016, 7:46 a.m. UTC | #10
On 01/09/16 10:31, Herbert Xu wrote:
> On Thu, Sep 01, 2016 at 10:28:47AM +0300, Tero Kristo wrote:
>>
>> Yeah, the flush should do the trick now. Kind of a chicken-egg
>> problem here. :P How do you see the situation with the above
>> explanation?
>
> The export function is not allowed to sleep so you must not wait
> for the hardware to complete in it.
>
> If you need to wait, then you must use the completion mechanism.

I don't think export is allowed to return -EINPROGRESS either? At least 
currently the kernel pieces using this functionality won't work if I do 
that.

If thats the case, I need to think of something else to handle this...

-Tero
Tero Kristo Sept. 5, 2016, 12:06 p.m. UTC | #11
On 01/09/16 10:46, Tero Kristo wrote:
> On 01/09/16 10:31, Herbert Xu wrote:
>> On Thu, Sep 01, 2016 at 10:28:47AM +0300, Tero Kristo wrote:
>>>
>>> Yeah, the flush should do the trick now. Kind of a chicken-egg
>>> problem here. :P How do you see the situation with the above
>>> explanation?
>>
>> The export function is not allowed to sleep so you must not wait
>> for the hardware to complete in it.
>>
>> If you need to wait, then you must use the completion mechanism.
>
> I don't think export is allowed to return -EINPROGRESS either? At least
> currently the kernel pieces using this functionality won't work if I do
> that.
>
> If thats the case, I need to think of something else to handle this...
>
> -Tero
>

Hi Herbert,

Additional request, would it be possible for you to check the rest of 
the series and just ignore patches #2 and #3 for now, the rest don't 
have any dependencies against these and can be applied cleanly without.

I would like to see these move forward while I figure out how to handle 
the buffer / export+import...

-Tero
Herbert Xu Sept. 7, 2016, 1:29 p.m. UTC | #12
On Mon, Sep 05, 2016 at 03:06:05PM +0300, Tero Kristo wrote:
>
> Additional request, would it be possible for you to check the rest
> of the series and just ignore patches #2 and #3 for now, the rest
> don't have any dependencies against these and can be applied cleanly
> without.
> 
> I would like to see these move forward while I figure out how to
> handle the buffer / export+import...

Sure I'll review them again.

Cheers,
diff mbox

Patch

diff --git a/drivers/crypto/omap-sham.c b/drivers/crypto/omap-sham.c
index 6e53944..aa71e61 100644
--- a/drivers/crypto/omap-sham.c
+++ b/drivers/crypto/omap-sham.c
@@ -1379,6 +1379,27 @@  exit_unlock:
 	return ret;
 }
 
+static int omap_sham_export(struct ahash_request *req, void *out)
+{
+	struct omap_sham_reqctx *rctx = ahash_request_ctx(req);
+
+	while (omap_sham_flush(req) == -EINPROGRESS)
+		msleep(10);
+
+	memcpy(out, rctx, sizeof(*rctx) + SHA512_BLOCK_SIZE);
+
+	return 0;
+}
+
+static int omap_sham_import(struct ahash_request *req, const void *in)
+{
+	struct omap_sham_reqctx *rctx = ahash_request_ctx(req);
+
+	memcpy(rctx, in, sizeof(*rctx) + SHA512_BLOCK_SIZE);
+
+	return 0;
+}
+
 static struct ahash_alg algs_sha1_md5[] = {
 {
 	.init		= omap_sham_init,
@@ -2037,8 +2058,14 @@  static int omap_sham_probe(struct platform_device *pdev)
 
 	for (i = 0; i < dd->pdata->algs_info_size; i++) {
 		for (j = 0; j < dd->pdata->algs_info[i].size; j++) {
-			err = crypto_register_ahash(
-					&dd->pdata->algs_info[i].algs_list[j]);
+			struct ahash_alg *alg;
+
+			alg = &dd->pdata->algs_info[i].algs_list[j];
+			alg->export = omap_sham_export;
+			alg->import = omap_sham_import;
+			alg->halg.statesize = sizeof(struct omap_sham_reqctx) +
+				SHA512_BLOCK_SIZE;
+			err = crypto_register_ahash(alg);
 			if (err)
 				goto err_algs;