Patchwork [v3,1/4] crypto: AF_ALG AIO - lock context IV

login
register
mail settings
Submitter Stephan Mueller
Date Feb. 9, 2018, 10:03 p.m.
Message ID <4507333.JelVKMRNjF@positron.chronox.de>
Download mbox | patch
Permalink /patch/10210049/
State Changes Requested
Delegated to: Herbert Xu
Headers show

Comments

Stephan Mueller - Feb. 9, 2018, 10:03 p.m.
The kernel crypto API requires the caller to set an IV in the request data
structure. That request data structure shall define one particular cipher
operation. During the cipher operation, the IV is read by the cipher
implementation and eventually the potentially updated IV (e.g. in case of
CBC) is written back to the memory location the request data structure
points to.

AF_ALG allows setting the IV with a sendmsg request, where the IV is stored
in the AF_ALG context that is unique to one particular AF_ALG socket. Note
the analogy: an AF_ALG socket is like a TFM where one recvmsg operation
uses one request with the TFM from the socket.

AF_ALG these days supports AIO operations with multiple IOCBs. I.e. with
one recvmsg call, multiple IOVECs can be specified. Each individual IOCB
(derived from one IOVEC) implies that one request data structure is
created with the data to be processed by the cipher implementation. The
IV that was set with the sendmsg call is registered with the request data
structure before the cipher operation.

In case of an AIO operation, the cipher operation invocation returns
immediately, queuing the request to the hardware. While the AIO request is
processed by the hardware, recvmsg processes the next IOVEC for which
another request is created. Again, the IV buffer from the AF_ALG socket
context is registered with the new request and the cipher operation is
invoked.

You may now see that there is a potential race condition regarding the IV
handling, because there is *no* separate IV buffer for the different
requests. This is nicely demonstrated with libkcapi using the following
command which creates an AIO request with two IOCBs each encrypting one
AES block in CBC mode:

kcapi  -d 2 -x 9  -e -c "cbc(aes)" -k
8d7dd9b0170ce0b5f2f8e1aa768e01e91da8bfc67fd486d081b28254c99eb423 -i
7fbc02ebf5b93322329df9bfccb635af -p 48981da18e4bb9ef7e2e3162d16b1910

When the first AIO request finishes before the 2nd AIO request is
processed, the returned value is:

8b19050f66582cb7f7e4b6c873819b7108afa0eaa7de29bac7d903576b674c32

I.e. two blocks where the IV output from the first request is the IV input
to the 2nd block.

In case the first AIO request is not completed before the 2nd request
commences, the result is two identical AES blocks (i.e. both use the same
IV):

8b19050f66582cb7f7e4b6c873819b718b19050f66582cb7f7e4b6c873819b71

This inconsistent result may even lead to the conclusion that there can be
a memory corruption in the IV buffer if both AIO requests write to the IV
buffer at the same time.

As the AF_ALG interface is used by user space, a mutex provides the
serialization which puts the caller to sleep in case a previous IOCB
processing is not yet finished.

If multiple IOCBs arrive that all are blocked, the mutex' FIFO operation
of processing arriving requests ensures that the blocked IOCBs are
unblocked in the right order.

CC: <stable@vger.kernel.org> #4.14
Fixes: e870456d8e7c8 ("crypto: algif_skcipher - overhaul memory management")
Fixes: d887c52d6ae43 ("crypto: algif_aead - overhaul memory management")
Signed-off-by: Stephan Mueller <smueller@chronox.de>
---
 crypto/af_alg.c         | 31 +++++++++++++++++++++++++++++++
 crypto/algif_aead.c     | 20 +++++++++++++-------
 crypto/algif_skcipher.c | 12 +++++++++---
 include/crypto/if_alg.h |  5 +++++
 4 files changed, 58 insertions(+), 10 deletions(-)
Harsh Jain - Feb. 14, 2018, 5:43 a.m.
On 10-02-2018 03:33, Stephan Müller wrote:
> The kernel crypto API requires the caller to set an IV in the request data
> structure. That request data structure shall define one particular cipher
> operation. During the cipher operation, the IV is read by the cipher
> implementation and eventually the potentially updated IV (e.g. in case of
> CBC) is written back to the memory location the request data structure
> points to.
>
> AF_ALG allows setting the IV with a sendmsg request, where the IV is stored
> in the AF_ALG context that is unique to one particular AF_ALG socket. Note
> the analogy: an AF_ALG socket is like a TFM where one recvmsg operation
> uses one request with the TFM from the socket.
>
> AF_ALG these days supports AIO operations with multiple IOCBs. I.e. with
> one recvmsg call, multiple IOVECs can be specified. Each individual IOCB
> (derived from one IOVEC) implies that one request data structure is
> created with the data to be processed by the cipher implementation. The
> IV that was set with the sendmsg call is registered with the request data
> structure before the cipher operation.
>
> In case of an AIO operation, the cipher operation invocation returns
> immediately, queuing the request to the hardware. While the AIO request is
> processed by the hardware, recvmsg processes the next IOVEC for which
> another request is created. Again, the IV buffer from the AF_ALG socket
> context is registered with the new request and the cipher operation is
> invoked.
>
> You may now see that there is a potential race condition regarding the IV
> handling, because there is *no* separate IV buffer for the different
> requests. This is nicely demonstrated with libkcapi using the following
> command which creates an AIO request with two IOCBs each encrypting one
> AES block in CBC mode:
>
> kcapi  -d 2 -x 9  -e -c "cbc(aes)" -k
> 8d7dd9b0170ce0b5f2f8e1aa768e01e91da8bfc67fd486d081b28254c99eb423 -i
> 7fbc02ebf5b93322329df9bfccb635af -p 48981da18e4bb9ef7e2e3162d16b1910
>
> When the first AIO request finishes before the 2nd AIO request is
> processed, the returned value is:
>
> 8b19050f66582cb7f7e4b6c873819b7108afa0eaa7de29bac7d903576b674c32
>
> I.e. two blocks where the IV output from the first request is the IV input
> to the 2nd block.
>
> In case the first AIO request is not completed before the 2nd request
> commences, the result is two identical AES blocks (i.e. both use the same
> IV):
>
> 8b19050f66582cb7f7e4b6c873819b718b19050f66582cb7f7e4b6c873819b71
>
> This inconsistent result may even lead to the conclusion that there can be
> a memory corruption in the IV buffer if both AIO requests write to the IV
> buffer at the same time.
>
> As the AF_ALG interface is used by user space, a mutex provides the
> serialization which puts the caller to sleep in case a previous IOCB
> processing is not yet finished.
>
> If multiple IOCBs arrive that all are blocked, the mutex' FIFO operation
> of processing arriving requests ensures that the blocked IOCBs are
> unblocked in the right order.
Hi Stephen,

Patch set is working fine with chelsio Driver.
Do we really need IV locking mechanism for AEAD algo because AEAD algo's don't support Partial mode operation and Driver are not updating(atleast Chelsio) IV's on AEAD request completions.
>
> CC: <stable@vger.kernel.org> #4.14
> Fixes: e870456d8e7c8 ("crypto: algif_skcipher - overhaul memory management")
> Fixes: d887c52d6ae43 ("crypto: algif_aead - overhaul memory management")
> Signed-off-by: Stephan Mueller <smueller@chronox.de>
> ---
>  crypto/af_alg.c         | 31 +++++++++++++++++++++++++++++++
>  crypto/algif_aead.c     | 20 +++++++++++++-------
>  crypto/algif_skcipher.c | 12 +++++++++---
>  include/crypto/if_alg.h |  5 +++++
>  4 files changed, 58 insertions(+), 10 deletions(-)
>
> diff --git a/crypto/af_alg.c b/crypto/af_alg.c
> index 5231f421ad00..e7887621aa44 100644
> --- a/crypto/af_alg.c
> +++ b/crypto/af_alg.c
> @@ -1051,6 +1051,8 @@ void af_alg_async_cb(struct crypto_async_request *_req, int err)
>  	struct kiocb *iocb = areq->iocb;
>  	unsigned int resultlen;
>  
> +	af_alg_put_iv(sk);
> +
>  	/* Buffer size written by crypto operation. */
>  	resultlen = areq->outlen;
>  
> @@ -1175,6 +1177,35 @@ int af_alg_get_rsgl(struct sock *sk, struct msghdr *msg, int flags,
>  }
>  EXPORT_SYMBOL_GPL(af_alg_get_rsgl);
>  
> +/**
> + * af_alg_get_iv
> + *
> + * @sk [in] AF_ALG socket
> + * @return 0 on success, < 0 on error
> + */
> +int af_alg_get_iv(struct sock *sk)
> +{
> +	struct alg_sock *ask = alg_sk(sk);
> +	struct af_alg_ctx *ctx = ask->private;
> +
> +	return mutex_lock_interruptible(&ctx->ivlock);
> +}
> +EXPORT_SYMBOL_GPL(af_alg_get_iv);
> +
> +/**
> + * af_alg_put_iv - release lock on IV in case CTX IV is used
> + *
> + * @sk [in] AF_ALG socket
> + */
> +void af_alg_put_iv(struct sock *sk)
> +{
> +	struct alg_sock *ask = alg_sk(sk);
> +	struct af_alg_ctx *ctx = ask->private;
> +
> +	mutex_unlock(&ctx->ivlock);
> +}
> +EXPORT_SYMBOL_GPL(af_alg_put_iv);
> +
>  static int __init af_alg_init(void)
>  {
>  	int err = proto_register(&alg_proto, 0);
> diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
> index 4b07edd5a9ff..402de50d4a58 100644
> --- a/crypto/algif_aead.c
> +++ b/crypto/algif_aead.c
> @@ -159,10 +159,14 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,
>  	if (IS_ERR(areq))
>  		return PTR_ERR(areq);
>  
> +	err = af_alg_get_iv(sk);
> +	if (err)
> +		goto free;
> +
>  	/* convert iovecs of output buffers into RX SGL */
>  	err = af_alg_get_rsgl(sk, msg, flags, areq, outlen, &usedpages);
>  	if (err)
> -		goto free;
> +		goto unlock;
>  
>  	/*
>  	 * Ensure output buffer is sufficiently large. If the caller provides
> @@ -176,7 +180,7 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,
>  
>  		if (used < less) {
>  			err = -EINVAL;
> -			goto free;
> +			goto unlock;
>  		}
>  		used -= less;
>  		outlen -= less;
> @@ -197,7 +201,7 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,
>  	}
>  	if (processed && !tsgl_src) {
>  		err = -EFAULT;
> -		goto free;
> +		goto unlock;
>  	}
>  
>  	/*
> @@ -230,7 +234,7 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,
>  		err = crypto_aead_copy_sgl(null_tfm, tsgl_src,
>  					   areq->first_rsgl.sgl.sg, processed);
>  		if (err)
> -			goto free;
> +			goto unlock;
>  		af_alg_pull_tsgl(sk, processed, NULL, 0);
>  	} else {
>  		/*
> @@ -248,7 +252,7 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,
>  		err = crypto_aead_copy_sgl(null_tfm, tsgl_src,
>  					   areq->first_rsgl.sgl.sg, outlen);
>  		if (err)
> -			goto free;
> +			goto unlock;
>  
>  		/* Create TX SGL for tag and chain it to RX SGL. */
>  		areq->tsgl_entries = af_alg_count_tsgl(sk, processed,
> @@ -260,7 +264,7 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,
>  					  GFP_KERNEL);
>  		if (!areq->tsgl) {
>  			err = -ENOMEM;
> -			goto free;
> +			goto unlock;
>  		}
>  		sg_init_table(areq->tsgl, areq->tsgl_entries);
>  
> @@ -316,7 +320,8 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,
>  				&ctx->wait);
>  	}
>  
> -
> +unlock:
> +	af_alg_put_iv(sk);
>  free:
>  	af_alg_free_resources(areq);
>  
> @@ -562,6 +567,7 @@ static int aead_accept_parent_nokey(void *private, struct sock *sk)
>  		return -ENOMEM;
>  	}
>  	memset(ctx->iv, 0, ivlen);
> +	mutex_init(&ctx->ivlock);
>  
>  	INIT_LIST_HEAD(&ctx->tsgl_list);
>  	ctx->len = len;
> diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
> index c88e5e4cd6a6..b65b9a89d6bb 100644
> --- a/crypto/algif_skcipher.c
> +++ b/crypto/algif_skcipher.c
> @@ -77,10 +77,14 @@ static int _skcipher_recvmsg(struct socket *sock, struct msghdr *msg,
>  	if (IS_ERR(areq))
>  		return PTR_ERR(areq);
>  
> +	err = af_alg_get_iv(sk);
> +	if (err)
> +		goto free;
> +
>  	/* convert iovecs of output buffers into RX SGL */
>  	err = af_alg_get_rsgl(sk, msg, flags, areq, -1, &len);
>  	if (err)
> -		goto free;
> +		goto unlock;
>  
>  	/* Process only as much RX buffers for which we have TX data */
>  	if (len > ctx->used)
> @@ -104,7 +108,7 @@ static int _skcipher_recvmsg(struct socket *sock, struct msghdr *msg,
>  				  GFP_KERNEL);
>  	if (!areq->tsgl) {
>  		err = -ENOMEM;
> -		goto free;
> +		goto unlock;
>  	}
>  	sg_init_table(areq->tsgl, areq->tsgl_entries);
>  	af_alg_pull_tsgl(sk, len, areq->tsgl, 0);
> @@ -146,7 +150,8 @@ static int _skcipher_recvmsg(struct socket *sock, struct msghdr *msg,
>  						 &ctx->wait);
>  	}
>  
> -
> +unlock:
> +	af_alg_put_iv(sk);
>  free:
>  	af_alg_free_resources(areq);
>  
> @@ -353,6 +358,7 @@ static int skcipher_accept_parent_nokey(void *private, struct sock *sk)
>  	}
>  
>  	memset(ctx->iv, 0, crypto_skcipher_ivsize(tfm));
> +	mutex_init(&ctx->ivlock);
>  
>  	INIT_LIST_HEAD(&ctx->tsgl_list);
>  	ctx->len = len;
> diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h
> index f38227a78eae..c48eff27e5a9 100644
> --- a/include/crypto/if_alg.h
> +++ b/include/crypto/if_alg.h
> @@ -19,6 +19,7 @@
>  #include <linux/scatterlist.h>
>  #include <linux/types.h>
>  #include <linux/atomic.h>
> +#include <linux/mutex.h>
>  #include <net/sock.h>
>  
>  #include <crypto/aead.h>
> @@ -127,6 +128,7 @@ struct af_alg_async_req {
>   *
>   * @tsgl_list:		Link to TX SGL
>   * @iv:			IV for cipher operation
> + * @ivlock:		Mutex to serialize access to the IV
>   * @aead_assoclen:	Length of AAD for AEAD cipher operations
>   * @completion:		Work queue for synchronous operation
>   * @used:		TX bytes sent to kernel. This variable is used to
> @@ -146,6 +148,7 @@ struct af_alg_ctx {
>  	struct list_head tsgl_list;
>  
>  	void *iv;
> +	struct mutex ivlock;
>  	size_t aead_assoclen;
>  
>  	struct crypto_wait wait;
> @@ -252,5 +255,7 @@ struct af_alg_async_req *af_alg_alloc_areq(struct sock *sk,
>  int af_alg_get_rsgl(struct sock *sk, struct msghdr *msg, int flags,
>  		    struct af_alg_async_req *areq, size_t maxsize,
>  		    size_t *outlen);
> +int af_alg_get_iv(struct sock *sk);
> +void af_alg_put_iv(struct sock *sk);
>  
>  #endif	/* _CRYPTO_IF_ALG_H */
Stephan Mueller - Feb. 14, 2018, 12:52 p.m.
Am Mittwoch, 14. Februar 2018, 06:43:53 CET schrieb Harsh Jain:

Hi Harsh,

> 
> Patch set is working fine with chelsio Driver.

Thank you.

> Do we really need IV locking mechanism for AEAD algo because AEAD algo's
> don't support Partial mode operation and Driver are not updating(atleast
> Chelsio) IV's on AEAD request completions.

Yes, I think we would need it. It is technically possible to have multiple 
IOCBs for AEAD ciphers. Even though your implementation may not write the IV 
back, others may do that. At least I do not see a guarantee that the IV is 
*not* written back by a driver.

In case your driver does not write the IV back and thus does not need to 
serialize, the driver can report CRYPTO_ALG_SERIALIZES_IV_ACCESS. In this 
case, the higher level functions would not serialize as the driver serializes 
the requests (or the driver deems it appropriate that no serialization is 
needed as is the case with your driver).

Ciao
Stephan
Harsh Jain - Feb. 15, 2018, 5:30 a.m.
On 14-02-2018 18:22, Stephan Mueller wrote:
> Am Mittwoch, 14. Februar 2018, 06:43:53 CET schrieb Harsh Jain:
>
> Hi Harsh,
>
>> Patch set is working fine with chelsio Driver.
> Thank you.
>
>> Do we really need IV locking mechanism for AEAD algo because AEAD algo's
>> don't support Partial mode operation and Driver are not updating(atleast
>> Chelsio) IV's on AEAD request completions.
> Yes, I think we would need it. It is technically possible to have multiple 
> IOCBs for AEAD ciphers. Even though your implementation may not write the IV 
> back, others may do that. At least I do not see a guarantee that the IV is 
> *not* written back by a driver.
There is no  use of writing IV back in AEAD algo till Framework starts supporting Partial mode.
Even if Driver starts updating IV for AEAD, Multiple IOCB's in both cases will yield wrong results only.

Case 1 : If we have AEAD IV serialization  applied,  Encryption will be wrong if same IV gets used.
Case 2: If we do not have IV serialization for AEAD. Encryption will be fine but user will have multiple Authentication  tag (that too with final block processed).  Its like 2nd Block encryption is based on IV received from 1st block  and Authentication Tag value is based on 2nd block content only.

>
> In case your driver does not write the IV back and thus does not need to 
> serialize, the driver can report CRYPTO_ALG_SERIALIZES_IV_ACCESS. In this 
> case, the higher level functions would not serialize as the driver serializes 
> the requests (or the driver deems it appropriate that no serialization is 
> needed as is the case with your driver).
>
> Ciao
> Stephan
>
>
Stephan Mueller - Feb. 15, 2018, 6:28 a.m.
Am Donnerstag, 15. Februar 2018, 06:30:36 CET schrieb Harsh Jain:

Hi Harsh,

> On 14-02-2018 18:22, Stephan Mueller wrote:
> > Am Mittwoch, 14. Februar 2018, 06:43:53 CET schrieb Harsh Jain:
> > 
> > Hi Harsh,
> > 
> >> Patch set is working fine with chelsio Driver.
> > 
> > Thank you.
> > 
> >> Do we really need IV locking mechanism for AEAD algo because AEAD algo's
> >> don't support Partial mode operation and Driver are not updating(atleast
> >> Chelsio) IV's on AEAD request completions.
> > 
> > Yes, I think we would need it. It is technically possible to have multiple
> > IOCBs for AEAD ciphers. Even though your implementation may not write the
> > IV back, others may do that. At least I do not see a guarantee that the
> > IV is *not* written back by a driver.
> 
> There is no  use of writing IV back in AEAD algo till Framework starts
> supporting Partial mode.

I agree.

> Even if Driver starts updating IV for AEAD,
> Multiple IOCB's in both cases will yield wrong results only.

This would only be the case if the driver would not implicitly or explicitly 
serialize the requests.
> 
> Case 1 : If we have AEAD IV serialization  applied,  Encryption will be
> wrong if same IV gets used.

Agreed.

> Case 2: If we do not have IV serialization for
> AEAD. Encryption will be fine but user will have multiple Authentication 
> tag (that too with final block processed).  Its like 2nd Block encryption
> is based on IV received from 1st block  and Authentication Tag value is
> based on 2nd block content only.

Agreed.

But are we sure that all drivers behave correctly? Before you notified us of 
the issue, I was not even aware of the fact that this serialization may not be 
done in the driver. And we only have seen that issue with AF_ALG where we test 
for multiple concurrent AIO operations.

Besides, when we do not have the locking for AEAD, what would we gain: one 
less lock to take vs. guarantee that the AEAD operation is always properly 
serialized.

Ciao
Stephan
Harsh Jain - Feb. 15, 2018, 7:03 a.m.
On 15-02-2018 11:58, Stephan Mueller wrote:
> Am Donnerstag, 15. Februar 2018, 06:30:36 CET schrieb Harsh Jain:
>
> Hi Harsh,
>
>> On 14-02-2018 18:22, Stephan Mueller wrote:
>>> Am Mittwoch, 14. Februar 2018, 06:43:53 CET schrieb Harsh Jain:
>>>
>>> Hi Harsh,
>>>
>>>> Patch set is working fine with chelsio Driver.
>>> Thank you.
>>>
>>>> Do we really need IV locking mechanism for AEAD algo because AEAD algo's
>>>> don't support Partial mode operation and Driver are not updating(atleast
>>>> Chelsio) IV's on AEAD request completions.
>>> Yes, I think we would need it. It is technically possible to have multiple
>>> IOCBs for AEAD ciphers. Even though your implementation may not write the
>>> IV back, others may do that. At least I do not see a guarantee that the
>>> IV is *not* written back by a driver.
>> There is no  use of writing IV back in AEAD algo till Framework starts
>> supporting Partial mode.
> I agree.
>
>> Even if Driver starts updating IV for AEAD,
>> Multiple IOCB's in both cases will yield wrong results only.
> This would only be the case if the driver would not implicitly or explicitly 
> serialize the requests.
>> Case 1 : If we have AEAD IV serialization  applied,  Encryption will be
>> wrong if same IV gets used.
> Agreed.
>
>> Case 2: If we do not have IV serialization for
>> AEAD. Encryption will be fine but user will have multiple Authentication 
>> tag (that too with final block processed).  Its like 2nd Block encryption
>> is based on IV received from 1st block  and Authentication Tag value is
>> based on 2nd block content only.
> Agreed.
>
> But are we sure that all drivers behave correctly? Before you notified us of 
> the issue, I was not even aware of the fact that this serialization may not be 
> done in the driver. And we only have seen that issue with AF_ALG where we test 
> for multiple concurrent AIO operations.
I am sure other H/W will have similar problem, It's just that we tested it first.

>
> Besides, when we do not have the locking for AEAD, what would we gain: one 
> less lock to take vs. guarantee that the AEAD operation is always properly 
> serialized.
Even after guarantee of serialization, In the end we will get wrong result as mentioned above. which destination side cannot decrypt it.
What I feel is scenario of sending 2 of more IOCB in case of AEAD itself is wrong.  We should not allow this type of requests for AEAD.
Can you think of any use case it is going to solve?
Can receiver decrypt(with 2 IOCB) the same request successfully without knowing  sender has done the operation in 2 request with size "x" each?
>
> Ciao
> Stephan
>
>
Stephan Mueller - Feb. 15, 2018, 7:17 a.m.
Am Donnerstag, 15. Februar 2018, 08:03:20 CET schrieb Harsh Jain:

Hi Harsh,

> Even after guarantee of serialization, In the end we will get wrong result
> as mentioned above. which destination side cannot decrypt it. What I feel
> is scenario of sending 2 of more IOCB in case of AEAD itself is wrong.

Without the inline IV handling, I would concur.

> We
> should not allow this type of requests for AEAD.

"Not allow" as in "technically block"? As a user would only shoot itself when 
he does that not knowing the consequences, I am not in favor of such an 
artificial block.

> Can you think of any use
> case it is going to solve?

Well, I could fathom a use case of this. In FIPS 140-2 (yes, a term not well 
received by some here), NIST insists for GCM that the IV is handled by the 
cryptographic implementation.

So, when using GCM for TLS, for example, the GCM implementation would know a 
bit about how the IV is updated as a session ID. I.e. after the end of one 
AEAD operation, the IV is written back but modified such to comply with the 
rules of some higher level proto. Thus, if such a scenarios is implemented by 
a driver here, multiple IOCBs could be used with such "TLSified" GCM, for 
example.

And such "TLSification" could be as simple as implementing an IV generator 
that can be used with every (AEAD) cipher implementation.

> Can receiver decrypt(with 2 IOCB) the same request successfully without
> knowing  sender has done the operation in 2 request with size "x" each?
> > Ciao
> > Stephan



Ciao
Stephan
Harsh Jain - Feb. 15, 2018, 11:38 a.m.
On 15-02-2018 12:47, Stephan Mueller wrote:
> Am Donnerstag, 15. Februar 2018, 08:03:20 CET schrieb Harsh Jain:
>
> Hi Harsh,
>
>> Even after guarantee of serialization, In the end we will get wrong result
>> as mentioned above. which destination side cannot decrypt it. What I feel
>> is scenario of sending 2 of more IOCB in case of AEAD itself is wrong.
> Without the inline IV handling, I would concur.
Even with Inline IV, We will have 2 Auth Tag. can we authenticate the data with 2 Auth Tags?
>
>> We
>> should not allow this type of requests for AEAD.
> "Not allow" as in "technically block"? As a user would only shoot itself when 
> he does that not knowing the consequences, I am not in favor of such an 
> artificial block.
Agreed, It may not be right thing to do, but if we allowed it, What he will do with Auth( each processed with final Block) tags received in each request.

I personally feels AEAD IV serialization logic is incomplete without partial tag support.

>> Can you think of any use
>> case it is going to solve?
> Well, I could fathom a use case of this. In FIPS 140-2 (yes, a term not well 
> received by some here), NIST insists for GCM that the IV is handled by the 
> cryptographic implementation.
>
> So, when using GCM for TLS, for example, the GCM implementation would know a 
> bit about how the IV is updated as a session ID. I.e. after the end of one 
> AEAD operation, the IV is written back but modified such to comply with the 
> rules of some higher level proto. Thus, if such a scenarios is implemented by 
> a driver here, multiple IOCBs could be used with such "TLSified" GCM, for 
> example.
>
> And such "TLSification" could be as simple as implementing an IV generator 
> that can be used with every (AEAD) cipher implementation.
>
>> Can receiver decrypt(with 2 IOCB) the same request successfully without
>> knowing  sender has done the operation in 2 request with size "x" each?
>>> Ciao
>>> Stephan
>
>
> Ciao
> Stephan
>
>
Stephan Mueller - Feb. 15, 2018, 11:45 a.m.
Am Donnerstag, 15. Februar 2018, 12:38:12 CET schrieb Harsh Jain:

Hi Harsh,

> On 15-02-2018 12:47, Stephan Mueller wrote:
> > Am Donnerstag, 15. Februar 2018, 08:03:20 CET schrieb Harsh Jain:
> > 
> > Hi Harsh,
> > 
> >> Even after guarantee of serialization, In the end we will get wrong
> >> result
> >> as mentioned above. which destination side cannot decrypt it. What I feel
> >> is scenario of sending 2 of more IOCB in case of AEAD itself is wrong.
> > 
> > Without the inline IV handling, I would concur.
> 
> Even with Inline IV, We will have 2 Auth Tag. can we authenticate the data
> with 2 Auth Tags?

The AAD and the tag are both sent to the kernel like in the inline IV case as 
part of the data via sendmsg.

Thus, when you use inline IV support, an entire self-contained AEAD cipher 
operation can be defined with one IOCB. Thus, if you have multiple IOCBs, 
inline IV allow fully parallel execution of different AEAD requests.

See the following kernel comment:

                /*
                 * Encryption operation - The in-place cipher operation is
                 * achieved by the following operation:
                 *
                 * TX SGL: AAD || PT
                 *          |      |
                 *          | copy |
                 *          v      v
                 * RX SGL: AAD || PT || Tag

                /*
                 * Decryption operation - To achieve an in-place cipher
                 * operation, the following  SGL structure is used:
                 *
                 * TX SGL: AAD || CT || Tag
                 *          |      |     ^
                 *          | copy |     | Create SGL link.
                 *          v      v     |
                 * RX SGL: AAD || CT ----+
                 */

Note, the TX SGL is what the caller submitted via sendmsg and the RX SGL is 
the data the caller obtains via recvmsg.

Hence, in inline IV support, the caller sends the following:

encryption: IV || AAD || PT

decryption: IV || AAD || CT || Tag

> >> We
> >> should not allow this type of requests for AEAD.
> > 
> > "Not allow" as in "technically block"? As a user would only shoot itself
> > when he does that not knowing the consequences, I am not in favor of such
> > an artificial block.
> 
> Agreed, It may not be right thing to do, but if we allowed it, What he will
> do with Auth( each processed with final Block) tags received in each
> request.

Each tag is returned as part of the recvmsg data. Thus, AEAD cipher operations 
can commence fully parallel if inline IV handling is used.
> 
> I personally feels AEAD IV serialization logic is incomplete without partial
> tag support.

Could you please elaborate what you mean with "partial tag" support?

Ciao
Stephan
Harsh Jain - Feb. 15, 2018, 12:45 p.m.
On 15-02-2018 17:15, Stephan Mueller wrote:
> Am Donnerstag, 15. Februar 2018, 12:38:12 CET schrieb Harsh Jain:
>
> Hi Harsh,
>
>> On 15-02-2018 12:47, Stephan Mueller wrote:
>>> Am Donnerstag, 15. Februar 2018, 08:03:20 CET schrieb Harsh Jain:
>>>
>>> Hi Harsh,
>>>
>>>> Even after guarantee of serialization, In the end we will get wrong
>>>> result
>>>> as mentioned above. which destination side cannot decrypt it. What I feel
>>>> is scenario of sending 2 of more IOCB in case of AEAD itself is wrong.
>>> Without the inline IV handling, I would concur.
>> Even with Inline IV, We will have 2 Auth Tag. can we authenticate the data
>> with 2 Auth Tags?
> The AAD and the tag are both sent to the kernel like in the inline IV case as 
> part of the data via sendmsg.
>
> Thus, when you use inline IV support, an entire self-contained AEAD cipher 
> operation can be defined with one IOCB. Thus, if you have multiple IOCBs, 
> inline IV allow fully parallel execution of different AEAD requests.
>
> See the following kernel comment:
>
>                 /*
>                  * Encryption operation - The in-place cipher operation is
>                  * achieved by the following operation:
>                  *
>                  * TX SGL: AAD || PT
>                  *          |      |
>                  *          | copy |
>                  *          v      v
>                  * RX SGL: AAD || PT || Tag
>
>                 /*
>                  * Decryption operation - To achieve an in-place cipher
>                  * operation, the following  SGL structure is used:
>                  *
>                  * TX SGL: AAD || CT || Tag
>                  *          |      |     ^
>                  *          | copy |     | Create SGL link.
>                  *          v      v     |
>                  * RX SGL: AAD || CT ----+
>                  */
>
> Note, the TX SGL is what the caller submitted via sendmsg and the RX SGL is 
> the data the caller obtains via recvmsg.
>
> Hence, in inline IV support, the caller sends the following:
>
> encryption: IV || AAD || PT
>
> decryption: IV || AAD || CT || Tag
>
>>>> We
>>>> should not allow this type of requests for AEAD.
>>> "Not allow" as in "technically block"? As a user would only shoot itself
>>> when he does that not knowing the consequences, I am not in favor of such
>>> an artificial block.
>> Agreed, It may not be right thing to do, but if we allowed it, What he will
>> do with Auth( each processed with final Block) tags received in each
>> request.
> Each tag is returned as part of the recvmsg data. Thus, AEAD cipher operations 
> can commence fully parallel if inline IV handling is used.
>> I personally feels AEAD IV serialization logic is incomplete without partial
>> tag support.
> Could you please elaborate what you mean with "partial tag" support?
Here is the catch, Calculation of tag depends on total payload length atleast for shaX, gcm,ccm mode on which I have worked.

If we take an example of shaX. It appends 1 special block at the end of user data which includes total input length in bit. Refer "sha1_base_do_finalize"
Suppose we have 32 byte and we break this in 2 IOCB of 16 bytes each.
Expected result : 32 encrypted bytes + sha auth tag considering length 32 bytes.
What we will  get : 16 bytes + sha auth tag considering length 16 bytes + 16 encrypted bytes + another sha tag considering 16 bytes.



>
> Ciao
> Stephan
>
>
Stephan Mueller - Feb. 15, 2018, 1:04 p.m.
Am Donnerstag, 15. Februar 2018, 13:45:53 CET schrieb Harsh Jain:

Hi Harsh,

> > Could you please elaborate what you mean with "partial tag" support?
> 
> Here is the catch, Calculation of tag depends on total payload length
> atleast for shaX, gcm,ccm mode on which I have worked.
> 
> If we take an example of shaX. It appends 1 special block at the end of user
> data which includes total input length in bit. Refer
> "sha1_base_do_finalize" Suppose we have 32 byte and we break this in 2 IOCB
> of 16 bytes each. Expected result : 32 encrypted bytes + sha auth tag
> considering length 32 bytes. What we will  get : 16 bytes + sha auth tag
> considering length 16 bytes + 16 encrypted bytes + another sha tag
> considering 16 bytes.

As AF_ALG for AEAD is implemented, there is no stream support where the hash 
is calculated at the end. This is even not supported in the current AEAD API 
of the kernel crypto API as far as I see. The only "stream-like" support is 
that you can invoke multiple separate sendmsg calls to provide the input data 
for the AEAD. But once you call recvmsg, the ciphertext and the tag is 
calculated and thus the recvmsg is akin to a hash_final operation.

Complete parallel execution of multiple independent AEAD cipher operations 
with AIO is possible using the inline IV as suggested in the patch set.

The locking of the ctx->iv is only there to support a driver not to whack the 
IV buffer. However, if others also agree that the ctx->ivlock should not be 
taken in the AEAD code path because a potential whacking the IV buffer by a 
driver does not need to be prevented, I would change the patch.

Ciao
Stephan
Jeffrey Walton - Feb. 15, 2018, 1:26 p.m.
On Thu, Feb 15, 2018 at 8:04 AM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Donnerstag, 15. Februar 2018, 13:45:53 CET schrieb Harsh Jain:
>
>> > Could you please elaborate what you mean with "partial tag" support?
>>
>> Here is the catch, Calculation of tag depends on total payload length
>> atleast for shaX, gcm,ccm mode on which I have worked.
>>
>> If we take an example of shaX. It appends 1 special block at the end of user
>> data which includes total input length in bit. Refer
>> "sha1_base_do_finalize" Suppose we have 32 byte and we break this in 2 IOCB
>> of 16 bytes each. Expected result : 32 encrypted bytes + sha auth tag
>> considering length 32 bytes. What we will  get : 16 bytes + sha auth tag
>> considering length 16 bytes + 16 encrypted bytes + another sha tag
>> considering 16 bytes.
>
> As AF_ALG for AEAD is implemented, there is no stream support where the hash
> is calculated at the end. This is even not supported in the current AEAD API
> of the kernel crypto API as far as I see. The only "stream-like" support is
> that you can invoke multiple separate sendmsg calls to provide the input data
> for the AEAD. But once you call recvmsg, the ciphertext and the tag is
> calculated and thus the recvmsg is akin to a hash_final operation.

If you follow Bernstein's protocol design philosophy, then messages
should be no larger than about 4K in size. From
https://nacl.cr.yp.to/valid.html:

    This is one of several reasons [1] that callers should (1) split
    all data into packets sent through the network; (2) put a
    small global limit on packet length; and (3) separately
    encrypt and authenticate each packet.

With the [1] link being
https://groups.google.com/forum/#!original/boring-crypto/BpUmNMXKMYQ/EEwAIeQdjacJ

Jeff
Stephan Mueller - Feb. 15, 2018, 6:09 p.m.
Am Donnerstag, 15. Februar 2018, 14:04:53 CET schrieb Stephan Mueller:

Hi Stephan,

> Am Donnerstag, 15. Februar 2018, 13:45:53 CET schrieb Harsh Jain:
> 
> Hi Harsh,
> 
> > > Could you please elaborate what you mean with "partial tag" support?
> > 
> > Here is the catch, Calculation of tag depends on total payload length
> > atleast for shaX, gcm,ccm mode on which I have worked.
> > 
> > If we take an example of shaX. It appends 1 special block at the end of
> > user data which includes total input length in bit. Refer
> > "sha1_base_do_finalize" Suppose we have 32 byte and we break this in 2
> > IOCB
> > of 16 bytes each. Expected result : 32 encrypted bytes + sha auth tag
> > considering length 32 bytes. What we will  get : 16 bytes + sha auth tag
> > considering length 16 bytes + 16 encrypted bytes + another sha tag
> > considering 16 bytes.
> 
> As AF_ALG for AEAD is implemented, there is no stream support where the hash
> is calculated at the end. This is even not supported in the current AEAD
> API of the kernel crypto API as far as I see. The only "stream-like"
> support is that you can invoke multiple separate sendmsg calls to provide
> the input data for the AEAD. But once you call recvmsg, the ciphertext and
> the tag is calculated and thus the recvmsg is akin to a hash_final
> operation.
> 
> Complete parallel execution of multiple independent AEAD cipher operations
> with AIO is possible using the inline IV as suggested in the patch set.
> 
> The locking of the ctx->iv is only there to support a driver not to whack
> the IV buffer. However, if others also agree that the ctx->ivlock should
> not be taken in the AEAD code path because a potential whacking the IV
> buffer by a driver does not need to be prevented, I would change the patch.

I was about to change the implementation. However, I already found an 
implementation where the driver changes the IV of the request buffer: 
crypto_ccm_init_crypt modifies the original buffer of the IV. I expect that 
there are more.

Therefore, I would think that keeping the patch set as it is right now is the 
right thing to do. Thus, we should cover the AEAD ciphers with the lock and 
thus serialize the AEAD request. This guarantees that the IV buffer is at 
least not mangled while the cipher operation is ongoing.

Again, if a particular AEAD driver guarantees the IV is not mangled, it can 
very well set CRYPTO_ALG_SERIALIZES_IV_ACCESS in its flag set. This would 
imply the lock is not set.

Ciao
Stephan

Patch

diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 5231f421ad00..e7887621aa44 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -1051,6 +1051,8 @@  void af_alg_async_cb(struct crypto_async_request *_req, int err)
 	struct kiocb *iocb = areq->iocb;
 	unsigned int resultlen;
 
+	af_alg_put_iv(sk);
+
 	/* Buffer size written by crypto operation. */
 	resultlen = areq->outlen;
 
@@ -1175,6 +1177,35 @@  int af_alg_get_rsgl(struct sock *sk, struct msghdr *msg, int flags,
 }
 EXPORT_SYMBOL_GPL(af_alg_get_rsgl);
 
+/**
+ * af_alg_get_iv
+ *
+ * @sk [in] AF_ALG socket
+ * @return 0 on success, < 0 on error
+ */
+int af_alg_get_iv(struct sock *sk)
+{
+	struct alg_sock *ask = alg_sk(sk);
+	struct af_alg_ctx *ctx = ask->private;
+
+	return mutex_lock_interruptible(&ctx->ivlock);
+}
+EXPORT_SYMBOL_GPL(af_alg_get_iv);
+
+/**
+ * af_alg_put_iv - release lock on IV in case CTX IV is used
+ *
+ * @sk [in] AF_ALG socket
+ */
+void af_alg_put_iv(struct sock *sk)
+{
+	struct alg_sock *ask = alg_sk(sk);
+	struct af_alg_ctx *ctx = ask->private;
+
+	mutex_unlock(&ctx->ivlock);
+}
+EXPORT_SYMBOL_GPL(af_alg_put_iv);
+
 static int __init af_alg_init(void)
 {
 	int err = proto_register(&alg_proto, 0);
diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
index 4b07edd5a9ff..402de50d4a58 100644
--- a/crypto/algif_aead.c
+++ b/crypto/algif_aead.c
@@ -159,10 +159,14 @@  static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,
 	if (IS_ERR(areq))
 		return PTR_ERR(areq);
 
+	err = af_alg_get_iv(sk);
+	if (err)
+		goto free;
+
 	/* convert iovecs of output buffers into RX SGL */
 	err = af_alg_get_rsgl(sk, msg, flags, areq, outlen, &usedpages);
 	if (err)
-		goto free;
+		goto unlock;
 
 	/*
 	 * Ensure output buffer is sufficiently large. If the caller provides
@@ -176,7 +180,7 @@  static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,
 
 		if (used < less) {
 			err = -EINVAL;
-			goto free;
+			goto unlock;
 		}
 		used -= less;
 		outlen -= less;
@@ -197,7 +201,7 @@  static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,
 	}
 	if (processed && !tsgl_src) {
 		err = -EFAULT;
-		goto free;
+		goto unlock;
 	}
 
 	/*
@@ -230,7 +234,7 @@  static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,
 		err = crypto_aead_copy_sgl(null_tfm, tsgl_src,
 					   areq->first_rsgl.sgl.sg, processed);
 		if (err)
-			goto free;
+			goto unlock;
 		af_alg_pull_tsgl(sk, processed, NULL, 0);
 	} else {
 		/*
@@ -248,7 +252,7 @@  static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,
 		err = crypto_aead_copy_sgl(null_tfm, tsgl_src,
 					   areq->first_rsgl.sgl.sg, outlen);
 		if (err)
-			goto free;
+			goto unlock;
 
 		/* Create TX SGL for tag and chain it to RX SGL. */
 		areq->tsgl_entries = af_alg_count_tsgl(sk, processed,
@@ -260,7 +264,7 @@  static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,
 					  GFP_KERNEL);
 		if (!areq->tsgl) {
 			err = -ENOMEM;
-			goto free;
+			goto unlock;
 		}
 		sg_init_table(areq->tsgl, areq->tsgl_entries);
 
@@ -316,7 +320,8 @@  static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,
 				&ctx->wait);
 	}
 
-
+unlock:
+	af_alg_put_iv(sk);
 free:
 	af_alg_free_resources(areq);
 
@@ -562,6 +567,7 @@  static int aead_accept_parent_nokey(void *private, struct sock *sk)
 		return -ENOMEM;
 	}
 	memset(ctx->iv, 0, ivlen);
+	mutex_init(&ctx->ivlock);
 
 	INIT_LIST_HEAD(&ctx->tsgl_list);
 	ctx->len = len;
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index c88e5e4cd6a6..b65b9a89d6bb 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -77,10 +77,14 @@  static int _skcipher_recvmsg(struct socket *sock, struct msghdr *msg,
 	if (IS_ERR(areq))
 		return PTR_ERR(areq);
 
+	err = af_alg_get_iv(sk);
+	if (err)
+		goto free;
+
 	/* convert iovecs of output buffers into RX SGL */
 	err = af_alg_get_rsgl(sk, msg, flags, areq, -1, &len);
 	if (err)
-		goto free;
+		goto unlock;
 
 	/* Process only as much RX buffers for which we have TX data */
 	if (len > ctx->used)
@@ -104,7 +108,7 @@  static int _skcipher_recvmsg(struct socket *sock, struct msghdr *msg,
 				  GFP_KERNEL);
 	if (!areq->tsgl) {
 		err = -ENOMEM;
-		goto free;
+		goto unlock;
 	}
 	sg_init_table(areq->tsgl, areq->tsgl_entries);
 	af_alg_pull_tsgl(sk, len, areq->tsgl, 0);
@@ -146,7 +150,8 @@  static int _skcipher_recvmsg(struct socket *sock, struct msghdr *msg,
 						 &ctx->wait);
 	}
 
-
+unlock:
+	af_alg_put_iv(sk);
 free:
 	af_alg_free_resources(areq);
 
@@ -353,6 +358,7 @@  static int skcipher_accept_parent_nokey(void *private, struct sock *sk)
 	}
 
 	memset(ctx->iv, 0, crypto_skcipher_ivsize(tfm));
+	mutex_init(&ctx->ivlock);
 
 	INIT_LIST_HEAD(&ctx->tsgl_list);
 	ctx->len = len;
diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h
index f38227a78eae..c48eff27e5a9 100644
--- a/include/crypto/if_alg.h
+++ b/include/crypto/if_alg.h
@@ -19,6 +19,7 @@ 
 #include <linux/scatterlist.h>
 #include <linux/types.h>
 #include <linux/atomic.h>
+#include <linux/mutex.h>
 #include <net/sock.h>
 
 #include <crypto/aead.h>
@@ -127,6 +128,7 @@  struct af_alg_async_req {
  *
  * @tsgl_list:		Link to TX SGL
  * @iv:			IV for cipher operation
+ * @ivlock:		Mutex to serialize access to the IV
  * @aead_assoclen:	Length of AAD for AEAD cipher operations
  * @completion:		Work queue for synchronous operation
  * @used:		TX bytes sent to kernel. This variable is used to
@@ -146,6 +148,7 @@  struct af_alg_ctx {
 	struct list_head tsgl_list;
 
 	void *iv;
+	struct mutex ivlock;
 	size_t aead_assoclen;
 
 	struct crypto_wait wait;
@@ -252,5 +255,7 @@  struct af_alg_async_req *af_alg_alloc_areq(struct sock *sk,
 int af_alg_get_rsgl(struct sock *sk, struct msghdr *msg, int flags,
 		    struct af_alg_async_req *areq, size_t maxsize,
 		    size_t *outlen);
+int af_alg_get_iv(struct sock *sk);
+void af_alg_put_iv(struct sock *sk);
 
 #endif	/* _CRYPTO_IF_ALG_H */