diff mbox

crypto: AF_ALG - remove locking in async callback

Message ID 5260911.ZuXsrgFD2R@positron.chronox.de (mailing list archive)
State Changes Requested
Delegated to: Herbert Xu
Headers show

Commit Message

Stephan Mueller Oct. 29, 2017, 8:39 p.m. UTC
Am Mittwoch, 25. Oktober 2017, 17:26:31 CET schrieb Romain Izard:

Hi Romain,

the patch below should cover the issue you see. Would you mind testing it?

Thanks
Stephan

---8<---

The code paths protected by the socket-lock do not use or modify the
socket in a non-atomic fashion. The actions pertaining the socket do not
even need to be handled as an atomic operation. Thus, the socket-lock
can be safely ignored.

This fixes a bug regarding scheduling in atomic as the callback function
may be invoked in interrupt context.

Fixes: 2d97591ef43d0 ("crypto: af_alg - consolidation of duplicate code")
Reported-by: Romain Izard <romain.izard.pro@gmail.com>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
---
 crypto/af_alg.c | 4 ----
 1 file changed, 4 deletions(-)

Comments

Romain Izard Oct. 30, 2017, 5:15 p.m. UTC | #1
2017-10-29 21:39 GMT+01:00 Stephan Müller <smueller@chronox.de>:
> Am Mittwoch, 25. Oktober 2017, 17:26:31 CET schrieb Romain Izard:
>
> Hi Romain,
>
> the patch below should cover the issue you see. Would you mind testing it?
>
> Thanks
> Stephan
>
> ---8<---
>
> The code paths protected by the socket-lock do not use or modify the
> socket in a non-atomic fashion. The actions pertaining the socket do not
> even need to be handled as an atomic operation. Thus, the socket-lock
> can be safely ignored.
>
> This fixes a bug regarding scheduling in atomic as the callback function
> may be invoked in interrupt context.
>
> Fixes: 2d97591ef43d0 ("crypto: af_alg - consolidation of duplicate code")
> Reported-by: Romain Izard <romain.izard.pro@gmail.com>
> Signed-off-by: Stephan Mueller <smueller@chronox.de>

Tested-by: Romain Izard <romain.izard.pro@gmail.com>

The issue observed with atmel-aes is not reproduced anymore.

> ---
>  crypto/af_alg.c | 4 ----
>  1 file changed, 4 deletions(-)
>
> diff --git a/crypto/af_alg.c b/crypto/af_alg.c
> index 337cf382718e..a41f08642eee 100644
> --- a/crypto/af_alg.c
> +++ b/crypto/af_alg.c
> @@ -1063,8 +1063,6 @@ void af_alg_async_cb(struct crypto_async_request *_req, int err)
>         struct kiocb *iocb = areq->iocb;
>         unsigned int resultlen;
>
> -       lock_sock(sk);
> -
>         /* Buffer size written by crypto operation. */
>         resultlen = areq->outlen;
>
> @@ -1073,8 +1071,6 @@ void af_alg_async_cb(struct crypto_async_request *_req, int err)
>         __sock_put(sk);
>
>         iocb->ki_complete(iocb, err ? err : resultlen, 0);
> -
> -       release_sock(sk);
>  }
>  EXPORT_SYMBOL_GPL(af_alg_async_cb);
>
> --
> 2.13.6
>
>
Herbert Xu Nov. 3, 2017, 1:20 p.m. UTC | #2
On Sun, Oct 29, 2017 at 09:39:30PM +0100, Stephan Müller wrote:
> Am Mittwoch, 25. Oktober 2017, 17:26:31 CET schrieb Romain Izard:
> 
> Hi Romain,
> 
> the patch below should cover the issue you see. Would you mind testing it?
> 
> Thanks
> Stephan
> 
> ---8<---
> 
> The code paths protected by the socket-lock do not use or modify the
> socket in a non-atomic fashion. The actions pertaining the socket do not
> even need to be handled as an atomic operation. Thus, the socket-lock
> can be safely ignored.

Are you sure about that? In particular is the callback function still
sane without the socket lock if a concurrent recvmsg/sendmsg call is
made?

Your fixes header is wrong too as the locks weren't introduced in that
commit, they just got moved around.

Cheers,
Stephan Mueller Nov. 3, 2017, 1:34 p.m. UTC | #3
Am Freitag, 3. November 2017, 14:20:16 CET schrieb Herbert Xu:

Hi Herbert,

> On Sun, Oct 29, 2017 at 09:39:30PM +0100, Stephan Müller wrote:
> > Am Mittwoch, 25. Oktober 2017, 17:26:31 CET schrieb Romain Izard:
> > 
> > Hi Romain,
> > 
> > the patch below should cover the issue you see. Would you mind testing it?
> > 
> > Thanks
> > Stephan
> > 
> > ---8<---
> > 
> > The code paths protected by the socket-lock do not use or modify the
> > socket in a non-atomic fashion. The actions pertaining the socket do not
> > even need to be handled as an atomic operation. Thus, the socket-lock
> > can be safely ignored.
> 
> Are you sure about that? In particular is the callback function still
> sane without the socket lock if a concurrent recvmsg/sendmsg call is
> made?

resultlen receives its data from the async_request -> no socket

af_alg_free_areq_sgls(areq) does not require a socket, but it uses the socket 
to find the data structures -> I do not see that the socket is operated on 
though. The socket will always be alive as the sk_refcnt is not yet 
decremented by __sock_put.

sock_kfree_s uses an atomic operation

__sock_put uses an atomic operation

iocb->ki_complete does not use the socket

Where would you think that the lock is needed?
> 
> Your fixes header is wrong too as the locks weren't introduced in that
> commit, they just got moved around.

Neither the skcipher_async_cb nor aead_async_cb up to and including 4.13 
contain any lock.

Ciao
Stephan
Stephan Mueller Nov. 6, 2017, 4:06 p.m. UTC | #4
Am Freitag, 3. November 2017, 14:20:16 CET schrieb Herbert Xu:

Hi Herbert,

> > 
> > The code paths protected by the socket-lock do not use or modify the
> > socket in a non-atomic fashion. The actions pertaining the socket do not
> > even need to be handled as an atomic operation. Thus, the socket-lock
> > can be safely ignored.
> 
> Are you sure about that? In particular is the callback function still
> sane without the socket lock if a concurrent recvmsg/sendmsg call is
> made?

I reviewed the code again and I cannot find a reason for keeping the lock. All 
we need to ensure is that the socket exists. This is ensured with the refcount 
of the socket released by __sock_put().

Would you mind helping me where you think the lock is needed.
> 
> Your fixes header is wrong too as the locks weren't introduced in that
> commit, they just got moved around.

Correct, the initial introduction was in 
e870456d8e7c8d57c059ea479b5aadbb55ff4c3a (algif_skcipher) and 
d887c52d6ae43aeebd249b5f2f1333e60236aa60 (algif_aead)

Thanks a lot.

Ciao
Stephan
Herbert Xu Nov. 7, 2017, 5:22 a.m. UTC | #5
On Mon, Nov 06, 2017 at 05:06:09PM +0100, Stephan Mueller wrote:
> Am Freitag, 3. November 2017, 14:20:16 CET schrieb Herbert Xu:
>
> > Are you sure about that? In particular is the callback function still
> > sane without the socket lock if a concurrent recvmsg/sendmsg call is
> > made?
> 
> I reviewed the code again and I cannot find a reason for keeping the lock. All 
> we need to ensure is that the socket exists. This is ensured with the refcount 
> of the socket released by __sock_put().

OK, I can't see why we need a lock there either.  However, the call
to __sock_put looks suspicious.  Why isn't this using sock_put?

Also the sock_hold on the caller side looks buggy.  Surely it needs
to be made before we even call the encrypt/decrypt functions rather
than after it returns EINPROGRESS at which point it may well be too
late?

Cheers,
Stephan Mueller Nov. 7, 2017, 6:19 a.m. UTC | #6
Am Dienstag, 7. November 2017, 06:22:35 CET schrieb Herbert Xu:

Hi Herbert,

> On Mon, Nov 06, 2017 at 05:06:09PM +0100, Stephan Mueller wrote:
> > Am Freitag, 3. November 2017, 14:20:16 CET schrieb Herbert Xu:
> > > Are you sure about that? In particular is the callback function still
> > > sane without the socket lock if a concurrent recvmsg/sendmsg call is
> > > made?
> > 
> > I reviewed the code again and I cannot find a reason for keeping the lock.
> > All we need to ensure is that the socket exists. This is ensured with the
> > refcount of the socket released by __sock_put().
> 
> OK, I can't see why we need a lock there either.  However, the call
> to __sock_put looks suspicious.  Why isn't this using sock_put?

I simply ported the existing code from algif_aead over -- but I think you are 
right that sock_put is more appropriate.
> 
> Also the sock_hold on the caller side looks buggy.  Surely it needs
> to be made before we even call the encrypt/decrypt functions rather
> than after it returns EINPROGRESS at which point it may well be too
> late?

I would concur. The sock_hold would need to be moved from the EINPROGRESS 
conditional to before the AIO enc/dec operation is invoked.

Where I am not fully sure is whether af_alg_async_cb is called in any case. 
I.e. when we invoke an AIO operation with a cipher that completes 
synchronously (e.g. AES-NI), is this callback triggered?

Ciao
Stephan
Herbert Xu Nov. 7, 2017, 6:32 a.m. UTC | #7
On Tue, Nov 07, 2017 at 07:19:32AM +0100, Stephan Müller wrote:
> 
> Where I am not fully sure is whether af_alg_async_cb is called in any case. 
> I.e. when we invoke an AIO operation with a cipher that completes 
> synchronously (e.g. AES-NI), is this callback triggered?

It's the same with any other callback in the crypto API.  The
completion function is called if the return value is EINPROGRESS
or EBUSY.

In fact the algif_aead/algif_skcipher code already frees the SG lists
and areq on a sync return.  So perhaps you can merge this with the
completion function.

Cheers,
diff mbox

Patch

diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 337cf382718e..a41f08642eee 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -1063,8 +1063,6 @@  void af_alg_async_cb(struct crypto_async_request *_req, int err)
 	struct kiocb *iocb = areq->iocb;
 	unsigned int resultlen;
 
-	lock_sock(sk);
-
 	/* Buffer size written by crypto operation. */
 	resultlen = areq->outlen;
 
@@ -1073,8 +1071,6 @@  void af_alg_async_cb(struct crypto_async_request *_req, int err)
 	__sock_put(sk);
 
 	iocb->ki_complete(iocb, err ? err : resultlen, 0);
-
-	release_sock(sk);
 }
 EXPORT_SYMBOL_GPL(af_alg_async_cb);