diff mbox series

crypto: x86/aes-gcm: Disable FPU around skcipher_walk_done().

Message ID 20240802102333.itejxOsJ@linutronix.de (mailing list archive)
State Superseded
Delegated to: Herbert Xu
Headers show
Series crypto: x86/aes-gcm: Disable FPU around skcipher_walk_done(). | expand

Commit Message

Sebastian Andrzej Siewior Aug. 2, 2024, 10:23 a.m. UTC
kernel_fpu_begin() disables preemption. gcm_crypt() has a
skcipher_walk_done() invocation within a preempt disabled section.
skcipher_walk_done() can invoke kfree() which requires sleeping locks on
PREEMPT_RT and must not be invoked with disabled preemption.

Keep FPU access enabled while skcipher_walk_done() is invoked.

Fixes: b06affb1cb580 ("crypto: x86/aes-gcm - add VAES and AVX512 / AVX10 optimized AES-GCM")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/crypto/aesni-intel_glue.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Eric Biggers Aug. 2, 2024, 4:28 p.m. UTC | #1
Hi Sebastian,

On Fri, Aug 02, 2024 at 12:23:33PM +0200, Sebastian Andrzej Siewior wrote:
> kernel_fpu_begin() disables preemption. gcm_crypt() has a
> skcipher_walk_done() invocation within a preempt disabled section.
> skcipher_walk_done() can invoke kfree() which requires sleeping locks on
> PREEMPT_RT and must not be invoked with disabled preemption.
> 
> Keep FPU access enabled while skcipher_walk_done() is invoked.
> 
> Fixes: b06affb1cb580 ("crypto: x86/aes-gcm - add VAES and AVX512 / AVX10 optimized AES-GCM")
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  arch/x86/crypto/aesni-intel_glue.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c
> index cd37de5ec4046..be92e4c3f9c7f 100644
> --- a/arch/x86/crypto/aesni-intel_glue.c
> +++ b/arch/x86/crypto/aesni-intel_glue.c
> @@ -1403,7 +1403,9 @@ gcm_crypt(struct aead_request *req, int flags)
>  			aes_gcm_update(key, le_ctr, ghash_acc,
>  				       walk.src.virt.addr, walk.dst.virt.addr,
>  				       nbytes, flags);
> +			kernel_fpu_end();
>  			err = skcipher_walk_done(&walk, 0);
> +			kernel_fpu_begin();
>  			/*
>  			 * The low word of the counter isn't used by the
>  			 * finalize, so there's no need to increment it here.

Can you make this conditional on CONFIG_PREEMPT_RT so that it doesn't hurt
performance for everyone else?

Note that kfree() lacks a might_sleep(), and its kerneldoc does not say that it
can sleep.  Have you checked for other instances of this same problem?  It seems
it would be quite common kernel-wide.  Is it really necessary that kfree() takes
a sleepable lock on PREEMPT_RT?

- Eric
Eric Biggers Aug. 2, 2024, 4:49 p.m. UTC | #2
On Fri, Aug 02, 2024 at 09:28:32AM -0700, Eric Biggers wrote:
> Hi Sebastian,
> 
> On Fri, Aug 02, 2024 at 12:23:33PM +0200, Sebastian Andrzej Siewior wrote:
> > kernel_fpu_begin() disables preemption. gcm_crypt() has a
> > skcipher_walk_done() invocation within a preempt disabled section.
> > skcipher_walk_done() can invoke kfree() which requires sleeping locks on
> > PREEMPT_RT and must not be invoked with disabled preemption.
> > 
> > Keep FPU access enabled while skcipher_walk_done() is invoked.
> > 
> > Fixes: b06affb1cb580 ("crypto: x86/aes-gcm - add VAES and AVX512 / AVX10 optimized AES-GCM")
> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > ---
> >  arch/x86/crypto/aesni-intel_glue.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c
> > index cd37de5ec4046..be92e4c3f9c7f 100644
> > --- a/arch/x86/crypto/aesni-intel_glue.c
> > +++ b/arch/x86/crypto/aesni-intel_glue.c
> > @@ -1403,7 +1403,9 @@ gcm_crypt(struct aead_request *req, int flags)
> >  			aes_gcm_update(key, le_ctr, ghash_acc,
> >  				       walk.src.virt.addr, walk.dst.virt.addr,
> >  				       nbytes, flags);
> > +			kernel_fpu_end();
> >  			err = skcipher_walk_done(&walk, 0);
> > +			kernel_fpu_begin();
> >  			/*
> >  			 * The low word of the counter isn't used by the
> >  			 * finalize, so there's no need to increment it here.
> 
> Can you make this conditional on CONFIG_PREEMPT_RT so that it doesn't hurt
> performance for everyone else?
> 
> Note that kfree() lacks a might_sleep(), and its kerneldoc does not say that it
> can sleep.  Have you checked for other instances of this same problem?  It seems
> it would be quite common kernel-wide.  Is it really necessary that kfree() takes
> a sleepable lock on PREEMPT_RT?
> 

This would work too, I think:

diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c
index cd37de5ec4046..2d6bcf7fc7c51 100644
--- a/arch/x86/crypto/aesni-intel_glue.c
+++ b/arch/x86/crypto/aesni-intel_glue.c
@@ -1401,11 +1401,12 @@ gcm_crypt(struct aead_request *req, int flags)
 		} else {
 			/* Last segment: process all remaining data. */
 			aes_gcm_update(key, le_ctr, ghash_acc,
 				       walk.src.virt.addr, walk.dst.virt.addr,
 				       nbytes, flags);
-			err = skcipher_walk_done(&walk, 0);
+			err = 0;
+			break;
 			/*
 			 * The low word of the counter isn't used by the
 			 * finalize, so there's no need to increment it here.
 			 */
 		}
@@ -1439,10 +1440,12 @@ gcm_crypt(struct aead_request *req, int flags)
 				       datalen, tag, taglen, flags))
 			err = -EBADMSG;
 	}
 out:
 	kernel_fpu_end();
+	if (nbytes)
+		skcipher_walk_done(&walk, 0);
 	return err;
 }
Herbert Xu Aug. 3, 2024, 12:34 a.m. UTC | #3
On Fri, Aug 02, 2024 at 09:28:32AM -0700, Eric Biggers wrote:
>
> Note that kfree() lacks a might_sleep(), and its kerneldoc does not say that it
> can sleep.  Have you checked for other instances of this same problem?  It seems
> it would be quite common kernel-wide.  Is it really necessary that kfree() takes
> a sleepable lock on PREEMPT_RT?

Agreed.  kfree() gets called in all sorts of places under softirq
context in the network stack which would presumably have the same
issue.

Please give a bit of context of when kfree started doing this.

Cheers,
Herbert Xu Aug. 3, 2024, 12:37 a.m. UTC | #4
On Fri, Aug 02, 2024 at 12:23:33PM +0200, Sebastian Andrzej Siewior wrote:
> kernel_fpu_begin() disables preemption. gcm_crypt() has a
> skcipher_walk_done() invocation within a preempt disabled section.
> skcipher_walk_done() can invoke kfree() which requires sleeping locks on
> PREEMPT_RT and must not be invoked with disabled preemption.
> 
> Keep FPU access enabled while skcipher_walk_done() is invoked.
> 
> Fixes: b06affb1cb580 ("crypto: x86/aes-gcm - add VAES and AVX512 / AVX10 optimized AES-GCM")
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  arch/x86/crypto/aesni-intel_glue.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c
> index cd37de5ec4046..be92e4c3f9c7f 100644
> --- a/arch/x86/crypto/aesni-intel_glue.c
> +++ b/arch/x86/crypto/aesni-intel_glue.c
> @@ -1403,7 +1403,9 @@ gcm_crypt(struct aead_request *req, int flags)
>  			aes_gcm_update(key, le_ctr, ghash_acc,
>  				       walk.src.virt.addr, walk.dst.virt.addr,
>  				       nbytes, flags);
> +			kernel_fpu_end();
>  			err = skcipher_walk_done(&walk, 0);
> +			kernel_fpu_begin();

What if the user already did a preempt_disable()? This would still
be buggy, right?

The Crypto API allows this to be called with preemption disabled.

Cheers,
Sebastian Andrzej Siewior Aug. 5, 2024, 8:21 a.m. UTC | #5
On 2024-08-03 08:34:05 [+0800], Herbert Xu wrote:
> On Fri, Aug 02, 2024 at 09:28:32AM -0700, Eric Biggers wrote:
> >
> > Note that kfree() lacks a might_sleep(), and its kerneldoc does not say that it
> > can sleep.  Have you checked for other instances of this same problem?  It seems
> > it would be quite common kernel-wide.  Is it really necessary that kfree() takes
> > a sleepable lock on PREEMPT_RT?
> 
> Agreed.  kfree() gets called in all sorts of places under softirq
> context in the network stack which would presumably have the same
> issue.
> 
> Please give a bit of context of when kfree started doing this.

I added "under PREEMPT_RT" in the patch description. The softirq under
PREEMPT_RT does not disable preemption so it is fully preemptible.
The other accelerated versions drop exclusive FPU access/ enable
preemption during these operations even on ARM/ ARM64.

> Cheers,

Sebastian
Sebastian Andrzej Siewior Aug. 5, 2024, 8:41 a.m. UTC | #6
On 2024-08-02 09:28:32 [-0700], Eric Biggers wrote:
> Hi Sebastian,
Hi Eric,

> > diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c
> > index cd37de5ec4046..be92e4c3f9c7f 100644
> > --- a/arch/x86/crypto/aesni-intel_glue.c
> > +++ b/arch/x86/crypto/aesni-intel_glue.c
> > @@ -1403,7 +1403,9 @@ gcm_crypt(struct aead_request *req, int flags)
> >  			aes_gcm_update(key, le_ctr, ghash_acc,
> >  				       walk.src.virt.addr, walk.dst.virt.addr,
> >  				       nbytes, flags);
> > +			kernel_fpu_end();
> >  			err = skcipher_walk_done(&walk, 0);
> > +			kernel_fpu_begin();
> >  			/*
> >  			 * The low word of the counter isn't used by the
> >  			 * finalize, so there's no need to increment it here.
> 
> Can you make this conditional on CONFIG_PREEMPT_RT so that it doesn't hurt
> performance for everyone else?

Every other instance in this file had a kernel_fpu_end/ begin() before
skcipher_walk_done() so I though was just missed by chance.

> Note that kfree() lacks a might_sleep(), and its kerneldoc does not say that it
> can sleep.  Have you checked for other instances of this same problem?  It seems
> it would be quite common kernel-wide.  

kfree() can't have a might_sleep() because it does not qualify for this
since you can use it in softirq context for instance with an acquired
spinlockt_t on !RT which would trigger it.
On PREEMPT_RT interrupts are threaded, softirq is preemptible,
spintlock_t is a sleeping lock so all these things where a kfree()
would have been invoked in preempt-disable context on !PREEMPT_RT is
actually preemptible on PREEMPT_RT.
This is of course not true in cases where preemption is explicitly
disabled like in this case.

> Is it really necessary that kfree() takes
> a sleepable lock on PREEMPT_RT?
Yes. The locking in SLUB and page allocator is spinlock_t.

> - Eric

Sebastian
Herbert Xu Aug. 5, 2024, 9:02 a.m. UTC | #7
On Fri, Aug 02, 2024 at 09:49:04AM -0700, Eric Biggers wrote:
>
> This would work too, I think:

Yes, and we can go a bit further like this:

diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c
index cd37de5ec404..149bc6beae51 100644
--- a/arch/x86/crypto/aesni-intel_glue.c
+++ b/arch/x86/crypto/aesni-intel_glue.c
@@ -1381,8 +1381,9 @@ gcm_crypt(struct aead_request *req, int flags)
 	gcm_process_assoc(key, ghash_acc, req->src, assoclen, flags);
 
 	/* En/decrypt the data and pass the ciphertext through GHASH. */
-	while ((nbytes = walk.nbytes) != 0) {
-		if (unlikely(nbytes < walk.total)) {
+	nbytes = walk.nbytes;
+	if (nbytes) {
+		while (unlikely(nbytes < walk.total)) {
 			/*
 			 * Non-last segment.  In this case, the assembly
 			 * function requires that the length be a multiple of 16
@@ -1397,21 +1398,24 @@ gcm_crypt(struct aead_request *req, int flags)
 			le_ctr[0] += nbytes / AES_BLOCK_SIZE;
 			kernel_fpu_end();
 			err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
+			if (err)
+				return err;
+			nbytes = walk.nbytes;
 			kernel_fpu_begin();
-		} else {
-			/* Last segment: process all remaining data. */
-			aes_gcm_update(key, le_ctr, ghash_acc,
-				       walk.src.virt.addr, walk.dst.virt.addr,
-				       nbytes, flags);
-			err = skcipher_walk_done(&walk, 0);
-			/*
-			 * The low word of the counter isn't used by the
-			 * finalize, so there's no need to increment it here.
-			 */
 		}
+
+		/* Last segment: process all remaining data. */
+		aes_gcm_update(key, le_ctr, ghash_acc,
+			       walk.src.virt.addr, walk.dst.virt.addr,
+			       nbytes, flags);
+		/*
+		 * The low word of the counter isn't used by the
+		 * finalize, so there's no need to increment it here.
+		 */
+	} else if (err) {
+		kernel_fpu_end();
+		return err;
 	}
-	if (err)
-		goto out;
 
 	/* Finalize */
 	taglen = crypto_aead_authsize(tfm);
@@ -1439,9 +1443,8 @@ gcm_crypt(struct aead_request *req, int flags)
 				       datalen, tag, taglen, flags))
 			err = -EBADMSG;
 	}
-out:
 	kernel_fpu_end();
-	return err;
+	return skcipher_walk_done(&walk, 0);
 }
 
 #define DEFINE_GCM_ALGS(suffix, flags, generic_driver_name, rfc_driver_name,   \
Sebastian Andrzej Siewior Aug. 5, 2024, 9:34 a.m. UTC | #8
On 2024-08-03 08:37:05 [+0800], Herbert Xu wrote:
> On Fri, Aug 02, 2024 at 12:23:33PM +0200, Sebastian Andrzej Siewior wrote:
> > kernel_fpu_begin() disables preemption. gcm_crypt() has a
> > skcipher_walk_done() invocation within a preempt disabled section.
> > skcipher_walk_done() can invoke kfree() which requires sleeping locks on
> > PREEMPT_RT and must not be invoked with disabled preemption.
> > 
> > Keep FPU access enabled while skcipher_walk_done() is invoked.
> > 
> > Fixes: b06affb1cb580 ("crypto: x86/aes-gcm - add VAES and AVX512 / AVX10 optimized AES-GCM")
> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > ---
> >  arch/x86/crypto/aesni-intel_glue.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c
> > index cd37de5ec4046..be92e4c3f9c7f 100644
> > --- a/arch/x86/crypto/aesni-intel_glue.c
> > +++ b/arch/x86/crypto/aesni-intel_glue.c
> > @@ -1403,7 +1403,9 @@ gcm_crypt(struct aead_request *req, int flags)
> >  			aes_gcm_update(key, le_ctr, ghash_acc,
> >  				       walk.src.virt.addr, walk.dst.virt.addr,
> >  				       nbytes, flags);
> > +			kernel_fpu_end();
> >  			err = skcipher_walk_done(&walk, 0);
> > +			kernel_fpu_begin();
> 
> What if the user already did a preempt_disable()? This would still
> be buggy, right?

Yes if it has been done explicitly by preempt_disable(). And I am
looking into explicit case of disabling preemption and trying to get rid
of it if I stumble upon one. This one just popped up on one of my boxes.

> The Crypto API allows this to be called with preemption disabled.
> 
> Cheers,

Sebastian
Sebastian Andrzej Siewior Aug. 5, 2024, 9:56 a.m. UTC | #9
On 2024-08-05 17:02:35 [+0800], Herbert Xu wrote:
> On Fri, Aug 02, 2024 at 09:49:04AM -0700, Eric Biggers wrote:
> >
> > This would work too, I think:
> 
> Yes, and we can go a bit further like this:

Yes,

Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Sebastian
Eric Biggers Aug. 5, 2024, 5:38 p.m. UTC | #10
On Mon, Aug 05, 2024 at 10:41:21AM +0200, Sebastian Andrzej Siewior wrote:
> On 2024-08-02 09:28:32 [-0700], Eric Biggers wrote:
> > Hi Sebastian,
> Hi Eric,
> 
> > > diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c
> > > index cd37de5ec4046..be92e4c3f9c7f 100644
> > > --- a/arch/x86/crypto/aesni-intel_glue.c
> > > +++ b/arch/x86/crypto/aesni-intel_glue.c
> > > @@ -1403,7 +1403,9 @@ gcm_crypt(struct aead_request *req, int flags)
> > >  			aes_gcm_update(key, le_ctr, ghash_acc,
> > >  				       walk.src.virt.addr, walk.dst.virt.addr,
> > >  				       nbytes, flags);
> > > +			kernel_fpu_end();
> > >  			err = skcipher_walk_done(&walk, 0);
> > > +			kernel_fpu_begin();
> > >  			/*
> > >  			 * The low word of the counter isn't used by the
> > >  			 * finalize, so there's no need to increment it here.
> > 
> > Can you make this conditional on CONFIG_PREEMPT_RT so that it doesn't hurt
> > performance for everyone else?
> 
> Every other instance in this file had a kernel_fpu_end/ begin() before
> skcipher_walk_done() so I though was just missed by chance.

No, it was intentional.  See the comment above the first kernel_fpu_begin() in
gcm_crypt():

	/*
	 * Since the AES-GCM assembly code requires that at least three assembly
	 * functions be called to process any message (this is needed to support
	 * incremental updates cleanly), to reduce overhead we try to do all
	 * three calls in the same kernel FPU section if possible.  We close the
	 * section and start a new one if there are multiple data segments or if
	 * rescheduling is needed while processing the associated data.
	 */
	kernel_fpu_begin();

> > Note that kfree() lacks a might_sleep(), and its kerneldoc does not say that it
> > can sleep.  Have you checked for other instances of this same problem?  It seems
> > it would be quite common kernel-wide.  
> 
> kfree() can't have a might_sleep() because it does not qualify for this
> since you can use it in softirq context for instance with an acquired
> spinlockt_t on !RT which would trigger it.
> On PREEMPT_RT interrupts are threaded, softirq is preemptible,
> spintlock_t is a sleeping lock so all these things where a kfree()
> would have been invoked in preempt-disable context on !PREEMPT_RT is
> actually preemptible on PREEMPT_RT.
> This is of course not true in cases where preemption is explicitly
> disabled like in this case.

WARN_ON(!preemptible()) then?

If I add that to kfree(), it triggers from lots of other places.  Are those
problems on PREEMPT_RT too?

What I am trying to get at is what debugging options do I need to detect issues
like this.  Is there really no option other than actually running a PREEMPT_RT
kernel?

I had tested this code with lots of debug options pre-merge and nothing came up.

If there was something in CONFIG_SLUB_DEBUG, for example, I would have seen
that, and you would never have had to deal with this issue at all as it would
never have been introduced.

- Eric
Sebastian Andrzej Siewior Aug. 6, 2024, 7:46 a.m. UTC | #11
On 2024-08-05 10:38:26 [-0700], Eric Biggers wrote:
> No, it was intentional.  See the comment above the first kernel_fpu_begin() in
> gcm_crypt():
> 
> 	/*
> 	 * Since the AES-GCM assembly code requires that at least three assembly
> 	 * functions be called to process any message (this is needed to support
> 	 * incremental updates cleanly), to reduce overhead we try to do all
> 	 * three calls in the same kernel FPU section if possible.  We close the
> 	 * section and start a new one if there are multiple data segments or if
> 	 * rescheduling is needed while processing the associated data.
> 	 */
> 	kernel_fpu_begin();

Okay.

> > > Note that kfree() lacks a might_sleep(), and its kerneldoc does not say that it
> > > can sleep.  Have you checked for other instances of this same problem?  It seems
> > > it would be quite common kernel-wide.  
> > 
> > kfree() can't have a might_sleep() because it does not qualify for this
> > since you can use it in softirq context for instance with an acquired
> > spinlockt_t on !RT which would trigger it.
> > On PREEMPT_RT interrupts are threaded, softirq is preemptible,
> > spintlock_t is a sleeping lock so all these things where a kfree()
> > would have been invoked in preempt-disable context on !PREEMPT_RT is
> > actually preemptible on PREEMPT_RT.
> > This is of course not true in cases where preemption is explicitly
> > disabled like in this case.
> 
> WARN_ON(!preemptible()) then?
> 
> If I add that to kfree(), it triggers from lots of other places.  Are those
> problems on PREEMPT_RT too?

Nope, shouldn't. On PREEMPT_RT you can only invoke kfree() or kmalloc()
from preemptible context. This is not a problem since interrupts are
threaded, softirqs are preemptbile,…

> What I am trying to get at is what debugging options do I need to detect issues
> like this.  Is there really no option other than actually running a PREEMPT_RT
> kernel?

There is PROVE_RAW_LOCK_NESTING but this only catches something like
raw_spinlock_t -> spinlock_t or using a spinlock_t in an interrupt
handler that won't be threaded. It won't find anything where you disable
interrupts or preemption on purpose.

> I had tested this code with lots of debug options pre-merge and nothing came up.
> 
> If there was something in CONFIG_SLUB_DEBUG, for example, I would have seen
> that, and you would never have had to deal with this issue at all as it would
> never have been introduced.

Don't worry. I didn't think for a second that there was lack of testing
on your side.

> - Eric

Sebastian
diff mbox series

Patch

diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c
index cd37de5ec4046..be92e4c3f9c7f 100644
--- a/arch/x86/crypto/aesni-intel_glue.c
+++ b/arch/x86/crypto/aesni-intel_glue.c
@@ -1403,7 +1403,9 @@  gcm_crypt(struct aead_request *req, int flags)
 			aes_gcm_update(key, le_ctr, ghash_acc,
 				       walk.src.virt.addr, walk.dst.virt.addr,
 				       nbytes, flags);
+			kernel_fpu_end();
 			err = skcipher_walk_done(&walk, 0);
+			kernel_fpu_begin();
 			/*
 			 * The low word of the counter isn't used by the
 			 * finalize, so there's no need to increment it here.