Message ID | 20240802102333.itejxOsJ@linutronix.de (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Herbert Xu |
Headers | show |
Series | crypto: x86/aes-gcm: Disable FPU around skcipher_walk_done(). | expand |
Hi Sebastian, On Fri, Aug 02, 2024 at 12:23:33PM +0200, Sebastian Andrzej Siewior wrote: > kernel_fpu_begin() disables preemption. gcm_crypt() has a > skcipher_walk_done() invocation within a preempt disabled section. > skcipher_walk_done() can invoke kfree() which requires sleeping locks on > PREEMPT_RT and must not be invoked with disabled preemption. > > Keep FPU access enabled while skcipher_walk_done() is invoked. > > Fixes: b06affb1cb580 ("crypto: x86/aes-gcm - add VAES and AVX512 / AVX10 optimized AES-GCM") > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > --- > arch/x86/crypto/aesni-intel_glue.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c > index cd37de5ec4046..be92e4c3f9c7f 100644 > --- a/arch/x86/crypto/aesni-intel_glue.c > +++ b/arch/x86/crypto/aesni-intel_glue.c > @@ -1403,7 +1403,9 @@ gcm_crypt(struct aead_request *req, int flags) > aes_gcm_update(key, le_ctr, ghash_acc, > walk.src.virt.addr, walk.dst.virt.addr, > nbytes, flags); > + kernel_fpu_end(); > err = skcipher_walk_done(&walk, 0); > + kernel_fpu_begin(); > /* > * The low word of the counter isn't used by the > * finalize, so there's no need to increment it here. Can you make this conditional on CONFIG_PREEMPT_RT so that it doesn't hurt performance for everyone else? Note that kfree() lacks a might_sleep(), and its kerneldoc does not say that it can sleep. Have you checked for other instances of this same problem? It seems it would be quite common kernel-wide. Is it really necessary that kfree() takes a sleepable lock on PREEMPT_RT? - Eric
On Fri, Aug 02, 2024 at 09:28:32AM -0700, Eric Biggers wrote: > Hi Sebastian, > > On Fri, Aug 02, 2024 at 12:23:33PM +0200, Sebastian Andrzej Siewior wrote: > > kernel_fpu_begin() disables preemption. gcm_crypt() has a > > skcipher_walk_done() invocation within a preempt disabled section. > > skcipher_walk_done() can invoke kfree() which requires sleeping locks on > > PREEMPT_RT and must not be invoked with disabled preemption. > > > > Keep FPU access enabled while skcipher_walk_done() is invoked. > > > > Fixes: b06affb1cb580 ("crypto: x86/aes-gcm - add VAES and AVX512 / AVX10 optimized AES-GCM") > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > > --- > > arch/x86/crypto/aesni-intel_glue.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c > > index cd37de5ec4046..be92e4c3f9c7f 100644 > > --- a/arch/x86/crypto/aesni-intel_glue.c > > +++ b/arch/x86/crypto/aesni-intel_glue.c > > @@ -1403,7 +1403,9 @@ gcm_crypt(struct aead_request *req, int flags) > > aes_gcm_update(key, le_ctr, ghash_acc, > > walk.src.virt.addr, walk.dst.virt.addr, > > nbytes, flags); > > + kernel_fpu_end(); > > err = skcipher_walk_done(&walk, 0); > > + kernel_fpu_begin(); > > /* > > * The low word of the counter isn't used by the > > * finalize, so there's no need to increment it here. > > Can you make this conditional on CONFIG_PREEMPT_RT so that it doesn't hurt > performance for everyone else? > > Note that kfree() lacks a might_sleep(), and its kerneldoc does not say that it > can sleep. Have you checked for other instances of this same problem? It seems > it would be quite common kernel-wide. Is it really necessary that kfree() takes > a sleepable lock on PREEMPT_RT? > This would work too, I think: diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c index cd37de5ec4046..2d6bcf7fc7c51 100644 --- a/arch/x86/crypto/aesni-intel_glue.c +++ b/arch/x86/crypto/aesni-intel_glue.c @@ -1401,11 +1401,12 @@ gcm_crypt(struct aead_request *req, int flags) } else { /* Last segment: process all remaining data. */ aes_gcm_update(key, le_ctr, ghash_acc, walk.src.virt.addr, walk.dst.virt.addr, nbytes, flags); - err = skcipher_walk_done(&walk, 0); + err = 0; + break; /* * The low word of the counter isn't used by the * finalize, so there's no need to increment it here. */ } @@ -1439,10 +1440,12 @@ gcm_crypt(struct aead_request *req, int flags) datalen, tag, taglen, flags)) err = -EBADMSG; } out: kernel_fpu_end(); + if (nbytes) + skcipher_walk_done(&walk, 0); return err; }
On Fri, Aug 02, 2024 at 09:28:32AM -0700, Eric Biggers wrote: > > Note that kfree() lacks a might_sleep(), and its kerneldoc does not say that it > can sleep. Have you checked for other instances of this same problem? It seems > it would be quite common kernel-wide. Is it really necessary that kfree() takes > a sleepable lock on PREEMPT_RT? Agreed. kfree() gets called in all sorts of places under softirq context in the network stack which would presumably have the same issue. Please give a bit of context of when kfree started doing this. Cheers,
On Fri, Aug 02, 2024 at 12:23:33PM +0200, Sebastian Andrzej Siewior wrote: > kernel_fpu_begin() disables preemption. gcm_crypt() has a > skcipher_walk_done() invocation within a preempt disabled section. > skcipher_walk_done() can invoke kfree() which requires sleeping locks on > PREEMPT_RT and must not be invoked with disabled preemption. > > Keep FPU access enabled while skcipher_walk_done() is invoked. > > Fixes: b06affb1cb580 ("crypto: x86/aes-gcm - add VAES and AVX512 / AVX10 optimized AES-GCM") > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > --- > arch/x86/crypto/aesni-intel_glue.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c > index cd37de5ec4046..be92e4c3f9c7f 100644 > --- a/arch/x86/crypto/aesni-intel_glue.c > +++ b/arch/x86/crypto/aesni-intel_glue.c > @@ -1403,7 +1403,9 @@ gcm_crypt(struct aead_request *req, int flags) > aes_gcm_update(key, le_ctr, ghash_acc, > walk.src.virt.addr, walk.dst.virt.addr, > nbytes, flags); > + kernel_fpu_end(); > err = skcipher_walk_done(&walk, 0); > + kernel_fpu_begin(); What if the user already did a preempt_disable()? This would still be buggy, right? The Crypto API allows this to be called with preemption disabled. Cheers,
On 2024-08-03 08:34:05 [+0800], Herbert Xu wrote: > On Fri, Aug 02, 2024 at 09:28:32AM -0700, Eric Biggers wrote: > > > > Note that kfree() lacks a might_sleep(), and its kerneldoc does not say that it > > can sleep. Have you checked for other instances of this same problem? It seems > > it would be quite common kernel-wide. Is it really necessary that kfree() takes > > a sleepable lock on PREEMPT_RT? > > Agreed. kfree() gets called in all sorts of places under softirq > context in the network stack which would presumably have the same > issue. > > Please give a bit of context of when kfree started doing this. I added "under PREEMPT_RT" in the patch description. The softirq under PREEMPT_RT does not disable preemption so it is fully preemptible. The other accelerated versions drop exclusive FPU access/ enable preemption during these operations even on ARM/ ARM64. > Cheers, Sebastian
On 2024-08-02 09:28:32 [-0700], Eric Biggers wrote: > Hi Sebastian, Hi Eric, > > diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c > > index cd37de5ec4046..be92e4c3f9c7f 100644 > > --- a/arch/x86/crypto/aesni-intel_glue.c > > +++ b/arch/x86/crypto/aesni-intel_glue.c > > @@ -1403,7 +1403,9 @@ gcm_crypt(struct aead_request *req, int flags) > > aes_gcm_update(key, le_ctr, ghash_acc, > > walk.src.virt.addr, walk.dst.virt.addr, > > nbytes, flags); > > + kernel_fpu_end(); > > err = skcipher_walk_done(&walk, 0); > > + kernel_fpu_begin(); > > /* > > * The low word of the counter isn't used by the > > * finalize, so there's no need to increment it here. > > Can you make this conditional on CONFIG_PREEMPT_RT so that it doesn't hurt > performance for everyone else? Every other instance in this file had a kernel_fpu_end/ begin() before skcipher_walk_done() so I though was just missed by chance. > Note that kfree() lacks a might_sleep(), and its kerneldoc does not say that it > can sleep. Have you checked for other instances of this same problem? It seems > it would be quite common kernel-wide. kfree() can't have a might_sleep() because it does not qualify for this since you can use it in softirq context for instance with an acquired spinlockt_t on !RT which would trigger it. On PREEMPT_RT interrupts are threaded, softirq is preemptible, spintlock_t is a sleeping lock so all these things where a kfree() would have been invoked in preempt-disable context on !PREEMPT_RT is actually preemptible on PREEMPT_RT. This is of course not true in cases where preemption is explicitly disabled like in this case. > Is it really necessary that kfree() takes > a sleepable lock on PREEMPT_RT? Yes. The locking in SLUB and page allocator is spinlock_t. > - Eric Sebastian
On Fri, Aug 02, 2024 at 09:49:04AM -0700, Eric Biggers wrote: > > This would work too, I think: Yes, and we can go a bit further like this: diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c index cd37de5ec404..149bc6beae51 100644 --- a/arch/x86/crypto/aesni-intel_glue.c +++ b/arch/x86/crypto/aesni-intel_glue.c @@ -1381,8 +1381,9 @@ gcm_crypt(struct aead_request *req, int flags) gcm_process_assoc(key, ghash_acc, req->src, assoclen, flags); /* En/decrypt the data and pass the ciphertext through GHASH. */ - while ((nbytes = walk.nbytes) != 0) { - if (unlikely(nbytes < walk.total)) { + nbytes = walk.nbytes; + if (nbytes) { + while (unlikely(nbytes < walk.total)) { /* * Non-last segment. In this case, the assembly * function requires that the length be a multiple of 16 @@ -1397,21 +1398,24 @@ gcm_crypt(struct aead_request *req, int flags) le_ctr[0] += nbytes / AES_BLOCK_SIZE; kernel_fpu_end(); err = skcipher_walk_done(&walk, walk.nbytes - nbytes); + if (err) + return err; + nbytes = walk.nbytes; kernel_fpu_begin(); - } else { - /* Last segment: process all remaining data. */ - aes_gcm_update(key, le_ctr, ghash_acc, - walk.src.virt.addr, walk.dst.virt.addr, - nbytes, flags); - err = skcipher_walk_done(&walk, 0); - /* - * The low word of the counter isn't used by the - * finalize, so there's no need to increment it here. - */ } + + /* Last segment: process all remaining data. */ + aes_gcm_update(key, le_ctr, ghash_acc, + walk.src.virt.addr, walk.dst.virt.addr, + nbytes, flags); + /* + * The low word of the counter isn't used by the + * finalize, so there's no need to increment it here. + */ + } else if (err) { + kernel_fpu_end(); + return err; } - if (err) - goto out; /* Finalize */ taglen = crypto_aead_authsize(tfm); @@ -1439,9 +1443,8 @@ gcm_crypt(struct aead_request *req, int flags) datalen, tag, taglen, flags)) err = -EBADMSG; } -out: kernel_fpu_end(); - return err; + return skcipher_walk_done(&walk, 0); } #define DEFINE_GCM_ALGS(suffix, flags, generic_driver_name, rfc_driver_name, \
On 2024-08-03 08:37:05 [+0800], Herbert Xu wrote: > On Fri, Aug 02, 2024 at 12:23:33PM +0200, Sebastian Andrzej Siewior wrote: > > kernel_fpu_begin() disables preemption. gcm_crypt() has a > > skcipher_walk_done() invocation within a preempt disabled section. > > skcipher_walk_done() can invoke kfree() which requires sleeping locks on > > PREEMPT_RT and must not be invoked with disabled preemption. > > > > Keep FPU access enabled while skcipher_walk_done() is invoked. > > > > Fixes: b06affb1cb580 ("crypto: x86/aes-gcm - add VAES and AVX512 / AVX10 optimized AES-GCM") > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > > --- > > arch/x86/crypto/aesni-intel_glue.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c > > index cd37de5ec4046..be92e4c3f9c7f 100644 > > --- a/arch/x86/crypto/aesni-intel_glue.c > > +++ b/arch/x86/crypto/aesni-intel_glue.c > > @@ -1403,7 +1403,9 @@ gcm_crypt(struct aead_request *req, int flags) > > aes_gcm_update(key, le_ctr, ghash_acc, > > walk.src.virt.addr, walk.dst.virt.addr, > > nbytes, flags); > > + kernel_fpu_end(); > > err = skcipher_walk_done(&walk, 0); > > + kernel_fpu_begin(); > > What if the user already did a preempt_disable()? This would still > be buggy, right? Yes if it has been done explicitly by preempt_disable(). And I am looking into explicit case of disabling preemption and trying to get rid of it if I stumble upon one. This one just popped up on one of my boxes. > The Crypto API allows this to be called with preemption disabled. > > Cheers, Sebastian
On 2024-08-05 17:02:35 [+0800], Herbert Xu wrote: > On Fri, Aug 02, 2024 at 09:49:04AM -0700, Eric Biggers wrote: > > > > This would work too, I think: > > Yes, and we can go a bit further like this: Yes, Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Sebastian
On Mon, Aug 05, 2024 at 10:41:21AM +0200, Sebastian Andrzej Siewior wrote: > On 2024-08-02 09:28:32 [-0700], Eric Biggers wrote: > > Hi Sebastian, > Hi Eric, > > > > diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c > > > index cd37de5ec4046..be92e4c3f9c7f 100644 > > > --- a/arch/x86/crypto/aesni-intel_glue.c > > > +++ b/arch/x86/crypto/aesni-intel_glue.c > > > @@ -1403,7 +1403,9 @@ gcm_crypt(struct aead_request *req, int flags) > > > aes_gcm_update(key, le_ctr, ghash_acc, > > > walk.src.virt.addr, walk.dst.virt.addr, > > > nbytes, flags); > > > + kernel_fpu_end(); > > > err = skcipher_walk_done(&walk, 0); > > > + kernel_fpu_begin(); > > > /* > > > * The low word of the counter isn't used by the > > > * finalize, so there's no need to increment it here. > > > > Can you make this conditional on CONFIG_PREEMPT_RT so that it doesn't hurt > > performance for everyone else? > > Every other instance in this file had a kernel_fpu_end/ begin() before > skcipher_walk_done() so I though was just missed by chance. No, it was intentional. See the comment above the first kernel_fpu_begin() in gcm_crypt(): /* * Since the AES-GCM assembly code requires that at least three assembly * functions be called to process any message (this is needed to support * incremental updates cleanly), to reduce overhead we try to do all * three calls in the same kernel FPU section if possible. We close the * section and start a new one if there are multiple data segments or if * rescheduling is needed while processing the associated data. */ kernel_fpu_begin(); > > Note that kfree() lacks a might_sleep(), and its kerneldoc does not say that it > > can sleep. Have you checked for other instances of this same problem? It seems > > it would be quite common kernel-wide. > > kfree() can't have a might_sleep() because it does not qualify for this > since you can use it in softirq context for instance with an acquired > spinlockt_t on !RT which would trigger it. > On PREEMPT_RT interrupts are threaded, softirq is preemptible, > spintlock_t is a sleeping lock so all these things where a kfree() > would have been invoked in preempt-disable context on !PREEMPT_RT is > actually preemptible on PREEMPT_RT. > This is of course not true in cases where preemption is explicitly > disabled like in this case. WARN_ON(!preemptible()) then? If I add that to kfree(), it triggers from lots of other places. Are those problems on PREEMPT_RT too? What I am trying to get at is what debugging options do I need to detect issues like this. Is there really no option other than actually running a PREEMPT_RT kernel? I had tested this code with lots of debug options pre-merge and nothing came up. If there was something in CONFIG_SLUB_DEBUG, for example, I would have seen that, and you would never have had to deal with this issue at all as it would never have been introduced. - Eric
On 2024-08-05 10:38:26 [-0700], Eric Biggers wrote: > No, it was intentional. See the comment above the first kernel_fpu_begin() in > gcm_crypt(): > > /* > * Since the AES-GCM assembly code requires that at least three assembly > * functions be called to process any message (this is needed to support > * incremental updates cleanly), to reduce overhead we try to do all > * three calls in the same kernel FPU section if possible. We close the > * section and start a new one if there are multiple data segments or if > * rescheduling is needed while processing the associated data. > */ > kernel_fpu_begin(); Okay. > > > Note that kfree() lacks a might_sleep(), and its kerneldoc does not say that it > > > can sleep. Have you checked for other instances of this same problem? It seems > > > it would be quite common kernel-wide. > > > > kfree() can't have a might_sleep() because it does not qualify for this > > since you can use it in softirq context for instance with an acquired > > spinlockt_t on !RT which would trigger it. > > On PREEMPT_RT interrupts are threaded, softirq is preemptible, > > spintlock_t is a sleeping lock so all these things where a kfree() > > would have been invoked in preempt-disable context on !PREEMPT_RT is > > actually preemptible on PREEMPT_RT. > > This is of course not true in cases where preemption is explicitly > > disabled like in this case. > > WARN_ON(!preemptible()) then? > > If I add that to kfree(), it triggers from lots of other places. Are those > problems on PREEMPT_RT too? Nope, shouldn't. On PREEMPT_RT you can only invoke kfree() or kmalloc() from preemptible context. This is not a problem since interrupts are threaded, softirqs are preemptbile,… > What I am trying to get at is what debugging options do I need to detect issues > like this. Is there really no option other than actually running a PREEMPT_RT > kernel? There is PROVE_RAW_LOCK_NESTING but this only catches something like raw_spinlock_t -> spinlock_t or using a spinlock_t in an interrupt handler that won't be threaded. It won't find anything where you disable interrupts or preemption on purpose. > I had tested this code with lots of debug options pre-merge and nothing came up. > > If there was something in CONFIG_SLUB_DEBUG, for example, I would have seen > that, and you would never have had to deal with this issue at all as it would > never have been introduced. Don't worry. I didn't think for a second that there was lack of testing on your side. > - Eric Sebastian
diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c index cd37de5ec4046..be92e4c3f9c7f 100644 --- a/arch/x86/crypto/aesni-intel_glue.c +++ b/arch/x86/crypto/aesni-intel_glue.c @@ -1403,7 +1403,9 @@ gcm_crypt(struct aead_request *req, int flags) aes_gcm_update(key, le_ctr, ghash_acc, walk.src.virt.addr, walk.dst.virt.addr, nbytes, flags); + kernel_fpu_end(); err = skcipher_walk_done(&walk, 0); + kernel_fpu_begin(); /* * The low word of the counter isn't used by the * finalize, so there's no need to increment it here.
kernel_fpu_begin() disables preemption. gcm_crypt() has a skcipher_walk_done() invocation within a preempt disabled section. skcipher_walk_done() can invoke kfree() which requires sleeping locks on PREEMPT_RT and must not be invoked with disabled preemption. Keep FPU access enabled while skcipher_walk_done() is invoked. Fixes: b06affb1cb580 ("crypto: x86/aes-gcm - add VAES and AVX512 / AVX10 optimized AES-GCM") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- arch/x86/crypto/aesni-intel_glue.c | 2 ++ 1 file changed, 2 insertions(+)