Message ID | 20221219220223.3982176-8-elliott@hpe.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Herbert Xu |
Headers | show |
Series | crypto: x86 - yield FPU context during long loops | expand |
On Mon, Dec 19, 2022 at 04:02:17PM -0600, Robert Elliott wrote: > Wrap each of the calls to clmul_hash_update and clmul_ghash__mul > in its own set of kernel_fpu_begin and kernel_fpu_end calls, preparing > to limit the amount of data processed by each _update call to avoid > RCU stalls. > > This is more like how polyval-clmulni_glue is structured. > > Fixes: 0e1227d356e9 ("crypto: ghash - Add PCLMULQDQ accelerated implementation") > Suggested-by: Herbert Xu <herbert@gondor.apana.org.au> > Signed-off-by: Robert Elliott <elliott@hpe.com> > --- > arch/x86/crypto/ghash-clmulni-intel_glue.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c > index beac4b2eddf6..1bfde099de0f 100644 > --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c > +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c > @@ -80,7 +80,6 @@ static int ghash_update(struct shash_desc *desc, > struct ghash_ctx *ctx = crypto_shash_ctx(desc->tfm); > u8 *dst = dctx->buffer; > > - kernel_fpu_begin(); > if (dctx->bytes) { > int n = min(srclen, dctx->bytes); > u8 *pos = dst + (GHASH_BLOCK_SIZE - dctx->bytes); > @@ -91,10 +90,14 @@ static int ghash_update(struct shash_desc *desc, > while (n--) > *pos++ ^= *src++; > > - if (!dctx->bytes) > + if (!dctx->bytes) { > + kernel_fpu_begin(); > clmul_ghash_mul(dst, &ctx->shash); > + kernel_fpu_end(); > + } > } > > + kernel_fpu_begin(); > clmul_ghash_update(dst, src, srclen, &ctx->shash); > kernel_fpu_end(); Why is this necessary? Couldn't you just add the kernel_fpu_yield calls even without this patch? This just seems to be adding some unnecessary begin/end calls. Cheers,
diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c index beac4b2eddf6..1bfde099de0f 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -80,7 +80,6 @@ static int ghash_update(struct shash_desc *desc, struct ghash_ctx *ctx = crypto_shash_ctx(desc->tfm); u8 *dst = dctx->buffer; - kernel_fpu_begin(); if (dctx->bytes) { int n = min(srclen, dctx->bytes); u8 *pos = dst + (GHASH_BLOCK_SIZE - dctx->bytes); @@ -91,10 +90,14 @@ static int ghash_update(struct shash_desc *desc, while (n--) *pos++ ^= *src++; - if (!dctx->bytes) + if (!dctx->bytes) { + kernel_fpu_begin(); clmul_ghash_mul(dst, &ctx->shash); + kernel_fpu_end(); + } } + kernel_fpu_begin(); clmul_ghash_update(dst, src, srclen, &ctx->shash); kernel_fpu_end();
Wrap each of the calls to clmul_hash_update and clmul_ghash__mul in its own set of kernel_fpu_begin and kernel_fpu_end calls, preparing to limit the amount of data processed by each _update call to avoid RCU stalls. This is more like how polyval-clmulni_glue is structured. Fixes: 0e1227d356e9 ("crypto: ghash - Add PCLMULQDQ accelerated implementation") Suggested-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Robert Elliott <elliott@hpe.com> --- arch/x86/crypto/ghash-clmulni-intel_glue.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)