mbox series

[v3,00/17] crypt: x86 - fix RCU stalls

Message ID 20221103042740.6556-1-elliott@hpe.com (mailing list archive)
Headers show
Series crypt: x86 - fix RCU stalls | expand

Message

Elliott, Robert (Servers) Nov. 3, 2022, 4:27 a.m. UTC
This series fixes the RCU stalls triggered by the x86 crypto
modules discussed in
https://lore.kernel.org/all/MW5PR84MB18426EBBA3303770A8BC0BDFAB759@MW5PR84MB1842.NAMPRD84.PROD.OUTLOOK.COM/

Two root causes were:
- too much data processed between kernel_fpu_begin and
  kernel_fpu_end calls (which are heavily used by the x86
  optimized drivers)
- tcrypt not calling cond_resched during speed test loops

These problems have always been lurking, but improving the
loading of the x86/sha512 module led to it happening a lot
during boot when using SHA-512 for module signature checking.

Fixing these problems makes it safer to improve loading
the rest of the x86 modules like the sha512 module.

This series only handles the x86 modules.

Except for the tcrypt change, v3 only tackles the hash functions
as discussed in
https://lore.kernel.org/lkml/MW5PR84MB184284FBED63E2D043C93A6FAB369@MW5PR84MB1842.NAMPRD84.PROD.OUTLOOK.COM/

The limits are implemented as static const unsigned ints at the
module level, which makes them easy to expose as module parameters
for testing like this:
   -static const unsigned int bytes_per_fpu = 655 * 1024;
   +static unsigned int bytes_per_fpu = 655 * 1024;
   +module_param(bytes_per_fpu, uint, 0644);
   +MODULE_PARM_DESC(bytes_per_fpu, "Bytes per FPU context");


Robert Elliott (17):
  crypto: tcrypt - test crc32
  crypto: tcrypt - test nhpoly1305
  crypto: tcrypt - reschedule during cycles speed tests
  crypto: x86/sha - limit FPU preemption
  crypto: x86/crc - limit FPU preemption
  crypto: x86/sm3 - limit FPU preemption
  crypto: x86/ghash - use u8 rather than char
  crypto: x86/ghash - restructure FPU context saving
  crypto: x86/ghash - limit FPU preemption
  crypto: x86/*poly* - limit FPU preemption
  crypto: x86/sha - register all variations
  crypto: x86/sha - minimize time in FPU context
  crypto: x86/sha1, sha256 - load based on CPU features
  crypto: x86/crc - load based on CPU features
  crypto: x86/sm3 - load based on CPU features
  crypto: x86/ghash,polyval - load based on CPU features
  crypto: x86/nhpoly1305, poly1305 - load based on CPU features

 arch/x86/crypto/crc32-pclmul_asm.S         |   6 +-
 arch/x86/crypto/crc32-pclmul_glue.c        |  36 ++-
 arch/x86/crypto/crc32c-intel_glue.c        |  58 +++--
 arch/x86/crypto/crct10dif-pclmul_glue.c    |  54 ++--
 arch/x86/crypto/ghash-clmulni-intel_asm.S  |   4 +-
 arch/x86/crypto/ghash-clmulni-intel_glue.c |  43 ++--
 arch/x86/crypto/nhpoly1305-avx2-glue.c     |  21 +-
 arch/x86/crypto/nhpoly1305-sse2-glue.c     |  21 +-
 arch/x86/crypto/poly1305_glue.c            |  49 +++-
 arch/x86/crypto/polyval-clmulni_glue.c     |  14 +-
 arch/x86/crypto/sha1_ssse3_glue.c          | 276 +++++++++++++--------
 arch/x86/crypto/sha256_ssse3_glue.c        | 268 +++++++++++++-------
 arch/x86/crypto/sha512_ssse3_glue.c        | 191 ++++++++------
 arch/x86/crypto/sm3_avx_glue.c             |  45 +++-
 crypto/tcrypt.c                            |  56 +++--
 15 files changed, 764 insertions(+), 378 deletions(-)