mbox series

[v3,00/11] Introduce cmpxchg128() -- aka. the demise of cmpxchg_double()

Message ID 20230515075659.118447996@infradead.org (mailing list archive)
Headers show
Series Introduce cmpxchg128() -- aka. the demise of cmpxchg_double() | expand

Message

Peter Zijlstra May 15, 2023, 7:56 a.m. UTC
Hi!

I seem to have forgotten to post this series last release; so here goes. I'm
really hoping to merge it and forget about it.


Since Linus hated on cmpxchg_double(), a few patches to get rid of it, as
proposed here:

  https://lkml.kernel.org/r/Y2U3WdU61FvYlpUh@hirez.programming.kicks-ass.net


These patches are based on 6.4.0-rc2.

Available here:

  git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git core/wip-u128

Since v2:

 - reworked this_cpu_cmpxchg() to not implicity do u128 but provide explicit
   this_cpu_cmpxchg128() (arnd)
 - added try_cmpxchg12_local() (per the addition of the try_cmpxchg*_local()
   family of functions)
 - slight cleanup of the SLUB conversion (due to rebase and having to touch it)
 - added a 'cleanup' patch for SLUB, since I was staring at that anyway

Since v1:

 - rebaed on Eric's ghash cleanups (hence the cryptodev-2.6 dependency)
 - rebased on Heiko's s390/cpum_sf CDSG patch
 - fixed up a bunch of arch code
 - fixed up the inline asm to use 'u128 *' mem argument so the compiler knows
   how wide the modification is.
 - reworked the percpu thing to use union based type-punning instead of
   _Generic() based casts.

---
 Documentation/core-api/this_cpu_ops.rst     |   2 -
 arch/arm64/include/asm/atomic_ll_sc.h       |  56 ++++----
 arch/arm64/include/asm/atomic_lse.h         |  39 +++---
 arch/arm64/include/asm/cmpxchg.h            |  48 ++-----
 arch/arm64/include/asm/percpu.h             |  30 +++--
 arch/s390/include/asm/cmpxchg.h             |  32 +----
 arch/s390/include/asm/cpu_mf.h              |   2 +-
 arch/s390/include/asm/percpu.h              |  34 +++--
 arch/s390/kernel/perf_cpum_sf.c             |  16 +--
 arch/x86/include/asm/cmpxchg.h              |  25 ----
 arch/x86/include/asm/cmpxchg_32.h           |   2 +-
 arch/x86/include/asm/cmpxchg_64.h           |  63 ++++++++-
 arch/x86/include/asm/percpu.h               | 100 +++++++++------
 drivers/iommu/amd/amd_iommu_types.h         |   9 +-
 drivers/iommu/amd/iommu.c                   |  10 +-
 drivers/iommu/intel/irq_remapping.c         |   8 +-
 include/asm-generic/percpu.h                |  66 ++--------
 include/crypto/b128ops.h                    |  14 +-
 include/linux/atomic/atomic-arch-fallback.h |  95 +++++++++++++-
 include/linux/atomic/atomic-instrumented.h  |  93 ++++++++++++--
 include/linux/dmar.h                        | 125 +++++++++---------
 include/linux/percpu-defs.h                 |  38 ------
 include/linux/slub_def.h                    |  12 +-
 include/linux/types.h                       |   5 +
 include/uapi/linux/types.h                  |   4 +
 lib/crypto/curve25519-hacl64.c              |   2 -
 lib/crypto/poly1305-donna64.c               |   2 -
 mm/slab.h                                   |  49 ++++++-
 mm/slub.c                                   | 191 ++++++++++++++--------------
 scripts/atomic/gen-atomic-fallback.sh       |   4 +-
 scripts/atomic/gen-atomic-instrumented.sh   |  19 +--
 31 files changed, 667 insertions(+), 528 deletions(-)

Comments

Arnd Bergmann May 15, 2023, 9:42 a.m. UTC | #1
On Mon, May 15, 2023, at 09:56, Peter Zijlstra wrote:
>
> Since v2:
>
>  - reworked this_cpu_cmpxchg() to not implicity do u128 but provide explicit
>    this_cpu_cmpxchg128() (arnd)
>  - added try_cmpxchg12_local() (per the addition of the try_cmpxchg*_local()
>    family of functions)
>  - slight cleanup of the SLUB conversion (due to rebase and having to touch it)
>  - added a 'cleanup' patch for SLUB, since I was staring at that anyway
>

This is clearly an improvement over the previous state, so I'm
happy with that, and the explicit this_cpu_cmpxchg128() interface
addresses most of my previous concerns.

Reviewed-by: Arnd Bergmann <arnd@arndb.de>

The need for runtime feature checking in the callers on x86-64 is still
a bit awkward, but this is no worse than before. I understand that
turning this into a compile-time choice would require first settling
a larger debate about raising the default target for distros beyond
the current CONFIG_GENERIC_CPU.

    Arnd
Peter Zijlstra May 24, 2023, 9:39 a.m. UTC | #2
On Mon, May 15, 2023 at 11:42:23AM +0200, Arnd Bergmann wrote:

> The need for runtime feature checking in the callers on x86-64 is still
> a bit awkward, but this is no worse than before. I understand that
> turning this into a compile-time choice would require first settling
> a larger debate about raising the default target for distros beyond
> the current CONFIG_GENERIC_CPU.

Looks like Power is going to be in the same boat, they can do
cmpxchg128, but only for Power8+.