mbox series

[RFC,0/4] Fix file lock cache accounting, again

Message ID cover.1705507931.git.jpoimboe@kernel.org (mailing list archive)
Headers show
Series Fix file lock cache accounting, again | expand

Message

Josh Poimboeuf Jan. 17, 2024, 4:14 p.m. UTC
This is an attempt to fix file lock cache accounting (again).  The bug
was originally reported 2+ years ago [1] but was quickly reverted [2]
for performance reasons.

A few years ago some ideas [3] were floated about how to improve the
performance.  Did any of those ever get implemented?

Testing shows "mm: improve performance of accounted kernel memory
allocations" [4] helping some.  But even with those patches, much of the
original performance regression still remains, at least according to
microbenchmarks.

Despite that regression, this being a security and correctness issue, it
really needs to be fixed by default.  Those who want to live on the edge
(or have trusted user space) can disable it.

Patch 1 enables the fix by default, but allows disabling it at boot
time.

Patch 2 allows disabling it at build time.

Patches 3 and 4 allow disabling it (along with all the CPU mitigations)
using mitigations=off.

[1] 0f12156dff28 ("memcg: enable accounting for file lock caches")
[2] 3754707bcc3e ("Revert "memcg: enable accounting for file lock caches"")
[3] https://lore.kernel.org/lkml/dbc9955d-6c28-1dd5-b842-ef39a762aa3b@kernel.dk/
[4] https://lore.kernel.org/lkml/20231019225346.1822282-1-roman.gushchin@linux.dev/

Josh Poimboeuf (4):
  fs/locks: Fix file lock cache accounting, again
  fs/locks: Add CONFIG_FLOCK_ACCOUNTING
  mitigations: Expand 'mitigations=off' to include optional software
    mitigations
  mitigations: Add flock cache accounting to 'mitigations=off'

 .../admin-guide/kernel-parameters.txt         | 48 ++++++++++++++----
 arch/arm64/kernel/cpufeature.c                |  2 +-
 arch/arm64/kernel/proton-pack.c               |  6 +--
 arch/powerpc/kernel/security.c                | 14 +++---
 arch/s390/kernel/nospec-branch.c              |  2 +-
 arch/x86/kernel/cpu/bugs.c                    | 35 ++++++-------
 arch/x86/kvm/mmu/mmu.c                        |  2 +-
 arch/x86/mm/pti.c                             |  3 +-
 fs/Kconfig                                    | 15 ++++++
 fs/locks.c                                    | 31 +++++++++++-
 include/linux/bpf.h                           |  5 +-
 include/linux/cpu.h                           |  3 --
 include/linux/mitigations.h                   |  4 ++
 kernel/Makefile                               |  3 +-
 kernel/cpu.c                                  | 43 ----------------
 kernel/mitigations.c                          | 50 +++++++++++++++++++
 16 files changed, 174 insertions(+), 92 deletions(-)
 create mode 100644 include/linux/mitigations.h
 create mode 100644 kernel/mitigations.c

Comments

Michal Koutný Jan. 18, 2024, 9:04 a.m. UTC | #1
On Wed, Jan 17, 2024 at 08:14:46AM -0800, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> Allow flock cache accounting to be disabled with 'mitigations=off', as
> it fits the profile for that option: trusted user space combined with a
> performance-impacting mitigation.

Note that some other kernel objects that don't have any other tight
limit are already charged too (but their charging likely did not stand
out in any performance regression tests).
In the situation you describe, users can already pass
`cgroup.memory=nokmem` and get rid of charging overhead in general.

IOW, if flock objects are charged, there already is a boot option to
turn off such behavior.

Regards,
Michal