mbox series

[0/7] zram: preemptible writes and occasionally preemptible reads

Message ID 20250122055831.3341175-1-senozhatsky@chromium.org (mailing list archive)
Headers show
Series zram: preemptible writes and occasionally preemptible reads | expand

Message

Sergey Senozhatsky Jan. 22, 2025, 5:57 a.m. UTC
This is Part I (aka Episode IV), which only changes zram and seems
like a good start.  More work needs to be done, outside of zram, in
order to make reads() preemptible in a non-occasional manner.  That
work will be carried out independently.

There are more details in the commit messages, but in short:

Currently zram runs compression and decompression in non-preemptible
sections, e.g.

    zcomp_stream_get()     // grabs CPU local lock
    zcomp_compress()

or

    zram_slot_lock()       // grabs entry spin-lock
    zcomp_stream_get()     // grabs CPU local lock
    zs_map_object()        // grabs rwlock and CPU local lock
    zcomp_decompress()

Potentially a little troublesome for a number of reasons.

For instance, this makes it impossible to use async compression
algorithms or/and H/W compression algorithms, which can wait for OP
completion or resource availability.  This also restricts what
compression algorithms can do internally, for example, zstd can
allocate internal state memory for C/D dictionaries:

do_fsync()
 do_writepages()
  zram_bio_write()
   zram_write_page()                          // become non-preemptible
    zcomp_compress()
     zstd_compress()
      ZSTD_compress_usingCDict()
       ZSTD_compressBegin_usingCDict_internal()
        ZSTD_resetCCtx_usingCDict()
         ZSTD_resetCCtx_internal()
          zstd_custom_alloc()                 // memory allocation

Not to mention that the system can be configured to maximize
compression ratio at a cost of CPU/HW time (e.g. lz4hc or deflate
with very high compression level) so zram can stay in non-preemptible
section (even under spin-lock or/and rwlock) for an extended period
of time.  Aside from compression algorithms, this also restricts what
zram can do.  One particular example is zram_write_page() zsmalloc
handle allocation, which has an optimistic allocation (disallowing
direct reclaim) and a pessimistic fallback path, which then forces
zram to compress the page one more time.

This series changes zram to not directly impose atomicity restrictions
on compression algorithms (and on itself), which makes zram write()
fully preemptible; zram read(), sadly, is not always preemptible.  There
are still indirect atomicity restrictions imposed by zsmalloc().  Changing
zsmalloc to permit preemption under zs_map_object() is a separate effort
(Part II) which will require some coordination.

Sergey Senozhatsky (7):
  zram: switch to non-atomic entry locking
  zram: do not use per-CPU compression streams
  zram: remove two-staged handle allocation
  zram: permit reclaim in zstd custom allocator
  zram: permit reclaim in recompression handle allocation
  zram: remove writestall zram_stats member
  zram: unlock slot bucket during recompression

 drivers/block/zram/backend_zstd.c |  11 +-
 drivers/block/zram/zcomp.c        | 162 +++++++++--------
 drivers/block/zram/zcomp.h        |  17 +-
 drivers/block/zram/zram_drv.c     | 277 ++++++++++++++++--------------
 drivers/block/zram/zram_drv.h     |   9 +-
 include/linux/cpuhotplug.h        |   1 -
 6 files changed, 252 insertions(+), 225 deletions(-)