From patchwork Fri Feb 21 09:37:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sergey Senozhatsky X-Patchwork-Id: 13985076 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF807C021AA for ; Fri, 21 Feb 2025 09:38:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 65D826B009B; Fri, 21 Feb 2025 04:38:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 60D356B009C; Fri, 21 Feb 2025 04:38:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4FC066B009D; Fri, 21 Feb 2025 04:38:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 337B36B009B for ; Fri, 21 Feb 2025 04:38:46 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id BD6D5A1480 for ; Fri, 21 Feb 2025 09:38:45 +0000 (UTC) X-FDA: 83143452210.06.E15105B Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) by imf16.hostedemail.com (Postfix) with ESMTP id E69E9180010 for ; Fri, 21 Feb 2025 09:38:43 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=SyyBam+v; spf=pass (imf16.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.214.177 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740130724; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=Sb/ZzpS94bhyr7zvV4dT2coaEuz51CRI4NUHze17YNE=; b=8dfnmYk5C6n5idmPasns2iKyx6S7bbVUE8qo+/efXdhRQ+61jn36LdP+xDjrE2PDRmCHvk s1STHzWGuHAcuI9yY31FKxd5ahqqxG4CSP9N8zF1I1K+wE9noIaAgRufLf84Wbwoky9jIR 3wiw08OBzXMqwzjtur9DkmIcnyg+fEE= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=SyyBam+v; spf=pass (imf16.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.214.177 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740130724; a=rsa-sha256; cv=none; b=eBciOuhG0EZ4ceDCbZUc8IDMuB2h4o4SXRnH+kw/f/R0pUgmBz0ZWGhzvD3+H6KIhMM68a bidur/aXeHEAdiNOStWbcAKJSTQSamtJs95bE4QMJwQNBdaPkvNb6prHnBGOts2aA0+Dqm 3okrlighQej9LcPu4AE/nG7lSb3tUb4= Received: by mail-pl1-f177.google.com with SMTP id d9443c01a7336-221050f3f00so39928115ad.2 for ; Fri, 21 Feb 2025 01:38:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1740130723; x=1740735523; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=Sb/ZzpS94bhyr7zvV4dT2coaEuz51CRI4NUHze17YNE=; b=SyyBam+vsoe9ikCLN/EdOBawDbZArbQlI6CnODf6pOroCouSElJzPSAPfwhU01MTjZ T7mAc2LgBZ8OJ7fVO8T1xmZqGA01po89o4jceu3+kg+OUCDnK1DAZxP0rZfhhb+CyCUP inbgHmHvk2MDKr5mjFEsqN6UtJEuwIpwZP8Uo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740130723; x=1740735523; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Sb/ZzpS94bhyr7zvV4dT2coaEuz51CRI4NUHze17YNE=; b=fZ0uHRWGUlDA97VdDlp1JQRIsjooi9vG9AZqOpA9yz9p4o/0lBY4E76Sly5eBpfu97 HKfRCeM/yB1voC5lMqLtzStEvfr1P2LLOw4HGOA9W+5xC5+9eZEMPdUs//7tDqo+hOl3 tn+9vJEuGju87xeWYsoLkZ8QpA1k+iS5nc3np3zIdfnBsRoP/IpAA6ssJ/53h15kG4R4 nKzWpUM8JP47REqlHvVVpoHDqO4X4hXKLexr1fI5z5FvnS1JIocZbUrs0E3hxcTbuZ+b Q8McLA4NCSWq9vSOItv898epjQhbkhLEfWytocWXyC13Zf+iD4FU6Ci67Pi9KF4XGxtT RZgQ== X-Forwarded-Encrypted: i=1; AJvYcCUtRsMSCGU4WiH1oQXhL6k5+IGnCbRO3Mon3gluJgaT/Pskye2GMbPcZLteQtEiytTo5b3pKKbJJw==@kvack.org X-Gm-Message-State: AOJu0Ywa+0pBMbbW7cvceHD9p8r+ZPdFhLuf276rCcFDLbYdEyIfFcZr BdA1TFtzfAbM/zj3qUz8ayT/J9UA+Wk0/wLO9tZiSIzUw8TERdT93VRiXq+1kg== X-Gm-Gg: ASbGncuH2hDDm/Y72rDzBU4JS5//+LgfMbVaIGBkh+h/VYTsPJoIG2+lfHW7PgJe043 ujFrMoqWi+HnvRj6K7OcQyc6KIHwobkFIORKNn0/k2DoQb5yUvMA+szbF9NrCeOLQJ2g8o8PIb+ b8FOMEeel9OiAZUsMcb0y1C+xdeR46XSFGiHs+2DZYS/3uUJ09RnUC6DjefymKz7zWNm5qYrtAe L57CSFLDl8Q5SDMhqIL55VXtu2lnr1yn85YCBygDYc0vZnFM4gN6UmCGV8t1K6Z+musDN0tKeN+ x2MbeWhfGTT+ZMrFjZZdO0hGKJs= X-Google-Smtp-Source: AGHT+IG50W3X4SnN9btz4er882u86/9UWbROeOzs8+xCx3/4zysAPn6c1qlS5KMy+CFTP1GMeFp3Ew== X-Received: by 2002:a17:902:f684:b0:21f:68ae:56e3 with SMTP id d9443c01a7336-2219ffa36a7mr39383345ad.39.1740130722768; Fri, 21 Feb 2025 01:38:42 -0800 (PST) Received: from localhost ([2401:fa00:8f:203:f987:e1e:3dbb:2191]) by smtp.gmail.com with UTF8SMTPSA id d9443c01a7336-220d55960ecsm132769155ad.253.2025.02.21.01.38.40 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 21 Feb 2025 01:38:42 -0800 (PST) From: Sergey Senozhatsky To: Andrew Morton Cc: Yosry Ahmed , Hillf Danton , Kairui Song , Sebastian Andrzej Siewior , Minchan Kim , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky Subject: [PATCH v7 00/17] zsmalloc/zram: there be preemption Date: Fri, 21 Feb 2025 18:37:53 +0900 Message-ID: <20250221093832.1949691-1-senozhatsky@chromium.org> X-Mailer: git-send-email 2.48.1.601.g30ceb7b040-goog MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: E69E9180010 X-Stat-Signature: hg9oyp8exyjj6dc657x9s9o9mdod19k3 X-HE-Tag: 1740130723-483537 X-HE-Meta: U2FsdGVkX19NeyDWmfmWQ0qVvxRvahjqPIOAoc+JEpyLnlOCqXM46LAs7IIEgTF2Sq3/tXSPCtfqjCJSXTSO/UTGQibH8XHsSehYv7ynlz2qpy2ExWD6/+5CFnABk4Boi/xkjixb0DbOlRIh22/85q9x0de+LEAw0+IW9J1rhlPzkYP6FvERVWeUJYDvrgeX7kHoVkOtM+zqT1bgL5s0oKXo1Q3DBYJt2fZmH3hwrMrMflttTDgQ10RXsjPIcfg1aqvIepGz2uPW9WYnQMJ1IQa5CViQkNwz2nj7c1hARosOwL5srn2p4QAa5uy++zE5MKDm1iZNDf73AElw2BikZe6Z+FUcV5GWEnsPCBrb42C1ArEwqbCWF7dVFABYQwm27vO1BRhB+fWNTS0SmB5sP9dzUcdruAOCQwYEvbocajbahVg7FQEpSV79EZSHrkZ3EmovrAjAZ64YST/CjXMImx9glQPRot3T4fPH3A5MViq285sGu9bER1U/QqNV4FgtZB45kMs/Z038iFLZOmAwup4AbzokP+YR2gcHY78HK6j+ZdtoGKWuJ0kEOho7qYbCvxIQA7KIYW0G867GLOwMmnBFemFW/OJJfxzHeiXqBks8S/ya/br+Aik+3Yv4MAxGqguzZPla4KPdkKfOsaHGSKMbH3WnYGAhGE7xv2v8nA8Ik0qhXo3hnMzRzHRTdFXb1NJkIuDuk9divqYdcfMrYrh66y9TeFZ5xQHqxAxdY15TOsFxhHvNj86vvbzOnaG+4hjZD6nCiph6NCAsfnAAOGFdZV13es2RPV3Z1Ix/qb8huLI1ePtqSJy3RnWN5Oh/QoAJniQWLo2WjGc2IClsNjvsz0sUcEmL0wI9pJ5w31iOsn07QKzuPiWuQ1mMw4/B2cy09fNT50TSx7jwfj5r8QI9LQjMV36bCswYWllwZ7pM0GawUsqy5fC44sddyNCv0IdmHkvaOr2S5XOHDf4 SYjmtxmG mgdV62O8XL1AuSGIWICfvZN8gEYDi3mrLDDsXqd6mk13W4o+vv/g3D/7DCdk888xEkrJzRViF+l5tlFBpvC6n+md2R8vMfqHAKFNobipv4gXUH5mNXawCTh0Kqai+M90AemH6BHHl7bfhlwqfKlObb85r10+NkR+fyKvlOF86AtJZwA0FtUvLHYcbTY0XSxgm5VDrM8KiM+gClByGpFOGFRUYa4w+ZFzTDwsaaeCpYwORGqh7Zkcj0U48r7OBuJQvE7Qwt2fvUEIge5UD/wTNw5yCpTHsipZhZTNG2SFC+ugOet6v2S+l3DK1GFetx9lBGaOG X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently zram runs compression and decompression in non-preemptible sections, e.g. zcomp_stream_get() // grabs CPU local lock zcomp_compress() or zram_slot_lock() // grabs entry spin-lock zcomp_stream_get() // grabs CPU local lock zs_map_object() // grabs rwlock and CPU local lock zcomp_decompress() Potentially a little troublesome for a number of reasons. For instance, this makes it impossible to use async compression algorithms or/and H/W compression algorithms, which can wait for OP completion or resource availability. This also restricts what compression algorithms can do internally, for example, zstd can allocate internal state memory for C/D dictionaries: do_fsync() do_writepages() zram_bio_write() zram_write_page() // become non-preemptible zcomp_compress() zstd_compress() ZSTD_compress_usingCDict() ZSTD_compressBegin_usingCDict_internal() ZSTD_resetCCtx_usingCDict() ZSTD_resetCCtx_internal() zstd_custom_alloc() // memory allocation Not to mention that the system can be configured to maximize compression ratio at a cost of CPU/HW time (e.g. lz4hc or deflate with very high compression level) so zram can stay in non-preemptible section (even under spin-lock or/and rwlock) for an extended period of time. Aside from compression algorithms, this also restricts what zram can do. One particular example is zram_write_page() zsmalloc handle allocation, which has an optimistic allocation (disallowing direct reclaim) and a pessimistic fallback path, which then forces zram to compress the page one more time. This series changes zram to not directly impose atomicity restrictions on compression algorithms (and on itself), which makes zram write() fully preemptible; zram read(), sadly, is not always preemptible yet. There are still indirect atomicity restrictions imposed by zsmalloc(). One notable example is object mapping API, which returns with: a) local CPU lock held b) zspage rwlock held First, zsmalloc's zspage lock is converted from rwlock to a special type of RW-lookalike look with some extra guarantees/features. Second, a new handle mapping is introduced which doesn't use per-CPU buffers (and hence no local CPU lock), does fewer memcpy() calls, but requires users to provide a pointer to temp buffer for object copy-in (when needed). Third, zram is converted to the new zsmalloc mapping API and thus zram read() becomes preemptible. v6 -> v7: -- provide dep_map access wrappers to avoid code duplication between !lockdep and lockdep builds (Yosry) Sergey Senozhatsky (17): zram: sleepable entry locking zram: permit preemption with active compression stream zram: remove unused crypto include zram: remove max_comp_streams device attr zram: remove two-staged handle allocation zram: remove writestall zram_stats member zram: limit max recompress prio to num_active_comps zram: filter out recomp targets based on priority zram: rework recompression loop zsmalloc: rename pool lock zsmalloc: make zspage lock preemptible zsmalloc: introduce new object mapping API zram: switch to new zsmalloc object mapping API zram: permit reclaim in zstd custom allocator zram: do not leak page on recompress_store error path zram: do not leak page on writeback_store error path zram: add might_sleep to zcomp API Documentation/ABI/testing/sysfs-block-zram | 8 - Documentation/admin-guide/blockdev/zram.rst | 36 +- drivers/block/zram/backend_zstd.c | 11 +- drivers/block/zram/zcomp.c | 48 ++- drivers/block/zram/zcomp.h | 8 +- drivers/block/zram/zram_drv.c | 289 ++++++++-------- drivers/block/zram/zram_drv.h | 22 +- include/linux/zsmalloc.h | 8 + mm/zsmalloc.c | 356 ++++++++++++++++---- 9 files changed, 498 insertions(+), 288 deletions(-) --- 2.48.1.601.g30ceb7b040-goog