From patchwork Fri Jan 31 09:05:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sergey Senozhatsky X-Patchwork-Id: 13955126 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8073BC0218F for ; Fri, 31 Jan 2025 09:07:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1D3E92800E3; Fri, 31 Jan 2025 04:07:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1847A2800E2; Fri, 31 Jan 2025 04:07:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0249B2800E3; Fri, 31 Jan 2025 04:07:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id D56E52800E2 for ; Fri, 31 Jan 2025 04:07:07 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 8D58DA07CE for ; Fri, 31 Jan 2025 09:07:07 +0000 (UTC) X-FDA: 83067167694.12.7258398 Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) by imf14.hostedemail.com (Postfix) with ESMTP id 98DD3100013 for ; Fri, 31 Jan 2025 09:07:05 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=gpYF17a+; spf=pass (imf14.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.214.173 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738314425; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=F9JcGjIMFcg98+LHC87h8XWwbBL3uEEr95Wh1WByXFM=; b=bshD5ZpijouoeudQxgTBRaJMB3YLebaPmx7O2Jwegmz2tDJkADsKYiYAxkGd0Y2QdzcMs+ LuixoK4XJwX9uIygURNWzfPDMhOjsSYP+ISnjxOotitn/QY/h9mtjyXykhrdo54Ue5zvlj 3Z1XZd6XsbSAtPudpYQ28s0T5prrPCw= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=gpYF17a+; spf=pass (imf14.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.214.173 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738314425; a=rsa-sha256; cv=none; b=R4x4uf+YJx68JTS/GmgJnednWyfSl5zLrqyLbdK2KEXgscosPWezJePpKuUPWo1OLlAQOB pam3UWpu9XkVhUHEVKQoM0Z5K9en+DugQE2MogTyiRq2hQcckJpZmM08wMcejtwe7lOlCx Ra3n+U+4Tat7gKnCzsgmLDbkMw/TGi8= Received: by mail-pl1-f173.google.com with SMTP id d9443c01a7336-21ddab8800bso23560125ad.3 for ; Fri, 31 Jan 2025 01:07:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1738314424; x=1738919224; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=F9JcGjIMFcg98+LHC87h8XWwbBL3uEEr95Wh1WByXFM=; b=gpYF17a+RDV9OBBVOAtFJZ2X7FhVh0wvLKwkFy99pBYHqatHqaQjmybMV9aQZb0JhL ODfv8mBqphR3WwSHP/j3ln+oN/uHiFi21LMKUIdlgEse42Q7IS8JARFab6Gugg/W2Hl1 Kpigg5I1dvb9zi3qVhzMqwrJKaVUyXY2OnbkQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738314424; x=1738919224; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=F9JcGjIMFcg98+LHC87h8XWwbBL3uEEr95Wh1WByXFM=; b=B8KaQBCmTWYlZUVED61CAxV/8n0G6l8tMZkpOER5hmH1QMSpUZy8IZEBM52IJbuI/m 9DYjYX6QoFeewNIRamajiyerd1XxLk6PYEsShgx+bFaRku3WCpo9Go44n0suivNrphaS 3R555/MGrvJxyHrrh9wR6i9fIp7hYf2xt4N+d7t09BmnZORUe+ePjmTv71RlucuFYzAb cMy7qNCivFYV9pfG6PRAr6MaKy3hNrU+OO5e50GU05jkKCI11NTVwRW37OiPw4rFSzBh HTM7LgPIe63SafU7qdSurn7aJlEet9c9hVq4Rofg3CAR5wSCsuIjFtrNng3PTglpU/Tl Bf4Q== X-Forwarded-Encrypted: i=1; AJvYcCVDIX+Gx+/vfRcxjxjLzv0/43tEOSNW1cucxoQzqJBo7JmTXL4jKHhxhw76PeYi2ExqtSfCmtjA2w==@kvack.org X-Gm-Message-State: AOJu0YwroOULoqNJJii2fTU1YsRqUGYDHY5ETYqeLKNFaGLPRpSYA8+P xKH6jFBa195h+oo1FsdjO5W3a4MYTWO3i3mezFhHx3xsU/nCtEIK9FwqIIXfPA== X-Gm-Gg: ASbGncvpsFL8U+SSFfa9ipThXn+u8zuny907ES6r/jd0pbV1Vm4ZOBzSjrm+LNm8bCw WZZnLZ+YyCOhxsTrsRG1jg5rBJn9wnOSps9a167A4bgXtM29x56ibatId1i2khgbyYd68yGYVls LXO9NOp+ySrzSxtbwpBp5GkYRV1TltjmKUlslY5Li6OVHU+hpcRwtzdoQhXQSvJL8+jblKFvULa ul6WwDujLvD1yyR2nk3ozz6Nb/lcTU7iCrao5eQWu8EzaD9ldPcgbY2tjZ0Gm1IsAzJ+PRxSy4M wmKrU84haiGqSlnKzQ== X-Google-Smtp-Source: AGHT+IGwqFC/C39AdygqHo+Yli9e16nI+7gPzLB9FQLwrl60VgzRS9LksePf+6AJeFIYWHe5kq0OoQ== X-Received: by 2002:a17:903:1205:b0:21a:8dec:e57a with SMTP id d9443c01a7336-21dd7dfad9dmr183337765ad.48.1738314424207; Fri, 31 Jan 2025 01:07:04 -0800 (PST) Received: from localhost ([2401:fa00:8f:203:c752:be9d:3368:16fa]) by smtp.gmail.com with UTF8SMTPSA id d9443c01a7336-21de31f7554sm26143945ad.81.2025.01.31.01.07.02 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 31 Jan 2025 01:07:03 -0800 (PST) From: Sergey Senozhatsky To: Andrew Morton Cc: Minchan Kim , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky Subject: [PATCHv4 00/17] zsmalloc/zram: there be preemption Date: Fri, 31 Jan 2025 18:05:59 +0900 Message-ID: <20250131090658.3386285-1-senozhatsky@chromium.org> X-Mailer: git-send-email 2.48.1.362.g079036d154-goog MIME-Version: 1.0 X-Rspamd-Queue-Id: 98DD3100013 X-Stat-Signature: 9wtffj9jxdsw3tu4mcxcxpddknwtei78 X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1738314425-202086 X-HE-Meta: U2FsdGVkX18+xA876EVGB81ZlG8Ro/FMQFCUGbQkI24IfZC35OahZNxX3A5VIyRI6j+KuljQt9ZVYopGTYMkzZw5sfUQ0xRYm9OYHYCvqiNSVAo8z1yX59yqm6roYENijjS8ccW1Q2+TdwyRSWN6LC9/OiBmqMtU6+TnSqtMomq7nTcnBgO0ZCaRv7EUHoHTbaCNzbgaIgmU9CwCVNkBt2+HrCcFRUHJ+eeMEEp0p3eO/asok2jb/GLoT13W2i+p2+Ce1O6R9Aeel0wbSnZQBAA6aTwiMUU2pF+85rFUtfxsjLkJ91LvCgNa2H9D1eoMrZNdaGlVZ4OuNIGvH1OwaPPV2d+6tHb4MjIdjXzLEM0oo1zYtG8bizNY78Rdf+7d0XC4EYZiWvZOp9ckZRMWk+04BIq1tZqD6f70mfRwDEF+QNvtx/Z9SdbkbHJRmCuS7fMj/VTRVci87mTJoWvK4iQN9C6WNetq9q+6o1SE65y13+U1BzxPyIWdPuaWI3yXfFtcXGCntLZotD+xn9U1LQMBLiruY6bt9hYWK46ahPMcGtnoMtwogNrDhtaSYAfzWd0tJSSpMPPXDzxLlhWyu7H/n7QaNBWonjhEqsJRvHr/64YbXIshPFai7q0asDlRBYCuT9lek/6FI4bhjmS/qzvYygxQzk3baPedCwD/1WoeRkyQDZByTX9tOx+vdyw9KX+H5w4pBLW7tWxiiVRPNwDjkLWoyflIDRldO96ZUKje9KWXLyL+RwN9slY46HDC9qgkdllLJzxNb22ZGLTQvaC/BTui0I63y7xvcaUpWEDpxxMVTranB0alO3AK3mGddsIQSkgQBYXJtMR4i+U3TiVAvzIK0oeZPMKez0bhz8wNFUtsEYGsINLFxHEVplP4CvAmXB5E+7BKW9SnqLTpyUac0qStTg5jf4mRdfBvNDekmqYX3YnwlvjnpKl2wVO+AtOjsC4A8rwJy5CCDfX 4jF1aLl8 rr7ykamym1fPvCd37xWBNTC1EG4sILtYEs6uQG92LXnbCndikY0iUXpW93Y6RaD8RGx72XmXfu4R4AeEvQLBIVKu9HujMVtmmA5aRMbIxCFSpYUh8X+5HMdr3HvDMpSD9c/PviPJ2cZwVOCjcdSj2aAMsI7MBi3wC6pgEeOTiWQ7zhbmcxgBVlmupBwg12Y/q0iXmBOGVyFnVPzzjuXBatMIL2mZQnVAi/Bi3SqInLK8zgJse5ynU2SqEX8wdy5aQTXmdYJfAJN7TOSdSKEYltNOSQRaTJPOT4AN4C2KfejwosDJ/psB4g5OlUMhbVHTAB+mcX+Y2GweSrGQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Posting [1] and [2] combined into one series for the first time, so don't get confused by v4. Currently zram runs compression and decompression in non-preemptible sections, e.g. zcomp_stream_get() // grabs CPU local lock zcomp_compress() or zram_slot_lock() // grabs entry spin-lock zcomp_stream_get() // grabs CPU local lock zs_map_object() // grabs rwlock and CPU local lock zcomp_decompress() Potentially a little troublesome for a number of reasons. For instance, this makes it impossible to use async compression algorithms or/and H/W compression algorithms, which can wait for OP completion or resource availability. This also restricts what compression algorithms can do internally, for example, zstd can allocate internal state memory for C/D dictionaries: do_fsync() do_writepages() zram_bio_write() zram_write_page() // become non-preemptible zcomp_compress() zstd_compress() ZSTD_compress_usingCDict() ZSTD_compressBegin_usingCDict_internal() ZSTD_resetCCtx_usingCDict() ZSTD_resetCCtx_internal() zstd_custom_alloc() // memory allocation Not to mention that the system can be configured to maximize compression ratio at a cost of CPU/HW time (e.g. lz4hc or deflate with very high compression level) so zram can stay in non-preemptible section (even under spin-lock or/and rwlock) for an extended period of time. Aside from compression algorithms, this also restricts what zram can do. One particular example is zram_write_page() zsmalloc handle allocation, which has an optimistic allocation (disallowing direct reclaim) and a pessimistic fallback path, which then forces zram to compress the page one more time. This series changes zram to not directly impose atomicity restrictions on compression algorithms (and on itself), which makes zram write() fully preemptible; zram read(), sadly, is not always preemptible yet. There are still indirect atomicity restrictions imposed by zsmalloc(). One notable example is object mapping API, which returns with: a) local CPU lock held b) zspage rwlock held First, zsmalloc is converted to use sleepable RW-"lock" (it's atomic_t in fact) for zspage migration protection. Second, a new handle mapping is introduced which doesn't use per-CPU buffers (and hence no local CPU lock), does fewer memcpy() calls, but requires users to provide a pointer to temp buffer for object copy-in (when needed). Third, zram is converted to the new zsmalloc mapping API and thus zram read() becomes preemptible. [1] https://lore.kernel.org/linux-mm/20250130111105.2861324-1-senozhatsky@chromium.org [2] https://lore.kernel.org/linux-mm/20250130044455.2642465-1-senozhatsky@chromium.org v4: -- merged the series -- renamed zs_pool migrate_lock (Yosry) -- dropped zs_pool members re-shuffle patch (Yosry hated it with passion) -- some minor cleanups Sergey Senozhatsky (17): zram: switch to non-atomic entry locking zram: do not use per-CPU compression streams zram: remove crypto include zram: remove max_comp_streams device attr zram: remove two-staged handle allocation zram: permit reclaim in zstd custom allocator zram: permit reclaim in recompression handle allocation zram: remove writestall zram_stats member zram: limit max recompress prio to num_active_comps zram: filter out recomp targets based on priority zram: unlock slot during recompression zsmalloc: factor out pool locking helpers zsmalloc: factor out size-class locking helpers zsmalloc: make zspage lock preemptible zsmalloc: introduce new object mapping API zram: switch to new zsmalloc object mapping API zram: add might_sleep to zcomp API Documentation/ABI/testing/sysfs-block-zram | 8 - Documentation/admin-guide/blockdev/zram.rst | 36 +- drivers/block/zram/backend_zstd.c | 11 +- drivers/block/zram/zcomp.c | 169 +++++---- drivers/block/zram/zcomp.h | 19 +- drivers/block/zram/zram_drv.c | 357 +++++++++---------- drivers/block/zram/zram_drv.h | 8 +- include/linux/cpuhotplug.h | 1 - include/linux/zsmalloc.h | 8 + mm/zsmalloc.c | 372 +++++++++++++++----- 10 files changed, 577 insertions(+), 412 deletions(-)