From patchwork Tue Jan 7 22:22:34 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yosry Ahmed X-Patchwork-Id: 13929631 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE1D6E77197 for ; Tue, 7 Jan 2025 22:22:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 042056B0083; Tue, 7 Jan 2025 17:22:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F33496B008C; Tue, 7 Jan 2025 17:22:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DFB336B0092; Tue, 7 Jan 2025 17:22:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C19C86B0083 for ; Tue, 7 Jan 2025 17:22:42 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 6374A80DC5 for ; Tue, 7 Jan 2025 22:22:42 +0000 (UTC) X-FDA: 82982081364.27.9CD6477 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf24.hostedemail.com (Postfix) with ESMTP id 9733D18000E for ; Tue, 7 Jan 2025 22:22:40 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=B1fagK7s; spf=pass (imf24.hostedemail.com: domain of 3L6l9ZwoKCLo3txw3fmrjilttlqj.htrqnsz2-rrp0fhp.twl@flex--yosryahmed.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3L6l9ZwoKCLo3txw3fmrjilttlqj.htrqnsz2-rrp0fhp.twl@flex--yosryahmed.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736288560; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=hA9joWkRL80aiLe0koo23pX7PhlO6tM6thqMATP8XDE=; b=sxVUIYBp2znJ+PFLzpxVLbjX8XW1v8weleC/vm78HZnfmF7MhZfHXkm1tjAoeF/hDRPnTF iftfUquz2z32IVBOUFfmluzg9do6FGK3Ilntz+g9r1HhCLbp1CbC5/fCM9dXfA1ekmNjik QN2ii5pV/pGxvWTSTwkiW4SoNWom45w= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=B1fagK7s; spf=pass (imf24.hostedemail.com: domain of 3L6l9ZwoKCLo3txw3fmrjilttlqj.htrqnsz2-rrp0fhp.twl@flex--yosryahmed.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3L6l9ZwoKCLo3txw3fmrjilttlqj.htrqnsz2-rrp0fhp.twl@flex--yosryahmed.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736288560; a=rsa-sha256; cv=none; b=sNbWxhf0hDlLUk1180kGUZGH5fc2y5OAte9KeN12EdHYsfQ24qz/+7bx2L0Y1xwZH93P8W 9QQ6EPvXgy9JD14+y6yEZIqw+XNfj0My7X8ZXbjqNDGi8GZVHeBvH2LXxjlRDD7AXYFof5 F3CiTkF4Mj/DEFRr56IHURdV+M2azg8= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2f46b7851fcso31168442a91.1 for ; Tue, 07 Jan 2025 14:22:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736288559; x=1736893359; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=hA9joWkRL80aiLe0koo23pX7PhlO6tM6thqMATP8XDE=; b=B1fagK7sCN/VIJvhJPMb7Wk1GaelN5w1RD8/uCGNtYIRNtE/q2/McucZGMK7N6giLX bbKl7eox9pQf5f71/0Yhw4MXrFQE46eep6PkXQk4B71YPfAUp+pxeTTl+POTTwd8shOj 96ZStrjy7QNEgezg1Qv3X+i4J/nlRIoh0gTU0hyUz0TQhr7+/Ci+6s+STIw1lvssMPT5 u0zbXwQ82XZOdRw6thYcqiIrUcfgo0SywanLAp2/lZgkdWz0b8cXJJJ1BmK8ANXiGIJm JPxCSc5xwPEh4fMrEs6KxcB5uC6nhZl8BItjkN1aYEaQEgExYL425orKxIVlxQ4O7ZBh 1cfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736288559; x=1736893359; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=hA9joWkRL80aiLe0koo23pX7PhlO6tM6thqMATP8XDE=; b=jhz/PKHj4U0gl96lUOr/7JApGnT5GCn7iY0g8m3oKQt1fcXP9jzyz0VZVi8fMIRmdK kyefAdBCnMDcpbhQnC1V/zn0CUF+olVQC0KC1vliZMbP82R7pgoUu9rU/mNuFEmypv7+ Q4Y2yJ3t+YLjLQ/ZSi36Rfdb1piZLl7oQ2rM07z0uNGOjQDtHhP4WEP/kRP7ZuAsrOfr MCYMyM8M4qqSZem93mqKuCdV3N98NQnpJN/qVUaQuDnuf5geZk0891OBDd2jvr3kHBou ja1MzYIvscGmlH6K4ZmPPoM82dRaDdwZdaQGCUxQMChrOeVVUaph1e6lu39Jd7BOLhoG IQaA== X-Forwarded-Encrypted: i=1; AJvYcCVNlDxpDR1Fqq7HS609SUDRCMVkMZdSfU1OZu6VzaKZwzowtvTTLqq1pPksHQ1lWU+uB5BUPdAYYQ==@kvack.org X-Gm-Message-State: AOJu0Yy8SagiaFBgD+y/bw3m7zltsERDiKURjypR2Ki2zaXfJnWEgg+Y 3tGFrfWKRom2KoDtpG5FKsJSjrYhJBzv4DRygcsVGfZmGGozHmjn5HOdFPvHcvfyBA/JBwYRblP zDjEge8M0gpt/wiX/cw== X-Google-Smtp-Source: AGHT+IHO+KIS6IWsnU8P0JcAlrA8VQx6zkqU6P2LdNHDD4x04qVDshzzX7V4PSbQjsjSWgp/VwS4Rbtz7I+owiPA X-Received: from pjbph15.prod.google.com ([2002:a17:90b:3bcf:b0:2ea:4139:e72d]) (user=yosryahmed job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90a:e18c:b0:2ee:b2be:f398 with SMTP id 98e67ed59e1d1-2f548f09f04mr802536a91.2.1736288559332; Tue, 07 Jan 2025 14:22:39 -0800 (PST) Date: Tue, 7 Jan 2025 22:22:34 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20250107222236.2715883-1-yosryahmed@google.com> Subject: [PATCH v2 1/2] Revert "mm: zswap: fix race between [de]compression and CPU hotunplug" From: Yosry Ahmed To: Andrew Morton Cc: Johannes Weiner , Nhat Pham , Chengming Zhou , Vitaly Wool , Barry Song , Sam Sun , Kanchana P Sridhar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yosry Ahmed , syzbot X-Rspamd-Queue-Id: 9733D18000E X-Stat-Signature: az8rq1fjeru5547prrzycuawhezprbmo X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1736288560-358888 X-HE-Meta: U2FsdGVkX1+MX3LwEH7L4XFGDLBwppW7RuZ2cNfVJPQKzwC92C0u/B/6f2ASe7KFQJrd9BOSkQTDR6uZyPk6yftVgC38Sx4QJu0T+pRFUrnyhX3M5F5XVfCLBtixleayIwc0LBJFhj/fdcLmLhuIHwaXFthijWfH636A18ksk5gb8Ba9m/5MR0r7QDUjkUtNChteNE1u1aXFjfVYpbcJVVJSiL+RK8G8HNKKdxMLNQRATZRP7l9F7yc1JCuTNxxUv4enpwruooPlAL46fRkVWTCH696gJY3Fv5FEbsOJ8rdMhfa2LHeksuc9y8RrLLUuP1/tWlLTRNLm4W2tfbdieElF9u0ERDOYru0Nsavjx3HBP6GDRrUvngskAGYxbF3wFLtkZ11vlt+Th/mn8d5zw5dhBo42ntCl7Weiq4iw+7g4JP8VSd8Q3USj2TJ/gYwNCvfXuvPGeb/wPfWNVlSf0UlAgJlOhr6SsRQF/IrVKtPobmlZBKCgTO6cuWhWN7oRU2sMnaOpBUfXltQe0GbDoVR27sWLrCWLT02FY6DsARABoU6IxL68CgojLHorioQYXmfxXgAgvG6TNWw6TXXLYhCowMJWuomXgvRUF2pXCSdREgzvaECKMx6D9naB+wSrblmb/fv87Bcy/IdwvrRQ63RN+1Ljoe3mjJ4qO4Hp5RW5E6JC7gui+/x5NoXs/g6j4XJwKeUIE402JWDIWp5I9wcGXM7u/mgshV7CSlheib5eqYwvmKXYpz18gPjPZ6nbJGZmcm+sdg+EcQSxYikDX9cjOJBPknJJqPTEEFeRI7JrLE/768+Vj+rzN2mDc1U1e4wZbBpT9l0BuHzOHJpbG8J/RgxzmWUaj0hU18EB9GJMAfrmSa43lLhld7vhFevir5PCZiWu/8GOfWnlGOVy08/p8d2j9tZ7JU7aOadOr+1yM9VLO2wd+clWXxcclJgABeHc9J2DmW5UcQzcvEy R6JUGale yVaZvjEBLDyGSeZ8kETR8MViMJmDopbvMOaSwXVBeW20UQM0GqOzb5rioeFHJNhIVmwfKlAvBERALnonS2Z4dYk48uBARdqBMtvcpHooIJYis+BeLOjiTTb3VbFxmYixsdh1tepAt2henf9KZXAZ6UAoQ9W1BLTli3YG1o3YCyxUDfUXH2NLyklteDbsPMchRJhdI4my/AZ1zBOfyA/pdCVI6/FK0JOCb+VAaQH2DynUtAEy2mHGZ9wdDPGICG+SG5d10BkxFb3GcSVwmN4lj6ahURdNIWF5IdQfJfYGTkn2UvMKtsLp8H+NuLBo9EaNhT2buTqsW79A05UiV/Be6D+SyNsqlp/U8gUnt8xWARY3Zp377sSxZ1OfbHq+2TFDdFz5n/iDyAYy1gPF9PqjfmisLOzjFztiSuzEU91N3Fab8k81YHbf9xxwrO0ZzdgsdM/KISEWFbUeDCs2eosGJewmdk0rkDrGgsjyOGWsA3dOMce37BT42lpygoJe1ks08IYbJNm9JzhfIRWQ5Pl3ADocbPu2Isbiw+5F0Qo+pgwx73ak6wX1nC4N1wYiIwkkst1UUliz4hGt+fJgoTao0DDqiGYwDjyYlHvSERDKL6UaLhK2c09B300kLR4q+k7UD1UXt X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This reverts commit eaebeb93922ca6ab0dd92027b73d0112701706ef. Commit eaebeb93922c ("mm: zswap: fix race between [de]compression and CPU hotunplug") used the CPU hotplug lock in zswap compress/decompress operations to protect against a race with CPU hotunplug making some per-CPU resources go away. However, zswap compress/decompress can be reached through reclaim while the lock is held, resulting in a potential deadlock as reported by syzbot: ====================================================== WARNING: possible circular locking dependency detected 6.13.0-rc6-syzkaller-00006-g5428dc1906dd #0 Not tainted ------------------------------------------------------ kswapd0/89 is trying to acquire lock: ffffffff8e7d2ed0 (cpu_hotplug_lock){++++}-{0:0}, at: acomp_ctx_get_cpu mm/zswap.c:886 [inline] ffffffff8e7d2ed0 (cpu_hotplug_lock){++++}-{0:0}, at: zswap_compress mm/zswap.c:908 [inline] ffffffff8e7d2ed0 (cpu_hotplug_lock){++++}-{0:0}, at: zswap_store_page mm/zswap.c:1439 [inline] ffffffff8e7d2ed0 (cpu_hotplug_lock){++++}-{0:0}, at: zswap_store+0xa74/0x1ba0 mm/zswap.c:1546 but task is already holding lock: ffffffff8ea355a0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat mm/vmscan.c:6871 [inline] ffffffff8ea355a0 (fs_reclaim){+.+.}-{0:0}, at: kswapd+0xb58/0x2f30 mm/vmscan.c:7253 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (fs_reclaim){+.+.}-{0:0}: lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849 __fs_reclaim_acquire mm/page_alloc.c:3853 [inline] fs_reclaim_acquire+0x88/0x130 mm/page_alloc.c:3867 might_alloc include/linux/sched/mm.h:318 [inline] slab_pre_alloc_hook mm/slub.c:4070 [inline] slab_alloc_node mm/slub.c:4148 [inline] __kmalloc_cache_node_noprof+0x40/0x3a0 mm/slub.c:4337 kmalloc_node_noprof include/linux/slab.h:924 [inline] alloc_worker kernel/workqueue.c:2638 [inline] create_worker+0x11b/0x720 kernel/workqueue.c:2781 workqueue_prepare_cpu+0xe3/0x170 kernel/workqueue.c:6628 cpuhp_invoke_callback+0x48d/0x830 kernel/cpu.c:194 __cpuhp_invoke_callback_range kernel/cpu.c:965 [inline] cpuhp_invoke_callback_range kernel/cpu.c:989 [inline] cpuhp_up_callbacks kernel/cpu.c:1020 [inline] _cpu_up+0x2b3/0x580 kernel/cpu.c:1690 cpu_up+0x184/0x230 kernel/cpu.c:1722 cpuhp_bringup_mask+0xdf/0x260 kernel/cpu.c:1788 cpuhp_bringup_cpus_parallel+0xf9/0x160 kernel/cpu.c:1878 bringup_nonboot_cpus+0x2b/0x50 kernel/cpu.c:1892 smp_init+0x34/0x150 kernel/smp.c:1009 kernel_init_freeable+0x417/0x5d0 init/main.c:1569 kernel_init+0x1d/0x2b0 init/main.c:1466 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 -> #0 (cpu_hotplug_lock){++++}-{0:0}: check_prev_add kernel/locking/lockdep.c:3161 [inline] check_prevs_add kernel/locking/lockdep.c:3280 [inline] validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3904 __lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5226 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849 percpu_down_read include/linux/percpu-rwsem.h:51 [inline] cpus_read_lock+0x42/0x150 kernel/cpu.c:490 acomp_ctx_get_cpu mm/zswap.c:886 [inline] zswap_compress mm/zswap.c:908 [inline] zswap_store_page mm/zswap.c:1439 [inline] zswap_store+0xa74/0x1ba0 mm/zswap.c:1546 swap_writepage+0x647/0xce0 mm/page_io.c:279 shmem_writepage+0x1248/0x1610 mm/shmem.c:1579 pageout mm/vmscan.c:696 [inline] shrink_folio_list+0x35ee/0x57e0 mm/vmscan.c:1374 shrink_inactive_list mm/vmscan.c:1967 [inline] shrink_list mm/vmscan.c:2205 [inline] shrink_lruvec+0x16db/0x2f30 mm/vmscan.c:5734 mem_cgroup_shrink_node+0x385/0x8e0 mm/vmscan.c:6575 mem_cgroup_soft_reclaim mm/memcontrol-v1.c:312 [inline] memcg1_soft_limit_reclaim+0x346/0x810 mm/memcontrol-v1.c:362 balance_pgdat mm/vmscan.c:6975 [inline] kswapd+0x17b3/0x2f30 mm/vmscan.c:7253 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(fs_reclaim); lock(cpu_hotplug_lock); lock(fs_reclaim); rlock(cpu_hotplug_lock); *** DEADLOCK *** 1 lock held by kswapd0/89: #0: ffffffff8ea355a0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat mm/vmscan.c:6871 [inline] #0: ffffffff8ea355a0 (fs_reclaim){+.+.}-{0:0}, at: kswapd+0xb58/0x2f30 mm/vmscan.c:7253 stack backtrace: CPU: 0 UID: 0 PID: 89 Comm: kswapd0 Not tainted 6.13.0-rc6-syzkaller-00006-g5428dc1906dd #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Call Trace: __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_circular_bug+0x13a/0x1b0 kernel/locking/lockdep.c:2074 check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2206 check_prev_add kernel/locking/lockdep.c:3161 [inline] check_prevs_add kernel/locking/lockdep.c:3280 [inline] validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3904 __lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5226 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849 percpu_down_read include/linux/percpu-rwsem.h:51 [inline] cpus_read_lock+0x42/0x150 kernel/cpu.c:490 acomp_ctx_get_cpu mm/zswap.c:886 [inline] zswap_compress mm/zswap.c:908 [inline] zswap_store_page mm/zswap.c:1439 [inline] zswap_store+0xa74/0x1ba0 mm/zswap.c:1546 swap_writepage+0x647/0xce0 mm/page_io.c:279 shmem_writepage+0x1248/0x1610 mm/shmem.c:1579 pageout mm/vmscan.c:696 [inline] shrink_folio_list+0x35ee/0x57e0 mm/vmscan.c:1374 shrink_inactive_list mm/vmscan.c:1967 [inline] shrink_list mm/vmscan.c:2205 [inline] shrink_lruvec+0x16db/0x2f30 mm/vmscan.c:5734 mem_cgroup_shrink_node+0x385/0x8e0 mm/vmscan.c:6575 mem_cgroup_soft_reclaim mm/memcontrol-v1.c:312 [inline] memcg1_soft_limit_reclaim+0x346/0x810 mm/memcontrol-v1.c:362 balance_pgdat mm/vmscan.c:6975 [inline] kswapd+0x17b3/0x2f30 mm/vmscan.c:7253 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Revert the change. A different fix for the race with CPU hotunplug will follow. Reported-by: syzbot Signed-off-by: Yosry Ahmed --- The patches apply on top of mm-hotfixes-unstable and are meant for v6.13. Andrew, I am not sure what's the best way to handle this. This fix is already merged into Linus's tree and had CC:stable, so I thought it's best to revert it and replace it with a separate fix that would be easy to backport instead of the revert patch, especially that functionally the new fix is different anyway. v1 -> v2: - Disable migration as an alternative fix instead of SRCU, and explain why SRCU and cpus_read_lock() cannot be used in the commit log of patch 2. --- mm/zswap.c | 19 +++---------------- 1 file changed, 3 insertions(+), 16 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index 5a27af8d86ea9..f6316b66fb236 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -880,18 +880,6 @@ static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node) return 0; } -/* Prevent CPU hotplug from freeing up the per-CPU acomp_ctx resources */ -static struct crypto_acomp_ctx *acomp_ctx_get_cpu(struct crypto_acomp_ctx __percpu *acomp_ctx) -{ - cpus_read_lock(); - return raw_cpu_ptr(acomp_ctx); -} - -static void acomp_ctx_put_cpu(void) -{ - cpus_read_unlock(); -} - static bool zswap_compress(struct page *page, struct zswap_entry *entry, struct zswap_pool *pool) { @@ -905,7 +893,8 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry, gfp_t gfp; u8 *dst; - acomp_ctx = acomp_ctx_get_cpu(pool->acomp_ctx); + acomp_ctx = raw_cpu_ptr(pool->acomp_ctx); + mutex_lock(&acomp_ctx->mutex); dst = acomp_ctx->buffer; @@ -961,7 +950,6 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry, zswap_reject_alloc_fail++; mutex_unlock(&acomp_ctx->mutex); - acomp_ctx_put_cpu(); return comp_ret == 0 && alloc_ret == 0; } @@ -972,7 +960,7 @@ static void zswap_decompress(struct zswap_entry *entry, struct folio *folio) struct crypto_acomp_ctx *acomp_ctx; u8 *src; - acomp_ctx = acomp_ctx_get_cpu(entry->pool->acomp_ctx); + acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx); mutex_lock(&acomp_ctx->mutex); src = zpool_map_handle(zpool, entry->handle, ZPOOL_MM_RO); @@ -1002,7 +990,6 @@ static void zswap_decompress(struct zswap_entry *entry, struct folio *folio) if (src != acomp_ctx->buffer) zpool_unmap_handle(zpool, entry->handle); - acomp_ctx_put_cpu(); } /********************************* From patchwork Tue Jan 7 22:22:35 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yosry Ahmed X-Patchwork-Id: 13929632 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92D86E77199 for ; Tue, 7 Jan 2025 22:22:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 154586B008C; Tue, 7 Jan 2025 17:22:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1059C6B0095; Tue, 7 Jan 2025 17:22:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F0F136B0096; Tue, 7 Jan 2025 17:22:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id CA6CB6B008C for ; Tue, 7 Jan 2025 17:22:44 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 17A471A0E03 for ; Tue, 7 Jan 2025 22:22:44 +0000 (UTC) X-FDA: 82982081448.18.4ECD4ED Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf13.hostedemail.com (Postfix) with ESMTP id 399F820010 for ; Tue, 7 Jan 2025 22:22:42 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Dy8jD6Mf; spf=pass (imf13.hostedemail.com: domain of 3MKl9ZwoKCLszptszbinfehpphmf.dpnmjovy-nnlwbdl.psh@flex--yosryahmed.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3MKl9ZwoKCLszptszbinfehpphmf.dpnmjovy-nnlwbdl.psh@flex--yosryahmed.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736288562; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8OEPHYADGiebrrUIJmwP++YfNdezEkFFL7taBOiv4vY=; b=kfkaRgOh77y7ImvDBU81MwjB4pNonUJ2ghelsBQTsvVESK9q4YiuJG0szexXUT4C4M3j9G h+vKS9mpLYkCuyD0biT6HyYd4zijSt1MrJGOeZf9e2fXrs47XuM7TzjmB4YC8lTIzFFzOI L9Rpwm/Kw6SNvnEQ93B1wB55Vh18m6w= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736288562; a=rsa-sha256; cv=none; b=ScL8v5Lp7lWHJeeBkdUE2c5JPaNBQrg8GQKGBSQI4R6pQKxw27RC07Xt9zduKkqMCvRXWk ekaW6KmOgiwTQZgfpFQ+vbGYF6YheMpwoQjRM6uueV/YjVAArRAaI6Aq5GHhnByKUWkCpR GTbycbp3CDAvXR7fb/k4gMhZhLa1AQE= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Dy8jD6Mf; spf=pass (imf13.hostedemail.com: domain of 3MKl9ZwoKCLszptszbinfehpphmf.dpnmjovy-nnlwbdl.psh@flex--yosryahmed.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3MKl9ZwoKCLszptszbinfehpphmf.dpnmjovy-nnlwbdl.psh@flex--yosryahmed.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2166f9f52fbso363339525ad.2 for ; Tue, 07 Jan 2025 14:22:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736288561; x=1736893361; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=8OEPHYADGiebrrUIJmwP++YfNdezEkFFL7taBOiv4vY=; b=Dy8jD6MfKOoEquQVVlAdv1c7ePfk8be1Nmz+sDwB0DVt1CSFMbWp47qs4Bw0srpLX4 SqKDDYUnSpnAuoTfGMVV3BRy7anaiEq9e2g3VitWtq9n7qPXItR8Y0QfdhxvZ7LRzvCY CD7iJXZi8IFfs9Jyi/ozoxLJZ8LkH5OPb7RdVsEuwXHJEMR4TBnTc6ECPMvObL1FWKrH qeQTp9/iazuU8I1IWZL+cKHUcjBJXIhd6sBjjLzsN6r5om69q7OQefJ1DZge1CY7IYpS Yh/hggbGVUuNnOvUHVjLsdFs+O52WCHP7NaHldegXytUeT/yKoepcDIMUPEgAXIWTH89 tFWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736288561; x=1736893361; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=8OEPHYADGiebrrUIJmwP++YfNdezEkFFL7taBOiv4vY=; b=MFhJx+yck9JlAEhfv74lX4l/Te7PvuLdFXvRt0ml10M4hn8ptfmFF/L0+YA51jla2J uGcgdIhV0mMa/UdHmpt8X+klly6NzIvxe+yyayt4YVWx9MsQO6ml1kqCsLlbprOdcVMr K2LiMIqkk5sgzuruzNebtiqHuaEkHsTS2NaVwbRJtUMoSwFNR2EeuI0p2O1Ee7KCNjQ0 sLLlVVkryubYYi/S9G7ddmZxuCxEKA/xi1Z1SYQlpZmrehBnT/+ZW0QGGH6KXQozlBFD CRp1mYXsLDg4skaWH1YUlkqdKRHRgy2/TR5EMQgie6p96bZvJjNHGNyGRDbNAyW/ww92 rT3g== X-Forwarded-Encrypted: i=1; AJvYcCUp2d1CKnYKc2C5EKB1F1DjFJY0t7OPKrjf0t3ayhsbVzynLBummkpHVB+YRK2qwGYIcF7Q/vf9Aw==@kvack.org X-Gm-Message-State: AOJu0Yz3CaLr182OMHpWvwZ9aZ0usXrNjWrxMSeaDWC0aS++UP3H9ZNt n5FmgdVWY6Ohql5XLKoH+ezhKPiLdNfTC8rkXeA+R4N2gsZrCeHc64O5XLod/YOZzbXGlKCkfiG NUpy2UmkPw+wumwK+pQ== X-Google-Smtp-Source: AGHT+IEgs5IL10xqkJJftn7yhmclPLhI6k7/GpQT4G68VnNLJlk+dMe1GT3gZ8d2wJgSYVs0Bs7gZdOCLhcukz9L X-Received: from pfwo11.prod.google.com ([2002:a05:6a00:1bcb:b0:725:f045:4714]) (user=yosryahmed job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:c681:b0:1e6:44b4:78ab with SMTP id adf61e73a8af0-1e88d0edb56mr1529448637.8.1736288560944; Tue, 07 Jan 2025 14:22:40 -0800 (PST) Date: Tue, 7 Jan 2025 22:22:35 +0000 In-Reply-To: <20250107222236.2715883-1-yosryahmed@google.com> Mime-Version: 1.0 References: <20250107222236.2715883-1-yosryahmed@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20250107222236.2715883-2-yosryahmed@google.com> Subject: [PATCH v2 2/2] mm: zswap: disable migration while using per-CPU acomp_ctx From: Yosry Ahmed To: Andrew Morton Cc: Johannes Weiner , Nhat Pham , Chengming Zhou , Vitaly Wool , Barry Song , Sam Sun , Kanchana P Sridhar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yosry Ahmed , stable@vger.kernel.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 399F820010 X-Stat-Signature: cptaneaaam3n8xqysoop7srfn3d4m5bp X-Rspam-User: X-HE-Tag: 1736288562-995300 X-HE-Meta: U2FsdGVkX19f5qmiuVbhXnlhJ2jPOa0+YvfuNhz3yUSGyz9qeB6Rmr0WaRnvLK/uSTQDk/neYgcgaqyWffXnVhHfn984gFtn+Im/wkBJhIBTJbD62RSIb5aOd0a7cmxmfKvz0hktNCnSEHtrKQzkL2MRAA0BP6+mWhczVAfbailSohzjtL6JyHCdesC64IqxJiJ8Mq5xXUNSJt+8UaOrxe4T1x++CDyjX1UiwHryElCOnDfvK8SwChxCo5abs7XZVDFlCqXJuxLsBlRqx/zkrEvvcn0nPBUavDcBFB3ad7/a/efBRIR++uds1G+yQrFpGb1ET1ftfaMYpCWNPuxkIP79/H2cdng9IOV+9IWNsT8Uc2PwDyjEu0EjnQO9t1+QUmYegEBkZW3g/rj3UJabd4h5n+6gvoxLMWBxhJM34m7vdWHbVaqhH5FoRNvG+iIGPz0QYIGbIMcgKuUD2zQiq0dizJUXD8FcBdGj+S1rfbzYHgr8tfifjtb9RwgK7A0SopMm0v4+CST6A9AOrlvR8wieks1toMQGV8ZKHXBH7K+b2mgk1ys1XZQ+6Xozrsc2iNw/+7aCxm+LylHAQ6CdV7oq/2J3XNLpOEnngh/iSoPOf3TTnqx8+ZcOWdJkcD8budXw/aOYzbScSY3rx1sdXqUZmFqrKBtPIHkt8ZATPmKXi+HKbUfjmZgoL/VifexrSVyZjbU+LVsSeMDRZTTYr5TVzMsSnp1kXmvOiOno1B/hnV7Y9UWwPqJSM7lk9FSheQksrIAc+ntUILv9BmlAAP5UE9bDVoDBOo39+ipD9YCzkYSdiWIjuk0q+9gN2rxx9EGPIgZEZ4KUZx8wXqgiGrZc2s5cSd4ave30/woJA/RyRk1D5eQN93ehUXMcLN+aT/QjYBMSxQ2vYTjidNcSzwHBTgIJhn+0eI7hKHBbzOU9TRt+AT08+zdhaXz0Sx/VOuucE/p4Bdx8TVN2meb Wv58EIDI WHeUk7ZhRjSL0Y5Q526quB2Sw+hLSjxGnTyfXm97IIWf8SqTEztdz6IFn4RnMOarHKgYXDnElZQOGIF2j+53X59CiwxojyY7KGPrRPBGEcjW1NnQwex7Sne5ydQhN2LYX38HUl262Guh97qF1cFOiyNPk+MIrMwxKOZsEhyxxiTnlXw90L0asxCE9ph9CPDqWlVE4Ax2VDaTbyv6EsbLiOYLpH77Ier5E6Ebg8CBiANdDAChsKPvJMMEA9kC8sHrMVlf+tgwO0YYNcwiaGMKMz6PKyf21Vvl7q96+TBgdTAqzQNTuqT8u/0ONrYoIAexKAcaNbTHoo/bH+GTewrY0DL31GEXtA00MdioJDI50jxFR+2jxHFmSIyEiy4iUV6Er5ZS4AMozMZjlt71mGOXkJTsekLADufQzTiKMI0R4wtEBxyoPvDqNsWwcVwcI++S54iuBMv3kYSsYCSxyiA6p+SzD3B211Y0dM2Xk6INDfnxh2T1pjFNpF6sdGo6Fh+7QkgICCLS87P5OhdngZhP2IikKUxBeEWU/FIWoNIjFG+hvHmHtVCySyBeIT3y+OLlnleCtUwDu0lEGsRVf/CZbW10eaRB1xOKsEjBYHDofwHyxyrE8zwFOcG4RsZzZmBZxji7jtlwlAJv2WW5XUcDZxdNES9chnwhAjpsnhOFXfX9MPKfGBA+TSNEPVNStPa60kXOBQ5Klm4xvhooH7IcKn9CSZI2027OUn+YI X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In zswap_compress() and zswap_decompress(), the per-CPU acomp_ctx of the current CPU at the beginning of the operation is retrieved and used throughout. However, since neither preemption nor migration are disabled, it is possible that the operation continues on a different CPU. If the original CPU is hotunplugged while the acomp_ctx is still in use, we run into a UAF bug as the resources attached to the acomp_ctx are freed during hotunplug in zswap_cpu_comp_dead(). The problem was introduced in commit 1ec3b5fe6eec ("mm/zswap: move to use crypto_acomp API for hardware acceleration") when the switch to the crypto_acomp API was made. Prior to that, the per-CPU crypto_comp was retrieved using get_cpu_ptr() which disables preemption and makes sure the CPU cannot go away from under us. Preemption cannot be disabled with the crypto_acomp API as a sleepable context is needed. Commit 8ba2f844f050 ("mm/zswap: change per-cpu mutex and buffer to per-acomp_ctx") increased the UAF surface area by making the per-CPU buffers dynamic, adding yet another resource that can be freed from under zswap compression/decompression by CPU hotunplug. This cannot be fixed by holding cpus_read_lock(), as it is possible for code already holding the lock to fall into reclaim and enter zswap (causing a deadlock). It also cannot be fixed by wrapping the usage of acomp_ctx in an SRCU critical section and using synchronize_srcu() in zswap_cpu_comp_dead(), because synchronize_srcu() is not allowed in CPU-hotplug notifiers (see Documentation/RCU/Design/Requirements/Requirements.rst). This can be fixed by refcounting the acomp_ctx, but it involves complexity in handling the race between the refcount dropping to zero in zswap_[de]compress() and the refcount being re-initialized when the CPU is onlined. Keep things simple for now and just disable migration while using the per-CPU acomp_ctx to block CPU hotunplug until the usage is over. Fixes: 1ec3b5fe6eec ("mm/zswap: move to use crypto_acomp API for hardware acceleration") Cc: Signed-off-by: Yosry Ahmed Reported-by: Johannes Weiner Closes: https://lore.kernel.org/lkml/20241113213007.GB1564047@cmpxchg.org/ Reported-by: Sam Sun Closes: https://lore.kernel.org/lkml/CAEkJfYMtSdM5HceNsXUDf5haghD5+o2e7Qv4OcuruL4tPg6OaQ@mail.gmail.com/ --- mm/zswap.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index f6316b66fb236..ecd86153e8a32 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -880,6 +880,18 @@ static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node) return 0; } +/* Remain on the CPU while using its acomp_ctx to stop it from going offline */ +static struct crypto_acomp_ctx *acomp_ctx_get_cpu(struct crypto_acomp_ctx __percpu *acomp_ctx) +{ + migrate_disable(); + return raw_cpu_ptr(acomp_ctx); +} + +static void acomp_ctx_put_cpu(void) +{ + migrate_enable(); +} + static bool zswap_compress(struct page *page, struct zswap_entry *entry, struct zswap_pool *pool) { @@ -893,8 +905,7 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry, gfp_t gfp; u8 *dst; - acomp_ctx = raw_cpu_ptr(pool->acomp_ctx); - + acomp_ctx = acomp_ctx_get_cpu(pool->acomp_ctx); mutex_lock(&acomp_ctx->mutex); dst = acomp_ctx->buffer; @@ -950,6 +961,7 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry, zswap_reject_alloc_fail++; mutex_unlock(&acomp_ctx->mutex); + acomp_ctx_put_cpu(); return comp_ret == 0 && alloc_ret == 0; } @@ -960,7 +972,7 @@ static void zswap_decompress(struct zswap_entry *entry, struct folio *folio) struct crypto_acomp_ctx *acomp_ctx; u8 *src; - acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx); + acomp_ctx = acomp_ctx_get_cpu(entry->pool->acomp_ctx); mutex_lock(&acomp_ctx->mutex); src = zpool_map_handle(zpool, entry->handle, ZPOOL_MM_RO); @@ -990,6 +1002,7 @@ static void zswap_decompress(struct zswap_entry *entry, struct folio *folio) if (src != acomp_ctx->buffer) zpool_unmap_handle(zpool, entry->handle); + acomp_ctx_put_cpu(); } /*********************************