From patchwork Tue Oct 22 19:24:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kairui Song X-Patchwork-Id: 13846081 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E49ECDD0D8 for ; Tue, 22 Oct 2024 19:30:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B5FB76B009A; Tue, 22 Oct 2024 15:30:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B0ED06B009B; Tue, 22 Oct 2024 15:30:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 963B06B009C; Tue, 22 Oct 2024 15:30:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 71B206B009A for ; Tue, 22 Oct 2024 15:30:13 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 1C5FD120458 for ; Tue, 22 Oct 2024 19:29:58 +0000 (UTC) X-FDA: 82702228476.06.02062C8 Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) by imf03.hostedemail.com (Postfix) with ESMTP id 8905220014 for ; Tue, 22 Oct 2024 19:30:03 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=TYAxA47k; spf=pass (imf03.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.214.171 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729625334; a=rsa-sha256; cv=none; b=HX//bHFVdP51346AO1XZ4Z5txkfOjfpTjrPVbuTJ7ARbY7JKEujWWlA+atFH2EJaTpB/Z+ FNb0oE1S2GjdbUkWXhg2aa+jf9ptkwJ9fo6092I+EG2c1BYtG72Ub3tkubLvFrJVd0CARw f66uRHb9904v3zM7Hazqg7YwnWDKHR4= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=TYAxA47k; spf=pass (imf03.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.214.171 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729625334; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Y5IVfma5ek0o5p05lfogeIX/Ksc+K9HaTmPWOdXoK94=; b=B4iHWjxVxS3VR80QjSA9Q6uANbA2xhkkFF+1l6eqIzLfX3aUx6f1dihk+qVDiAWfIYl5zV OR2kZO8/wAKWge6yP7hWAlHus4YsHpY5EmqEx/67BL8j+u4xOe+4mGOqZveroXFeHRuZ3M 7UefcDyNj8SWZ7B4fhtcHJJr1RI+woU= Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-20c803787abso1321105ad.0 for ; Tue, 22 Oct 2024 12:30:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729625409; x=1730230209; darn=kvack.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=Y5IVfma5ek0o5p05lfogeIX/Ksc+K9HaTmPWOdXoK94=; b=TYAxA47kim6KNzKua2gsaJQiBgGbmHR6YAqn0wQadTRLnLpgOEF3IGZkhYL/3vUETs jj1ct9HWpmdMeTsi6moXF7zr1SiINkd8linPM/WywA5TGphgX+tVW1XTOxjo4u121M3C CbA1w+BoXyGOU/+13qG3WgMbvO/eXH3lISQUP2rfBzY/BCHL2ET/EcLFyjVt2yYUV+qP 95aGk0zu7efGBJSONV4XgG+UcNRzdk7V40qVs3k0NLeIPxL0zo116tiheHcqUqFvd9Lg KBiY7rD+cRzBdulqdK5oi6qgFDMuFVd7qvdTrMBsIg7foN5+UltzvX+hrNjpQDv0c8/J nl5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729625409; x=1730230209; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Y5IVfma5ek0o5p05lfogeIX/Ksc+K9HaTmPWOdXoK94=; b=G1xb1GLBpzutJrZgwAouafhfv3A0e8KH6nXA9PM62AYv4OxWyV9UIgykVEh39Tlrm1 5UQ9NZICsHMMZIz0BfhaTzhz55wTrzUH1fmlKATdDnGzxROdkksGaUJenLdg3jjCn75T 2PF5Af+PnZNhzUv2098umJ57HH22xvhParhoCeLQuh4aVTDVIDL/t1Cr36OkM+iIemDg ovHoUH2p5sSqNpurEjk4yH48/6x8MMVdbrI++Nq9SQJftR3zsleZJ8Pa0JBVhaFSBrfB dgFsJMRyVIDCeV1bTcpFsbpseJ8YCCu2upexEAlWMbGdRQMGWTUvh0NOvkkH987Kzp+O BylQ== X-Gm-Message-State: AOJu0YycAvCR25BdrnF9si3Bk8QvI9X0ZtkkG0wImWVw3uMUeiJFATqf uYfSgQ/kOV2vRtVo0hw5NDiwgTqBOJIJpqO8ACDh7OnHWV32LpExUQrpE8Z0Ldk= X-Google-Smtp-Source: AGHT+IEaW5p22BuL++IGvPbx1wSKUh2fWL5sYrhcqHGPkvgStJJ2rzNvX7pO43mcwrLobjoc3axngg== X-Received: by 2002:a17:902:e845:b0:20b:a41f:6e4d with SMTP id d9443c01a7336-20fab2e2baemr2651685ad.15.1729625409537; Tue, 22 Oct 2024 12:30:09 -0700 (PDT) Received: from KASONG-MC4.tencent.com ([106.37.123.36]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-20e7f0d9f05sm45895305ad.186.2024.10.22.12.30.06 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 22 Oct 2024 12:30:09 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Chris Li , Barry Song , Ryan Roberts , Hugh Dickins , Yosry Ahmed , "Huang, Ying" , Tim Chen , Nhat Pham , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH 07/13] mm, swap: hold a reference of si during scan and clean up flags Date: Wed, 23 Oct 2024 03:24:45 +0800 Message-ID: <20241022192451.38138-8-ryncsn@gmail.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241022192451.38138-1-ryncsn@gmail.com> References: <20241022192451.38138-1-ryncsn@gmail.com> Reply-To: Kairui Song MIME-Version: 1.0 X-Stat-Signature: z5ecoi4gxpszkk7w3edyhugghcyguujk X-Rspamd-Queue-Id: 8905220014 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1729625403-453599 X-HE-Meta: U2FsdGVkX18efDnCu8d2dPwnoBX/oKOkwQ/kKEgO21Nqpx6Spq+Nq4pGxJYPWWIjFBSzvYoaOsUEI9LSEcwvoIPpFyewAWHia9esfamro5NJ2Lww5TLO5DSXeIc5QzPal5dNUue3R0GNEUHxxp5KeDj1GWY3+Y0F7WyxpV7AHXbKJk96661CMAC4hx0+5kEWveZIdUbt21fOmZTf7q7TZwwKLWXbr02+2mlihqApm50u+/ZtBXKWhKmZoAHlo0S1ixDAWATY7mJHhJ69w0b+7Hb8kNm0IhkfnkqE050S5ggTCixn7eNY1ZWS2ZIs0BtGCMba/2GmfSfwo0lJ/yEDSbzsRN+y+3nmMVnV6qo1MR9RC4TCvuqrKiMgPrQ4VjJOOSNCPwTBJdJBUPcjIRx64ZkEBBxIlH3XUoSi95nLbUGHefaiklprY+G0mM4ief4y7B/xoPHM9raXi6jo8xYbNxbhwBp07Odg4xB+1u7TZVlew5PxJkkm7mF/QJnp+CZlGz+bhnUpjVi/LCmIc1LBDapinb4Ay/atAI8Hcl7NEiZArtyS4S4Yug8Jku+DmR878Xa11Fx97lZ4vZ7uvnrjReznVKiIHpkqlnVRL8Ea8jRq0KEtmR8V2J1SezJ1f3gn6VhXqrMWYMKQtbf7iTRYYfvakTNLxVSRJuKq8/yg4634Q4k+ge1zarATphrIIeqZJrI2CG93R5ZPCB81SZEj4w3KlrGl5d6U1o2MhnJUIAxgk7MLn2aC/4/oqx9FMtVK2W2WxU6b8CAjQChYlct5+mCnC4TG9BSU7SiLA+SihxiO3R8/ki2+dT15mium2t5nkpQ3FyM+pvamp/3Dm3b1pAW2pQkF+OBXjC2kJjvXMyIC36xCSaYxHGuTzbYpMUifaHQkkK3bAB9ri+uFDCjpvto06XeK7hae9vuYRNMwuB2QvCdj3sUivEepBZGYSucfjrA+arxk+cgzMApurCW rrqCRE4X UQL1d8vp6rWjseaz6xb3U7rETQ+qE9IoCMF2/V5pKqb2Iv3AgCWmS9ljS2PdECMLbNIACDjw+NH60f5Gagddgnke4wyJPKw8r4sZoI4Fw1oa44kjh6xIwrCtjC8I+4K8DD6tKW1ZLL7zrA+0GGfPyNrtjhfU7QQhfLNX6usut3K1JF/8PUOdo7jGwWSnVezgPSqpPrdahAig67ta/hDVUtpKLmSsp1X64xEiTlIGqCAaDT60s7MzstDyw1uiF4Cme+g/Uwhg9Dj7Y190SZvVByVPiOVVpLL1GlS8aoCIEZpMAuD58e02YtZ/joEu8EJerjLaSJALRBagepKlKhpud1RVzXJILpXS3SBfwuE9FzTUXkuGlLlSaaYEVtELMJoifgCi2AJohupyBgfM1ik2eo/54oYo+6N/9N4/+AC3OrZeu8QMv8ipT5Yt9cilvXEUYT0Q4PGHQuQJ4v9asXPKZXwVujAAlXVCFDzlx3JvJ9UTchGa7+/oZNc3KuZ1ri8ntdkLf22bDMZtr8pW47ae2vxBgaw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Kairui Song The flag SWP_SCANNING was used as an indicator of whether a device is being scanned, and prevents swap off. But it's already no longer used. The only thing protects the scanning now is the si lock. However allocation path may drop the si lock, in theory this could leaf to UAF. So clean this up, just hold a reference for whole allocation path. So per CPU counter killing will wait for existing scan and other usage. The flag SWP_SCANNING can also be dropped. Signed-off-by: Kairui Song --- include/linux/swap.h | 1 - mm/swapfile.c | 62 +++++++++++++++++++++++--------------------- 2 files changed, 33 insertions(+), 30 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index 16dcf8bd1a4e..1651174959c8 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -219,7 +219,6 @@ enum { SWP_STABLE_WRITES = (1 << 11), /* no overwrite PG_writeback pages */ SWP_SYNCHRONOUS_IO = (1 << 12), /* synchronous IO is efficient */ /* add others here before... */ - SWP_SCANNING = (1 << 14), /* refcount in scan_swap_map */ }; #define SWAP_CLUSTER_MAX 32UL diff --git a/mm/swapfile.c b/mm/swapfile.c index 4e629536a07c..d6b6e71ccc19 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1088,6 +1088,21 @@ static int scan_swap_map_slots(struct swap_info_struct *si, return cluster_alloc_swap(si, usage, nr, slots, order); } +static bool get_swap_device_info(struct swap_info_struct *si) +{ + if (!percpu_ref_tryget_live(&si->users)) + return false; + /* + * Guarantee the si->users are checked before accessing other + * fields of swap_info_struct. + * + * Paired with the spin_unlock() after setup_swap_info() in + * enable_swap_info(). + */ + smp_rmb(); + return true; +} + int get_swap_pages(int n_goal, swp_entry_t swp_entries[], int entry_order) { int order = swap_entry_order(entry_order); @@ -1115,13 +1130,16 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[], int entry_order) /* requeue si to after same-priority siblings */ plist_requeue(&si->avail_lists[node], &swap_avail_heads[node]); spin_unlock(&swap_avail_lock); - spin_lock(&si->lock); - n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE, - n_goal, swp_entries, order); - spin_unlock(&si->lock); - if (n_ret || size > 1) - goto check_out; - cond_resched(); + if (get_swap_device_info(si)) { + spin_lock(&si->lock); + n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE, + n_goal, swp_entries, order); + spin_unlock(&si->lock); + put_swap_device(si); + if (n_ret || size > 1) + goto check_out; + cond_resched(); + } spin_lock(&swap_avail_lock); /* @@ -1272,16 +1290,8 @@ struct swap_info_struct *get_swap_device(swp_entry_t entry) si = swp_swap_info(entry); if (!si) goto bad_nofile; - if (!percpu_ref_tryget_live(&si->users)) + if (!get_swap_device_info(si)) goto out; - /* - * Guarantee the si->users are checked before accessing other - * fields of swap_info_struct. - * - * Paired with the spin_unlock() after setup_swap_info() in - * enable_swap_info(). - */ - smp_rmb(); offset = swp_offset(entry); if (offset >= si->max) goto put_out; @@ -1761,10 +1771,13 @@ swp_entry_t get_swap_page_of_type(int type) goto fail; /* This is called for allocating swap entry, not cache */ - spin_lock(&si->lock); - if ((si->flags & SWP_WRITEOK) && scan_swap_map_slots(si, 1, 1, &entry, 0)) - atomic_long_dec(&nr_swap_pages); - spin_unlock(&si->lock); + if (get_swap_device_info(si)) { + spin_lock(&si->lock); + if ((si->flags & SWP_WRITEOK) && scan_swap_map_slots(si, 1, 1, &entry, 0)) + atomic_long_dec(&nr_swap_pages); + spin_unlock(&si->lock); + put_swap_device(si); + } fail: return entry; } @@ -2650,15 +2663,6 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile) spin_lock(&p->lock); drain_mmlist(); - /* wait for anyone still in scan_swap_map_slots */ - while (p->flags >= SWP_SCANNING) { - spin_unlock(&p->lock); - spin_unlock(&swap_lock); - schedule_timeout_uninterruptible(1); - spin_lock(&swap_lock); - spin_lock(&p->lock); - } - swap_file = p->swap_file; p->swap_file = NULL; p->max = 0;