From patchwork Wed Oct 25 14:45:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13436308 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B6866C25B47 for ; Wed, 25 Oct 2023 14:46:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 918756B032E; Wed, 25 Oct 2023 10:46:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8A1CA6B032D; Wed, 25 Oct 2023 10:46:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 71AA26B032E; Wed, 25 Oct 2023 10:46:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 61DD36B032C for ; Wed, 25 Oct 2023 10:46:04 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 2CB871206DB for ; Wed, 25 Oct 2023 14:46:04 +0000 (UTC) X-FDA: 81384258648.26.8623D6D Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf22.hostedemail.com (Postfix) with ESMTP id 3DC7EC002D for ; Wed, 25 Oct 2023 14:46:02 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf22.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698245162; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=D4xovi6VfiUHc7H1WexA1wOCaWXMFwz43AMZCLiNoqA=; b=WvNUtN+kc56yEq01BPcsRR1XcqmvE7GGPoWPinFEWAYUYLyhUlU70T+PsygAcopfEXro9j Y06tIbA2yoiEEJeRYeLniYNvLI4c2ugnZ16jytfSXragrw2Mlfnpuy4q+MdRHrOcY24JEC jKPRk9iMDSzgq77SVjEOjcW3ZCo3tgo= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf22.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698245162; a=rsa-sha256; cv=none; b=ATbQ9iHT3+2abOWebzXD3Jwtz4f/Ua7jsy3MA0IRzx2oDk9T7zONfnWDsG3aIErMLyR7NL eGGvU1JsWZOce+EC/9ZliZlBfQUyN+/6H4ENzCuxy6nUiY/hgUR08xgpsoNWi7okeJUaVL LdAdPHRsu+I7SYBlvqTj/pvKNXx9J5c= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EB8F71477; Wed, 25 Oct 2023 07:46:42 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id CC0443F64C; Wed, 25 Oct 2023 07:45:59 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , David Hildenbrand , Matthew Wilcox , Huang Ying , Gao Xiang , Yu Zhao , Yang Shi , Michal Hocko , Kefeng Wang Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v3 3/4] mm: swap: Simplify ssd behavior when scanner steals entry Date: Wed, 25 Oct 2023 15:45:45 +0100 Message-Id: <20231025144546.577640-4-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231025144546.577640-1-ryan.roberts@arm.com> References: <20231025144546.577640-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 3DC7EC002D X-Stat-Signature: q5z5dp7simb5bdb9kq4wj11i64ksjwua X-Rspam-User: X-HE-Tag: 1698245162-870841 X-HE-Meta: U2FsdGVkX1+VovgfLPauT70t/5RqtK/nBjwQxSgybz7Olw1tvvA+VzVPj7y9eV+tKvEEBeWJqNWObMH7ZJibpyk8pJzvOm415EkKBI0mZyyLrktrPQ3Gcl5aRzylNKxjaruE01hrk/3dyL4ElsUWIZsnfCuPNFJ1BKYcDDhjD2d/xCsC62ZEki79RZWElmU5dS2iA6XzE4zvHKT6/fu0BNR5KGh/cfhzNMy9mmQ+IJ48mg6egTl6LzQsLoGeieyFd21Gofe8gmLcaLxvOuxLog95O5p+t4x75zkBdH5x9R2kq9GGXX0R4XlBaCFLfLSI2dBF/rKjWeJvJEbWVLxpI3voKkxJYEN4OMT74uDwfYXMPbRwbwAmFEfgIoan6w+IodnzAefdUp9TL/XICQzuTrct7H9WDOdXnLf47NBZ3Tdv/alko1pdYuoyDmIrRwGPDXOJfVXcUWK0G/jxLkU03a9NxQtkCNke1P8ZrsMd/qCMa5siKGcFrSdjfiB/h5useEXjflW0hWn+PfBGK7c+T2GOru+BozAOtWdosARE3psTgHxIVrTM0Gb4BHOdhYKA6V6LLTpYrvn+lo/Y3ix6xTrAHhKY6aloun2LjMxjW28jcvxXumVlJ0Vdj7ymlv+0ESxNnfiAW3B2oCMPvGIQkdTuxF7of+3hbG1xCN14+ig8FFfbp4Vqs7waMx4xIjWSyAKEbf8ThoDoZkn4Y+u0F21jmN7ecdp8Fku5HkXZMwAav3GJyNm6f7kMied3AyD/oy20H4iWOVyDfRWNwsMWUchVyw86QYoEjFkHx+SRQGvgiKlTZlSx/Jqvm5NX7FL4iJpDSjUe/4w+S6fpGB7ak5Sh8D54BMKFjEa6duokqcS02apNY8BY5CXP6LJQwPMi5dx9RroPvX+RYByproVe+4z/OO9KTBoVB2EvcqLh/631chB2w6q/qZBAYhORGbORaGprNVmk8dFw2BY6EGP HpHqScCb +z6Zd2Y0Z+2Kbvv6UkLTPY0TVJHw6NdSiThm+VMGd3qp7e4YTPV+sVpSguEN/i0W/qh8LfB4bobG4btMjUIXhVUH86l8keF0wNb8gaYGjIYLK+U90Bc6qUbEGi5McIe8IGqcFkJ00guPrtU28dpyg/zLYP11MGC31VJIQ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When a CPU fails to reserve a cluster (due to free list exhaustion), we revert to the scanner to find a free entry somewhere in the swap file. This might cause an entry to be stolen from another CPU's reserved cluster. Upon noticing this, the CPU with the stolen entry would previously scan forward to the end of the cluster trying to find a free entry to use. If there were none, it would try to reserve a new pre-cpu cluster and allocate from that. This scanning behavior does not scale well to high-order allocations, which will be introduced in a future patch since would need to scan for a contiguous area that was naturally aligned. Given stealing is a rare occurrence, let's remove the scanning behavior from the ssd allocator and simply drop the cluster and try to allocate a new one. Given the purpose of the per-cpu cluster is to ensure a given task's pages are sequential on disk to aid readahead, allocating a new cluster at this point makes most sense. Furthermore, si->max will always be greater than or equal to the end of the last cluster because any partial cluster will never be put on the free cluster list. Therefore we can simplify this logic too. These changes make it simpler to generalize scan_swap_map_try_ssd_cluster() to handle any allocation order. Signed-off-by: Ryan Roberts --- mm/swapfile.c | 21 +++++++++------------ 1 file changed, 9 insertions(+), 12 deletions(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index 617e34b8cdbe..94f7cc225eb9 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -639,27 +639,24 @@ static bool scan_swap_map_try_ssd_cluster(struct swap_info_struct *si, /* * Other CPUs can use our cluster if they can't find a free cluster, - * check if there is still free entry in the cluster + * check if the expected entry is still free. If not, drop it and + * reserve a new cluster. */ - max = min_t(unsigned long, si->max, - ALIGN_DOWN(tmp, SWAPFILE_CLUSTER) + SWAPFILE_CLUSTER); - if (tmp < max) { - ci = lock_cluster(si, tmp); - while (tmp < max) { - if (!si->swap_map[tmp]) - break; - tmp++; - } + ci = lock_cluster(si, tmp); + if (si->swap_map[tmp]) { unlock_cluster(ci); - } - if (tmp >= max) { *cpu_next = SWAP_NEXT_NULL; goto new_cluster; } + unlock_cluster(ci); + *offset = tmp; *scan_base = tmp; + + max = ALIGN_DOWN(tmp, SWAPFILE_CLUSTER) + SWAPFILE_CLUSTER; tmp += 1; *cpu_next = tmp < max ? tmp : SWAP_NEXT_NULL; + return true; }