From patchwork Tue Mar 26 18:50:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kairui Song X-Patchwork-Id: 13604898 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1AAC7CD1283 for ; Tue, 26 Mar 2024 19:04:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9B7366B009A; Tue, 26 Mar 2024 15:04:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9180E6B009B; Tue, 26 Mar 2024 15:04:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7911E6B009C; Tue, 26 Mar 2024 15:04:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 5FAFB6B009A for ; Tue, 26 Mar 2024 15:04:43 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 0BEB7A0DAA for ; Tue, 26 Mar 2024 19:04:43 +0000 (UTC) X-FDA: 81940116846.19.075B35A Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) by imf28.hostedemail.com (Postfix) with ESMTP id F2973C0007 for ; Tue, 26 Mar 2024 19:04:40 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=NJgxkd9L; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf28.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711479881; a=rsa-sha256; cv=none; b=Y5rLqK9lCV+Eq6A891G5LHgOnvq7U+ct7rx62cdsfOXZNTg3uBPeZx9E2LjTPtixz/Va9V EJleZf27BG4rkKit8Ole6CGWfwRmJXAjIs04eT8YAxuAvMwFE4eq39qX5DJ2Sq9ICZWRK/ OvHHYuixdgrtUjBcTaFwYUZi2KKorTU= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=NJgxkd9L; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf28.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711479881; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=QbZCOjluZtqwe+IJf5t2wPcomK28PDyLWYCalJheOPc=; b=w3K3SeRUm/gUyLrRKO1KVRA0/yz4xbwo9Jj2pFyScU99uZuXYwRDjoDWx02dkdd9tBrWon vie0hw7DCMzW5giLmf9TChOqdm8K7o+ZWwRIqUJ3kjiHO4/+xlgnhgmXhdnLXLz3GWKCyx z6oZJu12vTvYOCWekAozRjAMFCJf9J4= Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-1e00d1e13acso35826245ad.0 for ; Tue, 26 Mar 2024 12:04:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1711479879; x=1712084679; darn=kvack.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=QbZCOjluZtqwe+IJf5t2wPcomK28PDyLWYCalJheOPc=; b=NJgxkd9LluTHriQfdTSMtZa32tzlT0Onlx+Kv0hg0hD3krPLrdbjzrAbJbdt3+3CZz qxoAdEn1BmOFazTzunrFlbSckc/4wjCPSRK5bPMySulanz0dQ3vQIRgI5tpVVO5vKYMN D5/TEX7Q/t/djfxsxRjQdKlHmVGVWxK2Tewm0RrmjwA29bQE6PQieyCvb8lcA/NX+6Pt NLeeOgWnEQlNeBHF2NtpMgMG2PRP+T7DpjavkmHm4f/gQ2mflCm8WRGlydxKifowA7ze TNtQ7KkfqY2z9lnS3iZKKqeDsUkyU/dEDdEwhDo1uJGBKghjGP7Qo2K/EORKiOrDd1pv 4BZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711479879; x=1712084679; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=QbZCOjluZtqwe+IJf5t2wPcomK28PDyLWYCalJheOPc=; b=oq3wkz2i71pmhlD0Vc2zWoNsvf5+cTSrLlv3nk2quLJn2Hs+z3mefhLxJMu1kyqtxo PvCMNZbmVfGjsHc8JRQMEDk64GyoY/PPnEHA2gIKkPqof/9FO/7bl21rynpP14OpNnsD OBpK/AHlNQ/nP3N3GasxY7G6SdjA5Eb+w5bILG6oXY96PU5mINbtv2n5qgwdz0wop4/5 84hQRZK7bDMrCI6ielmYtu7GvtBs6QCgT6DR3TM4tBRsFgMQA4u90Q/3FzjZjksrLVd1 JsSXdclDQGP9jgRN1YHP4sHrwacy5z8HYSxO1vjTF+uOuRaaqu3PNifIsXgfYg4BFA6n KKBw== X-Gm-Message-State: AOJu0Yz+Dndo7m/zPMD1ZxDK//15bJSedGYKX6O2/due8UHL3/z1wqPv 1nJ4tBAieanWhT4bhsNHZz894CrGbb7ugz/EZK5K5jsq+JchjurjeN9JMU/HzwJ8bysh X-Google-Smtp-Source: AGHT+IHPrVL9xajG03QLF+CldgXO1e+DdNsfAc6WD2CDV/JcM0VHVNqqhHe8QUEsP8IyBUI6ApCsOQ== X-Received: by 2002:a05:6a20:9187:b0:1a3:c113:f441 with SMTP id v7-20020a056a20918700b001a3c113f441mr8667699pzd.15.1711479878748; Tue, 26 Mar 2024 12:04:38 -0700 (PDT) Received: from KASONG-MB2.tencent.com ([115.171.40.106]) by smtp.gmail.com with ESMTPSA id j14-20020aa783ce000000b006ea790c2232sm6298350pfn.79.2024.03.26.12.04.34 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 26 Mar 2024 12:04:38 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: "Huang, Ying" , Chris Li , Minchan Kim , Barry Song , Ryan Roberts , Yu Zhao , SeongJae Park , David Hildenbrand , Yosry Ahmed , Johannes Weiner , Matthew Wilcox , Nhat Pham , Chengming Zhou , Andrew Morton , linux-kernel@vger.kernel.org, Kairui Song Subject: [RFC PATCH 05/10] mm/swap: clean shadow only in unmap path Date: Wed, 27 Mar 2024 02:50:27 +0800 Message-ID: <20240326185032.72159-6-ryncsn@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240326185032.72159-1-ryncsn@gmail.com> References: <20240326185032.72159-1-ryncsn@gmail.com> Reply-To: Kairui Song MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: F2973C0007 X-Stat-Signature: k41wsb4dju8ddeiz4oqfczubcisa9q6m X-HE-Tag: 1711479880-447126 X-HE-Meta: U2FsdGVkX1/MGv9TTBRO6rDxIwcQ20w/SYSKE+XsjPoY7FbPk54N7DrPqLjbsV8EpyfjkqmDeHeu3vA/DGINIlMAIMuunbYkLA90pzre5JRqn6ipCVjzbO321gV9fMiV8oQYihvOoaCtNCD3b/6pSXcUyD5fgTNGFbGSDXOSc3r0NxECuMZc6sauxY+vpblzPxSUPF5B6ES7LC4o15cN6n++Va+m+LPu6ebFWkQH0XAE4W6biWibvj+XSY+txNFB8uW1E6ImbhhDJ9oJca38yCxaICGOl09q6RmXAY0/+zgfAj/u42cqNtuI/AJv+6YHU0cK13i/23aln3ImUqFuyFOpN/lfLlmCtiafNLNQfciEnSw/5H8RroOvXFN4xQzvs8Oa/1uiQtRVVvG3vrruEb+bwQQ152lUCs3oorG1QX9GLZLU1PN4DXWuw/FqnMDpN3qLslpiIy2izKCDYIVlwfcwe2S9bly0CszT56fiQp4bPsqCK+I7S+FLjmcXNvMvTr2bxWCWKQKDEJK1XkttCyikqhou6KLGXW659ltmvyR3J+CEfXmzn/MP8GgumHOT2KDTSqvrrWTIhIBcvI6t+VjxNuQd1F/4jaqMI7xs5G8POQMfykXNK+dTLgjevvQloy39quEbVtWJjiXPnK2sG+4cM+o9c3wYeNFIubEipj9lQc9mmtMiMeVw3CsXQquC25TfpvF1R8XCsL40eJjkEP6DcIXAFBKNge49jmFInD4oQJEcYPMsvvIN8CsAL9qZjZL5Cfq8JSVfZVDbH5urm860B/HNRHQD+QQsVGLpJ8xQlxf4TH/4GrWg9zcZLTM3/FBlG71/5CnxSLi+zZW2F82CZt7jk0tcHoOYtSNri/5u12r7syycHEiB2xbNgoL8Pr0yxBcGv5jRyPCAWFUEfRJ9zEeJebBnPF8XUgB8B0Jt1DvomJvxAIsm3VR1E8Q+gBO7+6hqJfsq4umriKw lk2VPcyb PTRZ2tdaH9DrcfVvpid3mnz7m+AHa+LX2kSV+A3aeUanFswr06JZF01+cZAU0FvFkpAxEp/HBgngnmfbI1y7bEUEg8b2NW2pRcGHAb5ce0Sp/B5bkOIOWRZkatA85FL+Vn1y2L4LBr9H8tsAWiNDGnSUmJK5aQ5KhYSN9+vBJymmveDgIOujXNo2TRKgIl6X9kXsso9QaN4MfBV5S3xhYH26/WIDCyHWwADaLw2qUzORJllIUAOacHwpTOk2MAUEeAjZuWVMER7CfEJHC5u9Xmx69dikpLfRh7BUKF9Q5Pma+cv8PPPcjNV/Wr0dslD705yLyyINlSNFR/Gasis1jBB+KCyDeSsaVmT9yxKTq2kMQNuoZcYzKDBApF6hA1gDnmAlTz0SPBdl6MNCkAXeQmKuufQU3ZRHdNyV0HL9LxLyQFZbMJLDR4AMyGxk1L/HVLuiaEyGaHsYFaCVrSm8oPN0ziV01EwPml+eQfeN2+IO5ximvID0fU2KDJvKQD4yQBI6w1gq54q3v5ukby/9woLykgqKbDDRajesrMtN+orBhCe0JhudZlms0PGyVXY+LBSKOOkP1wEAtazANEJ/qUUOO8w9Wh3WhEKpD4WxDBiOBuByyByFs6e+kK4n93DLf3e7X X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Kairui Song After removing the cache bypass swapin, the first thing could be gone is all the clear_shadow_from_swap_cache calls. Currently clear_shadow_from_swap_cache is being called in many paths. It's currently being called by swap_range_free which has two direct callers: - swap_free_cluster, which is only called by put_swap_folio to free up the shadow of a slot cluster. - swap_entry_free, which is only called by swapcache_free_entries to free up shadow of a slot. And these two are very commonly used everywhere in SWAP codes. Notice the shadow is only written by __delete_from_swap_cache after after a successful SWAP out, so clearly we only want to clear shadow after SWAP in (the shadow is used and no longer needed) or Unmap/MADV_FREE. After all swapin is using cached swapin path, clear_shadow_from_swap_cache is not needed for swapin anymore, because we have to insert the folio first, and this already removed the shadow. So we only need to clear the shadow for Unmap/MADV_FREE. All direct/indirect caller of swap_free_cluster and swap_entry_free are listed below: - swap_free_cluster: -> put_swap_folio (Clean the cache flag and try delete shadow, after removing the cache or error handling) -> delete_from_swap_cache -> __remove_mapping -> shmem_writepage -> folio_alloc_swap -> add_to_swap -> __read_swap_cache_async - swap_entry_free -> swapcache_free_entries -> drain_slots_cache_cpu -> free_swap_slot -> put_swap_folio (Already covered above) -> __swap_entry_free / swap_free -> free_swap_and_cache (Called by Unmap/Zap/MADV_FREE) -> madvise_free_single_vma -> unmap_page_range -> shmem_undo_range -> swap_free (Called by swapin path) -> do_swap_page (Swapin path) -> alloc_swapdev_block/free_all_swap_pages () -> try_to_unmap_one (Error handling, no shadow) -> shmem_set_folio_swapin_error (Shadow just gone) -> shmem_swapin_folio (Shmem's do_swap_page) -> unuse_pte (Swapoff, which always use swapcache) So now we only need to call clear_shadow_from_swap_cache in free_swap_and_cache because all swapin/out will went through swap cache now. Previously all above functions could invoke clear_shadow_from_swap_cache in case a cache bypass swapin left a entry with uncleared shadow. Also make clear_shadow_from_swap_cache only clear one entry for simplicity. Test result of sequential swapin/out: Before (us) After (us) Swapout: 33624641 33648529 Swapin: 41614858 40667696 (+2.3%) Swapout (THP): 7795530 7658664 Swapin (THP) : 41708471 40602278 (+2.7%) Signed-off-by: Kairui Song --- mm/swap.h | 6 ++---- mm/swap_state.c | 33 ++++++++------------------------- mm/swapfile.c | 6 ++++-- 3 files changed, 14 insertions(+), 31 deletions(-) diff --git a/mm/swap.h b/mm/swap.h index ac9573b03432..7721ddb3bdbc 100644 --- a/mm/swap.h +++ b/mm/swap.h @@ -39,8 +39,7 @@ int add_to_swap_cache(struct folio *folio, swp_entry_t entry, void __delete_from_swap_cache(struct folio *folio, swp_entry_t entry, void *shadow); void delete_from_swap_cache(struct folio *folio); -void clear_shadow_from_swap_cache(int type, unsigned long begin, - unsigned long end); +void clear_shadow_from_swap_cache(swp_entry_t entry); struct folio *swap_cache_get_folio(swp_entry_t entry, struct vm_area_struct *vma, unsigned long addr); struct folio *filemap_get_incore_folio(struct address_space *mapping, @@ -148,8 +147,7 @@ static inline void delete_from_swap_cache(struct folio *folio) { } -static inline void clear_shadow_from_swap_cache(int type, unsigned long begin, - unsigned long end) +static inline void clear_shadow_from_swap_cache(swp_entry_t entry) { } diff --git a/mm/swap_state.c b/mm/swap_state.c index 49ef6250f676..b84e7b0ea4a5 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -245,34 +245,17 @@ void delete_from_swap_cache(struct folio *folio) folio_ref_sub(folio, folio_nr_pages(folio)); } -void clear_shadow_from_swap_cache(int type, unsigned long begin, - unsigned long end) +void clear_shadow_from_swap_cache(swp_entry_t entry) { - unsigned long curr = begin; - void *old; - - for (;;) { - swp_entry_t entry = swp_entry(type, curr); - struct address_space *address_space = swap_address_space(entry); - XA_STATE(xas, &address_space->i_pages, curr); - - xas_set_update(&xas, workingset_update_node); + struct address_space *address_space = swap_address_space(entry); + XA_STATE(xas, &address_space->i_pages, swp_offset(entry)); - xa_lock_irq(&address_space->i_pages); - xas_for_each(&xas, old, end) { - if (!xa_is_value(old)) - continue; - xas_store(&xas, NULL); - } - xa_unlock_irq(&address_space->i_pages); + xas_set_update(&xas, workingset_update_node); - /* search the next swapcache until we meet end */ - curr >>= SWAP_ADDRESS_SPACE_SHIFT; - curr++; - curr <<= SWAP_ADDRESS_SPACE_SHIFT; - if (curr > end) - break; - } + xa_lock_irq(&address_space->i_pages); + if (xa_is_value(xas_load(&xas))) + xas_store(&xas, NULL); + xa_unlock_irq(&address_space->i_pages); } /* diff --git a/mm/swapfile.c b/mm/swapfile.c index ae8d3aa05df7..bafae23c0f26 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -724,7 +724,6 @@ static void add_to_avail_list(struct swap_info_struct *p) static void swap_range_free(struct swap_info_struct *si, unsigned long offset, unsigned int nr_entries) { - unsigned long begin = offset; unsigned long end = offset + nr_entries - 1; void (*swap_slot_free_notify)(struct block_device *, unsigned long); @@ -748,7 +747,6 @@ static void swap_range_free(struct swap_info_struct *si, unsigned long offset, swap_slot_free_notify(si->bdev, offset); offset++; } - clear_shadow_from_swap_cache(si->type, begin, end); /* * Make sure that try_to_unuse() observes si->inuse_pages reaching 0 @@ -1605,6 +1603,8 @@ bool folio_free_swap(struct folio *folio) /* * Free the swap entry like above, but also try to * free the page cache entry if it is the last user. + * Useful when clearing the swap map and swap cache + * without reading swap content (eg. unmap, MADV_FREE) */ int free_swap_and_cache(swp_entry_t entry) { @@ -1626,6 +1626,8 @@ int free_swap_and_cache(swp_entry_t entry) !swap_page_trans_huge_swapped(p, entry)) __try_to_reclaim_swap(p, swp_offset(entry), TTRS_UNMAPPED | TTRS_FULL); + if (!count) + clear_shadow_from_swap_cache(entry); put_swap_device(p); } return p != NULL;