From patchwork Fri Oct 18 10:48:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13841575 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D740D2FFEC for ; Fri, 18 Oct 2024 10:50:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 12B706B0096; Fri, 18 Oct 2024 06:50:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 067406B0095; Fri, 18 Oct 2024 06:50:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DADB06B0096; Fri, 18 Oct 2024 06:50:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id B5A1F6B0093 for ; Fri, 18 Oct 2024 06:50:45 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 20330806A8 for ; Fri, 18 Oct 2024 10:50:35 +0000 (UTC) X-FDA: 82686404598.05.9BE3157 Received: from mail-yb1-f177.google.com (mail-yb1-f177.google.com [209.85.219.177]) by imf29.hostedemail.com (Postfix) with ESMTP id DFD0212000C for ; Fri, 18 Oct 2024 10:50:28 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=WxT9wu5Z; spf=pass (imf29.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.219.177 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729248569; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=huWuZ5uHt0Ar/gjZBjsyETlUQ5fMZuwQtvUTxwg+3Fk=; b=TiTdddzUCDALyhL3ZwEkYJJkr0P2/h13BVKQyCWeddfx8hromi1B4fmFWo45rbuWpL2lAz X4guYixlgbk7oScx/oolWDjh2XLDM4Kwa+CvU0jhPtKj4dA59eENs4R281xvr8p/l/HpGJ BOdYzCPjTnaJFaHijuoveZdN/dGoI8w= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=WxT9wu5Z; spf=pass (imf29.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.219.177 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729248569; a=rsa-sha256; cv=none; b=uRTNMyF+lAOkWIZmXkBxi9R4m6pcW7OwA6QrLlu+LEdp3c8ryrSFvTUrxgJJBNDbHYGNrL Z/sLQuBH+18g2gahME85bb2lMycfFZ+NdW85R5CgN0VseycQukrfVVMVVqJzKlk2LoM9ro LHvx6acUEamlMgu1o3ZyHvDjtZlx3ew= Received: by mail-yb1-f177.google.com with SMTP id 3f1490d57ef6-e290554afb4so1970278276.0 for ; Fri, 18 Oct 2024 03:50:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729248642; x=1729853442; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=huWuZ5uHt0Ar/gjZBjsyETlUQ5fMZuwQtvUTxwg+3Fk=; b=WxT9wu5ZrfSS60xuPRVcng7vLuSb9L7kHhiLaEdZvAZCduBXNe10yIPO0ThX/OSAAD Iq0WDHUCDYDwKpWvmnVxVY1DpQfWiyKPZ8OLQjwMubyJuH1A9ebpxgedCJqO3A5nMnPd HvvgpOqOIUD4PhBc+WTijll2avzRrVBKh+ZSXYbrc1NPSw+qctkeLpRzJ7E/KJtEhHh7 z9TFZQWmRwYSSyfyJPNQJZtqmN0Hhgp86WSXJOTvrB+QGBDDwpD3Zh2+QHcdfA3x8W0/ KCEwEYarh4/kvoqV9suSjkX0zNnc0tzmR/HjORFad2wGj7HbAwEdt4UoIAMHJaMW8Jgz EVbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729248642; x=1729853442; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=huWuZ5uHt0Ar/gjZBjsyETlUQ5fMZuwQtvUTxwg+3Fk=; b=WmG5mS9ra2k3oO53wmKz24Ixe7VQsSqIG2R9hqZHXJlR8Rlw2LR5UjN54O5XYwaUWm ZpBZzanD52cy6OolU9HBgCT5E3bDTIMtkv9tEU7svYeD2o90qYpNp88NgYi6NB9YHEAa /1gLFdzlC6hDPMxF6r5Xq7+sIXrdfbYKWo4v4dgzM7PZpyYI7IM9ledOs8Abo02fDCaS i2x7pLPDvEBuzjWVxOsjWTXF+UYTmgkF7mwarh0EOCzOt/W5+6/kjHE10mnPr4hm64ij g5NlV6TouqLiyIP2NYd/9SaIUZFVbU74cJa+2EAd5SJmdfoekPkdp2gHz8244c52Srgv Am3A== X-Forwarded-Encrypted: i=1; AJvYcCVA+lLwMQUXNj3Cywe6Npy3Qwf7+JyU4QKvVDWxkxrq19Oa3IlEGfmZKvYDx1NHbnIYzBpLog67qQ==@kvack.org X-Gm-Message-State: AOJu0YwceUD3h8guktmO8gLUUYgFub4hk/8r8tMGo8mkZZBxyZq585lW hWpKWK1Nz7x78DEmspRmxdqSuBz9b83x90AnNiWM0LefdgDCP2Rd X-Google-Smtp-Source: AGHT+IH7UKKLWj1Gm56LuvSXibMHdJ9P4Zdin80hAy+JrAjlN+8W7nDtiVAWx8XbC+00QhotrmRWLQ== X-Received: by 2002:a05:6902:2b8b:b0:e28:fbbf:7406 with SMTP id 3f1490d57ef6-e2bb12f6a58mr1693699276.15.1729248642473; Fri, 18 Oct 2024 03:50:42 -0700 (PDT) Received: from localhost (fwdproxy-ash-113.fbsv.net. [2a03:2880:20ff:71::face:b00c]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-460aea455f9sm6044301cf.64.2024.10.18.03.50.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Oct 2024 03:50:42 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, david@redhat.com, willy@infradead.org, kanchana.p.sridhar@intel.com, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, riel@surriel.com, shakeel.butt@linux.dev, kernel-team@meta.com, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, Usama Arif Subject: [RFC 3/4] mm/zswap: add support for large folio zswapin Date: Fri, 18 Oct 2024 11:48:41 +0100 Message-ID: <20241018105026.2521366-4-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241018105026.2521366-1-usamaarif642@gmail.com> References: <20241018105026.2521366-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: DFD0212000C X-Stat-Signature: 618undr1uppw5c5jawxdc9tspg5cu56x X-HE-Tag: 1729248628-697046 X-HE-Meta: U2FsdGVkX19yax0s90exD38J4X6QdjOU1+oUeWkbcX2pqj8G0CJLgsQZWo75GAW/7ub1G5hXw7KW6v7MOgUfXAEKuovnj+sCtToRoaAPC4mS2ISrlM0YCNQELcyMXfakeU9TUWsiPGPVXk+XS6nfkdWps8r4nVXCPGUcdYzjVizN1p7rBKRpzumQYTezy4/yCs0O+BPT1WTGFVSoPle+dAuv0/Yo8kacgvdmZ/eMtTiYFlnjo7JeNU8iOj5WmeQVz3HUMXmEWFdFjUeFeQy+quMPB99WqkG9dPOlHhBujrLrafgf5HqIJicNFZ1YswWPud8MyyT9ZhLCUr31IqODQq9KHgBuNH4KDY1lbBAo0OuU8FS8i3SHbDPfP9i9oWpJIA0gkp4odA0zvyyUqYoZIXE77WYIAVoCYYyID7RkhW8YaegZBHCXsxVkBwSIeGT2mJr0sLG7DOQjePjyP/VKnBTcpoG1hhZ0/K2SpASKQKZReCa+LXxYxLLlvPf6/K56ImpdVHNVlxzOghVO2u/fMR9/9fmIb+2Zmq/1uChqse0RRzq8Uzy9235csWfOfnsrk23j5XNnEg73aDR1WmU4J7MLQ9LjJbrMj8xBhXDbD3+f8nz17+Qm1KPFoAg2Ap/Da5m4gJCKCNk7aj85Oj0+eOi5nmfFPuIdtpQ8vjmZcWMR5SXgem63Mwm6HgbTqh2UxH8jehQxe1NWo1nr6RnKybDvdc3ye21I4V7egHu+50qvYoTgKQejnyTLLWKZQrOu96tj1iomzQ5rSDK1OYdj6/qi1WIo0irnTW9BOakh8a1q/W64OTHiy06+3N87SidGSTvBRUbwwwZqnG/ndfkIUMe4m5Pz2V52QHuWUD/Dl2hY77hd9UgMpgJa3dhd9qllH5+p/Njok94G5S8u0JeLZvZgwnA1o3I3yUQg57Elci+DtjTgNP6un7j9pNV1y3Be2oP/L3aAQdVk53XVgib qT+SwQB1 22CEaobfZVPmbh4Uxo7gmt8nWF9Db+tQwqKUyO/vHVbGp9+SQtkPHtGzygveh/xmJuJG3blx4udMCLcChjZGZu1vrfSq3pvpCRcyKqPNl/iFKYdXiT2d/03JDSATGb5pHf/2Zd515wdSPt/nSrIr8bn11SJb+jvLvQLcYaZg4tjHmWn2hZDKZll0aJ8LsSJgUM34bUea4Ur2adUa6OooPKIuIfRdgG2dRTyos1cbTjZ5JB+PkXwVGcakC3eZL5utkzPa295MBSK01njctEbl7JFuTbz1igQpy4DNiZmovJinbUsitxURcMWHtSR9BzF3/QRBc9XEz3Danx0cCkVrZ0StzBdCQIkbVFsvidLMpaXA92kMVZn3OSFdSU0MzJVUaMD94 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: At time of folio allocation, alloc_swap_folio checks if the entire folio is in zswap to determine folio order. During swap_read_folio, zswap_load will check if the entire folio is in zswap, and if it is, it will iterate through the pages in folio and decompress them. This will mean the benefits of large folios (fewer page faults, batched PTE and rmap manipulation, reduced lru list, TLB coalescing (for arm64 and amd) are not lost at swap out when using zswap. This patch does not add support for hybrid backends (i.e. folios partly present swap and zswap). Signed-off-by: Usama Arif --- mm/memory.c | 13 +++------- mm/zswap.c | 68 ++++++++++++++++++++++++----------------------------- 2 files changed, 34 insertions(+), 47 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 49d243131169..75f7b9f5fb32 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4077,13 +4077,14 @@ static bool can_swapin_thp(struct vm_fault *vmf, pte_t *ptep, int nr_pages) /* * swap_read_folio() can't handle the case a large folio is hybridly - * from different backends. And they are likely corner cases. Similar - * things might be added once zswap support large folios. + * from different backends. And they are likely corner cases. */ if (unlikely(swap_zeromap_batch(entry, nr_pages, NULL) != nr_pages)) return false; if (unlikely(non_swapcache_batch(entry, nr_pages) != nr_pages)) return false; + if (unlikely(!zswap_present_test(entry, nr_pages))) + return false; return true; } @@ -4130,14 +4131,6 @@ static struct folio *alloc_swap_folio(struct vm_fault *vmf) if (unlikely(userfaultfd_armed(vma))) goto fallback; - /* - * A large swapped out folio could be partially or fully in zswap. We - * lack handling for such cases, so fallback to swapping in order-0 - * folio. - */ - if (!zswap_never_enabled()) - goto fallback; - entry = pte_to_swp_entry(vmf->orig_pte); /* * Get a list of all the (large) orders below PMD_ORDER that are enabled diff --git a/mm/zswap.c b/mm/zswap.c index 9cc91ae31116..a5aa86c24060 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -1624,59 +1624,53 @@ bool zswap_present_test(swp_entry_t swp, int nr_pages) bool zswap_load(struct folio *folio) { + int nr_pages = folio_nr_pages(folio); swp_entry_t swp = folio->swap; + unsigned int type = swp_type(swp); pgoff_t offset = swp_offset(swp); bool swapcache = folio_test_swapcache(folio); - struct xarray *tree = swap_zswap_tree(swp); + struct xarray *tree; struct zswap_entry *entry; + int i; VM_WARN_ON_ONCE(!folio_test_locked(folio)); if (zswap_never_enabled()) return false; - /* - * Large folios should not be swapped in while zswap is being used, as - * they are not properly handled. Zswap does not properly load large - * folios, and a large folio may only be partially in zswap. - * - * Return true without marking the folio uptodate so that an IO error is - * emitted (e.g. do_swap_page() will sigbus). - */ - if (WARN_ON_ONCE(folio_test_large(folio))) - return true; - - /* - * When reading into the swapcache, invalidate our entry. The - * swapcache can be the authoritative owner of the page and - * its mappings, and the pressure that results from having two - * in-memory copies outweighs any benefits of caching the - * compression work. - * - * (Most swapins go through the swapcache. The notable - * exception is the singleton fault on SWP_SYNCHRONOUS_IO - * files, which reads into a private page and may free it if - * the fault fails. We remain the primary owner of the entry.) - */ - if (swapcache) - entry = xa_erase(tree, offset); - else - entry = xa_load(tree, offset); - - if (!entry) + if (!zswap_present_test(folio->swap, nr_pages)) return false; - zswap_decompress(entry, &folio->page); + for (i = 0; i < nr_pages; ++i) { + tree = swap_zswap_tree(swp_entry(type, offset + i)); + /* + * When reading into the swapcache, invalidate our entry. The + * swapcache can be the authoritative owner of the page and + * its mappings, and the pressure that results from having two + * in-memory copies outweighs any benefits of caching the + * compression work. + * + * (Swapins with swap count > 1 go through the swapcache. + * For swap count == 1, the swapcache is skipped and we + * remain the primary owner of the entry.) + */ + if (swapcache) + entry = xa_erase(tree, offset + i); + else + entry = xa_load(tree, offset + i); - count_vm_event(ZSWPIN); - if (entry->objcg) - count_objcg_events(entry->objcg, ZSWPIN, 1); + zswap_decompress(entry, folio_page(folio, i)); - if (swapcache) { - zswap_entry_free(entry); - folio_mark_dirty(folio); + if (entry->objcg) + count_objcg_events(entry->objcg, ZSWPIN, 1); + if (swapcache) + zswap_entry_free(entry); } + count_vm_events(ZSWPIN, nr_pages); + if (swapcache) + folio_mark_dirty(folio); + folio_mark_uptodate(folio); return true; }