From patchwork Mon Jan 29 17:54:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kairui Song X-Patchwork-Id: 13536153 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3B90C47DDB for ; Mon, 29 Jan 2024 17:55:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 51F0A6B0071; Mon, 29 Jan 2024 12:55:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4CEDE6B0072; Mon, 29 Jan 2024 12:55:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 396F56B0074; Mon, 29 Jan 2024 12:55:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 27CF76B0071 for ; Mon, 29 Jan 2024 12:55:06 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id F19DF409B3 for ; Mon, 29 Jan 2024 17:55:05 +0000 (UTC) X-FDA: 81733099770.20.CF57FE2 Received: from mail-pg1-f170.google.com (mail-pg1-f170.google.com [209.85.215.170]) by imf12.hostedemail.com (Postfix) with ESMTP id 385A04002A for ; Mon, 29 Jan 2024 17:55:04 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=QQTNiuzP; spf=pass (imf12.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.215.170 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706550904; a=rsa-sha256; cv=none; b=BZQxegObJjuyK5Oe5UwtL9VikboMZG5AZe/LyrkagRvWSrx4RxNC0VNAnZ9BTgwgwXWXeJ BA5HFIzCvTsTE1oWvBK7bO0jm25Fn+S/EBlvR9z9YWlFRUQMtixHCyRL9cBcMHtoBWD2DZ 7EYvZDTPOVNijvXRppjZkmCmNGThVPo= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=QQTNiuzP; spf=pass (imf12.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.215.170 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706550904; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=x+EU8gGa/jy7tRNv4sNkXeHUxxI9bh/bLE9dwfXYMaM=; b=D7rJPjouFLR/KbDeYk+5//F7BjB8I7hRvcjCyV1VlXf2j6OjV1NQKYF4TxIwvsNynfl4ju 1Nha3Q3J+MVesYiDfE9icGyyi1nW1yrX1gP1rAmMd1UQSscO4jChz08V+rZpYmruyVmWOC xExrNVcr47gXIf8gsqE/nNFC9L6ngDM= Received: by mail-pg1-f170.google.com with SMTP id 41be03b00d2f7-5ce2aada130so1488794a12.1 for ; Mon, 29 Jan 2024 09:55:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706550902; x=1707155702; darn=kvack.org; h=content-transfer-encoding:mime-version:reply-to:message-id:date :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to; bh=x+EU8gGa/jy7tRNv4sNkXeHUxxI9bh/bLE9dwfXYMaM=; b=QQTNiuzPRG5D129iaEDRulaib8ZyIA+9iw+DMPRNuvlizVg5DYfBjzMi/wDHdw1OSy eT0oQuE4lo0SOnOF/d0715EsRshTD5k63bT0PQuj7YIemNAn2ZjpPiDBygq6h9NWg3bv 8dB593+pAk2OLh6krsAsN2OazPCmJeyze3aEJwqzfZbUwMPUZA9d5cVU3Bk1Y7CRQqQ4 +qr9KH/tjOHBVhpEY3Hp8iitzyaltbuJqrmfwu7Wbh25Hch/LKLyK37r8fC97MDYS51V vt4Qog2uHTrE1XrCK/2nBK2Mq7mECagnfuokKsNvs65n9FTvdv/Q5VlIcgZf6RcToxvX xr+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706550902; x=1707155702; h=content-transfer-encoding:mime-version:reply-to:message-id:date :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=x+EU8gGa/jy7tRNv4sNkXeHUxxI9bh/bLE9dwfXYMaM=; b=hvso1KkmEJq94V8ASOomdpSOEjHRMn/6CDyPFvAlvCei+hK4EToeUuBT7pjWIvoJ1V QptNn0YODYha4g9dpidaQWFYsyNxf9RifQzEdjDlxxWNUFIBHD1f36K7d5k8Zl7e/DnU peU/cbrXfzhN3aLKFNDqJ6D055+OmOZOeIaSdxf+cknhZ21oQfZ9PBnHJJullzcs09hs UWXVYvNNynPdqZlwrNLWR0K3m0Pww3MovaatC76ogxKynuNWL7E7bt9X7nbzZh8QspCt vG+czOGpJ/j+8bsz69EfExNfpGrNroPsJ8VwSp/fuEb/msSWyGVatd1fPcQNp+ywIKGQ AAgw== X-Gm-Message-State: AOJu0YyzjAKwBQALcotEO96ci9rlR93toCk8FsAVe674JAjV7Oaq0Fad 534+LfkVHcUtsClIfy7Zjh8dUnhhjpZ6aSrlSD+IQGGQgVoQWfOFv/4imjNRgGHfIw== X-Google-Smtp-Source: AGHT+IHC6I3zeWIP+jFk+wHaltKLnBGg4kEGCsueXo1qt4bwGqe5GrwjhCmfcbPnj1+FhkoULuWlDw== X-Received: by 2002:a05:6a20:6ca9:b0:19b:1d39:a567 with SMTP id em41-20020a056a206ca900b0019b1d39a567mr2722552pzb.47.1706550902318; Mon, 29 Jan 2024 09:55:02 -0800 (PST) Received: from KASONG-MB2.tencent.com ([1.203.117.98]) by smtp.gmail.com with ESMTPSA id h8-20020aa79f48000000b006ddcadb1e2csm6116676pfr.29.2024.01.29.09.54.58 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 29 Jan 2024 09:55:01 -0800 (PST) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Chris Li , "Huang, Ying" , Hugh Dickins , Johannes Weiner , Matthew Wilcox , Michal Hocko , Yosry Ahmed , David Hildenbrand , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v3 0/7] swapin refactor for optimization and unified readahead Date: Tue, 30 Jan 2024 01:54:15 +0800 Message-ID: <20240129175423.1987-1-ryncsn@gmail.com> X-Mailer: git-send-email 2.43.0 Reply-To: Kairui Song MIME-Version: 1.0 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 385A04002A X-Stat-Signature: wy5ejajo3epwr74omdfrr859ez6n8s13 X-Rspam-User: X-HE-Tag: 1706550904-472224 X-HE-Meta: U2FsdGVkX19E8b4HfTSVxBhGPKsD/QojiNB6faT3J2otHz1hnIuidS9a181YV+S7xbaBlvS02bE9Rv3LnBNN1S4G6vEuZFuP6oyZ2PrJzpUYawk5pK4z1K+nc/I0Ig3oqVx3/wcUgKZR0mrv7edKF1GDRKa3NVwGTagYEw15wR4u/cXu+hdGRhj+cOx+8Dj6D7G0vKMx71nNlLMph/uU8vt9BZDG9eo5ZLRavOy5VEZn4Mxy1B4yRPR82EhXUx74WO1FBUx3RdeJFFmLCQYc4f1J30VrHn7APwH3omAcwukGUSRSVTxehGZOrY7e5i3Ta4U6hGOo4ehc4tYta6UqPvQCI0mgfNscs1qVq+bhr+3j8rTa6IKqODUyk+NugqRlzgYMtw2RC6Q5WhHYh6OKM9DcAHa81GC2IIassPwuXereupPknNt4sRS1fUvoT4iPbXfdRxWpNTtXB1Nq9AGwYTeFQTjdbTB3ja3fZWvhnELulXn14Bxb2sCGCwkeysbnv3PtdbUv6KHZ/Aaou+Ua4ZJI0p8L6Mwu7fEWaF2USbgCUM93+0F36PQYNWtAJ7nzxoy1VprQvinbC2Sel7wJagTgg61f3PenxdG9LHiSof75YPPdGepwbGi78KyO9RCe0Vm1gEHEbeSgD0Zsbg5I3ATGO8MAOo5u7OuLoRy/MQCBocchMoAQQy/Qjz1lhWwgllH4CY4jAFTf1378zSp9R8taQDLr0m+EApK8hOaiZ0DTiMkz5dChKUcXE/5hhZN0K6ITx7uz4P2xApqctqdLWPhGFqXxhiieXjuUFmC4TNqEHdnzjyqqSVKWGFdRKhWd/FTM04WLn9BHRGJU8LYNvodH8vkPN1uGdhx87BBzlvjk5N8pSY7VDQXVFc1bv2fXEyy/vyRzXTNd4irRCHnI9/Uw7iYp4g0jZvfufzcPC5tQg4yZScmox9iIvaRtjG6pgkr3Czi77U2TZ2vIPSm vnWq7orZ Wb6oDtHOIMBtnGUjPRtnMGfm06LC3f8sXGddPG5YfgoLbv77kDQT/GqzBAiEoFPnxd97r1honEEqmp9BXyeJ25Zw1HvF6+4+VBPM4X0YiHlyxWDMo3Th5TKOyRqrW8DyEforUFTvf9zlE9Gk+J42+C0rq44vYy+u8t8RSmOHTXKng4Ap205IumzdVNBUo+nzKm2AaX90xrLjM86AJG2OlFs1IITH+AKedT8s09o6E2gZs9xpV99134Cc5d/uDuUBhOYc/vwPsBHu/4gUu3FWv3uzfNAFbBsWDmQYkOQL/1xiZV18lZlC8mPFMuOIBHew5qo2NJxlUjrym2FzChBqjURCL8zgU91DLSQk0s9sj/n6+dipSHb6efo5GpAXjhJBHbP5XmX2jJvzysqZyhOVuh0HRsvCdnd7mvoNkOCrM+LAOl1YkkNEDvLfs/QfT04ZeSCzhs2tG1VT3XZ77W+KCSGEoDIh/qAOkH/0nYyDGb4y7CN84A4psIuQFmKxj5IaRipp/r+c4X5rosn/2D0NiI348sRU0bIVqQUOU08JxzdnVE8IVWwG3V08qnPIfrKvUe4Ryg7qLsuA1Em83swC3I0wWoEVu2L1N8RNlHL7PH3n562C25WMn6gx9f3GpEF1V53UH/m/tYt8cDJ6YS1+sxBEBmyubDz68R6v4t9Ts5LHF8h6dSrTB124j1EZQ7IZ9b8f/BoTwspaTM9Zx6+VI8/Fncw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Kairui Song This series tries to unify and clean up the swapin path, introduce minor optimization, and make both shmem swapoff make use of SWP_SYNCHRONOUS_IO flag to skip readahead and swapcache for better performance. Test results: - swap out 10G zero-filled data to ZRAM then read them in: Before: 11143285 us After: 10692644 us (+4.1%) - swapping off a 10G ZRAM (lzo-rle) after same workload: Before: time swapoff /dev/zram0 real 0m12.337s user 0m0.001s sys 0m12.329s After: time swapoff /dev/zram0 real 0m9.728s user 0m0.001s sys 0m9.719s - shmem FIO test 1 on a Ryzen 5900HX: fio -name=tmpfs --numjobs=16 --directory=/tmpfs --size=960m \ --ioengine=mmap --rw=randread --random_distribution=zipf:0.5 \ --time_based --ramp_time=1m --runtime=5m --group_reporting (using brd as swap, 2G memcg limit) Before: bw ( MiB/s): min= 1167, max= 1732, per=100.00%, avg=1460.82, stdev= 4.38, samples=9536 iops : min=298938, max=443557, avg=373964.41, stdev=1121.27, samples=9536 After (+3.5%): bw ( MiB/s): min= 1285, max= 1738, per=100.00%, avg=1512.88, stdev= 4.34, samples=9456 iops : min=328957, max=445105, avg=387294.21, stdev=1111.15, samples=9456 - shmem FIO test 2 on a Ryzen 5900HX: fio -name=tmpfs --numjobs=16 --directory=/tmpfs --size=960m \ --ioengine=mmap --rw=randread --random_distribution=zipf:1.2 \ --time_based --ramp_time=1m --runtime=5m --group_reporting (using brd as swap, 2G memcg limit) Before: bw ( MiB/s): min= 5296, max= 7112, per=100.00%, avg=6131.93, stdev=17.09, samples=9536 iops : min=1355934, max=1820833, avg=1569769.11, stdev=4375.93, samples=9536 After (+3.1%): bw ( MiB/s): min= 5466, max= 7173, per=100.00%, avg=6324.51, stdev=16.66, samples=9521 iops : min=1399355, max=1836435, avg=1619068.90, stdev=4263.94, samples=9521 - Some built objects are very slightly smaller (gcc 13.2.1): ./scripts/bloat-o-meter ./vmlinux ./vmlinux.new add/remove: 4/2 grow/shrink: 1/10 up/down: 818/-983 (-165) Function old new delta swapin_entry - 482 +482 mm_counter - 248 +248 shmem_swapin_folio 1412 1468 +56 __pfx_swapin_entry - 16 +16 __pfx_mm_counter - 16 +16 __read_swap_cache_async 738 736 -2 copy_present_pte 1258 1249 -9 mem_cgroup_swapin_charge_folio 297 285 -12 __pfx_swapin_readahead 16 - -16 swap_cache_get_folio 364 345 -19 do_anonymous_page 1488 1458 -30 unuse_pte_range 889 833 -56 free_p4d_range 524 446 -78 restore_exclusive_pte 937 822 -115 do_swap_page 2969 2817 -152 swapin_readahead 239 - -239 copy_nonpresent_pte 1478 1223 -255 Total: Before=26056243, After=26056078, chg -0.00% V2: https://lore.kernel.org/linux-mm/20240102175338.62012-1-ryncsn@gmail.com/ Update from V2: - Many code path clean up (merge swapin_entry with swapin_entry_mpol, drop second param of mem_cgroup_swapin_charge_folio, swapin_entry takes a pointer to folio as return value instaed of pointer to boolean to reduce LOC and logic), thanks for Huang, Ying. - Don't use cluster readhead for swapoff, the performance is worse than VMA readahead for NVME. - Add a refactor patch for swap_cache_get_folio. V1: https://lore.kernel.org/linux-mm/20231119194740.94101-1-ryncsn@gmail.com/T/ Update from V1: - Rebased based on mm-unstable. - Remove behaviour changing patches, will submit in seperate series later. - Code style, naming and comments updates. - Thanks to Chris Li for very detailed and helpful review of V1. Thanks to Matthew Wilcox and Huang Ying for helpful suggestions. Kairui Song (7): mm/swapfile.c: add back some comment mm/swap: move no readahead swapin code to a stand-alone helper mm/swap: always account swapped in page into current memcg mm/swap: introduce swapin_entry for unified readahead policy mm/swap: avoid a duplicated swap cache lookup for SWP_SYNCHRONOUS_IO mm/swap, shmem: use unified swapin helper for shmem mm/swap: refactor swap_cache_get_folio include/linux/memcontrol.h | 4 +- mm/memcontrol.c | 5 +- mm/memory.c | 45 ++-------- mm/shmem.c | 50 +++++++---- mm/swap.h | 23 ++--- mm/swap_state.c | 176 ++++++++++++++++++++++++++----------- mm/swapfile.c | 20 +++-- 7 files changed, 190 insertions(+), 133 deletions(-)