From patchwork Sun Sep 8 23:21:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13795688 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5333ECE577 for ; Sun, 8 Sep 2024 23:21:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 001966B00E8; Sun, 8 Sep 2024 19:21:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ECBFC6B00EA; Sun, 8 Sep 2024 19:21:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D1E8D6B00EB; Sun, 8 Sep 2024 19:21:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id B0AF96B00E8 for ; Sun, 8 Sep 2024 19:21:38 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id B814B160AF4 for ; Sun, 8 Sep 2024 23:21:36 +0000 (UTC) X-FDA: 82543144992.04.621F5FF Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) by imf28.hostedemail.com (Postfix) with ESMTP id ED410C0003 for ; Sun, 8 Sep 2024 23:21:34 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=iSud97OO; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf28.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.173 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725837668; a=rsa-sha256; cv=none; b=jOhGEThCoQ7Knr68rYhXXvT2eSylJ8rPohIjgR3E1tYdRulY49WvoQlNdTW7UfcP24V0AW c8pRv75pnBLhQmvb5RivXRyuzKMtKwhBlvWdHYYK5FNlsl2+ly2fKNpduLaT774iy/eyhX u7LhfTaWCVsv4SlYb+fi+hyOgT2jXLQ= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=iSud97OO; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf28.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.173 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725837668; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=D4XfG6mbU1pHrhOieTfAg/3V0YCeOX3JLypNHo4Aiik=; b=qBlCTtUgdhbHsmxlA/L677KSPj8ij8J2HlwicsE3rCm0f3OE7m1iLRwiw0oYNs7Z0G92zu SDPe4w2tpE9n0e3BUTS/t3E0ckBKdBwwz2/bC6RhLrnJXn1Z465W2+DcV12exeOYSaF2qg LeMFIxLXsFNuL0IACjQXEZHfWijPD7A= Received: by mail-pf1-f173.google.com with SMTP id d2e1a72fcca58-718e3c98b5aso1015239b3a.0 for ; Sun, 08 Sep 2024 16:21:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725837693; x=1726442493; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=D4XfG6mbU1pHrhOieTfAg/3V0YCeOX3JLypNHo4Aiik=; b=iSud97OOuaFLXBI37e00y/MxkMQ6Peuyq31pFCDpfiKJUCLZoYXPRvuXGrIhxS++Cr 79cVSZB4jbtig+WDnlPBy1gKtZjRxG6uv6obXu0GkRagdVG0ih9vQ9s8lZ1J4FMY5fZP W2Y5RpbhKq/ZTbQz9rOqCDiXgyqanXNIIGzvpJONJtEb5wT+vpatkBLyLQ+0kP6V0Nm6 mVTD8DXukzHhDbcDW2aA0igTWhz1BLU3XTePPkM5+uhG7hZ1/N3wOZolpymMGQVcxiV3 z0oxsL6WoEUeqk8RFw1GCyAn79+zj3nfY1j1Wg3g19PDQh4mCZ5G5VMeYEHuct3Jyfw7 BGDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725837693; x=1726442493; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=D4XfG6mbU1pHrhOieTfAg/3V0YCeOX3JLypNHo4Aiik=; b=Dor3xJFUy/tvPyG/8nd2WMw+P3GWiIV4RG0cT26dB5rKumr6IBwO5+Us4JU4H9CDFy 5R6/knErV83nek0v6T7tW6srXGvyqqIkN0xLCVkfWgtk8ZKXVQzvEWra4/7A+csCGhOJ 5LjElcMzXMkQ7xDrVdC+5UGcdPTEfMtaGA9dkoBN0/ciT+SgKQJB6664oCucMkwgFmbF BER3NUZORR6giYb8bK0VHq7r2azlWeuiCf4a0NYB8JVOe0gXTEmv7UNu9R+f+pj0WXUM UuMkqlwEZbnYDOsgbOHgoIAaqwFgCBe9D6nUbqmmCQV9MSnAe8pDewMaDyn45wf42BvO 6K0Q== X-Forwarded-Encrypted: i=1; AJvYcCVgBtVagz3jI1uxDfpitkHZj6xzxDPchYkqtwGMvVZsOcjEa9umQt7bNxwhwClCHveLUDFVkXi/vA==@kvack.org X-Gm-Message-State: AOJu0YxrEJNyIrmoZi7HQ3+M4QwedUmsUX519cKjh+5N7tRuScAian51 UUZDYTIVDDguzoxtqvDHI92RnvA4S+5A84LTwJjwnmSnZm6R8Cml X-Google-Smtp-Source: AGHT+IF/PMhDHe/5mSj4TkI1XBbuxWuCkRI0tlZ/vvDLAbmDq+Ontuf0QtBlpGcaKpyUcqQGdsZbjw== X-Received: by 2002:a05:6a00:94a2:b0:714:1e36:3bcb with SMTP id d2e1a72fcca58-718d5e0f288mr7969912b3a.9.1725837693317; Sun, 08 Sep 2024 16:21:33 -0700 (PDT) Received: from Barrys-MBP.hub ([2407:7000:8942:5500:ecbd:b95f:f7b2:8158]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-718e58c703asm2447643b3a.82.2024.09.08.16.21.24 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Sun, 08 Sep 2024 16:21:32 -0700 (PDT) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: baolin.wang@linux.alibaba.com, chrisl@kernel.org, david@redhat.com, hanchuanhua@oppo.com, hannes@cmpxchg.org, hch@infradead.org, hughd@google.com, kaleshsingh@google.com, kasong@tencent.com, linux-kernel@vger.kernel.org, mhocko@suse.com, minchan@kernel.org, nphamcs@gmail.com, ryan.roberts@arm.com, ryncsn@gmail.com, senozhatsky@chromium.org, shakeel.butt@linux.dev, shy828301@gmail.com, surenb@google.com, v-songbaohua@oppo.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yosryahmed@google.com Subject: [PATCH v9 0/3] mm: enable large folios swap-in support Date: Mon, 9 Sep 2024 11:21:16 +1200 Message-Id: <20240908232119.2157-1-21cnbao@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: ED410C0003 X-Rspamd-Server: rspam01 X-Stat-Signature: 3meb49w5p8fmrkfx9kejdg93xijzzumq X-HE-Tag: 1725837694-692847 X-HE-Meta: U2FsdGVkX1+YTSCtVUgkvS4GHbQD0ikfS8nuHT3joPTwtsWJSFDgCA1mVP4OCjg/nWam1zn696gH+Sfmablbrz7L/Wx8xEUTpzsOZo7yT60yXCmLtvJv896OxItS+cxJE3Olb1QUDfrvEtmip/zRRKJ0V2LUEu24fP7Eo50TK/m0s7GEbxY2jZ/8TSgUmPQDuc3CPnQ4LeWG6U4Z4sGK6iWA8P+Go/tTFhs4v+sIcaLLEHwBBx4AIuaoTQiibAs0oyLU6xQKskymgvRWDKkbhoLv8exjfQmvU84bYE390PdCiZEM304/ONvereke1WeBrcDmhKzM88od6KCZhr4QK5D8Zcg8eFlPU8rlOecMW50xUIrNqSVoAqCQDxTiozDvlaV1lWtRal+IrHxyBMqmr5TGVBzmSViCqUjm5KCxL0AHfyzQDDwLu2JWlznokUwCoYrNTtqQ/KGIbLrWPk+xH/uwnq4Yb3Zkp1SxNqTfN4JNzE0Dxy3ByNMLuRu3+2OAWdWIxTAK3NLwGKau1k5FPTC20M0B4y/WCPTpxPKgyoyy+2oqw8veXBtzJCDdXRXHdH/vFjV8mD05R9MSugG85l2KhMN1PjFSgPK5sFnfNxbN9Fq3IYW6CxNsVb8eNMBTPOCndwY/XlhM9A3TDXiqnWJKYRuvg1CQXwQfqms9lOzLwxxRhg2FwNvUixsbFjRcNeGwDd/cY9GnmZ1nOy3Io4wTkqiOQwUsuemJP655m7G/bIwJTBpMFHGjfaLZbiSIvTFWfI3Zy0ZXQnL9OQPYHty+zAv6is9pM0BLTaCyfOAvfHfUmEPRK1b06Tr0upeL7f7CiqYvMziyhDGRTS/BQffZehFckiDMKKc9LzjNse1K70Ged7aF013woE4aGg9tYajYFqC0Z/gqA1YmjCedcFv2SvgTzOLauB1Lddn3v7s2oFY8ji8MV01+b3maaFdCrRiAmTEYr3qzafdHO0k lkkUqmi2 djcrhGMPfHvZYK+w2LVRbfgngAPuhse7qNUe3Xst/qGtUr5NhclA2F7pHySGW2is9LZJRcvA49/bKjhcqjedpq5hrS/5dXTJAEItI3ZUFecPyTUvayZtHsWz8DjgbsqkJha6bvkifhuVGGwYyg7OtXg4YAgjLv2Eu3YGuS3vaoCl7e14NGe8bY4uNNrtWtIxXqrQxGfdlpMR/8Vch6/F6nhm2epkk8idqwIYbNlcfwQaZUW91DowvQxutswskoVmHeT9hilukPeo3w00FJ5rc12OscjPE31ocwiJVAeR9kmVEnPqoMjcLrfkgI7KEspVCkMeCtPnpJbC245Cr1gywvnYGaNR6Gymqjr8S7CnUAZ8Gm5+xdokESjLeTl1H1ttYMHrjeLbbhJXRG9zEwNSHeeQulnv94k9uZRpaulMhq7lNDifFv3hvEHUCI9tOZxD44DU+2K6NiXqqGdC5UholqsGDpg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song Currently, we support mTHP swapout but not swapin. This means that once mTHP is swapped out, it will come back as small folios when swapped in. This is particularly detrimental for devices like Android, where more than half of the memory is in swap. The lack of mTHP swapin functionality makes mTHP a showstopper in scenarios that heavily rely on swap. This patchset introduces mTHP swap-in support. It starts with synchronous devices similar to zRAM, aiming to benefit as many users as possible with minimal changes. -v9: * cleanup (rename, drop local variable) for patch 1 according to Yosry, thanks! * collect Yosry's reviewed-by tags, thanks! -v8: https://lore.kernel.org/linux-mm/20240906001047.1245-1-21cnbao@gmail.com/ * fix the conflicts with zeromap(this is also a hotfix to zeromap with a Fixes tag), reported by Kairui, thanks! Usama, Yosry, thanks for all your comments during the discussion! * refine the changelog to add the case Kanchana reported, using Intel IAA, with mTHP swap-in zRAM read latency can improve 7X. thanks! * some other code cleanup -v7: https://lore.kernel.org/linux-mm/20240821074541.516249-1-hanchuanhua@oppo.com/ * collect Chris's ack tags, thanks! * adjust the comment and subject,pointed by Christoph. * make alloc_swap_folio() always charge the folio to fix the problem of charge failure in memcg when the memory limit is reached(reported and pointed by Kairui), pointed by Kefeng, Matthew. -v6: https://lore.kernel.org/linux-mm/20240802122031.117548-1-21cnbao@gmail.com/ * remove the swapin control added in v5, per Willy, Christoph; The original reason for adding the swpin_enabled control was primarily to address concerns for slower devices. Currently, since we only support fast sync devices, swap-in size is less of a concern. We’ll gain a clearer understanding of the next steps while more devices begin to support mTHP swap-in. * add nr argument in mem_cgroup_swapin_uncharge_swap() instead of adding new API, Willy; * swapcache_prepare() and swapcache_clear() large folios support is also removed as it has been separated per Baolin's request, right now has been in mm-unstable. * provide more data in changelog. -v5: https://lore.kernel.org/linux-mm/20240726094618.401593-1-21cnbao@gmail.com/ * Add swap-in control policy according to Ying's proposal. Right now only "always" and "never" are supported, later we can extend to "auto"; * Fix the comment regarding zswap_never_enabled() according to Yosry; * Filter out unaligned swp entries earlier; * add mem_cgroup_swapin_uncharge_swap_nr() helper -v4: https://lore.kernel.org/linux-mm/20240629111010.230484-1-21cnbao@gmail.com/ Many parts of v3 have been merged into the mm tree with the help on reviewing from Ryan, David, Ying and Chris etc. Thank you very much! This is the final part to allocate large folios and map them. * Use Yosry's zswap_never_enabled(), notice there is a bug. I put the bug fix in this v4 RFC though it should be fixed in Yosry's patch * lots of code improvement (drop large stack, hold ptl etc) according to Yosry's and Ryan's feedback * rebased on top of the latest mm-unstable and utilized some new helpers introduced recently. -v3: https://lore.kernel.org/linux-mm/20240304081348.197341-1-21cnbao@gmail.com/ * avoid over-writing err in __swap_duplicate_nr, pointed out by Yosry, thanks! * fix the issue folio is charged twice for do_swap_page, separating alloc_anon_folio and alloc_swap_folio as they have many differences now on * memcg charing * clearing allocated folio or not -v2: https://lore.kernel.org/linux-mm/20240229003753.134193-1-21cnbao@gmail.com/ * lots of code cleanup according to Chris's comments, thanks! * collect Chris's ack tags, thanks! * address David's comment on moving to use folio_add_new_anon_rmap for !folio_test_anon in do_swap_page, thanks! * remove the MADV_PAGEOUT patch from this series as Ryan will intergrate it into swap-out series * Apply Kairui's work of "mm/swap: fix race when skipping swapcache" on large folios swap-in as well * fixed corrupted data(zero-filled data) in two races: zswap and a part of entries are in swapcache while some others are not in by checking SWAP_HAS_CACHE while swapping in a large folio -v1: https://lore.kernel.org/all/20240118111036.72641-1-21cnbao@gmail.com/#t Barry Song (2): mm: Fix swap_read_folio_zeromap() for large folios with partial zeromap mm: add nr argument in mem_cgroup_swapin_uncharge_swap() helper to support large folios Chuanhua Han (1): mm: support large folios swap-in for sync io devices include/linux/memcontrol.h | 5 +- mm/memcontrol.c | 7 +- mm/memory.c | 261 +++++++++++++++++++++++++++++++++---- mm/page_io.c | 32 +---- mm/swap.h | 33 +++++ mm/swap_state.c | 2 +- 6 files changed, 282 insertions(+), 58 deletions(-)