From patchwork Wed Jan 15 03:38:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13939796 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC348C02180 for ; Wed, 15 Jan 2025 03:38:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 709A3280003; Tue, 14 Jan 2025 22:38:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6B9C7280001; Tue, 14 Jan 2025 22:38:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 55A3F280003; Tue, 14 Jan 2025 22:38:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 370E9280001 for ; Tue, 14 Jan 2025 22:38:37 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id B2B0FAFA82 for ; Wed, 15 Jan 2025 03:38:36 +0000 (UTC) X-FDA: 83008279032.20.0CFE102 Received: from mail-pl1-f180.google.com (mail-pl1-f180.google.com [209.85.214.180]) by imf25.hostedemail.com (Postfix) with ESMTP id CF722A0009 for ; Wed, 15 Jan 2025 03:38:34 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="SgalNc/9"; spf=pass (imf25.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.180 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736912314; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KJIQvsAs93F2Rgx1xFn4Hzm1URQ2cAp3NCo21K6rE7w=; b=qMFa92j8GBud8yAdQYUakoEzwvwzjCAGyDj5c9gBupLB+OB6UINWN501STH0ciCtIeMh0/ BWjJzdwoc48rrP7XCwFthVDucRnpmV+8IG9gFAXowReQ5cjiPY2HN4PRhfUIWBZEIIoaXV HqktBwuxFzZQfNWIagGysaIjwHifSCQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736912314; a=rsa-sha256; cv=none; b=I5R8O27wF6fgxz/U0BDY41Cl8uoLh2RCVpKdXvZrrDU94UWBQkhlBfyyZ8BbTMziC0H7bV 3qU7WHdEVa7Rstfmc0vb4+ktUsMol1+1EDy1yzrG3R8hUKcgUHTpSgRceXxXAXZQMi1ToY pM817u+mfgyjnHNVirZCd/zZc67DOzo= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="SgalNc/9"; spf=pass (imf25.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.180 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pl1-f180.google.com with SMTP id d9443c01a7336-219f8263ae0so102948085ad.0 for ; Tue, 14 Jan 2025 19:38:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736912313; x=1737517113; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KJIQvsAs93F2Rgx1xFn4Hzm1URQ2cAp3NCo21K6rE7w=; b=SgalNc/9uznh4qcfmpUtF5wf4ttNYaFA8FjPP5WTmMZKuWXe6Fmf50zT+UjuVsJDLW fr9N/P+vxu3tv2f9kSoBJ2jPkZLHYYFKA96H3V/7M98TD+bTK3rXcukGG5XeyQP1/MT+ IZ/8Y3laDnVsFnb07rCE/zlh+LRJkwdjMCXe6LwXfmhxr26EqTf+FGXpP5F6j6EufyzR DmjothK3yWNTrq6YLVscTTa8DNk+/4Mb/TFhhhEx1ZMilnmkZaArq2Zb/aP5CwXWUSRC 9iFp7bW1jvq6gGYM+jLOVDVEamgzO5KaIYHZdRrZ+E/ms2WpLTQh6EQU8a3Ru2BvyIR1 Hziw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736912313; x=1737517113; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KJIQvsAs93F2Rgx1xFn4Hzm1URQ2cAp3NCo21K6rE7w=; b=ZMd4w42E11pUzFTFFvztmoKQDg2JRXNtXHcp1QBoZaCVFd5uyRJkpd2z8fTykcD1or 2T8qVBBG7M8lo9hFEFnOrFFxQCsEmeiYyZ3K3DFYYSe94W50cJdIewDMwd/Rb/GS+r6H qeCH6bF6nOEuiqOtqs/idDXfDl6VvFg4ruA4hR05a99z5gawjcIBmRC4zwWMVbiLL2uz 6BwPuxokbTi8B6aPeZ0ukUq+SigY0ZYJyIrs16fQVKRYHZmM63oly41mF6vpteAiyzKi SQjlLi71ThvcPtXEJgSmL2rfuoQlrlHLj4TwcXdFea4u8AxEOKi1SyiUKLQdCc+JjeX7 NiYQ== X-Forwarded-Encrypted: i=1; AJvYcCVbFhTUAY0V6byNqMg987iqWywpxuxp282Ef7vj6cB7XUqcN01JJRMIV91BxGBtriI3zCmGkBGaOw==@kvack.org X-Gm-Message-State: AOJu0YxwytC0PTfrUNJgXoXbM2XdgD1gw1cDW1/VBvmXhqWJz4P8HJzb qAiqN+u5i9gCxZcK5JJjFpqF/KO/jKNkmu00b3hAPuPbFGRX3RkE X-Gm-Gg: ASbGncvPWnLuaPkAGP6Nlfni5ruTellrigmQRsPfiXeqKb849hM8MSuWj73q/1q0c2G MLeRfSQS3CNYQSkl2eW2MphDFCq1SdxYLr3Sduw15xBak0kHq8+mjo1wXHDFQwFgMViJQU75grw r+D+76bxdmG68mHoHv6xba/deQ38bHbLUkI+BuWHp27oDJspB+x3FbKLVVDPV6YgXajz05Ow4Zm XTXGXkNHNYCrdC/PkExdYM0mC49h5MdxtTm629cvNrfcf2Mytl1daxN4qhC1qQkM/vL0cWSiN1Z 3HiaOzW1 X-Google-Smtp-Source: AGHT+IHnFl4K6iDBiLirFwWAnLa88ULHZZM7psfa+Le3bhG1J3ARcy1prtquvYLzB7/JWDl2R1Bcjw== X-Received: by 2002:a17:903:1cf:b0:216:39fa:5cb4 with SMTP id d9443c01a7336-21a83f67c50mr435295355ad.25.1736912313634; Tue, 14 Jan 2025 19:38:33 -0800 (PST) Received: from Barrys-MBP.hub ([2407:7000:af65:8200:e5d5:b870:ca9b:78f8]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21a9f10dffbsm73368195ad.49.2025.01.14.19.38.26 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 14 Jan 2025 19:38:32 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: 21cnbao@gmail.com, baolin.wang@linux.alibaba.com, chrisl@kernel.org, david@redhat.com, ioworker0@gmail.com, kasong@tencent.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, lorenzo.stoakes@oracle.com, ryan.roberts@arm.com, v-songbaohua@oppo.com, x86@kernel.org, ying.huang@intel.com, zhengtangquan@oppo.com, Mauricio Faria de Oliveira Subject: [PATCH v3 1/4] mm: Set folio swapbacked iff folios are dirty in try_to_unmap_one Date: Wed, 15 Jan 2025 16:38:05 +1300 Message-Id: <20250115033808.40641-2-21cnbao@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250115033808.40641-1-21cnbao@gmail.com> References: <20250115033808.40641-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: CF722A0009 X-Stat-Signature: wqkkb9gspaonccjoweb1mfx6wzym75sx X-Rspam-User: X-HE-Tag: 1736912314-495973 X-HE-Meta: U2FsdGVkX1/WmJ1c2Kz+rQFD3U2mdMcrL7w78bf+Kmw43PHoVXrWkgIrsxa+e+yE2PlUdrkRipPd9Y71Q+7iHNXYREKThxprdKpIaDMZwwGwxMLqER5VOJrvvdcCSW6u3inKvIwVjMHer/SkiMLmFdxBzbIsBFIqCR/UiGiLbyIa0q2zCsKuHZZmo1DvKucSMBtYJDVT0qBOwaMpQrPmvacbeovLKv9pBQJ5jyePRf5LpPUoxelzaDLQesJTbJr6NDoVXE41YFCMpFl+fFkgSt7yV5o8+dQP67YTeqkxLo7NU8N/99woVOqQwHg2MD7ZstIm+xJu9VgqL4Vs+mwMYjjOXwGQg1L5GfCi6dw1Ft7ih5E46zDGujyr16lAHB61elDQf0wFQuNdqO3mqsMeRNZIklxGmcbz0f5n8Imz2R58jwCuo9dCyEz1Sp8gtExgKuyGrsGjZqb208yDx/f4IyzLglAAXj1agMa7pek0LEfgLR9FbP4fPngjpOQeJLdKetlRakerhiCb9Zwm2H0Bkb6hjn4YRmyGcVo3r8aDTwoPcsg5YA0blDglAH/JP3XM4skQXPvkSxU+z/3NO2vB/Iqc4vmAtH65AETyGQL5KaShmtkuw04dFVfUvmQC/haWL0sGNuq72dtUR9wpHTYdHJ11U/lDOglcvBWXwPTeSnY/vxV1gZwgOM8Nh11aP+pSY78YoKw5MmDWFg8m2FU521OcLYXNNwPaDGUayrEWmFc9LwXMebTXRDCizAtQAusmv1D77TqWqzUHSV3BkJN87DLWWDcc+/L+GrWqWzCvNw6DFPL+POgByBqcPsz4y5eM3dy2fFpqRZMcG1kvtSRjxbFHEz/Mk0FTOEgZoB71/pmCj/1szeCSD/MivZk1acqFaVLls1xD3PpsAVDvjFVopWeuxt+bsP/G6G3NlpF84tJRWffjqdYijB1TnWEKZXjrpX8WQIYn79CW0BwX7Vq TmWGld2f Wk7zq/nw2vRopC5yvjDgHJWtwRM3snSHEwwbhSJn8AykOxrpHIyHCuvdd/rEnGabep/237y9IskCInuOkPu/AVfhqPv5ErKNUAau4YmmtntiolQqaFZuGKhbcL8OC/4n84sA8/BCPm45ZF3DCNx/TciJ6KBq+lL18B86eC7ltfa1J7FICMxeGTAyPwHqsKh6nvLzT3frteho9xj5gNYpQlo1r/HmYM7Yn3CgXIjT/NQMASaJJqnqe0w/MbXYm0p5p7AliXYBgX29SefuY95z57DufKVT4Zu4wPiXEG92DoqhTFU3cmien6DLj6OQfaYufV3VnCtotI3792c6GgmNoHWcjR0h6/N+aRV4G70CAjfYbuv9mktUpCru3c1ZiENZENtN2jZjK7Uwxz5M5LSqssmLNN6CAI5tOXYlhLu1mto0jorJ5l0l0E9lDTvL77aye4dnHVqlMWliYknyajhRBDEY3by1POGJuEJ/7yCQkoJMcLM9NcAprkzsoUcW0Imgy+SczRzf5p8U4dqch3nYpS1fW3Y4Y4qOseR33UF3E9h8gDjFh9A6C5RJQWA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song The refcount may be temporarily or long-term increased, but this does not change the fundamental nature of the folio already being lazy- freed. Therefore, we only reset 'swapbacked' when we are certain the folio is dirty and not droppable. Fixes: 6c8e2a256915 ("mm: fix race between MADV_FREE reclaim and blkdev direct IO read") Suggested-by: David Hildenbrand Signed-off-by: Barry Song Cc: Mauricio Faria de Oliveira Acked-by: David Hildenbrand Reviewed-by: Baolin Wang Reviewed-by: Lance Yang --- mm/rmap.c | 49 ++++++++++++++++++++++--------------------------- 1 file changed, 22 insertions(+), 27 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index c6c4d4ea29a7..de6b8c34e98c 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1868,34 +1868,29 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, */ smp_rmb(); - /* - * The only page refs must be one from isolation - * plus the rmap(s) (dropped by discard:). - */ - if (ref_count == 1 + map_count && - (!folio_test_dirty(folio) || - /* - * Unlike MADV_FREE mappings, VM_DROPPABLE - * ones can be dropped even if they've - * been dirtied. - */ - (vma->vm_flags & VM_DROPPABLE))) { - dec_mm_counter(mm, MM_ANONPAGES); - goto discard; - } - - /* - * If the folio was redirtied, it cannot be - * discarded. Remap the page to page table. - */ - set_pte_at(mm, address, pvmw.pte, pteval); - /* - * Unlike MADV_FREE mappings, VM_DROPPABLE ones - * never get swap backed on failure to drop. - */ - if (!(vma->vm_flags & VM_DROPPABLE)) + if (folio_test_dirty(folio) && !(vma->vm_flags & VM_DROPPABLE)) { + /* + * redirtied either using the page table or a previously + * obtained GUP reference. + */ + set_pte_at(mm, address, pvmw.pte, pteval); folio_set_swapbacked(folio); - goto walk_abort; + goto walk_abort; + } else if (ref_count != 1 + map_count) { + /* + * Additional reference. Could be a GUP reference or any + * speculative reference. GUP users must mark the folio + * dirty if there was a modification. This folio cannot be + * reclaimed right now either way, so act just like nothing + * happened. + * We'll come back here later and detect if the folio was + * dirtied when the additional reference is gone. + */ + set_pte_at(mm, address, pvmw.pte, pteval); + goto walk_abort; + } + dec_mm_counter(mm, MM_ANONPAGES); + goto discard; } if (swap_duplicate(entry) < 0) { From patchwork Wed Jan 15 03:38:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13939797 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65CC4C02183 for ; Wed, 15 Jan 2025 03:38:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E3586280004; Tue, 14 Jan 2025 22:38:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DE582280001; Tue, 14 Jan 2025 22:38:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C85EC280004; Tue, 14 Jan 2025 22:38:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id A9F98280001 for ; Tue, 14 Jan 2025 22:38:48 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 61D431C802F for ; Wed, 15 Jan 2025 03:38:48 +0000 (UTC) X-FDA: 83008279536.29.DEE84E0 Received: from mail-pl1-f180.google.com (mail-pl1-f180.google.com [209.85.214.180]) by imf19.hostedemail.com (Postfix) with ESMTP id 7B2D51A0004 for ; Wed, 15 Jan 2025 03:38:46 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="QP/pyGvB"; spf=pass (imf19.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.180 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736912326; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gFMoOwSJpv8P/wz3qgwCAFHIUmnEPnRJywSyGSBHca8=; b=vWrhLngSh4wp/UeI30KTWhYjlzWka+COz7qUyTlCe90iKqsnUxRm0P0LfKIqAraJEZzd84 QSBzBRuBIBERbZQXhwPWzAGNY2ZHmzMDcBqam2lWYvhPDb3HmRDK4KUNiAeOYPWFPoMqtv RmWko2yEMJYq0g6j9OP5FHeRARH0nYU= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="QP/pyGvB"; spf=pass (imf19.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.180 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736912326; a=rsa-sha256; cv=none; b=bbkYvpzj4XmBpNjxZiFCYgwTiqZAVAAHesjGDr5tzKQjDfnIynYye0fci8xRpoOyz/uqu/ yPadqPbMLce23tA+kI4QnUebF20PkYdzarzcC55rZqq7pL2WDfBJC4Iao/0dEH0po+KpqW 54rkB7TqgO7mPu1hMnfymKhNU5ZtabQ= Received: by mail-pl1-f180.google.com with SMTP id d9443c01a7336-21634338cfdso105138555ad.2 for ; Tue, 14 Jan 2025 19:38:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736912325; x=1737517125; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=gFMoOwSJpv8P/wz3qgwCAFHIUmnEPnRJywSyGSBHca8=; b=QP/pyGvBevBwYILv3g0knCpjIUoKuw0uNDKZXtdGp2EcMRlRit1jd+b3/ohrdXIUdz FXp7kpEt6tBcInvXjJiql7CNO1aE1DI864aRnlmN3N+yQCUTsJNRaJdwC1rrgMT/UECW XzSZ3omSqmDjc46lFrjdugoWFy18i33cNB3A/AxbWWkz29Hy/eduB74bqJ9cNzp90ZUN qfI07BjE9vnuVme25yDvo+1+Wt3JD7IBp/9F9zadAJMtoW/NBsNpmo1Jtgfuqktg9mKv fhVIW0S5IH+Al+XD+ogvdnrQ7LpZlSr9oZwvtrKAKYDyHcttJUp3X1SVjWXrlpfMeiLg 2HZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736912325; x=1737517125; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gFMoOwSJpv8P/wz3qgwCAFHIUmnEPnRJywSyGSBHca8=; b=sBAhD1DDXSFnYG3iaeP82arN7SXkrG+/qr3YQ/R+NdtZav67G/BC3jgHuLigr1p411 9b9qOwcGWNIIQj6QU2z3d6C3Jk4fy6oofX9gG6NF7GGbgEcnlyFsIUBU7uA7UEMOytC7 /LLeo6prSPlFsSGqaKdExFhGQxJIlo+0ukGu09Uu9Gh56JBQwj52UEyxhlZo/bzzPswM fdSxpumcuIBveA9apBpBoLI1hMyiW1FhnvyJcgdwKxiBPe94HZqcvp7YjBlMiHmw6E31 dJukYtS8OOzkOD0YK8p7uBRMoRzTU6GIe4m8kn1/utcN7AE0PGYFTnj4zfcLo38Yh0Ip Klig== X-Forwarded-Encrypted: i=1; AJvYcCXHRBtzbfw9oHQxkve7N7Kj6bLGF8Ropzopz1Ze63ThgNyX8xnsYoBCnsuxyzRtMD4XSnWNZafmsQ==@kvack.org X-Gm-Message-State: AOJu0YxbWWEugHyplP+oESht5FwvnY5YRhaww8anwukvGqk8gQkXIxWO mOYcvPFp3sOIyheq/87WyHlYNCvSntkoXhOdDFhsW4eMQbBOyrbC X-Gm-Gg: ASbGnctIsxzUEmn33z1p1HLCCWDUB+z1OkFmePT6jGUhK8M18yjZ2sjezQMR0WNxZqZ LE9UWLFrzZqfUmp8a4X9+daOKmHgkMVyLz5MrS8R9Wy+CIDAu8DNAC3PRaK9gcSm3y2lXPOhF6A e0bCFLtW/hRZ3ZSMZZL8QV8SEkDMUYNSkO9KE1+p89G+TQTdGTM65d/93oWsgEyM3fVNSlUt9+0 fWwIt5mYkAAFtTPKWY25/kE4lpBlrhVeXCj7fRfVQ08fZn+R4BJDl62M5LbOuRTOZaOQjFEOa2A dr8bVW1p X-Google-Smtp-Source: AGHT+IFF9bEbjjmNapGhmql+BuPntstAKEicpEzxJg3Qh6LkZHMTOzHOOAv9vQg9vg8A6/X699gIig== X-Received: by 2002:a17:902:c941:b0:215:a179:14ca with SMTP id d9443c01a7336-21a83f3eec9mr416527735ad.2.1736912325125; Tue, 14 Jan 2025 19:38:45 -0800 (PST) Received: from Barrys-MBP.hub ([2407:7000:af65:8200:e5d5:b870:ca9b:78f8]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21a9f10dffbsm73368195ad.49.2025.01.14.19.38.34 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 14 Jan 2025 19:38:44 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: 21cnbao@gmail.com, baolin.wang@linux.alibaba.com, chrisl@kernel.org, david@redhat.com, ioworker0@gmail.com, kasong@tencent.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, lorenzo.stoakes@oracle.com, ryan.roberts@arm.com, v-songbaohua@oppo.com, x86@kernel.org, ying.huang@intel.com, zhengtangquan@oppo.com, Catalin Marinas , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Anshuman Khandual , Shaoqin Huang , Gavin Shan , Kefeng Wang , Mark Rutland , "Kirill A. Shutemov" , Yosry Ahmed , Paul Walmsley , Palmer Dabbelt , Albert Ou , Yicong Yang , Will Deacon Subject: [PATCH v3 2/4] mm: Support tlbbatch flush for a range of PTEs Date: Wed, 15 Jan 2025 16:38:06 +1300 Message-Id: <20250115033808.40641-3-21cnbao@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250115033808.40641-1-21cnbao@gmail.com> References: <20250115033808.40641-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 7B2D51A0004 X-Stat-Signature: t17yjzihotd961kq7k5rzengrdijg94b X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1736912326-945157 X-HE-Meta: U2FsdGVkX1/SvQqkMRsgkrHgUqE4gRvMsAEdf1Gq3sFOxXBh7McyNeNmmnxZx2nPa3VGwm5zUaAA3vbF7B0G76kyQw8yTJkgZWCFxfqMOybj9V3F5baGn892nlEC5ZYLiT3NXj9rR/TNqXQSMWXvSnDfdtCBFgEzfQOEAJ4YWijfved7SFnvbky0Ix/7kZGGbr4s87QbXGpkr0OaH0JkqTylH7V9oToZjbkyoTEaXLtoGHzPDojBuUol4Gk+LaphSdoFG05vSvsOGsIRYenlwKrEoK7tLE2zxXckrm3GGeXMQf3ZgW0sURV2dEtZop5eu3rqLNMtkFsepM8gcHAv3eNhIVPOoKJX29pqXvz7qS/YFhFcWbPE22rpXxLrKkyraBgP2bvT4B3fUpX5YhyKCUdDQGaJyKB5DlgobivbcBGcKyUgysj84vLVz0E4kcyH0Y6g6ep4tVXgEzxhdSlkBta+LqWe4GYHq7QjWrbF6dzyA3pzB6it3b7Aqk+BSSrlkRkXrnDL7eonWtY/aoCndWy+tEAwibhN3yOTJORwbl/VFRnUcpGnYoXTvakAYmqQ2NKV8OVTKA1AHwCdsksgTdGvpelDXfP7mzQk8yNdUk7oWngpjjoAxJGe3aj/M9Fn6F3LWT+rjPxQ7Zx43a0Smqi19yTUukY9p7v5hihvQHvKleRkKlpJCv7iDIty4+77qnfxI7hu46Fm+xZal76vhLbCoS6hF+K8l2KxXiskDUjsjPeii7rmcDgAnnKKj5Lpry9ipQPFMiMHcBMhRvrnJVMt7+jju0LtANMvT9BfG3PBaF3K9Uh/63JX1sme7KN0fD6B2+GNzOhRsC5sqvnGiUeOJ9t5PwfDDtM0uZzuUjWdEgFzegGziQ2yj3SHaN/fKpFrj7wYMM28L2Hs5r8aA12NhOVJFaX9Vx+nr+vasTOjiMtzHt9I7bJJhomzeYdsi4Vh8IvijbePF5UJfcI BJCRDwce t2W38t3dTZgNrk2ybxAPoSyZyFmtpGHUz54R7/UPlaN2tVs5TvlLBvwAWdoSDfLouOX/H4vxRRx3q81EFQYw4oKzY4weqt1qcmW72y5JOY6E6h21XFYQEc64mqLY8Vxb7F47Cj3lKmh3ElEycinvsSyEmWE9ynQ6NGoZleKha1qEGOmgBAUE7SFKVzg1akFXqXG65+SlA7662ZWWYFAjoQ51JylRsZ5MM7bAR9YIJVjwuIV4rq1w5XOHO7fDz8n9iSVhyLJ94/7RSJWP+Dn32bQY3DqzQHSAXeOzgTU5WFeQgk/38Dy7tkrzX7Iqxntd7b2G5STuWECNoDoAbC53URVQfKxsaeu9Ef393b6NhFtY+cYuOyNaCWQJ1fUXLal4fzUHh5AY99nMBbjAeVnSdRUVPLRdeumULRvGDlhOxgHaPwyU9KcksavMurxN5ooYhuB8bb6HDZBcYN38zrqvU8Wa+t+vU4yGf07KtNq4CJ13FbAhOwP6vcxBonIh3aMR3aHq9tpYhH/EnZtbgHafJiLxsFWCpNoglEYPA3egzOO74n4dxLh1JSDdUdLnA7z2Ca8tm/Lq9RRAjuWSZBAH8S8eKE80wgUdqkT0LrX/aS75UvJtTbrQ+m8gdInqo+Mg0EXYIWVOtZE/SgU0ACDOLPmeJYqUIH1OZTibqmQjX8lOrF8ptftHbKkLVXG/aGEzmfR/l0VXuA3N32tKORNU8Kfy58TPDY7wQ+j1YGBkQ0mOn9hAtzNzNdy/3llCnyZpCLqAYle+1SEdYxVdLSvEvGNVbe/8OTPF7obLhvSRfTMVbuzFAaXjXtFgonDeQO+NUnUkG X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song This patch lays the groundwork for supporting batch PTE unmapping in try_to_unmap_one(). It introduces range handling for TLB batch flushing, with the range currently set to the size of PAGE_SIZE. The function __flush_tlb_range_nosync() is architecture-specific and is only used within arch/arm64. This function requires the mm structure instead of the vma structure. To allow its reuse by arch_tlbbatch_add_pending(), which operates with mm but not vma, this patch modifies the argument of __flush_tlb_range_nosync() to take mm as its parameter. Cc: Catalin Marinas Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: "H. Peter Anvin" Cc: Anshuman Khandual Cc: Ryan Roberts Cc: Shaoqin Huang Cc: Gavin Shan Cc: Kefeng Wang Cc: Mark Rutland Cc: David Hildenbrand Cc: Lance Yang Cc: "Kirill A. Shutemov" Cc: Yosry Ahmed Cc: Paul Walmsley Cc: Palmer Dabbelt Cc: Albert Ou Cc: Yicong Yang Signed-off-by: Barry Song Acked-by: Will Deacon --- arch/arm64/include/asm/tlbflush.h | 25 +++++++++++++------------ arch/arm64/mm/contpte.c | 2 +- arch/riscv/include/asm/tlbflush.h | 5 +++-- arch/riscv/mm/tlbflush.c | 5 +++-- arch/x86/include/asm/tlbflush.h | 5 +++-- mm/rmap.c | 12 +++++++----- 6 files changed, 30 insertions(+), 24 deletions(-) diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index bc94e036a26b..98fbc8df7cf3 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -322,13 +322,6 @@ static inline bool arch_tlbbatch_should_defer(struct mm_struct *mm) return true; } -static inline void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, - struct mm_struct *mm, - unsigned long uaddr) -{ - __flush_tlb_page_nosync(mm, uaddr); -} - /* * If mprotect/munmap/etc occurs during TLB batched flushing, we need to * synchronise all the TLBI issued with a DSB to avoid the race mentioned in @@ -448,7 +441,7 @@ static inline bool __flush_tlb_range_limit_excess(unsigned long start, return false; } -static inline void __flush_tlb_range_nosync(struct vm_area_struct *vma, +static inline void __flush_tlb_range_nosync(struct mm_struct *mm, unsigned long start, unsigned long end, unsigned long stride, bool last_level, int tlb_level) @@ -460,12 +453,12 @@ static inline void __flush_tlb_range_nosync(struct vm_area_struct *vma, pages = (end - start) >> PAGE_SHIFT; if (__flush_tlb_range_limit_excess(start, end, pages, stride)) { - flush_tlb_mm(vma->vm_mm); + flush_tlb_mm(mm); return; } dsb(ishst); - asid = ASID(vma->vm_mm); + asid = ASID(mm); if (last_level) __flush_tlb_range_op(vale1is, start, pages, stride, asid, @@ -474,7 +467,7 @@ static inline void __flush_tlb_range_nosync(struct vm_area_struct *vma, __flush_tlb_range_op(vae1is, start, pages, stride, asid, tlb_level, true, lpa2_is_enabled()); - mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, start, end); + mmu_notifier_arch_invalidate_secondary_tlbs(mm, start, end); } static inline void __flush_tlb_range(struct vm_area_struct *vma, @@ -482,7 +475,7 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, unsigned long stride, bool last_level, int tlb_level) { - __flush_tlb_range_nosync(vma, start, end, stride, + __flush_tlb_range_nosync(vma->vm_mm, start, end, stride, last_level, tlb_level); dsb(ish); } @@ -533,6 +526,14 @@ static inline void __flush_tlb_kernel_pgtable(unsigned long kaddr) dsb(ish); isb(); } + +static inline void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, + struct mm_struct *mm, + unsigned long start, + unsigned long end) +{ + __flush_tlb_range_nosync(mm, start, end, PAGE_SIZE, true, 3); +} #endif #endif diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c index 55107d27d3f8..bcac4f55f9c1 100644 --- a/arch/arm64/mm/contpte.c +++ b/arch/arm64/mm/contpte.c @@ -335,7 +335,7 @@ int contpte_ptep_clear_flush_young(struct vm_area_struct *vma, * eliding the trailing DSB applies here. */ addr = ALIGN_DOWN(addr, CONT_PTE_SIZE); - __flush_tlb_range_nosync(vma, addr, addr + CONT_PTE_SIZE, + __flush_tlb_range_nosync(vma->vm_mm, addr, addr + CONT_PTE_SIZE, PAGE_SIZE, true, 3); } diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h index 72e559934952..e4c533691a7d 100644 --- a/arch/riscv/include/asm/tlbflush.h +++ b/arch/riscv/include/asm/tlbflush.h @@ -60,8 +60,9 @@ void flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long start, bool arch_tlbbatch_should_defer(struct mm_struct *mm); void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, - struct mm_struct *mm, - unsigned long uaddr); + struct mm_struct *mm, + unsigned long start, + unsigned long end); void arch_flush_tlb_batched_pending(struct mm_struct *mm); void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch); diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c index 9b6e86ce3867..6d6e8e7cc576 100644 --- a/arch/riscv/mm/tlbflush.c +++ b/arch/riscv/mm/tlbflush.c @@ -186,8 +186,9 @@ bool arch_tlbbatch_should_defer(struct mm_struct *mm) } void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, - struct mm_struct *mm, - unsigned long uaddr) + struct mm_struct *mm, + unsigned long start, + unsigned long end) { cpumask_or(&batch->cpumask, &batch->cpumask, mm_cpumask(mm)); } diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 69e79fff41b8..2b511972d008 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -278,8 +278,9 @@ static inline u64 inc_mm_tlb_gen(struct mm_struct *mm) } static inline void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, - struct mm_struct *mm, - unsigned long uaddr) + struct mm_struct *mm, + unsigned long start, + unsigned long end) { inc_mm_tlb_gen(mm); cpumask_or(&batch->cpumask, &batch->cpumask, mm_cpumask(mm)); diff --git a/mm/rmap.c b/mm/rmap.c index de6b8c34e98c..abeb9fcec384 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -672,7 +672,8 @@ void try_to_unmap_flush_dirty(void) (TLB_FLUSH_BATCH_PENDING_MASK / 2) static void set_tlb_ubc_flush_pending(struct mm_struct *mm, pte_t pteval, - unsigned long uaddr) + unsigned long start, + unsigned long end) { struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; int batch; @@ -681,7 +682,7 @@ static void set_tlb_ubc_flush_pending(struct mm_struct *mm, pte_t pteval, if (!pte_accessible(mm, pteval)) return; - arch_tlbbatch_add_pending(&tlb_ubc->arch, mm, uaddr); + arch_tlbbatch_add_pending(&tlb_ubc->arch, mm, start, end); tlb_ubc->flush_required = true; /* @@ -757,7 +758,8 @@ void flush_tlb_batched_pending(struct mm_struct *mm) } #else static void set_tlb_ubc_flush_pending(struct mm_struct *mm, pte_t pteval, - unsigned long uaddr) + unsigned long start, + unsigned long end) { } @@ -1792,7 +1794,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, */ pteval = ptep_get_and_clear(mm, address, pvmw.pte); - set_tlb_ubc_flush_pending(mm, pteval, address); + set_tlb_ubc_flush_pending(mm, pteval, address, address + PAGE_SIZE); } else { pteval = ptep_clear_flush(vma, address, pvmw.pte); } @@ -2164,7 +2166,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, */ pteval = ptep_get_and_clear(mm, address, pvmw.pte); - set_tlb_ubc_flush_pending(mm, pteval, address); + set_tlb_ubc_flush_pending(mm, pteval, address, address + PAGE_SIZE); } else { pteval = ptep_clear_flush(vma, address, pvmw.pte); } From patchwork Wed Jan 15 03:38:07 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13939798 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6D79C02183 for ; Wed, 15 Jan 2025 03:38:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 730DB280005; Tue, 14 Jan 2025 22:38:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6E14C280001; Tue, 14 Jan 2025 22:38:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 581E0280005; Tue, 14 Jan 2025 22:38:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 37048280001 for ; Tue, 14 Jan 2025 22:38:55 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id D9860C0FD3 for ; Wed, 15 Jan 2025 03:38:54 +0000 (UTC) X-FDA: 83008279788.25.2295434 Received: from mail-pl1-f180.google.com (mail-pl1-f180.google.com [209.85.214.180]) by imf18.hostedemail.com (Postfix) with ESMTP id D2D4A1C0005 for ; Wed, 15 Jan 2025 03:38:52 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=bUd1L4jw; spf=pass (imf18.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.180 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736912332; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bGaU/+TE2oGF3eimQ2KnLuagLzu4AwU2TBvptcVR7ws=; b=H9TSazV6C/98B3tPowzcbBTLo+dYimz1emARRFJ9XNskmK6B8GEen7XQEQ8RLt6lzVZswy f4lwxIAoyXvWs2ETKmHaa4g1qQPEoRlCBuO5AFIwDdywOFZth9E1FoT9zQbll15Awefo90 OtgfSqKm7iwjV+TKOg44Uoj0EugtNCY= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=bUd1L4jw; spf=pass (imf18.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.180 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736912332; a=rsa-sha256; cv=none; b=2Bqfs2t/U7sDpmG4NsjUi0lFaVvOvSPr6+Pi6hLU8YSgDR3DxmByQk2BkV5dfIXeZBiXZF RGGtbagn9gCAbdqDjNSh7bgQSEOR5fBiVRSTK5L06oZFq1jZKpIBfc5YdocE7YTTgz24ih C18jDbj5SJpLstXQbMZYzO6gqwRZBsY= Received: by mail-pl1-f180.google.com with SMTP id d9443c01a7336-21654fdd5daso106217435ad.1 for ; Tue, 14 Jan 2025 19:38:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736912331; x=1737517131; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=bGaU/+TE2oGF3eimQ2KnLuagLzu4AwU2TBvptcVR7ws=; b=bUd1L4jwoWvE8B7fdAZPU6jbUWPc4qovHiy5eQp5sGgiVw1r9GwPmohmsWKUrgoyuG q0GY+CvS1xhC78BgRybs6P84X8KqxrB4iWiw6QeMLd45bIUlM9J8g8BCo1CLySRU3vlv hcnML6jZz0juoN0ZT5EuQ0gtowsOA66hxDIGgS4/fCAh2IIVjG1k92MvbSJ86CPTs2Se T8d6CPzV4n3oQh9O2CgeT7mUySTOAxOHEXcbC+eu+G2CNzaS8RgPJjkRzYByE1vX+qMq WnxfkcXHFQHVNFXdvR1e5LPHsgkxmLK9o9u0tdqVuhh+FHKpf0OO4Ln2rgvllwEA1b6F l03g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736912331; x=1737517131; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bGaU/+TE2oGF3eimQ2KnLuagLzu4AwU2TBvptcVR7ws=; b=CoGBqj+1xBgDO7GHhEskWY1RLPYAFDuOoqvJxol4IcTvVRKAINdCyUJ82JkN3mS0+y zJwO3JpgJE9Ka9WL2mxeWxoy2CXQZHsKocvjqXVSywqNikvkREqAjpgB6VH9LgfoT9wt sxU0+gv2FXHr2TeYQlOq67/40Kcne3noLdO3RwkCSyiF/JEjr9vPbWn4QUXcQSkSHAq+ Bbqd3WZUrhSgNZZ/98AvnM/x8Ah8/pSNrybOyCuP02UaszQKJ1s9A1AFK5Q31G+LycHG t1hVlDzcIP/WUd6ZdIrTNshClngNZD/y+3qeU8nI+svSuiXMbCcMx0yPceCe66Auzhl2 D4VA== X-Forwarded-Encrypted: i=1; AJvYcCVhbuhTmxBo8J9GhjxnblCoGTbBd/wBbTSSINZNgKqBmcRXlw04nMFhw3ICei/nbjbI6bEzzrHbWw==@kvack.org X-Gm-Message-State: AOJu0YxXRDSkRFkf9RazkxDRGZdLcq6cIDT5NpUohPwOfXYeex/WedWR WLPaxvoPp3d0cPo35by+hNfyPcoiSRSxREmY2MxARrpbeJnrbv/P X-Gm-Gg: ASbGncs1sK8cE/ApKqSpd3PwIwDJuwTpPGAQQipvf+P48A0rqpdXZtCejsxFeVKLoHo odJWXlKlnsToS6bIjONNO50uh+eynNgfga3DdSvEvhEFykwCw80oqSfKpUiBUTt3UPUotDFDUqP SN3PIgxiyr5VcCrScmiVH4Fq6XJzF4bubj1eJpHzxS8cTxleIM7D90MJP3zKLz4msLf5L4FjNIW +AnyYxaUkhoNZuKonUm/jEJe8LD1R1OuSalnKcdMt4j19/hi39IMoSdwWOuh9yVX5t/LCTB1H/k 6NiS/bJc X-Google-Smtp-Source: AGHT+IEqw72Wem1/NpwpTk58spR/v0PCzoGmmngV+d3HWDRVCCL9vc8Xbq25OJt+gLxiUfAdXr2j8w== X-Received: by 2002:a17:902:f644:b0:212:40e0:9562 with SMTP id d9443c01a7336-21a83f69651mr417875835ad.25.1736912331517; Tue, 14 Jan 2025 19:38:51 -0800 (PST) Received: from Barrys-MBP.hub ([2407:7000:af65:8200:e5d5:b870:ca9b:78f8]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21a9f10dffbsm73368195ad.49.2025.01.14.19.38.45 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 14 Jan 2025 19:38:51 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: 21cnbao@gmail.com, baolin.wang@linux.alibaba.com, chrisl@kernel.org, david@redhat.com, ioworker0@gmail.com, kasong@tencent.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, lorenzo.stoakes@oracle.com, ryan.roberts@arm.com, v-songbaohua@oppo.com, x86@kernel.org, ying.huang@intel.com, zhengtangquan@oppo.com Subject: [PATCH v3 3/4] mm: Support batched unmap for lazyfree large folios during reclamation Date: Wed, 15 Jan 2025 16:38:07 +1300 Message-Id: <20250115033808.40641-4-21cnbao@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250115033808.40641-1-21cnbao@gmail.com> References: <20250115033808.40641-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: D2D4A1C0005 X-Stat-Signature: yo5hibc7ds8burnara3mgu88o4c17o1i X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1736912332-477064 X-HE-Meta: U2FsdGVkX1+2DNxGKy4tj+gYzNvIBhKOUp8j3mDovUtsTwAKBwzJddcCgDWmF+1A/S+ELEzvjU8iZRuUpPiGwPjWmPshX9394bBsLCOYrYhmjf3Ra2c5U+xKVO+S2V6A5crCtAM3bN/NJes04GMaLPWX2SWSKjcDuDbjTh5fiXgWwbHM4qArJXbpxMVQZKziRvQYrk+8BktkmXsTX4XGsyfsnJkJZW4o/h3pwCJnV1MSwf1qLVVcBqf2IQISVyS/yuNJSWT/9bE/ZviRjKZdh8arCHsgQrb25KU974OBFebsaZ3l3oBLXGiyK+HllhsmEU25IGmCJoo2cK7B5y8StGfbvKAog3ddx2J1/A7/P1jxV9nBEZtiHfdB0jJr2wxFEujWdoDGh0b98uDJvW5SMIns/GnLwLLNcBhNJxh1M/wF5ceAsP8aBZJtc7mQouTRmNGXlrAF2iDhNhctdegrxTZON4woqhrIvFt9yNrlM1ltXTciNRhCnvj1y/5BVwxa7OZr94qJh5Y21Cf9zDzHa0VyEUCwwOLqeg2nzKWkbomL739ni0CPbt86UjB/9qcB1aqUhK56IdtVljsQIUAzohramS5TXl2Sntx7yLXSn0mla2Q1l4FB68ufc9rwNSlDi51+ZrdeirYmVhxq22QsD2QIQlqzPvGtfsYVI3FFyxOhV1WvaV/mqltazGUBVf1pTuWy2LsPOsIujLY8RKT19iqaXTqoLF18b+3Yz0kzzZ6vgRpLZFXvbBhdasvA7acl73IHb3HP75vJSvIJadSgEyd9ghMIPbQoil8klbZCHrFfqJcn6eQ79DMzvn7w1KJSPUVt07gCPCPHpbllkNL6laXX3kRoLX+AbvkgB+BAQVJL2p6rqbw0Lr4MEU/UMK8skwj0b+pxhfxb841Z72FHGUb4gDo9FYM5RBRS513g/BvogAR36Swy763aIcEeh8Sw+ytbXpOe7ReBH3/Dinl Z335zG4+ df7txNshGejkSWs30b5HfTiKCAdXhmk7q6H+6JjqQmL73vRGgm12UawC//8u4NkfcZs0HyRlvNTeZFOelXiMpXhftNWxoKJze6HULSv13rBRGrwWosf9K31GC8R70YJ/nBZkBibOBmqjXWOSMXoNn4MqMr+5kNxhBOPgGCyGsS/FK2Jyj48Gk8hWOowP4B18F7tirrOmci8WXUlXSMDRgRlO4zq96vcYsWHaEMfUoeb8AXrPnHCXVEN95SAvI1o0jM7N4/1ZOc4MEGpaOKMRmJPf+DHMzJTCxnN5c5pxOLgHFTRj9zFF4LC2QkV1IPOK4pOnFTPg6Ten9lzS4CNEox4QBryVm/B/BMX0dZLCiYEORbqp9DnXSaMZ7CosW0Cbu7DtRGu0ad7dssUoPFd1qWSyR6e7cteSFh6nXljf93CUcV2Q= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song Currently, the PTEs and rmap of a large folio are removed one at a time. This is not only slow but also causes the large folio to be unnecessarily added to deferred_split, which can lead to races between the deferred_split shrinker callback and memory reclamation. This patch releases all PTEs and rmap entries in a batch. Currently, it only handles lazyfree large folios. The below microbench tries to reclaim 128MB lazyfree large folios whose sizes are 64KiB: #include #include #include #include #define SIZE 128*1024*1024 // 128 MB unsigned long read_split_deferred() { FILE *file = fopen("/sys/kernel/mm/transparent_hugepage" "/hugepages-64kB/stats/split_deferred", "r"); if (!file) { perror("Error opening file"); return 0; } unsigned long value; if (fscanf(file, "%lu", &value) != 1) { perror("Error reading value"); fclose(file); return 0; } fclose(file); return value; } int main(int argc, char *argv[]) { while(1) { volatile int *p = mmap(0, SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); memset((void *)p, 1, SIZE); madvise((void *)p, SIZE, MADV_FREE); clock_t start_time = clock(); unsigned long start_split = read_split_deferred(); madvise((void *)p, SIZE, MADV_PAGEOUT); clock_t end_time = clock(); unsigned long end_split = read_split_deferred(); double elapsed_time = (double)(end_time - start_time) / CLOCKS_PER_SEC; printf("Time taken by reclamation: %f seconds, split_deferred: %ld\n", elapsed_time, end_split - start_split); munmap((void *)p, SIZE); } return 0; } w/o patch: ~ # ./a.out Time taken by reclamation: 0.177418 seconds, split_deferred: 2048 Time taken by reclamation: 0.178348 seconds, split_deferred: 2048 Time taken by reclamation: 0.174525 seconds, split_deferred: 2048 Time taken by reclamation: 0.171620 seconds, split_deferred: 2048 Time taken by reclamation: 0.172241 seconds, split_deferred: 2048 Time taken by reclamation: 0.174003 seconds, split_deferred: 2048 Time taken by reclamation: 0.171058 seconds, split_deferred: 2048 Time taken by reclamation: 0.171993 seconds, split_deferred: 2048 Time taken by reclamation: 0.169829 seconds, split_deferred: 2048 Time taken by reclamation: 0.172895 seconds, split_deferred: 2048 Time taken by reclamation: 0.176063 seconds, split_deferred: 2048 Time taken by reclamation: 0.172568 seconds, split_deferred: 2048 Time taken by reclamation: 0.171185 seconds, split_deferred: 2048 Time taken by reclamation: 0.170632 seconds, split_deferred: 2048 Time taken by reclamation: 0.170208 seconds, split_deferred: 2048 Time taken by reclamation: 0.174192 seconds, split_deferred: 2048 ... w/ patch: ~ # ./a.out Time taken by reclamation: 0.074231 seconds, split_deferred: 0 Time taken by reclamation: 0.071026 seconds, split_deferred: 0 Time taken by reclamation: 0.072029 seconds, split_deferred: 0 Time taken by reclamation: 0.071873 seconds, split_deferred: 0 Time taken by reclamation: 0.073573 seconds, split_deferred: 0 Time taken by reclamation: 0.071906 seconds, split_deferred: 0 Time taken by reclamation: 0.073604 seconds, split_deferred: 0 Time taken by reclamation: 0.075903 seconds, split_deferred: 0 Time taken by reclamation: 0.073191 seconds, split_deferred: 0 Time taken by reclamation: 0.071228 seconds, split_deferred: 0 Time taken by reclamation: 0.071391 seconds, split_deferred: 0 Time taken by reclamation: 0.071468 seconds, split_deferred: 0 Time taken by reclamation: 0.071896 seconds, split_deferred: 0 Time taken by reclamation: 0.072508 seconds, split_deferred: 0 Time taken by reclamation: 0.071884 seconds, split_deferred: 0 Time taken by reclamation: 0.072433 seconds, split_deferred: 0 Time taken by reclamation: 0.071939 seconds, split_deferred: 0 ... Signed-off-by: Barry Song --- mm/rmap.c | 47 +++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 41 insertions(+), 6 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index abeb9fcec384..be1978d2712d 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1642,6 +1642,25 @@ void folio_remove_rmap_pmd(struct folio *folio, struct page *page, #endif } +/* We support batch unmapping of PTEs for lazyfree large folios */ +static inline bool can_batch_unmap_folio_ptes(unsigned long addr, + struct folio *folio, pte_t *ptep) +{ + const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY; + int max_nr = folio_nr_pages(folio); + pte_t pte = ptep_get(ptep); + + if (!folio_test_anon(folio) || folio_test_swapbacked(folio)) + return false; + if (pte_none(pte) || pte_unused(pte) || !pte_present(pte)) + return false; + if (pte_pfn(pte) != folio_pfn(folio)) + return false; + + return folio_pte_batch(folio, addr, ptep, pte, max_nr, fpb_flags, NULL, + NULL, NULL) == max_nr; +} + /* * @arg: enum ttu_flags will be passed to this argument */ @@ -1655,6 +1674,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, bool anon_exclusive, ret = true; struct mmu_notifier_range range; enum ttu_flags flags = (enum ttu_flags)(long)arg; + int nr_pages = 1; unsigned long pfn; unsigned long hsz = 0; @@ -1780,6 +1800,16 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, hugetlb_vma_unlock_write(vma); } pteval = huge_ptep_clear_flush(vma, address, pvmw.pte); + } else if (folio_test_large(folio) && !(flags & TTU_HWPOISON) && + can_batch_unmap_folio_ptes(address, folio, pvmw.pte)) { + nr_pages = folio_nr_pages(folio); + flush_cache_range(vma, range.start, range.end); + pteval = get_and_clear_full_ptes(mm, address, pvmw.pte, nr_pages, 0); + if (should_defer_flush(mm, flags)) + set_tlb_ubc_flush_pending(mm, pteval, address, + address + folio_size(folio)); + else + flush_tlb_range(vma, range.start, range.end); } else { flush_cache_page(vma, address, pfn); /* Nuke the page table entry. */ @@ -1875,7 +1905,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, * redirtied either using the page table or a previously * obtained GUP reference. */ - set_pte_at(mm, address, pvmw.pte, pteval); + set_ptes(mm, address, pvmw.pte, pteval, nr_pages); folio_set_swapbacked(folio); goto walk_abort; } else if (ref_count != 1 + map_count) { @@ -1888,10 +1918,10 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, * We'll come back here later and detect if the folio was * dirtied when the additional reference is gone. */ - set_pte_at(mm, address, pvmw.pte, pteval); + set_ptes(mm, address, pvmw.pte, pteval, nr_pages); goto walk_abort; } - dec_mm_counter(mm, MM_ANONPAGES); + add_mm_counter(mm, MM_ANONPAGES, -nr_pages); goto discard; } @@ -1943,13 +1973,18 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, dec_mm_counter(mm, mm_counter_file(folio)); } discard: - if (unlikely(folio_test_hugetlb(folio))) + if (unlikely(folio_test_hugetlb(folio))) { hugetlb_remove_rmap(folio); - else - folio_remove_rmap_pte(folio, subpage, vma); + } else { + folio_remove_rmap_ptes(folio, subpage, nr_pages, vma); + folio_ref_sub(folio, nr_pages - 1); + } if (vma->vm_flags & VM_LOCKED) mlock_drain_local(); folio_put(folio); + /* We have already batched the entire folio */ + if (nr_pages > 1) + goto walk_done; continue; walk_abort: ret = false; From patchwork Wed Jan 15 03:38:08 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13939799 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2744EC02180 for ; Wed, 15 Jan 2025 03:39:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B28E0280006; Tue, 14 Jan 2025 22:39:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AD890280001; Tue, 14 Jan 2025 22:39:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 95359280006; Tue, 14 Jan 2025 22:39:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 75D4E280001 for ; Tue, 14 Jan 2025 22:39:01 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 33A39160E56 for ; Wed, 15 Jan 2025 03:39:01 +0000 (UTC) X-FDA: 83008280082.14.72FCEA5 Received: from mail-pj1-f54.google.com (mail-pj1-f54.google.com [209.85.216.54]) by imf30.hostedemail.com (Postfix) with ESMTP id 3F77C8000A for ; Wed, 15 Jan 2025 03:38:59 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=kwumTal9; spf=pass (imf30.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.216.54 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736912339; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9d8u+VrD2ONy4imzNddIFq+dSgSLZFsyXGu9GrVZVD8=; b=2ELA+DaYt/v+XQ7FG9IlJe71YM/nJzuaN020n7zkTK59qsnpKucvVrUZ2LC+0w6rQ07KvA jM0fQF+SftDGCqz4JrLd4mC1B6213ypET+8rT7j+h1tSEEk32lo/olHAZmxXGDK0HzUdBi k+Xxqx6fw4vhcBcEZgArY9L/VgMpaU0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736912339; a=rsa-sha256; cv=none; b=fzzzZvDezffHgtZrAolawr4uljIOaZXwSZd1SvRJDW3fiDUnPEQUC7vdWxRLtIelJOoeH6 3X4hfp657ip3KzjjIYOCAkUuksWDblPv1/g+LgaG4lW2DJElHMVWm1RcuiJn32tEfIoldO J5bxYb8uWpmAWJRaMHQtPfppvPXJsEg= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=kwumTal9; spf=pass (imf30.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.216.54 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pj1-f54.google.com with SMTP id 98e67ed59e1d1-2ef8c012913so7821525a91.3 for ; Tue, 14 Jan 2025 19:38:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736912338; x=1737517138; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9d8u+VrD2ONy4imzNddIFq+dSgSLZFsyXGu9GrVZVD8=; b=kwumTal94D297ITpe6Gm/2FINrvBn05tr72iOb26fdQCf/9d6RSi9BFtcVZbj666qs kvVM8QJ5FTenENv/k2HO4eydCZWy6wsoZ2rBiuVTvVRtQEKQid+/Cyc4o3/MxpzcoOpg yweSFfrHN1qrgVtLahUaKO8WOXCI5g9t6v6UQIWfb+HV9i65nAycqR/tux1aSueItmm9 iZNQbESk6CMkwIWo06jDG0FxHk62pN5upv9OZ1TgHp9pxvUtWrKxp6KNnmBo9hjYIHlU S6MkG5/EZ5jjhhyysuL1onoKGJ159g25PUkx7JivkTFQMU/2T1syvjYYfng/JrW/SfeF Kufw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736912338; x=1737517138; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9d8u+VrD2ONy4imzNddIFq+dSgSLZFsyXGu9GrVZVD8=; b=ErBq1OM6jDceyn9PT1AsdDROFsttrDmZkkdygEG9yQJ9bGlRU+NOF4y9V0ulRs+rgs iAEjxw9loVNtPtqOnfYxARZNJilOk0YnGOum3a8joAw0l1mQC/X4nTQ0RHSGW0GMPi5e pSc7/txp360gjju1imVyGLismAwK/Q2O/KxssUXWVuTl1z7/52zbQaUZA7hDfyYtvRYV PU7DQnD6rXNsJiZOaI7qHMIoP1SM0Jwmpap/LRhyad8cyMoKHPRyJ3/OXBSUGSzd3mpZ S9hUdZ1nYtpqX/KV8ocJKoG2H4urcdmKCgFaUDuEk7rA2IP9yYEZJsT+PS6pOeYo5LZy t0SQ== X-Forwarded-Encrypted: i=1; AJvYcCXdSeVwvKMIjiPGC8bcMlK5QRKDRFxDGb5zLzUJTha3XocvYbdbP5eo22gWVTbzgNCNiJ3Q4CQo8w==@kvack.org X-Gm-Message-State: AOJu0YzXpSjfYgL/tF2SRgOvK/xTNQgcnttmediS1zxDigzMyP2J/CnP 6re6LRUkpRviNkX8VJG4zNI+SUfATxoD6DbPb086Eij/q6/NTv1s X-Gm-Gg: ASbGncu9LJFF2ay2GKllTVtudQQUZ7opMxT/9NSUoNWf2z6YuCg7PkRIc2QDLl8yTco wLsrVtyz2i9y6PF2Ggu4R5qaT7O6k4Q1mGBETqRU4NPW7ErkQ+OXvo2foVFvsyPaBYxXa5rFVn1 4rNeoisS9tRyOfoBpD99FM2+hHNnHIIW1KvfA0VPINeWIl6jz0bocvS4SMbT5tJuZIRM6jT/LQC 3fGRlM1v2pfmAu7DqTRDPZEX3AcemBsji8mBIR0CbFiMJfIQ+SNle3j1KPyrd1YTt0VVyiT67LX w6Xmfwqv X-Google-Smtp-Source: AGHT+IEgu9NCoDqW1LR96/MyezmmdQXkL7jxU3Tb2bYXbRgZkZZNqA2mJFSkzEeWX1OvoHRYK4gZ3A== X-Received: by 2002:a17:90b:2f4e:b0:2ee:f550:3848 with SMTP id 98e67ed59e1d1-2f548e98ea9mr36383031a91.5.1736912337886; Tue, 14 Jan 2025 19:38:57 -0800 (PST) Received: from Barrys-MBP.hub ([2407:7000:af65:8200:e5d5:b870:ca9b:78f8]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21a9f10dffbsm73368195ad.49.2025.01.14.19.38.51 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 14 Jan 2025 19:38:57 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: 21cnbao@gmail.com, baolin.wang@linux.alibaba.com, chrisl@kernel.org, david@redhat.com, ioworker0@gmail.com, kasong@tencent.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, lorenzo.stoakes@oracle.com, ryan.roberts@arm.com, v-songbaohua@oppo.com, x86@kernel.org, ying.huang@intel.com, zhengtangquan@oppo.com Subject: [PATCH v3 4/4] mm: Avoid splitting pmd for lazyfree pmd-mapped THP in try_to_unmap Date: Wed, 15 Jan 2025 16:38:08 +1300 Message-Id: <20250115033808.40641-5-21cnbao@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250115033808.40641-1-21cnbao@gmail.com> References: <20250115033808.40641-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 3F77C8000A X-Stat-Signature: 8ptt7doxidzzfamak48tafx371ws7w6e X-Rspam-User: X-HE-Tag: 1736912339-244557 X-HE-Meta: U2FsdGVkX18lcAGJ1S44XQQBrY5JoTKXEYmydF+I7ySG5rUPi0hMJLJjimNT1eGCALYycqTmPdOyRb4exKuyx95nRIQpcVRY9zGZUXplYhEVDmf3qztySOZEWGP3ehx485xUpIxYUObfeA39uYWIZi1iDDiW5FCzqZ9PqxJa7iTIkrsq3fKzD9fhJKtMmDeL8mUXI5xHkm+T8Vd6+F7qaj8h4Xkc21kQt7ReYTB+JrIUQLUPY3727xLTHMhugs4RewZ0SXWuwMQTKMTpPAZF8CNUmohfDt/jVW1rEc+cld/aeZxAEkTJP2iBqDlaESz1bbc4DJ+RbdpSmbKSAqtI0+PLon2NqFqsWcDQIQzmeXppUPIaH4Hn/46OWrKGjskKC6YKU8JlNXNjV19CZ93PBhyKGAz72nGITTmQeBuf3Qd0Y/lFb4d5m7Le9yIacrhqK6NcNY1b6mjSv03vfv0OAqTl7SI9WdPRT1YgAMWvnza87jiqAegVs2i0FRYt96GpCEi5dxbrczxAiuoR4HhlsY+6xOqKO7Ro5/xAWuajh+VG5crWsnFRqHOA1eRlYEVdu2W0TmAEb9GROzOQqglKUsDgetMJ2JTDq+uppdglBVpSAfgEEWf92ZuGIofaBwhGOFjAyRu2SZkii3b9FT+g8mZl3dfxBHTyUqXBqbYpezaEuXU/TDBsWcCo2LG4nR11To0IDPoNRJFlgU9g5rJf3IIgIjy9fFo3xUJdNMj+TlLqo9aUP+2oQcdTZP0wMtQCT+yay+PyqPLwJEpeeVBdF/gMkk6ZfXKYuuEgLgrX7CjGWNmKrK7Cf+TJuYDZ63d8AOhd4CYuDLUI1eBORp68z/AmgXBJmYosl9PPFNtislaYIhPx89hd/DzaxgYVaI61oAF9etx0JJKKGxMhFMh1aw/cDtGN7co5FW7kSIZojzVAuR8j49ToPEPhCrvIo/Rc4SQYZWvJrpGOMu+r93i 426is9wx lV+pOmB8ujcEP5LTWbzW/+2thmm03H7ts5T/mbb1LPR7tB1pqVyzhGvMzxwvCYhdodyz7vzkQpwUe3LnbIzVduEQW37wEHypJzYffIcYwmAoLzLHv95VAgqzZ/uSqQZ+JHkp4Te1+23Y7XKLqeY35f9SBkAYW2X53RCZDv9TpGrkDFiyjcdXcMRsV5fDboDm2b5rCP6iMzjQCY6DJFhN2tN4fgASinUNtJk32Z6p+PekHcRjG4pQzWbg3N8/0G2DEdfLDKccn4gMJZ3Slp/Lcr282Oioe66pTpqLuKRkQQe3L2nD3at1yUKXvR7C60kjoXzP/S7k5fxjX0RYtSEeY6vJPMkPEXOE2x36LymH8h9xb07V0law2acu7CRnFTZifMfdLCiQpwuruBzjelhmxIGAR/JVg92iL8oVvs/Gu4+98LUeW0PwzE2biETlOyIkd8RvvmIWQhbRcDsk+6tiFBLygOcKFIbov9lQKzG7+1I3PlrNwB7B36rBRxy4mjNo7Nkk17/jsueKQ6rM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song The try_to_unmap_one() function currently handles PMD-mapped THPs inefficiently. It first splits the PMD into PTEs, copies the dirty state from the PMD to the PTEs, iterates over the PTEs to locate the dirty state, and then marks the THP as swap-backed. This process involves unnecessary PMD splitting and redundant iteration. Instead, this functionality can be efficiently managed in __discard_anon_folio_pmd_locked(), avoiding the extra steps and improving performance. The following microbenchmark redirties folios after invoking MADV_FREE, then measures the time taken to perform memory reclamation (actually set those folios swapbacked again) on the redirtied folios. #include #include #include #include #define SIZE 128*1024*1024 // 128 MB int main(int argc, char *argv[]) { while(1) { volatile int *p = mmap(0, SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); memset((void *)p, 1, SIZE); madvise((void *)p, SIZE, MADV_FREE); /* redirty after MADV_FREE */ memset((void *)p, 1, SIZE); clock_t start_time = clock(); madvise((void *)p, SIZE, MADV_PAGEOUT); clock_t end_time = clock(); double elapsed_time = (double)(end_time - start_time) / CLOCKS_PER_SEC; printf("Time taken by reclamation: %f seconds\n", elapsed_time); munmap((void *)p, SIZE); } return 0; } Testing results are as below, w/o patch: ~ # ./a.out Time taken by reclamation: 0.007300 seconds Time taken by reclamation: 0.007226 seconds Time taken by reclamation: 0.007295 seconds Time taken by reclamation: 0.007731 seconds Time taken by reclamation: 0.007134 seconds Time taken by reclamation: 0.007285 seconds Time taken by reclamation: 0.007720 seconds Time taken by reclamation: 0.007128 seconds Time taken by reclamation: 0.007710 seconds Time taken by reclamation: 0.007712 seconds Time taken by reclamation: 0.007236 seconds Time taken by reclamation: 0.007690 seconds Time taken by reclamation: 0.007174 seconds Time taken by reclamation: 0.007670 seconds Time taken by reclamation: 0.007169 seconds Time taken by reclamation: 0.007305 seconds Time taken by reclamation: 0.007432 seconds Time taken by reclamation: 0.007158 seconds Time taken by reclamation: 0.007133 seconds … w/ patch ~ # ./a.out Time taken by reclamation: 0.002124 seconds Time taken by reclamation: 0.002116 seconds Time taken by reclamation: 0.002150 seconds Time taken by reclamation: 0.002261 seconds Time taken by reclamation: 0.002137 seconds Time taken by reclamation: 0.002173 seconds Time taken by reclamation: 0.002063 seconds Time taken by reclamation: 0.002088 seconds Time taken by reclamation: 0.002169 seconds Time taken by reclamation: 0.002124 seconds Time taken by reclamation: 0.002111 seconds Time taken by reclamation: 0.002224 seconds Time taken by reclamation: 0.002297 seconds Time taken by reclamation: 0.002260 seconds Time taken by reclamation: 0.002246 seconds Time taken by reclamation: 0.002272 seconds Time taken by reclamation: 0.002277 seconds Time taken by reclamation: 0.002462 seconds … This patch significantly speeds up try_to_unmap_one() by allowing it to skip redirtied THPs without splitting the PMD. Suggested-by: Baolin Wang Suggested-by: Lance Yang Signed-off-by: Barry Song --- mm/huge_memory.c | 24 +++++++++++++++++------- mm/rmap.c | 13 ++++++++++--- 2 files changed, 27 insertions(+), 10 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 3d3ebdc002d5..47cc8c3f8f80 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3070,8 +3070,12 @@ static bool __discard_anon_folio_pmd_locked(struct vm_area_struct *vma, int ref_count, map_count; pmd_t orig_pmd = *pmdp; - if (folio_test_dirty(folio) || pmd_dirty(orig_pmd)) + if (pmd_dirty(orig_pmd)) + folio_set_dirty(folio); + if (folio_test_dirty(folio) && !(vma->vm_flags & VM_DROPPABLE)) { + folio_set_swapbacked(folio); return false; + } orig_pmd = pmdp_huge_clear_flush(vma, addr, pmdp); @@ -3098,8 +3102,15 @@ static bool __discard_anon_folio_pmd_locked(struct vm_area_struct *vma, * * The only folio refs must be one from isolation plus the rmap(s). */ - if (folio_test_dirty(folio) || pmd_dirty(orig_pmd) || - ref_count != map_count + 1) { + if (pmd_dirty(orig_pmd)) + folio_set_dirty(folio); + if (folio_test_dirty(folio) && !(vma->vm_flags & VM_DROPPABLE)) { + folio_set_swapbacked(folio); + set_pmd_at(mm, addr, pmdp, orig_pmd); + return false; + } + + if (ref_count != map_count + 1) { set_pmd_at(mm, addr, pmdp, orig_pmd); return false; } @@ -3119,12 +3130,11 @@ bool unmap_huge_pmd_locked(struct vm_area_struct *vma, unsigned long addr, { VM_WARN_ON_FOLIO(!folio_test_pmd_mappable(folio), folio); VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); + VM_WARN_ON_FOLIO(!folio_test_anon(folio), folio); + VM_WARN_ON_FOLIO(folio_test_swapbacked(folio), folio); VM_WARN_ON_ONCE(!IS_ALIGNED(addr, HPAGE_PMD_SIZE)); - if (folio_test_anon(folio) && !folio_test_swapbacked(folio)) - return __discard_anon_folio_pmd_locked(vma, addr, pmdp, folio); - - return false; + return __discard_anon_folio_pmd_locked(vma, addr, pmdp, folio); } static void remap_page(struct folio *folio, unsigned long nr, int flags) diff --git a/mm/rmap.c b/mm/rmap.c index be1978d2712d..a859c399ec7c 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1724,9 +1724,16 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, } if (!pvmw.pte) { - if (unmap_huge_pmd_locked(vma, pvmw.address, pvmw.pmd, - folio)) - goto walk_done; + if (folio_test_anon(folio) && !folio_test_swapbacked(folio)) { + if (unmap_huge_pmd_locked(vma, pvmw.address, pvmw.pmd, folio)) + goto walk_done; + /* + * unmap_huge_pmd_locked has either already marked + * the folio as swap-backed or decided to retain it + * due to GUP or speculative references. + */ + goto walk_abort; + } if (flags & TTU_SPLIT_HUGE_PMD) { /*