From patchwork Fri Feb 14 09:30:12 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13974665 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75260C021A4 for ; Fri, 14 Feb 2025 09:30:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D9D976B008C; Fri, 14 Feb 2025 04:30:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D24C66B008A; Fri, 14 Feb 2025 04:30:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B9F63280001; Fri, 14 Feb 2025 04:30:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 95BA76B0089 for ; Fri, 14 Feb 2025 04:30:40 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 3C728C230C for ; Fri, 14 Feb 2025 09:30:40 +0000 (UTC) X-FDA: 83118030240.28.4D8DEDE Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) by imf10.hostedemail.com (Postfix) with ESMTP id 4AABEC0006 for ; Fri, 14 Feb 2025 09:30:38 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Oeoae509; spf=pass (imf10.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739525438; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MYd38P/88z4Le/gu2ZsWRGcpuaJyiGCRxbJw4vPtU7w=; b=Ra0BVOoG/woC0ph5VvpvTvMUK0sIWuQv6h/Hz9ZrQR6784LnnAi3rnx6NSNYnhcRNg7vdK zAPFZowYiXOzVYYmQTdu12kfOaPYy0zC1hS75SmrGwQ/6CRYMNQJgXHtXQ70InGJJgTgoP xHQ6d99xvoqJGozG5LrgWl7AuMKHaAo= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Oeoae509; spf=pass (imf10.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739525438; a=rsa-sha256; cv=none; b=hV+9Jr+uLG8CLRifWwDk+9Kzh7d//Wdh5lKOWavlpNocqXjlKRIlATg5POLt6/Pi+sJFKU GulkhatsNzGL9kkfnebfDkZb2U6Tmf5USEhHm7yvF8SsA2z1qyZaQwuWwcMf2IMgHlXKR5 tciOl+hTmLddcYRA/5hckkWZ607fqE4= Received: by mail-pl1-f170.google.com with SMTP id d9443c01a7336-220ec47991aso8555765ad.1 for ; Fri, 14 Feb 2025 01:30:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1739525437; x=1740130237; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MYd38P/88z4Le/gu2ZsWRGcpuaJyiGCRxbJw4vPtU7w=; b=Oeoae509T87Bdd8KKit+l1UlLRuPDXeLCsj0EVJoOnI/pCytUPkagwjEBtzJ4beer6 rrnU+7IwgmIlIqnOdH3MraKnCRnXiCkJ/E4fu201LImtk/e9jVejCchsiWPRIUIEf68Y fMpEc43+0Aaq3qy0mi+sKaFNpGFVZwIhmwxn+S/o+hPWa9PeIJp3lV8j13AMW2ZXi1Yq AtXkvLcywC7wA3fzLQfJpo3TfqhlhPihwexY5m6uf6FVfHd1M9E87ALMl9PlwXRZgenI znn05FocPxiLfCZ1ci6na5fe4cYCGnbc5UbN6UbgUmVs4p+/K1ps93hytrKsbAJR8gwu zrHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739525437; x=1740130237; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MYd38P/88z4Le/gu2ZsWRGcpuaJyiGCRxbJw4vPtU7w=; b=dkDdBA61CCyhKSd/iTJjgyOzV8wx/eizciHTGcXYz7nYi9PAavv+MBoNZuP5SRYwQ5 zHt1gYFkobQq3421SmUcfguLs+4z05tpQ9M0rFFHCQ8jveeAYEHrygj3HkTmxD6sqNug oBoZGQlXPt0GrVCst0oQbz7JcjinboowARkthZIxN3Ab7hzQJrWOm2cSnr7aQMYfqznk 76xXjV3Nm7QpqqwLslDXb1BRidLQ+0qlSQkVq2EUIUw3Bi5G/4E8YgQPcgPgQ/fCAqFC o1ATnORnuW+CbAWPvv+KvFPA5915K1J9L6brU3lvbI4jGn2WmVC4iTinC6IKEc4anDAG dxAA== X-Forwarded-Encrypted: i=1; AJvYcCUAKVX1/5R/E+j8RsZrf+fM0cvNbrlj0EUqk3j3IJBKGE5Wb7tdAH1n3c+A5wvhjEPqFYu6KEIy3Q==@kvack.org X-Gm-Message-State: AOJu0YxTgRPh5T/dKriFHnQng2upVyO4g+Iqdoh6r/FyXHmX/Kf0R7dD mGie73gqG0cRkjgMrSlac03snMyV0qVslfIghZ0yBm2/fXYPOS+O X-Gm-Gg: ASbGncs+2Shz9DN9AVAMKMr7UQyZxeYkSk11OcLdEzVzYP4tC+f0fV7b9OvPQcZhYoX cPD0lRWNzxQkyE5maOCaMU/yaX4fBoMNCZ2POl8slwATqmJ5yhzjxOcHxKYDwDH6Yhjvva2r/Lz hjHn9K3XGOk6ilUHiQBCDKZSxIK/QjIt6Gu5a/Tj1z6nbcgThgJg9pVU2CHPULY5Yih/ZBPNBow GRSfYYc39ltONfDjBVWPG9CwCVpG7sMqvskA+upL8H6blOTbMGYPR5LX3OaB+vdaKKe9PvdqsA6 +64fip5iQOqTg3t75JpJ463SoL75mvc= X-Google-Smtp-Source: AGHT+IF3R6sEv2yv6agqEs7T8nVlI0vjyVGMx4hRl93qDBPx5Nld8wdX+5D8keuWm8hulHGB8fPZdw== X-Received: by 2002:a17:902:ecc2:b0:220:d6be:5bba with SMTP id d9443c01a7336-220d6be5e03mr81713945ad.33.1739525437158; Fri, 14 Feb 2025 01:30:37 -0800 (PST) Received: from Barrys-MBP.hub ([118.92.30.135]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-220d545c814sm25440515ad.148.2025.02.14.01.30.30 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 14 Feb 2025 01:30:36 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: 21cnbao@gmail.com, baolin.wang@linux.alibaba.com, chrisl@kernel.org, david@redhat.com, ioworker0@gmail.com, kasong@tencent.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, lorenzo.stoakes@oracle.com, ryan.roberts@arm.com, v-songbaohua@oppo.com, x86@kernel.org, ying.huang@intel.com, zhengtangquan@oppo.com, Mauricio Faria de Oliveira Subject: [PATCH v4 1/4] mm: Set folio swapbacked iff folios are dirty in try_to_unmap_one Date: Fri, 14 Feb 2025 22:30:12 +1300 Message-Id: <20250214093015.51024-2-21cnbao@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250214093015.51024-1-21cnbao@gmail.com> References: <20250214093015.51024-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 4AABEC0006 X-Stat-Signature: t1ouicna3caq9bjs9q797zd16pw8pztb X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1739525438-547776 X-HE-Meta: U2FsdGVkX1/FatI5FgeJXHpsHPf2BrfFigP5oO8TbB/jf6MOzJHKMMqJhKMoisoHby9AKQUTaDzNMFyaQR/Lt2U3vMZM7TKZK4UXW6PTwoJ18ZbsUJeuw1PsF6n7kuBmvzw7gVFoO9Qrv7v+J/b/LW2CZK3E0mHx1Ab1z5Z1Xfca+PhsqEvyZ3EeISR9Bh/ybCxSc5sXCn27MdRVfYi60FbpU0tkMxmzNbRWgqOMtl3i9Vco3ibMhmJD2zOvPxQXhI2wKrGnA3jgE8f8Fk5p2RZvYhh5k+BKTXc6QJXJEB7hS4WdQK1AMEv9WOx0SQ0V6sNUkYjZGnKyyf6ybgS+1NQ/aCxFUPfJhTrKD0MYHqcsSGgG1r5aPssGwACxK+YDkyivfmPxJV4LDBdQlUcSH1bY5z9MVDPcjJAzVwyInqMFU0CwLyzGFfdxJGD+8NBaLcs9QJ8bryGUlZArD2PRbEdFhbvZ03tKtW0SVxh1ArGiQwZAgOAIsARSJCVDrQomsMTy0etTJ+d1dFdXQCPkBdoOlyTXspS0iVfw0/6IT/KlzGQiwqw9izKIkiTadaOg86V5j1vRrKQ820xpFILzV/mEsrpn9oWGcjFgq6FYrG0u5hiirh4Bc2qVIataetDgYz13U756UKktkrTRZfp0Olxac4KOZP/cTv4niiwh+i7X8FtxaqU00s8ywXBHEOKxKSxlkDp/jKQ9sBpqsCR3GZnf0N6OYdnJDkHLdBm3DTo2qQqn8wXAzF2bQaFzQi2D57FjqFlxAoYgDhRMDeNxzW7kTNyOK7yUgo78JzQAk1JnFD9E1SvGmDkCP6nWaflVe0G50SLzFA9vmg5erdQtfnu6PHl58UIuxksOldtTmTOJJk0pDeLSGcuJ3tDRIMkVUWP1B5Wa6SrqAZUDGBrpy2hXPqN8ofdFLi7Ade2BwY4/eC9fDj9Y1EJ3tXdRktFRHDGGVGvvyvXb9uo4SZl 7HuNg5wM P4gr6/KXRwheCm5iFVPR4aZTANUg2D7FJbwisv9RMnYONUyp8Pxu+4dPYb646LAMeb6uyOVhqEcHdb6I/9OCXuLTWZcbxgvHt6CtuL5aPAYFWHvi5GdsJZk0T3ySRse+CGtKQewPIRt64hWmVtOddB5SLsKEJDuKZgQaqGTNZonEi9vNoB0NRaii064CcIb1x+TnUSUMibH6QGXkGsGXS9KqFPCupMuAvQ0b26nzaZVAWOD+HRx/ncZFWh35r0YNM1M6/Fof3eaP/fyCL6H20Ni60qT3nmVo+ADVcgy14iNW5KaymzA1B0nsmm11aE6EJ6Hvfu2z91JbCHZ7w+7rGXPGQx3VlIv/mUGG8KBNMsiaDtSl8P8f1l3NdcCD8BYAisTqdDg4VtKH+asMZ0eQawcaRczxJIfOCrC22XVmTdShjAeIws+meElmFe7ZHRW4CNXCDhKCgLM0TOtII33BzNyOM2dX/A4AETljha0IA9jXZ5TSF5/Av7qk8W96kNb7fL8cy0IscQhXRinmLlBPynvtq1c6qpw5YaYHSpMMozZKd5zY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000044, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song The refcount may be temporarily or long-term increased, but this does not change the fundamental nature of the folio already being lazy- freed. Therefore, we only reset 'swapbacked' when we are certain the folio is dirty and not droppable. Fixes: 6c8e2a256915 ("mm: fix race between MADV_FREE reclaim and blkdev direct IO read") Suggested-by: David Hildenbrand Signed-off-by: Barry Song Cc: Mauricio Faria de Oliveira Acked-by: David Hildenbrand Reviewed-by: Baolin Wang Reviewed-by: Lance Yang --- mm/rmap.c | 49 ++++++++++++++++++++++--------------------------- 1 file changed, 22 insertions(+), 27 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 25a8a127f689..1320527e90cd 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -2022,34 +2022,29 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, */ smp_rmb(); - /* - * The only page refs must be one from isolation - * plus the rmap(s) (dropped by discard:). - */ - if (ref_count == 1 + map_count && - (!folio_test_dirty(folio) || - /* - * Unlike MADV_FREE mappings, VM_DROPPABLE - * ones can be dropped even if they've - * been dirtied. - */ - (vma->vm_flags & VM_DROPPABLE))) { - dec_mm_counter(mm, MM_ANONPAGES); - goto discard; - } - - /* - * If the folio was redirtied, it cannot be - * discarded. Remap the page to page table. - */ - set_pte_at(mm, address, pvmw.pte, pteval); - /* - * Unlike MADV_FREE mappings, VM_DROPPABLE ones - * never get swap backed on failure to drop. - */ - if (!(vma->vm_flags & VM_DROPPABLE)) + if (folio_test_dirty(folio) && !(vma->vm_flags & VM_DROPPABLE)) { + /* + * redirtied either using the page table or a previously + * obtained GUP reference. + */ + set_pte_at(mm, address, pvmw.pte, pteval); folio_set_swapbacked(folio); - goto walk_abort; + goto walk_abort; + } else if (ref_count != 1 + map_count) { + /* + * Additional reference. Could be a GUP reference or any + * speculative reference. GUP users must mark the folio + * dirty if there was a modification. This folio cannot be + * reclaimed right now either way, so act just like nothing + * happened. + * We'll come back here later and detect if the folio was + * dirtied when the additional reference is gone. + */ + set_pte_at(mm, address, pvmw.pte, pteval); + goto walk_abort; + } + dec_mm_counter(mm, MM_ANONPAGES); + goto discard; } if (swap_duplicate(entry) < 0) { From patchwork Fri Feb 14 09:30:13 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13974666 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F1C4C02198 for ; Fri, 14 Feb 2025 09:30:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 92F786B008A; Fri, 14 Feb 2025 04:30:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8B7FA6B0092; Fri, 14 Feb 2025 04:30:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 73173280001; Fri, 14 Feb 2025 04:30:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 50DEF6B008A for ; Fri, 14 Feb 2025 04:30:52 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 0DC9A4CB98 for ; Fri, 14 Feb 2025 09:30:52 +0000 (UTC) X-FDA: 83118030744.30.B7B3E60 Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) by imf07.hostedemail.com (Postfix) with ESMTP id 1DA154000B for ; Fri, 14 Feb 2025 09:30:49 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=VLg9ztDF; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf07.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739525450; a=rsa-sha256; cv=none; b=aSOlUKCSTzOlcQH0fx0CN1O+jU2tg6rVYlvKc6g18S8L0lEwdZoDkwqndqwI8OtHtKMoe6 VsuhvTU0JUDPVl0NXQhYDcEq4lXNqywjOfUqpnxpXeZOGy6s8aMCWVREP5nUi6AsLvvcvT gljLLuUdniX7u5wdyXmmUdAQUQ4LtOQ= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=VLg9ztDF; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf07.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739525450; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fYY5o/q5rsrfsQ4J/LjZ+h5x5F/3ppu3Uc96Z+8j+c0=; b=DMMHu/lQ2B6KVrVewiWOXVWjuSWCyZRj9h+GC/ePxzHwmDvr3ftPrPj4lsxOhA5/u/Pfbr rIj3F0/EbO1gImfp7c5GOjdnqNQacT+Jn778AW4aHKkii9QHAY3xIXOWVV2AqgILzaPDJ+ uDDOEv/K/F/1vvslw5czSXPq7CAzJE0= Received: by mail-pl1-f174.google.com with SMTP id d9443c01a7336-220e83d65e5so19325855ad.1 for ; Fri, 14 Feb 2025 01:30:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1739525449; x=1740130249; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=fYY5o/q5rsrfsQ4J/LjZ+h5x5F/3ppu3Uc96Z+8j+c0=; b=VLg9ztDFqytCyMlVluEBjxlF6ew7GNvg9GPEnMjqMqnwvG4rEQxSMVEBBqMbYwjWs6 2cBS7d87aux90jkblGGioVa2mTZUG4cHtzK5IMd5BuLurnDb5viOlGG/kytKy8/JyMSX Vd+D3qDO8n+LVh8WqgJBgYIyeUsGsTCk+y31dBkOryUuf+3jyFh/KkwHzsgwcOLUVi0E ze37n9fg/nFFQRmi5TU6ZlC7b2gHz4ZiRQz6ecBHXaUmnRiYP9Dni+BmsWSUY6NNT18T UvdQm0doFSUdyeOf2Rn6c9vHrEn9hzSbWNkbiXaYVNFanuVkj/jZ3yOWsaKsOEJMzFW6 gX4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739525449; x=1740130249; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fYY5o/q5rsrfsQ4J/LjZ+h5x5F/3ppu3Uc96Z+8j+c0=; b=CAsOHnekq9WFPDOs+juZ7suBTAaaIV20aMgycrlH85P4I4l2MsNGG/Q+DAq96yQR8J zg79xH5u6uJPmouGIU+8DN3hexoGXvvhyjxP+gvA5pbp0MjPAC38amdvZkxbwYQWTf+T r+lKD4cs6sxVjHtvLj0brYmCHwFrbNi8nh5J5U7P/VtNZ4pYit8YOJDKyJYL0ndEkrJq 3xABn9lt14WvxFMhuvAIqHFNLjNob//doJjv5ar0RZrLWVRs20LFCZhWs2tWh8dAy1CU 8ONheg1St6f3+QSgza+mwProoezai196nxa4CIqytUTpu21roIZICeGhybC+gH11oYu8 Z2Ug== X-Forwarded-Encrypted: i=1; AJvYcCX7Z+Ey5igQWU2MaJkSBIDxlGOsHPPB82ybPXUB+gRfX3HvCAgRrwhVzgqY1dW/QVBLFmJg43cHxw==@kvack.org X-Gm-Message-State: AOJu0YxqoTglfOiO26I7Btnw12vBdLeS5hQ51GUU71VSGI5wK4c1i7+Z lbZJYKa2iDEbBrKumucd6Z1COxQSiBagERw9Y2r2Wkld8a1qT/5F X-Gm-Gg: ASbGnctAFD4k8lHK0euDnT6BKfHyWLKrPv9Rst6M+fJE+XD3mRsa4lgX1OmYtyWaayt 9Zqp4JS/fZGUUUaYtljDBu6YtuZNTGzm6k8YnYlp+OVt0ToImgxvoxX8nRJFse9gyEkQ5BxPLZK mm+HX0ACrBZ52EnygPhL6z5oFNcEoq4/xwWrMoMzQpajSIXoxb6JSj38zQ/rtQhCIPHelQmn5zR CIXH/Hswg/BzvbaxdSmGgtgl2e1yLeeTVN0ieUP6NiM02IkCebsKKgZ6lT1meMYYT53wfR6uFYR 9b5UEoDaMe/i9WvUhGZUE7RcTog6JZw= X-Google-Smtp-Source: AGHT+IGBw+VKYQOOSQZMxOqYuMGxHuHa76uW0Guam1HUKu41ToHalQ66sh7ofZywZvlaeJqNnNB3MA== X-Received: by 2002:a17:903:2292:b0:20d:cb6:11e with SMTP id d9443c01a7336-220bbb101fbmr163987715ad.26.1739525448736; Fri, 14 Feb 2025 01:30:48 -0800 (PST) Received: from Barrys-MBP.hub ([118.92.30.135]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-220d545c814sm25440515ad.148.2025.02.14.01.30.37 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 14 Feb 2025 01:30:48 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: 21cnbao@gmail.com, baolin.wang@linux.alibaba.com, chrisl@kernel.org, david@redhat.com, ioworker0@gmail.com, kasong@tencent.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, lorenzo.stoakes@oracle.com, ryan.roberts@arm.com, v-songbaohua@oppo.com, x86@kernel.org, ying.huang@intel.com, zhengtangquan@oppo.com, Catalin Marinas , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Anshuman Khandual , Shaoqin Huang , Gavin Shan , Mark Rutland , "Kirill A. Shutemov" , Yosry Ahmed , Paul Walmsley , Palmer Dabbelt , Albert Ou , Yicong Yang , Will Deacon , Kefeng Wang Subject: [PATCH v4 2/4] mm: Support tlbbatch flush for a range of PTEs Date: Fri, 14 Feb 2025 22:30:13 +1300 Message-Id: <20250214093015.51024-3-21cnbao@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250214093015.51024-1-21cnbao@gmail.com> References: <20250214093015.51024-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 1DA154000B X-Rspamd-Server: rspam12 X-Stat-Signature: wgj1817fkwmskgt4y16deezxep43sh7r X-HE-Tag: 1739525449-885589 X-HE-Meta: U2FsdGVkX1+4/NoufS/Y/i4WPagr2FDaRfmQS3ab6y5H3Q4ezM9qnW5NGOZq9G0gtRwv74GitkVVcIdixylAxri8BljKWv8Y/FyZ9Cq39lM6pEpPc4i0K3B9PEAuZntE3q4hLb/uqmX6Jq/63uImf+jczDzWyOm8B4YwG54ikfsIrgn5OqFNRjKx3ecb6gw+qJB2f1F57k6nwxWGdYtjQuy47lncRfY5Vp2Tz9pRVJZPhslrCYKm1tVJzMWlpHtsQDHDutvwlzcCYi7Li4MXRinVz0VDwzleW0adkfWedc+q8aXrawSdcieF+hfa0Cik5r/ST9wB7bBqqNJOU+IDZHyrtJM/lvQDxO1089iAHwyT3hQk5g7PtrA40teOZU/Ne/+KDr0YH6Kpa/iK44UjH9iJND5UoEb8Ki64jAG40JPNEU6hD3WbtXgu5VO6hn3wEPZo5Xmr6UpADlxFOj4e3CUPZhX2zm/BLAl4ZnP2q+NLqzi9al37Fakvy7m3+FsKc9AfGrqteMKpOZ/ZKFCR/mwoQyUFeZxqgCtOQcvb2TwuakhcRzsvGqXgKC7y/UrAUjG6hlyMQ4hzdHXQcdC3EF4Hfl6UrSAlPx4IeR3fCjLZmRkr+Zi58Qy5i6U8UHq6Vbp31/YVCRXekNXnPX+vZ3R74v65nek6Uuoxh66w4ZF/m10Xa5RZ0HA1GwYgm/afnLenwwF7k2gEJvNoI319LCI/4AnPpO75WOSE122AbHVGVVkZDgadkiNFJ3ZY31v1taE5o8A+ri4tJ80sQR8JmR338iRc0udMj11zq80olJiFrEgszI0SCt1icKwC8r9swEskoE9wb4Bbpm5cpSaByb5np4XCcie0y+2hiPzIBP996eV6K/Ja7jvFM36cbgyu86F0rreJsYNB+ZZGvnweLkL4xlg6PJ6qzM2uOTEK+qCcnV5i8vPfGnkZo3nuKtSfq/MvAzjbCImGJ+CArG2 e1IikVB5 aVUsH6LgVwqPMixmpmqnGTlOuhBzCiOxnstjZORyHHTX9tJrYLkkQbxME3QTFAbnaaj/03OBr7DESBU8JmX3XTIh5kq5ng4EbykfcIA3LYpRQveulnTQLo+x63n9vhnFkUEl/X+GsP1yM66mUYZFeGQlkYz03g3N7cQ+kmM1t+lxBk/Y+kbk3T41JzAbs6lTEszI2i+ATQrsQSNCcDJ2gWI/xnX62UjcVPkzsw9qEv8VpymOCbSm6tRJQpe60gwIbEnNF8qm0oNndh+zsoAja4oNi7IrZK/NpXy/QMN9c2avXGxKD1G+FqxW4oBCbp4xjBMD5zc8ip/fLd6pJRsz2ePTQ2dED7XqkhEueCvmEvu6DKCBwakr/30sbyIzfFx0HdErl8Y/58o5o1Ru+wc4jV//uq4pBjiyDYl9A0f57IWXmD00ykQ0KVvmD2YPe0ehTrxOdeQhi9jMTyoCAnqiRZQoRZM/EMU7Hg85nXyi8zXkINHYNMjfL940pio7dguyKLDqzAZWThiGOVmnVbbp73YCUwz5S8OZ4CbGV8r+5AI/xo2OPt1TfCuECQpaXsAjaAjHXsmjnmC/iwOSwImAB1yuI508UoGOa6sIwclrqbhLVeWJwOT/N7SUcOpxRjjMeZSAWNi2QBwz2WFVOFmw1ffpknfqVmMfpju+rRTxCwJoGoxfCizYxk1K98AACRzbPwtyOze+BMGp6q1TlMZHM8Pamq0pM49ZLb+EK6YEV6CiEMlUpHIadP1x0Ds1XXoqi2T7z00L4C8pfceQiASpPpwOMERk+K28Em3W7wQeQhfH8QYc+5Jsd0i0LwA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song This patch lays the groundwork for supporting batch PTE unmapping in try_to_unmap_one(). It introduces range handling for TLB batch flushing, with the range currently set to the size of PAGE_SIZE. The function __flush_tlb_range_nosync() is architecture-specific and is only used within arch/arm64. This function requires the mm structure instead of the vma structure. To allow its reuse by arch_tlbbatch_add_pending(), which operates with mm but not vma, this patch modifies the argument of __flush_tlb_range_nosync() to take mm as its parameter. Cc: Catalin Marinas Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: "H. Peter Anvin" Cc: Anshuman Khandual Cc: Ryan Roberts Cc: Shaoqin Huang Cc: Gavin Shan Cc: Mark Rutland Cc: David Hildenbrand Cc: Lance Yang Cc: "Kirill A. Shutemov" Cc: Yosry Ahmed Cc: Paul Walmsley Cc: Palmer Dabbelt Cc: Albert Ou Cc: Yicong Yang Signed-off-by: Barry Song Acked-by: Will Deacon Reviewed-by: Kefeng Wang --- arch/arm64/include/asm/tlbflush.h | 23 +++++++++++------------ arch/arm64/mm/contpte.c | 2 +- arch/riscv/include/asm/tlbflush.h | 3 +-- arch/riscv/mm/tlbflush.c | 3 +-- arch/x86/include/asm/tlbflush.h | 3 +-- mm/rmap.c | 10 +++++----- 6 files changed, 20 insertions(+), 24 deletions(-) diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index bc94e036a26b..b7e1920570bd 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -322,13 +322,6 @@ static inline bool arch_tlbbatch_should_defer(struct mm_struct *mm) return true; } -static inline void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, - struct mm_struct *mm, - unsigned long uaddr) -{ - __flush_tlb_page_nosync(mm, uaddr); -} - /* * If mprotect/munmap/etc occurs during TLB batched flushing, we need to * synchronise all the TLBI issued with a DSB to avoid the race mentioned in @@ -448,7 +441,7 @@ static inline bool __flush_tlb_range_limit_excess(unsigned long start, return false; } -static inline void __flush_tlb_range_nosync(struct vm_area_struct *vma, +static inline void __flush_tlb_range_nosync(struct mm_struct *mm, unsigned long start, unsigned long end, unsigned long stride, bool last_level, int tlb_level) @@ -460,12 +453,12 @@ static inline void __flush_tlb_range_nosync(struct vm_area_struct *vma, pages = (end - start) >> PAGE_SHIFT; if (__flush_tlb_range_limit_excess(start, end, pages, stride)) { - flush_tlb_mm(vma->vm_mm); + flush_tlb_mm(mm); return; } dsb(ishst); - asid = ASID(vma->vm_mm); + asid = ASID(mm); if (last_level) __flush_tlb_range_op(vale1is, start, pages, stride, asid, @@ -474,7 +467,7 @@ static inline void __flush_tlb_range_nosync(struct vm_area_struct *vma, __flush_tlb_range_op(vae1is, start, pages, stride, asid, tlb_level, true, lpa2_is_enabled()); - mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, start, end); + mmu_notifier_arch_invalidate_secondary_tlbs(mm, start, end); } static inline void __flush_tlb_range(struct vm_area_struct *vma, @@ -482,7 +475,7 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, unsigned long stride, bool last_level, int tlb_level) { - __flush_tlb_range_nosync(vma, start, end, stride, + __flush_tlb_range_nosync(vma->vm_mm, start, end, stride, last_level, tlb_level); dsb(ish); } @@ -533,6 +526,12 @@ static inline void __flush_tlb_kernel_pgtable(unsigned long kaddr) dsb(ish); isb(); } + +static inline void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, + struct mm_struct *mm, unsigned long start, unsigned long end) +{ + __flush_tlb_range_nosync(mm, start, end, PAGE_SIZE, true, 3); +} #endif #endif diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c index 55107d27d3f8..bcac4f55f9c1 100644 --- a/arch/arm64/mm/contpte.c +++ b/arch/arm64/mm/contpte.c @@ -335,7 +335,7 @@ int contpte_ptep_clear_flush_young(struct vm_area_struct *vma, * eliding the trailing DSB applies here. */ addr = ALIGN_DOWN(addr, CONT_PTE_SIZE); - __flush_tlb_range_nosync(vma, addr, addr + CONT_PTE_SIZE, + __flush_tlb_range_nosync(vma->vm_mm, addr, addr + CONT_PTE_SIZE, PAGE_SIZE, true, 3); } diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h index 72e559934952..ce0dd0fed764 100644 --- a/arch/riscv/include/asm/tlbflush.h +++ b/arch/riscv/include/asm/tlbflush.h @@ -60,8 +60,7 @@ void flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long start, bool arch_tlbbatch_should_defer(struct mm_struct *mm); void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, - struct mm_struct *mm, - unsigned long uaddr); + struct mm_struct *mm, unsigned long start, unsigned long end); void arch_flush_tlb_batched_pending(struct mm_struct *mm); void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch); diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c index 9b6e86ce3867..74dd9307fbf1 100644 --- a/arch/riscv/mm/tlbflush.c +++ b/arch/riscv/mm/tlbflush.c @@ -186,8 +186,7 @@ bool arch_tlbbatch_should_defer(struct mm_struct *mm) } void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, - struct mm_struct *mm, - unsigned long uaddr) + struct mm_struct *mm, unsigned long start, unsigned long end) { cpumask_or(&batch->cpumask, &batch->cpumask, mm_cpumask(mm)); } diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 02fc2aa06e9e..29373da7b00a 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -279,8 +279,7 @@ static inline u64 inc_mm_tlb_gen(struct mm_struct *mm) } static inline void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, - struct mm_struct *mm, - unsigned long uaddr) + struct mm_struct *mm, unsigned long start, unsigned long end) { inc_mm_tlb_gen(mm); cpumask_or(&batch->cpumask, &batch->cpumask, mm_cpumask(mm)); diff --git a/mm/rmap.c b/mm/rmap.c index 1320527e90cd..89e51a7a9509 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -672,7 +672,7 @@ void try_to_unmap_flush_dirty(void) (TLB_FLUSH_BATCH_PENDING_MASK / 2) static void set_tlb_ubc_flush_pending(struct mm_struct *mm, pte_t pteval, - unsigned long uaddr) + unsigned long start, unsigned long end) { struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; int batch; @@ -681,7 +681,7 @@ static void set_tlb_ubc_flush_pending(struct mm_struct *mm, pte_t pteval, if (!pte_accessible(mm, pteval)) return; - arch_tlbbatch_add_pending(&tlb_ubc->arch, mm, uaddr); + arch_tlbbatch_add_pending(&tlb_ubc->arch, mm, start, end); tlb_ubc->flush_required = true; /* @@ -757,7 +757,7 @@ void flush_tlb_batched_pending(struct mm_struct *mm) } #else static void set_tlb_ubc_flush_pending(struct mm_struct *mm, pte_t pteval, - unsigned long uaddr) + unsigned long start, unsigned long end) { } @@ -1946,7 +1946,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, */ pteval = ptep_get_and_clear(mm, address, pvmw.pte); - set_tlb_ubc_flush_pending(mm, pteval, address); + set_tlb_ubc_flush_pending(mm, pteval, address, address + PAGE_SIZE); } else { pteval = ptep_clear_flush(vma, address, pvmw.pte); } @@ -2329,7 +2329,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, */ pteval = ptep_get_and_clear(mm, address, pvmw.pte); - set_tlb_ubc_flush_pending(mm, pteval, address); + set_tlb_ubc_flush_pending(mm, pteval, address, address + PAGE_SIZE); } else { pteval = ptep_clear_flush(vma, address, pvmw.pte); } From patchwork Fri Feb 14 09:30:14 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13974667 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 452CEC021A4 for ; Fri, 14 Feb 2025 09:30:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BEF726B0093; Fri, 14 Feb 2025 04:30:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B77CD6B0095; Fri, 14 Feb 2025 04:30:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9CAC7280001; Fri, 14 Feb 2025 04:30:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 7C9626B0093 for ; Fri, 14 Feb 2025 04:30:58 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 47F0B1C9F9F for ; Fri, 14 Feb 2025 09:30:58 +0000 (UTC) X-FDA: 83118030996.05.F0F33EA Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) by imf12.hostedemail.com (Postfix) with ESMTP id 5C5F24000C for ; Fri, 14 Feb 2025 09:30:56 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=AQfPIPK2; spf=pass (imf12.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739525456; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ptqpPfUGDg1EiuURg84FX6mkSOnefIkRazcht+VYCyI=; b=pX9PQSq/zb1K8d3VO1hZTPsJR76u4mVLu8VKH5HZO6mqPVdBuxPLvr27n+r/RlAKEr+dMR o5w/s7s9avPzcJq1gKPSjVITYy8LGQ/Nu8Dpjx/aNKAPINcI6lPpFkKs/Eua5cdSVBHQa/ 0mYG7WJP5rlC/itEyd7kfegUOnqNcxw= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=AQfPIPK2; spf=pass (imf12.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739525456; a=rsa-sha256; cv=none; b=Yu+UL7axX5S2ONw+MbgAkpiprtNHoGghfD2yLmxXReTlsuKpbVmIPgq+S7H8OMHET5HR2o hyzBcwFMPwPZ8Ax14rU623Vp6uaZWWZIULF1iTnPdFCxYoOVX0L6lTDwu85om6pyQJrs13 WjlWp0Pq4KcMojW7M+AC0RqQhTCspx8= Received: by mail-pl1-f174.google.com with SMTP id d9443c01a7336-21f48ebaadfso36754335ad.2 for ; Fri, 14 Feb 2025 01:30:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1739525455; x=1740130255; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ptqpPfUGDg1EiuURg84FX6mkSOnefIkRazcht+VYCyI=; b=AQfPIPK2qKMWIabmkrDLusZdStB6McpFa6nT0Gs75mnbPyMDX91ulj85fp1U9HFZ5q DdKGhBcC9iO4S/vI0uas7BeBObP7XA4tugRi8sxG7tx+swo47yJEfZg4NrMb8ZjvDvIg VykwG4a2ZxXhhwIPwszr7A0MOKEXIXeiixnYRO6gv3SvCBNAmg0v5aL5dyLgvd3985s8 NJ9T+jhccxEJYMeH9k2jojU3n5OmlaM4t+1bFZP3CYTx+fz/qKOoI2ST6TbdVUJVTvIg jJJ12uNBK+4M1kLL6KKtk78IVMxZ/4k6fQKlhFTxTaE5Ig/pbcKiD2rXBdh5x1mrK3dD Yd5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739525455; x=1740130255; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ptqpPfUGDg1EiuURg84FX6mkSOnefIkRazcht+VYCyI=; b=e0JDzPa7FgeMvhfvBJZuyhsC/MsbauYM5lRP6OjvwhRt5tJdVEf+2e0/XXPANxEpp1 hO8/E9bncQdG76vSyKb2ztBlO1hGxUhUb395HWXd4XUV4zJgJoi7yQYPNDh35Aq61B+v CSeVMkz/AI9oKqFUVemKCBQ5AG516HArLDaJ0wJphIWQ1WIKUj+tkQrf3Ei9Nvxs2F9h 6yGnWBZiNbJO1mpETDtXL/9Rg4mlzZtTOiWbABnMulu/G/DbrdTWPTcJcXiBD56y9csw iUmFIlaDsJsf9Qy+T2TBNqH9FgxhyhU4seMzhJNwg+Fd8Qes6wz3iWQuKlupIxjexFir M/rw== X-Forwarded-Encrypted: i=1; AJvYcCVy9RSBY2Y6GyeXWjATCPj1QbIsRQG5ZoRNyOTKeaWk/4lZwULJVHYg45J2dkxTyWi1EPYKsllMzA==@kvack.org X-Gm-Message-State: AOJu0YzIIyS33OxKoznEvC2DxJuk7G0wLVNFKMiGsYhwMBQxRmPsNulT YuHsZJ3Ms/yfPOVJY99RP6Y1bodKEdA2r2Rr+sDc7FF7drmakLV7 X-Gm-Gg: ASbGncsmUWorWsy6/5rrIAlfR59u99tt9eEgRROmIR0veBZMQYf4rr/asU0kE7a3FVv cIMqnzLY3Bzlv85Vk59JwsgTWbs1cYSKmWXjQhJpNHQCsEFZ8iupI/yYpZV43X+3qT4XfbAQZrA o+5+7YjfzbK/J4jcGJ3VliyNwVA/qytO6oA2XUfdJMbea6Qys0rJ6zqgrtFgPPgKRYU3QZGqVam 35a0v5c4AFFQtI0khHfI4dJ+HG5nVevWKJgKf0Kxjoh7v9Iqj4SzBKLDPEYVFNDZ+NEjmOA7pST llkqvx4kXZEWkjrj2xQZlG05mJF6NM8= X-Google-Smtp-Source: AGHT+IH6XcRb3dY+ed+qthaJ3tDEtbZeAuAInUKd60UoN8JUd4P8SIptONXNJZy3ckuGqMfBoSS1OA== X-Received: by 2002:a17:902:f545:b0:216:393b:23d4 with SMTP id d9443c01a7336-220bbab0e0cmr183420735ad.11.1739525455176; Fri, 14 Feb 2025 01:30:55 -0800 (PST) Received: from Barrys-MBP.hub ([118.92.30.135]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-220d545c814sm25440515ad.148.2025.02.14.01.30.49 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 14 Feb 2025 01:30:54 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: 21cnbao@gmail.com, baolin.wang@linux.alibaba.com, chrisl@kernel.org, david@redhat.com, ioworker0@gmail.com, kasong@tencent.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, lorenzo.stoakes@oracle.com, ryan.roberts@arm.com, v-songbaohua@oppo.com, x86@kernel.org, ying.huang@intel.com, zhengtangquan@oppo.com Subject: [PATCH v4 3/4] mm: Support batched unmap for lazyfree large folios during reclamation Date: Fri, 14 Feb 2025 22:30:14 +1300 Message-Id: <20250214093015.51024-4-21cnbao@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250214093015.51024-1-21cnbao@gmail.com> References: <20250214093015.51024-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 5C5F24000C X-Stat-Signature: zp7i3ipeqgs15ifgp4pa4hr78jguedmb X-HE-Tag: 1739525456-857133 X-HE-Meta: U2FsdGVkX1+ZBvPiDoKVGVMOFGjtvrssCkK4Di38h/owe4h9JPUh159aYrCn+zCUITXJ8HG4AnN8cuMY7Wmm4wNg0eD1id1CK9D4LU9V8kaW9gdHTJFt00K/KGRHnkqA9K54EXEiB79cH2olwrF7Cu2FCHc2IfGZsC9pPebBKeoG072iqfHxu1gVeogGTio/+/6XD2hIC5GdrYKRoqDuoWRyQXfAZDTztXRO5TGoHortX7jp1FJpW+l70hXLvBinCqdCl4hF/9qwD+kN1YJ+Ba3WWdoCMdGFfZtqd60g7ZC+y3+Kgq+cN7APhoIHS7LwDxM6ZmwnhUAi+jhOiIkApoEz987y2g0ROSyCzTDH6BFDanrbf6VUnlmEfkBEibhNqYRVqA8pXcBUKt/G+1+ZChOBunwPNN0dVywHu5u26vfHK081URwDtRqBmkG3A04ro8OqgvfvPihUm9PBkShm7uCcO3yK4hxAF3lhMncV5+Sede56qBWotJrUuPJnobjgKtnnbeRWLSOWULaaLgVChDYu4Bnh3VvnN5a9hIrwXeRU9O3rho1Ksmkw2GheUTP9bpxSCty8Hu+11LtKIGQ4HRoKtt4wCW/PeKCfk6S03r1nbEKLt229JbH9unGzmIBSbxPbCZfUtPmYLEznARzvBqs7JEaMZPNkT/TXiuQ1gPHQBlZM26pHlUCI9Bgh7+7gwjftEPqkis8eUcL/OMQPghvfbCPmX6DmsttM13F9jH2vF9ixxoOIQuHVM1qqsKg9EJG7P4E5eRB1Dwd8YZHQ0b90KAhF0cWMdQwztktV2LJz3eLiT0vLFO8LsUI4/uaqzTMNA7dl1gnHGgA9NApTiMFtuKkKgGAqQGP8ipEwqHGk8zwEZmE33/ynvvxngzMKu2kScnadd0uXX/AEufHdQu9CdhUF7wSpb4JR+mAvdx9pJxOrw+wLwl+hmEPlzcs1OkUTbJzjOcQ6nafKVbd 7uoMJtB0 QWCDxBpuve4+SSV97zJS8/LXj/8CJ99AMzOZNRneAOb+a3ilaGBvhp7qMG14lI7RMJY5oqfIKbLWc5p/5EaO6QC3GGVnFnTscDoSoOWYS6zIwR4RXopUJ/qF2aVMkz9IF5MCfsQzUEjfNZV1Eog0rzAkCh+8nYnuiUvhWYwxTYwR5OUdWWAL6azu6DNSo2y5Fy97TvndNDD6QfLEBN5OsP1DEbHbIp+pAIFPxR25WLGKs/37bOjHYDa3NPW+HhJIRQALrsd+B9ERfHVxcfpsr5ogfzm5Ve2IdGTqu5Dp25BQD9S78nwiB/PMKW2R7V22Er9uApwIWTU4s47C9YWdW4IP7qTQ6mkdeLxSA4iHUeHWIahLhpAMoGexsaA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song Currently, the PTEs and rmap of a large folio are removed one at a time. This is not only slow but also causes the large folio to be unnecessarily added to deferred_split, which can lead to races between the deferred_split shrinker callback and memory reclamation. This patch releases all PTEs and rmap entries in a batch. Currently, it only handles lazyfree large folios. The below microbench tries to reclaim 128MB lazyfree large folios whose sizes are 64KiB: #include #include #include #include #define SIZE 128*1024*1024 // 128 MB unsigned long read_split_deferred() { FILE *file = fopen("/sys/kernel/mm/transparent_hugepage" "/hugepages-64kB/stats/split_deferred", "r"); if (!file) { perror("Error opening file"); return 0; } unsigned long value; if (fscanf(file, "%lu", &value) != 1) { perror("Error reading value"); fclose(file); return 0; } fclose(file); return value; } int main(int argc, char *argv[]) { while(1) { volatile int *p = mmap(0, SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); memset((void *)p, 1, SIZE); madvise((void *)p, SIZE, MADV_FREE); clock_t start_time = clock(); unsigned long start_split = read_split_deferred(); madvise((void *)p, SIZE, MADV_PAGEOUT); clock_t end_time = clock(); unsigned long end_split = read_split_deferred(); double elapsed_time = (double)(end_time - start_time) / CLOCKS_PER_SEC; printf("Time taken by reclamation: %f seconds, split_deferred: %ld\n", elapsed_time, end_split - start_split); munmap((void *)p, SIZE); } return 0; } w/o patch: ~ # ./a.out Time taken by reclamation: 0.177418 seconds, split_deferred: 2048 Time taken by reclamation: 0.178348 seconds, split_deferred: 2048 Time taken by reclamation: 0.174525 seconds, split_deferred: 2048 Time taken by reclamation: 0.171620 seconds, split_deferred: 2048 Time taken by reclamation: 0.172241 seconds, split_deferred: 2048 Time taken by reclamation: 0.174003 seconds, split_deferred: 2048 Time taken by reclamation: 0.171058 seconds, split_deferred: 2048 Time taken by reclamation: 0.171993 seconds, split_deferred: 2048 Time taken by reclamation: 0.169829 seconds, split_deferred: 2048 Time taken by reclamation: 0.172895 seconds, split_deferred: 2048 Time taken by reclamation: 0.176063 seconds, split_deferred: 2048 Time taken by reclamation: 0.172568 seconds, split_deferred: 2048 Time taken by reclamation: 0.171185 seconds, split_deferred: 2048 Time taken by reclamation: 0.170632 seconds, split_deferred: 2048 Time taken by reclamation: 0.170208 seconds, split_deferred: 2048 Time taken by reclamation: 0.174192 seconds, split_deferred: 2048 ... w/ patch: ~ # ./a.out Time taken by reclamation: 0.074231 seconds, split_deferred: 0 Time taken by reclamation: 0.071026 seconds, split_deferred: 0 Time taken by reclamation: 0.072029 seconds, split_deferred: 0 Time taken by reclamation: 0.071873 seconds, split_deferred: 0 Time taken by reclamation: 0.073573 seconds, split_deferred: 0 Time taken by reclamation: 0.071906 seconds, split_deferred: 0 Time taken by reclamation: 0.073604 seconds, split_deferred: 0 Time taken by reclamation: 0.075903 seconds, split_deferred: 0 Time taken by reclamation: 0.073191 seconds, split_deferred: 0 Time taken by reclamation: 0.071228 seconds, split_deferred: 0 Time taken by reclamation: 0.071391 seconds, split_deferred: 0 Time taken by reclamation: 0.071468 seconds, split_deferred: 0 Time taken by reclamation: 0.071896 seconds, split_deferred: 0 Time taken by reclamation: 0.072508 seconds, split_deferred: 0 Time taken by reclamation: 0.071884 seconds, split_deferred: 0 Time taken by reclamation: 0.072433 seconds, split_deferred: 0 Time taken by reclamation: 0.071939 seconds, split_deferred: 0 ... Signed-off-by: Barry Song --- mm/rmap.c | 72 ++++++++++++++++++++++++++++++++++++++----------------- 1 file changed, 50 insertions(+), 22 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 89e51a7a9509..8786704bd466 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1781,6 +1781,25 @@ void folio_remove_rmap_pud(struct folio *folio, struct page *page, #endif } +/* We support batch unmapping of PTEs for lazyfree large folios */ +static inline bool can_batch_unmap_folio_ptes(unsigned long addr, + struct folio *folio, pte_t *ptep) +{ + const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY; + int max_nr = folio_nr_pages(folio); + pte_t pte = ptep_get(ptep); + + if (!folio_test_anon(folio) || folio_test_swapbacked(folio)) + return false; + if (pte_unused(pte)) + return false; + if (pte_pfn(pte) != folio_pfn(folio)) + return false; + + return folio_pte_batch(folio, addr, ptep, pte, max_nr, fpb_flags, NULL, + NULL, NULL) == max_nr; +} + /* * @arg: enum ttu_flags will be passed to this argument */ @@ -1794,6 +1813,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, struct page *subpage; struct mmu_notifier_range range; enum ttu_flags flags = (enum ttu_flags)(long)arg; + unsigned long nr_pages = 1, end_addr; unsigned long pfn; unsigned long hsz = 0; @@ -1933,23 +1953,26 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, if (pte_dirty(pteval)) folio_mark_dirty(folio); } else if (likely(pte_present(pteval))) { - flush_cache_page(vma, address, pfn); - /* Nuke the page table entry. */ - if (should_defer_flush(mm, flags)) { - /* - * We clear the PTE but do not flush so potentially - * a remote CPU could still be writing to the folio. - * If the entry was previously clean then the - * architecture must guarantee that a clear->dirty - * transition on a cached TLB entry is written through - * and traps if the PTE is unmapped. - */ - pteval = ptep_get_and_clear(mm, address, pvmw.pte); + if (folio_test_large(folio) && !(flags & TTU_HWPOISON) && + can_batch_unmap_folio_ptes(address, folio, pvmw.pte)) + nr_pages = folio_nr_pages(folio); + end_addr = address + nr_pages * PAGE_SIZE; + flush_cache_range(vma, address, end_addr); - set_tlb_ubc_flush_pending(mm, pteval, address, address + PAGE_SIZE); - } else { - pteval = ptep_clear_flush(vma, address, pvmw.pte); - } + /* Nuke the page table entry. */ + pteval = get_and_clear_full_ptes(mm, address, pvmw.pte, nr_pages, 0); + /* + * We clear the PTE but do not flush so potentially + * a remote CPU could still be writing to the folio. + * If the entry was previously clean then the + * architecture must guarantee that a clear->dirty + * transition on a cached TLB entry is written through + * and traps if the PTE is unmapped. + */ + if (should_defer_flush(mm, flags)) + set_tlb_ubc_flush_pending(mm, pteval, address, end_addr); + else + flush_tlb_range(vma, address, end_addr); if (pte_dirty(pteval)) folio_mark_dirty(folio); } else { @@ -2027,7 +2050,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, * redirtied either using the page table or a previously * obtained GUP reference. */ - set_pte_at(mm, address, pvmw.pte, pteval); + set_ptes(mm, address, pvmw.pte, pteval, nr_pages); folio_set_swapbacked(folio); goto walk_abort; } else if (ref_count != 1 + map_count) { @@ -2040,10 +2063,10 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, * We'll come back here later and detect if the folio was * dirtied when the additional reference is gone. */ - set_pte_at(mm, address, pvmw.pte, pteval); + set_ptes(mm, address, pvmw.pte, pteval, nr_pages); goto walk_abort; } - dec_mm_counter(mm, MM_ANONPAGES); + add_mm_counter(mm, MM_ANONPAGES, -nr_pages); goto discard; } @@ -2108,13 +2131,18 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, dec_mm_counter(mm, mm_counter_file(folio)); } discard: - if (unlikely(folio_test_hugetlb(folio))) + if (unlikely(folio_test_hugetlb(folio))) { hugetlb_remove_rmap(folio); - else - folio_remove_rmap_pte(folio, subpage, vma); + } else { + folio_remove_rmap_ptes(folio, subpage, nr_pages, vma); + folio_ref_sub(folio, nr_pages - 1); + } if (vma->vm_flags & VM_LOCKED) mlock_drain_local(); folio_put(folio); + /* We have already batched the entire folio */ + if (nr_pages > 1) + goto walk_done; continue; walk_abort: ret = false; From patchwork Fri Feb 14 09:30:15 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13974668 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12A9CC02198 for ; Fri, 14 Feb 2025 09:31:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 97EA86B0083; Fri, 14 Feb 2025 04:31:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 906906B0096; Fri, 14 Feb 2025 04:31:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 732516B0098; Fri, 14 Feb 2025 04:31:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 5584C6B0083 for ; Fri, 14 Feb 2025 04:31:05 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 12B52141D6E for ; Fri, 14 Feb 2025 09:31:05 +0000 (UTC) X-FDA: 83118031290.26.B9AAEE1 Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) by imf20.hostedemail.com (Postfix) with ESMTP id 08A531C001D for ; Fri, 14 Feb 2025 09:31:02 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Ps3D4MFW; spf=pass (imf20.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.173 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739525463; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OYdDqavtq9jUGmr9BTt10VNZtJWCJbWghGFQiFP0UEc=; b=eLg7lbJeRJ+4r8kCxHmOegdoUh8ITH+YmqLuNFqTPM9e/6gD22NJ4ILM/3ciM1KE9mP+vE 7R5++Dxsn8YnaYIQ+3R9F5O2OavtPwrE9OZlxmZ1hh10I3IstPSS+PIt5vC4xpWFVfvcCF MKY8GWONnfGD19VHiZt0syaYAyGMCvY= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Ps3D4MFW; spf=pass (imf20.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.173 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739525463; a=rsa-sha256; cv=none; b=twK/cbJWLkPboRIVEqut3QPO4ce7cRYodn3n4rxl/gKf1PXE8fbLtmFiTfWcBHvA3Zl9Sd RWAlV7wAXgHt9KCBf0I0B72TRdaldTtM0tHa+WXCTC/wkwKcYuMs50mK/nKAQD2VQZpauZ 3+zHBiDJFX7MaTU8dNgZiC6cBmHdQ8g= Received: by mail-pl1-f173.google.com with SMTP id d9443c01a7336-21f6a47d617so30942605ad.2 for ; Fri, 14 Feb 2025 01:31:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1739525462; x=1740130262; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=OYdDqavtq9jUGmr9BTt10VNZtJWCJbWghGFQiFP0UEc=; b=Ps3D4MFW1DV5u6ACBIr/mSC+9IpBYB4E2DfAB5cSRxFBoqUnrLREC9zqMdwaF8ed0V vR5fnCltypUufbgWvqT81R9Qjz+lyK8i6St0Vzb2o+t3mD9BMdLf34WqBqV0bu8rIEXL vvMrjlfK4cEM8hqFLGfMXY2TpUDgX3Mc16lecR3h3HprdRVesJ6u89feGMecyWI0dsAf UNF5rRAOa7qT/Y6eVT+cZnxYp+SJyvS4F7N+uvPIlPl0yNFqVSeZRt6JCHI3Hdl9ds1d gJii0yWD3lZ+J8cLcKvkPEayRyQLWA9y9ePq8v3oRDmt73Lx0fJExwiXYEGkiHBZ5DjZ tw8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739525462; x=1740130262; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OYdDqavtq9jUGmr9BTt10VNZtJWCJbWghGFQiFP0UEc=; b=HZMrSxXa8aHjEektcvsyJ1JM63ea2kK0UI7ID/OIdP2kfzENXhZpeZccyYaQd3qwjf jRasH05P48wcDKW0O9Pj6E5Qr2tQB8H2lJ3lo9fWYzSODuYf3dif+JPbBP3RZlzCnVmt ntYa0jErUchwzw8ZWcqbm6TbcLMd18z1gUEImIarlNAmEJU+vEMqvQyB58DS1rN0Ww/p 3mNIR9Mqeook54L2ht5y0uUJQco429uvULijrDuJth5cq5sTMmwAdz9As5qvPGcasdlm zf1UX64Dzv3e51p+Ld4xD7ltNn7qmQg21kTsHdxkr1Vr+Ao+inTdRQ+IGQy41kzPqvM/ g04w== X-Forwarded-Encrypted: i=1; AJvYcCWLEfsC9v1NiKlwOHBZjETLBlJTDDD77tG/LY9TuZT90BXVwZgovBLkJWr01phxQNYgpaUO8HYySQ==@kvack.org X-Gm-Message-State: AOJu0YxzuMpDKHFXcbmD5QQ/GcOOkxodqtBgGORROa6w2hvRA05Vvm/U smr6tOeB7GbzGhN19DyXjFYuKmOQtXPucR/TsMiNhI/a+jOrehUj X-Gm-Gg: ASbGnctkygd6xqNEJUr8RlFBg3uyqmdzqOvNWsQt2RpIPQXV+VuF+GA8Sk2yqrbT248 KVS7X+xhl2jUT8D1pcYLLE6hVm8xrPQFGzCCWOYhMIQLYPepS3axT0oea9+vZj+Am2wJSqeKMBw UBxL/+K46JboQdumce0Y+LDq1rTM/x3A+7tC9xckcLbyuCs3AuMBQscFXztp9acQYjIu3oJ1em/ rL4MJSFNOYJ7Hl8t5oE5RQNnn39ZoemAmQLX/YS9UiGHc1Ur2um6eRnBA2yYgBXPdKVu+gGogEO onKlHsq0Jpk+d5vCDm2bnPa2/xZf8Io= X-Google-Smtp-Source: AGHT+IHG1ACAyCVdNpdCj3QS2lybJkeXxN0YXvZYc6Wbp6cHggvocX10uqRNnZxi4u23qNKsTl+Yhw== X-Received: by 2002:a17:902:e5cc:b0:21f:3e2d:7d58 with SMTP id d9443c01a7336-220bbad65b4mr153804995ad.13.1739525461729; Fri, 14 Feb 2025 01:31:01 -0800 (PST) Received: from Barrys-MBP.hub ([118.92.30.135]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-220d545c814sm25440515ad.148.2025.02.14.01.30.55 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 14 Feb 2025 01:31:01 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: 21cnbao@gmail.com, baolin.wang@linux.alibaba.com, chrisl@kernel.org, david@redhat.com, ioworker0@gmail.com, kasong@tencent.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, lorenzo.stoakes@oracle.com, ryan.roberts@arm.com, v-songbaohua@oppo.com, x86@kernel.org, ying.huang@intel.com, zhengtangquan@oppo.com Subject: [PATCH v4 4/4] mm: Avoid splitting pmd for lazyfree pmd-mapped THP in try_to_unmap Date: Fri, 14 Feb 2025 22:30:15 +1300 Message-Id: <20250214093015.51024-5-21cnbao@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250214093015.51024-1-21cnbao@gmail.com> References: <20250214093015.51024-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 08A531C001D X-Stat-Signature: uf8ai7nswt77h9ku5qarm9j6cgss1mtw X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1739525462-770836 X-HE-Meta: U2FsdGVkX1964TpwYfnMXkYGFuyJoTPezK2Tru/TEWkeP6j+pOPioofrTECx7SL5Cudx+CEoYe8kuvEG0rHkr9cmfvw0Vr6CiNr/F9FRaNlMDq/pGy3USTcBDPacxOzPdgoiqA/e5Ka/nDtXR91a+cW9fiscwT/yZhUgGzuKoexLJ4kOQyz3xqgdPUahERaiUtpGaO0Or1/yKurXvhBz09NEK2obPf3IePEFiO/ikGj/3K8PzgjC91TuFC3OZ8/H7z4Q/+e/s0SlkJwanuzx344R+cvQH+dCcyBlL0O0JwP5zFOMI/bB+cFHKuuJOF1tPPAqwAz0ggmpxyscFOU0mN0Rlf1UhS7KcYqMZv2wvLAyb8TwCRIo+A/7pgWKboI+MEAFhiQC6M7IgRTq4PJvOgdpqqOZJPWX4PhBitekjf3RIh4t9+H2CEPs2HvUCkfvI1qHGak1QEh6D/IjbSNhD9DQ1yzsfL1cqhcCtP1WY/99uFmCJecuhYlNUf2X/iKmgC5KgUinRKDdAZAVQo6i/9pz4NHqdfDWtqXQacKRfCOiS1rgO3opimQJyE9flAbBntDlIYtKKQb55o3a3IQrEtBOrDZrv9owp5RimBWCFTagl4WkFFWtKzIk4tBj2yk5RjG6gxUD55XHj+9zlGZfXkay2Xsf0NhZ5u5ou129ym8yDex+5szR3P/0Thx0sbak4ZZFWjwtzNGwe2x3fY4VoQ5QFwdUA6vBkNi4pUl1ix2vTAkrg5efuloe77N3xKilIZRwWOMmdrL4avWOgkwfU6fAMaY0oWKJNXaPivv+13g+dklRFbSyERbLeOOOW6ijQKOv3hpT8gXIEG8IEsGvXPz8+9IQKJrj9gcTiTp7MaQHPgskZzR8TyM+jaUnngPyh0MSOu8s8RLxdQcT3N5QrxJELJQAJr31AYt304Gv5vq2k4POZzACRhBWgRNAtKqqkIdcWjjnkgOc2tL+AnK 3NUqi5/F jicQmQaVo1OGzqtRQ83bn6vCUU0ZIs3SObZqwGZ33Rtz+6wcU6rfdZu5Wge8B76HD4VpO2P6cfiaUK2t9KLWv8XcJMCYA5YCMb6azwKUEjKTGr5sxgmf+fDmLUpyn4WAxIqMvyewFz4bR8pJSYhfsd1pU8rUOOVFs6yt4ul0iIadEfgIjDnl179a9CXEQJmgUKGgCOmlqet+0oW+Rw8bbU/a6CicYDPzlU011shqLB6eO6V9s1C2i8oftVjGisvjx+F5f0ERsPRgEClMEk7JoIrBKfH2KqXNCgd357CEMO9kBiGBSr1wRh42auN7LhqD9Bgy1DaRjPneQaY002l05eCz257IWd2AJ6wMZQOqNPcEibUppFQ2aKK7UGK/xf9Llxv1rFGwC78KehfIysiAO3TDHL8ArsfO9Jg63GeuhMVdHQk2RiMImmGiVHsIQpbL/3XPFwMIDc6WYN9Wv2lrO4+6mUVbfh1WpAY2vp7WFnUaSxqSCMNSwVlrS3GGLIcIAoJ/i X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song The try_to_unmap_one() function currently handles PMD-mapped THPs inefficiently. It first splits the PMD into PTEs, copies the dirty state from the PMD to the PTEs, iterates over the PTEs to locate the dirty state, and then marks the THP as swap-backed. This process involves unnecessary PMD splitting and redundant iteration. Instead, this functionality can be efficiently managed in __discard_anon_folio_pmd_locked(), avoiding the extra steps and improving performance. The following microbenchmark redirties folios after invoking MADV_FREE, then measures the time taken to perform memory reclamation (actually set those folios swapbacked again) on the redirtied folios. #include #include #include #include #define SIZE 128*1024*1024 // 128 MB int main(int argc, char *argv[]) { while(1) { volatile int *p = mmap(0, SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); memset((void *)p, 1, SIZE); madvise((void *)p, SIZE, MADV_FREE); /* redirty after MADV_FREE */ memset((void *)p, 1, SIZE); clock_t start_time = clock(); madvise((void *)p, SIZE, MADV_PAGEOUT); clock_t end_time = clock(); double elapsed_time = (double)(end_time - start_time) / CLOCKS_PER_SEC; printf("Time taken by reclamation: %f seconds\n", elapsed_time); munmap((void *)p, SIZE); } return 0; } Testing results are as below, w/o patch: ~ # ./a.out Time taken by reclamation: 0.007300 seconds Time taken by reclamation: 0.007226 seconds Time taken by reclamation: 0.007295 seconds Time taken by reclamation: 0.007731 seconds Time taken by reclamation: 0.007134 seconds Time taken by reclamation: 0.007285 seconds Time taken by reclamation: 0.007720 seconds Time taken by reclamation: 0.007128 seconds Time taken by reclamation: 0.007710 seconds Time taken by reclamation: 0.007712 seconds Time taken by reclamation: 0.007236 seconds Time taken by reclamation: 0.007690 seconds Time taken by reclamation: 0.007174 seconds Time taken by reclamation: 0.007670 seconds Time taken by reclamation: 0.007169 seconds Time taken by reclamation: 0.007305 seconds Time taken by reclamation: 0.007432 seconds Time taken by reclamation: 0.007158 seconds Time taken by reclamation: 0.007133 seconds … w/ patch ~ # ./a.out Time taken by reclamation: 0.002124 seconds Time taken by reclamation: 0.002116 seconds Time taken by reclamation: 0.002150 seconds Time taken by reclamation: 0.002261 seconds Time taken by reclamation: 0.002137 seconds Time taken by reclamation: 0.002173 seconds Time taken by reclamation: 0.002063 seconds Time taken by reclamation: 0.002088 seconds Time taken by reclamation: 0.002169 seconds Time taken by reclamation: 0.002124 seconds Time taken by reclamation: 0.002111 seconds Time taken by reclamation: 0.002224 seconds Time taken by reclamation: 0.002297 seconds Time taken by reclamation: 0.002260 seconds Time taken by reclamation: 0.002246 seconds Time taken by reclamation: 0.002272 seconds Time taken by reclamation: 0.002277 seconds Time taken by reclamation: 0.002462 seconds … This patch significantly speeds up try_to_unmap_one() by allowing it to skip redirtied THPs without splitting the PMD. Suggested-by: Baolin Wang Suggested-by: Lance Yang Signed-off-by: Barry Song Reviewed-by: Baolin Wang Reviewed-by: Lance Yang --- mm/huge_memory.c | 24 +++++++++++++++++------- mm/rmap.c | 13 ++++++++++--- 2 files changed, 27 insertions(+), 10 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 2eda2a9ec8fc..ab80348f33dd 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3176,8 +3176,12 @@ static bool __discard_anon_folio_pmd_locked(struct vm_area_struct *vma, int ref_count, map_count; pmd_t orig_pmd = *pmdp; - if (folio_test_dirty(folio) || pmd_dirty(orig_pmd)) + if (pmd_dirty(orig_pmd)) + folio_set_dirty(folio); + if (folio_test_dirty(folio) && !(vma->vm_flags & VM_DROPPABLE)) { + folio_set_swapbacked(folio); return false; + } orig_pmd = pmdp_huge_clear_flush(vma, addr, pmdp); @@ -3204,8 +3208,15 @@ static bool __discard_anon_folio_pmd_locked(struct vm_area_struct *vma, * * The only folio refs must be one from isolation plus the rmap(s). */ - if (folio_test_dirty(folio) || pmd_dirty(orig_pmd) || - ref_count != map_count + 1) { + if (pmd_dirty(orig_pmd)) + folio_set_dirty(folio); + if (folio_test_dirty(folio) && !(vma->vm_flags & VM_DROPPABLE)) { + folio_set_swapbacked(folio); + set_pmd_at(mm, addr, pmdp, orig_pmd); + return false; + } + + if (ref_count != map_count + 1) { set_pmd_at(mm, addr, pmdp, orig_pmd); return false; } @@ -3225,12 +3236,11 @@ bool unmap_huge_pmd_locked(struct vm_area_struct *vma, unsigned long addr, { VM_WARN_ON_FOLIO(!folio_test_pmd_mappable(folio), folio); VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); + VM_WARN_ON_FOLIO(!folio_test_anon(folio), folio); + VM_WARN_ON_FOLIO(folio_test_swapbacked(folio), folio); VM_WARN_ON_ONCE(!IS_ALIGNED(addr, HPAGE_PMD_SIZE)); - if (folio_test_anon(folio) && !folio_test_swapbacked(folio)) - return __discard_anon_folio_pmd_locked(vma, addr, pmdp, folio); - - return false; + return __discard_anon_folio_pmd_locked(vma, addr, pmdp, folio); } static void remap_page(struct folio *folio, unsigned long nr, int flags) diff --git a/mm/rmap.c b/mm/rmap.c index 8786704bd466..bcec8677f68d 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1863,9 +1863,16 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, } if (!pvmw.pte) { - if (unmap_huge_pmd_locked(vma, pvmw.address, pvmw.pmd, - folio)) - goto walk_done; + if (folio_test_anon(folio) && !folio_test_swapbacked(folio)) { + if (unmap_huge_pmd_locked(vma, pvmw.address, pvmw.pmd, folio)) + goto walk_done; + /* + * unmap_huge_pmd_locked has either already marked + * the folio as swap-backed or decided to retain it + * due to GUP or speculative references. + */ + goto walk_abort; + } if (flags & TTU_SPLIT_HUGE_PMD) { /*