From patchwork Thu Oct 17 09:47:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13839728 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED1D3D2126D for ; Thu, 17 Oct 2024 09:47:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7F1256B0093; Thu, 17 Oct 2024 05:47:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 79FBA6B0095; Thu, 17 Oct 2024 05:47:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 640326B0096; Thu, 17 Oct 2024 05:47:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 45A4B6B0093 for ; Thu, 17 Oct 2024 05:47:58 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 283D01210D4 for ; Thu, 17 Oct 2024 09:47:48 +0000 (UTC) X-FDA: 82682617248.20.441F0EA Received: from mail-pg1-f169.google.com (mail-pg1-f169.google.com [209.85.215.169]) by imf29.hostedemail.com (Postfix) with ESMTP id 4C425120005 for ; Thu, 17 Oct 2024 09:47:43 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=UvVGY00C; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf29.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.215.169 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729158428; a=rsa-sha256; cv=none; b=kzlNUS05jAXNoW5mMZKPSs7c9BIjq2ZtcdsPb/PZRDnbxkt44yHwtQmSBZwyJQFEgJzA/G rZgayeK1TAqkySIjaFoDR5uhSIxyf3chIBJwDijUoS1B17z/66RtwdK7M/7jvVqTaJJZtl BEeUM5ZBABFKtGDr4G9oE5ukL9q8Q9M= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=UvVGY00C; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf29.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.215.169 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729158428; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+7VdqwStWQj5e7m9yAII5NEsGFIvs7nS0Am+cBAHNsQ=; b=Ol0buwqTMNQyu5V1HDCUsW/jwQSGoiYOvENZp+1FV9JR7DIvS3yS9Le+YLkuyvFUXrRvNS Vca3GAEu2t17okzb9hKgqoesvcO0EwmWG6SIRpvMiU5mkM1QJCrjRaej14EVBfuaO5fKY3 T+LLwxCnfJxnUbw0zTn2JCTIQFa4MOw= Received: by mail-pg1-f169.google.com with SMTP id 41be03b00d2f7-7ea8de14848so548332a12.2 for ; Thu, 17 Oct 2024 02:47:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1729158475; x=1729763275; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+7VdqwStWQj5e7m9yAII5NEsGFIvs7nS0Am+cBAHNsQ=; b=UvVGY00CFHPLCJtxusD6J6nCj1nnaq5OhfIdZYG4ZUMJvPdGLm/PfNflBaUVX8tFYa ArtN0wBWxEuMoBgQ3xpsBg0RZhqDva6NdPLC+6C0OVaFHxx+PQjRT6B7MDmhi0b5PhoJ q/UMPlw+zrDyuS+eD5bFV/A7ry3Gmb8Iu8o6LZiCw3wm2DjdCvOu6m/nSloAQVjZwgk1 /PftHdbDURuC1ayPVd/oepdn/llrS2KIDczf84nsLrUQdWMIhOwsaYKIddRe5O5D47EW bibiY0LHDGa3UAR4qLU5hXHhM6iVaH8xrNDC1IyufXZA4uWBjEkNdW//l12ssaRl2gkj 1Dxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729158475; x=1729763275; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+7VdqwStWQj5e7m9yAII5NEsGFIvs7nS0Am+cBAHNsQ=; b=TwA6Mw9SLXhAdv/2tfrDHwGfn0yC8qFQVmbj0L0C9YphnaQD6d+2Q1E+KGxPMig0zK R6jyam8aKjNfEgEQRpfHcvni6kiF0zpud1sNy1IDYfmxIgKOWFovlLPwoMOn4wxDCBoj tzdfn1f+viK6JqdNidA4sR6PrplyepBCqVAJJw43juCSPBQs13M4fr1/LfxUv/kLBR/X OblY3YWjMu0xiNUkXwkR13wGn8JMa2ZbNpDIrBZjlNko1YRC5HHeGjA7MgM33M1lulEG 2cahb0hC+D1Ic5g5wxwJ6pL0tAnb3PvQmf1acxAFYUBk+AgwF+P6qV3xeCgKtN82JbKh 8+0g== X-Gm-Message-State: AOJu0Yx1hNkKkOaD7bKtvjMjBEzrqqnaRxm8AKhu1Jt7SGDR66fN79zf WjD5hui/GIPj0DGo+vq33CNL2WkkBvJy0ixqLh5tweMqdzb2fBPfNxhkzknPScM= X-Google-Smtp-Source: AGHT+IHzuiiCIun9pa4mZiHfpa9E3frsNbxHgaXVoXJzu+vwbmm9rxbIGaUhA7zByM5nDjwu8FuQmQ== X-Received: by 2002:a05:6a21:e8a:b0:1d9:181f:e6d8 with SMTP id adf61e73a8af0-1d9181fe778mr4119688637.31.1729158474948; Thu, 17 Oct 2024 02:47:54 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71e774a4218sm4385365b3a.120.2024.10.17.02.47.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2024 02:47:54 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, jannh@google.com, peterx@redhat.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Qi Zheng Subject: [PATCH v1 1/7] mm: khugepaged: retract_page_tables() use pte_offset_map_lock() Date: Thu, 17 Oct 2024 17:47:20 +0800 Message-Id: <258de4356bdcc01bce0ff1f6c29b2b64a4211494.1729157502.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 4C425120005 X-Rspamd-Server: rspam01 X-Stat-Signature: 9mrmkzfk3bjnn47fsjwxxxi5y4end78e X-HE-Tag: 1729158463-768521 X-HE-Meta: U2FsdGVkX189ifysJDQH6yt44sOXBtsIlOJn2jIiGPQ8YyqkJpBP4MkcyMZGXN11tJ0JggnBfeMUtz3odbj2MdAmHaqHKZiSOD/svZ3qsaMoi4zHrvfQHsexk5n30LeZwfGjRBENWm+dNQ4DsGmhMcJD+S389OPK25nPJrlP690QSTSTNjrN+eQJJCrgZCL+kPhk5N+ld4zOS97AUwi35PpFNvzx50LkU4NXX5tLIznnG5yIK8ViZrSO6MojHtxIbu7Xrhfgwsx80ap8R8MiQ7vj51QKB8/SWASG0G8mZ5x+QPQDXWK/tc/gRmDKxAsigXG7jv6xAOvC5UP4691fLZAnQtlY08SDsfg5YcEOjdBrw08Krf4g5RkSngtHBhGDA/XuCm0ZIcWYixEeZ6EQrDd6SOatTrkccugmoh3MS5PnATUL2IA6lpGnr+OOhYeYAUsPJEhf5od46kXkMwsukGg44Xco+1953kccA9GslZjDjmHkKLiwOJnSAIJV9nK3YUG8ZvzE/ZazAnEey7WmzS5BbJOoS0eLC3C584pL0f89hdAnDguGCos2eioTypynQ6bt8dGx/nng1rgXhXlA1R7tFwHr3+z2ruqi01B0p009wgFqXtHGzA7k+OEMUH/zXEHKWaFQsK8Ey3DJfV5W1xU4+L9XgQi+ZN0py4HLqkJZKE6o9Nr6WellQhhftcWG5BoHmseInxTQ64cYVGWBNSJeY5R+pLRdEijycH9CaI/s3Drsl4oCXympNglgSbUKfrLLUcm2J3gE01/O8Z93jFONnYXfPQJzsTLv55VuKarafqfK3UbehyR4wvyZtFQlInfJjTtwq3gv+QfQjjP6fshvs85sa4cW1ToETqtWJSDx7gMo7RU7Uzxp2GaZe5bkApI/g0G1t5nq+pO/sOfxbftxeW/0ZdepKFKWu24JzOmb3aTdht9D5FBkWdoOAX0vIwXJW+Auv66Wwua1i2n +kG4KjlW xVVXz0KSaRAyC3OAoCzXQ44SGgUdNI9snGwz4ooyV4rp2buhVphHba5UlDV/LLQ5y7Wcu28SLhOyWJDztc2uHfmQxTdrmIsMWQ9fP8UIzSUoI+8Xlr8X3iVpaaafkKHeqGRTV/mkZtM+Bb54RIDnszqRYUSFg48VJBORKCGXQ3yfanTyGiyuS4eMUQJv0B6JMi5Jlk/54ffmehVobymOtSC2ZLtnBNcchV6PuJLpRYOVzk/fu3Le6pSlNwtcb/MfoErgfWb5N3b7RJPvu20Nf2SO/0IlLjoGJcZW0aJF/Ti0E5apysHMWSklspj3N/JA/w6Ioonb0bCbZ23256osIyZsDqsDLlBUCUOcMYZXw7xOUvvw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In retract_page_tables(), we may modify the pmd entry after acquiring the pml and ptl, so we should also check whether the pmd entry is stable. Using pte_offset_map_lock() to do it, and then we can also remove the calling of the pte_lockptr(). Signed-off-by: Qi Zheng --- mm/khugepaged.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 94feb85ce996c..b4f49d323c8d9 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1721,6 +1721,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) spinlock_t *pml; spinlock_t *ptl; bool skipped_uffd = false; + pte_t *pte; /* * Check vma->anon_vma to exclude MAP_PRIVATE mappings that @@ -1757,9 +1758,15 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) mmu_notifier_invalidate_range_start(&range); pml = pmd_lock(mm, pmd); - ptl = pte_lockptr(mm, pmd); + pte = pte_offset_map_lock(mm, pmd, addr, &ptl); + if (!pte) { + spin_unlock(pml); + mmu_notifier_invalidate_range_end(&range); + continue; + } if (ptl != pml) spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); + pte_unmap(pte); /* * Huge page lock is still held, so normally the page table From patchwork Thu Oct 17 09:47:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13839729 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA676D2126D for ; Thu, 17 Oct 2024 09:48:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 509C76B0088; Thu, 17 Oct 2024 05:48:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4B9756B0096; Thu, 17 Oct 2024 05:48:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 359E26B0098; Thu, 17 Oct 2024 05:48:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 170FE6B0088 for ; Thu, 17 Oct 2024 05:48:04 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 3601FC0FEB for ; Thu, 17 Oct 2024 09:47:52 +0000 (UTC) X-FDA: 82682617710.06.767C0C3 Received: from mail-pg1-f182.google.com (mail-pg1-f182.google.com [209.85.215.182]) by imf09.hostedemail.com (Postfix) with ESMTP id EF8AE140004 for ; Thu, 17 Oct 2024 09:47:54 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=G+oUeiV8; spf=pass (imf09.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.215.182 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729158336; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KKkeoST6I/bkefiDvyHuFZOvKUoGTogE4tqpFCMcSq8=; b=1amqHV0+FejKiPs6RLdYrRZ6+MAB1vEbcJl1NKEgJxzUGuqPpjB/2QO4zI5zQnLk/w+qXn 7LNYj9wte8hhLsYv8tAtMbPiNeevntj7fWOVK1Dyc9MXdKEbIlyxuOQLj3tynfTIwTjWCS pik0ukxlZj6p+zzBfzBbrOiByLAhjZ8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729158336; a=rsa-sha256; cv=none; b=G4TzvhcNfqqq3fmy64PHghqfrfT3YS0Bxr/DY/WsicGcpHIvN3S/UUik65JjrlZPQ0TLGh Uu49n7sepSsPg3Duqtr+X33sv+o0/+xvhEoyZelr0cJCL6ezJzvP1Tc6m4VHBKre6m/xQn mCj5oz0gqnJPhYGt42e4CFjnCh+13Vo= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=G+oUeiV8; spf=pass (imf09.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.215.182 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-pg1-f182.google.com with SMTP id 41be03b00d2f7-7ea8c4ce232so762521a12.0 for ; Thu, 17 Oct 2024 02:48:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1729158481; x=1729763281; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KKkeoST6I/bkefiDvyHuFZOvKUoGTogE4tqpFCMcSq8=; b=G+oUeiV8vruRVXWp82sqj+Swn8xIuHNzUQgU9zb/n2vcqmifdDa7E85763F0O48X4m TsioOsX3RrsPcGGU+TnoU9S4HgoM9ptbjtM5lmkreuISYkg8dUo85+mfNGVevYkjopPV Ymg5n8ovJqo8bARMnAFCvIu9dUJ1uEskkokSskID4WOLJeS3GXzh1AU6cpvAI1iS24v8 MjDAC1TbwSOL5UJU211PXHEDGYgt8/vHLcdyl+shLqQsZOhVogDlDot8LSDxhPLZ2dCq fLEdpZsoHEeWJ8oCr6G3YIxHbKriHHeq4b9krBN+l3h/PjI1Wbl3KYEylTLEv9MxVQkE hnyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729158481; x=1729763281; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KKkeoST6I/bkefiDvyHuFZOvKUoGTogE4tqpFCMcSq8=; b=MmHfWB1QzD/pIK+Z3tES1xwY1Sz8a6GIkItmKnAXrEprck7AHuXj9ACAXPHbVKkznB sarKw3cQ6133+b/VXeZxaNrCrTC4drdg318g1BN8Fz4jGlREH6TRXh3od6DaFipEfH0+ bTu4iOgzDzY9I7A7e6XO3fcMRx7g1l390F2wzkeSSlbcGbis+5SZqkU5m1zC/QL4N0Fy hI6sXVfri31+QRx9FMODrkFCTxU9o0lOJm8JFWp9wAD3AJpWnHmz1QAPMW4y6bp77VNj NdvRe/XTzDFSyT7rr8EM7xLou7g9ZO0OwpuRxrdJ7Pw/PmRzERXgBhcIyuTO6hGq6Zaw 4QZQ== X-Gm-Message-State: AOJu0YwOiNDlojfSESoUNcGY2ha7rzYB3bLHECM5LOzuuDBQg7x1BA+x l654udHCcBSPx9f4X9HAapVENPeMnpqjKsKx9pZXHXhb6v/EeQPAY8watxmw1aw= X-Google-Smtp-Source: AGHT+IEI5E0ziycErtAHmWC08AA6d27vlvkOtexRBD8N1aJkqSz6qnPieAM9Lamrn4x3BOyjY5wnjQ== X-Received: by 2002:a05:6a21:3998:b0:1d9:21c7:5af7 with SMTP id adf61e73a8af0-1d921c75b3dmr1710778637.15.1729158480809; Thu, 17 Oct 2024 02:48:00 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71e774a4218sm4385365b3a.120.2024.10.17.02.47.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2024 02:48:00 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, jannh@google.com, peterx@redhat.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Qi Zheng Subject: [PATCH v1 2/7] mm: make zap_pte_range() handle full within-PMD range Date: Thu, 17 Oct 2024 17:47:21 +0800 Message-Id: X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: in6idczbtmjexw86qy4oudkn7nb7bwcc X-Rspamd-Queue-Id: EF8AE140004 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1729158474-387249 X-HE-Meta: U2FsdGVkX1+ItNrxTKBS3DbY2vkfVy8FHMrgbjNL665gPobAGlZ77BVXkPe3fvToW2PwbCMc1vHv50NSk7lzxR1uwqw3jije/67qB/c9/j06k9f9H35fk4HQvROQMWZe5YkIW4in/A+bPsB/hDQJ+5RC2S5en1ZQEspYMD4D2C96BSunKKxQzTa5dB9TsdCbUBbgXrwjzTPf/nG7xG+T1RhLP5kD941Wh22v/WAt8Zt1ejyom7inPpIlDau8bP/vCz6bR87D3hNkS5bXhG8NeJYH2sVgHPUtablYTKiEb+BD2gLuzx6sBWaXOWNUiEURQX9iQv5IQ1vuSseVOOoGZCt8DCXTa1rPBn1SDDWc8TB74xAK/dG9IY+BU83eRODjxox4ZFjQ2Kdg9OtcvBz/TDTBOZir5OpYj0tezoKY6ffav82LJv+4hEmGkAbjaHZuL3pjj5Ibek0yYAjfpw5FC2Xna8gPriW942ktfraEybmODsxu6p0PTj4K+mW4exs1HAaKqfKktrauli+s3FzeojnSisge55HD8FH8rwjT3d0xZCdIPqU7QeO4UsC7uIN8Hl6hsYevPGfL+vS05Sr8/D8LFuJ7/9ErbID3rMg36gVQb8iTisIKTGezgbps3Pss+VeEQzE71QfOCx4+oACTOh1xc/LSDZZWrk8rJ6RLfRMnvohfTonxORIg7E+UnVdRr4rY7vsO30twQr5wT1VbLFmcl8yVPw3JDr0GM/gq8Npk5kL+T/+S8M5kLNDqTXf5NzmYfKQHNhaT94r/SfatYAlt/FQMD7iRQTkzF9ofAknWP1e4xmQFNxOEGOeOKzVp7/Rt/VykRjjsHC4EMdOFvom7UbRp5ZHoorWXLJ0m9PQ4Qg8mH3191/dtLt0SDm/NVCte9j3bhtVNxo4IgvaeBMeQXeAOlwM/3S1MjF0T+W5d4cuhAx1wZAYy5KN7LW7RuVIMu84/6SlmsNYOtfo H0Lgprdi 75Ew98pYjf96M2NAAW0urnZUAghFAYI8aNFhnml5OdZreOGQMYA8IPDhRDTBN2Nry2I285FfVrTR/faGEub30QeK46ja74TQDuvijy/w/mdBEhT4LK0KaUyeI9W7MxrnFzSPXoVWfeiI40SibOW3U7RrfZh5PmeXWvUF3xq0Trums73XRrTdTSy60G7mFVedp2LZfemNeI3OJxp2FTbRXnvKhORuZmhFJkBgLjeSa10zASzdhkOXDneb/nT1xxPuom0RHwyKiqbmF1pCPbkT3nUR7tbXoipw1pfezTZFHah8Skhp8nJFZHWXr0MQ/j+Xa8ea5VDTAWg2gBjY2YOgxouUdX5/3TO44509wiz9SupgdiO0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In preparation for reclaiming empty PTE pages, this commit first makes zap_pte_range() to handle the full within-PMD range, so that we can more easily detect and free PTE pages in this function in subsequent commits. Signed-off-by: Qi Zheng --- mm/memory.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index caa6ed0a7fe5b..fd57c0f49fce2 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1602,6 +1602,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, swp_entry_t entry; int nr; +retry: tlb_change_page_size(tlb, PAGE_SIZE); init_rss_vec(rss); start_pte = pte = pte_offset_map_lock(mm, pmd, addr, &ptl); @@ -1706,6 +1707,11 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, if (force_flush) tlb_flush_mmu(tlb); + if (addr != end) { + cond_resched(); + goto retry; + } + return addr; } @@ -1744,8 +1750,6 @@ static inline unsigned long zap_pmd_range(struct mmu_gather *tlb, continue; } addr = zap_pte_range(tlb, vma, pmd, addr, next, details); - if (addr != next) - pmd--; } while (pmd++, cond_resched(), addr != end); return addr; From patchwork Thu Oct 17 09:47:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13839730 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD68CD2126D for ; Thu, 17 Oct 2024 09:48:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4E35E6B0099; Thu, 17 Oct 2024 05:48:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 492CA6B009A; Thu, 17 Oct 2024 05:48:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 333236B009B; Thu, 17 Oct 2024 05:48:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 120936B0099 for ; Thu, 17 Oct 2024 05:48:10 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 95F90141154 for ; Thu, 17 Oct 2024 09:47:58 +0000 (UTC) X-FDA: 82682618046.23.96DA612 Received: from mail-oo1-f49.google.com (mail-oo1-f49.google.com [209.85.161.49]) by imf23.hostedemail.com (Postfix) with ESMTP id CD2F2140005 for ; Thu, 17 Oct 2024 09:48:01 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=jJhEySTI; spf=pass (imf23.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.161.49 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729158414; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Ns8LrZyEwJOJHznjnD7PlHQ9MHEMoW624Fd3QWpp04I=; b=CnI74dX3K/rvBzNIrPHzScHomN0wAwkNLFC7i4zA7r881KdXFO71DU+Rv7/EyCUpwf7pRA f5+sbtcQ509X6OamqGqiDEP/YY1kJVtI/LsXNlVSqJEmSR3jQYgmdWnVdXrlehDGnEH34T SSW1REtHcg9B3uOtUaWmutNdYIk1Q9A= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=jJhEySTI; spf=pass (imf23.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.161.49 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729158414; a=rsa-sha256; cv=none; b=a5G6Fy/kgfZjHfPCxWkYG3rXpD2PDoLx7TnEzHxaWQ87WBLeVfKg+JTsIkKP4DzqQyzc+W RmIatTGId68YQp/A+4DQeuCLH7T+kb+9UiEIZZu5elNkv6tcbG0iT9Vb1z7NbG4Rd8xtjJ lms+pFEitiQ33z3cXi1bsYr3HEDck/M= Received: by mail-oo1-f49.google.com with SMTP id 006d021491bc7-5e56759e6d7so435891eaf.3 for ; Thu, 17 Oct 2024 02:48:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1729158487; x=1729763287; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Ns8LrZyEwJOJHznjnD7PlHQ9MHEMoW624Fd3QWpp04I=; b=jJhEySTIIJb6GrlZnKz4Gj3qxN5vTFEXF1DgSwm/MOxDiplQSftLGQoW1LaClWttT2 l72/GhCGBGMDF+JOQkrA+qyP//5LqcQSs6VL4iGm1AVnkpWsISk0TCietV7oPLf+dTvN IY+6NoWqNs6Hu/MZtP1VFYy7vFWiex2APTGFUhVTBEbqS42fWN+1LdShSesLpWITrMOi QVkbP7moyWpXSdztWOrnmQ8acjc/e8H8MmtQ7o+jlbZW1g/YrkJURCX/8LGeisde8qT2 JSPx94PkKk85HgnRoJMn/d3/Orcv8TPEPQSV1HOcMwzc4t1ZNwBJRKAY1K6nUfJr1ech XUIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729158487; x=1729763287; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Ns8LrZyEwJOJHznjnD7PlHQ9MHEMoW624Fd3QWpp04I=; b=AflWEm50vehRRuViflu0BwEAPFNl0SebAgll/i3BVBG0JFR0P927cbwrOy3c0dBI8H 1MZsJ536v17KevskIW/bR2xrI+uOmJh1GWlgyjz8DnPFr7iEN2LZE5QvDmz3+bKn7Fcz kQmWjyI1kMV71hLG9+OurGKdyXchTSJKFGKFCIHYhIdm0rxvxn/sS+gr0e308mUYpfyH pWR8Uy+DyJ1vIl0JlPelqHsleg7RDad8OFvzIZSYcT1Q6BhmrS1Pho04eUUZingV9KD5 XFbah1QmTFGqUHGkHXAJPBKVa3AUjtYjA08PduCZ3WodWjB5cwU4nFXyrK7zD4Yz1PHa gcvg== X-Gm-Message-State: AOJu0YwI0UdBVCHxcNvhVGs9wqbxWwxY7opwwY+EvX+pZEXD941lJJf1 /XTkJ3P69byCAYE89YvIzS/CFR30GoTGqkI91fvO0TBZ9TawF2TFUZ1DP2PYX7w= X-Google-Smtp-Source: AGHT+IHvYEBhNwHNitO8wA559PhHl4cuzZKZHlF0Sd6OJ2vGi8dRLztGVbwRe2pbe9tQOZtPvKzwMw== X-Received: by 2002:a05:6870:798:b0:287:ce7:7c4a with SMTP id 586e51a60fabf-2886e00f920mr16405829fac.38.1729158486665; Thu, 17 Oct 2024 02:48:06 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71e774a4218sm4385365b3a.120.2024.10.17.02.48.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2024 02:48:06 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, jannh@google.com, peterx@redhat.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Qi Zheng Subject: [PATCH v1 3/7] mm: zap_install_uffd_wp_if_needed: return whether uffd-wp pte has been re-installed Date: Thu, 17 Oct 2024 17:47:22 +0800 Message-Id: <37983544cb4fbd27c244b68d3e37d170ea51f444.1729157502.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: CD2F2140005 X-Stat-Signature: nt8pwdj5amjryxwitgnisux8onzjspzj X-HE-Tag: 1729158481-804596 X-HE-Meta: U2FsdGVkX19M2ctG+jrHu8gaPh7pYkBrSWJUB0/k9GHWYhnQDZQpQUWAMCcreS3lv8GTEj9QofSPWPMSKsZsRXhNcRMGQ44qfKygc79btM+Jjizoq5EhMigabcbCrLbEYWqRdsJ2qxY9E6jDjdT1kW6sEZihXpndeqEorPYJW30wjvGZrf7WK5cWsguKs8NZ/ouDB3PrZBrrzGzX3L/fdrnBbMnD6NW8C+hfmQrdj9KOFYmvYHl5ifjy3Opy/EJZjCgAkwNy9dsV9PW8mLfUEyS845SSKs8eyMJ+7aObL8nDa074LLloCsd+VIqq2ViqPXl0sbF/KIT5zF7qpW+mWBUe4agb5g4yFGaWibkEUAHou13o8PU4UU8Ij8ewq0taAc0aCDkptb75q86xKqUOxhuGyujXcrweyoGofuUEu93lj1M8bLUtoC28Kd+jir2ipn+QUYDWXiXs3t6+8zTEzoz5A4oqpcn4aWu/pAjcnYq7KfSSYHWVnJGA3CNI5z/ppTjEi4p68n4IjzgjuvozpSq2K4nHO9pzryd2IqfRWIQ0sdw/bu0rBBzutf4+JxmajzzzoqoJTMeIDeV7fPaMsFpqCI7CEbRen9NLLiQM/IORNbgtA4GzZ4ESuwQMMayJw97yezahsSMg79uEt1lzVNpEWN425ZUDvm5GOAWEaY1mORFhzb4MbDGXefFHs3UifQveR2sXXz4F4nofVMhCtP6nqI3c4p5gJ1W8DAwrmfT8cdoQ60sg75ZxM/sN3haCXms+l0oNMHGVcya+kh/ORoxQAfXlmORJFWOQ5OGzTXAvmvfyoaix3NEJ4tRVeVIH4mgwuOUeqlKrh+D7uNfsNw0GzemTcTNEuKyBIkjHLgD48YYnNytOQW/j/gOQ3XkTj69SM4tVJDHAEFI/8Y0Iln5l/+cCIRMntEJ6DdRpx5IHvWVER/Hie/g+pKbxUBk27uaJohwU4WZFcCvnOLq ISH2XnS/ KK3qDrxdV6bglyZQ6JxGMxvLciY7QQkaWZWA3846cUTgdsu5kmHGUVzvNKZSbVbBlLRNRH/WXGA2koPUyGMOqA0Gfp/WI0Qj37nSeOS0jgvJ019AoZdg9G+gOd2WcV3o4r2dIIH/yZy2jtk27S+ue1/EXkFj8ADWXqxmg4Z7Qiz6p/nCTr6EkaHGLZZtfkbJfiXL6EwbtL6dibYVBJtu5j6MbhudHH7zp1sVZ+Owf0F9osTUYriHQ2EPalZLe5b7UuhaurWela2wJ7iromNKvtauUng0yYh7foyaZY6lQBWDc/Rj0esa0kOczIeR3FHUk1YHPD7cpRmA1f4lycC8oVoIwd10Uk8Yg/HOqhQnHAtl/pRc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In some cases, we'll replace the none pte with an uffd-wp swap special pte marker when necessary. Let's expose this information to the caller through the return value, so that subsequent commits can use this information to detect whether the PTE page is empty. Signed-off-by: Qi Zheng --- include/linux/mm_inline.h | 11 ++++++++--- mm/memory.c | 15 +++++++++++---- 2 files changed, 19 insertions(+), 7 deletions(-) diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index 6f801c7b36e2f..c3ba1da418caf 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -552,8 +552,10 @@ static inline pte_marker copy_pte_marker( * pte, and if they see it, they'll fault and serialize at the pgtable lock. * * This function is a no-op if PTE_MARKER_UFFD_WP is not enabled. + * + * Returns true if an uffd-wp pte was installed, false otherwise. */ -static inline void +static inline bool pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr, pte_t *pte, pte_t pteval) { @@ -570,7 +572,7 @@ pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr, * with a swap pte. There's no way of leaking the bit. */ if (vma_is_anonymous(vma) || !userfaultfd_wp(vma)) - return; + return false; /* A uffd-wp wr-protected normal pte */ if (unlikely(pte_present(pteval) && pte_uffd_wp(pteval))) @@ -583,10 +585,13 @@ pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr, if (unlikely(pte_swp_uffd_wp_any(pteval))) arm_uffd_pte = true; - if (unlikely(arm_uffd_pte)) + if (unlikely(arm_uffd_pte)) { set_pte_at(vma->vm_mm, addr, pte, make_pte_marker(PTE_MARKER_UFFD_WP)); + return true; + } #endif + return false; } static inline bool vma_has_recency(struct vm_area_struct *vma) diff --git a/mm/memory.c b/mm/memory.c index fd57c0f49fce2..534d9d52b5ebe 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1467,27 +1467,34 @@ static inline bool zap_drop_file_uffd_wp(struct zap_details *details) /* * This function makes sure that we'll replace the none pte with an uffd-wp * swap special pte marker when necessary. Must be with the pgtable lock held. + * + * Returns true if uffd-wp ptes was installed, false otherwise. */ -static inline void +static inline bool zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr, pte_t *pte, int nr, struct zap_details *details, pte_t pteval) { + bool was_installed = false; + /* Zap on anonymous always means dropping everything */ if (vma_is_anonymous(vma)) - return; + return false; if (zap_drop_file_uffd_wp(details)) - return; + return false; for (;;) { /* the PFN in the PTE is irrelevant. */ - pte_install_uffd_wp_if_needed(vma, addr, pte, pteval); + if (pte_install_uffd_wp_if_needed(vma, addr, pte, pteval)) + was_installed = true; if (--nr == 0) break; pte++; addr += PAGE_SIZE; } + + return was_installed; } static __always_inline void zap_present_folio_ptes(struct mmu_gather *tlb, From patchwork Thu Oct 17 09:47:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13839731 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8AC85D21269 for ; Thu, 17 Oct 2024 09:48:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1D5516B009B; Thu, 17 Oct 2024 05:48:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1867C6B009C; Thu, 17 Oct 2024 05:48:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 026A46B009D; Thu, 17 Oct 2024 05:48:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id D6E146B009B for ; Thu, 17 Oct 2024 05:48:15 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id BB921120EC4 for ; Thu, 17 Oct 2024 09:48:05 +0000 (UTC) X-FDA: 82682618466.28.281F0C0 Received: from mail-pg1-f176.google.com (mail-pg1-f176.google.com [209.85.215.176]) by imf19.hostedemail.com (Postfix) with ESMTP id B75AC1A0012 for ; Thu, 17 Oct 2024 09:48:01 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=DQoB3fEo; spf=pass (imf19.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.215.176 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729158420; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Dj3ri9p71nmuCZf/HvkLHVU1fpbThKpmsfMSEQvNK9I=; b=L0aeIJZZJMerkfojuuwwwTn8funmzNwA3Jkken7qpVYkNQeCBmT3xZ+fauNw5onBXkdoYa A67MTXJtndR18ZpARvKd5yPVWfX2s5ObNXEqjNV442KDThZwKVRapjNXlRjCwpUqYw15ZE ag72os08OTgAZjQPb+hr7GeAeYqs1AM= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=DQoB3fEo; spf=pass (imf19.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.215.176 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729158420; a=rsa-sha256; cv=none; b=mIc+YJ65NiUkgTF97GlHxlOiWd2rfGSCaQ69dRsLc/UlgHQs87jquXpa7WjUrJ/zSuq9aF fqffgIeCm1MpMA2heidQSXmyLBrrsAFx0qUYVPw0Qw3HnC+ikI7rzy+N6h+ldT0JAvgy8D Mij75LwSCOAooKb02/37RJ6P9n3vu/A= Received: by mail-pg1-f176.google.com with SMTP id 41be03b00d2f7-7e6ed072cdaso475855a12.0 for ; Thu, 17 Oct 2024 02:48:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1729158492; x=1729763292; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Dj3ri9p71nmuCZf/HvkLHVU1fpbThKpmsfMSEQvNK9I=; b=DQoB3fEoActmu+68iR00ELe5IObAwhnaSRTE1TOxowgAsiIH0FZ3j9DvbEtUjII3tk 9EuNrcwBwlZass8dIFJpF6sFOyGpg02oMa+6btv9/xR177ZUIxWC30aLv+2FBHUmH9pu oa1FYiLH/o7gb0zZsjTSz3hC98f5fZDIzCuEyRVLy0JRI3ykBRrU9ZKGwy9yOQvQWBiA w1q0LTyFVnIkYVKBNuPB93ZmifVP0RSTkDPEuuvBPK0f5fIpvOQm5LWbEzfzU6PuEYgj U50UIk8Tp0P5j8C+A99KDyqazC5Brsdgyxdlbooaba0YuIlqnbpsa6K0QEx+fNoCFHk2 GPhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729158492; x=1729763292; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Dj3ri9p71nmuCZf/HvkLHVU1fpbThKpmsfMSEQvNK9I=; b=VVG2tJu2kC2szBlmnbB0FgafMqb39oTRrLdyAeHikpKWu+M7uSXtEQeW7JbuptTjPF uKfN0fteYTBH4e/2sxfVPReXbXaIZjMEsUK8bW1YyWyl2XWl0ZeUQw6XnI1gP0luIlPf iw96Y2RkX/g7TfYBkJb2QrkpzUT+YxzeE7K8kFsF1vbvth8E1SvxhiXFiiVjfE8GJkJ7 QQm6HewN4oHxfQScBbAJTDuwc6yWmfbPpou22940FNNcBTLf40gY/8XJpF/UjL6vxpwj TuFC7ol0iTr9xD584qfUUZ8yX18c0wrq8NfOd7bc578U5H0YnSSvLcXkQWG7GMwJU6c+ Ekzw== X-Gm-Message-State: AOJu0YwDCVgsQDf6lhXKFQSIre252Dp2h1IsD2uD6Cga9HiV0F1uyqQz ACR90T7dCYzvOUYKmfGNn3CJNUro2AU7j/Rb8WYZ0KQhmsAlUDQ/ZIxPlKRRg0I= X-Google-Smtp-Source: AGHT+IEKEDRwguwxshyZXMicxDUmcRCwiWXS2djsKDNBwB/wtcHqk2whV9zra/Oag0PCIfiHyKtsxQ== X-Received: by 2002:a05:6a20:9f4a:b0:1d5:1729:35ec with SMTP id adf61e73a8af0-1d8c955c8ebmr29391532637.7.1729158492598; Thu, 17 Oct 2024 02:48:12 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71e774a4218sm4385365b3a.120.2024.10.17.02.48.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2024 02:48:12 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, jannh@google.com, peterx@redhat.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Qi Zheng Subject: [PATCH v1 4/7] mm: zap_present_ptes: return whether the PTE page is unreclaimable Date: Thu, 17 Oct 2024 17:47:23 +0800 Message-Id: <84a9fddde9993e4a5108f188193fd9c8ff1c5c31.1729157502.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: B75AC1A0012 X-Stat-Signature: toetfb44iaqt5iu5gm7iuugunhfeoh1c X-HE-Tag: 1729158481-405538 X-HE-Meta: U2FsdGVkX18cnOayXAc6QcLdid4Oc/1ynS5XH8WZ4MAI1tXsjAM9JVM5QJ8B6Sr8gm9FAETzKX9rg6smZ+VhPIEXrRADYdnJtTy3w/5VBL9LwUCnfv6dfuNTtVfK/sK7MBzI+OhHAKGhFfCV5f0RZF54Hwu87c2eXkbb9o6VaUHIFaPr7fQFX7KyylCWaNlqfaA1NJqtIW8PFhG+fwbRT3tIOPsH7mf7PaqLiFAs2dzR2+zPMWpsebTqzykHvm+meGXZaVU+YbdPN+Z+Lclqd7Gpi6aHph1DErKFc/F9uSRHYNr7amENeF/9fTeK9CSJrIKVtfWFI2AR306p+GWPZ1QE7WvhIl/I60tjYiESNlJfgXztIfnyV/il337Dgl1aJ8ANo58TvbsWJG78yPsKZP/SteEOAfHJjU23u98mM/0J/tZzZiVOHhaR2uFm+Gn+6mIk6WLiF71Hx/WpBSLA6JJhiMdLK6h8iEXTbRuIKoPdWlNvrE3V6l1fYqruOUf9O8q4/3eqJFMDBMXyQFGty7wX6QoNdaOsetQ1hNA9ZQ84isRaPXW7GVQXuHpRmyvc5uufMmzg19h9TLVBMKqNVSJlAC0riR6CkFlEgx2VmxrZ9f/WyNwUVnWLYIozTQ9BPWpy+pfi+mGr84PDuMYv7cwSvdBXYcYxyypPWtt/Z1zvnPlRoQPWg37BBcvXp7Iu3skUoSyJ6ivk85tyzJyjxdbSFvqA5jWc7ZgloNqPFAKVBmQyO/F54x9ypkIh9ZFEpofGMbL7KSk55y+r5R7GGqPoSsoOq/iwdd0lAy/dJSJKanxvAE3scG8oami0BeWu5Pr3yNht3tgeqPRq4IWQb+0sH8S8xr3IB7ZpjyXVdvdwv97pEFglE44/TjQPalQqQAVALyucgB+epN2EtpqclosyHs5nuZW4QWxmxPGUKJgJp+8Hmaia/cEkHAZ0zOdOECSS+JiPndPdyQYY4MS Yz1c3Q6S CPnP4oYkV/Y3jsL7ZYVHmuaI8dtrVlf+9fdv4VUXac8PMuWqK11oYIuBYodLo7spUJLRqFGbP/OIcFB3VwgSE6nfaf9J4sk1GCdTrSGcoXePK6dr7VMK8hDpvG8IL/HcVdi51NDUlodpM7NWQs1mdBt9bvaZFwiVo2AOdDUHx+KMhk+3lU5F1uoTKB/KjJxsU0A/njmPzGGLN5aXwkwlP/PSYlwW68ufEuzbtlGZTdWm7F8wLa9M8o1iVsZHJvhT0vw0o9IUh4mvoFY5DdHSimB3Xyaw5DlRPdStI0unpqVvxro89iY/TP3pd9acF6lUs/tG5vMa546M9MhDxOYKg1fqa34hyKEmwsedaR4Bg7TuUWLw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In the following two cases, the PTE page cannot be empty and cannot be reclaimed: 1. an uffd-wp pte was re-installed 2. should_zap_folio() return false Let's expose this information to the caller through is_pt_unreclaimable, so that subsequent commits can use this information in zap_pte_range() to detect whether the PTE page can be reclaimed. Signed-off-by: Qi Zheng --- mm/memory.c | 25 +++++++++++++++---------- 1 file changed, 15 insertions(+), 10 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 534d9d52b5ebe..cc89ede8ce2ab 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1501,7 +1501,7 @@ static __always_inline void zap_present_folio_ptes(struct mmu_gather *tlb, struct vm_area_struct *vma, struct folio *folio, struct page *page, pte_t *pte, pte_t ptent, unsigned int nr, unsigned long addr, struct zap_details *details, int *rss, - bool *force_flush, bool *force_break) + bool *force_flush, bool *force_break, bool *is_pt_unreclaimable) { struct mm_struct *mm = tlb->mm; bool delay_rmap = false; @@ -1527,8 +1527,8 @@ static __always_inline void zap_present_folio_ptes(struct mmu_gather *tlb, arch_check_zapped_pte(vma, ptent); tlb_remove_tlb_entries(tlb, pte, nr, addr); if (unlikely(userfaultfd_pte_wp(vma, ptent))) - zap_install_uffd_wp_if_needed(vma, addr, pte, nr, details, - ptent); + *is_pt_unreclaimable = + zap_install_uffd_wp_if_needed(vma, addr, pte, nr, details, ptent); if (!delay_rmap) { folio_remove_rmap_ptes(folio, page, nr, vma); @@ -1552,7 +1552,7 @@ static inline int zap_present_ptes(struct mmu_gather *tlb, struct vm_area_struct *vma, pte_t *pte, pte_t ptent, unsigned int max_nr, unsigned long addr, struct zap_details *details, int *rss, bool *force_flush, - bool *force_break) + bool *force_break, bool *is_pt_unreclaimable) { const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY; struct mm_struct *mm = tlb->mm; @@ -1567,15 +1567,18 @@ static inline int zap_present_ptes(struct mmu_gather *tlb, arch_check_zapped_pte(vma, ptent); tlb_remove_tlb_entry(tlb, pte, addr); if (userfaultfd_pte_wp(vma, ptent)) - zap_install_uffd_wp_if_needed(vma, addr, pte, 1, - details, ptent); + *is_pt_unreclaimable = + zap_install_uffd_wp_if_needed(vma, addr, pte, 1, + details, ptent); ksm_might_unmap_zero_page(mm, ptent); return 1; } folio = page_folio(page); - if (unlikely(!should_zap_folio(details, folio))) + if (unlikely(!should_zap_folio(details, folio))) { + *is_pt_unreclaimable = true; return 1; + } /* * Make sure that the common "small folio" case is as fast as possible @@ -1587,11 +1590,12 @@ static inline int zap_present_ptes(struct mmu_gather *tlb, zap_present_folio_ptes(tlb, vma, folio, page, pte, ptent, nr, addr, details, rss, force_flush, - force_break); + force_break, is_pt_unreclaimable); return nr; } zap_present_folio_ptes(tlb, vma, folio, page, pte, ptent, 1, addr, - details, rss, force_flush, force_break); + details, rss, force_flush, force_break, + is_pt_unreclaimable); return 1; } @@ -1622,6 +1626,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, pte_t ptent = ptep_get(pte); struct folio *folio; struct page *page; + bool is_pt_unreclaimable = false; int max_nr; nr = 1; @@ -1635,7 +1640,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, max_nr = (end - addr) / PAGE_SIZE; nr = zap_present_ptes(tlb, vma, pte, ptent, max_nr, addr, details, rss, &force_flush, - &force_break); + &force_break, &is_pt_unreclaimable); if (unlikely(force_break)) { addr += nr * PAGE_SIZE; break; From patchwork Thu Oct 17 09:47:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13839732 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80B0ED21269 for ; Thu, 17 Oct 2024 09:48:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1FC5E6B009D; Thu, 17 Oct 2024 05:48:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1D44C6B009E; Thu, 17 Oct 2024 05:48:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 04D6F6B009F; Thu, 17 Oct 2024 05:48:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id DC0956B009D for ; Thu, 17 Oct 2024 05:48:22 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id BFF07A0F4F for ; Thu, 17 Oct 2024 09:48:02 +0000 (UTC) X-FDA: 82682618256.20.7C12DED Received: from mail-pg1-f180.google.com (mail-pg1-f180.google.com [209.85.215.180]) by imf30.hostedemail.com (Postfix) with ESMTP id 58E9880003 for ; Thu, 17 Oct 2024 09:48:02 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=NzGU5a4H; spf=pass (imf30.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.215.180 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729158355; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6MI+cLF01LX1jQYVjU1/jISHOw23BC0/0O3KrEoCWt0=; b=CR4zgX/kxfqd7jEIP8/Xal1dQDt2miwV/0pG4HkVkcrKJNdoypglBLtUlnQ0y9+zXxQxw6 O/wBpkjtTWkuAwz7mbRXMrRNk0aSc4AtTJQkDr/VhupWqqwo9HqMdjd0+IEKryHwaw0pFg la0KXToZXsp4IxXuDqFxPuxsSxoh6Og= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729158355; a=rsa-sha256; cv=none; b=D24UAZ8gDo6PIKVEqHI/ZjMF9VQBKapLBll2Xwpk2DTfbopiqlFJ8gXKtDQdHCdqL3DvxB s7AorqsQeJxouMpg/d4F1Kwy+B3021kEJNAu4iJaGJGBfcSzovCxMgdo2svNQqEF45o29W d+VxULDihiCOJ4ZghCSDBZdqCn4OQjo= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=NzGU5a4H; spf=pass (imf30.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.215.180 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-pg1-f180.google.com with SMTP id 41be03b00d2f7-7163489149eso622510a12.1 for ; Thu, 17 Oct 2024 02:48:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1729158499; x=1729763299; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6MI+cLF01LX1jQYVjU1/jISHOw23BC0/0O3KrEoCWt0=; b=NzGU5a4H7cx0vzMs2DColOMSPnHKYVETXmQGhrq6k2WvKx8kmb6doIxwepGwsgZ0sF PS2ITo22BhFQyQbTh6UI1e3/D6NRelNajxxvr+4FzudINCNOWoNGo6vOkGEaTFPaqlqB UVbfAad7WYe6+qbDj2GTDissldl3R7+2/u6Lhz73Xg3yh2YCcmO/eBAT0dUh3zhl/D6D gcxwD5hpeL2J71qmGcAP0M3hCRpOG/+lMYkW2RqY19c215zHxiLYgOHYClC6FkGS7qrs EYbQUmpw7nakA4sYTa4sKsVmwzMSCgbKFl92qnxRDAPrpHhTuqGHUWck4i1sF1irRZRl QDLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729158499; x=1729763299; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6MI+cLF01LX1jQYVjU1/jISHOw23BC0/0O3KrEoCWt0=; b=kziYCGWyzCoAHUThL1h+lXoJ/BYS22AKRM4wBJZoS6vgNVP2Nxrh6V9FpwTpU8O9i5 5RzyBmYYS6WElhdhjevpk85tipsQC8S+ry1w9y7JN07MDFXo7Wv71CnKWsngyL7SKYGd sUaxLRczktCRk1iUSFZJTnOTv+Ivo+B9dr9wnP0n6byQNeVIvx2o6UxAWaBU3rHGTp+G shHSx36yNzjOQzeP2e8kL8WfMdxDrXrJoDbMpHeSDGPYJ0Uy0TOjC3fT6JT3lkLGAfFH qchIDBX9MWHj8yjZdD1XdCw+0jxhWC0DafALETNWmva5nP2ylZhv1PGl3bHivzIsr6pF QHLg== X-Gm-Message-State: AOJu0YwE3otktn3SgorZYGiy6F7boUy0NNsR+uKlmqLFyFRsHhtZsC5+ jgoPndkWpIvEotb+lrlcpl6uF7QkbWnTMhx56vfx7yN5m9Dmrkt3FNgIL7tY/ng= X-Google-Smtp-Source: AGHT+IG3h20lu4G0QNcRMk+sLLhl4rVPjvK7yVrB3CQ6qr+/Kwo1iZ1STapCyXTOLaWUcuJN2AVvmg== X-Received: by 2002:a05:6a21:3998:b0:1d9:21c7:5af7 with SMTP id adf61e73a8af0-1d921c75b3dmr1711810637.15.1729158499347; Thu, 17 Oct 2024 02:48:19 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71e774a4218sm4385365b3a.120.2024.10.17.02.48.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2024 02:48:18 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, jannh@google.com, peterx@redhat.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Qi Zheng Subject: [PATCH v1 5/7] mm: pgtable: try to reclaim empty PTE page in madvise(MADV_DONTNEED) Date: Thu, 17 Oct 2024 17:47:24 +0800 Message-Id: <6c7fe15b0434a08a287c400869f9ba434e1a8fa3.1729157502.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: ao3xujrxtj6ctoys7z3tyf5getbhbm51 X-Rspamd-Queue-Id: 58E9880003 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1729158482-170039 X-HE-Meta: U2FsdGVkX1+Abhx3wjTTKP+a1Zfdue+wUzxBHS1Yh3cPwUn2fW93rthYo4Abpb+Z4SOVFcvXqeKLJMihmQVEa6P+5LtmYaqGhubXysHlPhtqHs2fs3f0rLSnUSt4Zwzch7zQhbREd5KxwZsJ0LBsIpQR9whw/56Wp5nFCAMy0JwR/nbxUdfuqomWnDwarI3XtJBJ2+BFyeQ2NJRWSk8QO+VVO3Gxe8w2ZLZHKQJzmOewCRU6FcSV0EcNFQc+HGeV68iuQdVxW3JBfviB4oy09787p/t75wGbXBVH/wG9exHRAlhrsHtaO+Dnb6CH3UYQNtDQ3cLt8xSjOWjVyZHTnwYotlJr6Y07EDI+CbJgudWllHv2C3PKbSPsIFPnqL12pbL/tmXReUauCEeVFG7Mo1h+v1NxyrhQxOgg6bsxXId5jcHPgLhXYBVgLgArd31vRGdOWxDVZSG5tZjVqovk6WVS6DewxDEwvqARPzlmmB4sR8FT8DYl4FUnC5QBKN+cF1su7kPhHC98yj6NEpMEbwmD0AG2QmgFRaoK/owh9wFvuQuOX67qjWBo1BhYd+Rr32ktYgRnPRPUXYaL2NIUTTaA+l1a6GXiBo9CdadiV9B0wFvkn1Awh2ZTxzSFUBD0o85LbQZEnWrkChzyUxeLSDuL+lxPhUdoYjsXqSci61skvZphxURPWVbJpEVYM5m6IarXsDThVGo1wdkTYgGg+++oCdWKZtViEiOqsrMfQKyVzHUd7H/WEIHEIZ9TY6K/MFwnn/4kbobOhxOb56Dxz+EF1+SONdwNB6TAWvRn3Trnu88PNHW7Az8OgrfbFx+77iH70+n1CKUtie3BsdO7rpI826gpi9NykvYrC9NeJ+B+vfwl4ERveUPrfAD8ZqQ/nD40TzfFvAch+aTvgQfHD8YblAZ4wGltJ3xN5BOMxk2bg5oeSNf3iv4bsq42YeJD5UyrPqhSjS8t6zxwyZb 3zOzxDS/ nE0YvsGVU4KnGhomlni0YZdcLGl8jW5AFa6RrmbLDz+i/khQG1Hf55KHX/Pyogu0aiXBtXggyuzz/OwIwBA35jE3ZJ79sVFdt2dceOWOUfre/Cdnc7ubdkly3Ub47UDT/OkfYKjvBHKIHl18OmfmzoKvQ/CrwD1s56OnL5qrv7ipdZ0T6RJsylFXNTTMGHa1s5V2F6rsJQ/hVM/OZN6VOZJrD97+sLFMA5cc8uRhEoDIqihboNzQ3r1pPqb5xMMuY+LpoIOFeAYkUehztphW1sTisdGaXwznDEMcQ7slPd14OS1a6B2Xq/URZaXpj+zgX7g0mfJCYmQ3oLPT5K50E+wlebHi7calU9naNO74CXeoD6Kk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now in order to pursue high performance, applications mostly use some high-performance user-mode memory allocators, such as jemalloc or tcmalloc. These memory allocators use madvise(MADV_DONTNEED or MADV_FREE) to release physical memory, but neither MADV_DONTNEED nor MADV_FREE will release page table memory, which may cause huge page table memory usage. The following are a memory usage snapshot of one process which actually happened on our server: VIRT: 55t RES: 590g VmPTE: 110g In this case, most of the page table entries are empty. For such a PTE page where all entries are empty, we can actually free it back to the system for others to use. As a first step, this commit aims to synchronously free the empty PTE pages in madvise(MADV_DONTNEED) case. We will detect and free empty PTE pages in zap_pte_range(), and will add zap_details.reclaim_pt to exclude cases other than madvise(MADV_DONTNEED). Once an empty PTE is detected, we first try to hold the pmd lock within the pte lock. If successful, we clear the pmd entry directly (fast path). Otherwise, we wait until the pte lock is released, then re-hold the pmd and pte locks and loop PTRS_PER_PTE times to check pte_none() to re-detect whether the PTE page is empty and free it (slow path). For other cases such as madvise(MADV_FREE), consider scanning and freeing empty PTE pages asynchronously in the future. The following code snippet can show the effect of optimization: mmap 50G while (1) { for (; i < 1024 * 25; i++) { touch 2M memory madvise MADV_DONTNEED 2M } } As we can see, the memory usage of VmPTE is reduced: before after VIRT 50.0 GB 50.0 GB RES 3.1 MB 3.1 MB VmPTE 102640 KB 240 KB Signed-off-by: Qi Zheng --- include/linux/mm.h | 1 + mm/Kconfig | 14 ++++++++++ mm/Makefile | 1 + mm/internal.h | 29 ++++++++++++++++++++ mm/madvise.c | 4 ++- mm/memory.c | 47 +++++++++++++++++++++++++++----- mm/pt_reclaim.c | 68 ++++++++++++++++++++++++++++++++++++++++++++++ 7 files changed, 156 insertions(+), 8 deletions(-) create mode 100644 mm/pt_reclaim.c diff --git a/include/linux/mm.h b/include/linux/mm.h index df0a5eac66b78..667a466bb4649 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2321,6 +2321,7 @@ extern void pagefault_out_of_memory(void); struct zap_details { struct folio *single_folio; /* Locked folio to be unmapped */ bool even_cows; /* Zap COWed private pages too? */ + bool reclaim_pt; zap_flags_t zap_flags; /* Extra flags for zapping */ }; diff --git a/mm/Kconfig b/mm/Kconfig index 4b2a1ef9a161c..f5993b9cc2a9f 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1302,6 +1302,20 @@ config ARCH_HAS_USER_SHADOW_STACK The architecture has hardware support for userspace shadow call stacks (eg, x86 CET, arm64 GCS or RISC-V Zicfiss). +config ARCH_SUPPORTS_PT_RECLAIM + def_bool n + +config PT_RECLAIM + bool "reclaim empty user page table pages" + default y + depends on ARCH_SUPPORTS_PT_RECLAIM && MMU && SMP + select MMU_GATHER_RCU_TABLE_FREE + help + Try to reclaim empty user page table pages in paths other that munmap + and exit_mmap path. + + Note: now only empty user PTE page table pages will be reclaimed. + source "mm/damon/Kconfig" endmenu diff --git a/mm/Makefile b/mm/Makefile index d5639b0361663..9d816323d247a 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -145,3 +145,4 @@ obj-$(CONFIG_GENERIC_IOREMAP) += ioremap.o obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o obj-$(CONFIG_EXECMEM) += execmem.o obj-$(CONFIG_TMPFS_QUOTA) += shmem_quota.o +obj-$(CONFIG_PT_RECLAIM) += pt_reclaim.o diff --git a/mm/internal.h b/mm/internal.h index 906da6280c2df..4adaaea0917c0 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1445,4 +1445,33 @@ static inline void accept_page(struct page *page) } #endif /* CONFIG_UNACCEPTED_MEMORY */ +#ifdef CONFIG_PT_RECLAIM +static inline void set_pt_unreclaimable(bool *can_reclaim_pt) +{ + *can_reclaim_pt = false; +} +bool try_get_and_clear_pmd(struct mm_struct *mm, pmd_t *pmd, pmd_t *pmdval); +void free_pte(struct mm_struct *mm, unsigned long addr, struct mmu_gather *tlb, + pmd_t pmdval); +void try_to_free_pte(struct mm_struct *mm, pmd_t *pmd, unsigned long addr, + struct mmu_gather *tlb); +#else +static inline void set_pt_unreclaimable(bool *can_reclaim_pt) +{ +} +static inline bool try_get_and_clear_pmd(struct mm_struct *mm, pmd_t *pmd, + pmd_t *pmdval) +{ + return false; +} +static inline void free_pte(struct mm_struct *mm, unsigned long addr, + struct mmu_gather *tlb, pmd_t pmdval) +{ +} +static inline void try_to_free_pte(struct mm_struct *mm, pmd_t *pmd, + unsigned long addr, struct mmu_gather *tlb) +{ +} +#endif /* CONFIG_PT_RECLAIM */ + #endif /* __MM_INTERNAL_H */ diff --git a/mm/madvise.c b/mm/madvise.c index e871a72a6c329..82a6d15429da7 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -843,7 +843,9 @@ static int madvise_free_single_vma(struct vm_area_struct *vma, static long madvise_dontneed_single_vma(struct vm_area_struct *vma, unsigned long start, unsigned long end) { - zap_page_range_single(vma, start, end - start, NULL); + struct zap_details details = {.reclaim_pt = true,}; + + zap_page_range_single(vma, start, end - start, &details); return 0; } diff --git a/mm/memory.c b/mm/memory.c index cc89ede8ce2ab..77774b34f2cde 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1437,7 +1437,7 @@ copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma) static inline bool should_zap_cows(struct zap_details *details) { /* By default, zap all pages */ - if (!details) + if (!details || details->reclaim_pt) return true; /* Or, we zap COWed pages only if the caller wants to */ @@ -1611,8 +1611,18 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, pte_t *start_pte; pte_t *pte; swp_entry_t entry; + pmd_t pmdval; + bool can_reclaim_pt = false; + bool direct_reclaim; + unsigned long start = addr; int nr; + if (details && details->reclaim_pt) + can_reclaim_pt = true; + + if ((ALIGN_DOWN(end, PMD_SIZE)) - (ALIGN(start, PMD_SIZE)) < PMD_SIZE) + can_reclaim_pt = false; + retry: tlb_change_page_size(tlb, PAGE_SIZE); init_rss_vec(rss); @@ -1641,6 +1651,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, nr = zap_present_ptes(tlb, vma, pte, ptent, max_nr, addr, details, rss, &force_flush, &force_break, &is_pt_unreclaimable); + if (is_pt_unreclaimable) + set_pt_unreclaimable(&can_reclaim_pt); if (unlikely(force_break)) { addr += nr * PAGE_SIZE; break; @@ -1653,8 +1665,10 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, is_device_exclusive_entry(entry)) { page = pfn_swap_entry_to_page(entry); folio = page_folio(page); - if (unlikely(!should_zap_folio(details, folio))) + if (unlikely(!should_zap_folio(details, folio))) { + set_pt_unreclaimable(&can_reclaim_pt); continue; + } /* * Both device private/exclusive mappings should only * work with anonymous page so far, so we don't need to @@ -1670,14 +1684,18 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, max_nr = (end - addr) / PAGE_SIZE; nr = swap_pte_batch(pte, max_nr, ptent); /* Genuine swap entries, hence a private anon pages */ - if (!should_zap_cows(details)) + if (!should_zap_cows(details)) { + set_pt_unreclaimable(&can_reclaim_pt); continue; + } rss[MM_SWAPENTS] -= nr; free_swap_and_cache_nr(entry, nr); } else if (is_migration_entry(entry)) { folio = pfn_swap_entry_folio(entry); - if (!should_zap_folio(details, folio)) + if (!should_zap_folio(details, folio)) { + set_pt_unreclaimable(&can_reclaim_pt); continue; + } rss[mm_counter(folio)]--; } else if (pte_marker_entry_uffd_wp(entry)) { /* @@ -1685,21 +1703,29 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, * drop the marker if explicitly requested. */ if (!vma_is_anonymous(vma) && - !zap_drop_file_uffd_wp(details)) + !zap_drop_file_uffd_wp(details)) { + set_pt_unreclaimable(&can_reclaim_pt); continue; + } } else if (is_hwpoison_entry(entry) || is_poisoned_swp_entry(entry)) { - if (!should_zap_cows(details)) + if (!should_zap_cows(details)) { + set_pt_unreclaimable(&can_reclaim_pt); continue; + } } else { /* We should have covered all the swap entry types */ pr_alert("unrecognized swap entry 0x%lx\n", entry.val); WARN_ON_ONCE(1); } clear_not_present_full_ptes(mm, addr, pte, nr, tlb->fullmm); - zap_install_uffd_wp_if_needed(vma, addr, pte, nr, details, ptent); + if (zap_install_uffd_wp_if_needed(vma, addr, pte, nr, details, ptent)) + set_pt_unreclaimable(&can_reclaim_pt); } while (pte += nr, addr += PAGE_SIZE * nr, addr != end); + if (addr == end && can_reclaim_pt) + direct_reclaim = try_get_and_clear_pmd(mm, pmd, &pmdval); + add_mm_rss_vec(mm, rss); arch_leave_lazy_mmu_mode(); @@ -1724,6 +1750,13 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, goto retry; } + if (can_reclaim_pt) { + if (direct_reclaim) + free_pte(mm, start, tlb, pmdval); + else + try_to_free_pte(mm, pmd, start, tlb); + } + return addr; } diff --git a/mm/pt_reclaim.c b/mm/pt_reclaim.c new file mode 100644 index 0000000000000..fc055da40b615 --- /dev/null +++ b/mm/pt_reclaim.c @@ -0,0 +1,68 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include + +#include "internal.h" + +bool try_get_and_clear_pmd(struct mm_struct *mm, pmd_t *pmd, pmd_t *pmdval) +{ + spinlock_t *pml = pmd_lockptr(mm, pmd); + + if (!spin_trylock(pml)) + return false; + + *pmdval = pmdp_get_lockless(pmd); + pmd_clear(pmd); + spin_unlock(pml); + + return true; +} + +void free_pte(struct mm_struct *mm, unsigned long addr, struct mmu_gather *tlb, + pmd_t pmdval) +{ + pte_free_tlb(tlb, pmd_pgtable(pmdval), addr); + mm_dec_nr_ptes(mm); +} + +void try_to_free_pte(struct mm_struct *mm, pmd_t *pmd, unsigned long addr, + struct mmu_gather *tlb) +{ + pmd_t pmdval; + spinlock_t *pml, *ptl; + pte_t *start_pte, *pte; + int i; + + start_pte = pte_offset_map_rw_nolock(mm, pmd, addr, &pmdval, &ptl); + if (!start_pte) + return; + + pml = pmd_lock(mm, pmd); + if (ptl != pml) + spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); + + if (unlikely(!pmd_same(pmdval, pmdp_get_lockless(pmd)))) + goto out_ptl; + + /* Check if it is empty PTE page */ + for (i = 0, pte = start_pte; i < PTRS_PER_PTE; i++, pte++) { + if (!pte_none(ptep_get(pte))) + goto out_ptl; + } + pte_unmap(start_pte); + + pmd_clear(pmd); + + if (ptl != pml) + spin_unlock(ptl); + spin_unlock(pml); + + free_pte(mm, addr, tlb, pmdval); + + return; +out_ptl: + pte_unmap_unlock(start_pte, ptl); + if (pml != ptl) + spin_unlock(pml); +} From patchwork Thu Oct 17 09:47:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13839733 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6089D2126D for ; Thu, 17 Oct 2024 09:48:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 77AF56B009F; Thu, 17 Oct 2024 05:48:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 72B436B00A0; Thu, 17 Oct 2024 05:48:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5A4AD6B00A1; Thu, 17 Oct 2024 05:48:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 3B8146B009F for ; Thu, 17 Oct 2024 05:48:29 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 1CB99A114D for ; Thu, 17 Oct 2024 09:48:09 +0000 (UTC) X-FDA: 82682618886.16.C27C76C Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) by imf08.hostedemail.com (Postfix) with ESMTP id 2EFA2160016 for ; Thu, 17 Oct 2024 09:48:19 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=fSwki0Cz; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf08.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.170 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729158347; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=96N82BICgFqmbQqgF/vu6V3zKuLzfmtjKmdmS7DODoY=; b=5ss1Bz72uAsgVxKbSsS4l1dg4PcGWtdkz744Wi9IvLCa2QAfhxNi8RMo8MkKyp2u4RPjrE CMOc52KJJsPNeEDlAzoo7zW6cJgn1ZnN1RIIHR/cxOhHG3jJjaOK322fNF/KawzUk1U2zQ SinSQWw0WPCWviYjCRPCUyBliE7Vmpo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729158347; a=rsa-sha256; cv=none; b=wmtVNk509G6Gyv/WrZYpz/x7moleFiEpXT7upmxf8JDJShZV56KNF6J5dpAx1gBvZq/5wi 6QNJHQaXc9bprwv+su7FuNmn7auU87wiZpRLsHtg3EL7D6kWcYxUXCiz83PSwLh0TXdwCw mdijWleiYrc0EuSU2upSx7VK1M9Ckwk= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=fSwki0Cz; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf08.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.170 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-71e4e481692so627930b3a.1 for ; Thu, 17 Oct 2024 02:48:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1729158505; x=1729763305; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=96N82BICgFqmbQqgF/vu6V3zKuLzfmtjKmdmS7DODoY=; b=fSwki0CzjYQTYWq/8u8Nb9iBVvSIU1q9kQJfR4uAvDgAMjprVzQoHagka6LgF66j3Q oAo/hNHljY6vyf5cUrpNyUX5QyF/+ipgQYIfgR3g9TtbgTBR2XPTUMSj69cR1NyYiGbD z2ACfDUt95EOwScoCnhUsJLmleXu201q64P3lE6mZXG8jsIzk6R4J8ElP6yZ/5ulJDIp ZxaHKqR6QCxC8WAVBOUkDZEoO0S9CpET+V+h2/JqG1SBbQYIhpm7LHv9AdNbRnrKOEhw d/oBc21xkZWOdWL3anqhtvobMN4Ms1ra6QUsHn+1pSBgdNuuA69nEqwzA4XDrjA+sUWx qYIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729158505; x=1729763305; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=96N82BICgFqmbQqgF/vu6V3zKuLzfmtjKmdmS7DODoY=; b=dUeq+9DVhc7k3/1IQ9UsiIS/oqsAtUJdn6BXrt40lvwY24Zjn4GcT8K3XpeMOPAyBT fkygw3kJlHth1EuFL7QLznE/QIhq0fKBF1OiqKRQlCHegED1kOQNIHyi8BSNMxB3U26Y BZz+u6BE92auluAw0R93FUdq3MnljANhvlDjZmASbc/Ilus/Cb47t55d0M6JomSWrAFG LaZq4dfAOjxrlOcTp7bAQ7/IbNhovV4RKbFqpdA+Uigutwe3K7VyPJJgjGfGbiFasHAZ FcmRUmP3UCkvcqHRA5hrH+j9EX0FPP/eSYIbHZpAnoflUxPQq7VfIUz4TN+OvvUncw8D KGeQ== X-Gm-Message-State: AOJu0Yx4aArWhC6hE+aa8B9YzHO4p3MAocHSHl6SGBvcAzgLWTmDwkI8 WRpeiFlaWWCCyvVaB4lyVACFXJSsFCYXfaeN7N/uFyGyhPRj0kTgcDbhCNXw9Bo= X-Google-Smtp-Source: AGHT+IEf2y1GA9Rtz+/5nXvZXCaQlczdVC6XcSHY78uZicXgbVhL/Djw5qd37IxfTX2YDXEO2p/4DQ== X-Received: by 2002:a05:6a21:388a:b0:1d6:97f2:71b4 with SMTP id adf61e73a8af0-1d8c9577df8mr26874708637.1.1729158505469; Thu, 17 Oct 2024 02:48:25 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71e774a4218sm4385365b3a.120.2024.10.17.02.48.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2024 02:48:24 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, jannh@google.com, peterx@redhat.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Qi Zheng Subject: [PATCH v1 6/7] x86: mm: free page table pages by RCU instead of semi RCU Date: Thu, 17 Oct 2024 17:47:25 +0800 Message-Id: X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 2EFA2160016 X-Stat-Signature: adwwbscihrpwwf6qytrymg8qmk3u8fq3 X-Rspam-User: X-HE-Tag: 1729158499-406663 X-HE-Meta: U2FsdGVkX18+/cn29pNsoXQAV0Jrr+pm5MNqWXFcXGcFCDarW5zFaStton2CCJXrMBeWhQJd6xDseD3WjBVrINOaaqZmedGZgrGG75O6StWE/WwgA7DACfRYkVj6e88xAnFFOUWSqpyB+GksUkxUStOJUA6itzynFFHv9Uqo3LIu8BkxYypn6LW2cs9wyMbgJ18Mu8JqLxLI56n5W175qROPtH7zJT4LjsGWVhWVcKmqVlU9dHPzVAwoAq2Q5WzDWIFjl+lh7EyFD0j/daABeWrhhVfcdZSgPaCS1YtNdlLAn2xsXsMWiffcccUbAb3VKTJNzMTJivVhfIR8pWOX6QDkoN/9cUMZI2ojGIoS64ni6XJr55wYpO2i+Yl4QB8KTfH+kUnsHg1KHD19Dtb2krm6TdOvn11ZXxAuYAm4sj6E9eXOW3riSZkVM5Mu4W/Iyix+GO6o9D4vHSMfHVvDp3bf5ookrYJsAXqt5aJR8bs1uRU7llp+Y1MNBmF/mVOp8wpnGN+keCLbEqQIb/0dJp5x1yo7brhbe8mY4HngK0hDrYi1IESMzpxnYdiZYszKE+BQrYp2x7dc16lbpg3jD8JzbbDK1Wh+EjgAH76y8uE9TlhDsF+GkIXTj7K2JQ2J8uZ4fJpIKKMGYPbRPVuOGq0wcj+7nIKQpbmYS96BIgj5CFreerreRLg6Bi0lRcK+rHC6Ps1V8zMImejsdiRvk6jH5kfXkcFhHrmsH8HNpEtZQnCgB95IKGKRrI6oYIG3V7hOedhBz9c5FIOowJy5lsWrTm3XsQMvwU3sYamDxOJ9VApeoGBF56qOQtJbyXPWjphHFDoLCwIitwKZE2nK+xNxAmclhvCVn8uXM7MVwy2GQFKb7o2Xnh4ooFcLBiJQKnVSGkrUDx9zpH2PGp0ufLP2YGfT6n2q1q9j6gXzfz8GhYeOPLRmx38SNbt6vHP3xN7fIZoaE8VkJuh5iyT mAuhdvyZ ydZmxwk6UIQfoe+tUhCqcEonhl8SlZw9KDX4y1GeOdvAMbnBEFBDOj/hZUNwuLBum2AhxgKlYouzc3sRHP69rPVW6bKwBxgYrOtGGSnAWhqJwMLgT39SQy+/+CIAS07GL1R+aR7So/YRsJEjf4EHrxyw75Sk/wXFXY6JxDcHP9q6VCO62mhA9lbL2n9kwvN7ZIlotEll4zILx9FE/b4a4EQV78NupH1ILjpK6NvUwuxG8lopL9SrOqwGU3GN0ZU9n4qWG0NVjaFVpMPp+OVELPbD57fvve1HUYPoWIPwNMtTDzV9Pkuz8puRirQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now, if CONFIG_MMU_GATHER_RCU_TABLE_FREE is selected, the page table pages will be freed by semi RCU, that is: - batch table freeing: asynchronous free by RCU - single table freeing: IPI + synchronous free In this way, the page table can be lockless traversed by disabling IRQ in paths such as fast GUP. But this is not enough to free the empty PTE page table pages in paths other that munmap and exit_mmap path, because IPI cannot be synchronized with rcu_read_lock() in pte_offset_map{_lock}(). In preparation for supporting empty PTE page table pages reclaimation, let single table also be freed by RCU like batch table freeing. Then we can also use pte_offset_map() etc to prevent PTE page from being freed. Like pte_free_defer(), we can also safely use ptdesc->pt_rcu_head to free the page table pages: - The pt_rcu_head is unioned with pt_list and pmd_huge_pte. - For pt_list, it is used to manage the PGD page in x86. Fortunately tlb_remove_table() will not be used for free PGD pages, so it is safe to use pt_rcu_head. - For pmd_huge_pte, we will do zap_deposited_table() before freeing the PMD page, so it is also safe. Signed-off-by: Qi Zheng --- arch/x86/include/asm/tlb.h | 19 +++++++++++++++++++ arch/x86/kernel/paravirt.c | 7 +++++++ arch/x86/mm/pgtable.c | 10 +++++++++- mm/mmu_gather.c | 9 ++++++++- 4 files changed, 43 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h index 580636cdc257b..e223b53a8b190 100644 --- a/arch/x86/include/asm/tlb.h +++ b/arch/x86/include/asm/tlb.h @@ -34,4 +34,23 @@ static inline void __tlb_remove_table(void *table) free_page_and_swap_cache(table); } +#ifdef CONFIG_PT_RECLAIM +static inline void __tlb_remove_table_one_rcu(struct rcu_head *head) +{ + struct page *page; + + page = container_of(head, struct page, rcu_head); + free_page_and_swap_cache(page); +} + +static inline void __tlb_remove_table_one(void *table) +{ + struct page *page; + + page = table; + call_rcu(&page->rcu_head, __tlb_remove_table_one_rcu); +} +#define __tlb_remove_table_one __tlb_remove_table_one +#endif /* CONFIG_PT_RECLAIM */ + #endif /* _ASM_X86_TLB_H */ diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index fec3815335558..89688921ea62e 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -59,10 +59,17 @@ void __init native_pv_lock_init(void) static_branch_enable(&virt_spin_lock_key); } +#ifndef CONFIG_PT_RECLAIM static void native_tlb_remove_table(struct mmu_gather *tlb, void *table) { tlb_remove_page(tlb, table); } +#else +static void native_tlb_remove_table(struct mmu_gather *tlb, void *table) +{ + tlb_remove_table(tlb, table); +} +#endif struct static_key paravirt_steal_enabled; struct static_key paravirt_steal_rq_enabled; diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 5745a354a241c..69a357b15974a 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -19,12 +19,20 @@ EXPORT_SYMBOL(physical_mask); #endif #ifndef CONFIG_PARAVIRT +#ifndef CONFIG_PT_RECLAIM static inline void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) { tlb_remove_page(tlb, table); } -#endif +#else +static inline +void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) +{ + tlb_remove_table(tlb, table); +} +#endif /* !CONFIG_PT_RECLAIM */ +#endif /* !CONFIG_PARAVIRT */ gfp_t __userpte_alloc_gfp = GFP_PGTABLE_USER | PGTABLE_HIGHMEM; diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index 99b3e9408aa0f..d948479ca09e6 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -311,10 +311,17 @@ static inline void tlb_table_invalidate(struct mmu_gather *tlb) } } +#ifndef __tlb_remove_table_one +static inline void __tlb_remove_table_one(void *table) +{ + __tlb_remove_table(table); +} +#endif + static void tlb_remove_table_one(void *table) { tlb_remove_table_sync_one(); - __tlb_remove_table(table); + __tlb_remove_table_one(table); } static void tlb_table_flush(struct mmu_gather *tlb) From patchwork Thu Oct 17 09:47:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13839734 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 979BBD21269 for ; Thu, 17 Oct 2024 09:48:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3362A6B0082; Thu, 17 Oct 2024 05:48:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2E5F66B00A0; Thu, 17 Oct 2024 05:48:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 193AC6B00A1; Thu, 17 Oct 2024 05:48:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E79096B0082 for ; Thu, 17 Oct 2024 05:48:34 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 3918F1A0B89 for ; Thu, 17 Oct 2024 09:48:15 +0000 (UTC) X-FDA: 82682619264.28.A8918BE Received: from mail-pf1-f181.google.com (mail-pf1-f181.google.com [209.85.210.181]) by imf14.hostedemail.com (Postfix) with ESMTP id C6A9510000C for ; Thu, 17 Oct 2024 09:48:21 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="J/ZzEFMx"; spf=pass (imf14.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.181 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729158367; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rfwXxKs8PIaYIos0iWI11FlNBR0N0b0qINIDJ43u9VQ=; b=Ejh3myevX2NFRYOUVN4OE3fz6DCI04POvMsrEUwQGtjr5r16gldMYjYJ+s6yByugbZe8ba 5BcSdBMC7VlAK34enu9Pe8sMe7n+0bADdgp+TaF4m40qJubGY0VOgR3WQbX0EGYkzMxMGx /xjfNugWv6ov17iYE6rmv55NFGAIPLk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729158367; a=rsa-sha256; cv=none; b=r6+E/9zEtrT80qfZsESQxIU7FSr6Jn337iPAuPLZrsKT12Mod/twGCxyCBaLFw6hOHpcCG 2P68XOBMkqVumIXGHMSOh5XEmg7tvDE8n60hP4hoWkQEOTyztEGKfbop+9JSf1LRnmsXtC e+lwfSQTFS4ORxvBqmsW8yAMNv2spLQ= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="J/ZzEFMx"; spf=pass (imf14.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.181 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-pf1-f181.google.com with SMTP id d2e1a72fcca58-71e49ad46b1so477760b3a.1 for ; Thu, 17 Oct 2024 02:48:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1729158511; x=1729763311; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rfwXxKs8PIaYIos0iWI11FlNBR0N0b0qINIDJ43u9VQ=; b=J/ZzEFMxH1XO7sNvCRb1/5Oyrto/Iy/vXQ/VbIWLOXyUurPY0ZARM5VRDwbssugqra scY93BywnkLAZlk4GG73cK1yh8z8MNkUjYjMc50uPN7u9N4Gb9jOaHp9tCgca/PzaoaP SXEvGyFtOuJio5XWA/uwYbWPyGT9XoPaJMON4AAMQv0GOv+XsmAODYS8zP7n1EE2Muc4 BFXc5KSaUd92zdYhUwizqKSTScpbF1rjUjGAMdzzQdt78WCDVntVoohK4WAAcNBqZapD bI8AYf/UKzI+ZmFem2YnBl2ySKIhB2y1D1HNaTlYYDshxQ/NGBCsB38hgkP3xaRxC9qE Y4GQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729158511; x=1729763311; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rfwXxKs8PIaYIos0iWI11FlNBR0N0b0qINIDJ43u9VQ=; b=NBoTlHAekpzjHBgxN+pv95KA+EDQ8TZqOjLQDn/6fJ7v5ToexPZ/JZtOQpfOYGtotN 7yXwZSp+Xk5S0jaqhFU5BcG8U3NpEpJbiOuXUZZnephlZf/+qNcWo/2ls2/KQLEjFktq iU+S/GNCsgy0XuN6sJcBHf7iixwKA8jPFPpGC1TjVf69Zo//hVHdT1gdMmNLd9Xc4BhO pGjxtBk6+tcGty4C6T922Gnnd+xtYWZ8eBH3LOM8Rz4myRH7dGBOqxuL1OZJTWacqJ5C Y8RAbRM+VUwmjHWqMnSAFsDN9Kj3tl0lYdcVF/FKQxqfI73ahoLBethY2xfPNi+Ya9y2 agKg== X-Gm-Message-State: AOJu0YwUv6y/q5CeN/Z/vYfbaIeRjxE4+sEGymmkrBkjXpM9QOGvCsc8 eM2IxtAEwprcMHRExQW3kWkUQCcDnyAj7mDjWM77WOdNkixX5PgW9KC06v2Z8Ec= X-Google-Smtp-Source: AGHT+IGzAl2OFA6S9xDWJuD2m0DVBLlwiMdwmaXRwCcTvTu0dK+Cc+T0fsktUtSx/wfralK9s8v26A== X-Received: by 2002:a05:6a20:6f91:b0:1cf:3677:1c4a with SMTP id adf61e73a8af0-1d905ecb8e9mr9815945637.16.1729158511599; Thu, 17 Oct 2024 02:48:31 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71e774a4218sm4385365b3a.120.2024.10.17.02.48.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2024 02:48:30 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, jannh@google.com, peterx@redhat.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Qi Zheng Subject: [PATCH v1 7/7] x86: select ARCH_SUPPORTS_PT_RECLAIM if X86_64 Date: Thu, 17 Oct 2024 17:47:26 +0800 Message-Id: <0f6e7fb7fb21431710f28df60738f8be98fe9dd9.1729157502.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: C6A9510000C X-Stat-Signature: xaoifcrrybpdz95j6o4kggg1g6q81jhe X-HE-Tag: 1729158501-640474 X-HE-Meta: U2FsdGVkX1/DUQV3p1LxkYAHzTKY7OqSUDH5vF+mSXRRX4C6kG7O+OLb//R2nGY43trvEkQR0eP/ulFrCpDZfDqva6IhQLlclBIOcmzD03idvT+dYUxxLpi1WvcQ1zi+30BpvCE5ie4LjecZpnsYVRxO84iveVxPNGS7ZIXpLD4axpjgSY0xdRR0WzDt427POLqht5t1nY1Ue95YSaIecoP0dWdj5mCk/DHxx8IrGtIXgw9bTer8StChBJP+KcYWeP8UuhhFMGMIN45Kp26XuStLTcicizdq6xlOhw+S8jipq3vVcX2eLvb2npoTlC3VRJWUeFVuKZOim6mBCZ6NbBE3N18SugVm6Sg0JmeefkeuIB/WCt4Sz6OseR2HBf8l6mpZxyUGLqaPXDvirE/bSmhWsQRpYqLXqWsowYitWs7yuzCf2ydfIUKd3KIJJPrC4IrmEsdPo2jI3uW17tYb/1dLT3Fqi2iDFN6m9/d2JNkKSNI6zvCefT74nr2cMak9D6sXjHYvA1H69MfwhuSwxhry8tUp87JwRgZdn8r3M4epPyrbiHdaZ5VfOayu7tP0YO/NrYX3/B6pWVkMyE+59ENTf8dQIKdUgYNbmVKSg+7gdiayp8Zytpgcgnfz12kM+JmZcZtJ6sggV2/KHPTYWjtigqvabPGVkk3ErIYQxNcB6hcEcGa9kgf/5Ft5nfOMb1CzvlXjYyMyChaho1eISWNdTNfTue2Cxq5nTpqTOPYvwa5JgdYKMJT7h57veI7fWsYLDUU9/T6JWI9avI5o7vBU3U2RDYzGw2HzKCCmMEQcrg5ZEKH1ftaJ+LkaOjxZgwox7aqFXjnrcQLuEUwe/ZkCm588UhBobXqDh+B8hjVvbjbmjoonN4zInCdPxLFx1CpJUzjx//pIWGizDuH5NAyhLR2nJlhsgM9u8E2Y4R0l5yyCSiUiz2hb+j6jzViPAEgxMsT6xsvYoqzATGv HaTMoRKQ XhkQJvnqQj8Sxb8gwBAWeM9VufPI32HiJdXbW+qbuXMcKfuAkS3Qv/uST7s2M5aoQ15Pnv4tm1ikf+g4OzyaCziegNKkI0do26xrwvV31oiBNUxHSVmBDSxc+CpgQDjPlc83TXALf/ceXneSD0cKAJsbtHGNSYzKO3RJ2BOK7gCXFTcKDteUagTmMUlxk0v0EmkzuJNRNb2/ojz0aSVMsJUEbW2r/+w/eter6DJUH9J6JtEbbi4VSsZq24U3NrS3gL7aIhcWWSgdBfRdc7OOyw5FAdo0PvfkBgNZEw+9uxHhP7rMhvx8Xl3gF/njhdvThg4pIY6+t6Wg7yrXU5i1DwELnLDvVI0b1WY2A6lq4tk6XZJE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000489, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now, x86 has fully supported the CONFIG_PT_RECLAIM feature, and reclaiming PTE pages is profitable only on 64-bit systems, so select ARCH_SUPPORTS_PT_RECLAIM if X86_64. Signed-off-by: Qi Zheng --- arch/x86/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 1ea18662942c9..69a20cb9ddd81 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -319,6 +319,7 @@ config X86 select FUNCTION_ALIGNMENT_4B imply IMA_SECURE_AND_OR_TRUSTED_BOOT if EFI select HAVE_DYNAMIC_FTRACE_NO_PATCHABLE + select ARCH_SUPPORTS_PT_RECLAIM if X86_64 config INSTRUCTION_DECODER def_bool y