From patchwork Fri Feb 9 22:15:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13551940 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A37BC4829E for ; Fri, 9 Feb 2024 22:15:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 98F576B0081; Fri, 9 Feb 2024 17:15:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9184C6B0082; Fri, 9 Feb 2024 17:15:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 792996B0085; Fri, 9 Feb 2024 17:15:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 6773A6B0081 for ; Fri, 9 Feb 2024 17:15:28 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 09234160122 for ; Fri, 9 Feb 2024 22:15:28 +0000 (UTC) X-FDA: 81773672736.07.FE44159 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf15.hostedemail.com (Postfix) with ESMTP id 5C20DA000D for ; Fri, 9 Feb 2024 22:15:26 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=eO0Do0z9; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf15.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707516926; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GUXSwV0Ba8onVJaRwxmEJ/5t7CtvdY0dXq3jtt3HQ88=; b=PrYHuYcoZ7dRmqWQrBuptvOWANyIb+BUpBBqoCJSInor4RzfLucdWf/SAneq573tyGYPpV wM+lvil1sN0FWo0pUqYuexjpiVneWXqJH8h5wkNUP+ksWTNYxX1jqIrESeo5dSr4ylk0tt VrtULVfXfM5m+8jvTPTGyVYf8IDDXos= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=eO0Do0z9; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf15.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707516926; a=rsa-sha256; cv=none; b=z8M4Q1jJz+gE+b1L3fvc9s2lc6y6kK3TlpZMDKev4Rizb6cmIGL0iKPVDpX7gFhG7DZatZ Wb97q3s6Of+JF6lKtWIAr4bWnH4pq7YLNVUqSq7F6hzZVLjXKVIrEJDdRqFVbSg5xj6a0z yP3xDwaAufOu+DeVZyQxpQQrPWy3BlQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707516925; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GUXSwV0Ba8onVJaRwxmEJ/5t7CtvdY0dXq3jtt3HQ88=; b=eO0Do0z9Lx9rSmCheBzWkfmkXHRK0CqSnA3iOqB39SzWSuUoU4Zdzo69J7Cn6VwkJQDj2J WDlejUjNkqRpVM9JswND24gAfDqzjhCTba3XtjszGKVyH6OF7ypZfktmR08zXFDoyEJf92 pqmQEbXQ/XNyqdOdGWZizuc6mnmMeDQ= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-42-CTYzAu7yOQSNMpmHOZXTyA-1; Fri, 09 Feb 2024 17:15:21 -0500 X-MC-Unique: CTYzAu7yOQSNMpmHOZXTyA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 85BC185A589; Fri, 9 Feb 2024 22:15:20 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.194.59]) by smtp.corp.redhat.com (Postfix) with ESMTP id 74DCF1C14B04; Fri, 9 Feb 2024 22:15:16 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Matthew Wilcox , Ryan Roberts , Catalin Marinas , Yin Fengwei , Michal Hocko , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Peter Zijlstra , Michael Ellerman , Christophe Leroy , "Naveen N. Rao" , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org Subject: [PATCH v2 01/10] mm/memory: factor out zapping of present pte into zap_present_pte() Date: Fri, 9 Feb 2024 23:15:00 +0100 Message-ID: <20240209221509.585251-2-david@redhat.com> In-Reply-To: <20240209221509.585251-1-david@redhat.com> References: <20240209221509.585251-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Rspamd-Queue-Id: 5C20DA000D X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: jzuy59dogk3ych9wbrgcthyxjm1ezr91 X-HE-Tag: 1707516926-583878 X-HE-Meta: U2FsdGVkX19e3fBKObnlu2OT0QoR9xtCCtV2FK6sjLu8pe7VtQGblFKm9Jn2Tk0vN7VczH4I4XD5xHdaXVIpXTlp16hYEhCm1XoyItUVgPPQ4mAXbr4Y1WEY0AfKyvoFl00JNZvIdPHBHohOl+4MkjvaF7e3ORniYmRkb2t/bebJ5fyHJRSQMtbUN+VMtGHmczNcvcv/XXehko/JuC2S71RlrbuqvrqH26C5m1b5eHofSQJq0W7bCJ4LXlG81eVEEaKCT7RzA+iACVrQNsG4o3WKSlp/q7szGYna6HSkTB9NOhpwhb9tSjXduu3r9NI/2UFottUG3+j6to8dAPs8lQfdsedBKXBo5GVNub1PfUMJiB9vTQdrC5w7an8986PXs/Sjnoag/UpdKCWa0h9UyQRBIXPdW6nZMYldD3IXstN0KitqX5zfK96FcpwSShzWtyiEVsoWRP/3A3KSR5rpY7AgMcp+vX/1PUFzn/7vC9bE2gNHa0tN1XiZgsEGeelya47G6vYiCzQdiyq97rXVJ1GXUrkhB2+G5wExBxqjBAX1xhi7fPjKYqxr2d3i/86RMcvL2v/hI+66UftvpH+nq/kRmV9QmN/WnxF8Vck4b+FcgDuYlamCuGkbmtbElhJqrtwnx4ZA4RREiEm423XEvnAkvnuwjLcCckTW2Wlkz1MOWa5UfpI9kWRKCn82su+2KpSadWLIYdkpro7d2aXFZcrMkuBnZXLAay51DbmIZXruHAuykdXvt/KfQXk/Vng9OtQQbAIiAfu/0WAmRfjn4KMjNkokSMJCL5ESv/CKTP8X6CAmZjiVh1j0ExbGNJvL4CJN2O+k9ILU0LtM0UxZWB5NkN77K4lHwUoWd/luGDO6FcDaS62toAhFhMu/oLs4uL78+m2WNObmmr+CJIwHvpo7L7jLlSh5K53S6WLNSXPDbDxqglgKL8qFQGY9iavNkHW/iEbZpMH3b3LWp+5 JEFGQkGx 3cYljUzxH59X2ejG0Izm5FcfBodkR2Lo2GuASkUjj1l13JBDsy/y61DrYao3BCZdeRqWdAHCv3NHdMqF1lrjAQobsKJaIYAGdM2BbtfjknMbkxAoy15RG61NuuV3G6t1OhQS25qp/LlOqw0yA0/5AFBJ+b5nwRVWLJYQh6f7wBlHFyEPglfSpgtCme1HThKMCiisSf55+GhX235Jqr0IImgUL3pwskCP6/IS+ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Let's prepare for further changes by factoring out processing of present PTEs. Signed-off-by: David Hildenbrand Reviewed-by: Ryan Roberts --- mm/memory.c | 94 ++++++++++++++++++++++++++++++----------------------- 1 file changed, 53 insertions(+), 41 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 7c3ca41a7610..5b0dc33133a6 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1532,13 +1532,61 @@ zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, pte_install_uffd_wp_if_needed(vma, addr, pte, pteval); } +static inline void zap_present_pte(struct mmu_gather *tlb, + struct vm_area_struct *vma, pte_t *pte, pte_t ptent, + unsigned long addr, struct zap_details *details, + int *rss, bool *force_flush, bool *force_break) +{ + struct mm_struct *mm = tlb->mm; + struct folio *folio = NULL; + bool delay_rmap = false; + struct page *page; + + page = vm_normal_page(vma, addr, ptent); + if (page) + folio = page_folio(page); + + if (unlikely(!should_zap_folio(details, folio))) + return; + ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); + arch_check_zapped_pte(vma, ptent); + tlb_remove_tlb_entry(tlb, pte, addr); + zap_install_uffd_wp_if_needed(vma, addr, pte, details, ptent); + if (unlikely(!page)) { + ksm_might_unmap_zero_page(mm, ptent); + return; + } + + if (!folio_test_anon(folio)) { + if (pte_dirty(ptent)) { + folio_mark_dirty(folio); + if (tlb_delay_rmap(tlb)) { + delay_rmap = true; + *force_flush = true; + } + } + if (pte_young(ptent) && likely(vma_has_recency(vma))) + folio_mark_accessed(folio); + } + rss[mm_counter(folio)]--; + if (!delay_rmap) { + folio_remove_rmap_pte(folio, page, vma); + if (unlikely(page_mapcount(page) < 0)) + print_bad_pte(vma, addr, ptent, page); + } + if (unlikely(__tlb_remove_page(tlb, page, delay_rmap))) { + *force_flush = true; + *force_break = true; + } +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, struct zap_details *details) { + bool force_flush = false, force_break = false; struct mm_struct *mm = tlb->mm; - int force_flush = 0; int rss[NR_MM_COUNTERS]; spinlock_t *ptl; pte_t *start_pte; @@ -1555,7 +1603,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, arch_enter_lazy_mmu_mode(); do { pte_t ptent = ptep_get(pte); - struct folio *folio = NULL; + struct folio *folio; struct page *page; if (pte_none(ptent)) @@ -1565,45 +1613,9 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, break; if (pte_present(ptent)) { - unsigned int delay_rmap; - - page = vm_normal_page(vma, addr, ptent); - if (page) - folio = page_folio(page); - - if (unlikely(!should_zap_folio(details, folio))) - continue; - ptent = ptep_get_and_clear_full(mm, addr, pte, - tlb->fullmm); - arch_check_zapped_pte(vma, ptent); - tlb_remove_tlb_entry(tlb, pte, addr); - zap_install_uffd_wp_if_needed(vma, addr, pte, details, - ptent); - if (unlikely(!page)) { - ksm_might_unmap_zero_page(mm, ptent); - continue; - } - - delay_rmap = 0; - if (!folio_test_anon(folio)) { - if (pte_dirty(ptent)) { - folio_mark_dirty(folio); - if (tlb_delay_rmap(tlb)) { - delay_rmap = 1; - force_flush = 1; - } - } - if (pte_young(ptent) && likely(vma_has_recency(vma))) - folio_mark_accessed(folio); - } - rss[mm_counter(folio)]--; - if (!delay_rmap) { - folio_remove_rmap_pte(folio, page, vma); - if (unlikely(page_mapcount(page) < 0)) - print_bad_pte(vma, addr, ptent, page); - } - if (unlikely(__tlb_remove_page(tlb, page, delay_rmap))) { - force_flush = 1; + zap_present_pte(tlb, vma, pte, ptent, addr, details, + rss, &force_flush, &force_break); + if (unlikely(force_break)) { addr += PAGE_SIZE; break; } From patchwork Fri Feb 9 22:15:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13551941 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B20CC48297 for ; Fri, 9 Feb 2024 22:15:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D88006B0085; Fri, 9 Feb 2024 17:15:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D103C6B0087; Fri, 9 Feb 2024 17:15:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B3B9C6B0088; Fri, 9 Feb 2024 17:15:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A12996B0085 for ; Fri, 9 Feb 2024 17:15:34 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 4E23840515 for ; Fri, 9 Feb 2024 22:15:34 +0000 (UTC) X-FDA: 81773672988.13.8992C17 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf09.hostedemail.com (Postfix) with ESMTP id 987CF140015 for ; Fri, 9 Feb 2024 22:15:32 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ZZFnRkDF; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf09.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707516932; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fqQsnr9EFuS72Lg9egurC5IFPM5MYMM8dNEJpcWm/gk=; b=Y1rfaa/BujquSeitTukU89dratGXS2Xz7KN9aK+8LaPipRBDr8Jp2KW/nDojMatQdVYx0o +LK2CIG3IvklrzdlA1DaQcyZmnRehZxf3OUwTjhjpbf+1bB6dqBuEb/lp84N2hDxWL1x2T VvxovnuTDPahp31C2NzyNXdwi4FR4Yw= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ZZFnRkDF; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf09.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707516932; a=rsa-sha256; cv=none; b=VYNpt59H4x/gOPdSC3xwKnMCR/WT7wvNALQ6OkLcem/d+4ygbdX6cRepX7yk5YU2COUhlm yNnSPGl33mEnoBJD83iwfwCvR3se4RVDJZSt808qaqzll+WEbG++iY73TRxNEoYhzREjN5 khGc7+5X87XqHgupPX4+60x0dsr7Sc8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707516931; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fqQsnr9EFuS72Lg9egurC5IFPM5MYMM8dNEJpcWm/gk=; b=ZZFnRkDFls4A5W0/ouqSf2KaDCggt3g0BpyfEFkFhAGJCr3yQlPzbHQg6zjJzZepueJ1vc WEbK55uP+TzbH8mTXKBTVi/yMVu7Vb7GjHxlva8fkLotitLPAvMwgsRKANqnqD8FHODuu/ lssuvjtWqLIjMY3mC5DCJhmgzDZ71II= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-659-vWqlBEopN_iaGbQkhty9OQ-1; Fri, 09 Feb 2024 17:15:28 -0500 X-MC-Unique: vWqlBEopN_iaGbQkhty9OQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E6404811E81; Fri, 9 Feb 2024 22:15:26 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.194.59]) by smtp.corp.redhat.com (Postfix) with ESMTP id F06291C14B04; Fri, 9 Feb 2024 22:15:20 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Matthew Wilcox , Ryan Roberts , Catalin Marinas , Yin Fengwei , Michal Hocko , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Peter Zijlstra , Michael Ellerman , Christophe Leroy , "Naveen N. Rao" , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org Subject: [PATCH v2 02/10] mm/memory: handle !page case in zap_present_pte() separately Date: Fri, 9 Feb 2024 23:15:01 +0100 Message-ID: <20240209221509.585251-3-david@redhat.com> In-Reply-To: <20240209221509.585251-1-david@redhat.com> References: <20240209221509.585251-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Rspam-User: X-Stat-Signature: a1xipxa4og3ketf7nn4r5zoffdrt6k87 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 987CF140015 X-HE-Tag: 1707516932-914908 X-HE-Meta: U2FsdGVkX1/PzSLIBqYga0SrFvm+klVJLI+OMs17zCIqu6wTGZ1L2GfTYKbZCxrF2xawBde3NQw/aSysW78TS4h6ksMn+Sj/s76cJ238akan+Zcj7LHYNrZhbNhA/ZLI8VtsAsg/SOJNXhyr6dSaiED11OGa7qJ46rbJM2t3Ks653m0pdGiAj2d4/Kaw+wgpqia5ymNjrVvnnj7O9SQMGifJNBxhfeAWq4x0eiqbdVeEQpPkEnW6ZlgtOCDBlqV/rwuDhOiggXTFU5At6h+1OaltM+2UwBHbPiN9CSwQM0p3bYVOrc8ADg1Ncm+MJbKLwX/f7srZHJbUxgK2knBkexNS5BXXZxSscIrLsOxGw9bAq+Ze0v8C4RkS5pkJ5a/OS86F7E9x+EuLUjyennPfw4vG8gMaCsGYBOsuIKzH6hUkw7OS6BF5mM5HEtMO8DD5co96/NcQhN34bXs9YRsml3dXZ9XQeADkf2Rftob7V3C9F5vC4mAX9L8xIEfpiFeD0ocsHQK98E+FQwt9hxnSovZplCJgirEvikZmGuhgvuno7lenYXIT6FRgn1jfQEq3NCZM9uG+C0DpS7sIhtW4ZziPzqFP73dcr/aqYURZElrwsHrKfXKOZoqaYa9MedqviphFPQqHksdUAPxQBIEvrn7yxwjxDC6IAG4IzmR7AfSaTyUH7eNjRvuABMxzdMS/igQ9q8yLU4Znr6BnQJRpY60bF/2d/Ke1fQmCyH4ldlerliW4F4+gnt66L1pDvqyqBJiqfphqRNvwmVfX8bLFxz50Fn1m5deBNMRAUGCCeCUV7m6M2A6CVtN8MqeakCEENe+dQ4mD/Ty84U0Z+KkXZZ0p24acOVcEjff9lqjH59Z2TE/LYZNTvVE97NL7T0vMgaIq/QQ9H6hHSGnPDzWnzuSyaem2fjne2kq62VCr5mCcMMj4ctot4pTLjdqZwPGmUlRwCH5I2iZrTYqzXpz s2e2xYwe bt/fIOw4hhbh4igXLfEhKrTJPFR/vNBoY+V06uainUvDtSbYH7IBSlJ9GSyb5Un0aPqmoQgCWbOK15Iyey42AwJA1286oW16DdEvkOK41y9qkH4RfoIPeFEa1mwWLqLzGciYgiiCMo6IDZLg+W1g7MjYFq3Y3BFAgDjNkY8dr4CoA+XVrKYA9ZSzn2DvUpykpDj9/Wjzwh3xGZY1/5kBqGNCWkhwm606MLEhBy44uEEtWCoRhtp79NfSqE00y6sKJhw7p4KNtnqrdZPDbGmT7OD5wGMYP2ODwniqvrUw/0J4SL5mQbFrMzP0VO5PJjip8xKD/ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: We don't need uptodate accessed/dirty bits, so in theory we could replace ptep_get_and_clear_full() by an optimized ptep_clear_full() function. Let's rely on the provided pte. Further, there is no scenario where we would have to insert uffd-wp markers when zapping something that is not a normal page (i.e., zeropage). Add a sanity check to make sure this remains true. should_zap_folio() no longer has to handle NULL pointers. This change replaces 2/3 "!page/!folio" checks by a single "!page" one. Note that arch_check_zapped_pte() on x86-64 checks the HW-dirty bit to detect shadow stack entries. But for shadow stack entries, the HW dirty bit (in combination with non-writable PTEs) is set by software. So for the arch_check_zapped_pte() check, we don't have to sync against HW setting the HW dirty bit concurrently, it is always set. Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- mm/memory.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 5b0dc33133a6..4da6923709b2 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1497,10 +1497,6 @@ static inline bool should_zap_folio(struct zap_details *details, if (should_zap_cows(details)) return true; - /* E.g. the caller passes NULL for the case of a zero folio */ - if (!folio) - return true; - /* Otherwise we should only zap non-anon folios */ return !folio_test_anon(folio); } @@ -1538,24 +1534,28 @@ static inline void zap_present_pte(struct mmu_gather *tlb, int *rss, bool *force_flush, bool *force_break) { struct mm_struct *mm = tlb->mm; - struct folio *folio = NULL; bool delay_rmap = false; + struct folio *folio; struct page *page; page = vm_normal_page(vma, addr, ptent); - if (page) - folio = page_folio(page); + if (!page) { + /* We don't need up-to-date accessed/dirty bits. */ + ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); + arch_check_zapped_pte(vma, ptent); + tlb_remove_tlb_entry(tlb, pte, addr); + VM_WARN_ON_ONCE(userfaultfd_wp(vma)); + ksm_might_unmap_zero_page(mm, ptent); + return; + } + folio = page_folio(page); if (unlikely(!should_zap_folio(details, folio))) return; ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); arch_check_zapped_pte(vma, ptent); tlb_remove_tlb_entry(tlb, pte, addr); zap_install_uffd_wp_if_needed(vma, addr, pte, details, ptent); - if (unlikely(!page)) { - ksm_might_unmap_zero_page(mm, ptent); - return; - } if (!folio_test_anon(folio)) { if (pte_dirty(ptent)) { From patchwork Fri Feb 9 22:15:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13551942 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BF4DC48297 for ; Fri, 9 Feb 2024 22:15:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0907E6B0088; Fri, 9 Feb 2024 17:15:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 03F386B0089; Fri, 9 Feb 2024 17:15:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DFBBC6B008A; Fri, 9 Feb 2024 17:15:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id C47E86B0088 for ; Fri, 9 Feb 2024 17:15:41 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 9DEDCA100C for ; Fri, 9 Feb 2024 22:15:41 +0000 (UTC) X-FDA: 81773673282.27.F5C4F39 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf03.hostedemail.com (Postfix) with ESMTP id E4B4520010 for ; Fri, 9 Feb 2024 22:15:39 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cXqT27lQ; spf=pass (imf03.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707516940; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=M/Ux9HRgPwmdR3zuVPtEfBlFP5wVpYCc/8Whwmc2o7c=; b=SFwaTS0mZH0oq6ScwzcgX5Q6gncipScSAdMn3Olrpoy9FvrXqoQN6Nqsa9bC6LbsOgg6Qc eEhVZNgx1feR5BowwUHVAxvAGXYxrNq9vz0JjjWrcW/lveKqVQr/tQvKLF+C+vMXhqmfM/ ZRrxplYw0Xh/V1PnBUb4iMNHuvK7xp4= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cXqT27lQ; spf=pass (imf03.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707516940; a=rsa-sha256; cv=none; b=vSk4quuEza19ZeWICfJFYMxoJfWNCmpW4zD3+NU98rUmDOlP1U+a85he+EYq+yFNCnKsAH fATcWKL7s/3gfYSXtzGw6cFiv4DoZH2tEV5qzG0cHl2QEsv7X2GDYczoH4mkGWUYd/XOi1 Yv5w40iRFaAKHFJdDOTKvP0shbdY6Dw= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707516939; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=M/Ux9HRgPwmdR3zuVPtEfBlFP5wVpYCc/8Whwmc2o7c=; b=cXqT27lQiwlFZH8EN2rfKdCAxS6FWjosxKForX5hZ1kQqwloTwp9zkLz9vg25Wm2ylLXAl M7nHgr2j+JETbzi6dCKb3ClJfygHBUS35kCPXxIrde+0mqLZ4TTTZB9hyfC4m/mgJdamJY hmcH7ChDqs7KCvF3mS70+MNFXETmGCU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-639-ojxWH6gHP7CoaMSJUSv5fA-1; Fri, 09 Feb 2024 17:15:32 -0500 X-MC-Unique: ojxWH6gHP7CoaMSJUSv5fA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 85BE9185A780; Fri, 9 Feb 2024 22:15:31 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.194.59]) by smtp.corp.redhat.com (Postfix) with ESMTP id 512C21C14B04; Fri, 9 Feb 2024 22:15:27 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Matthew Wilcox , Ryan Roberts , Catalin Marinas , Yin Fengwei , Michal Hocko , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Peter Zijlstra , Michael Ellerman , Christophe Leroy , "Naveen N. Rao" , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org Subject: [PATCH v2 03/10] mm/memory: further separate anon and pagecache folio handling in zap_present_pte() Date: Fri, 9 Feb 2024 23:15:02 +0100 Message-ID: <20240209221509.585251-4-david@redhat.com> In-Reply-To: <20240209221509.585251-1-david@redhat.com> References: <20240209221509.585251-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Rspamd-Queue-Id: E4B4520010 X-Rspam-User: X-Stat-Signature: dtqiexhc38fgjyhiho394rmeoc8759ay X-Rspamd-Server: rspam01 X-HE-Tag: 1707516939-107529 X-HE-Meta: U2FsdGVkX199O6DMPFxEyDdmdosS+JxN2YgPy9VnLjON2YMNPOa38HtF0s4WM+KVgxGNMae7giEU2+ox4f9emEoo+RW4ngIto9i8qtupVTXuVZzm6A9CMsj6FSzJ+f/o8rQssV4j6be1tqmv4hat9Df/p7cOXBGobQ1pfsltRFFZ+/hUlZPhiYqeJsyNRzdHwgspZe4N9aJv5KEJlT+nbx2Dyk+k+pKhLIKHANw0ByFVzsGM9oHaJhrpJZWrwX9l6fxJmH1zMkaLbIbsrchBAxXqkhqhzYpLhU1CoJHqqZjbahWQaUujyV/uNrSbFcRCKonFZPc4FJphCUyo82qynL0pNtQPO7vhaxqoMythqsN20LTqdcQq9tFkiavgGR6fwsNKMaDKeC0Z8H4GOVPyY+xAaZ6WHypsSixBmLJD0mkPncgrT1VYSY7yMnnp7gj31KjAF3QYsKX31TJ/hxAJCC8vYc63tTEthRYX6KkihmvPmykaAjfzElOZMPKk/P+JrJ/PzrA055oiBKxpfxIpcfVstV8E/35K5kOqLd/+mYnNciPcFHxrhCbAuenQbPmP3IUb1UmJrojvWJmaft2x6KwGSn/nijy6P6ga+h0WfSmEhuL2wkKA1YtG8IAMig7KEbnxTpxIt0y7A/n9ayTH0jwwTouEym6dT3v8X+/kzMAvqp1lvQP7o3LQ4/mMe+1Hb5xRxlKPnXHQuJhfB0BaqRCekNS8WlCH0Say8QaRKjb5428m1MS6V/HlkWed0SdqpAyLzVkdp1wuUGH92LNeclkU73zvXet5VBSyfjjc6B1483EozwCXbPsmvZc9XeuZbNT1KefE+JHA6CCeHpDTgmVY/F1Fzv6PotnfpKBvZYdFH0SEVLZ81zBIwIfIFQscVUpMyzYv2ojq5/fokiL9X0oINRtwaN0qhAMltT57+XV7vjI3YjjURfwaSE9cEdsMK3RYoWmmOUL9t8AwpBZ ZwfVVKxF dV9uuQF04VYZTozj38moymr0hdeprhnAhrLrPGwcAE/8BABPUz1fugr1Pazxpf0oAh/6NIAtzU1/0TI9lMUsw2d6EcU53h7IKZgqrZO9z3yh2kYKfebmqHQMRy7jrFVdpZfdlNMbC9WGBf9xVef64TDspD7q+4Q9njBbuJHXUZ3/PjYvIKWeZ3aM1KDpFURGOfWk2qAXGtqTL78PRnzBB4LEv3tqPSmzJai6slNrLdamhBmym1OXy7xlVxP8fmNEtkM/RdSl9G6LeR90uiYE8hMxsSA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: We don't need up-to-date accessed-dirty information for anon folios and can simply work with the ptent we already have. Also, we know the RSS counter we want to update. We can safely move arch_check_zapped_pte() + tlb_remove_tlb_entry() + zap_install_uffd_wp_if_needed() after updating the folio and RSS. While at it, only call zap_install_uffd_wp_if_needed() if there is even any chance that pte_install_uffd_wp_if_needed() would do *something*. That is, just don't bother if uffd-wp does not apply. Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- mm/memory.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 4da6923709b2..7a3ebb6e5909 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1552,12 +1552,9 @@ static inline void zap_present_pte(struct mmu_gather *tlb, folio = page_folio(page); if (unlikely(!should_zap_folio(details, folio))) return; - ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); - arch_check_zapped_pte(vma, ptent); - tlb_remove_tlb_entry(tlb, pte, addr); - zap_install_uffd_wp_if_needed(vma, addr, pte, details, ptent); if (!folio_test_anon(folio)) { + ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); if (pte_dirty(ptent)) { folio_mark_dirty(folio); if (tlb_delay_rmap(tlb)) { @@ -1567,8 +1564,17 @@ static inline void zap_present_pte(struct mmu_gather *tlb, } if (pte_young(ptent) && likely(vma_has_recency(vma))) folio_mark_accessed(folio); + rss[mm_counter(folio)]--; + } else { + /* We don't need up-to-date accessed/dirty bits. */ + ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); + rss[MM_ANONPAGES]--; } - rss[mm_counter(folio)]--; + arch_check_zapped_pte(vma, ptent); + tlb_remove_tlb_entry(tlb, pte, addr); + if (unlikely(userfaultfd_pte_wp(vma, ptent))) + zap_install_uffd_wp_if_needed(vma, addr, pte, details, ptent); + if (!delay_rmap) { folio_remove_rmap_pte(folio, page, vma); if (unlikely(page_mapcount(page) < 0)) From patchwork Fri Feb 9 22:15:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13551943 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A60E7C4828F for ; Fri, 9 Feb 2024 22:15:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8CE8B6B0089; Fri, 9 Feb 2024 17:15:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8568C6B008A; Fri, 9 Feb 2024 17:15:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6A8B46B008C; Fri, 9 Feb 2024 17:15:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 51E036B0089 for ; Fri, 9 Feb 2024 17:15:42 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 1AE931403E1 for ; Fri, 9 Feb 2024 22:15:42 +0000 (UTC) X-FDA: 81773673324.21.EECBC02 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf07.hostedemail.com (Postfix) with ESMTP id 443A640013 for ; Fri, 9 Feb 2024 22:15:40 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=YtBUCE8S; spf=pass (imf07.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707516940; a=rsa-sha256; cv=none; b=Mvm8VcupjfXLdHCuBUh8rGPTCRcyJdGt9GZIrEAXg0TjykK3vhrWq8v8x99RGXeoxhafRP eUftVU0AiMcBRSDQ31OGbWTqONX4DIOQGyudaU5BZlYAygVfSrZ3xGQw4ZiWybz6E0vhMQ 2tfn2SLfdObiIYjhlFP+50h3crZ111Y= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=YtBUCE8S; spf=pass (imf07.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707516940; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SyWghWuBP6GkqnhxcyLXYq2ssX4b7HML4Pf9katCZ5k=; b=j6LMX/Nm51kMjbcUjfhzwhMosgEl5gk3IQSV8E3ZhkNpWm9UQdXf/kOT0JVlQXFJJfNjiN 4uKIqSYEIYyL8GFK/il7Jhk+RJcw5Z9SD5hnSUz+YTRwNTQXL2pUU9puzdyXy8UnbaUOmg qElQfQSSRbEztgDBkRzURJ2/XkHy2hc= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707516939; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SyWghWuBP6GkqnhxcyLXYq2ssX4b7HML4Pf9katCZ5k=; b=YtBUCE8S9wFv2MMpMFfV62tPWJAWOUvgxZOIqrazjoBTOvUsu7AmTVovG4cR9aMJwIiqrr hRucOQn7Kj4W7f9JJo95AxTlAGsFvSriVJPI6icYaY7eQK4U4GwDZG3ZaPgGlfVBaG3VME ybDnyiWk84bTg/pfdfHNn8WRgs3qQfc= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-689-umYgGBJnMqW48D3blg8DZg-1; Fri, 09 Feb 2024 17:15:38 -0500 X-MC-Unique: umYgGBJnMqW48D3blg8DZg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id ED4963812023; Fri, 9 Feb 2024 22:15:36 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.194.59]) by smtp.corp.redhat.com (Postfix) with ESMTP id E569E1C14B04; Fri, 9 Feb 2024 22:15:31 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Matthew Wilcox , Ryan Roberts , Catalin Marinas , Yin Fengwei , Michal Hocko , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Peter Zijlstra , Michael Ellerman , Christophe Leroy , "Naveen N. Rao" , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org Subject: [PATCH v2 04/10] mm/memory: factor out zapping folio pte into zap_present_folio_pte() Date: Fri, 9 Feb 2024 23:15:03 +0100 Message-ID: <20240209221509.585251-5-david@redhat.com> In-Reply-To: <20240209221509.585251-1-david@redhat.com> References: <20240209221509.585251-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 443A640013 X-Stat-Signature: sbtanmezpz164eteijpxu6hcme8fewie X-Rspam-User: X-HE-Tag: 1707516940-710467 X-HE-Meta: U2FsdGVkX18ujEqLTGK8FSmErGk8tzT4lC5qROwCTydFGu34wA67jOWrpOfBGH/59c5LMribmcpeQwbnWKvNTzvR7eXpqrPmEmwxnOM6eu5H5wFVmx8eY21ZrD1xlVs/QqUxYL5uUjRjEG+oOcwHsqhJYi04i8ZISU7h/5qUs08AnmSigtzqW/se+ONCfvSQ5ob6lPX7sPrDf/hHpZXqPSEIIZXebPfHishl6CUob8J9lmGYICTKWyzlrNlNE4IcbwYJ/y9GU17/M+eSHBe5s8ZEc4amePp7y9AC3deVVvdI5fIJN142G7iLmtxWJtXIKHMqWWvmV3T1K8AZoV4/pamkgnxosJkb/esz7ZA36jBXJWqtWYSIn6QzwxjRVGXUVyqsEDqAsNzDOoOxcn1UnCWy9Vsj1oORygM93RbsxLF6gEsgVGUaBpjFzSU7QO6GekBrnkMaLJWs/K/LMnkyqPQ5FLmJn0xQ6CxWFemdbGscQvs72+aw1nK3zRXDG1ls5eZKerAJ1t+GdnkC4YFOvdI6EL32diCd3i0McdDDsk/ckIwCO0Usq2//5ij+UDLfACg/Yh4+82KwlGP4bSQ6OaLPnJS8kaJXcv78JUljs90irjtLEdAYXD61Vf8hPkgfuW2KtHaVVan+nNFlCo+kIFdoKnwv+63k18L45MZnXj+C+v172h9qRKh6bJqTkWgYYtFYnDn2zOHIW0iDqxKWdUl0IBFPysZxQ369YpEdmxUB8D2MKtVRv0hDIKO8HLSVvFoi29kLZ+EYVF96f86wJaMF+Mq4bFiZQ2FujY1n9rz6bsXFsHZ4Rg3H9YIiQIpspY9tziJezmUMkTeeOAW65pz19RtX6PDRb0vtRHTQFIiZU9HAjulqUwMQd7dXGgj7lLt5L/V0CTEwjA7hQS5VYqFZvCEnuIFPjOmLfQVZjqU1mgXf52GOQ5ysr9VFJAfHuaZqgGoW00SLz9NITcn bIxGeIW8 tHow5EK4pErSfYnb7ESj/Z2RpZBm7K5Lh01vLUlMuMNcFJfV5eWb+Uoh+MB8967b+8oNK7/8D7ZserssDCmyTHiz+RY4Nr0GjjfLlA439/peIyjuLHZnnbs+jGM0ozclr688vgL5YfmjnCAP7NcIPXn8/E1t1NLXvXf/89fKK4Nr4HWjGT+aTCOnyMl4CcACNE4Oxmt+Y1NlqwHZqCpzA5cpGcpgR0Tnyh5dRvE7z2eHJ6c4gKD6TLMDZ8qKw/no/fjAemCASJGKZqejvfCTgV9OGfg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Let's prepare for further changes by factoring it out into a separate function. Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- mm/memory.c | 53 ++++++++++++++++++++++++++++++++--------------------- 1 file changed, 32 insertions(+), 21 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 7a3ebb6e5909..a3efc4da258a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1528,30 +1528,14 @@ zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, pte_install_uffd_wp_if_needed(vma, addr, pte, pteval); } -static inline void zap_present_pte(struct mmu_gather *tlb, - struct vm_area_struct *vma, pte_t *pte, pte_t ptent, - unsigned long addr, struct zap_details *details, - int *rss, bool *force_flush, bool *force_break) +static inline void zap_present_folio_pte(struct mmu_gather *tlb, + struct vm_area_struct *vma, struct folio *folio, + struct page *page, pte_t *pte, pte_t ptent, unsigned long addr, + struct zap_details *details, int *rss, bool *force_flush, + bool *force_break) { struct mm_struct *mm = tlb->mm; bool delay_rmap = false; - struct folio *folio; - struct page *page; - - page = vm_normal_page(vma, addr, ptent); - if (!page) { - /* We don't need up-to-date accessed/dirty bits. */ - ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); - arch_check_zapped_pte(vma, ptent); - tlb_remove_tlb_entry(tlb, pte, addr); - VM_WARN_ON_ONCE(userfaultfd_wp(vma)); - ksm_might_unmap_zero_page(mm, ptent); - return; - } - - folio = page_folio(page); - if (unlikely(!should_zap_folio(details, folio))) - return; if (!folio_test_anon(folio)) { ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); @@ -1586,6 +1570,33 @@ static inline void zap_present_pte(struct mmu_gather *tlb, } } +static inline void zap_present_pte(struct mmu_gather *tlb, + struct vm_area_struct *vma, pte_t *pte, pte_t ptent, + unsigned long addr, struct zap_details *details, + int *rss, bool *force_flush, bool *force_break) +{ + struct mm_struct *mm = tlb->mm; + struct folio *folio; + struct page *page; + + page = vm_normal_page(vma, addr, ptent); + if (!page) { + /* We don't need up-to-date accessed/dirty bits. */ + ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); + arch_check_zapped_pte(vma, ptent); + tlb_remove_tlb_entry(tlb, pte, addr); + VM_WARN_ON_ONCE(userfaultfd_wp(vma)); + ksm_might_unmap_zero_page(mm, ptent); + return; + } + + folio = page_folio(page); + if (unlikely(!should_zap_folio(details, folio))) + return; + zap_present_folio_pte(tlb, vma, folio, page, pte, ptent, addr, details, + rss, force_flush, force_break); +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, From patchwork Fri Feb 9 22:15:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13551944 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38524C4828F for ; Fri, 9 Feb 2024 22:15:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B907A6B0092; Fri, 9 Feb 2024 17:15:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B14A16B0093; Fri, 9 Feb 2024 17:15:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 967AA6B0095; Fri, 9 Feb 2024 17:15:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 81E426B0092 for ; Fri, 9 Feb 2024 17:15:48 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 2BCD2810AB for ; Fri, 9 Feb 2024 22:15:48 +0000 (UTC) X-FDA: 81773673576.26.29EAB85 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf03.hostedemail.com (Postfix) with ESMTP id 5248A20020 for ; Fri, 9 Feb 2024 22:15:46 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=bmBBJUSR; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf03.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707516946; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4Q1ph86PH1Crd3Eqr5OMYEFX/2u3faALVbtCzM66/MM=; b=bq+y6FBBSq/gomjbgW4ITfs7B9kniCroQIZzQXyt8sAKOqL6MN9qFcucMGMc2er5I4yyqH oWI+mimhmj1WRXOcWUV8vtiEvIwm+B0k/fPZ86n6c1TBHWjzBcONLbY8ytzDRGo96VzdEv xYlil6+M+XmuMgktvQ50Jgt2KpMRJj8= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=bmBBJUSR; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf03.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707516946; a=rsa-sha256; cv=none; b=krClcxHkusquLPiC0zJt+xQxu7XL91FD/Z87NBkvAqHKCsqtPs9cQAaCpde8j/MkETDTd0 zbuV6iWVWdpsy9olfel8RkgoMXLx2boGvKHCOU7PiWgtzCoIb1OGB8IK5swFKrGkrrK88O GCl3mNDSLiff3RBSES0RsqHWiedNp5o= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707516945; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4Q1ph86PH1Crd3Eqr5OMYEFX/2u3faALVbtCzM66/MM=; b=bmBBJUSR42eo6IRGd1YNaXuTaMenNxiDpILc/41eecvVOkarLm9SJetPhorGDueZbkCc5m Cm0DfdKAEpJ78i2ty2nz1CJUpzXkQVLiiIRA5CgDVOCVbwRGlqv69kJ6BcIV7cBLxJB2Ig zCiKIv00rW0c2uHTxNO91/U4KJYLM8I= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-271-Q3fpNq9gN4WqYc2YHXjPew-1; Fri, 09 Feb 2024 17:15:42 -0500 X-MC-Unique: Q3fpNq9gN4WqYc2YHXjPew-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 8D4AE1025623; Fri, 9 Feb 2024 22:15:41 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.194.59]) by smtp.corp.redhat.com (Postfix) with ESMTP id 59E001C14B04; Fri, 9 Feb 2024 22:15:37 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Matthew Wilcox , Ryan Roberts , Catalin Marinas , Yin Fengwei , Michal Hocko , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Peter Zijlstra , Michael Ellerman , Christophe Leroy , "Naveen N. Rao" , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org Subject: [PATCH v2 05/10] mm/mmu_gather: pass "delay_rmap" instead of encoded page to __tlb_remove_page_size() Date: Fri, 9 Feb 2024 23:15:04 +0100 Message-ID: <20240209221509.585251-6-david@redhat.com> In-Reply-To: <20240209221509.585251-1-david@redhat.com> References: <20240209221509.585251-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 5248A20020 X-Stat-Signature: rp147e6m4zaa9r6e3fb8i6wdiahy6ir3 X-HE-Tag: 1707516946-698180 X-HE-Meta: U2FsdGVkX1+9gkGsXQN65vA/JRZJcXHmscLEiFYtMHADSv00V+48gqrlGmu91T6e0IPRLTk8yyzkZrTNGeLIgHrgW1J18jWsc1Mz94DathW55gWUbN6asM906RaA5edTT7GJ/E+HSkw+QV2u1i4pXh/o7kB/voOCWdjm4G0WFfIWUffAKwMnp0uH4OaR6/petpqA85EqXre7U/pHpkhBxSmBhp14VqZHbrttfQMHlyDxgoWMh3tejO0BIkvtZCdmYdonqw07jxtTeObOSzg1ePM7jcpLi7VE8+XkS5ftiT7TiZD+0x8gNWmgcxVvqC9u0b8l8VCELbiP7CvJxdn84GYnD6Z12VYI2tWvsoSLVMw4rq0PiMPfAVuMk8cx/GdSH7pOkFEVo56kwNBTA1d+ogFDzhjjmNAGIH5pSveURbHrDnd3/YS8WUraOjjQXofPNXXL+vF7Qq7/n3e/3kC8/j208bw5aP8eQWfPBwUxnb9X61EyY+CvxdtB3F+o3r1rvhTQqIv0+VkxaaWX9VdBVkQOd8rtT2fOV/j1DYL3HA7yIitYsnfmxI/C8v8YRWYbJKP/skk96zId8T8e3xLLa+/BakvPL7mAKt8uJjVAvkW9Pz9cUVMk//Q1BeWUk/TbdmvbsDVTDY6C1JvDLVSZ4AkwP5q7yLbEjthHBMH9Y4MvbTABpmXjQq0tPZ9p20eUX9XJo3kOqythT+3fQJ+lg+xMR5bd2x81nAwqj0E1heKbxcMnpR3/KeiWuWqO74uOc32vZbT40ntEr9I6nbrubu2/D+mJnoZOu0QgxERn/xLdCBgKN80GURrCzybG78kx68qwMU1Lqe9xmXZfIz63Iyyft0nEug1uX/iaz0UgNXQVSljpR1YjhaXQxiyHLwZ86krzmK4ZKtGslnec9RTIdykuOTwFHSvFOMp9EXLt1HhssZneKMuAQk17XqCLW/IBsxwJ38i5lEtuC+Pr8h2 aHx7ablr a0TNPSff4mE/W/yJ9EEqpnTpmH67vVphd4a0fa2/zeXknEVEVDo9ZbHxIZNMRCOtRFVJy3RfWo+DTiPl2ZYcWhGQqOoW/KJupO/omZfjVvQAIFTnGbNB3cIb/sGvRFxSxr9NjYUW8r9MM5Q4inOBVyp2sSIgjueEiAZdiHZB7BdBJjbUCf+locIYSehwZsBoPfbMlPOr84R/93H1BA7Mjzuqtw+rBaXxFROX+4VffggCyY2hIkC0/4tZ7DlrwJB2xE1ps1sGr6gAF8ZA2ZjF19Gsagg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: We have two bits available in the encoded page pointer to store additional information. Currently, we use one bit to request delay of the rmap removal until after a TLB flush. We want to make use of the remaining bit internally for batching of multiple pages of the same folio, specifying that the next encoded page pointer in an array is actually "nr_pages". So pass page + delay_rmap flag instead of an encoded page, to handle the encoding internally. Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- arch/s390/include/asm/tlb.h | 13 ++++++------- include/asm-generic/tlb.h | 12 ++++++------ mm/mmu_gather.c | 7 ++++--- 3 files changed, 16 insertions(+), 16 deletions(-) diff --git a/arch/s390/include/asm/tlb.h b/arch/s390/include/asm/tlb.h index d1455a601adc..48df896d5b79 100644 --- a/arch/s390/include/asm/tlb.h +++ b/arch/s390/include/asm/tlb.h @@ -25,8 +25,7 @@ void __tlb_remove_table(void *_table); static inline void tlb_flush(struct mmu_gather *tlb); static inline bool __tlb_remove_page_size(struct mmu_gather *tlb, - struct encoded_page *page, - int page_size); + struct page *page, bool delay_rmap, int page_size); #define tlb_flush tlb_flush #define pte_free_tlb pte_free_tlb @@ -42,14 +41,14 @@ static inline bool __tlb_remove_page_size(struct mmu_gather *tlb, * tlb_ptep_clear_flush. In both flush modes the tlb for a page cache page * has already been freed, so just do free_page_and_swap_cache. * - * s390 doesn't delay rmap removal, so there is nothing encoded in - * the page pointer. + * s390 doesn't delay rmap removal. */ static inline bool __tlb_remove_page_size(struct mmu_gather *tlb, - struct encoded_page *page, - int page_size) + struct page *page, bool delay_rmap, int page_size) { - free_page_and_swap_cache(encoded_page_ptr(page)); + VM_WARN_ON_ONCE(delay_rmap); + + free_page_and_swap_cache(page); return false; } diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index 129a3a759976..2eb7b0d4f5d2 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -260,9 +260,8 @@ struct mmu_gather_batch { */ #define MAX_GATHER_BATCH_COUNT (10000UL/MAX_GATHER_BATCH) -extern bool __tlb_remove_page_size(struct mmu_gather *tlb, - struct encoded_page *page, - int page_size); +extern bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, + bool delay_rmap, int page_size); #ifdef CONFIG_SMP /* @@ -462,13 +461,14 @@ static inline void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb) static inline void tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, int page_size) { - if (__tlb_remove_page_size(tlb, encode_page(page, 0), page_size)) + if (__tlb_remove_page_size(tlb, page, false, page_size)) tlb_flush_mmu(tlb); } -static __always_inline bool __tlb_remove_page(struct mmu_gather *tlb, struct page *page, unsigned int flags) +static __always_inline bool __tlb_remove_page(struct mmu_gather *tlb, + struct page *page, bool delay_rmap) { - return __tlb_remove_page_size(tlb, encode_page(page, flags), PAGE_SIZE); + return __tlb_remove_page_size(tlb, page, delay_rmap, PAGE_SIZE); } /* tlb_remove_page diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index 604ddf08affe..ac733d81b112 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -116,7 +116,8 @@ static void tlb_batch_list_free(struct mmu_gather *tlb) tlb->local.next = NULL; } -bool __tlb_remove_page_size(struct mmu_gather *tlb, struct encoded_page *page, int page_size) +bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, + bool delay_rmap, int page_size) { struct mmu_gather_batch *batch; @@ -131,13 +132,13 @@ bool __tlb_remove_page_size(struct mmu_gather *tlb, struct encoded_page *page, i * Add the page and check if we are full. If so * force a flush. */ - batch->encoded_pages[batch->nr++] = page; + batch->encoded_pages[batch->nr++] = encode_page(page, delay_rmap); if (batch->nr == batch->max) { if (!tlb_next_batch(tlb)) return true; batch = tlb->active; } - VM_BUG_ON_PAGE(batch->nr > batch->max, encoded_page_ptr(page)); + VM_BUG_ON_PAGE(batch->nr > batch->max, page); return false; } From patchwork Fri Feb 9 22:15:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13551946 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D926C4828F for ; Fri, 9 Feb 2024 22:15:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 34EC66B0095; Fri, 9 Feb 2024 17:15:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2D7C86B0096; Fri, 9 Feb 2024 17:15:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 129DB6B0098; Fri, 9 Feb 2024 17:15:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 006216B0095 for ; Fri, 9 Feb 2024 17:15:55 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id D5F301C16AE for ; Fri, 9 Feb 2024 22:15:55 +0000 (UTC) X-FDA: 81773673870.23.CF75B44 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf19.hostedemail.com (Postfix) with ESMTP id 4A8181A0016 for ; Fri, 9 Feb 2024 22:15:54 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=fAjr6zyz; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf19.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707516954; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BfteBvWtn9YL0u+ZWWzLkAZQ9TqgPhMIROF4I2AA+84=; b=bOmcSvAakRt+Bib0PAVGHmPdiB8N+iaF35kuJ2r4DYWHe4D2PvZ/R9269gIeGra5gxV6Om XXNriNsB4SG1WyJs7olu/MyMnpgfCOLv3DZ84FdySNsTChJh4NuIhneuCrb9aJFcRW9IDD cM8AB0g+4Gr1umI8D/5WHw5A7RIxv7g= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=fAjr6zyz; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf19.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707516954; a=rsa-sha256; cv=none; b=bAYs0aSshraS45AVsODcgCPGKQm8+QzJEsXjs08TgrnWVZnk96CF/EWVQaXjURCT9J/GQj BkQzCcOtDq5n1JGoGQpA7sV5nPVRJaBLXAw9mFX+F//iiukkKudD8gt7rToRvqDa97870g cyshIYBXYrSNDgK3m1cuJ9xwe+RgmgA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707516953; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BfteBvWtn9YL0u+ZWWzLkAZQ9TqgPhMIROF4I2AA+84=; b=fAjr6zyzz746bAOq7EIMHu1+KarPgZjUQ3m/OfliyYE5EYvE85mFJAnWdRxYcZHWydKChJ pa6ZABbT5pt9YwPWIXB3OHNPIuqOhdWTMd/tgMY7tRHPuAfg3pgx6OGVZWtH+Caj9GEtIh 1XeDEefqvDzQkMS0BSuyFXS/fCHtXdU= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-441-DUvgpdnlM0OO1UlRjbDzjg-1; Fri, 09 Feb 2024 17:15:48 -0500 X-MC-Unique: DUvgpdnlM0OO1UlRjbDzjg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 025DF2823807; Fri, 9 Feb 2024 22:15:47 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.194.59]) by smtp.corp.redhat.com (Postfix) with ESMTP id D5A541C14B04; Fri, 9 Feb 2024 22:15:41 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Matthew Wilcox , Ryan Roberts , Catalin Marinas , Yin Fengwei , Michal Hocko , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Peter Zijlstra , Michael Ellerman , Christophe Leroy , "Naveen N. Rao" , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org Subject: [PATCH v2 06/10] mm/mmu_gather: define ENCODED_PAGE_FLAG_DELAY_RMAP Date: Fri, 9 Feb 2024 23:15:05 +0100 Message-ID: <20240209221509.585251-7-david@redhat.com> In-Reply-To: <20240209221509.585251-1-david@redhat.com> References: <20240209221509.585251-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 4A8181A0016 X-Stat-Signature: sh4hwqbsqr4usi8oad91w173m1k1imf8 X-HE-Tag: 1707516954-640642 X-HE-Meta: U2FsdGVkX1/rrOo9q7BocFp6IAvLDw6+ChpNFjWtHfJV5+4hymM0Hy4JoqVmel++5jgldbxrZc43dAbUBLISnNXGZ8o/P/oYZ3S9CHkkUrpUVWna6rthPo7GtepW++qB6/w8C/VIOQuNIjW7auwIPMOkIexOH8SDZEjYPkV+2VNOWmUjKOadJYjhVKIdyq85JQM3VxDRKRU9pqXuda5Yq2C7w9R7qtuyw0iTWCtZEDMef4xk791kMwNVisloBjlRUPn6mC3tG6jo4kTCW3J8xdirkVS1oFglVwLsvVIlqVycU6kl127uK6y/8GvbT5vTx+dK7i7Y3D0PtZP2llKkeLS8qReLJ6sbl6QoWyPk9cLv1xugxSNX5RCBcgYqITHmG0RfvKC6Wwc5rDSh5/zBO+jh9XIXbggzn6CZAiXJ1yetcyeDbLyAgGwBW9gCYQKJIFlw2v1q/9wEz8Hz0DOoM9Hw739EQZATn+yWKKfbwKGnj/tyzQlXPcDMjbcFzbtucPFOenmbn46O5Noo+hBJnvf/EiXugPwKGiKMwa6UcU4ODbBV3JfvSLZBFFP+9dc9G2WPYJDELE2SnW6qMOiLuH/Z+H6XIe4+h1sm2ngRJyvEzssn3LE8vutw/fdPQL2FGCGcOCrww9kEew9PS+Keb/o+hXGzo5DRaRG98XOd/ujYqEChOSk3fKPCGp9NXioVR97DTooQC40ayIv17IRUQkflozTv5AXPic41XMifVeJLLWz+98cDRrRQah4khVOCAku7KxDKynQpUoRb64/ADINvBo5rk2cJZjbFf0CRTf8qW/hhzXxRrgtINSmRS3y0gTULYDwDJaHo6Eb3SPf38E9rclvQI70qFc4PXSsdUEo0fxav6jcTPyKp7Vih+snbOV63p5IjRe3U2Y5mM/n2HFeBf1oXXjsxJ7dDT+EebYMMoOkPSkQPOVuLEr8DTVyY X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Nowadays, encoded pages are only used in mmu_gather handling. Let's update the documentation, and define ENCODED_PAGE_BIT_DELAY_RMAP. While at it, rename ENCODE_PAGE_BITS to ENCODED_PAGE_BITS. If encoded page pointers would ever be used in other context again, we'd likely want to change the defines to reflect their context (e.g., ENCODED_PAGE_FLAG_MMU_GATHER_DELAY_RMAP). For now, let's keep it simple. This is a preparation for using the remaining spare bit to indicate that the next item in an array of encoded pages is a "nr_pages" argument and not an encoded page. Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- include/linux/mm_types.h | 17 +++++++++++------ mm/mmu_gather.c | 5 +++-- 2 files changed, 14 insertions(+), 8 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 8b611e13153e..1b89eec0d6df 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -210,8 +210,8 @@ struct page { * * An 'encoded_page' pointer is a pointer to a regular 'struct page', but * with the low bits of the pointer indicating extra context-dependent - * information. Not super-common, but happens in mmu_gather and mlock - * handling, and this acts as a type system check on that use. + * information. Only used in mmu_gather handling, and this acts as a type + * system check on that use. * * We only really have two guaranteed bits in general, although you could * play with 'struct page' alignment (see CONFIG_HAVE_ALIGNED_STRUCT_PAGE) @@ -220,21 +220,26 @@ struct page { * Use the supplied helper functions to endcode/decode the pointer and bits. */ struct encoded_page; -#define ENCODE_PAGE_BITS 3ul + +#define ENCODED_PAGE_BITS 3ul + +/* Perform rmap removal after we have flushed the TLB. */ +#define ENCODED_PAGE_BIT_DELAY_RMAP 1ul + static __always_inline struct encoded_page *encode_page(struct page *page, unsigned long flags) { - BUILD_BUG_ON(flags > ENCODE_PAGE_BITS); + BUILD_BUG_ON(flags > ENCODED_PAGE_BITS); return (struct encoded_page *)(flags | (unsigned long)page); } static inline unsigned long encoded_page_flags(struct encoded_page *page) { - return ENCODE_PAGE_BITS & (unsigned long)page; + return ENCODED_PAGE_BITS & (unsigned long)page; } static inline struct page *encoded_page_ptr(struct encoded_page *page) { - return (struct page *)(~ENCODE_PAGE_BITS & (unsigned long)page); + return (struct page *)(~ENCODED_PAGE_BITS & (unsigned long)page); } /* diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index ac733d81b112..6540c99c6758 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -53,7 +53,7 @@ static void tlb_flush_rmap_batch(struct mmu_gather_batch *batch, struct vm_area_ for (int i = 0; i < batch->nr; i++) { struct encoded_page *enc = batch->encoded_pages[i]; - if (encoded_page_flags(enc)) { + if (encoded_page_flags(enc) & ENCODED_PAGE_BIT_DELAY_RMAP) { struct page *page = encoded_page_ptr(enc); folio_remove_rmap_pte(page_folio(page), page, vma); } @@ -119,6 +119,7 @@ static void tlb_batch_list_free(struct mmu_gather *tlb) bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, bool delay_rmap, int page_size) { + int flags = delay_rmap ? ENCODED_PAGE_BIT_DELAY_RMAP : 0; struct mmu_gather_batch *batch; VM_BUG_ON(!tlb->end); @@ -132,7 +133,7 @@ bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, * Add the page and check if we are full. If so * force a flush. */ - batch->encoded_pages[batch->nr++] = encode_page(page, delay_rmap); + batch->encoded_pages[batch->nr++] = encode_page(page, flags); if (batch->nr == batch->max) { if (!tlb_next_batch(tlb)) return true; From patchwork Fri Feb 9 22:15:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13551945 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65751C48297 for ; Fri, 9 Feb 2024 22:15:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D87AF6B0096; Fri, 9 Feb 2024 17:15:57 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CE93E6B0098; Fri, 9 Feb 2024 17:15:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B154C6B0099; Fri, 9 Feb 2024 17:15:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 9DB3F6B0096 for ; Fri, 9 Feb 2024 17:15:57 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 7D574A1627 for ; Fri, 9 Feb 2024 22:15:57 +0000 (UTC) X-FDA: 81773673954.09.D77473D Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf30.hostedemail.com (Postfix) with ESMTP id CE37E8000F for ; Fri, 9 Feb 2024 22:15:55 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Aq7XDoWl; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf30.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707516955; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=X4BN9bDZrrapc7Xd1FDfmQmuEZxXG5tNILRPdL9R2lg=; b=KAEwzcHhi8VvU0bebvkVZuMqwsxZnhBCa5hbcOCHl04Nl0jLs7ZLKx+vzNyCFM+KqrDfrT nVHXRsu5Cx2vKtQ9m3WHZVtt0kraJciYfekYQl6d/vYxi84qWExljdzMlXLpHboE4M6gH9 ApXxHtosLHX62N+WPtWuIFkWa7FT9UA= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Aq7XDoWl; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf30.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707516955; a=rsa-sha256; cv=none; b=wyg/oYtjFJ5BuCXXEAzS+bzRK/V+chp8djtm3+BZuIRd8ZRejV4fY9sYw/4CTMd5nlO0h4 X1dPm0G8mU71e7kNUThyULCVF+oMfyE9eCJrArfoGIA+YFbpBcWM+eQRVYtqjuoSK7wwHs knefH4Li17WO9VY8cDSiRaOetNYi2XI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707516955; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=X4BN9bDZrrapc7Xd1FDfmQmuEZxXG5tNILRPdL9R2lg=; b=Aq7XDoWlzWdU0M1ZB24bcJKUdI9P1E/sY0MFR2XAEd2mZM9iujqDsPOMr7hZbeUQC7wjNQ cz4QgWCMJID2EUMrLR7mtYf47XkRDrA5BSUmq2ne8pkL942fg/5AjW+VkJBneaZHy5KMxP /J2PbveXfaTV6ejwzrOdwuzrMqJH+f8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-373-AMAEvJNUPyeeH3DHaeR2Bw-1; Fri, 09 Feb 2024 17:15:52 -0500 X-MC-Unique: AMAEvJNUPyeeH3DHaeR2Bw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9B6CF83722E; Fri, 9 Feb 2024 22:15:51 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.194.59]) by smtp.corp.redhat.com (Postfix) with ESMTP id 666531C14B04; Fri, 9 Feb 2024 22:15:47 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Matthew Wilcox , Ryan Roberts , Catalin Marinas , Yin Fengwei , Michal Hocko , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Peter Zijlstra , Michael Ellerman , Christophe Leroy , "Naveen N. Rao" , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org Subject: [PATCH v2 07/10] mm/mmu_gather: add tlb_remove_tlb_entries() Date: Fri, 9 Feb 2024 23:15:06 +0100 Message-ID: <20240209221509.585251-8-david@redhat.com> In-Reply-To: <20240209221509.585251-1-david@redhat.com> References: <20240209221509.585251-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: CE37E8000F X-Stat-Signature: tdutyt4pe4ec3dodhjy1oohfx5g4b19r X-Rspam-User: X-HE-Tag: 1707516955-694079 X-HE-Meta: U2FsdGVkX1+BbgxVN2coqGCFVAuWFhUzYW7mEvamcgDAdchmdqdVWBe5/M3HSC+0qGDvFRnhFXkM8fW8duzN+AaBoDqYE3IqKAoR1/uShioCFtaOJ5nLqbQSZ+b5ZFIj/7WDWthoBO+bjrlsdp2mXLkA/14WCcbkGIV11ER/VmhOHkmEgr21IkiSsIliCMkPrHs+ulzc/kycx4/i+BfzDXHoRrq5pDTQjBiaaDGspEKm2j+Ix5TMPYdywEIUhTC2TB5ho770dlIxFbdyP6SC9Lxq6nyLwzzXOr7a8l/TezLQeB3POMmb8/z2gqiRu0QSMgfyWP517k/1xT17/1G583MyStoYykqpdWwNfubqI6TocONfFIRUEcj3qCyn5v0fq4AE8b62MzE+Tt9Ej4BLq5TcMjAVMjUiLwOu09GrhUQ61+p0M8Js2itF7rqEx+/QXD3bxvAJ7GT4LycBViZ12ZMM719O3bdOmD9OSWZbW7YLmPSjvFmjgwIzKI4sYnnTmU3CtpQCDjW62o6v87l0XyG61K/mw9TLxhOhFt+FESKDCnp2x1m0FmTzJ9gvmiTfKJhv3x2Izp57XtIgfBlDsVha5F4pk/YOIunJHJpGetJCuvUiE2Uv9VHLpHilBFDSFhKtdGOfw5dygHedqUiOnvQKSuVe9N3YVOSv7YfH5NyCUG61CgPGKLSHbTuscNTFML+olAW2Z2jM1+TtL/OLRdrux6riphznZTvzBdzlj14Aoak3s7186V3aN99cpSeRCjE/CHDFLNsc34532faEhdy1x45WTAoMfTi9eA3oGaaAjmhg/58n7tEyprXaZxkRGdDn4XEBEXsmEQcw/bL/nSlL6HWAUh2iPNG38XcgGFeLJGvQ9hNK9XdUSxht8t3mg4oll51XcbR/CAuhagjnvVmrTJ04FQDe84hS8GsvYOaKkokKU9lU3Yp36wEaMi9XBLDBSg1YkpQ2Yl064Af x7oPsuvJ lY3l5ZEKYqsydI6mGlOaBrIApC9k3usV8amKAOOH1VoxmtozlyAr1TRxjTcKM4PN/6z1XJN4/CWLQ34trC47ZpRf8IGHXnIl87Mt3l+xq1FfIpk8/RRWWOYXsu944LvXgA+G7nK4OtF74mEGFZ261FtaUod/cE/mdp21NlpzLmwG/ZwWF/VMxy0lt8WqKTbEjXWC9KDZrPYAX6RYnEBX/BwNp7U7B4RmVJzOYYNdm6bYn8q2GTqnwxpv48RnCNqR1h/u+JA+e4bo1LLBl6r5s1CEhaQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Let's add a helper that lets us batch-process multiple consecutive PTEs. Note that the loop will get optimized out on all architectures except on powerpc. We have to add an early define of __tlb_remove_tlb_entry() on ppc to make the compiler happy (and avoid making tlb_remove_tlb_entries() a macro). Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- arch/powerpc/include/asm/tlb.h | 2 ++ include/asm-generic/tlb.h | 20 ++++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/arch/powerpc/include/asm/tlb.h b/arch/powerpc/include/asm/tlb.h index b3de6102a907..1ca7d4c4b90d 100644 --- a/arch/powerpc/include/asm/tlb.h +++ b/arch/powerpc/include/asm/tlb.h @@ -19,6 +19,8 @@ #include +static inline void __tlb_remove_tlb_entry(struct mmu_gather *tlb, pte_t *ptep, + unsigned long address); #define __tlb_remove_tlb_entry __tlb_remove_tlb_entry #define tlb_flush tlb_flush diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index 2eb7b0d4f5d2..95d60a4f468a 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -608,6 +608,26 @@ static inline void tlb_flush_p4d_range(struct mmu_gather *tlb, __tlb_remove_tlb_entry(tlb, ptep, address); \ } while (0) +/** + * tlb_remove_tlb_entries - remember unmapping of multiple consecutive ptes for + * later tlb invalidation. + * + * Similar to tlb_remove_tlb_entry(), but remember unmapping of multiple + * consecutive ptes instead of only a single one. + */ +static inline void tlb_remove_tlb_entries(struct mmu_gather *tlb, + pte_t *ptep, unsigned int nr, unsigned long address) +{ + tlb_flush_pte_range(tlb, address, PAGE_SIZE * nr); + for (;;) { + __tlb_remove_tlb_entry(tlb, ptep, address); + if (--nr == 0) + break; + ptep++; + address += PAGE_SIZE; + } +} + #define tlb_remove_huge_tlb_entry(h, tlb, ptep, address) \ do { \ unsigned long _sz = huge_page_size(h); \ From patchwork Fri Feb 9 22:15:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13551947 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88DC5C4828F for ; Fri, 9 Feb 2024 22:16:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1CC1A6B009A; Fri, 9 Feb 2024 17:16:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 151346B009B; Fri, 9 Feb 2024 17:16:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EE5E36B009C; Fri, 9 Feb 2024 17:16:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D7ACB6B009A for ; Fri, 9 Feb 2024 17:16:02 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id B4AC780387 for ; Fri, 9 Feb 2024 22:16:02 +0000 (UTC) X-FDA: 81773674164.24.12B5720 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf21.hostedemail.com (Postfix) with ESMTP id E7DD71C0007 for ; Fri, 9 Feb 2024 22:16:00 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ZnstAOuD; spf=pass (imf21.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707516961; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1nEVDzHw7aC5MZOjj7IJGnO0raht3IChKPRkoJ/iPcQ=; b=2TJwO+IxRBlWpFm0Zwu8ne9Y+pZPqJgYD1vfgi8kEXR+ulZpDwFz4KDzD6pjW7AC9OwbZM X7V/Q7Es/CQO+EmzOqrAD4q7v0WW2uZfKhu//HlRmrNm2k0owyQHhYku9UY+Xx7AqHp1mn G0NYsQcS9NbvXHyrwqE+YArqrcDeriE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707516961; a=rsa-sha256; cv=none; b=Bdk4iJrqXXYQdT8ArRdu+nQoKYTd8lkBXaNM3Cx5jDh5rFSfDgfdn4u/8+qJpTIQnOqVD2 fhsyGGZy2lvFeIDNc3N60vnObh1J8tO8HENDrw04B2f+S0+rxvy+/yhNxOgxvopl9zVEaq 99q650fqohlS6jGvZNom9I2D1cJeXS0= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ZnstAOuD; spf=pass (imf21.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707516960; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1nEVDzHw7aC5MZOjj7IJGnO0raht3IChKPRkoJ/iPcQ=; b=ZnstAOuDjh03wNOPyfhL5bXUhyn8GfGLnmp9va2/DNWzXkQAhywZ59+fwqB6SBHOteU+VK DAdl6znwY5qOmYyTzy1p8X94iO7qyKPr0CDW+Q6FL7cTfbThxNDrEOwh+35CUgQRZl3lh6 aVfaXTCBC4DSKLwYp/+HcfgJ8kpTj10= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-483-GjH1TBS8OeKCQERcPxqveA-1; Fri, 09 Feb 2024 17:15:57 -0500 X-MC-Unique: GjH1TBS8OeKCQERcPxqveA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6AF29185A780; Fri, 9 Feb 2024 22:15:56 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.194.59]) by smtp.corp.redhat.com (Postfix) with ESMTP id DF5C61C14B04; Fri, 9 Feb 2024 22:15:51 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Matthew Wilcox , Ryan Roberts , Catalin Marinas , Yin Fengwei , Michal Hocko , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Peter Zijlstra , Michael Ellerman , Christophe Leroy , "Naveen N. Rao" , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org Subject: [PATCH v2 08/10] mm/mmu_gather: add __tlb_remove_folio_pages() Date: Fri, 9 Feb 2024 23:15:07 +0100 Message-ID: <20240209221509.585251-9-david@redhat.com> In-Reply-To: <20240209221509.585251-1-david@redhat.com> References: <20240209221509.585251-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Rspamd-Queue-Id: E7DD71C0007 X-Rspam-User: X-Stat-Signature: zye6gmiwczeuh9kjww6qtftoejb4rfwu X-Rspamd-Server: rspam03 X-HE-Tag: 1707516960-256105 X-HE-Meta: U2FsdGVkX1/hceNs3xiqVtvXeeeSDPSziSF//sk6DQOmtcn1lcV1lZitMJaaNPkgZpwfqQS1KZOWXfGSimIwtoyp3Sr6CuH11XXP07kNP9Rn5XEwXh18PgLhUV3R8lNMRRJWVZcOVZ4dns562glhkITZMYUgkWgSqgcsKCcZuuTHTs8/Od6yghQFYgEwvuC0OZZrJCBiQeMgoQ+lnMlALezD0Oa9o8TN47+/HJeU0ncOrYmCRUftlImLHYf/bgK1Yf0AJb5eHyUqxabBLNWS/O9mz8NwU+VGIOXW6M4XoEldQNdQVfOIyS+9MzIWCCKOr9jOgMkf+YyZn+fqx9NqpvCXIkTKTl9iBAW9KrBQExof4yCgp4ukSHh5hk5mR13dwtOigdESCaiaztrVk8jBNGBLXPszh9feSB3mGD2elgWlxAHeEoxVIPFMpRA4pHXibYH0qZi8+93UkeNqCU2lZjA1n3wSpykHR+6zpcV8bEPjEDPoN0mqfWnDxiS/Et8Tf7hcdQu1WR2ZAGc38Q+FLsy0qGnz3vL6AciRf0VVnussES2dJzu3xM5ZahUwEMshMJouO47/D/S7I/ZaH1M/WRl0HunF2o82P+Q3aTq8qCYxr8cB+S59TwM7TlYLbINMgXYyeD0BMi5PPxlTfqb450K2aHzoGl++zASsBmY6ViBiJd8jN+LTtAqLCXre2GPx12Y/tfNFLeSjY505BPW5Dsmnao/huboxog5V35GCuEi2DFNkjrfqD1BcYsV93Razx5ajx1IdNR9bRg8l74HLZ6n3aRnKPmKPN3wpWLdJ7i8jWn4BjTE4UE7H4vYDOspTsfc5VhXeGYXUGlzN29yeAqkcbk7Dn3COSF8eUdqVpqazb/HyRc5B+dlnbYNZ+N63B6aSWRI7+EEwkWbEHoLnd7vgWg7sdbU9yA+anCoqSQ9toH6BxsnUDKu5+/zxIjtWLnwHkhGSZ/6/BLNhqL8 0R6oLUQ6 4QQdsANqzo851UfSZEA9LSMEjN7b9oLHezHoy1NqiT4iZRXTBWIwe8CRI2DpuBPXm2JZvHzkvsK/yRMp2eMrury7eO+vpu2sK/2KO9b9vcOJHrCvWFroLmXkaaz3YGKBv+9+/jlBfDesJtjWi5n1Yrlrh8K9UzRFqm3C03polB7w+fxyeHTCYhFYxKn58BvCETKNpEQmRrGzYcaMwR2xrqjBYDQbGTvSj3Dq2 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add __tlb_remove_folio_pages(), which will remove multiple consecutive pages that belong to the same large folio, instead of only a single page. We'll be using this function when optimizing unmapping/zapping of large folios that are mapped by PTEs. We're using the remaining spare bit in an encoded_page to indicate that the next enoced page in an array contains actually shifted "nr_pages". Teach swap/freeing code about putting multiple folio references, and delayed rmap handling to remove page ranges of a folio. This extension allows for still gathering almost as many small folios as we used to (-1, because we have to prepare for a possibly bigger next entry), but still allows for gathering consecutive pages that belong to the same large folio. Note that we don't pass the folio pointer, because it is not required for now. Further, we don't support page_size != PAGE_SIZE, it won't be required for simple PTE batching. We have to provide a separate s390 implementation, but it's fairly straight forward. Another, more invasive and likely more expensive, approach would be to use folio+range or a PFN range instead of page+nr_pages. But, we should do that consistently for the whole mmu_gather. For now, let's keep it simple and add "nr_pages" only. Note that it is now possible to gather significantly more pages: In the past, we were able to gather ~10000 pages, now we can gather also gather ~5000 folio fragments that span multiple pages. A folio fragement on x86-64 can be up to 512 pages (2 MiB THP) and on arm64 with 64k in theory 8192 pages (512 MiB THP). Gathering more memory is not considered something we should worry about, especially because these are already corner cases. While we can gather more total memory, we won't free more folio fragments. As long as page freeing time primarily only depends on the number of involved folios, there is no effective change for !preempt configurations. However, we'll adjust tlb_batch_pages_flush() separately to handle corner cases where page freeing time grows proportionally with the actual memory size. Signed-off-by: David Hildenbrand Reviewed-by: Ryan Roberts --- arch/s390/include/asm/tlb.h | 17 +++++++++++ include/asm-generic/tlb.h | 8 +++++ include/linux/mm_types.h | 20 ++++++++++++ mm/mmu_gather.c | 61 +++++++++++++++++++++++++++++++------ mm/swap.c | 12 ++++++-- mm/swap_state.c | 15 +++++++-- 6 files changed, 119 insertions(+), 14 deletions(-) diff --git a/arch/s390/include/asm/tlb.h b/arch/s390/include/asm/tlb.h index 48df896d5b79..e95b2c8081eb 100644 --- a/arch/s390/include/asm/tlb.h +++ b/arch/s390/include/asm/tlb.h @@ -26,6 +26,8 @@ void __tlb_remove_table(void *_table); static inline void tlb_flush(struct mmu_gather *tlb); static inline bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, bool delay_rmap, int page_size); +static inline bool __tlb_remove_folio_pages(struct mmu_gather *tlb, + struct page *page, unsigned int nr_pages, bool delay_rmap); #define tlb_flush tlb_flush #define pte_free_tlb pte_free_tlb @@ -52,6 +54,21 @@ static inline bool __tlb_remove_page_size(struct mmu_gather *tlb, return false; } +static inline bool __tlb_remove_folio_pages(struct mmu_gather *tlb, + struct page *page, unsigned int nr_pages, bool delay_rmap) +{ + struct encoded_page *encoded_pages[] = { + encode_page(page, ENCODED_PAGE_BIT_NR_PAGES_NEXT), + encode_nr_pages(nr_pages), + }; + + VM_WARN_ON_ONCE(delay_rmap); + VM_WARN_ON_ONCE(page_folio(page) != page_folio(page + nr_pages - 1)); + + free_pages_and_swap_cache(encoded_pages, ARRAY_SIZE(encoded_pages)); + return false; +} + static inline void tlb_flush(struct mmu_gather *tlb) { __tlb_flush_mm_lazy(tlb->mm); diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index 95d60a4f468a..bd00dd238b79 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -69,6 +69,7 @@ * * - tlb_remove_page() / __tlb_remove_page() * - tlb_remove_page_size() / __tlb_remove_page_size() + * - __tlb_remove_folio_pages() * * __tlb_remove_page_size() is the basic primitive that queues a page for * freeing. __tlb_remove_page() assumes PAGE_SIZE. Both will return a @@ -78,6 +79,11 @@ * tlb_remove_page() and tlb_remove_page_size() imply the call to * tlb_flush_mmu() when required and has no return value. * + * __tlb_remove_folio_pages() is similar to __tlb_remove_page(), however, + * instead of removing a single page, remove the given number of consecutive + * pages that are all part of the same (large) folio: just like calling + * __tlb_remove_page() on each page individually. + * * - tlb_change_page_size() * * call before __tlb_remove_page*() to set the current page-size; implies a @@ -262,6 +268,8 @@ struct mmu_gather_batch { extern bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, bool delay_rmap, int page_size); +bool __tlb_remove_folio_pages(struct mmu_gather *tlb, struct page *page, + unsigned int nr_pages, bool delay_rmap); #ifdef CONFIG_SMP /* diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 1b89eec0d6df..a7223ba3ea1e 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -226,6 +226,15 @@ struct encoded_page; /* Perform rmap removal after we have flushed the TLB. */ #define ENCODED_PAGE_BIT_DELAY_RMAP 1ul +/* + * The next item in an encoded_page array is the "nr_pages" argument, specifying + * the number of consecutive pages starting from this page, that all belong to + * the same folio. For example, "nr_pages" corresponds to the number of folio + * references that must be dropped. If this bit is not set, "nr_pages" is + * implicitly 1. + */ +#define ENCODED_PAGE_BIT_NR_PAGES_NEXT 2ul + static __always_inline struct encoded_page *encode_page(struct page *page, unsigned long flags) { BUILD_BUG_ON(flags > ENCODED_PAGE_BITS); @@ -242,6 +251,17 @@ static inline struct page *encoded_page_ptr(struct encoded_page *page) return (struct page *)(~ENCODED_PAGE_BITS & (unsigned long)page); } +static __always_inline struct encoded_page *encode_nr_pages(unsigned long nr) +{ + VM_WARN_ON_ONCE((nr << 2) >> 2 != nr); + return (struct encoded_page *)(nr << 2); +} + +static __always_inline unsigned long encoded_nr_pages(struct encoded_page *page) +{ + return ((unsigned long)page) >> 2; +} + /* * A swap entry has to fit into a "unsigned long", as the entry is hidden * in the "index" field of the swapper address space. diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index 6540c99c6758..d175c0f1e2c8 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -50,12 +50,21 @@ static bool tlb_next_batch(struct mmu_gather *tlb) #ifdef CONFIG_SMP static void tlb_flush_rmap_batch(struct mmu_gather_batch *batch, struct vm_area_struct *vma) { + struct encoded_page **pages = batch->encoded_pages; + for (int i = 0; i < batch->nr; i++) { - struct encoded_page *enc = batch->encoded_pages[i]; + struct encoded_page *enc = pages[i]; if (encoded_page_flags(enc) & ENCODED_PAGE_BIT_DELAY_RMAP) { struct page *page = encoded_page_ptr(enc); - folio_remove_rmap_pte(page_folio(page), page, vma); + unsigned int nr_pages = 1; + + if (unlikely(encoded_page_flags(enc) & + ENCODED_PAGE_BIT_NR_PAGES_NEXT)) + nr_pages = encoded_nr_pages(pages[++i]); + + folio_remove_rmap_ptes(page_folio(page), page, nr_pages, + vma); } } } @@ -89,18 +98,26 @@ static void tlb_batch_pages_flush(struct mmu_gather *tlb) for (batch = &tlb->local; batch && batch->nr; batch = batch->next) { struct encoded_page **pages = batch->encoded_pages; - do { + while (batch->nr) { /* * limit free batch count when PAGE_SIZE > 4K */ unsigned int nr = min(512U, batch->nr); + /* + * Make sure we cover page + nr_pages, and don't leave + * nr_pages behind when capping the number of entries. + */ + if (unlikely(encoded_page_flags(pages[nr - 1]) & + ENCODED_PAGE_BIT_NR_PAGES_NEXT)) + nr++; + free_pages_and_swap_cache(pages, nr); pages += nr; batch->nr -= nr; cond_resched(); - } while (batch->nr); + } } tlb->active = &tlb->local; } @@ -116,8 +133,9 @@ static void tlb_batch_list_free(struct mmu_gather *tlb) tlb->local.next = NULL; } -bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, - bool delay_rmap, int page_size) +static bool __tlb_remove_folio_pages_size(struct mmu_gather *tlb, + struct page *page, unsigned int nr_pages, bool delay_rmap, + int page_size) { int flags = delay_rmap ? ENCODED_PAGE_BIT_DELAY_RMAP : 0; struct mmu_gather_batch *batch; @@ -126,6 +144,8 @@ bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, #ifdef CONFIG_MMU_GATHER_PAGE_SIZE VM_WARN_ON(tlb->page_size != page_size); + VM_WARN_ON_ONCE(nr_pages != 1 && page_size != PAGE_SIZE); + VM_WARN_ON_ONCE(page_folio(page) != page_folio(page + nr_pages - 1)); #endif batch = tlb->active; @@ -133,17 +153,40 @@ bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, * Add the page and check if we are full. If so * force a flush. */ - batch->encoded_pages[batch->nr++] = encode_page(page, flags); - if (batch->nr == batch->max) { + if (likely(nr_pages == 1)) { + batch->encoded_pages[batch->nr++] = encode_page(page, flags); + } else { + flags |= ENCODED_PAGE_BIT_NR_PAGES_NEXT; + batch->encoded_pages[batch->nr++] = encode_page(page, flags); + batch->encoded_pages[batch->nr++] = encode_nr_pages(nr_pages); + } + /* + * Make sure that we can always add another "page" + "nr_pages", + * requiring two entries instead of only a single one. + */ + if (batch->nr >= batch->max - 1) { if (!tlb_next_batch(tlb)) return true; batch = tlb->active; } - VM_BUG_ON_PAGE(batch->nr > batch->max, page); + VM_BUG_ON_PAGE(batch->nr > batch->max - 1, page); return false; } +bool __tlb_remove_folio_pages(struct mmu_gather *tlb, struct page *page, + unsigned int nr_pages, bool delay_rmap) +{ + return __tlb_remove_folio_pages_size(tlb, page, nr_pages, delay_rmap, + PAGE_SIZE); +} + +bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, + bool delay_rmap, int page_size) +{ + return __tlb_remove_folio_pages_size(tlb, page, 1, delay_rmap, page_size); +} + #endif /* MMU_GATHER_NO_GATHER */ #ifdef CONFIG_MMU_GATHER_TABLE_FREE diff --git a/mm/swap.c b/mm/swap.c index cd8f0150ba3a..e5380d732c0d 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -967,11 +967,17 @@ void release_pages(release_pages_arg arg, int nr) unsigned int lock_batch; for (i = 0; i < nr; i++) { + unsigned int nr_refs = 1; struct folio *folio; /* Turn any of the argument types into a folio */ folio = page_folio(encoded_page_ptr(encoded[i])); + /* Is our next entry actually "nr_pages" -> "nr_refs" ? */ + if (unlikely(encoded_page_flags(encoded[i]) & + ENCODED_PAGE_BIT_NR_PAGES_NEXT)) + nr_refs = encoded_nr_pages(encoded[++i]); + /* * Make sure the IRQ-safe lock-holding time does not get * excessive with a continuous string of pages from the @@ -990,14 +996,14 @@ void release_pages(release_pages_arg arg, int nr) unlock_page_lruvec_irqrestore(lruvec, flags); lruvec = NULL; } - if (put_devmap_managed_page(&folio->page)) + if (put_devmap_managed_page_refs(&folio->page, nr_refs)) continue; - if (folio_put_testzero(folio)) + if (folio_ref_sub_and_test(folio, nr_refs)) free_zone_device_page(&folio->page); continue; } - if (!folio_put_testzero(folio)) + if (!folio_ref_sub_and_test(folio, nr_refs)) continue; if (folio_test_large(folio)) { diff --git a/mm/swap_state.c b/mm/swap_state.c index 7255c01a1e4e..2f540748f7c0 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -311,8 +311,19 @@ void free_page_and_swap_cache(struct page *page) void free_pages_and_swap_cache(struct encoded_page **pages, int nr) { lru_add_drain(); - for (int i = 0; i < nr; i++) - free_swap_cache(encoded_page_ptr(pages[i])); + for (int i = 0; i < nr; i++) { + struct page *page = encoded_page_ptr(pages[i]); + + /* + * Skip over the "nr_pages" entry. It's sufficient to call + * free_swap_cache() only once per folio. + */ + if (unlikely(encoded_page_flags(pages[i]) & + ENCODED_PAGE_BIT_NR_PAGES_NEXT)) + i++; + + free_swap_cache(page); + } release_pages(pages, nr); } From patchwork Fri Feb 9 22:15:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13551948 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F409AC48297 for ; Fri, 9 Feb 2024 22:16:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 91A6C6B0075; Fri, 9 Feb 2024 17:16:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8A3AF6B009C; Fri, 9 Feb 2024 17:16:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6F6866B009D; Fri, 9 Feb 2024 17:16:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 595DE6B0075 for ; Fri, 9 Feb 2024 17:16:11 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 007FB1C178E for ; Fri, 9 Feb 2024 22:16:11 +0000 (UTC) X-FDA: 81773674500.01.8CF26B3 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf24.hostedemail.com (Postfix) with ESMTP id 518BE180026 for ; Fri, 9 Feb 2024 22:16:09 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=YutCGfZY; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf24.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707516969; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=m8W53GW4ZEWC3PprgXE/ukTwzuRzol5bFe9xhF3bvjg=; b=izEmKIf+XVP2uV0EOk6aVJ0AHzFXMSA2z2LTN+sO6MUaKVBJqhcsVmFOBXanvQrCgt1exU QqbYRLU39zlS30zIXiJgJ5ILXi6SzLCupo5Lm4cQC9lNa1HmyV+1U8TIvFR6T2bNN+o6Sz 9xI+vVyt043TrKroef6ncmN63yH+d88= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=YutCGfZY; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf24.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707516969; a=rsa-sha256; cv=none; b=lLThJFefeizFa7Z0u4tW8vKP/XUGPRkztwwUu91YnPzAEDT+3+Cl8MAeEz4ZR7JOZBvLcc H2MjMpzc8iMsRSpfPTaTIylmDdWS3qDjZeuf2SNkpcgvAOxM28kelxUenEWZbIne6n5tlN zJxOQBifi3DeEmhRs9Dg1JqKjAJuy4Y= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707516968; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=m8W53GW4ZEWC3PprgXE/ukTwzuRzol5bFe9xhF3bvjg=; b=YutCGfZY7twT2p+Pny3wildQ0kF/bNFkMZCoQY3AEXc22+IbnlE55yQe8i24082LgS3/Lh dc8VjqIdi5lKC1bPr9fuuWa1tvbSGLQAjBM2LwdpQuUrZl3lHV65elz0uz9uWJpTUSsBLR ea9X9aMHHH/xSLPoMPcyAcATmNSiDdI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-60-t2V2UsG2PJSAXQJ2V6pbnw-1; Fri, 09 Feb 2024 17:16:04 -0500 X-MC-Unique: t2V2UsG2PJSAXQJ2V6pbnw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id BE704837230; Fri, 9 Feb 2024 22:16:02 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.194.59]) by smtp.corp.redhat.com (Postfix) with ESMTP id E26B01C14B04; Fri, 9 Feb 2024 22:15:56 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Matthew Wilcox , Ryan Roberts , Catalin Marinas , Yin Fengwei , Michal Hocko , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Peter Zijlstra , Michael Ellerman , Christophe Leroy , "Naveen N. Rao" , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org Subject: [PATCH v2 09/10] mm/mmu_gather: improve cond_resched() handling with large folios and expensive page freeing Date: Fri, 9 Feb 2024 23:15:08 +0100 Message-ID: <20240209221509.585251-10-david@redhat.com> In-Reply-To: <20240209221509.585251-1-david@redhat.com> References: <20240209221509.585251-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Rspamd-Queue-Id: 518BE180026 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: ekgesdcacuydwkru568b6aik4szyyctf X-HE-Tag: 1707516969-102247 X-HE-Meta: U2FsdGVkX1/tGP/pPNY1xnSp/37knZpbKZ9yRX+d0vq5x938J+X3LdKBd0VkbyiHl00rl1rjpszEf6+kXvD3QvUp3yTljed6dnPx7VTgXVyMH66TfyKIEwT55eACaR7I3+kU4v36AaEWanPL3Q+q218GRLVl46792qnia+OV1jiIYIBHYy0mqkSKfaz6fmhheFLfHtTDg6Yg01hp+22pCqMbMEEPNQ5BaMXR3XPHAeqNr8g8WGRFPhLmla2WrxlOShX6CAmDPBOcivJ4MZaBa2m+sNWEENcKkQxJ6+t6P1YeYyJ4jg4/vbPoZ4ZEx/X6SVPGnDB/W+Oe6AG8aYfBodmGdFIxOCEiMqileSldV+dXKaFmYKDGtrmFYhziy5OMNSntsd5TcAeTDukQC94pBwGy8DM7ou87KiyM8DQYXTEBQCwCEzD4/ToB6ZmHARCUODM9o+LHayJabEDfdRVa2o1zQc3H7TT3Q/V3AH2OiUEWGgvnffVBAoqyTOnkhGtMNRFtL9rYjTFo606bI2LtqRddVAknwbW6Kx+ggIk+tiq6ibSV2w+fFN4sJ9qZVRVqm95smSP10kmfTP/dT1/mE5AiS2znvF8CPVdYegjMtbZ7Tqz78NK2fqsZBGE+V33UgwNX3+s9I/zIs9b4o5qfjCtWzHaosY/SUYZCegWXoWhr5oGOBPYqMFa9aPZGJxOynLcIanI1jmUnSawUCXlE+ZUHjby8XEuDmTdo84b7Iew4g6iz60ttazbmcWAAC4uXPQYRu4xTFbEtZP4Gl7oRziHEpqcX+3eCrRZFovOfIx5Tyn9iLC2gvDrky0GyIgAu0pUaMKcKtI5W62/DFIe0TWlNtz1vXNnmtuOiG9zGUq/5YHbeJPCD4UvISjXhSgfnxXxDE8bVvRT7MtZBmi6WYmgrvC9EmyHYx7yGe6kMCIe7xKzf+p8hRQsIFjrcP0S2Vc68dBXlaSElTqkI+l8 K54gFsdE RQsOEzwAly5+m97pfmt2Gqczyiz7ofbPFNzCZXM5+y8cgc0YXgPaMnJ4SVmGWiNv+ajeEQVp41P4EiVTB74S76sfNSXazYasTl3RmIS4yealxehrH3pP5sUZqvjAW6bppZ0MYmJ/2h5LG2bSD55fudEGSpB3Ff9T1MaELyoz0vC1BP/OtSadvlG1XUoJ7Sn/vUSJV56vaAaI6DLKEE1IuZnXfivV5dWabeJJc0nehNBkHamrErmzd7iNr120QJoBc7r5p X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: It's a pain that we have to handle cond_resched() in tlb_batch_pages_flush() manually and cannot simply handle it in release_pages() -- release_pages() can be called from atomic context. Well, in a perfect world we wouldn't have to make our code more at all. With page poisoning and init_on_free, we might now run into soft lockups when we free a lot of rather large folio fragments, because page freeing time then depends on the actual memory size we are freeing instead of on the number of folios that are involved. In the absolute (unlikely) worst case, on arm64 with 64k we will be able to free up to 256 folio fragments that each span 512 MiB: zeroing out 128 GiB does sound like it might take a while. But instead of ignoring this unlikely case, let's just handle it. So, let's teach tlb_batch_pages_flush() that there are some configurations where page freeing is horribly slow, and let's reschedule more frequently -- similarly like we did for now before we had large folio fragments in there. Note that we might end up freeing only a single folio fragment at a time that might exceed the old 512 pages limit: but if we cannot even free a single MAX_ORDER page on a system without running into soft lockups, something else is already completely bogus. In the future, we might want to detect if handling cond_resched() is required at all, and just not do any of that with full preemption enabled. Signed-off-by: David Hildenbrand Reviewed-by: Ryan Roberts Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- mm/mmu_gather.c | 50 ++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 41 insertions(+), 9 deletions(-) diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index d175c0f1e2c8..2774044b5790 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -91,18 +91,19 @@ void tlb_flush_rmaps(struct mmu_gather *tlb, struct vm_area_struct *vma) } #endif -static void tlb_batch_pages_flush(struct mmu_gather *tlb) +static void __tlb_batch_free_encoded_pages(struct mmu_gather_batch *batch) { - struct mmu_gather_batch *batch; - - for (batch = &tlb->local; batch && batch->nr; batch = batch->next) { - struct encoded_page **pages = batch->encoded_pages; + struct encoded_page **pages = batch->encoded_pages; + unsigned int nr, nr_pages; + /* + * We might end up freeing a lot of pages. Reschedule on a regular + * basis to avoid soft lockups in configurations without full + * preemption enabled. The magic number of 512 folios seems to work. + */ + if (!page_poisoning_enabled_static() && !want_init_on_free()) { while (batch->nr) { - /* - * limit free batch count when PAGE_SIZE > 4K - */ - unsigned int nr = min(512U, batch->nr); + nr = min(512, batch->nr); /* * Make sure we cover page + nr_pages, and don't leave @@ -119,6 +120,37 @@ static void tlb_batch_pages_flush(struct mmu_gather *tlb) cond_resched(); } } + + /* + * With page poisoning and init_on_free, the time it takes to free + * memory grows proportionally with the actual memory size. Therefore, + * limit based on the actual memory size and not the number of involved + * folios. + */ + while (batch->nr) { + for (nr = 0, nr_pages = 0; + nr < batch->nr && nr_pages < 512; nr++) { + if (unlikely(encoded_page_flags(pages[nr]) & + ENCODED_PAGE_BIT_NR_PAGES_NEXT)) + nr_pages += encoded_nr_pages(pages[++nr]); + else + nr_pages++; + } + + free_pages_and_swap_cache(pages, nr); + pages += nr; + batch->nr -= nr; + + cond_resched(); + } +} + +static void tlb_batch_pages_flush(struct mmu_gather *tlb) +{ + struct mmu_gather_batch *batch; + + for (batch = &tlb->local; batch && batch->nr; batch = batch->next) + __tlb_batch_free_encoded_pages(batch); tlb->active = &tlb->local; } From patchwork Fri Feb 9 22:15:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13551949 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEB9DC4828F for ; Fri, 9 Feb 2024 22:16:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 859776B009D; Fri, 9 Feb 2024 17:16:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7E13F6B009E; Fri, 9 Feb 2024 17:16:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 634276B009F; Fri, 9 Feb 2024 17:16:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 4ABEC6B009D for ; Fri, 9 Feb 2024 17:16:13 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 1BD781C178E for ; Fri, 9 Feb 2024 22:16:13 +0000 (UTC) X-FDA: 81773674626.15.1A865C9 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf29.hostedemail.com (Postfix) with ESMTP id 67572120008 for ; Fri, 9 Feb 2024 22:16:11 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=TdMaUP9S; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf29.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707516971; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KrYYgTjhMX4rJEhHUeQn0YNRSL+kwvv2YOYlgfS0XZc=; b=toC4IBXMpmmm24VHBbMT7hg1KpCNNVQWuG0lDc+8JkZE8gTIlFBiRYG+u1Bb+AmTukPdJY JuNM3Lj1Fj9IV+cXsG8O1jIh4jzhJFb12p875tioL7Aq8dKbhjJIHlh/qSk84achU3DcRY A7QM9w8OLxWQLoTmZTsdzd8CTFkxPcE= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=TdMaUP9S; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf29.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707516971; a=rsa-sha256; cv=none; b=7zG3JKy/tQN0V1Dli6QImucIEwIiznS0+v+mrSiwO3Iob9AEb9xAUvMxlLcbkIO/qVxx8v gWJCc4lISSWrgUZqWhIv/KjYD8x02AeQiRVi+jpurYZAClze+y85+030B3I8Liks2pCohs km3k0wYz8hw12RegMegTjTaQMtCwtCo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707516970; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KrYYgTjhMX4rJEhHUeQn0YNRSL+kwvv2YOYlgfS0XZc=; b=TdMaUP9S0xF9ndi9o6cm2H0VDijeA/Dn1GS3bU98GKzcZtHBEKloyk/SeWjXH7aGID/s2J nhsj9bZ5TK+lRrrqtILv9bhMy0U6Z2V0Abqlcq2TaKRMsDKZTEdnJW3BVfha/e/70Ivny+ qzkfHgyRe+Bp6uNnMDUaLib8h4Vk1l0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-508-uldKMFUyNS60AI0e5y5IZw-1; Fri, 09 Feb 2024 17:16:08 -0500 X-MC-Unique: uldKMFUyNS60AI0e5y5IZw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 786F71025620; Fri, 9 Feb 2024 22:16:07 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.194.59]) by smtp.corp.redhat.com (Postfix) with ESMTP id 218D01C14B0D; Fri, 9 Feb 2024 22:16:03 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Matthew Wilcox , Ryan Roberts , Catalin Marinas , Yin Fengwei , Michal Hocko , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Peter Zijlstra , Michael Ellerman , Christophe Leroy , "Naveen N. Rao" , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org Subject: [PATCH v2 10/10] mm/memory: optimize unmap/zap with PTE-mapped THP Date: Fri, 9 Feb 2024 23:15:09 +0100 Message-ID: <20240209221509.585251-11-david@redhat.com> In-Reply-To: <20240209221509.585251-1-david@redhat.com> References: <20240209221509.585251-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Rspamd-Queue-Id: 67572120008 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: fznur8jtxjhmpdk7sghshhzws9mwu1ja X-HE-Tag: 1707516971-829134 X-HE-Meta: U2FsdGVkX18LdUjamjtdx/zaLDkbuzG3taYuCuWYVyVbjIo/fEQMMdwM+DCzX2VmMhNY23xssFgrC1qjFzy0sgcva9UWvjs5co2UoPxT5I+51BJTJ0Vvt8hAsedH96yiI/cfpssQF0MLztfn9yzi7eubwtTq5jPixuOLJxsTWTvuXYtzf0Ni9zBkLhqiPdVulyfz1obGIau7k4yylce06ZABddmeHyrj4nOTHu+eIPFUmjo6qLKMPqCa1A9h1DpoNio56ZEsTLYCsQXz1aNpIjjzKj7aAdwlHPf+jWFYGBb+gZ886zlVdtDtQoKZCc2fammn6jO4R66AxEZO2t/kb+aN6IX4FeAs29sy/P0sbaYRoZdpkOdJKydfEPgdAS/XCZ4F+0b4C9hFPizPjJiaw8KrGdE9rDbxRUBhSekQnSzfGWd8zGwok+PoJMqMVfBYCG1bkIuUEPUtdBTkdNW+AJ2vit1auFOgJ0eNDLvs6bEZ3YrknMBHQTd7dB58LRBGa0QwuRZxlraFRlzDnk4o6+n4zKhxUOFAC1SNc87808SMjbhKG9I7sb8nQn+WUmgglMKa3WbhTh8DQiUESr+xm6y5EbczArwl8nqBn3p63lYjtoe2ZkD+FDRXvakiqUDBv0s4Td5c/6O8djgwEy5u9bkTDLltzDLXM6IIUSOHPSvC2PC06eh0215UJAY70GIuk5baIT5TAK9Awb0Dt0/5b1HN5PmExOP6AJSobxJmKs9irEqJ6+uaCfOX2Da5kZWd5UnNbkdXiBe8rqVEdJ4Rd+BanMgVFQre02fEvcPQD56iLbuUe1YFzeDXzPN/mjUxPUKVjPKR3nmDky0uukjKtP4BY7eAv9cTTOXY1JhP+kArr1bozd6LG+bM/q+xJ1VN/TQi057WrEQLV04rmasWttnR8iuOyxMd0ChJgX3J0ka+CAnAxGai9UxW4GGyyP7Jfz6sgn/870FbuqQ9/y6 Z3sdUUOg udvqHgZMsOvIEvj/tdJWqSDFh2QTHpBzK9HnmNKTqp5IRRsIlIp5bBJZ+2EyisPILxN1ZueRgD8DBDhOJ2D9KaiFIqmDAVIH2jxaDNWlLdGdAWAxIKcYVi5/9fMzvao7VY/HlfTfGT6YxzeuZf3LnG87pIT7y9Fp0XxSWcFRur7xnY+w= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Similar to how we optimized fork(), let's implement PTE batching when consecutive (present) PTEs map consecutive pages of the same large folio. Most infrastructure we need for batching (mmu gather, rmap) is already there. We only have to add get_and_clear_full_ptes() and clear_full_ptes(). Similarly, extend zap_install_uffd_wp_if_needed() to process a PTE range. We won't bother sanity-checking the mapcount of all subpages, but only check the mapcount of the first subpage we process. If there is a real problem hiding somewhere, we can trigger it simply by using small folios, or when we zap single pages of a large folio. Ideally, we had that check in rmap code (including for delayed rmap), but then we cannot print the PTE. Let's keep it simple for now. If we ever have a cheap folio_mapcount(), we might just want to check for underflows there. To keep small folios as fast as possible force inlining of a specialized variant using __always_inline with nr=1. Signed-off-by: David Hildenbrand Reviewed-by: Ryan Roberts --- include/linux/pgtable.h | 70 +++++++++++++++++++++++++++++++ mm/memory.c | 92 +++++++++++++++++++++++++++++------------ 2 files changed, 136 insertions(+), 26 deletions(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index aab227e12493..49ab1f73b5c2 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -580,6 +580,76 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, } #endif +#ifndef get_and_clear_full_ptes +/** + * get_and_clear_full_ptes - Clear present PTEs that map consecutive pages of + * the same folio, collecting dirty/accessed bits. + * @mm: Address space the pages are mapped into. + * @addr: Address the first page is mapped at. + * @ptep: Page table pointer for the first entry. + * @nr: Number of entries to clear. + * @full: Whether we are clearing a full mm. + * + * May be overridden by the architecture; otherwise, implemented as a simple + * loop over ptep_get_and_clear_full(), merging dirty/accessed bits into the + * returned PTE. + * + * Note that PTE bits in the PTE range besides the PFN can differ. For example, + * some PTEs might be write-protected. + * + * Context: The caller holds the page table lock. The PTEs map consecutive + * pages that belong to the same folio. The PTEs are all in the same PMD. + */ +static inline pte_t get_and_clear_full_ptes(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, unsigned int nr, int full) +{ + pte_t pte, tmp_pte; + + pte = ptep_get_and_clear_full(mm, addr, ptep, full); + while (--nr) { + ptep++; + addr += PAGE_SIZE; + tmp_pte = ptep_get_and_clear_full(mm, addr, ptep, full); + if (pte_dirty(tmp_pte)) + pte = pte_mkdirty(pte); + if (pte_young(tmp_pte)) + pte = pte_mkyoung(pte); + } + return pte; +} +#endif + +#ifndef clear_full_ptes +/** + * clear_full_ptes - Clear present PTEs that map consecutive pages of the same + * folio. + * @mm: Address space the pages are mapped into. + * @addr: Address the first page is mapped at. + * @ptep: Page table pointer for the first entry. + * @nr: Number of entries to clear. + * @full: Whether we are clearing a full mm. + * + * May be overridden by the architecture; otherwise, implemented as a simple + * loop over ptep_get_and_clear_full(). + * + * Note that PTE bits in the PTE range besides the PFN can differ. For example, + * some PTEs might be write-protected. + * + * Context: The caller holds the page table lock. The PTEs map consecutive + * pages that belong to the same folio. The PTEs are all in the same PMD. + */ +static inline void clear_full_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned int nr, int full) +{ + for (;;) { + ptep_get_and_clear_full(mm, addr, ptep, full); + if (--nr == 0) + break; + ptep++; + addr += PAGE_SIZE; + } +} +#endif /* * If two threads concurrently fault at the same page, the thread that diff --git a/mm/memory.c b/mm/memory.c index a3efc4da258a..3b8e56eb08a3 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1515,7 +1515,7 @@ static inline bool zap_drop_file_uffd_wp(struct zap_details *details) */ static inline void zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, - unsigned long addr, pte_t *pte, + unsigned long addr, pte_t *pte, int nr, struct zap_details *details, pte_t pteval) { /* Zap on anonymous always means dropping everything */ @@ -1525,20 +1525,27 @@ zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, if (zap_drop_file_uffd_wp(details)) return; - pte_install_uffd_wp_if_needed(vma, addr, pte, pteval); + for (;;) { + /* the PFN in the PTE is irrelevant. */ + pte_install_uffd_wp_if_needed(vma, addr, pte, pteval); + if (--nr == 0) + break; + pte++; + addr += PAGE_SIZE; + } } -static inline void zap_present_folio_pte(struct mmu_gather *tlb, +static __always_inline void zap_present_folio_ptes(struct mmu_gather *tlb, struct vm_area_struct *vma, struct folio *folio, - struct page *page, pte_t *pte, pte_t ptent, unsigned long addr, - struct zap_details *details, int *rss, bool *force_flush, - bool *force_break) + struct page *page, pte_t *pte, pte_t ptent, unsigned int nr, + unsigned long addr, struct zap_details *details, int *rss, + bool *force_flush, bool *force_break) { struct mm_struct *mm = tlb->mm; bool delay_rmap = false; if (!folio_test_anon(folio)) { - ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); + ptent = get_and_clear_full_ptes(mm, addr, pte, nr, tlb->fullmm); if (pte_dirty(ptent)) { folio_mark_dirty(folio); if (tlb_delay_rmap(tlb)) { @@ -1548,36 +1555,49 @@ static inline void zap_present_folio_pte(struct mmu_gather *tlb, } if (pte_young(ptent) && likely(vma_has_recency(vma))) folio_mark_accessed(folio); - rss[mm_counter(folio)]--; + rss[mm_counter(folio)] -= nr; } else { /* We don't need up-to-date accessed/dirty bits. */ - ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); - rss[MM_ANONPAGES]--; + clear_full_ptes(mm, addr, pte, nr, tlb->fullmm); + rss[MM_ANONPAGES] -= nr; } + /* Checking a single PTE in a batch is sufficient. */ arch_check_zapped_pte(vma, ptent); - tlb_remove_tlb_entry(tlb, pte, addr); + tlb_remove_tlb_entries(tlb, pte, nr, addr); if (unlikely(userfaultfd_pte_wp(vma, ptent))) - zap_install_uffd_wp_if_needed(vma, addr, pte, details, ptent); + zap_install_uffd_wp_if_needed(vma, addr, pte, nr, details, + ptent); if (!delay_rmap) { - folio_remove_rmap_pte(folio, page, vma); + folio_remove_rmap_ptes(folio, page, nr, vma); + + /* Only sanity-check the first page in a batch. */ if (unlikely(page_mapcount(page) < 0)) print_bad_pte(vma, addr, ptent, page); } - if (unlikely(__tlb_remove_page(tlb, page, delay_rmap))) { + if (unlikely(__tlb_remove_folio_pages(tlb, page, nr, delay_rmap))) { *force_flush = true; *force_break = true; } } -static inline void zap_present_pte(struct mmu_gather *tlb, +/* + * Zap or skip at least one present PTE, trying to batch-process subsequent + * PTEs that map consecutive pages of the same folio. + * + * Returns the number of processed (skipped or zapped) PTEs (at least 1). + */ +static inline int zap_present_ptes(struct mmu_gather *tlb, struct vm_area_struct *vma, pte_t *pte, pte_t ptent, - unsigned long addr, struct zap_details *details, - int *rss, bool *force_flush, bool *force_break) + unsigned int max_nr, unsigned long addr, + struct zap_details *details, int *rss, bool *force_flush, + bool *force_break) { + const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY; struct mm_struct *mm = tlb->mm; struct folio *folio; struct page *page; + int nr; page = vm_normal_page(vma, addr, ptent); if (!page) { @@ -1587,14 +1607,29 @@ static inline void zap_present_pte(struct mmu_gather *tlb, tlb_remove_tlb_entry(tlb, pte, addr); VM_WARN_ON_ONCE(userfaultfd_wp(vma)); ksm_might_unmap_zero_page(mm, ptent); - return; + return 1; } folio = page_folio(page); if (unlikely(!should_zap_folio(details, folio))) - return; - zap_present_folio_pte(tlb, vma, folio, page, pte, ptent, addr, details, - rss, force_flush, force_break); + return 1; + + /* + * Make sure that the common "small folio" case is as fast as possible + * by keeping the batching logic separate. + */ + if (unlikely(folio_test_large(folio) && max_nr != 1)) { + nr = folio_pte_batch(folio, addr, pte, ptent, max_nr, fpb_flags, + NULL); + + zap_present_folio_ptes(tlb, vma, folio, page, pte, ptent, nr, + addr, details, rss, force_flush, + force_break); + return nr; + } + zap_present_folio_ptes(tlb, vma, folio, page, pte, ptent, 1, addr, + details, rss, force_flush, force_break); + return 1; } static unsigned long zap_pte_range(struct mmu_gather *tlb, @@ -1609,6 +1644,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, pte_t *start_pte; pte_t *pte; swp_entry_t entry; + int nr; tlb_change_page_size(tlb, PAGE_SIZE); init_rss_vec(rss); @@ -1622,7 +1658,9 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, pte_t ptent = ptep_get(pte); struct folio *folio; struct page *page; + int max_nr; + nr = 1; if (pte_none(ptent)) continue; @@ -1630,10 +1668,12 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, break; if (pte_present(ptent)) { - zap_present_pte(tlb, vma, pte, ptent, addr, details, - rss, &force_flush, &force_break); + max_nr = (end - addr) / PAGE_SIZE; + nr = zap_present_ptes(tlb, vma, pte, ptent, max_nr, + addr, details, rss, &force_flush, + &force_break); if (unlikely(force_break)) { - addr += PAGE_SIZE; + addr += nr * PAGE_SIZE; break; } continue; @@ -1687,8 +1727,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, WARN_ON_ONCE(1); } pte_clear_not_present_full(mm, addr, pte, tlb->fullmm); - zap_install_uffd_wp_if_needed(vma, addr, pte, details, ptent); - } while (pte++, addr += PAGE_SIZE, addr != end); + zap_install_uffd_wp_if_needed(vma, addr, pte, 1, details, ptent); + } while (pte += nr, addr += PAGE_SIZE * nr, addr != end); add_mm_rss_vec(mm, rss); arch_leave_lazy_mmu_mode();