From patchwork Wed Feb 14 20:44:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13557004 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED55BC48BC3 for ; Wed, 14 Feb 2024 20:44:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7F3B76B009D; Wed, 14 Feb 2024 15:44:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 77D086B009E; Wed, 14 Feb 2024 15:44:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6443C6B009F; Wed, 14 Feb 2024 15:44:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 4EB3E6B009D for ; Wed, 14 Feb 2024 15:44:53 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 2C88D803F2 for ; Wed, 14 Feb 2024 20:44:53 +0000 (UTC) X-FDA: 81791588466.11.A01E865 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf20.hostedemail.com (Postfix) with ESMTP id 60B911C000C for ; Wed, 14 Feb 2024 20:44:51 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=NwIs7uUP; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf20.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707943491; a=rsa-sha256; cv=none; b=fJoBHDDN5hKzqN2STVVRa2Cc8X19GtckQUb/WiIvvomYn29ds9mhpzRrjOQO6viLud6ft8 LKoeBzAWD/rnwUmXK6NOxjuoUPltWy9E4SBTx5F6FX7PQng+VMI2v3+xmL/ylGZwDBdFjb KyWNptaL+fuE/3mSB3MGFqxp3c3i8SY= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=NwIs7uUP; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf20.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707943491; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dEQuQbE8PusGeIN6wyahI5tWOdzdv7mIipcs9sgG+zs=; b=4viR82mTg7PSOo/Lb8Q11q5GXVigtuFU+kNBlQ0XayREOrKAT2JHCP40fIyohon31wGXdh XZ0m5wXhxbYRcG/RtrOkaz7N0RXpRZHjDazY9sWFd5l7t2I5u1o09pHYiU37nfwCcGZpEA 8LTa+i/9W/nnteYgwo/7Dmh1Nrwi2Cg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707943490; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dEQuQbE8PusGeIN6wyahI5tWOdzdv7mIipcs9sgG+zs=; b=NwIs7uUPFWJCS5iEJkMSWn2DGvZmYNXyFOsxOgAWJXqj299oSvRb2ItddOlE1bDNGdgVlK cYojeSbd7WbiknPNqRYiyBVDW9SURNhNsWNf+7HVYO+lpMGspHytXd39TIvlz8X5peC82Y urf62oQ/R4ioDPuLWkmu4MZGo0lMhSg= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-373--0kZEuV3Pfig40WvEuXQ9Q-1; Wed, 14 Feb 2024 15:44:46 -0500 X-MC-Unique: -0kZEuV3Pfig40WvEuXQ9Q-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 05DB528B6A1E; Wed, 14 Feb 2024 20:44:45 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.194.174]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5C51A1C066A9; Wed, 14 Feb 2024 20:44:41 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Matthew Wilcox , Ryan Roberts , Catalin Marinas , Yin Fengwei , Michal Hocko , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Peter Zijlstra , Michael Ellerman , Christophe Leroy , "Naveen N. Rao" , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org Subject: [PATCH v3 01/10] mm/memory: factor out zapping of present pte into zap_present_pte() Date: Wed, 14 Feb 2024 21:44:26 +0100 Message-ID: <20240214204435.167852-2-david@redhat.com> In-Reply-To: <20240214204435.167852-1-david@redhat.com> References: <20240214204435.167852-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 60B911C000C X-Stat-Signature: 9bmmytjjxbe5spfm7rfg3hfxs47ap6db X-HE-Tag: 1707943491-632884 X-HE-Meta: U2FsdGVkX18r1Xkib2SzfiQHLBzriqfj3DohIAXpzPcFCmUKo0UMh+Z6cQ/92+2Y3w1pmeEWc4wrhasoAmAL3oSEaFYkKhQu+tNMR2nr0Yj3C7yXWloLZOsgt2NhFyDo7ci99G9GaN01A7KRjhHPka2dhUh2nxpt0e65NpUnpmqn6fW3T3580+HvhqJ2OsmxYhSdTraDZur30s6ocQbkoOmUXbqX+dCm1Kv0XgdxDCKQtl/Q6VCHweDm5nvBUwkJlq3tNo5iULiyt22HnoKsFzf1L48qrWcv/g/mYxBs5kRWypRhyZOhU3yvn8zKjp34udzjl9axBQIiG4XHIiKWRNP9M3Dg6o2711XLtJ3I09OpXKjnFlLJI4+7RF9Kc6ttQ9P9QxiBviSH/hfPqyha9ZnYqeV+mPlIVCgjxTlDhMs7VwaDiKRONJ3duMCEX7ypvxgVZEXlrDR9kqlWqMJUNZm7rkSHw2qkyLcMELXfDjyy6Pn49uNZqpATzP3aHKEj3Zo19Sa07Y2zFwztO9uvckk1IEHsP+1OQtOHeL6CNMJ7jq0yxM5nG3gpFF8vahfGvkXkO4R5DVeOcHLKEf5tP3FXIINognNjgTz4PDgyCd8ltHELxeXj3SSEO/uSvhtBJAwBdi/uWKBteRu5+MRWZUxRgvAN+QAgrIf+O3oPWtRBd34iX/1ZXS/rE6FYA0rCMnaYYj23E2Nka8KRQtdFAfskD3TJgIj7dKcyCUGjs5+Cq0eAXzA1FZqYArPi0usdsviH628aGA5Mvu4xNTlQ43NEH++HEJxwlXnBVHFYdufrMo5jokWaMa0VL2Nt60pTtzXRgyHUrJVCk9drJPdY0ikeRBOVjimXe2X7VivscHn+cQMkqqAH4+5Uy/Ku/ueqMnnhWXJWtM1sOA68u1OlHdn5iWCFCR7aXYYKtLCNJCYQvrQ/TpiDNb3j+DZy9K6606t2Q4Ux0d2EXWBGZ90 sRDfaG0d v87SBEGNFx5pNM+pkdJBsCJl91EZ/Y6vW5RryymKZAxs3Xfnw04lHUwZhuvCJvrKTCpgpG7dFt4eWMYh+g3+se00iSwJcw7+6x8UFnHjfAqkJyz4gYAwfOw5JllAcegICB816+w2Dd6JzPVamd438BG+xstbemHUxClw/ny4QEmCfgVZ9ICRch2orjZbAplurzzo/l976RWgjGxzb+JeUJjUaxanXQ/vpDZNtEgxPfit9FZnpRbaGB7LVVULDDTrXgykNYTuKbDiBJMEeX+cFVC/2eA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Let's prepare for further changes by factoring out processing of present PTEs. Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- mm/memory.c | 94 ++++++++++++++++++++++++++++++----------------------- 1 file changed, 53 insertions(+), 41 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 7c3ca41a7610..5b0dc33133a6 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1532,13 +1532,61 @@ zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, pte_install_uffd_wp_if_needed(vma, addr, pte, pteval); } +static inline void zap_present_pte(struct mmu_gather *tlb, + struct vm_area_struct *vma, pte_t *pte, pte_t ptent, + unsigned long addr, struct zap_details *details, + int *rss, bool *force_flush, bool *force_break) +{ + struct mm_struct *mm = tlb->mm; + struct folio *folio = NULL; + bool delay_rmap = false; + struct page *page; + + page = vm_normal_page(vma, addr, ptent); + if (page) + folio = page_folio(page); + + if (unlikely(!should_zap_folio(details, folio))) + return; + ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); + arch_check_zapped_pte(vma, ptent); + tlb_remove_tlb_entry(tlb, pte, addr); + zap_install_uffd_wp_if_needed(vma, addr, pte, details, ptent); + if (unlikely(!page)) { + ksm_might_unmap_zero_page(mm, ptent); + return; + } + + if (!folio_test_anon(folio)) { + if (pte_dirty(ptent)) { + folio_mark_dirty(folio); + if (tlb_delay_rmap(tlb)) { + delay_rmap = true; + *force_flush = true; + } + } + if (pte_young(ptent) && likely(vma_has_recency(vma))) + folio_mark_accessed(folio); + } + rss[mm_counter(folio)]--; + if (!delay_rmap) { + folio_remove_rmap_pte(folio, page, vma); + if (unlikely(page_mapcount(page) < 0)) + print_bad_pte(vma, addr, ptent, page); + } + if (unlikely(__tlb_remove_page(tlb, page, delay_rmap))) { + *force_flush = true; + *force_break = true; + } +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, struct zap_details *details) { + bool force_flush = false, force_break = false; struct mm_struct *mm = tlb->mm; - int force_flush = 0; int rss[NR_MM_COUNTERS]; spinlock_t *ptl; pte_t *start_pte; @@ -1555,7 +1603,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, arch_enter_lazy_mmu_mode(); do { pte_t ptent = ptep_get(pte); - struct folio *folio = NULL; + struct folio *folio; struct page *page; if (pte_none(ptent)) @@ -1565,45 +1613,9 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, break; if (pte_present(ptent)) { - unsigned int delay_rmap; - - page = vm_normal_page(vma, addr, ptent); - if (page) - folio = page_folio(page); - - if (unlikely(!should_zap_folio(details, folio))) - continue; - ptent = ptep_get_and_clear_full(mm, addr, pte, - tlb->fullmm); - arch_check_zapped_pte(vma, ptent); - tlb_remove_tlb_entry(tlb, pte, addr); - zap_install_uffd_wp_if_needed(vma, addr, pte, details, - ptent); - if (unlikely(!page)) { - ksm_might_unmap_zero_page(mm, ptent); - continue; - } - - delay_rmap = 0; - if (!folio_test_anon(folio)) { - if (pte_dirty(ptent)) { - folio_mark_dirty(folio); - if (tlb_delay_rmap(tlb)) { - delay_rmap = 1; - force_flush = 1; - } - } - if (pte_young(ptent) && likely(vma_has_recency(vma))) - folio_mark_accessed(folio); - } - rss[mm_counter(folio)]--; - if (!delay_rmap) { - folio_remove_rmap_pte(folio, page, vma); - if (unlikely(page_mapcount(page) < 0)) - print_bad_pte(vma, addr, ptent, page); - } - if (unlikely(__tlb_remove_page(tlb, page, delay_rmap))) { - force_flush = 1; + zap_present_pte(tlb, vma, pte, ptent, addr, details, + rss, &force_flush, &force_break); + if (unlikely(force_break)) { addr += PAGE_SIZE; break; } From patchwork Wed Feb 14 20:44:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13557005 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7087CC48BC3 for ; Wed, 14 Feb 2024 20:44:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F1F966B009E; Wed, 14 Feb 2024 15:44:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id ED0B26B00A0; Wed, 14 Feb 2024 15:44:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D978E6B00A1; Wed, 14 Feb 2024 15:44:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C69BE6B009E for ; Wed, 14 Feb 2024 15:44:56 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 94D5A120318 for ; Wed, 14 Feb 2024 20:44:56 +0000 (UTC) X-FDA: 81791588592.15.1855478 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf22.hostedemail.com (Postfix) with ESMTP id F307FC001D for ; Wed, 14 Feb 2024 20:44:54 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=XlNlnZ1e; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf22.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707943495; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fqQsnr9EFuS72Lg9egurC5IFPM5MYMM8dNEJpcWm/gk=; b=25oOC8/89fJl5r4dbu9Z8tT19q3GsbSl0wyGfY/ObuR3rZV6jm7Nb/5U+UPke9mcbM4kJk lSBid8gQWPJvgpsc2pTy/pBu7Qogj82wLvPKWDzTmX+9St21qFZpJq7JXZjv52JqUugA3Y KnebuzXPkYIGwHUAZtk2SY7MOYU6FmE= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=XlNlnZ1e; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf22.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707943495; a=rsa-sha256; cv=none; b=pBKjVttaDmEBnN0xAKpBFqWjkyxSfugf1soOWnHlKsgVH81j1vbskNh3NvCscGyQCuoZQ5 6Ow5fZhaDfgFW7YR0fLX5K7pb7HAteQioHMYQcQtP5KA46+d1y6Bgf45Vc8sFGJ/6dWjRy DDT/CG+hKy2mQ/i4Rh5z5snovx/upf4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707943494; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fqQsnr9EFuS72Lg9egurC5IFPM5MYMM8dNEJpcWm/gk=; b=XlNlnZ1edGdbpn/YVukjTXxYOdfWQ+ZcOsIhLO/0JMtqCBx2KL2Ib3lUwzQuVCxizteaV0 xg42IlQPhZdRBrdZbNz0TgeMD20Eh1OKbOWUZ1/jW6hsYCRPHfvXX0Rx3sRhQUgf1qWqM3 L8+LozDiPATTuZAvu7rLY1AGHfUFl4M= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-226-pgh-LenmOMa4VgIEuXmFnA-1; Wed, 14 Feb 2024 15:44:50 -0500 X-MC-Unique: pgh-LenmOMa4VgIEuXmFnA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0A351185A782; Wed, 14 Feb 2024 20:44:49 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.194.174]) by smtp.corp.redhat.com (Postfix) with ESMTP id 438201C066AA; Wed, 14 Feb 2024 20:44:45 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Matthew Wilcox , Ryan Roberts , Catalin Marinas , Yin Fengwei , Michal Hocko , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Peter Zijlstra , Michael Ellerman , Christophe Leroy , "Naveen N. Rao" , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org Subject: [PATCH v3 02/10] mm/memory: handle !page case in zap_present_pte() separately Date: Wed, 14 Feb 2024 21:44:27 +0100 Message-ID: <20240214204435.167852-3-david@redhat.com> In-Reply-To: <20240214204435.167852-1-david@redhat.com> References: <20240214204435.167852-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Rspamd-Queue-Id: F307FC001D X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: 3rhhyxa4so8i9jfj8o1idzn7ecxkh16i X-HE-Tag: 1707943494-205258 X-HE-Meta: U2FsdGVkX1+5ot96Ib5JaCt5Yz4P8prjs4LZXhxuiRJRaPcjmnYg719Bch1GXW+ZjWo/nfcUM5J1ZP+8zjVD3zkUhiYnwvN/9Z3HPsD32i7DUWZ8uIhiqMcIiJhOLQICYcNRlvy+fZM0Mc/jvhy83anVXYMb1edq7+3JO9XX3nBB8ZJAqc6zz0neOcEMNbBIaegBRVK5Q/trmkhJB+an9Affjp7cj2VnE2jdIX5moxDDOeYtdfJNbRkDfYdIsku/1ShxBybXd6VrQSNxxsYzra3X710xSIfTB3rtuLsKwkqjKKEROjX0t901FvnU9l5jdm8o1HVg4Hjv4IcUFvTW04AdDuYKFNqEy8NPAxZYt6uOBEJYzTB7fxAzeXgZ92UiEzQ/BwEdi1OYfkYddPCOdlk4oDQhRZImX8/vjS5URH9JlN8/aDRLT2QEam8FXJ4NIynE0vfmz0DvDDKGkGYAvf3JcyZuDtGjGxAXSURJ0jtAQUOWT9jVgK9maBbWAAEhqglU2LT7by8Vpcm4qVDRcUXtvPoyW9qFTlBfZIcHxvI2F5vpfoVxWREhNOGb4WampW3RTVSNoIW08D83JYq/lBglxf59GOQyxDMI0UAFOAaw62EHzxbmR9VEnfCoRhQusu9ipFLy1EZpVEeNYTAMpBpFMqb7l+h6LbomKj/xCPDI56olkZ0WDai3TSYwyqRVKoe6HWJfMhxmuxfQacOhD8OWLiy0qucR9VrQYH3b7SD7mdn2R154HRAUUE1N5JoQAPpm/Ge+p/Pvkrdgssy4CfYP7kDbt2UkIRxkyXEQKRXUgLPhRQdo5ZpNpKjNXC2pALpGk3GK/FrAAbIrOk0ZQc851afeLkxIHFFHOjkyY4ZQ1rInu/ew0VGVmQmsu6du995gM7xs2T8hfOv/GH9ibUrdp5GHn9t85NDbWyyTf0sPfvW26hA0PeRdW/uhc3uf3mp8fWN23hb1uscCDdt 4btikgyc tHLHcBiHM9dc8lmI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: We don't need uptodate accessed/dirty bits, so in theory we could replace ptep_get_and_clear_full() by an optimized ptep_clear_full() function. Let's rely on the provided pte. Further, there is no scenario where we would have to insert uffd-wp markers when zapping something that is not a normal page (i.e., zeropage). Add a sanity check to make sure this remains true. should_zap_folio() no longer has to handle NULL pointers. This change replaces 2/3 "!page/!folio" checks by a single "!page" one. Note that arch_check_zapped_pte() on x86-64 checks the HW-dirty bit to detect shadow stack entries. But for shadow stack entries, the HW dirty bit (in combination with non-writable PTEs) is set by software. So for the arch_check_zapped_pte() check, we don't have to sync against HW setting the HW dirty bit concurrently, it is always set. Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- mm/memory.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 5b0dc33133a6..4da6923709b2 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1497,10 +1497,6 @@ static inline bool should_zap_folio(struct zap_details *details, if (should_zap_cows(details)) return true; - /* E.g. the caller passes NULL for the case of a zero folio */ - if (!folio) - return true; - /* Otherwise we should only zap non-anon folios */ return !folio_test_anon(folio); } @@ -1538,24 +1534,28 @@ static inline void zap_present_pte(struct mmu_gather *tlb, int *rss, bool *force_flush, bool *force_break) { struct mm_struct *mm = tlb->mm; - struct folio *folio = NULL; bool delay_rmap = false; + struct folio *folio; struct page *page; page = vm_normal_page(vma, addr, ptent); - if (page) - folio = page_folio(page); + if (!page) { + /* We don't need up-to-date accessed/dirty bits. */ + ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); + arch_check_zapped_pte(vma, ptent); + tlb_remove_tlb_entry(tlb, pte, addr); + VM_WARN_ON_ONCE(userfaultfd_wp(vma)); + ksm_might_unmap_zero_page(mm, ptent); + return; + } + folio = page_folio(page); if (unlikely(!should_zap_folio(details, folio))) return; ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); arch_check_zapped_pte(vma, ptent); tlb_remove_tlb_entry(tlb, pte, addr); zap_install_uffd_wp_if_needed(vma, addr, pte, details, ptent); - if (unlikely(!page)) { - ksm_might_unmap_zero_page(mm, ptent); - return; - } if (!folio_test_anon(folio)) { if (pte_dirty(ptent)) { From patchwork Wed Feb 14 20:44:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13557006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41E7AC48BC1 for ; Wed, 14 Feb 2024 20:45:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A64E46B00A2; Wed, 14 Feb 2024 15:45:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A14928D0001; Wed, 14 Feb 2024 15:45:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B6256B00A6; Wed, 14 Feb 2024 15:45:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 786F16B00A2 for ; Wed, 14 Feb 2024 15:45:03 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 57716807E1 for ; Wed, 14 Feb 2024 20:45:03 +0000 (UTC) X-FDA: 81791588886.21.1E54A30 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf12.hostedemail.com (Postfix) with ESMTP id 9E70C40006 for ; Wed, 14 Feb 2024 20:45:01 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=YU2k2Cjb; spf=pass (imf12.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707943501; a=rsa-sha256; cv=none; b=lua8udeesyPmalqUmPRoaVDUtQqwnnbEqHFgTYHgTMiNokqHv9YxNA4oWsG53EQmifiPdk lrjTQLjk6MGmOsToswvzSur2hcX8UzIC2i4v2Tt0S3sWQtGPGNflZbnx9NIECTZC0zn00o jANptihEiQCwi9TsqYh6SNZGnk/r94M= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=YU2k2Cjb; spf=pass (imf12.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707943501; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=M/Ux9HRgPwmdR3zuVPtEfBlFP5wVpYCc/8Whwmc2o7c=; b=L98AFiV3DCAFq+VdPjyQp+TaULmofq1F4o+1p53zvyPXDRdmF8TvcwQPihZbBS9sxkh2oJ Mn4JiExqFvNJPMGWjaKN9iOaeR9fCqfhHYGWA1cZ3xllreIYz29bRE0kyeWerDxIEBHD7+ e7mxEnHgjln/717L0d12DPnRTz0TWFo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707943500; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=M/Ux9HRgPwmdR3zuVPtEfBlFP5wVpYCc/8Whwmc2o7c=; b=YU2k2CjbEZnT2M+1PB6NtmOSj01ami64kmIKGOZ8bxKCm+m7UlZ+4EEegNpWwCuq0+ALGL WRLzwDfWfE2mQufWsuniNswAbikaTcb+qllJ1ZBXXcXQIuer8/5iMRGcDxmJMrL715Mz5m GRLW4VuDHK8Dhm49gy0+VBUovl0j5L8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-460-Ud6bEsrlM92gS6OHI6pxZA-1; Wed, 14 Feb 2024 15:44:54 -0500 X-MC-Unique: Ud6bEsrlM92gS6OHI6pxZA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id DDD578630C4; Wed, 14 Feb 2024 20:44:52 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.194.174]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4741B1C060B1; Wed, 14 Feb 2024 20:44:49 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Matthew Wilcox , Ryan Roberts , Catalin Marinas , Yin Fengwei , Michal Hocko , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Peter Zijlstra , Michael Ellerman , Christophe Leroy , "Naveen N. Rao" , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org Subject: [PATCH v3 03/10] mm/memory: further separate anon and pagecache folio handling in zap_present_pte() Date: Wed, 14 Feb 2024 21:44:28 +0100 Message-ID: <20240214204435.167852-4-david@redhat.com> In-Reply-To: <20240214204435.167852-1-david@redhat.com> References: <20240214204435.167852-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 9E70C40006 X-Stat-Signature: qmfw7ukdkwhhab4c6hfxk3r8ottwqwcn X-Rspam-User: X-HE-Tag: 1707943501-95671 X-HE-Meta: U2FsdGVkX1/sAjMl1ZdEpvRic95X9LpCItTjmJiXG0rFPgC+Wmdf7wEGfOYO3Ba00iLfMkRvOMt289eRTstCz6nRFu3De+2T0m1zWab54GNIukxj4kJpObRcAAzjAgb4ZaLOYzpdYJjbBT6cuFi5wfRE7ndzRsp5kuqOFuS3W4VBXLbjrotfnm9IS7OXwfQqynVCBSjF3Y1mqARwYCSp08F2rXry0Q9o3EroiV3mb3uniG1eqw9gPs10BrxviOqF9iB+MBxInzNQ8CGvqThOIx35i8Jof/I9BZZImTmsffI0kw3OhND+qUCCJGf+rubdQLzuw9HmAjrL/cwP8AHlLKz/tby8mucTOxIHpVXJMF9kJ7WSiIdwQ5y6XO4wTjDoGEUAd2U9aGYU3Zpz449QuIs91GV6q+moXyclyklnnzB7vAapImOchN+W1wGt5nWgsjpWitvlOPzzu4BYIq8XL3C7mTHiupHYcRPQbadBiW80XCvyIUswwtu7xpyrwOrzsRQwKU8CL9P9zG86QTqgNR3Jekz1oFTPYgAtBWfGEA4L4PZ2f/yzeH339EMOxcwn4zXrIM2/GUp0ucOg3u6n95wz0DqsKdTg0a94cC2fwHxV8cizV98Xd5xRdbLZWt7lC4OJDzw8vIQYq3EnI/V1jfIGEBwo9TcqK39I0v8aMKXJ42rrBWqcGNLDnYwIdzBApzBjpSx1PUPQcKPyoEYCGI/wRxgIOEqd3wQlj65qTe8IpHBDMySWuLHlqF3jbzZSpWqnJecaJRFH/IvrKEbDNex7Q+PnVSCMWCahdiE+WT4ghDS+qVU4oVGOg9wBv0/ePHPi9vSXi97xwtn1eICl7wbMndCB6jWvs32lpc3liVRAeJNmQKK2WvEV4y0GOBb5pGgGq5AkBrsMikrBudOOchnn0w51lL66HPOHY4q05+LX05IIB+6G37MOz2yOg9RJG7bKTNOUGMXTDGw/Znv x+4kfsvS ye72/FUnTBPv6zC8MpPTH5DydlUga/gLfaxK4QmZv+n0i63wexrDQeAs9yIyFVpcjSPvXFmDwq2PC1cITn4951jImkgDNyg1IEqq9lkIpD+vFLUZ3auVqLDYqpcAdexTmyFdVFq2dsrX4D3nkcCBveutRmeoYJacdM2s15xJdtTOZpIGza9Xm2gegVqe34hFV2WHDIYJqoXEllDCyXEoCj7X61a4VPQVzUA7hU2cmCqz27DmC/zDbA8sa0oBUzCFXCFCVUuKq6XGJIToXdtgMIkX5bA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: We don't need up-to-date accessed-dirty information for anon folios and can simply work with the ptent we already have. Also, we know the RSS counter we want to update. We can safely move arch_check_zapped_pte() + tlb_remove_tlb_entry() + zap_install_uffd_wp_if_needed() after updating the folio and RSS. While at it, only call zap_install_uffd_wp_if_needed() if there is even any chance that pte_install_uffd_wp_if_needed() would do *something*. That is, just don't bother if uffd-wp does not apply. Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- mm/memory.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 4da6923709b2..7a3ebb6e5909 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1552,12 +1552,9 @@ static inline void zap_present_pte(struct mmu_gather *tlb, folio = page_folio(page); if (unlikely(!should_zap_folio(details, folio))) return; - ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); - arch_check_zapped_pte(vma, ptent); - tlb_remove_tlb_entry(tlb, pte, addr); - zap_install_uffd_wp_if_needed(vma, addr, pte, details, ptent); if (!folio_test_anon(folio)) { + ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); if (pte_dirty(ptent)) { folio_mark_dirty(folio); if (tlb_delay_rmap(tlb)) { @@ -1567,8 +1564,17 @@ static inline void zap_present_pte(struct mmu_gather *tlb, } if (pte_young(ptent) && likely(vma_has_recency(vma))) folio_mark_accessed(folio); + rss[mm_counter(folio)]--; + } else { + /* We don't need up-to-date accessed/dirty bits. */ + ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); + rss[MM_ANONPAGES]--; } - rss[mm_counter(folio)]--; + arch_check_zapped_pte(vma, ptent); + tlb_remove_tlb_entry(tlb, pte, addr); + if (unlikely(userfaultfd_pte_wp(vma, ptent))) + zap_install_uffd_wp_if_needed(vma, addr, pte, details, ptent); + if (!delay_rmap) { folio_remove_rmap_pte(folio, page, vma); if (unlikely(page_mapcount(page) < 0)) From patchwork Wed Feb 14 20:44:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13557007 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04398C48BC3 for ; Wed, 14 Feb 2024 20:45:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DE6706B00A5; Wed, 14 Feb 2024 15:45:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D6F3F6B00A6; Wed, 14 Feb 2024 15:45:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C0E056B00A7; Wed, 14 Feb 2024 15:45:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id A984A6B00A5 for ; Wed, 14 Feb 2024 15:45:03 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 578A080CEB for ; Wed, 14 Feb 2024 20:45:03 +0000 (UTC) X-FDA: 81791588886.22.BB3B74A Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf17.hostedemail.com (Postfix) with ESMTP id AD0E140008 for ; Wed, 14 Feb 2024 20:45:01 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=QBsFUciT; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf17.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707943501; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SyWghWuBP6GkqnhxcyLXYq2ssX4b7HML4Pf9katCZ5k=; b=RMkFDYycOaxRF7OY6etqTGPBxvCPrRUu53SnXko1JuE1TYwJrGjfn/lC06C3J0BQuaLb3a 3sS4EJNqLU0ivldVg0zvNJw28FW98+WhzhcMyYyU/bwL1QzzzLYYMmBfzRtL6leqoSiND2 aKotSgIqn0M8ownD/rJwduPtjivuCHA= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=QBsFUciT; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf17.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707943501; a=rsa-sha256; cv=none; b=spFX6zSJbgMV2+g/HdQdbOj2OAxAzi9th8l5qqiByNwGT6y1Y8iIvmwrGKUxHEdubImf9+ qVQqMFnS57VOjqbg2nOUkomNpxPQYiYYRBnFVSgPNpCC1bcVocw/JofMiCEJVFzYw2QRuw 3xkcRN6ZqdEQJ7iaMuvQ1ib6lEFgkpg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707943500; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SyWghWuBP6GkqnhxcyLXYq2ssX4b7HML4Pf9katCZ5k=; b=QBsFUciTdcQvEEwIrR2s0jSUR3Spd74TiMKrEjN0NElz7OYdiGHFb/U46yoUeVd/oLvTgN j5prsNA6BNsariKFHBHJwvDY10Se5sd/7X+vouHq3P+9wvuhn5MpzEeCJXdxMmiqfPK/we 4ShppiKUcOwxPu9Inz5p3wwE9yN6kcc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-608-cfALmccYMXOB9OP0YG_4hQ-1; Wed, 14 Feb 2024 15:44:58 -0500 X-MC-Unique: cfALmccYMXOB9OP0YG_4hQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B0CEB10AFD6E; Wed, 14 Feb 2024 20:44:56 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.194.174]) by smtp.corp.redhat.com (Postfix) with ESMTP id 244661C066A9; Wed, 14 Feb 2024 20:44:53 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Matthew Wilcox , Ryan Roberts , Catalin Marinas , Yin Fengwei , Michal Hocko , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Peter Zijlstra , Michael Ellerman , Christophe Leroy , "Naveen N. Rao" , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org Subject: [PATCH v3 04/10] mm/memory: factor out zapping folio pte into zap_present_folio_pte() Date: Wed, 14 Feb 2024 21:44:29 +0100 Message-ID: <20240214204435.167852-5-david@redhat.com> In-Reply-To: <20240214204435.167852-1-david@redhat.com> References: <20240214204435.167852-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Rspam-User: X-Stat-Signature: 9uneh4zs7thykjioka95t44m5pf4c5zu X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: AD0E140008 X-HE-Tag: 1707943501-797514 X-HE-Meta: U2FsdGVkX1+mmEGG0/L7NzzGNPPA9DvrjKp0wniR5cDziiBiBmpjQP+x/qbJK1HpAFXRYV1bX5LulYDknYfg6FPZElGHJDp/Z1RZkAunI0ahdSZgxq0EgNEEDvdyHYpXO4C2pNwzgu97SvhgdPe7SJB6YSm7ZQb+ii5VOUzDMOcln6gg0mr+taiqxYKWT45aJ8KLFMo7fvqpGFggQ1Pi0rtSQ/nPcBoQTswHyXV7jJxaFFJ01Popp1izdMED5N+PFZyALnvoTfvwA2/+Ir743U2QfZqUAtyOfaSxzVUybXQlwJEtNA18CQVjriDgd17jUxHP5Hb2Iukn7CvyGgNknCtx40KermQixNMOF1OwrS1jonBkj//bfjZ6sYe4TcNGwYL4PM4/AJDJk87gN/vJ6NBWqNvP9ojq63V/dHxA3MOyBZRl/WR4wPuXEpNF2Q7ef8aJ0NrrS39En/jVU6RV0g/3YFZisInyrFc+PtCuu4swuW8Tm1ny81H8VDdFvP2l5ALq66fKw2iNljjbsNRB2HWqe+QYgu4roz6dmFOLnCpgNGhz4IpOz0V+EF5mCIVlctyrSsGIra6LUWMVavjVya0oHHdMPk07SNXRvpBCerj0i7JAC+Yu2TNePBdDw97D5aKRO770nWoSrLxzUgXq+/TIfE6njB4ofhNufv52foVi006+Tlnpzed2FiZYcvuv4NHr+r+8WzddHVgNo1HK4IaY2/CzLO7QHLYIy8+9FoK0otOFGa8nEhwTedFoaPfaYUpnXa7VEp3j7glH3d97EG0gEBcmPWoFMXjZVvnL/nv4kRhhc9fbOAApPBzlC/Y5Wp5ak1Hv9RcWOe7AQh5wUBsmfdV55ez3EAmLFBBNZYUtyPyR1WKYxwAOxtZ15hBI+xYjPhIAeLyr4DgpZpR2l0bCKapRDkr+btUEQO/B2AGwnCVKEEcfJVxmEJA0RHey X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Let's prepare for further changes by factoring it out into a separate function. Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- mm/memory.c | 53 ++++++++++++++++++++++++++++++++--------------------- 1 file changed, 32 insertions(+), 21 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 7a3ebb6e5909..a3efc4da258a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1528,30 +1528,14 @@ zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, pte_install_uffd_wp_if_needed(vma, addr, pte, pteval); } -static inline void zap_present_pte(struct mmu_gather *tlb, - struct vm_area_struct *vma, pte_t *pte, pte_t ptent, - unsigned long addr, struct zap_details *details, - int *rss, bool *force_flush, bool *force_break) +static inline void zap_present_folio_pte(struct mmu_gather *tlb, + struct vm_area_struct *vma, struct folio *folio, + struct page *page, pte_t *pte, pte_t ptent, unsigned long addr, + struct zap_details *details, int *rss, bool *force_flush, + bool *force_break) { struct mm_struct *mm = tlb->mm; bool delay_rmap = false; - struct folio *folio; - struct page *page; - - page = vm_normal_page(vma, addr, ptent); - if (!page) { - /* We don't need up-to-date accessed/dirty bits. */ - ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); - arch_check_zapped_pte(vma, ptent); - tlb_remove_tlb_entry(tlb, pte, addr); - VM_WARN_ON_ONCE(userfaultfd_wp(vma)); - ksm_might_unmap_zero_page(mm, ptent); - return; - } - - folio = page_folio(page); - if (unlikely(!should_zap_folio(details, folio))) - return; if (!folio_test_anon(folio)) { ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); @@ -1586,6 +1570,33 @@ static inline void zap_present_pte(struct mmu_gather *tlb, } } +static inline void zap_present_pte(struct mmu_gather *tlb, + struct vm_area_struct *vma, pte_t *pte, pte_t ptent, + unsigned long addr, struct zap_details *details, + int *rss, bool *force_flush, bool *force_break) +{ + struct mm_struct *mm = tlb->mm; + struct folio *folio; + struct page *page; + + page = vm_normal_page(vma, addr, ptent); + if (!page) { + /* We don't need up-to-date accessed/dirty bits. */ + ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); + arch_check_zapped_pte(vma, ptent); + tlb_remove_tlb_entry(tlb, pte, addr); + VM_WARN_ON_ONCE(userfaultfd_wp(vma)); + ksm_might_unmap_zero_page(mm, ptent); + return; + } + + folio = page_folio(page); + if (unlikely(!should_zap_folio(details, folio))) + return; + zap_present_folio_pte(tlb, vma, folio, page, pte, ptent, addr, details, + rss, force_flush, force_break); +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, From patchwork Wed Feb 14 20:44:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13557008 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DF591C48BC1 for ; Wed, 14 Feb 2024 20:45:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5F8466B00A7; Wed, 14 Feb 2024 15:45:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5ABB06B00AC; Wed, 14 Feb 2024 15:45:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 421058D0001; Wed, 14 Feb 2024 15:45:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 2E1246B00A7 for ; Wed, 14 Feb 2024 15:45:08 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id E4B9D140507 for ; Wed, 14 Feb 2024 20:45:07 +0000 (UTC) X-FDA: 81791589054.08.1A86F1F Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf24.hostedemail.com (Postfix) with ESMTP id E7831180021 for ; Wed, 14 Feb 2024 20:45:05 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Idqvrsgu; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf24.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707943506; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4Q1ph86PH1Crd3Eqr5OMYEFX/2u3faALVbtCzM66/MM=; b=dC5IL7jRKK7h0NjwMErGinaWBETL9JBIjCRFuMjzwEXDBkk7PR7GDfXDT6ysUTXb2Pp8aQ pSNqowuG0t05ZDoDxX8eqmfd2Nb0ZbS9/JCNIPx3dyjcn67B6oaCI1KJ3RxoR3XwgO86IV XTQ+Wle9Yrxc0NMuD9Ye6MIVubOaY9w= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Idqvrsgu; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf24.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707943506; a=rsa-sha256; cv=none; b=fAJcnx6Rt2iy5NaIBSjJjo0qfam7dVKCYBGW3OYwQcR7NXe57mdYk0uHUy/gIFkj/hSzon Kyanko4FKL6rPgKoRjU5F8cBisMGqL2lfUdvzkiVRq/BvnTcXqu6Ty+yRWfzrvhtw14t66 evUEpqgWhEeP7z2XZ3s9bSa25cJLDH8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707943505; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4Q1ph86PH1Crd3Eqr5OMYEFX/2u3faALVbtCzM66/MM=; b=IdqvrsgunK4/c/ezi9TsDMpswAVhVcXptWySqJV5aEqIQXCePWwu6527P763Dx76ELhwwC iR8RkMTO0IuZrqnfK/q8h3F+1JsOY6Di1QZs5BUq/+Kfu3oErH6uZUpF2zI7TMOPtzvoSp W7A73Q12+FhiW81CFWbWaz1hcMcIjqo= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-670--0HI9YGROZaGy4w0dg4ihQ-1; Wed, 14 Feb 2024 15:45:01 -0500 X-MC-Unique: -0HI9YGROZaGy4w0dg4ihQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 864D985A589; Wed, 14 Feb 2024 20:45:00 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.194.174]) by smtp.corp.redhat.com (Postfix) with ESMTP id ECCE01C06532; Wed, 14 Feb 2024 20:44:56 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Matthew Wilcox , Ryan Roberts , Catalin Marinas , Yin Fengwei , Michal Hocko , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Peter Zijlstra , Michael Ellerman , Christophe Leroy , "Naveen N. Rao" , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org Subject: [PATCH v3 05/10] mm/mmu_gather: pass "delay_rmap" instead of encoded page to __tlb_remove_page_size() Date: Wed, 14 Feb 2024 21:44:30 +0100 Message-ID: <20240214204435.167852-6-david@redhat.com> In-Reply-To: <20240214204435.167852-1-david@redhat.com> References: <20240214204435.167852-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: E7831180021 X-Stat-Signature: yyki6esm6n3kmu9gzrm8kis9h6yo95tw X-HE-Tag: 1707943505-912539 X-HE-Meta: U2FsdGVkX1/6FooDgdm+q7H1VNW0INB1pT6/fQtPtlEoip9tUi/KxoMObJ8no3HVHmLgEQiVEl7vfqzCOEPXeNs7W7ReNIxcKV2MtKqQUf7h6DfOBHF3rV/yWoJkv3TYxzlsOyDp6hQzsJqgpkvTBSSCSXRaZETWRFPdgRJ2FEeSylqSExHjxhz/QApwROm+dt2GUbXmqyGAY68Y+b0oevhrHdnJzJYhfq4L6dpuuzdd6HJEi6uBRaxm5o79oIG3jQHbxuHtxhDP4K6wEk4on00v3/+CBng8i0SUa6qFwaEu0odQHMupUkGC4kELykFjPfrYiKtX+xkCZ9IMea14J7qWvn2p/PqqqoqYbrkNKxoPPqQarClpCfGMURH3b2lNNz8fnEpCzciZ2imcurnLfwKLoopHIdXD/tUEZQ4QesHqtFJ9RwVqkQmCDABnMZDIZ3X2VXyXFw1Ne3IQES6dVa0lAfwmWfM9wzvA1ZJPzwsALgHIeFSnL5vw7n0FVIPPkJdsNAIlN+0oe2WVLSETGwYzhTVIZPj7c5X1Qryu0qIac9nSRH/Zwh5fClFX1nJQKI+ucLNRDzVmoUfzsn0xw9owOzU1txuZSf5bg/0djOZCvzg2QufsyfC89jCJUcQgyPxVj8j5JvQY43TsoHq2cdu4Rw+teopi8mbPreEiYS1pyoLSDyVkSthWTUXJruMXskQA8DOJpbq0RP5Kn1KLhZgj3Ud4H58vl7bJetDsXsBXgAWwPFmZntRvzw6BN3vRoL415awlNc6sYTIWOzfHOe2LPYKNP2F1ya3pdDVS2ehL67KIFG3sWKgjIpu5qFqSokHZVCPO0RwZATWmHRYh8dRZtwH2bn5BfKN3V3+hmdE0KbOB6xWsGJc1VXUxVS0K+OGwEVHql/usVUaTniH6vsdarG8twBB2CZrql4+D1nuMyKew5qZ0huvzxtHilQaBqqp4Bm5UWkO08HGhu+1 1V50vGe9 arOoEGNoBeHYKF7YCLBdYySUmxwdLtoUDTVgFi+Sa2/MOR4xqpnstEgQX4cOxxg0/p8TwaF3e2Qi0d0qGg2aZrHdIuZ8UhOZd/MPdPmNPibk0trxFcO0Kr+lhSU+HYI9PERH7G/YN2Pa97Ja3iQIK5HndZwflZW10QNzx7RNFAGPOucVr5/irWlpse968AjNI94v+X1wEBnIK0Hkkn2mjb7/JI7SclOyhSIi7F+teM1KPDx0/4xRIUctilIiDYa7y3eiVdJ62WJtt+9M45AIc8g1E+w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: We have two bits available in the encoded page pointer to store additional information. Currently, we use one bit to request delay of the rmap removal until after a TLB flush. We want to make use of the remaining bit internally for batching of multiple pages of the same folio, specifying that the next encoded page pointer in an array is actually "nr_pages". So pass page + delay_rmap flag instead of an encoded page, to handle the encoding internally. Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- arch/s390/include/asm/tlb.h | 13 ++++++------- include/asm-generic/tlb.h | 12 ++++++------ mm/mmu_gather.c | 7 ++++--- 3 files changed, 16 insertions(+), 16 deletions(-) diff --git a/arch/s390/include/asm/tlb.h b/arch/s390/include/asm/tlb.h index d1455a601adc..48df896d5b79 100644 --- a/arch/s390/include/asm/tlb.h +++ b/arch/s390/include/asm/tlb.h @@ -25,8 +25,7 @@ void __tlb_remove_table(void *_table); static inline void tlb_flush(struct mmu_gather *tlb); static inline bool __tlb_remove_page_size(struct mmu_gather *tlb, - struct encoded_page *page, - int page_size); + struct page *page, bool delay_rmap, int page_size); #define tlb_flush tlb_flush #define pte_free_tlb pte_free_tlb @@ -42,14 +41,14 @@ static inline bool __tlb_remove_page_size(struct mmu_gather *tlb, * tlb_ptep_clear_flush. In both flush modes the tlb for a page cache page * has already been freed, so just do free_page_and_swap_cache. * - * s390 doesn't delay rmap removal, so there is nothing encoded in - * the page pointer. + * s390 doesn't delay rmap removal. */ static inline bool __tlb_remove_page_size(struct mmu_gather *tlb, - struct encoded_page *page, - int page_size) + struct page *page, bool delay_rmap, int page_size) { - free_page_and_swap_cache(encoded_page_ptr(page)); + VM_WARN_ON_ONCE(delay_rmap); + + free_page_and_swap_cache(page); return false; } diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index 129a3a759976..2eb7b0d4f5d2 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -260,9 +260,8 @@ struct mmu_gather_batch { */ #define MAX_GATHER_BATCH_COUNT (10000UL/MAX_GATHER_BATCH) -extern bool __tlb_remove_page_size(struct mmu_gather *tlb, - struct encoded_page *page, - int page_size); +extern bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, + bool delay_rmap, int page_size); #ifdef CONFIG_SMP /* @@ -462,13 +461,14 @@ static inline void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb) static inline void tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, int page_size) { - if (__tlb_remove_page_size(tlb, encode_page(page, 0), page_size)) + if (__tlb_remove_page_size(tlb, page, false, page_size)) tlb_flush_mmu(tlb); } -static __always_inline bool __tlb_remove_page(struct mmu_gather *tlb, struct page *page, unsigned int flags) +static __always_inline bool __tlb_remove_page(struct mmu_gather *tlb, + struct page *page, bool delay_rmap) { - return __tlb_remove_page_size(tlb, encode_page(page, flags), PAGE_SIZE); + return __tlb_remove_page_size(tlb, page, delay_rmap, PAGE_SIZE); } /* tlb_remove_page diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index 604ddf08affe..ac733d81b112 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -116,7 +116,8 @@ static void tlb_batch_list_free(struct mmu_gather *tlb) tlb->local.next = NULL; } -bool __tlb_remove_page_size(struct mmu_gather *tlb, struct encoded_page *page, int page_size) +bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, + bool delay_rmap, int page_size) { struct mmu_gather_batch *batch; @@ -131,13 +132,13 @@ bool __tlb_remove_page_size(struct mmu_gather *tlb, struct encoded_page *page, i * Add the page and check if we are full. If so * force a flush. */ - batch->encoded_pages[batch->nr++] = page; + batch->encoded_pages[batch->nr++] = encode_page(page, delay_rmap); if (batch->nr == batch->max) { if (!tlb_next_batch(tlb)) return true; batch = tlb->active; } - VM_BUG_ON_PAGE(batch->nr > batch->max, encoded_page_ptr(page)); + VM_BUG_ON_PAGE(batch->nr > batch->max, page); return false; } From patchwork Wed Feb 14 20:44:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13557009 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89650C48BC1 for ; Wed, 14 Feb 2024 20:45:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1B53B6B00AD; Wed, 14 Feb 2024 15:45:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 163186B00AE; Wed, 14 Feb 2024 15:45:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EAA096B00B0; Wed, 14 Feb 2024 15:45:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id D4B2F6B00AD for ; Wed, 14 Feb 2024 15:45:11 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id A8083160E1F for ; Wed, 14 Feb 2024 20:45:11 +0000 (UTC) X-FDA: 81791589222.29.CD33E12 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf07.hostedemail.com (Postfix) with ESMTP id 040D340013 for ; Wed, 14 Feb 2024 20:45:09 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=dWS7Tlj5; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf07.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707943510; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BfteBvWtn9YL0u+ZWWzLkAZQ9TqgPhMIROF4I2AA+84=; b=erg2XXtNkAqoFrjEKeg6lJ3hI8QVyl3JSYB0AzuhN9lA3NE01d7lADvPizNDX04FtaYON9 mMmB+36/vCwkOLz8zmxr8jrcGEp6+kQMgy9bzBzPs/WHktdKNtRc4mEYMu/mz5hdjzRmSf YgwhG3cdMNlxlPlQ/mrN13xtmPZIvlU= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=dWS7Tlj5; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf07.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707943510; a=rsa-sha256; cv=none; b=rxrv33g2PK9MpcF1dB/kTfE7NWMRapP79Z3e3zBopxQxR5qztR77UK0EH0Vwqf9FSaQubQ KfYBVafnOYdNMhV5HTFGhsBbgAZV0CkCUcdTEtnLo5N7Mn/TWSRq+BVHDehAyouR9HpcZO Zi7m9Nz5I3v5hG2972ckDS0CPYfxQMw= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707943509; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BfteBvWtn9YL0u+ZWWzLkAZQ9TqgPhMIROF4I2AA+84=; b=dWS7Tlj5hvo++IKBwEcAZX8N1kyu5MFiYG5LceV724qhzHlhlWnCbuVYOcjmSFNc3hjRAH NvInRkO6BO9Jkw7pznPvqg7Wc8TXzrt/R8/GGIlh/4WSOSJJ0LwueRiKvzE0v9HgAg0f5v X682/Nf1TSZAb4roDM0/bce9MOD6bOs= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-502-FFKMi1htPsyeY2_wWYIuqQ-1; Wed, 14 Feb 2024 15:45:06 -0500 X-MC-Unique: FFKMi1htPsyeY2_wWYIuqQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 768353C29A67; Wed, 14 Feb 2024 20:45:04 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.194.174]) by smtp.corp.redhat.com (Postfix) with ESMTP id C376E1C060B1; Wed, 14 Feb 2024 20:45:00 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Matthew Wilcox , Ryan Roberts , Catalin Marinas , Yin Fengwei , Michal Hocko , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Peter Zijlstra , Michael Ellerman , Christophe Leroy , "Naveen N. Rao" , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org Subject: [PATCH v3 06/10] mm/mmu_gather: define ENCODED_PAGE_FLAG_DELAY_RMAP Date: Wed, 14 Feb 2024 21:44:31 +0100 Message-ID: <20240214204435.167852-7-david@redhat.com> In-Reply-To: <20240214204435.167852-1-david@redhat.com> References: <20240214204435.167852-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Rspamd-Queue-Id: 040D340013 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: jk13gg7ju1t38cg85jc67nrr7jbi54gw X-HE-Tag: 1707943509-590703 X-HE-Meta: U2FsdGVkX1/QUf4Hcg02Nk7LLN35GKpJT+D34hrJzGBO0idkKI3SEV605s7ORVAxOS8nP9HkAokj1c2H2UknSF/rIvhE7y7icxLULdq0twGjDF7fbJce4Y1ClQ34HkVwkyGiy25Dto1IqVEoxR62R3LcFxU+hjkC0U2w8n4hgMiUoPqDVGlBpb54bKkqUbpwaXfWzEssFEZueHtKjRMCB53Cv1GCGUD2msW3SdDXTgz79cVIg/3jyXiOQ8sD4FF0NJPgP/0EEShINPT29uO980B3uq6g1ojiajytQogqpV9Ao6WNl1tabM8QugxpbaYghWYCqZNYzdADbQmxAS6Q6+kh+jYL2NcciIX66m1QB57hY708prR9GUOB3iLMSmksM5ojsy6I88TDSymOcjM0kfBmQmgwxRerKNRd1TtAd3aw8RAQdUfelHeIrdN7cxzstLMpqjfT3sv1qzPpPhLOWSuKeUiKC+c2+v6B31ZOELkkZLJacz/aVmhqMwVmnUHPfQlYLA3GvLb5AfjF8vOXYKyRvfLZoFasIqiN4hIhh/n7B6HbiwxhcGqja8JILQLmtbynT+QPswrXZcIYQk0wgzz2yeW9cNIdacJ1Ih/us1JDbsqPpOupVPf1+mpY5Xwt5p2DeW3cJPQdp/ZszOBzwzgZDzpfFoTWfr7XBcFxqxCAE5eQ5ZOYTJZ42qZw2VUyIpRn5O7y17BmAinP4lRnFINYG/vugmFgDzynjHZo81AL6FzdQnewEdx6Tf/wuWiK2wE9p5ir5OQ4NikZ4XverbN+0VmDwXHmzYvRf0C8iytz/ujr3Qkw6J49fMJl89WNT+r/8OrzX903FRMzii4B+4epwjyx7160aJC4vzf+O+1rdrdSm2h//OLdzlvWmao1VWfkiejENF9z+mNHJW5Bf+dNxJYSvdfAIZfsGznpTnsL/23kdx9AI9KJOW4tkbGKCYFToilPxioZVcFhSJV rjCwi5eZ V75Yd+OPigW5uRRxmmDljqZiZvPXcyYkytHRnRH1c+p5irqfAYyUn72RgcghbsiBqg2EvgYtLTnDnxjd2flr5RuNcCRRWfCgjRyMArRVdo4l/h8ElM8SQ/NVnmzCmX4w9W8k+2rqVbb4c1JSOzQLew7ShT9vfl4c/SHcgoNQe4Rs7keV66LoDtsUt4a6CJlwxczLpa5Xvb+BYtMw7lsOQvhRUSJ5nMhhfI6/u8ZHVqq/23sta6NbDhRjwNEYAL/HYzVB+pzI2CPheLKAsq6RJfUtAMg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Nowadays, encoded pages are only used in mmu_gather handling. Let's update the documentation, and define ENCODED_PAGE_BIT_DELAY_RMAP. While at it, rename ENCODE_PAGE_BITS to ENCODED_PAGE_BITS. If encoded page pointers would ever be used in other context again, we'd likely want to change the defines to reflect their context (e.g., ENCODED_PAGE_FLAG_MMU_GATHER_DELAY_RMAP). For now, let's keep it simple. This is a preparation for using the remaining spare bit to indicate that the next item in an array of encoded pages is a "nr_pages" argument and not an encoded page. Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- include/linux/mm_types.h | 17 +++++++++++------ mm/mmu_gather.c | 5 +++-- 2 files changed, 14 insertions(+), 8 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 8b611e13153e..1b89eec0d6df 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -210,8 +210,8 @@ struct page { * * An 'encoded_page' pointer is a pointer to a regular 'struct page', but * with the low bits of the pointer indicating extra context-dependent - * information. Not super-common, but happens in mmu_gather and mlock - * handling, and this acts as a type system check on that use. + * information. Only used in mmu_gather handling, and this acts as a type + * system check on that use. * * We only really have two guaranteed bits in general, although you could * play with 'struct page' alignment (see CONFIG_HAVE_ALIGNED_STRUCT_PAGE) @@ -220,21 +220,26 @@ struct page { * Use the supplied helper functions to endcode/decode the pointer and bits. */ struct encoded_page; -#define ENCODE_PAGE_BITS 3ul + +#define ENCODED_PAGE_BITS 3ul + +/* Perform rmap removal after we have flushed the TLB. */ +#define ENCODED_PAGE_BIT_DELAY_RMAP 1ul + static __always_inline struct encoded_page *encode_page(struct page *page, unsigned long flags) { - BUILD_BUG_ON(flags > ENCODE_PAGE_BITS); + BUILD_BUG_ON(flags > ENCODED_PAGE_BITS); return (struct encoded_page *)(flags | (unsigned long)page); } static inline unsigned long encoded_page_flags(struct encoded_page *page) { - return ENCODE_PAGE_BITS & (unsigned long)page; + return ENCODED_PAGE_BITS & (unsigned long)page; } static inline struct page *encoded_page_ptr(struct encoded_page *page) { - return (struct page *)(~ENCODE_PAGE_BITS & (unsigned long)page); + return (struct page *)(~ENCODED_PAGE_BITS & (unsigned long)page); } /* diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index ac733d81b112..6540c99c6758 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -53,7 +53,7 @@ static void tlb_flush_rmap_batch(struct mmu_gather_batch *batch, struct vm_area_ for (int i = 0; i < batch->nr; i++) { struct encoded_page *enc = batch->encoded_pages[i]; - if (encoded_page_flags(enc)) { + if (encoded_page_flags(enc) & ENCODED_PAGE_BIT_DELAY_RMAP) { struct page *page = encoded_page_ptr(enc); folio_remove_rmap_pte(page_folio(page), page, vma); } @@ -119,6 +119,7 @@ static void tlb_batch_list_free(struct mmu_gather *tlb) bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, bool delay_rmap, int page_size) { + int flags = delay_rmap ? ENCODED_PAGE_BIT_DELAY_RMAP : 0; struct mmu_gather_batch *batch; VM_BUG_ON(!tlb->end); @@ -132,7 +133,7 @@ bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, * Add the page and check if we are full. If so * force a flush. */ - batch->encoded_pages[batch->nr++] = encode_page(page, delay_rmap); + batch->encoded_pages[batch->nr++] = encode_page(page, flags); if (batch->nr == batch->max) { if (!tlb_next_batch(tlb)) return true; From patchwork Wed Feb 14 20:44:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13557010 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B962EC48BC3 for ; Wed, 14 Feb 2024 20:45:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4BFE96B00B2; Wed, 14 Feb 2024 15:45:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 46FA96B00B3; Wed, 14 Feb 2024 15:45:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2E9FA8D0001; Wed, 14 Feb 2024 15:45:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 1A8E86B00B2 for ; Wed, 14 Feb 2024 15:45:17 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id DA51380E31 for ; Wed, 14 Feb 2024 20:45:16 +0000 (UTC) X-FDA: 81791589432.04.43DF31D Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf15.hostedemail.com (Postfix) with ESMTP id 136FFA000F for ; Wed, 14 Feb 2024 20:45:14 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=g7PPi1mj; spf=pass (imf15.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707943515; a=rsa-sha256; cv=none; b=jRdI5IZ8h984SenXjska5XUOCUsu4Rr2hObs8iYHAaVhSWcqlbi5AsLRdK69K2/UBMz+v0 S6TWGe7SsesM2cYImRJbOhN8/Ff7SEGip3cqoPqmZvWQUfKjwu8mpprot/YO3+/wlgVMUD pFh5V7jwXGNti1BBL3RQ77oR1YjOMXo= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=g7PPi1mj; spf=pass (imf15.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707943515; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=X4BN9bDZrrapc7Xd1FDfmQmuEZxXG5tNILRPdL9R2lg=; b=atc82bNIcgbJo/ZxtSMWLa2URX8xM7EhzooQ+IYPotIu3qgtpgSBRiqOox/39Nv1zAOaOV t93+MNRVLBsF7Pi7r5xpaF/PJThl06eYYL2oku8CRkGQgYLY7OzmKBer0q0zLhkFVO/9Dr 69hgwqR8f6wG/BSRYP1six30BrOZzTI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707943514; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=X4BN9bDZrrapc7Xd1FDfmQmuEZxXG5tNILRPdL9R2lg=; b=g7PPi1mjimebqsiKMKqHPof+S2D1iEWxTgOZpNFb4If9tcX4YlGkzfP7pfonjv2mwCqJTE nsgPMUaxodvPFabGnvO/w/DXnc3xnuprA9qkysaVzXBONKLlxqhSBOWjzr2TzCgSp90lMI pRBcn2KcIMwS4txbSHuFeKQe0xnjGZE= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-520-D3IZQFsSNhyFuV3-aH76jg-1; Wed, 14 Feb 2024 15:45:09 -0500 X-MC-Unique: D3IZQFsSNhyFuV3-aH76jg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6ACBD811E79; Wed, 14 Feb 2024 20:45:08 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.194.174]) by smtp.corp.redhat.com (Postfix) with ESMTP id B3DCE1C066A9; Wed, 14 Feb 2024 20:45:04 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Matthew Wilcox , Ryan Roberts , Catalin Marinas , Yin Fengwei , Michal Hocko , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Peter Zijlstra , Michael Ellerman , Christophe Leroy , "Naveen N. Rao" , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org Subject: [PATCH v3 07/10] mm/mmu_gather: add tlb_remove_tlb_entries() Date: Wed, 14 Feb 2024 21:44:32 +0100 Message-ID: <20240214204435.167852-8-david@redhat.com> In-Reply-To: <20240214204435.167852-1-david@redhat.com> References: <20240214204435.167852-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 136FFA000F X-Stat-Signature: pmgpxi315owtcsopm317zui4z1rskfcx X-Rspam-User: X-HE-Tag: 1707943514-279294 X-HE-Meta: U2FsdGVkX19f/7PHIUKc3KravZGndkdazSDQwJQtQlgXw3YIZkYjg2EKA4xBfxg6/yZcGtkou0AH23NKcAmLFvBHP032S4x0qLAlx1/xztPpLhYJLchf2S+ZRfV40ecnwYK24YBk8q7p+ziEudbpT6tRoXwTDQEDYzutEqSATGTa1tc8poAMzvRv1R8o+a9Cfu2eBpOXJhGpmEcqaZHq1zVCqYpBYfK0U4gIuDOuOgx0bZHJhmUej6hP/lLANGfkZz9eQZEG+EI39ZqZqr+HwZrQ9nRZX1W6ib5osfwUluS7pkPHhGBDE4BFk2X87HYng/auaxX5Xv08O77rxi8r3CR+Dn7eGBluI1HJsz28Hr9Vag9in+4oSXIMTuElj5lAW1SByWEjXU/esRDaMFfMhrT3YYKnhNZVfnVfR+Fk+C4dnbS4gk4ZNfDNcZOSf7cp+VlqJ0ZJ0imDSx4GaYQe/BRrVKf/UBEK5xkkj16WArafMBMnSUUldUcYc8VYexqkFh0gVgJDkBM4qO/dXqIM+EUhK8r7XKxpi64hvmm6qhK8csIzQ7xBXpotS/sNcpLAB+nIyt03RC3whU09J8ZrFLbxAnutPJ66L2jEgn4qLKbKjRSZqhgjICJG1kGIS4VgWP2hIGz5S9tXMJ2KJ8wA2UIkcfgC4Xi1VauFb7dXEJQ+nKoqVzYCQpYSPlFq2P8diR4d/pOER1Wv4EejpX1PRiY2M9heP6o34YsKZkxMiem6OlS9ePeQs0vl6GDO6qa0ksdMA8bjpkwX2nguuE+bOAWwUuZ2wnJkRTgOsWtN6iXPk2zDMxvackwIVqPPeVixNKQQMq1wRZyYnK+RpuzVSFLAqN+0qcKQkyfu3VbXurHk2OTa1K9yy1JXRZiOLopnkvhYgwjdJld6810w9vsoEy6lGHZkdkCvfFjJIYQREE8vF2W6NOUrgAiik4YCeDGukuC56RDzk46LcOiHu5N Ym1fA2h3 Xr4psvlW/0GW+tuJ/G1+i7N6GUl4MSZ23Ru1ttTgzVeLyA9/I+5NUt/TEyKjC5ZmtMbi8QME2+9I9TsIMTGVMwZh43Y6Wds2NnKXp3A2wLo2X35Piw5gH+UTYYGttInxjAZYPszmWXj/693RVqBOmmfARuyXgrIrzPHqv19g7xG3junISZmzoLYMW5llBApUI7Hh9vttAbsIMC1sMJwbl0SP+KDv5mPAS9Mrv48+z+MatTBp+8lsaYkuMjs+X2YXSC753QaUQGbZBPRdsvc0thNzwEA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Let's add a helper that lets us batch-process multiple consecutive PTEs. Note that the loop will get optimized out on all architectures except on powerpc. We have to add an early define of __tlb_remove_tlb_entry() on ppc to make the compiler happy (and avoid making tlb_remove_tlb_entries() a macro). Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- arch/powerpc/include/asm/tlb.h | 2 ++ include/asm-generic/tlb.h | 20 ++++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/arch/powerpc/include/asm/tlb.h b/arch/powerpc/include/asm/tlb.h index b3de6102a907..1ca7d4c4b90d 100644 --- a/arch/powerpc/include/asm/tlb.h +++ b/arch/powerpc/include/asm/tlb.h @@ -19,6 +19,8 @@ #include +static inline void __tlb_remove_tlb_entry(struct mmu_gather *tlb, pte_t *ptep, + unsigned long address); #define __tlb_remove_tlb_entry __tlb_remove_tlb_entry #define tlb_flush tlb_flush diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index 2eb7b0d4f5d2..95d60a4f468a 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -608,6 +608,26 @@ static inline void tlb_flush_p4d_range(struct mmu_gather *tlb, __tlb_remove_tlb_entry(tlb, ptep, address); \ } while (0) +/** + * tlb_remove_tlb_entries - remember unmapping of multiple consecutive ptes for + * later tlb invalidation. + * + * Similar to tlb_remove_tlb_entry(), but remember unmapping of multiple + * consecutive ptes instead of only a single one. + */ +static inline void tlb_remove_tlb_entries(struct mmu_gather *tlb, + pte_t *ptep, unsigned int nr, unsigned long address) +{ + tlb_flush_pte_range(tlb, address, PAGE_SIZE * nr); + for (;;) { + __tlb_remove_tlb_entry(tlb, ptep, address); + if (--nr == 0) + break; + ptep++; + address += PAGE_SIZE; + } +} + #define tlb_remove_huge_tlb_entry(h, tlb, ptep, address) \ do { \ unsigned long _sz = huge_page_size(h); \ From patchwork Wed Feb 14 20:44:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13557011 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48667C48BC1 for ; Wed, 14 Feb 2024 20:45:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D4B976B00B6; Wed, 14 Feb 2024 15:45:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CD54B6B00B7; Wed, 14 Feb 2024 15:45:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B75476B00B8; Wed, 14 Feb 2024 15:45:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id A254C6B00B6 for ; Wed, 14 Feb 2024 15:45:21 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 758AD160350 for ; Wed, 14 Feb 2024 20:45:21 +0000 (UTC) X-FDA: 81791589642.24.315447A Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf30.hostedemail.com (Postfix) with ESMTP id AF6C98001D for ; Wed, 14 Feb 2024 20:45:19 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=VSXWtozN; spf=pass (imf30.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707943519; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Urk7YDR9ebLiDb1P6kmx7XPXJlWdS6zrYIBQssjJoDA=; b=KSz+FXf/1X3L+/aPLld/qa0CfHEMScT/mCe5ynvIhzKWPYCY5tff6lOIafBiVs0aW0dLnj XTE9pl+1Ux4bK5fWcjJJQSwoiE9EHx8B8pAd7laTwN5fXSWjG3uEyWZHxCapqPd6BiRSvc QjLVOELu+URXIEfQB4Xuy/JOyGHjSz0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707943519; a=rsa-sha256; cv=none; b=QeYVoFTLECQdj7NKq/P4Bv8XJnsNQO0SrlikRUR7QCygYpMyXHxQ80hympBcCz9wTL6Ov7 9vqlJIfPxPcVGxoTT5fsHxlS0pnSRBIti8w46367ljcWKdaQmLrhvkyjFaGizedWlRVE/b XTJp+Ku5TxxuU1aJcHYoybw/AkFtrpQ= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=VSXWtozN; spf=pass (imf30.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707943519; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Urk7YDR9ebLiDb1P6kmx7XPXJlWdS6zrYIBQssjJoDA=; b=VSXWtozNqSKJKjRmuBka4rXixe12IVPen6UFdM6fLu0OjtW/bpFS7ic8VgZSdwisz4Vnv4 kjQhvPzCJSxiCR0Ziso/pvrmF93zT5m8GlUDqT6wHlwKVOqnazhgyQ0nKLcN+VfcxHl7NT Cue9+6Q5Cior+IY0DSuTM8BOXm6jEjQ= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-495-aCdWli_yP2W_uoiOTDRjgQ-1; Wed, 14 Feb 2024 15:45:14 -0500 X-MC-Unique: aCdWli_yP2W_uoiOTDRjgQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5326C28B6A1D; Wed, 14 Feb 2024 20:45:13 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.194.174]) by smtp.corp.redhat.com (Postfix) with ESMTP id A53741C03428; Wed, 14 Feb 2024 20:45:09 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Matthew Wilcox , Ryan Roberts , Catalin Marinas , Yin Fengwei , Michal Hocko , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Peter Zijlstra , Michael Ellerman , Christophe Leroy , "Naveen N. Rao" , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org Subject: [PATCH v3 08/10] mm/mmu_gather: add __tlb_remove_folio_pages() Date: Wed, 14 Feb 2024 21:44:33 +0100 Message-ID: <20240214204435.167852-9-david@redhat.com> In-Reply-To: <20240214204435.167852-1-david@redhat.com> References: <20240214204435.167852-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Stat-Signature: ijfx9egh47axw4i86usu1mdf8xgiqa7u X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: AF6C98001D X-Rspam-User: X-HE-Tag: 1707943519-367225 X-HE-Meta: U2FsdGVkX1/DJZV+DAM50YJ0mX4kGuBJR2C8cxA7VOBcgGk3gnpwnsIYmSnKfLWyFaU48U1tb8p2izedjMtb1hOid9nUQslienqDTVe/PyBBgK/EtItJDSt7MDoEIHrkJxKhSLq/gq7HnSdFbXdASzOkTr5NPwdi/CIcPTxTlFcxf2SHK4dAh6EER7HrdRuAGDQeMx2t/lLxgybTHfLGyZCfxRpNpc5IB1widwof4td0ZdVIdp3X1cCscjxrpqIzAj8St3lP1sUmI2JIDnUREnX/A/BgtPs9PVMKhCSIwW2hCcplAaibKsE9kF0aGw8vPfqZsl470prtnfB8sEr9tnLIe//pVKcua1dWz/QwToVuEF/shBVDeeBMQEoOcO2+eg0VlSPm5pjV0Agj77IsWJqNKmRHl4H91WS9tTC/AXHjj+lyIKa4Uux9bocWLDNyW4iLKVtVgXBX3KLlosm4EfobgA1xX5taAx8ggygF19kuj91F941fPVpQ5owX1F9HeJ7pPJsg/I7rRwt6D+hKcJJe/9LgvZX0XIu2Qlu4+VlA20DOeqqk0pzusU8Fg5d0+OKhSo+DodSpxgq5PkVjo5SFnNbUsHN9r4VDAO/J8IP3Gr66K+rZ3WYxJtSmVrhCccbllJ3SS1jrzbhh/mz4y3o+vIqNNtUNBf7ogUeyBNTXkFvWmRL/9ofOfPNcn6YmbQAPFy1YjOud/lUqzxtKI4wYm4QCS2fa9YpVm7ITvBcaCZkGuKOWVlJ/yNOYCYeQDY4VMDasMMv0Mqk1N482sX+yMRsIVO1xh8EqBDKqMWp3QFIHlsTXjbLhkJYAr65KbNsbSIU158ly0qP+J737NSk4PRJspGWhfmjBF5XPErh2qzB5sEY9EcOsmfODpLUgxIw+nkoarGIPYdd472l+Fnjn1fAjShFpSkcOuVMgw8WOn49yUITUO8f+B5+JYjBVtFpAuAFE1bAx46hRM7A nZ0XikRX 061ABYNDHykvfBOrWxROuLh3TtVLtnPX8HkT8cPUk0dT8ObwBfxK++C8E7yz5CF8/isqVgB1vAK5aaKpa0yGX515+fPU+DcV4efP3Yiad3nfAn9R8hiOfPuFyARW4WIut7y6T6MYgh4xFUYWua5KkMFIw0HC7ttdJIm3J7pdd5U/Nx8h3zJg6dH36PUozu13ScMDq1uMy56V9GyE6hhKwz+4B6O1XWeYDtiGO4A4K7KflkA6UyVcfB2cWd0uChL9VoXGfWrLuMs/GYGct5qh5NTczJQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add __tlb_remove_folio_pages(), which will remove multiple consecutive pages that belong to the same large folio, instead of only a single page. We'll be using this function when optimizing unmapping/zapping of large folios that are mapped by PTEs. We're using the remaining spare bit in an encoded_page to indicate that the next enoced page in an array contains actually shifted "nr_pages". Teach swap/freeing code about putting multiple folio references, and delayed rmap handling to remove page ranges of a folio. This extension allows for still gathering almost as many small folios as we used to (-1, because we have to prepare for a possibly bigger next entry), but still allows for gathering consecutive pages that belong to the same large folio. Note that we don't pass the folio pointer, because it is not required for now. Further, we don't support page_size != PAGE_SIZE, it won't be required for simple PTE batching. We have to provide a separate s390 implementation, but it's fairly straight forward. Another, more invasive and likely more expensive, approach would be to use folio+range or a PFN range instead of page+nr_pages. But, we should do that consistently for the whole mmu_gather. For now, let's keep it simple and add "nr_pages" only. Note that it is now possible to gather significantly more pages: In the past, we were able to gather ~10000 pages, now we can also gather ~5000 folio fragments that span multiple pages. A folio fragment on x86-64 can span up to 512 pages (2 MiB THP) and on arm64 with 64k in theory 8192 pages (512 MiB THP). Gathering more memory is not considered something we should worry about, especially because these are already corner cases. While we can gather more total memory, we won't free more folio fragments. As long as page freeing time primarily only depends on the number of involved folios, there is no effective change for !preempt configurations. However, we'll adjust tlb_batch_pages_flush() separately to handle corner cases where page freeing time grows proportionally with the actual memory size. Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- arch/s390/include/asm/tlb.h | 17 +++++++++++ include/asm-generic/tlb.h | 8 +++++ include/linux/mm_types.h | 20 ++++++++++++ mm/mmu_gather.c | 61 +++++++++++++++++++++++++++++++------ mm/swap.c | 12 ++++++-- mm/swap_state.c | 15 +++++++-- 6 files changed, 119 insertions(+), 14 deletions(-) diff --git a/arch/s390/include/asm/tlb.h b/arch/s390/include/asm/tlb.h index 48df896d5b79..e95b2c8081eb 100644 --- a/arch/s390/include/asm/tlb.h +++ b/arch/s390/include/asm/tlb.h @@ -26,6 +26,8 @@ void __tlb_remove_table(void *_table); static inline void tlb_flush(struct mmu_gather *tlb); static inline bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, bool delay_rmap, int page_size); +static inline bool __tlb_remove_folio_pages(struct mmu_gather *tlb, + struct page *page, unsigned int nr_pages, bool delay_rmap); #define tlb_flush tlb_flush #define pte_free_tlb pte_free_tlb @@ -52,6 +54,21 @@ static inline bool __tlb_remove_page_size(struct mmu_gather *tlb, return false; } +static inline bool __tlb_remove_folio_pages(struct mmu_gather *tlb, + struct page *page, unsigned int nr_pages, bool delay_rmap) +{ + struct encoded_page *encoded_pages[] = { + encode_page(page, ENCODED_PAGE_BIT_NR_PAGES_NEXT), + encode_nr_pages(nr_pages), + }; + + VM_WARN_ON_ONCE(delay_rmap); + VM_WARN_ON_ONCE(page_folio(page) != page_folio(page + nr_pages - 1)); + + free_pages_and_swap_cache(encoded_pages, ARRAY_SIZE(encoded_pages)); + return false; +} + static inline void tlb_flush(struct mmu_gather *tlb) { __tlb_flush_mm_lazy(tlb->mm); diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index 95d60a4f468a..bd00dd238b79 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -69,6 +69,7 @@ * * - tlb_remove_page() / __tlb_remove_page() * - tlb_remove_page_size() / __tlb_remove_page_size() + * - __tlb_remove_folio_pages() * * __tlb_remove_page_size() is the basic primitive that queues a page for * freeing. __tlb_remove_page() assumes PAGE_SIZE. Both will return a @@ -78,6 +79,11 @@ * tlb_remove_page() and tlb_remove_page_size() imply the call to * tlb_flush_mmu() when required and has no return value. * + * __tlb_remove_folio_pages() is similar to __tlb_remove_page(), however, + * instead of removing a single page, remove the given number of consecutive + * pages that are all part of the same (large) folio: just like calling + * __tlb_remove_page() on each page individually. + * * - tlb_change_page_size() * * call before __tlb_remove_page*() to set the current page-size; implies a @@ -262,6 +268,8 @@ struct mmu_gather_batch { extern bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, bool delay_rmap, int page_size); +bool __tlb_remove_folio_pages(struct mmu_gather *tlb, struct page *page, + unsigned int nr_pages, bool delay_rmap); #ifdef CONFIG_SMP /* diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 1b89eec0d6df..a7223ba3ea1e 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -226,6 +226,15 @@ struct encoded_page; /* Perform rmap removal after we have flushed the TLB. */ #define ENCODED_PAGE_BIT_DELAY_RMAP 1ul +/* + * The next item in an encoded_page array is the "nr_pages" argument, specifying + * the number of consecutive pages starting from this page, that all belong to + * the same folio. For example, "nr_pages" corresponds to the number of folio + * references that must be dropped. If this bit is not set, "nr_pages" is + * implicitly 1. + */ +#define ENCODED_PAGE_BIT_NR_PAGES_NEXT 2ul + static __always_inline struct encoded_page *encode_page(struct page *page, unsigned long flags) { BUILD_BUG_ON(flags > ENCODED_PAGE_BITS); @@ -242,6 +251,17 @@ static inline struct page *encoded_page_ptr(struct encoded_page *page) return (struct page *)(~ENCODED_PAGE_BITS & (unsigned long)page); } +static __always_inline struct encoded_page *encode_nr_pages(unsigned long nr) +{ + VM_WARN_ON_ONCE((nr << 2) >> 2 != nr); + return (struct encoded_page *)(nr << 2); +} + +static __always_inline unsigned long encoded_nr_pages(struct encoded_page *page) +{ + return ((unsigned long)page) >> 2; +} + /* * A swap entry has to fit into a "unsigned long", as the entry is hidden * in the "index" field of the swapper address space. diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index 6540c99c6758..d175c0f1e2c8 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -50,12 +50,21 @@ static bool tlb_next_batch(struct mmu_gather *tlb) #ifdef CONFIG_SMP static void tlb_flush_rmap_batch(struct mmu_gather_batch *batch, struct vm_area_struct *vma) { + struct encoded_page **pages = batch->encoded_pages; + for (int i = 0; i < batch->nr; i++) { - struct encoded_page *enc = batch->encoded_pages[i]; + struct encoded_page *enc = pages[i]; if (encoded_page_flags(enc) & ENCODED_PAGE_BIT_DELAY_RMAP) { struct page *page = encoded_page_ptr(enc); - folio_remove_rmap_pte(page_folio(page), page, vma); + unsigned int nr_pages = 1; + + if (unlikely(encoded_page_flags(enc) & + ENCODED_PAGE_BIT_NR_PAGES_NEXT)) + nr_pages = encoded_nr_pages(pages[++i]); + + folio_remove_rmap_ptes(page_folio(page), page, nr_pages, + vma); } } } @@ -89,18 +98,26 @@ static void tlb_batch_pages_flush(struct mmu_gather *tlb) for (batch = &tlb->local; batch && batch->nr; batch = batch->next) { struct encoded_page **pages = batch->encoded_pages; - do { + while (batch->nr) { /* * limit free batch count when PAGE_SIZE > 4K */ unsigned int nr = min(512U, batch->nr); + /* + * Make sure we cover page + nr_pages, and don't leave + * nr_pages behind when capping the number of entries. + */ + if (unlikely(encoded_page_flags(pages[nr - 1]) & + ENCODED_PAGE_BIT_NR_PAGES_NEXT)) + nr++; + free_pages_and_swap_cache(pages, nr); pages += nr; batch->nr -= nr; cond_resched(); - } while (batch->nr); + } } tlb->active = &tlb->local; } @@ -116,8 +133,9 @@ static void tlb_batch_list_free(struct mmu_gather *tlb) tlb->local.next = NULL; } -bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, - bool delay_rmap, int page_size) +static bool __tlb_remove_folio_pages_size(struct mmu_gather *tlb, + struct page *page, unsigned int nr_pages, bool delay_rmap, + int page_size) { int flags = delay_rmap ? ENCODED_PAGE_BIT_DELAY_RMAP : 0; struct mmu_gather_batch *batch; @@ -126,6 +144,8 @@ bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, #ifdef CONFIG_MMU_GATHER_PAGE_SIZE VM_WARN_ON(tlb->page_size != page_size); + VM_WARN_ON_ONCE(nr_pages != 1 && page_size != PAGE_SIZE); + VM_WARN_ON_ONCE(page_folio(page) != page_folio(page + nr_pages - 1)); #endif batch = tlb->active; @@ -133,17 +153,40 @@ bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, * Add the page and check if we are full. If so * force a flush. */ - batch->encoded_pages[batch->nr++] = encode_page(page, flags); - if (batch->nr == batch->max) { + if (likely(nr_pages == 1)) { + batch->encoded_pages[batch->nr++] = encode_page(page, flags); + } else { + flags |= ENCODED_PAGE_BIT_NR_PAGES_NEXT; + batch->encoded_pages[batch->nr++] = encode_page(page, flags); + batch->encoded_pages[batch->nr++] = encode_nr_pages(nr_pages); + } + /* + * Make sure that we can always add another "page" + "nr_pages", + * requiring two entries instead of only a single one. + */ + if (batch->nr >= batch->max - 1) { if (!tlb_next_batch(tlb)) return true; batch = tlb->active; } - VM_BUG_ON_PAGE(batch->nr > batch->max, page); + VM_BUG_ON_PAGE(batch->nr > batch->max - 1, page); return false; } +bool __tlb_remove_folio_pages(struct mmu_gather *tlb, struct page *page, + unsigned int nr_pages, bool delay_rmap) +{ + return __tlb_remove_folio_pages_size(tlb, page, nr_pages, delay_rmap, + PAGE_SIZE); +} + +bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, + bool delay_rmap, int page_size) +{ + return __tlb_remove_folio_pages_size(tlb, page, 1, delay_rmap, page_size); +} + #endif /* MMU_GATHER_NO_GATHER */ #ifdef CONFIG_MMU_GATHER_TABLE_FREE diff --git a/mm/swap.c b/mm/swap.c index cd8f0150ba3a..e5380d732c0d 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -967,11 +967,17 @@ void release_pages(release_pages_arg arg, int nr) unsigned int lock_batch; for (i = 0; i < nr; i++) { + unsigned int nr_refs = 1; struct folio *folio; /* Turn any of the argument types into a folio */ folio = page_folio(encoded_page_ptr(encoded[i])); + /* Is our next entry actually "nr_pages" -> "nr_refs" ? */ + if (unlikely(encoded_page_flags(encoded[i]) & + ENCODED_PAGE_BIT_NR_PAGES_NEXT)) + nr_refs = encoded_nr_pages(encoded[++i]); + /* * Make sure the IRQ-safe lock-holding time does not get * excessive with a continuous string of pages from the @@ -990,14 +996,14 @@ void release_pages(release_pages_arg arg, int nr) unlock_page_lruvec_irqrestore(lruvec, flags); lruvec = NULL; } - if (put_devmap_managed_page(&folio->page)) + if (put_devmap_managed_page_refs(&folio->page, nr_refs)) continue; - if (folio_put_testzero(folio)) + if (folio_ref_sub_and_test(folio, nr_refs)) free_zone_device_page(&folio->page); continue; } - if (!folio_put_testzero(folio)) + if (!folio_ref_sub_and_test(folio, nr_refs)) continue; if (folio_test_large(folio)) { diff --git a/mm/swap_state.c b/mm/swap_state.c index 7255c01a1e4e..2f540748f7c0 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -311,8 +311,19 @@ void free_page_and_swap_cache(struct page *page) void free_pages_and_swap_cache(struct encoded_page **pages, int nr) { lru_add_drain(); - for (int i = 0; i < nr; i++) - free_swap_cache(encoded_page_ptr(pages[i])); + for (int i = 0; i < nr; i++) { + struct page *page = encoded_page_ptr(pages[i]); + + /* + * Skip over the "nr_pages" entry. It's sufficient to call + * free_swap_cache() only once per folio. + */ + if (unlikely(encoded_page_flags(pages[i]) & + ENCODED_PAGE_BIT_NR_PAGES_NEXT)) + i++; + + free_swap_cache(page); + } release_pages(pages, nr); } From patchwork Wed Feb 14 20:44:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13557012 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DF927C48BC3 for ; Wed, 14 Feb 2024 20:45:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 73A746B00B8; Wed, 14 Feb 2024 15:45:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6E3AF6B00B9; Wed, 14 Feb 2024 15:45:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 55AB76B00BA; Wed, 14 Feb 2024 15:45:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 422866B00B8 for ; Wed, 14 Feb 2024 15:45:26 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 24134120ECE for ; Wed, 14 Feb 2024 20:45:26 +0000 (UTC) X-FDA: 81791589852.13.4DAE99D Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf12.hostedemail.com (Postfix) with ESMTP id 6D66240010 for ; Wed, 14 Feb 2024 20:45:24 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=C7sTOd0A; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf12.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707943524; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pRS8kYaXs4+JoUa7KvLCUnXcrqNWL6ZvqI66J3j2gDY=; b=MXmLXtg2nDXF7ryCA+MJuQjj18t6cgEheBCmpdDc8Cqf7tl/F7TZY0vdPS8sqJWbff0log J+tOxOBf4Vz3UeWrq97wsEZ7HjlUAoJnQ+96UBYjSnexQcA8s6uqq3WfwLYIC6EIqxs5Tx Z67svbfbqKaqZvTM+52XhCIkHUC9fMQ= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=C7sTOd0A; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf12.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707943524; a=rsa-sha256; cv=none; b=0cTg1c+2+XPn4SaRt/UvqYU0fKcJF+/UBZb/yfG3A97/kDJsljLGiuKfUc1ZiGsX1XNl6J MD6lyctb+0kAZ3+nE+kuzArXuYMTQIbvMLQsTg/vOJDASnN+T+6gMQ8G+rFhc8EWbfVPPi 8yhlP4v21Xx6OUS8QA9TRpdLVYIA3e8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707943523; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pRS8kYaXs4+JoUa7KvLCUnXcrqNWL6ZvqI66J3j2gDY=; b=C7sTOd0AAltuL91GX3JE4gHANjZjf16OKvvlqXhnf/reqcAt6AhGu7NcH96krPA360KbI4 dibbLpeBLdLiKwZrWK6Bg8T77CpKjZmvPzPA6OX66WzYEX+5vlcWTVQaEUo0JwFTLtnK+7 JtjQEVcp+ED+cuKf9H/GRZ//Bg3PqOQ= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-73-lVXRetU-PDCixQSdYcx5pA-1; Wed, 14 Feb 2024 15:45:18 -0500 X-MC-Unique: lVXRetU-PDCixQSdYcx5pA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 503D9101A52A; Wed, 14 Feb 2024 20:45:17 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.194.174]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8ECBC1C05E0F; Wed, 14 Feb 2024 20:45:13 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Matthew Wilcox , Ryan Roberts , Catalin Marinas , Yin Fengwei , Michal Hocko , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Peter Zijlstra , Michael Ellerman , Christophe Leroy , "Naveen N. Rao" , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org Subject: [PATCH v3 09/10] mm/mmu_gather: improve cond_resched() handling with large folios and expensive page freeing Date: Wed, 14 Feb 2024 21:44:34 +0100 Message-ID: <20240214204435.167852-10-david@redhat.com> In-Reply-To: <20240214204435.167852-1-david@redhat.com> References: <20240214204435.167852-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Rspamd-Queue-Id: 6D66240010 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 6qg48s5kkmhgc6p1j38qcwb3w8yf6q5h X-HE-Tag: 1707943524-436533 X-HE-Meta: U2FsdGVkX1970syn9K2StVtmRfuEE3DZBFWmzsvugYBF21oZcwIC87AIXGEm4DUnDaLCJehOQyZPV4N2Ua9NG6qSUvOqZgkXAPH/+FLUh1fdiah+LawRS80LFNrzI9Nlf9f6WeFbwXvhNQXiLC/u5to0ASWhxnxcYsLn/aYC4zYfwLCZaGD5VP1aIrShmjhe7vDjwFiaA9iYUY/p42m5kZlo1CRxeyFlyAuBzuU1iI8kyeQY1n+JP90tlpbVz8pKJE/rteVQgcxPV52WVTHWOkq2i/vIXLGQTssZc4aVWpVzhkDDOlT/Pkl3pz7HZXJjySZ3zA1SKyKvk+DM4dOsH4qNbHl5JwQT9P7s+sphhi53dCEoN0jBFsKL8BS+S9bYiX1ukNS6aW2LqWCay1AUnAurwV5A7h2ipH+vVQrwRv2XUPnRdJjCfZq0Z6UyKK+Fll4BI4Y36DIDO8srBorc7Vlnv7If7E3Bx31b9qBfpjupXk9hxxWdPHRjklNmcGe6rqTTKXPRiQwzHmCx+t4SF7Zzyow7EiA9qxHh4xGILKQpMyKBVE6dSh4tW88N9Seh5T9fLY9gEbZy63GAIozuomppZaZR+OfqZcSbDdvTK0W9dcVh3FKZkPqZNRwim0c4lD0Ncm7Ee1pkvQuqRmMAQxSSOAHPo4HHy34SPNaUoCBJYoARre0HcycJAQmwuy0qEvWyRfHRwazwJ9UmpVuBFJgT2ZFvH/Nhq7XC78zYs/Zv9qJN+OQ8xCIzzdFa53Vyj42YeCQR5oKqZI2mRuCQa5agjvLAilLGXUaRAW4mzmCBkZpSRoCaBVl+NvksB293Lb5ohlGopO+i7i88DMQxwluTLbWQVOYmKqlOQklUBm/4xoxMR8HGA2ucdFkWRTvJnApDqXhQAq6m0bFdQBZwqPmx63MRi6b0ExJhureGqFw2nXro/zAjvNiyHeZzMz66ebVnffr12JN1RzT1owV YtaBLyUn GbfQk1MqTynaXqU1h8Sa+F503zI/I63DA+S1GEZAnqCzbOZyN3sVVQIrKn13wpDk11F3epquSO5cHnztf8pC3jw5Ud9fZFXLv9A3xek3x80rizgvRaUlX5kX7KJtBCpqDTHNoLd+NgBGfXeR78z/M6ih+CtJI6f//xHqSXbkZRM6RgP8P+Gcf00RE8YaAElAiRYrOq0uMDnsx+RbEhzRpIGnVXvBknGahe54zJQgM8sFJVUpY2SiaWyuaQnwYBSjf0M1CPcq9SUe64V4pUB0AzoPB3A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In tlb_batch_pages_flush(), we can end up freeing up to 512 pages or now up to 256 folio fragments that span more than one page, before we conditionally reschedule. It's a pain that we have to handle cond_resched() in tlb_batch_pages_flush() manually and cannot simply handle it in release_pages() -- release_pages() can be called from atomic context. Well, in a perfect world we wouldn't have to make our code more complicated at all. With page poisoning and init_on_free, we might now run into soft lockups when we free a lot of rather large folio fragments, because page freeing time then depends on the actual memory size we are freeing instead of on the number of folios that are involved. In the absolute (unlikely) worst case, on arm64 with 64k we will be able to free up to 256 folio fragments that each span 512 MiB: zeroing out 128 GiB does sound like it might take a while. But instead of ignoring this unlikely case, let's just handle it. So, let's teach tlb_batch_pages_flush() that there are some configurations where page freeing is horribly slow, and let's reschedule more frequently -- similarly like we did for now before we had large folio fragments in there. Avoid yet another loop over all encoded pages in the common case by handling that separately. Note that with page poisoning/zeroing, we might now end up freeing only a single folio fragment at a time that might exceed the old 512 pages limit: but if we cannot even free a single MAX_ORDER page on a system without running into soft lockups, something else is already completely bogus. Freeing a PMD-mapped THP would similarly cause trouble. In theory, we might even free 511 order-0 pages + a single MAX_ORDER page, effectively having to zero out 8703 pages on arm64 with 64k, translating to ~544 MiB of memory: however, if 512 MiB doesn't result in soft lockups, 544 MiB is unlikely to result in soft lockups, so we won't care about that for the time being. In the future, we might want to detect if handling cond_resched() is required at all, and just not do any of that with full preemption enabled. Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- mm/mmu_gather.c | 58 ++++++++++++++++++++++++++++++++++++------------- 1 file changed, 43 insertions(+), 15 deletions(-) diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index d175c0f1e2c8..99b3e9408aa0 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -91,18 +91,21 @@ void tlb_flush_rmaps(struct mmu_gather *tlb, struct vm_area_struct *vma) } #endif -static void tlb_batch_pages_flush(struct mmu_gather *tlb) -{ - struct mmu_gather_batch *batch; +/* + * We might end up freeing a lot of pages. Reschedule on a regular + * basis to avoid soft lockups in configurations without full + * preemption enabled. The magic number of 512 folios seems to work. + */ +#define MAX_NR_FOLIOS_PER_FREE 512 - for (batch = &tlb->local; batch && batch->nr; batch = batch->next) { - struct encoded_page **pages = batch->encoded_pages; +static void __tlb_batch_free_encoded_pages(struct mmu_gather_batch *batch) +{ + struct encoded_page **pages = batch->encoded_pages; + unsigned int nr, nr_pages; - while (batch->nr) { - /* - * limit free batch count when PAGE_SIZE > 4K - */ - unsigned int nr = min(512U, batch->nr); + while (batch->nr) { + if (!page_poisoning_enabled_static() && !want_init_on_free()) { + nr = min(MAX_NR_FOLIOS_PER_FREE, batch->nr); /* * Make sure we cover page + nr_pages, and don't leave @@ -111,14 +114,39 @@ static void tlb_batch_pages_flush(struct mmu_gather *tlb) if (unlikely(encoded_page_flags(pages[nr - 1]) & ENCODED_PAGE_BIT_NR_PAGES_NEXT)) nr++; + } else { + /* + * With page poisoning and init_on_free, the time it + * takes to free memory grows proportionally with the + * actual memory size. Therefore, limit based on the + * actual memory size and not the number of involved + * folios. + */ + for (nr = 0, nr_pages = 0; + nr < batch->nr && nr_pages < MAX_NR_FOLIOS_PER_FREE; + nr++) { + if (unlikely(encoded_page_flags(pages[nr]) & + ENCODED_PAGE_BIT_NR_PAGES_NEXT)) + nr_pages += encoded_nr_pages(pages[++nr]); + else + nr_pages++; + } + } - free_pages_and_swap_cache(pages, nr); - pages += nr; - batch->nr -= nr; + free_pages_and_swap_cache(pages, nr); + pages += nr; + batch->nr -= nr; - cond_resched(); - } + cond_resched(); } +} + +static void tlb_batch_pages_flush(struct mmu_gather *tlb) +{ + struct mmu_gather_batch *batch; + + for (batch = &tlb->local; batch && batch->nr; batch = batch->next) + __tlb_batch_free_encoded_pages(batch); tlb->active = &tlb->local; } From patchwork Wed Feb 14 20:44:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13557013 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD6E8C48BC3 for ; Wed, 14 Feb 2024 20:45:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 534F56B00BA; Wed, 14 Feb 2024 15:45:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4976C6B00BB; Wed, 14 Feb 2024 15:45:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 30FFB6B00BC; Wed, 14 Feb 2024 15:45:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 1D6C56B00BA for ; Wed, 14 Feb 2024 15:45:32 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id EC14680F10 for ; Wed, 14 Feb 2024 20:45:31 +0000 (UTC) X-FDA: 81791590062.15.A98058E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf03.hostedemail.com (Postfix) with ESMTP id 32D8620018 for ; Wed, 14 Feb 2024 20:45:29 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ICsQ6cxY; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf03.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707943530; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MyxDp/8zupvql86OoDnfzfcwBZWGRlpds50qO6uo6X4=; b=1+yKPDRUe2TI+uDkiDUTlmHw4VKgUXlHrDGgOvXRuFA1X7TmZ2h2+kNsyWjjxYSbUEFwiv SQj0GVPq3GZikTy++Y+T6QsZtt36jIpu/LTSzA4p2FqvSUVSoJtHrQp/MmM9PnNXinTzuT UNi+VARXIOWBrn+xea0FUzo2OyLFOAA= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ICsQ6cxY; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf03.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707943530; a=rsa-sha256; cv=none; b=4C7+l+q24wjfEBs9kvAhm7fbN/Ui9vmWQhAQynE8qJhBZRARDCEjFBDGssBJRjRdqMUYzh m4Aw3au4sh7OHhovrU89gH5ySMGpTp1kET8D6m8qCf6a4x78rNNK99q+WkcBriQ5+ZeMjs mcV7opEODA012TeiKF7JoC7ax4SmkCA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707943529; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MyxDp/8zupvql86OoDnfzfcwBZWGRlpds50qO6uo6X4=; b=ICsQ6cxY8z/sZnFsuYLYrbYuQupQsoIt4JUlsLk/6L/4Skt25vEHQlkgjPwC9cKXTKqI6C GbbDX1/p1dNDEeNKnoHLLuUSd0rSGEI6HrXCcU1EHpA70nVYS+MpkeTZ0a9U0+rERec8jg xhO2CKJFZLTL1+wN+NQtp+Sm33W0vhc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-588-wHfYfIO7M6uwQlj4FVq0Rg-1; Wed, 14 Feb 2024 15:45:22 -0500 X-MC-Unique: wHfYfIO7M6uwQlj4FVq0Rg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 96991811E79; Wed, 14 Feb 2024 20:45:21 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.194.174]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8E8091C066A9; Wed, 14 Feb 2024 20:45:17 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Matthew Wilcox , Ryan Roberts , Catalin Marinas , Yin Fengwei , Michal Hocko , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Peter Zijlstra , Michael Ellerman , Christophe Leroy , "Naveen N. Rao" , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org Subject: [PATCH v3 10/10] mm/memory: optimize unmap/zap with PTE-mapped THP Date: Wed, 14 Feb 2024 21:44:35 +0100 Message-ID: <20240214204435.167852-11-david@redhat.com> In-Reply-To: <20240214204435.167852-1-david@redhat.com> References: <20240214204435.167852-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Rspamd-Queue-Id: 32D8620018 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: hrwa563h9x854zebzyozo73o3gjhcbzk X-HE-Tag: 1707943529-43875 X-HE-Meta: U2FsdGVkX19XUNjbfLbwchF7djseg3Ps/XAUCeBaQq2y+u6DUEI8hetC4JOTgtG6j0lrHVvvj7HcUl0+6WlB6HL9ufueS6juoIxLpfBFSsmeYs+2O4L6DrayBwn4UQeRePy9VfFMoDEqrN53QSdEtwzn/4/LWaueq1C+ZgrQja5jBvBAaX44FlRPTa27ywam4gchwc+5GEYKSwC5Q3cj3KQUjzutCS6qb1QRuiwAu99F9ko9lluxEgHoDrkPiov1dP4fQ9UZq+HktJzkQp7k222MXkbC8NkdiFRLG6vc6ZXBZtFyQBDNuHrL1nwIU0YDBxZuBNwtEj81vBI4SBcHIsYl7lHuP1K2LzATkuqxHAH8efvq8RkI8ICGdH5rO+mEeqzxhCSoom8/L388ORXm/7cjUrORCXCfTerssRJ+ZMZMD08LFcem6TkC+GaVcs+d0alfqCKMjK68U9q6m/mlXnJYj28EuJOikPCIPo5etPBDPh3+i6SGLuNqGvYzykO61VgPL4wngccXtlWHylY2s4JndKb9/Aoj4l9myFAaeNyI1GrDKOnEP3OZMfr9v/mR0ufoJm1xFTfDVEQGaMtZ4hWv6A1puE77C1QmZCKu11SQA70CeqEKf9lPmEMFNvWT5T9Z+k73bb/XVWHdnW2JfMt5R0FcSIT8OAvsijadDLZAzEq5j1aQ5Hpsv3+XGIFmNeOmjApL7mvkicwPphmjuGW3gk4JuEWEa0y4Rc3Sv8SQx0vpgodLmfl/o7VSHHei7QGL2UjLXeJd5qTJaYY7gXAcTqcGn9OU8uVdokKUtD6O51FIF7ANuTqLrNVi2wgHbZppNDt1NEboCF8mzzeT8WI3Y/jdh5Lh4rdkH6DDtfidyJI2kzVVeVXUt/n0AXh3yJyAP12/tuSogVWUDU4DnXcNUKMu0Ti+43Pd98fN08YzvqEOnGKl3f+lQ+6QEkrLqI/yV31eWk39cjI/yMP nxZMry85 ih0hpGNY8mQeq4kPA9i/PIzHMB1dnYhg7JRwyt6OKZt/hqosOiGKgVAAtMdF5uXxV9HpSd4FfqmTU7jQDs84eOjSyu9pQzW6Tds+yQi/d0haSQppx/TBsKm7SFXzIEmdRB64EQ3JiXdmXPxTVHfDR/GLeACUBBk1ezRSxEDm83vk0hm3ZgTykh3SwlcTGd9MnxX6bmo6KZ399ekl792LlK5LnNNMGK6HZRDy4J/c1KIo6OQw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Similar to how we optimized fork(), let's implement PTE batching when consecutive (present) PTEs map consecutive pages of the same large folio. Most infrastructure we need for batching (mmu gather, rmap) is already there. We only have to add get_and_clear_full_ptes() and clear_full_ptes(). Similarly, extend zap_install_uffd_wp_if_needed() to process a PTE range. We won't bother sanity-checking the mapcount of all subpages, but only check the mapcount of the first subpage we process. If there is a real problem hiding somewhere, we can trigger it simply by using small folios, or when we zap single pages of a large folio. Ideally, we had that check in rmap code (including for delayed rmap), but then we cannot print the PTE. Let's keep it simple for now. If we ever have a cheap folio_mapcount(), we might just want to check for underflows there. To keep small folios as fast as possible force inlining of a specialized variant using __always_inline with nr=1. Reviewed-by: Ryan Roberts Signed-off-by: David Hildenbrand --- include/linux/pgtable.h | 70 +++++++++++++++++++++++++++++++ mm/memory.c | 92 +++++++++++++++++++++++++++++------------ 2 files changed, 136 insertions(+), 26 deletions(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index aab227e12493..49ab1f73b5c2 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -580,6 +580,76 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, } #endif +#ifndef get_and_clear_full_ptes +/** + * get_and_clear_full_ptes - Clear present PTEs that map consecutive pages of + * the same folio, collecting dirty/accessed bits. + * @mm: Address space the pages are mapped into. + * @addr: Address the first page is mapped at. + * @ptep: Page table pointer for the first entry. + * @nr: Number of entries to clear. + * @full: Whether we are clearing a full mm. + * + * May be overridden by the architecture; otherwise, implemented as a simple + * loop over ptep_get_and_clear_full(), merging dirty/accessed bits into the + * returned PTE. + * + * Note that PTE bits in the PTE range besides the PFN can differ. For example, + * some PTEs might be write-protected. + * + * Context: The caller holds the page table lock. The PTEs map consecutive + * pages that belong to the same folio. The PTEs are all in the same PMD. + */ +static inline pte_t get_and_clear_full_ptes(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, unsigned int nr, int full) +{ + pte_t pte, tmp_pte; + + pte = ptep_get_and_clear_full(mm, addr, ptep, full); + while (--nr) { + ptep++; + addr += PAGE_SIZE; + tmp_pte = ptep_get_and_clear_full(mm, addr, ptep, full); + if (pte_dirty(tmp_pte)) + pte = pte_mkdirty(pte); + if (pte_young(tmp_pte)) + pte = pte_mkyoung(pte); + } + return pte; +} +#endif + +#ifndef clear_full_ptes +/** + * clear_full_ptes - Clear present PTEs that map consecutive pages of the same + * folio. + * @mm: Address space the pages are mapped into. + * @addr: Address the first page is mapped at. + * @ptep: Page table pointer for the first entry. + * @nr: Number of entries to clear. + * @full: Whether we are clearing a full mm. + * + * May be overridden by the architecture; otherwise, implemented as a simple + * loop over ptep_get_and_clear_full(). + * + * Note that PTE bits in the PTE range besides the PFN can differ. For example, + * some PTEs might be write-protected. + * + * Context: The caller holds the page table lock. The PTEs map consecutive + * pages that belong to the same folio. The PTEs are all in the same PMD. + */ +static inline void clear_full_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned int nr, int full) +{ + for (;;) { + ptep_get_and_clear_full(mm, addr, ptep, full); + if (--nr == 0) + break; + ptep++; + addr += PAGE_SIZE; + } +} +#endif /* * If two threads concurrently fault at the same page, the thread that diff --git a/mm/memory.c b/mm/memory.c index a3efc4da258a..3b8e56eb08a3 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1515,7 +1515,7 @@ static inline bool zap_drop_file_uffd_wp(struct zap_details *details) */ static inline void zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, - unsigned long addr, pte_t *pte, + unsigned long addr, pte_t *pte, int nr, struct zap_details *details, pte_t pteval) { /* Zap on anonymous always means dropping everything */ @@ -1525,20 +1525,27 @@ zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, if (zap_drop_file_uffd_wp(details)) return; - pte_install_uffd_wp_if_needed(vma, addr, pte, pteval); + for (;;) { + /* the PFN in the PTE is irrelevant. */ + pte_install_uffd_wp_if_needed(vma, addr, pte, pteval); + if (--nr == 0) + break; + pte++; + addr += PAGE_SIZE; + } } -static inline void zap_present_folio_pte(struct mmu_gather *tlb, +static __always_inline void zap_present_folio_ptes(struct mmu_gather *tlb, struct vm_area_struct *vma, struct folio *folio, - struct page *page, pte_t *pte, pte_t ptent, unsigned long addr, - struct zap_details *details, int *rss, bool *force_flush, - bool *force_break) + struct page *page, pte_t *pte, pte_t ptent, unsigned int nr, + unsigned long addr, struct zap_details *details, int *rss, + bool *force_flush, bool *force_break) { struct mm_struct *mm = tlb->mm; bool delay_rmap = false; if (!folio_test_anon(folio)) { - ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); + ptent = get_and_clear_full_ptes(mm, addr, pte, nr, tlb->fullmm); if (pte_dirty(ptent)) { folio_mark_dirty(folio); if (tlb_delay_rmap(tlb)) { @@ -1548,36 +1555,49 @@ static inline void zap_present_folio_pte(struct mmu_gather *tlb, } if (pte_young(ptent) && likely(vma_has_recency(vma))) folio_mark_accessed(folio); - rss[mm_counter(folio)]--; + rss[mm_counter(folio)] -= nr; } else { /* We don't need up-to-date accessed/dirty bits. */ - ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); - rss[MM_ANONPAGES]--; + clear_full_ptes(mm, addr, pte, nr, tlb->fullmm); + rss[MM_ANONPAGES] -= nr; } + /* Checking a single PTE in a batch is sufficient. */ arch_check_zapped_pte(vma, ptent); - tlb_remove_tlb_entry(tlb, pte, addr); + tlb_remove_tlb_entries(tlb, pte, nr, addr); if (unlikely(userfaultfd_pte_wp(vma, ptent))) - zap_install_uffd_wp_if_needed(vma, addr, pte, details, ptent); + zap_install_uffd_wp_if_needed(vma, addr, pte, nr, details, + ptent); if (!delay_rmap) { - folio_remove_rmap_pte(folio, page, vma); + folio_remove_rmap_ptes(folio, page, nr, vma); + + /* Only sanity-check the first page in a batch. */ if (unlikely(page_mapcount(page) < 0)) print_bad_pte(vma, addr, ptent, page); } - if (unlikely(__tlb_remove_page(tlb, page, delay_rmap))) { + if (unlikely(__tlb_remove_folio_pages(tlb, page, nr, delay_rmap))) { *force_flush = true; *force_break = true; } } -static inline void zap_present_pte(struct mmu_gather *tlb, +/* + * Zap or skip at least one present PTE, trying to batch-process subsequent + * PTEs that map consecutive pages of the same folio. + * + * Returns the number of processed (skipped or zapped) PTEs (at least 1). + */ +static inline int zap_present_ptes(struct mmu_gather *tlb, struct vm_area_struct *vma, pte_t *pte, pte_t ptent, - unsigned long addr, struct zap_details *details, - int *rss, bool *force_flush, bool *force_break) + unsigned int max_nr, unsigned long addr, + struct zap_details *details, int *rss, bool *force_flush, + bool *force_break) { + const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY; struct mm_struct *mm = tlb->mm; struct folio *folio; struct page *page; + int nr; page = vm_normal_page(vma, addr, ptent); if (!page) { @@ -1587,14 +1607,29 @@ static inline void zap_present_pte(struct mmu_gather *tlb, tlb_remove_tlb_entry(tlb, pte, addr); VM_WARN_ON_ONCE(userfaultfd_wp(vma)); ksm_might_unmap_zero_page(mm, ptent); - return; + return 1; } folio = page_folio(page); if (unlikely(!should_zap_folio(details, folio))) - return; - zap_present_folio_pte(tlb, vma, folio, page, pte, ptent, addr, details, - rss, force_flush, force_break); + return 1; + + /* + * Make sure that the common "small folio" case is as fast as possible + * by keeping the batching logic separate. + */ + if (unlikely(folio_test_large(folio) && max_nr != 1)) { + nr = folio_pte_batch(folio, addr, pte, ptent, max_nr, fpb_flags, + NULL); + + zap_present_folio_ptes(tlb, vma, folio, page, pte, ptent, nr, + addr, details, rss, force_flush, + force_break); + return nr; + } + zap_present_folio_ptes(tlb, vma, folio, page, pte, ptent, 1, addr, + details, rss, force_flush, force_break); + return 1; } static unsigned long zap_pte_range(struct mmu_gather *tlb, @@ -1609,6 +1644,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, pte_t *start_pte; pte_t *pte; swp_entry_t entry; + int nr; tlb_change_page_size(tlb, PAGE_SIZE); init_rss_vec(rss); @@ -1622,7 +1658,9 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, pte_t ptent = ptep_get(pte); struct folio *folio; struct page *page; + int max_nr; + nr = 1; if (pte_none(ptent)) continue; @@ -1630,10 +1668,12 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, break; if (pte_present(ptent)) { - zap_present_pte(tlb, vma, pte, ptent, addr, details, - rss, &force_flush, &force_break); + max_nr = (end - addr) / PAGE_SIZE; + nr = zap_present_ptes(tlb, vma, pte, ptent, max_nr, + addr, details, rss, &force_flush, + &force_break); if (unlikely(force_break)) { - addr += PAGE_SIZE; + addr += nr * PAGE_SIZE; break; } continue; @@ -1687,8 +1727,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, WARN_ON_ONCE(1); } pte_clear_not_present_full(mm, addr, pte, tlb->fullmm); - zap_install_uffd_wp_if_needed(vma, addr, pte, details, ptent); - } while (pte++, addr += PAGE_SIZE, addr != end); + zap_install_uffd_wp_if_needed(vma, addr, pte, 1, details, ptent); + } while (pte += nr, addr += PAGE_SIZE * nr, addr != end); add_mm_rss_vec(mm, rss); arch_leave_lazy_mmu_mode();