From patchwork Fri Sep 13 09:19:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 13803183 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 378FEFA3733 for ; Fri, 13 Sep 2024 09:19:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7848C6B00CA; Fri, 13 Sep 2024 05:19:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 733406B00D5; Fri, 13 Sep 2024 05:19:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5D4AD6B00D6; Fri, 13 Sep 2024 05:19:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 3CEC56B00CA for ; Fri, 13 Sep 2024 05:19:40 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 9B79A81366 for ; Fri, 13 Sep 2024 09:19:39 +0000 (UTC) X-FDA: 82559167278.04.0325968 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf10.hostedemail.com (Postfix) with ESMTP id 9DAF2C0002 for ; Fri, 13 Sep 2024 09:19:36 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=none; spf=pass (imf10.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726219036; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references; bh=nCUdIKRj/Jxwe6C9Q3wyx7a78dTP1/fGipg1op1kS4A=; b=IJp52U5sYSNBHlghYrhTxB9y3pgxQlz7Og96VxP20A15tUG+SCd9y+it5VQBBcZKGzEMiu joaNNG2U5mtVCvQHxltQ1XhtNbon9TsA9oVmayu3SmQ4qxs4dOXz7SgesTS1dh4Nv7NACL QdhUEoXmSfTfMvkb8M7G6ixbxIuSfRw= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=none; spf=pass (imf10.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726219036; a=rsa-sha256; cv=none; b=DtiAJPkTu8g+b19LVLUDqkKuEGWTE+z4ZvZmVCCiAhozxo54vtOk1mD94Hqsgci5lZtEuW uKiYA4xS3OBBlmVvrNndttSshviQTpMovVPjADnBXgiT+iBXxOScH5JtRTycrnae76aLaD xM/xdk83pDp88B2Zf4sHmoDDt6Zl7f4= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 42CFD339; Fri, 13 Sep 2024 02:20:05 -0700 (PDT) Received: from e116581.blr.arm.com (e116581.arm.com [10.162.40.25]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 2ACB33F73B; Fri, 13 Sep 2024 02:19:30 -0700 (PDT) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, baohua@kernel.org, hughd@google.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, baolin.wang@linux.alibaba.com, gshan@redhat.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Dev Jain Subject: [PATCH] mm: Compute mTHP order efficiently Date: Fri, 13 Sep 2024 14:49:02 +0530 Message-Id: <20240913091902.1160520-1-dev.jain@arm.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 9DAF2C0002 X-Stat-Signature: 1tjxjx455xsaaosji3ajyd37a7mi7axt X-Rspam-User: X-HE-Tag: 1726219176-102413 X-HE-Meta: U2FsdGVkX183+UsD/1QBTOEDF2yBp6cexvxBWndEBrP+SGZYKMX/EjDnUcSTivujB5xlJdJ/KLAmCqD/wa0ppds2MX0esJa7xSKNDE/JDPXIwbEQ5aLRto9xkfTS/sZeZsHqv/ruUoHwhZakGfMUdO0gmyaQZf2bE3lMGzHSIAfZcmmqcfsM0dPLsJp6caDmsVfPtGktUowI73BGulcfZ+M2C+rbUU5L+p0ZUg/tTZbj6YGWUobvowsNTc6st1UjQZWkKR8bmi7iQJ9vySK4dQiGGT4MQH6CeNrP/lANxV+BlBwsmFyFQ7SD6mRIbYy+rtzBIcY7N6+PeWwYhWA6zMspDZDnGTrgQPPxyif8/JlkJ2nLl0FKEjk0nL0zFtPh8aTQBXOEeaJBDCvqfElF0n5fsfqa02xQ3ucsJgQop5KtWZ7arTPzugVHBS2XW3qEuo/IhG+hhLCqjC8S/NxMF5P/fFNQ2bRN19IW7/K7S8pdaAqHSQFamRcUgwgrS1yFnaPD9wICtVW/8kmsj8xKnZfyq5qYJp0fTmiQzDAptT9rhWXyyZfIDj1JEj+JN/o9/S+NIjA7kGapHC7T9DRqN3fxZC9zmaA/4I2/uVWd9b+HWvHGx3TsD4Bn1M9nqjoPQ2gFifd03pko7uGkrBi7MAk5rOrn7GLh9WukYlbEtPasv2olFbD5oROZEo1CuwWaApby5L7q+mwLVMnVCtcReGOZSfdtFwnAT2XV05uKyJljtGeipRgRJChFGpFOSlr8fkzDxYsngD/GN1CYSpTInXuPC3+5167OIVOC2Bdu+auhPlqJVadxymBV/rISoesfdgptiD1QlcSZA/Hhk5foY7ThkxssiJ2rMvLd89iAS5IjE60glWHW7cldr7fH0f5XMQ5pg21vf6YCyZbBVxuLtkiBX3bI0F7hQ70Hmo46t88tlk14CFnvfW7fAyLNslGZIvdVcQUExIPpNn5dtA+ gLKLzAxA LsOpyiwOOFZIwcpar+CUQCbr1kOa5PICKPW6+7E3OmGu+RGZtfRlBYq5xweYmUOnF/UPlrciZ7hTCz9dz1LSsv1MdaUR2g8pdfCbJpA4tQQsho/hKTadzm+AYo39cZcRRVpKYHksC65l8Zy/lnsacHw6dCw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: We use pte_range_none() to determine whether contiguous PTEs are empty for an mTHP allocation. Instead of iterating the while loop for every order, use some information, which is the first set PTE found, from the previous iteration, to eliminate some cases. The key to understanding the correctness of the patch is that the ranges we want to examine form a strictly decreasing sequence of nested intervals. Suggested-by: Ryan Roberts Signed-off-by: Dev Jain --- mm/memory.c | 30 +++++++++++++++++++++++------- 1 file changed, 23 insertions(+), 7 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 3c01d68065be..ffc24a48ef15 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4409,26 +4409,27 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) return ret; } -static bool pte_range_none(pte_t *pte, int nr_pages) +static int pte_range_none(pte_t *pte, int nr_pages) { int i; for (i = 0; i < nr_pages; i++) { if (!pte_none(ptep_get_lockless(pte + i))) - return false; + return i; } - return true; + return nr_pages; } static struct folio *alloc_anon_folio(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; #ifdef CONFIG_TRANSPARENT_HUGEPAGE + pte_t *first_set_pte = NULL, *align_pte, *pte; unsigned long orders; struct folio *folio; unsigned long addr; - pte_t *pte; + int max_empty; gfp_t gfp; int order; @@ -4463,8 +4464,23 @@ static struct folio *alloc_anon_folio(struct vm_fault *vmf) order = highest_order(orders); while (orders) { addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << order); - if (pte_range_none(pte + pte_index(addr), 1 << order)) + align_pte = pte + pte_index(addr); + + /* Range to be scanned known to be empty */ + if (align_pte + (1 << order) <= first_set_pte) break; + + /* Range to be scanned contains first_set_pte */ + if (align_pte <= first_set_pte) + goto repeat; + + /* align_pte > first_set_pte, so need to check properly */ + max_empty = pte_range_none(align_pte, 1 << order); + if (max_empty == 1 << order) + break; + + first_set_pte = align_pte + max_empty; +repeat: order = next_order(&orders, order); } @@ -4579,7 +4595,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) if (nr_pages == 1 && vmf_pte_changed(vmf)) { update_mmu_tlb(vma, addr, vmf->pte); goto release; - } else if (nr_pages > 1 && !pte_range_none(vmf->pte, nr_pages)) { + } else if (nr_pages > 1 && pte_range_none(vmf->pte, nr_pages) != nr_pages) { update_mmu_tlb_range(vma, addr, vmf->pte, nr_pages); goto release; } @@ -4915,7 +4931,7 @@ vm_fault_t finish_fault(struct vm_fault *vmf) update_mmu_tlb(vma, addr, vmf->pte); ret = VM_FAULT_NOPAGE; goto unlock; - } else if (nr_pages > 1 && !pte_range_none(vmf->pte, nr_pages)) { + } else if (nr_pages > 1 && pte_range_none(vmf->pte, nr_pages) != nr_pages) { update_mmu_tlb_range(vma, addr, vmf->pte, nr_pages); ret = VM_FAULT_NOPAGE; goto unlock;