From patchwork Tue Mar 4 15:04:31 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 14000865 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57B1CC021B8 for ; Tue, 4 Mar 2025 15:05:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E16946B0085; Tue, 4 Mar 2025 10:05:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DC88C6B008A; Tue, 4 Mar 2025 10:05:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C8F376B008C; Tue, 4 Mar 2025 10:05:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A6C676B0085 for ; Tue, 4 Mar 2025 10:05:00 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 1BB69A0F25 for ; Tue, 4 Mar 2025 15:05:00 +0000 (UTC) X-FDA: 83184191160.13.FFEFDBC Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf17.hostedemail.com (Postfix) with ESMTP id 3C01F4002D for ; Tue, 4 Mar 2025 15:04:58 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=none; spf=pass (imf17.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741100698; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FRBxlI11z+iOjTr34Y+Rzr3Tsdmm0TM9w6xycECWkVo=; b=8YN5wWmHV0ntOhwGX/WsSgLVaEEo0k9K5smd1tyx88hQ8yeRy5ybkQ4qKzcI8c2TtXiX4z B4HxaIqdY6I1un9rhFvYXLkyncGTWMur/JKkN55q0v+Caq6GIpAmZ/n6S4qQvIsQo6YLvF tqv2LhaLzi0aQKcmJ9itXPid+ZO5pdw= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=none; spf=pass (imf17.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741100698; a=rsa-sha256; cv=none; b=eeq614NzXQB4URwA01B2l3C05omjIrfFD2MglyksiD9wyU6lOb9ESVPO1DrtMI03nOkIJV w6EHSXzs8JNrQWomJN5LxA8WYg3K9YathEs+xJ/j+uC5To7sSn27azloWrHj050iLs0EVk 0qCsY4Lacie05q51MkbA4MdnUc7nHhc= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EB4EE1007; Tue, 4 Mar 2025 07:05:10 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 659523F66E; Tue, 4 Mar 2025 07:04:55 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 01/11] arm64: hugetlb: Cleanup huge_pte size discovery mechanisms Date: Tue, 4 Mar 2025 15:04:31 +0000 Message-ID: <20250304150444.3788920-2-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250304150444.3788920-1-ryan.roberts@arm.com> References: <20250304150444.3788920-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: fdfgctjd86rmhe4uufw7mi19g9amcmgt X-Rspamd-Queue-Id: 3C01F4002D X-Rspamd-Server: rspam07 X-HE-Tag: 1741100698-463317 X-HE-Meta: U2FsdGVkX18sMEsy5WHAnIYil5mKsFQqMNMoCH4KqLk1pwx+6O5ec9FncV4lo3POrT3oLiyVoRBe4dbBU0oF3ZLAe1bRFV1A9tKvA8xWA/c1Q3+QKdFl/TCPau9qGjVd4ZM/i6Cr9DG6twpATL/Bn3lnZo+HPYlN1zKe3aKEs2HZXgzeaPrKoYKfdqyJA9YznXsISRvKcdkqwcUKV7iWwAriUjqVHtTTBPYbmWApZhhv/9gxLS+P5FL0CgqGfhz3AqL8n3a1a9lImA1N+pBDP39ghxkZxDPITABY5DOz56jtrbDqN1XYsjQBw8PyD/rrSiXwGRSRkjicfMcUcPABMamnaA8y7gmv694h1zZ9PdUYkaGYykn8TZi4krMmsnJ1t+SwG1wRbuo0RWUHKYuNFMtcEQB7/mqK4WTcM6ZkbrwCePazgLnEpHF0fEKahq4aozmHhgKLanoGM3UqVIo4xt4+BomncOPKX1rqEhJrKZ/MUEbwDAjNZShSg7K4qH9s7Sq2X/3/fMayJv5cXAZMiPB2AGeoHpp4NSRK/PwmDgOH6KCjv+737/fcLXV0EcPDFFilJlzOQ3UEe+dBu8LpRiyXNh6DTyzkSSC/cY0irl3SjV3B2zb+M+6xUsg5zbJ+lRj8gqhsxemAS/Jbj1JOljbLZLJQhp7gCo4kNPzhpujCmdID979dHW5BrTfxbhA3VR8flOUQ/LG3x0Wi7XKAYx4/30ew6maja5a85QC9UEw/bqK1VloH9Ne0VarNBtrezB7gVhVFFQ+w4cvnYjTHPdVh+dt+ELJ469Zy+glIqPgVTWSiih2toRcM4zC/V4xpm2NOAmN0D632l6GRWdCefxo8COk2od8Toa0p3/jGg8U6JZPI7MjAP+8Qi4HUw08bhrb9WpfHx9N8CGHvzM9jI/52Zbv8vfj+ekl99WGNuaKwe1whAtgDXEbZqbzoIfQAx4H3kj+pYvska6+fL3C +PWC3pLE FCUL0fEdkw8IXN4+jcy54Br7HtMEM4J/k490caZqVGByNFYdZhaCnPjKHAt1nxRTY9jgSDqV1fbVe2Hlh3DuiUtp+LqEpdt4ii4cka6fmug/teV6VJWuii9IKk3iMjE7oz2Hqoy3W2nPhvv2ML91feA912yBPAmw+46GKApfkd8whvvorD1ZYCn4KHwljdG8sc7/o3d7SU7s4MsQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Not all huge_pte helper APIs explicitly provide the size of the huge_pte. So the helpers have to depend on various methods to determine the size of the huge_pte. Some of these methods are dubious. Let's clean up the code to use preferred methods and retire the dubious ones. The options in order of preference: - If size is provided as parameter, use it together with num_contig_ptes(). This is explicit and works for both present and non-present ptes. - If vma is provided as a parameter, retrieve size via huge_page_size(hstate_vma(vma)) and use it together with num_contig_ptes(). This is explicit and works for both present and non-present ptes. - If the pte is present and contiguous, use find_num_contig() to walk the pgtable to find the level and infer the number of ptes from level. Only works for *present* ptes. - If the pte is present and not contiguous and you can infer from this that only 1 pte needs to be operated on. This is ok if you don't care about the absolute size, and just want to know the number of ptes. - NEVER rely on resolving the PFN of a present pte to a folio and getting the folio's size. This is fragile at best, because there is nothing to stop the core-mm from allocating a folio twice as big as the huge_pte then mapping it across 2 consecutive huge_ptes. Or just partially mapping it. Where we require that the pte is present, add warnings if not-present. Signed-off-by: Ryan Roberts --- arch/arm64/mm/hugetlbpage.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index b3a7fafe8892..6a2af9fb2566 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -129,7 +129,7 @@ pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep) if (!pte_present(orig_pte) || !pte_cont(orig_pte)) return orig_pte; - ncontig = num_contig_ptes(page_size(pte_page(orig_pte)), &pgsize); + ncontig = find_num_contig(mm, addr, ptep, &pgsize); for (i = 0; i < ncontig; i++, ptep++) { pte_t pte = __ptep_get(ptep); @@ -438,16 +438,19 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, pgprot_t hugeprot; pte_t orig_pte; + VM_WARN_ON(!pte_present(pte)); + if (!pte_cont(pte)) return __ptep_set_access_flags(vma, addr, ptep, pte, dirty); - ncontig = find_num_contig(mm, addr, ptep, &pgsize); + ncontig = num_contig_ptes(huge_page_size(hstate_vma(vma)), &pgsize); dpfn = pgsize >> PAGE_SHIFT; if (!__cont_access_flags_changed(ptep, pte, ncontig)) return 0; orig_pte = get_clear_contig_flush(mm, addr, ptep, pgsize, ncontig); + VM_WARN_ON(!pte_present(orig_pte)); /* Make sure we don't lose the dirty or young state */ if (pte_dirty(orig_pte)) @@ -472,7 +475,10 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm, size_t pgsize; pte_t pte; - if (!pte_cont(__ptep_get(ptep))) { + pte = __ptep_get(ptep); + VM_WARN_ON(!pte_present(pte)); + + if (!pte_cont(pte)) { __ptep_set_wrprotect(mm, addr, ptep); return; } @@ -496,11 +502,15 @@ pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, struct mm_struct *mm = vma->vm_mm; size_t pgsize; int ncontig; + pte_t pte; + + pte = __ptep_get(ptep); + VM_WARN_ON(!pte_present(pte)); - if (!pte_cont(__ptep_get(ptep))) + if (!pte_cont(pte)) return ptep_clear_flush(vma, addr, ptep); - ncontig = find_num_contig(mm, addr, ptep, &pgsize); + ncontig = num_contig_ptes(huge_page_size(hstate_vma(vma)), &pgsize); return get_clear_contig_flush(mm, addr, ptep, pgsize, ncontig); } From patchwork Tue Mar 4 15:04:32 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 14000866 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4251C282D2 for ; Tue, 4 Mar 2025 15:05:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9AC216B008A; Tue, 4 Mar 2025 10:05:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 90D0B6B008C; Tue, 4 Mar 2025 10:05:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7AF066B0092; Tue, 4 Mar 2025 10:05:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 4DBB96B008C for ; Tue, 4 Mar 2025 10:05:02 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 0DE0914021D for ; Tue, 4 Mar 2025 15:05:02 +0000 (UTC) X-FDA: 83184191244.03.CD33082 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf15.hostedemail.com (Postfix) with ESMTP id 4276CA0034 for ; Tue, 4 Mar 2025 15:05:00 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf15.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741100700; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nmyuc6lIBFf/YKH0hZWfobduPBbxOIgME37VdPqLXHU=; b=ir1y/FBmTPaH2bRH8gIVSUY2BfvUjMliJV1Eh4gaEtTAJcxkMfiRtePFecxS0LxCvgggc7 ZSxxeRTnTFwMhWdtqrt8ypF7CrJ7ZMUG/dOQTLZbg2MV5epHFMoeSfDO6TUsKyaVL52iY5 arD03x/Da8uSKUxcq872a+kxJ/q3vlY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741100700; a=rsa-sha256; cv=none; b=LOizLcSTR++VFH9gjqEHD60l4v8DXwpvq//QXqAcfBdtHfPDpsfjj3QTTYeNRwAcrEBGXd dvWkjgwI7F+y7O5ksJpPF2h/EnN9QRPYIL1VuM4MERUDFW0gWVhoT2jUe41HCdg9H8BZqz qMmtQAf3O+IsEQGjc6Ceq4RU1iJeYjM= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf15.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 167B41C25; Tue, 4 Mar 2025 07:05:13 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 84D063F66E; Tue, 4 Mar 2025 07:04:57 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 02/11] arm64: hugetlb: Refine tlb maintenance scope Date: Tue, 4 Mar 2025 15:04:32 +0000 Message-ID: <20250304150444.3788920-3-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250304150444.3788920-1-ryan.roberts@arm.com> References: <20250304150444.3788920-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam02 X-Stat-Signature: bmwpwkeqpgry8swie11xa8313b9nbxc3 X-Rspamd-Queue-Id: 4276CA0034 X-Rspam-User: X-HE-Tag: 1741100700-492844 X-HE-Meta: U2FsdGVkX1+PGDlVvI2KKA1TLy5eSvpAcVm7HQtYOKOajRX0NhkgAUsKBfh9773px1mcLQbu6uAeRAuhei4UsEZWEQcllMFTfddbrTB9vmqbtyYtVV5KB9Q4akU7DhW1WqH3nTvjV5sJUVWF0Iwz95Vk7XAHRTUvoqk3a/gA4c1qTV6tWjfNCIv49Of1nET05G4BndnCxSng84tl/qrU+ZuH+r7uoxrBUJ9y0HF0v9u1BD1D6HpXz5Hu0L7mX3h7Fn2aP5GdPHl5uHap2QV1YtgDXVusGapabarhkHU7G8fvLaBUqGzydmCU0Q+Bwxq6LlTlIIO7IYufe/PbW4AyMTFlUFvqaCT5fTwvqvB3a3aFgCPynIRvQtmvTu1AYKGsndsBGdnJqWXokbI0HPugJoFKq+uObziA0tVzZHuRcv89pGWTMdMWWH3OCctLuONYO1HejegU3O8QepT0lTUJG9zestQ9wNdEP7zkwIvcjGtQDv8bAFMQ8UhlqS7QUJGgmSfwL5nYQNwbvByKkMmwsfT9nLKgLAt2pEjvCGIol0YDPxOPhQfkZ/XjHVQgdKVlYYD90meizf9Avh5diePQVdX1aZwqcgyETjzt0cgenyYlWNj9FX8s4VtRK6DjRTDdok2rRQG+Arm70azjQP8gvkFv/vB3tsQF3Z7p3K4lTQHeyoJxhemkz4OhO67Y9JNoIVHUk3+WxLTHFxYUc1ZnNJut6ySw4kD+XbAPBF0ugT1XEo8721ReXab8pHV1HH3KWVgUgLnfiswGfxUdhkFzAZEba24szg86ZiJ8n96MkuexATnVFLNObenCM21uKJDn9cOcSK+Ysj1NPDkkRl5alZtvpGppalQPQw7y0vXGoEhEY4MdU1mE+EVBcbvxO1T3gUufSEHTtjAQBJpC8EZp4k39O27TuSdkFn3B/3lw8YVH8sCbAxvSNk9iyT05lv3htkKoKGsl2xZa4c+nscZ zOb0VQ6M xHXRQTgnrzPB7rCZG4egX5NXWwGQ6PP781hKqTHDLBUhYkq1KGFLl0XFMQNRqJdNVldvYe1NlEXqmWoOYou/SPSPdn8kVReoS9cAv3S/h+12xj8fv56DEdqDUUN+RvpGiCU0uIPqbO1m5MEirMVPBA6FHog== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When operating on contiguous blocks of ptes (or pmds) for some hugetlb sizes, we must honour break-before-make requirements and clear down the block to invalid state in the pgtable then invalidate the relevant tlb entries before making the pgtable entries valid again. However, the tlb maintenance is currently always done assuming the worst case stride (PAGE_SIZE), last_level (false) and tlb_level (TLBI_TTL_UNKNOWN). We can do much better with the hinting; In reality, we know the stride from the huge_pte pgsize, we are always operating only on the last level, and we always know the tlb_level, again based on pgsize. So let's start providing these hints. Additionally, avoid tlb maintenace in set_huge_pte_at(). Break-before-make is only required if we are transitioning the contiguous pte block from valid -> valid. So let's elide the clear-and-flush ("break") if the pte range was previously invalid. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/hugetlb.h | 29 +++++++++++++++++++---------- arch/arm64/mm/hugetlbpage.c | 9 ++++++--- 2 files changed, 25 insertions(+), 13 deletions(-) diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h index 07fbf5bf85a7..2a8155c4a882 100644 --- a/arch/arm64/include/asm/hugetlb.h +++ b/arch/arm64/include/asm/hugetlb.h @@ -69,29 +69,38 @@ extern void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, #include -#define __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE -static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma, - unsigned long start, - unsigned long end) +static inline void __flush_hugetlb_tlb_range(struct vm_area_struct *vma, + unsigned long start, + unsigned long end, + unsigned long stride, + bool last_level) { - unsigned long stride = huge_page_size(hstate_vma(vma)); - switch (stride) { #ifndef __PAGETABLE_PMD_FOLDED case PUD_SIZE: - __flush_tlb_range(vma, start, end, PUD_SIZE, false, 1); + __flush_tlb_range(vma, start, end, PUD_SIZE, last_level, 1); break; #endif case CONT_PMD_SIZE: case PMD_SIZE: - __flush_tlb_range(vma, start, end, PMD_SIZE, false, 2); + __flush_tlb_range(vma, start, end, PMD_SIZE, last_level, 2); break; case CONT_PTE_SIZE: - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 3); + __flush_tlb_range(vma, start, end, PAGE_SIZE, last_level, 3); break; default: - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN); + __flush_tlb_range(vma, start, end, PAGE_SIZE, last_level, TLBI_TTL_UNKNOWN); } } +#define __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE +static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma, + unsigned long start, + unsigned long end) +{ + unsigned long stride = huge_page_size(hstate_vma(vma)); + + __flush_hugetlb_tlb_range(vma, start, end, stride, false); +} + #endif /* __ASM_HUGETLB_H */ diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 6a2af9fb2566..065be8650aa5 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -183,8 +183,9 @@ static pte_t get_clear_contig_flush(struct mm_struct *mm, { pte_t orig_pte = get_clear_contig(mm, addr, ptep, pgsize, ncontig); struct vm_area_struct vma = TLB_FLUSH_VMA(mm, 0); + unsigned long end = addr + (pgsize * ncontig); - flush_tlb_range(&vma, addr, addr + (pgsize * ncontig)); + __flush_hugetlb_tlb_range(&vma, addr, end, pgsize, true); return orig_pte; } @@ -209,7 +210,7 @@ static void clear_flush(struct mm_struct *mm, for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) __ptep_get_and_clear(mm, addr, ptep); - flush_tlb_range(&vma, saddr, addr); + __flush_hugetlb_tlb_range(&vma, saddr, addr, pgsize, true); } void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, @@ -238,7 +239,9 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, dpfn = pgsize >> PAGE_SHIFT; hugeprot = pte_pgprot(pte); - clear_flush(mm, addr, ptep, pgsize, ncontig); + /* Only need to "break" if transitioning valid -> valid. */ + if (pte_valid(__ptep_get(ptep))) + clear_flush(mm, addr, ptep, pgsize, ncontig); for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); From patchwork Tue Mar 4 15:04:33 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 14000867 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C5385C021B8 for ; Tue, 4 Mar 2025 15:05:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 739E26B008C; Tue, 4 Mar 2025 10:05:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6EA3E6B0092; Tue, 4 Mar 2025 10:05:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5B17A6B0093; Tue, 4 Mar 2025 10:05:04 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 3C5386B008C for ; Tue, 4 Mar 2025 10:05:04 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id EEB47B0027 for ; Tue, 4 Mar 2025 15:05:03 +0000 (UTC) X-FDA: 83184191286.08.4B9412F Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf11.hostedemail.com (Postfix) with ESMTP id 44D1B40032 for ; Tue, 4 Mar 2025 15:05:02 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf11.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741100702; a=rsa-sha256; cv=none; b=iQ4EAFc7cukq/kQwBpMn1z3ZcGLCf6exiZqrQbFvEOK961Cve8b/7BDto8Db/ZDDu2FbFF WMPdNJCT8PrNXWrSyCZQOCMij1+jxy6OTCVbKh0kG/xEEITR8fVUE4kPk7AVjFKylzSbCy xUvelC2Lk75ziFf9EcC0XM9ew+d82e0= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf11.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741100702; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=i1h0X13EBKBwrKu1G9dgmEtg+bxoPDg6xcG/Lwz+vSw=; b=GfXUhfopQeW4OGkULLDrp3t3+TFW5RRMaNfna/fWDkjVIRM8yOPy7NDvbLPrVhBDmUbE9F tgSw19iINDIBKgqaQS7qIPFVSMQuVciKMQJ6BhOE4Yo5//DFdX9+88JV78ener6YUgUQzU dDE2zpBswGx4jkOSRY3WC2hKEqvJMbA= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 379ECFEC; Tue, 4 Mar 2025 07:05:15 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A4AB43F66E; Tue, 4 Mar 2025 07:04:59 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 03/11] mm/page_table_check: Batch-check pmds/puds just like ptes Date: Tue, 4 Mar 2025 15:04:33 +0000 Message-ID: <20250304150444.3788920-4-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250304150444.3788920-1-ryan.roberts@arm.com> References: <20250304150444.3788920-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 44D1B40032 X-Stat-Signature: nbipcyoemrrdex9hbsraq1pin6zwwtyr X-Rspam-User: X-HE-Tag: 1741100702-603376 X-HE-Meta: U2FsdGVkX18WFJDKuKFxHd4bEBST7/rKQoSCHrlc+PsOcieRmO3Q3mWVO8EyJAe3AYDjljERR7Z2V5PD4HClC/cVXTvgjC/+XYG5PQnPwf57vgg19oX9eOf57SNUD1Gk1XdFM1Ykl0rV0zZ5hLA024FgdAc9VvS4PPwFV/plNJzBs19B6jUawHdjTqLFjZALRjFzRD0FrCnBAVny0MUMwS9qxW3K+GRAIjoFRx0LfTvml72+IliRwnyvrcsmGVDB4gLEURwvVPY2c/1fygM8KpPb+40y+xJd9iSRScEj7v31AvLc36ZHp5R6lKZHdW8gp8EuOh736u/y4lM3tA6QRxonDyTnTjmYNTTeJmDeD7kpHZv/sC/iDmT/nkkM20xH5Tv5Rln0q7XtT0kxbVuny7Cezih21ca69FUk+6o+peWci42Zhryz7nCNfn7gheZurAa7hLJK5/cQGtAqiDzdpyt85KJxvEbvo6ZL01ERHdp4QwCOYtEHzH5hSb/LMzpWg/F1JjmG3xwGW5SFABmNBD+alOdG2O1HPwt4OuFywkVi1f8IJ/4piSqS/wBR8J11opkRnUCnUKxd5gMKmqeSMSJ1vMWveNJKKnbnHEyknn3+LA01X91B4gIEfn7l2b/7CJCkBRyKtXYK+v1ZAPFAuts7xInc8T1YB7TFUUTZs1PUUTyMx5xh8bKhcT0n5thDre/ClAyZsSmBrE4y4cFRfIProgdSAoukkg97RPGEVZe7lV+0gx8ba8kiyWRjMiWoPbT66sRXzANNjlHnNanOk3ReZwsfpcqiuUgFZindmQHQmT+ulTnTcMgaD1ssLOkoAprdSSfQQ7R/u9kZLM7JT2i4O1nknwtdhvtNxpCeY40XGYdnoRGopG5m1rkoHe5aGQDmhx9NIK9a7cG9CDf6oxiU1P7bDluSY3mfnb7jNnkr66qgm3OjiJAQIqediYGzgKbZjGs71xVAnkeoEUl 2rP4LI/C 9VGtOv93/rTtnVAx4katBlDx+sMMpLK9EzTKyq0JX0+aPs8iUudl9P294zWhRS58LV5yDk6n+rn4QsR6d2COGnUTIGmjgD260zg8Leojyf6jKk9mD8/ue8AnrFOsPvSOefrz3w9kz2ytTIn9BfFedJ5ggq0WExMpS3xyBWAijkN+FHOpEYRtnYgfSqFritosIx8X13a6QQBWNtZA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Convert page_table_check_p[mu]d_set(...) to page_table_check_p[mu]ds_set(..., nr) to allow checking a contiguous set of pmds/puds in single batch. We retain page_table_check_p[mu]d_set(...) as macros that call new batch functions with nr=1 for compatibility. arm64 is about to reorganise its pte/pmd/pud helpers to reuse more code and to allow the implementation for huge_pte to more efficiently set ptes/pmds/puds in batches. We need these batch-helpers to make the refactoring possible. Reviewed-by: Anshuman Khandual Signed-off-by: Ryan Roberts --- include/linux/page_table_check.h | 30 +++++++++++++++++----------- mm/page_table_check.c | 34 +++++++++++++++++++------------- 2 files changed, 38 insertions(+), 26 deletions(-) diff --git a/include/linux/page_table_check.h b/include/linux/page_table_check.h index 6722941c7cb8..289620d4aad3 100644 --- a/include/linux/page_table_check.h +++ b/include/linux/page_table_check.h @@ -19,8 +19,10 @@ void __page_table_check_pmd_clear(struct mm_struct *mm, pmd_t pmd); void __page_table_check_pud_clear(struct mm_struct *mm, pud_t pud); void __page_table_check_ptes_set(struct mm_struct *mm, pte_t *ptep, pte_t pte, unsigned int nr); -void __page_table_check_pmd_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd); -void __page_table_check_pud_set(struct mm_struct *mm, pud_t *pudp, pud_t pud); +void __page_table_check_pmds_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd, + unsigned int nr); +void __page_table_check_puds_set(struct mm_struct *mm, pud_t *pudp, pud_t pud, + unsigned int nr); void __page_table_check_pte_clear_range(struct mm_struct *mm, unsigned long addr, pmd_t pmd); @@ -74,22 +76,22 @@ static inline void page_table_check_ptes_set(struct mm_struct *mm, __page_table_check_ptes_set(mm, ptep, pte, nr); } -static inline void page_table_check_pmd_set(struct mm_struct *mm, pmd_t *pmdp, - pmd_t pmd) +static inline void page_table_check_pmds_set(struct mm_struct *mm, + pmd_t *pmdp, pmd_t pmd, unsigned int nr) { if (static_branch_likely(&page_table_check_disabled)) return; - __page_table_check_pmd_set(mm, pmdp, pmd); + __page_table_check_pmds_set(mm, pmdp, pmd, nr); } -static inline void page_table_check_pud_set(struct mm_struct *mm, pud_t *pudp, - pud_t pud) +static inline void page_table_check_puds_set(struct mm_struct *mm, + pud_t *pudp, pud_t pud, unsigned int nr) { if (static_branch_likely(&page_table_check_disabled)) return; - __page_table_check_pud_set(mm, pudp, pud); + __page_table_check_puds_set(mm, pudp, pud, nr); } static inline void page_table_check_pte_clear_range(struct mm_struct *mm, @@ -129,13 +131,13 @@ static inline void page_table_check_ptes_set(struct mm_struct *mm, { } -static inline void page_table_check_pmd_set(struct mm_struct *mm, pmd_t *pmdp, - pmd_t pmd) +static inline void page_table_check_pmds_set(struct mm_struct *mm, + pmd_t *pmdp, pmd_t pmd, unsigned int nr) { } -static inline void page_table_check_pud_set(struct mm_struct *mm, pud_t *pudp, - pud_t pud) +static inline void page_table_check_puds_set(struct mm_struct *mm, + pud_t *pudp, pud_t pud, unsigned int nr) { } @@ -146,4 +148,8 @@ static inline void page_table_check_pte_clear_range(struct mm_struct *mm, } #endif /* CONFIG_PAGE_TABLE_CHECK */ + +#define page_table_check_pmd_set(mm, pmdp, pmd) page_table_check_pmds_set(mm, pmdp, pmd, 1) +#define page_table_check_pud_set(mm, pudp, pud) page_table_check_puds_set(mm, pudp, pud, 1) + #endif /* __LINUX_PAGE_TABLE_CHECK_H */ diff --git a/mm/page_table_check.c b/mm/page_table_check.c index 509c6ef8de40..dae4a7d776b3 100644 --- a/mm/page_table_check.c +++ b/mm/page_table_check.c @@ -234,33 +234,39 @@ static inline void page_table_check_pmd_flags(pmd_t pmd) WARN_ON_ONCE(swap_cached_writable(pmd_to_swp_entry(pmd))); } -void __page_table_check_pmd_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd) +void __page_table_check_pmds_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd, + unsigned int nr) { + unsigned int i; + unsigned long stride = PMD_SIZE >> PAGE_SHIFT; + if (&init_mm == mm) return; page_table_check_pmd_flags(pmd); - __page_table_check_pmd_clear(mm, *pmdp); - if (pmd_user_accessible_page(pmd)) { - page_table_check_set(pmd_pfn(pmd), PMD_SIZE >> PAGE_SHIFT, - pmd_write(pmd)); - } + for (i = 0; i < nr; i++) + __page_table_check_pmd_clear(mm, *(pmdp + i)); + if (pmd_user_accessible_page(pmd)) + page_table_check_set(pmd_pfn(pmd), stride * nr, pmd_write(pmd)); } -EXPORT_SYMBOL(__page_table_check_pmd_set); +EXPORT_SYMBOL(__page_table_check_pmds_set); -void __page_table_check_pud_set(struct mm_struct *mm, pud_t *pudp, pud_t pud) +void __page_table_check_puds_set(struct mm_struct *mm, pud_t *pudp, pud_t pud, + unsigned int nr) { + unsigned int i; + unsigned long stride = PUD_SIZE >> PAGE_SHIFT; + if (&init_mm == mm) return; - __page_table_check_pud_clear(mm, *pudp); - if (pud_user_accessible_page(pud)) { - page_table_check_set(pud_pfn(pud), PUD_SIZE >> PAGE_SHIFT, - pud_write(pud)); - } + for (i = 0; i < nr; i++) + __page_table_check_pud_clear(mm, *(pudp + i)); + if (pud_user_accessible_page(pud)) + page_table_check_set(pud_pfn(pud), stride * nr, pud_write(pud)); } -EXPORT_SYMBOL(__page_table_check_pud_set); +EXPORT_SYMBOL(__page_table_check_puds_set); void __page_table_check_pte_clear_range(struct mm_struct *mm, unsigned long addr, From patchwork Tue Mar 4 15:04:34 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 14000868 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09EA8C021B8 for ; Tue, 4 Mar 2025 15:05:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C6B466B0092; Tue, 4 Mar 2025 10:05:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C1AAF6B0093; Tue, 4 Mar 2025 10:05:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A1F0D6B0095; Tue, 4 Mar 2025 10:05:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 7A7486B0092 for ; Tue, 4 Mar 2025 10:05:06 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id E55F280226 for ; Tue, 4 Mar 2025 15:05:05 +0000 (UTC) X-FDA: 83184191370.30.37FB38A Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf11.hostedemail.com (Postfix) with ESMTP id 424224002C for ; Tue, 4 Mar 2025 15:05:04 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=none; spf=pass (imf11.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741100704; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XpkQ6d7xBAhPjyCPb06jvbtt0L1ADbzIGtuZamDqKQw=; b=EtM7bIFje9Ik2hRvIcW8UmBmY4FF/Y1LGYg+O2MQfFczCkStxubhLNzZCmsHWoZ1qKamuz sDKpdWXN9JyI+21HaZv9pYhNYfMirOoJSh0xzVmr6/GZEbxZDyUNV6lz0VSGwyrcaqL0WY iP9aGhuuPIzdc0f3K1wx+vSGF+1XCFk= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=none; spf=pass (imf11.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741100704; a=rsa-sha256; cv=none; b=oiazV54hm8REyvnEcV2PftGZRsR1pCiu3r40DG+ELttqbo3Zm9bpSmkK74Rw9e2SNDme0x qj7/HACOQvKpHKFSGnmAGFGgVf0Pd7gthMK3LzAdau6NzP+muJtCePkQ7AwBupAfuenRa6 eQATuAZIV5duu5RRTfEtNYQ2mKvvUYI= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5626F1007; Tue, 4 Mar 2025 07:05:17 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C43DD3F66E; Tue, 4 Mar 2025 07:05:01 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 04/11] arm64/mm: Refactor __set_ptes() and __ptep_get_and_clear() Date: Tue, 4 Mar 2025 15:04:34 +0000 Message-ID: <20250304150444.3788920-5-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250304150444.3788920-1-ryan.roberts@arm.com> References: <20250304150444.3788920-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 424224002C X-Rspamd-Server: rspam09 X-Stat-Signature: jf9fhmusqsyb7iy919nwpki78bz9hm6s X-HE-Tag: 1741100704-787520 X-HE-Meta: U2FsdGVkX1+HBD1fjR4OKjOQZkeoh5eurwX+L28+BtoG5nHuQghwdTIyiCGVExkxsRVL+mhT90Je1SfLQOdjQeLw2NJYv2Z/X4wUTmmpd/JiheVSXEtQ7mRMcG9jX2R0vF8Frys55qP7r3nxDczQAEdOC3r8HEnQ9eJzqqhwv5Q6qg+JY0PAS9tm8Uzkv+zQyW72SzrRKj28bZX+CCAk+cZoz4/DqGZfCYGt8tHT5Am8F2S0wdT/aWsaO8240lJlsFuLASUr9MVg2jRmeb72DP8vmIrvhBqzb2lpHE+SDpVaviZ8O4d5zxinAI+T1rWkMh/BbOoTnTI7wJqEp24INtZlFo5d5gle4VCKYk12XGfVVana0GVKei4+/ElosPnF6kH430FA32nJJBpv+zBG5boG9L83HeEFEtNAMSvmSuwI/G5NQL4Bb8SJoiJwFtvM+XS5bYVWXSWjLtD2r5UlrVrDTM8ghcmX2GNuHM69fFo+VFkThf1PS0sQPrBL9Q4aHjgXv2DQEnPkQo4CqQ467pFDj/F1cgy/oyv1DFwZQBj2IGDXF1ufN1c0VT0I+2qjRQ0RS9mikoFfBWvvW8gyqpyAicknZVt5iojh5C1yTJzitO4jBKmUIQbXsFv7tKr5SGNcwNB2tCbRxhNQx8t27HpO1gervxyUpd2GOuow9Rp3z4+N83gaHCSrQKk3v75tw4CmKQV7sxwUpqo2AlthJUrz+aDMvpkbAKrfxwemxad4epzNN6fVVMngqS90nax6mvuQr0F6sqgB9H1yNL9un4EsQw3sSwpxc5c4HRowWbWWka90KhO9LEw3/4aFkbsCvd8PmVNG2QGgu/v8YdJlqy/E4JWD7lL6eIhk9Od1z64NkE/t/gquJXLU2/3O+FuEaanQWMLG5hc92LKYH188jGo4KKALg3lZYRm6zaDspupU2EeTWsDdhUImtTr6NwZWYDcoa2si2/m2SqogN4O RRpL9pyi 2bzMlAb0Vqqw39NOzc1vbq6P7nflwURsPsoUEUIvRVcQ6Rqd8t9T5DNL6TA73qoAVld3MlmVGIueMDca6yP7UCuruIJXaf5/rRd3Ah16r53UKexVKBAqExzE6/5bCwOi0dE3pdXfdcCRfpLAV6ub4rAss/if0c3Y9YhT1PAuDDcOR1rGTg765MeF6tD9APlvREZKMuiux937HAwU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Refactor __set_ptes(), set_pmd_at() and set_pud_at() so that they are all a thin wrapper around a new common set_ptes_anysz(), which takes pgsize parameter. Additionally, refactor __ptep_get_and_clear() and pmdp_huge_get_and_clear() to use a new common ptep_get_and_clear_anysz() which also takes a pgsize parameter. These changes will permit the huge_pte API to efficiently batch-set pgtable entries and take advantage of the future barrier optimizations. Additionally since the new *_anysz() helpers call the correct page_table_check_*_set() API based on pgsize, this means that huge_ptes will be able to get proper coverage. Currently the huge_pte API always uses the pte API which assumes an entry only covers a single page. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 108 +++++++++++++++++++------------ 1 file changed, 67 insertions(+), 41 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 0b2a2ad1b9e8..e255a36380dc 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -420,23 +420,6 @@ static inline pte_t pte_advance_pfn(pte_t pte, unsigned long nr) return pfn_pte(pte_pfn(pte) + nr, pte_pgprot(pte)); } -static inline void __set_ptes(struct mm_struct *mm, - unsigned long __always_unused addr, - pte_t *ptep, pte_t pte, unsigned int nr) -{ - page_table_check_ptes_set(mm, ptep, pte, nr); - __sync_cache_and_tags(pte, nr); - - for (;;) { - __check_safe_pte_update(mm, ptep, pte); - __set_pte(ptep, pte); - if (--nr == 0) - break; - ptep++; - pte = pte_advance_pfn(pte, 1); - } -} - /* * Hugetlb definitions. */ @@ -641,30 +624,59 @@ static inline pgprot_t pud_pgprot(pud_t pud) return __pgprot(pud_val(pfn_pud(pfn, __pgprot(0))) ^ pud_val(pud)); } -static inline void __set_pte_at(struct mm_struct *mm, - unsigned long __always_unused addr, - pte_t *ptep, pte_t pte, unsigned int nr) +static inline void set_ptes_anysz(struct mm_struct *mm, pte_t *ptep, pte_t pte, + unsigned int nr, unsigned long pgsize) { - __sync_cache_and_tags(pte, nr); - __check_safe_pte_update(mm, ptep, pte); - __set_pte(ptep, pte); + unsigned long stride = pgsize >> PAGE_SHIFT; + + switch (pgsize) { + case PAGE_SIZE: + page_table_check_ptes_set(mm, ptep, pte, nr); + break; + case PMD_SIZE: + page_table_check_pmds_set(mm, (pmd_t *)ptep, pte_pmd(pte), nr); + break; + case PUD_SIZE: + page_table_check_puds_set(mm, (pud_t *)ptep, pte_pud(pte), nr); + break; + default: + VM_WARN_ON(1); + } + + __sync_cache_and_tags(pte, nr * stride); + + for (;;) { + __check_safe_pte_update(mm, ptep, pte); + __set_pte(ptep, pte); + if (--nr == 0) + break; + ptep++; + pte = pte_advance_pfn(pte, stride); + } } -static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr, - pmd_t *pmdp, pmd_t pmd) +static inline void __set_ptes(struct mm_struct *mm, + unsigned long __always_unused addr, + pte_t *ptep, pte_t pte, unsigned int nr) { - page_table_check_pmd_set(mm, pmdp, pmd); - return __set_pte_at(mm, addr, (pte_t *)pmdp, pmd_pte(pmd), - PMD_SIZE >> PAGE_SHIFT); + set_ptes_anysz(mm, ptep, pte, nr, PAGE_SIZE); } -static inline void set_pud_at(struct mm_struct *mm, unsigned long addr, - pud_t *pudp, pud_t pud) +static inline void __set_pmds(struct mm_struct *mm, + unsigned long __always_unused addr, + pmd_t *pmdp, pmd_t pmd, unsigned int nr) +{ + set_ptes_anysz(mm, (pte_t *)pmdp, pmd_pte(pmd), nr, PMD_SIZE); +} +#define set_pmd_at(mm, addr, pmdp, pmd) __set_pmds(mm, addr, pmdp, pmd, 1) + +static inline void __set_puds(struct mm_struct *mm, + unsigned long __always_unused addr, + pud_t *pudp, pud_t pud, unsigned int nr) { - page_table_check_pud_set(mm, pudp, pud); - return __set_pte_at(mm, addr, (pte_t *)pudp, pud_pte(pud), - PUD_SIZE >> PAGE_SHIFT); + set_ptes_anysz(mm, (pte_t *)pudp, pud_pte(pud), nr, PUD_SIZE); } +#define set_pud_at(mm, addr, pudp, pud) __set_puds(mm, addr, pudp, pud, 1) #define __p4d_to_phys(p4d) __pte_to_phys(p4d_pte(p4d)) #define __phys_to_p4d_val(phys) __phys_to_pte_val(phys) @@ -1276,16 +1288,34 @@ static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma, } #endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG */ -static inline pte_t __ptep_get_and_clear(struct mm_struct *mm, - unsigned long address, pte_t *ptep) +static inline pte_t ptep_get_and_clear_anysz(struct mm_struct *mm, pte_t *ptep, + unsigned long pgsize) { pte_t pte = __pte(xchg_relaxed(&pte_val(*ptep), 0)); - page_table_check_pte_clear(mm, pte); + switch (pgsize) { + case PAGE_SIZE: + page_table_check_pte_clear(mm, pte); + break; + case PMD_SIZE: + page_table_check_pmd_clear(mm, pte_pmd(pte)); + break; + case PUD_SIZE: + page_table_check_pud_clear(mm, pte_pud(pte)); + break; + default: + VM_WARN_ON(1); + } return pte; } +static inline pte_t __ptep_get_and_clear(struct mm_struct *mm, + unsigned long address, pte_t *ptep) +{ + return ptep_get_and_clear_anysz(mm, ptep, PAGE_SIZE); +} + static inline void __clear_full_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, unsigned int nr, int full) { @@ -1322,11 +1352,7 @@ static inline pte_t __get_and_clear_full_ptes(struct mm_struct *mm, static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, unsigned long address, pmd_t *pmdp) { - pmd_t pmd = __pmd(xchg_relaxed(&pmd_val(*pmdp), 0)); - - page_table_check_pmd_clear(mm, pmd); - - return pmd; + return pte_pmd(ptep_get_and_clear_anysz(mm, (pte_t *)pmdp, PMD_SIZE)); } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ From patchwork Tue Mar 4 15:04:35 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 14000869 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0DC0C021B8 for ; Tue, 4 Mar 2025 15:05:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B82176B0093; Tue, 4 Mar 2025 10:05:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B2F056B0095; Tue, 4 Mar 2025 10:05:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9AAF26B0096; Tue, 4 Mar 2025 10:05:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 72EC06B0093 for ; Tue, 4 Mar 2025 10:05:08 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 0FF774B569 for ; Tue, 4 Mar 2025 15:05:08 +0000 (UTC) X-FDA: 83184191496.19.25536F9 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf08.hostedemail.com (Postfix) with ESMTP id 6F3EC16000B for ; Tue, 4 Mar 2025 15:05:06 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf08.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741100706; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TF/6uGlm1+JnFATcSTMQMNzWrve0WLCN606nc3w+F2M=; b=2usR/mfMznz7bhOZ+8LI31FB4Q14EXgKqG/dU1cDS0Z0ovZ7SkV90y4kaZFG0bTae2dSOJ mhn75kUXqPcmG9mGidvcIv0Y9QBoqs61eAGzrUgD7RMjITeJJBL0Lo0Vx+HVG/aMrsQjyf A4tLA9aubjxeZvJKLMPZWbBY+0X5L+Q= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf08.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741100706; a=rsa-sha256; cv=none; b=QvjOxq3vbrryBLx8eJwqCGlo8YHlBKF7tmYFKqXJhHuCxKKQOVwcFWEaz980uarVOPZ/wK AskmTp8shNXN5/3LOz7jLfbtP2oR4/SpCui5bB3SQs/py0bKeclGw/u8cyQazZG/+Gm0KO 0Fc7fogDUxWncN5t4vpVNYDZg61c/r8= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 75EFD1C2B; Tue, 4 Mar 2025 07:05:19 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E43683F66E; Tue, 4 Mar 2025 07:05:03 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 05/11] arm64: hugetlb: Use set_ptes_anysz() and ptep_get_and_clear_anysz() Date: Tue, 4 Mar 2025 15:04:35 +0000 Message-ID: <20250304150444.3788920-6-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250304150444.3788920-1-ryan.roberts@arm.com> References: <20250304150444.3788920-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 6F3EC16000B X-Stat-Signature: wjzs9gbuhkbq4115x18cuioigejgw6a6 X-HE-Tag: 1741100706-157516 X-HE-Meta: U2FsdGVkX1/eceKzm8exW7hcLAsr+ZDYhQumUSzqDM0gM3jpRMSHy16P979DnAdFkyt6r4RIKtMoLRIdq3NkXCMbREQF863AsMCcuQSwATTkws63UKupsahehqkrxSAvNdO5U1KCyhEs6aU0X+FZP6xPMfU+JZnypViOokobPXzgzngE/dbIKQsTlkyhuDaeUG4sqahxkPCUMj2VaLIxVEDGkp32Gp7f8iyKJY5TcwfcPWtKhSHm+xNKWfEM3mcgIKPrj+TYrmLxCKwRgb4fGeTvO9Rynh14k5qcQUOIleGIx76ZT9rGl1YFyC5YEOY0aczMFUJ427dC4PIu9Jl6OBfMU4DVzPxSdYyfxkHSPfpFf4ShNmlqTLzD0JnFWJfFAUnoJkfq3SFXnHILC/d26G5uz5rwOAjJcWUZFZn0X/eKcBFdN4uDdvHV/ZCHXQ4TZGNQVNLuQlUQNxiSFeoP3EKq1zKg69fGzFNFWoCRcd2r6o27CQ+WaNLAK/PzCiv4lcOqvBhK1llkm9v/AUfTmntYcGL8FIbAKqRE0S4cTyjnmErmKDMzRgDn6PMzPL0ULuIFFw/ym6VUjiUcFKOKOZOaA7Jdzgs4i5LFFNL7LSVSXRpMHW/nu4aRQ+QoXjyIMqI02A1CKhC7iZ51gk4iurYEkpzS74F4YWAgEdX5D9ETfHThL3iyyNfjzHKQ/U+5oAQyH0Q27u4T76e3MIm3/eXmHO0lcjm//gYSfu5q2FVMIc/hVX6a/T8KNwiXsZVbBGUCJ6IwCT1phEFlmBsUgvnSWKUKjqxBJiCFTpVfBTHJABr9NfWwyeaIG9avksCAqneGEGmGBe2OhgBf7HV32bsr1YVp5LQabf22u0s4qx7cgUgFafM/NsfTucwjPE9gEWXmG1hv7a/Qf8ItTkBNCqRx3YQEB1iFd7d+L5iCSOPwrgKa+rAZJCte+KvQAu6A7WOMfUSFFfjAJciiDdh bdMnoBLm 6iiD4zhebNdwgcyFli/h8JjE/mgzeLbqzfW+vlGiHi0uGCu48BStSeztbmVSmYE7soJaXc7COnG8FQ9YQHJY4x8HcaWdEnpdo5KSXkcWauIM+YL9kvjn3ni9jSCviXebMl6SH2R7Svg+/8JD8xXqsgL5A4ekPfrjGBps+mnEbOfGFM4KUP62zwS4i07w2mUlOEGO0EMsCLkncA64= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Refactor the huge_pte helpers to use the new common set_ptes_anysz() and ptep_get_and_clear_anysz() APIs. This provides 2 benefits; First, when page_table_check=on, hugetlb is now properly/fully checked. Previously only the first page of a hugetlb folio was checked. Second, instead of having to call __set_ptes(nr=1) for each pte in a loop, the whole contiguous batch can now be set in one go, which enables some efficiencies and cleans up the code. One detail to note is that huge_ptep_clear_flush() was previously calling ptep_clear_flush() for a non-contiguous pte (i.e. a pud or pmd block mapping). This has a couple of disadvantages; first ptep_clear_flush() calls ptep_get_and_clear() which transparently handles contpte. Given we only call for non-contiguous ptes, it would be safe, but a waste of effort. It's preferable to go straight to the layer below. However, more problematic is that ptep_get_and_clear() is for PAGE_SIZE entries so it calls page_table_check_pte_clear() and would not clear the whole hugetlb folio. So let's stop special-casing the non-cont case and just rely on get_clear_contig_flush() to do the right thing for non-cont entries. Signed-off-by: Ryan Roberts --- arch/arm64/mm/hugetlbpage.c | 52 +++++++------------------------------ 1 file changed, 10 insertions(+), 42 deletions(-) diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 065be8650aa5..efd18bd1eae3 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -159,12 +159,12 @@ static pte_t get_clear_contig(struct mm_struct *mm, pte_t pte, tmp_pte; bool present; - pte = __ptep_get_and_clear(mm, addr, ptep); + pte = ptep_get_and_clear_anysz(mm, ptep, pgsize); present = pte_present(pte); while (--ncontig) { ptep++; addr += pgsize; - tmp_pte = __ptep_get_and_clear(mm, addr, ptep); + tmp_pte = ptep_get_and_clear_anysz(mm, ptep, pgsize); if (present) { if (pte_dirty(tmp_pte)) pte = pte_mkdirty(pte); @@ -208,7 +208,7 @@ static void clear_flush(struct mm_struct *mm, unsigned long i, saddr = addr; for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) - __ptep_get_and_clear(mm, addr, ptep); + ptep_get_and_clear_anysz(mm, ptep, pgsize); __flush_hugetlb_tlb_range(&vma, saddr, addr, pgsize, true); } @@ -219,32 +219,20 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, size_t pgsize; int i; int ncontig; - unsigned long pfn, dpfn; - pgprot_t hugeprot; ncontig = num_contig_ptes(sz, &pgsize); if (!pte_present(pte)) { for (i = 0; i < ncontig; i++, ptep++, addr += pgsize) - __set_ptes(mm, addr, ptep, pte, 1); + set_ptes_anysz(mm, ptep, pte, 1, pgsize); return; } - if (!pte_cont(pte)) { - __set_ptes(mm, addr, ptep, pte, 1); - return; - } - - pfn = pte_pfn(pte); - dpfn = pgsize >> PAGE_SHIFT; - hugeprot = pte_pgprot(pte); - /* Only need to "break" if transitioning valid -> valid. */ - if (pte_valid(__ptep_get(ptep))) + if (pte_cont(pte) && pte_valid(__ptep_get(ptep))) clear_flush(mm, addr, ptep, pgsize, ncontig); - for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) - __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); + set_ptes_anysz(mm, ptep, pte, ncontig, pgsize); } pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, @@ -434,11 +422,9 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t pte, int dirty) { - int ncontig, i; + int ncontig; size_t pgsize = 0; - unsigned long pfn = pte_pfn(pte), dpfn; struct mm_struct *mm = vma->vm_mm; - pgprot_t hugeprot; pte_t orig_pte; VM_WARN_ON(!pte_present(pte)); @@ -447,7 +433,6 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, return __ptep_set_access_flags(vma, addr, ptep, pte, dirty); ncontig = num_contig_ptes(huge_page_size(hstate_vma(vma)), &pgsize); - dpfn = pgsize >> PAGE_SHIFT; if (!__cont_access_flags_changed(ptep, pte, ncontig)) return 0; @@ -462,19 +447,14 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, if (pte_young(orig_pte)) pte = pte_mkyoung(pte); - hugeprot = pte_pgprot(pte); - for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) - __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); - + set_ptes_anysz(mm, ptep, pte, ncontig, pgsize); return 1; } void huge_ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - unsigned long pfn, dpfn; - pgprot_t hugeprot; - int ncontig, i; + int ncontig; size_t pgsize; pte_t pte; @@ -487,16 +467,11 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm, } ncontig = find_num_contig(mm, addr, ptep, &pgsize); - dpfn = pgsize >> PAGE_SHIFT; pte = get_clear_contig_flush(mm, addr, ptep, pgsize, ncontig); pte = pte_wrprotect(pte); - hugeprot = pte_pgprot(pte); - pfn = pte_pfn(pte); - - for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) - __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); + set_ptes_anysz(mm, ptep, pte, ncontig, pgsize); } pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, @@ -505,13 +480,6 @@ pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, struct mm_struct *mm = vma->vm_mm; size_t pgsize; int ncontig; - pte_t pte; - - pte = __ptep_get(ptep); - VM_WARN_ON(!pte_present(pte)); - - if (!pte_cont(pte)) - return ptep_clear_flush(vma, addr, ptep); ncontig = num_contig_ptes(huge_page_size(hstate_vma(vma)), &pgsize); return get_clear_contig_flush(mm, addr, ptep, pgsize, ncontig); From patchwork Tue Mar 4 15:04:36 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 14000870 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 267E5C021B8 for ; Tue, 4 Mar 2025 15:05:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1E39A6B0095; Tue, 4 Mar 2025 10:05:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 16CD56B0096; Tue, 4 Mar 2025 10:05:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F027D6B0098; Tue, 4 Mar 2025 10:05:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id CB73C6B0095 for ; Tue, 4 Mar 2025 10:05:10 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 7D9FB1A01F6 for ; Tue, 4 Mar 2025 15:05:10 +0000 (UTC) X-FDA: 83184191580.29.DC90AA2 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf03.hostedemail.com (Postfix) with ESMTP id 99C2920027 for ; Tue, 4 Mar 2025 15:05:08 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=none; spf=pass (imf03.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741100708; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/+q/1JwU2m6fh2CWGbngu0nd469Mg+2mngxCCKUUaMU=; b=xoTqJCGoByVvyjbzcKUz2lwQ6kCs08GjQo7NZGORtGq08riOxue3UZJ2zRjHdX0S+xUe/O F27Ms4EtBxw/o6JlMWUXQGHLdHOd4jhE6kQVWggQJyk+tAXyyf40mx36i4eDCe2liqtPYn Fljzkeo0HV6IuXCc/8XQJa4TqWBRFGk= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none; spf=pass (imf03.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741100708; a=rsa-sha256; cv=none; b=jXsH8Omoc4RGvYCpDLjZsdByIhB+kQCvsJ/KUHOL1zn6N+Ynn8pP9Eii3VAyqeOsYP/sA7 zfTWKvJNMT6xcnX4siU04qehfgLJzMfGggbx1ZYQF1eFREvX6xl2GfCaKvJzGxx8L6+apd A3YtXe+E3a4OgT+EykifsfZRPMnOfZA= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9595EFEC; Tue, 4 Mar 2025 07:05:21 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0F7FE3F66E; Tue, 4 Mar 2025 07:05:05 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 06/11] arm64/mm: Hoist barriers out of set_ptes_anysz() loop Date: Tue, 4 Mar 2025 15:04:36 +0000 Message-ID: <20250304150444.3788920-7-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250304150444.3788920-1-ryan.roberts@arm.com> References: <20250304150444.3788920-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 99C2920027 X-Rspamd-Server: rspam09 X-Stat-Signature: zry197q46m6sd3yx9t614sa1y76cny8p X-HE-Tag: 1741100708-684744 X-HE-Meta: U2FsdGVkX18eHD6q3gt72Lby1q5btfsR5hM3tpX9WzsIM2YZ6bIAvIi+WLgdDpxZc4JdJVRO4Zn/AVGJJQaOD34QiQfbPKFa19HUwXMZv0XckvDgiuYZThR+4dZA5fTcla68f9mJuFhw28Ou54cKquv9/M8dl5JevnMQA7GAzk88P8ZjnQ64GLGPKbKRQTMipKCMhTuzNt/pzBbSq4H8wv8/h0/NoF9jjrfxbtsK4b3YydqNr9GVuzYFusZomKBsj0f6C08ogmqAGhoc9qxX+XsPvx0RQDKn+biSG54T3c1+dsbrNlFArl7fjDb+yhFWN3YfbhtxYoKEKpGUYnu5IV18Gtee6/Cdu4g4zEtytNUhSoecgTChpvUbah7EE4ounkTwVNby4pPQlULh+g/9Y7n5tAfPqmBnRcVyX1Irl5PEMwC5Niwxz3z+r0Xb3lEhIQ5jwIC0lT0KAUYh77yr9wn+a5swzPOMwasnZBroOOjXkx4N1p2cKG6lvGosdp5UwEHH6ln5o/DEZdbclDY7JBEZTJGQpP6fN1IMYjquq3Z4PgMgex2taPrT5nrRbLZ67MOT7BM3DorlYkfpv9IJdFPYdgvwTIsoQdzFfiVd3T3Sx+R+0uXCdCpYQ6DzcIf7QGh2TNpjOpARvhLIcfar8NVvOY7iQ62rPqbei4u6POZWPeMKtT6+INETSjy4DB5KRvuaVqB1kd0vdscBIEQn36qBeUud2bT9Ky2aW+jADRA0DcDmjdywYzY9pU278p4ZOKbv4WL67QcM5JIMIY8Hu3NB+4O2Lga6Kgy5x3oFyQdohw1TA5hZEZs/KbH2oVt0g8lNZbKt5g6Og23Hqucv63Pns/COM3Fh3JY7z56WbY9imLoDJRAU/a+Eal8AaCP0uVT3DKC74sDC4kQe721LYlA9hoHwAkjbGONhR2z4TxUPf+YnWB42pEERIaIU8e3PHQyoaT36s5ls1okjyJC TRrGA/Os sE96HNqKnrYPhuAMLTH+IC9qknQ9zWOpj5rjuqJADxt0kOizc+06TnOjhWYQ+1GXv+r6BHpxpvEwCyS96umedz6Yj/8Mr5OIUt6oGpkDiVPlpWGsGfTxQRU2gBEpTnBcatO9zddpq9ZdgtG4QYCmYsh7iT/V+6rAHw0ps7IzszwJjDfJD9xwbVU5IjE7HStRSp/u30WIwpL4XiMQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: set_ptes_anysz() previously called __set_pte() for each PTE in the range, which would conditionally issue a DSB and ISB to make the new PTE value immediately visible to the table walker if the new PTE was valid and for kernel space. We can do better than this; let's hoist those barriers out of the loop so that they are only issued once at the end of the loop. We then reduce the cost by the number of PTEs in the range. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index e255a36380dc..1898c3069c43 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -317,13 +317,11 @@ static inline void __set_pte_nosync(pte_t *ptep, pte_t pte) WRITE_ONCE(*ptep, pte); } -static inline void __set_pte(pte_t *ptep, pte_t pte) +static inline void __set_pte_complete(pte_t pte) { - __set_pte_nosync(ptep, pte); - /* * Only if the new pte is valid and kernel, otherwise TLB maintenance - * or update_mmu_cache() have the necessary barriers. + * has the necessary barriers. */ if (pte_valid_not_user(pte)) { dsb(ishst); @@ -331,6 +329,12 @@ static inline void __set_pte(pte_t *ptep, pte_t pte) } } +static inline void __set_pte(pte_t *ptep, pte_t pte) +{ + __set_pte_nosync(ptep, pte); + __set_pte_complete(pte); +} + static inline pte_t __ptep_get(pte_t *ptep) { return READ_ONCE(*ptep); @@ -647,12 +651,14 @@ static inline void set_ptes_anysz(struct mm_struct *mm, pte_t *ptep, pte_t pte, for (;;) { __check_safe_pte_update(mm, ptep, pte); - __set_pte(ptep, pte); + __set_pte_nosync(ptep, pte); if (--nr == 0) break; ptep++; pte = pte_advance_pfn(pte, stride); } + + __set_pte_complete(pte); } static inline void __set_ptes(struct mm_struct *mm, From patchwork Tue Mar 4 15:04:37 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 14000871 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4BA68C021B8 for ; Tue, 4 Mar 2025 15:05:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 121FD6B0096; Tue, 4 Mar 2025 10:05:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 082B56B0098; Tue, 4 Mar 2025 10:05:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EB3706B0099; Tue, 4 Mar 2025 10:05:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id CE9B66B0096 for ; Tue, 4 Mar 2025 10:05:12 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 912778020E for ; Tue, 4 Mar 2025 15:05:12 +0000 (UTC) X-FDA: 83184191664.26.9F16001 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf09.hostedemail.com (Postfix) with ESMTP id BCD19140013 for ; Tue, 4 Mar 2025 15:05:10 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf09.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741100710; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zq+cpplVIbdafYmKh04Zm3BYxIpGtlL4VcJXVKSI9kY=; b=lE/dAzFzDzLe54lphzFEW5+mQVmvuj0gseu9QIXxVdY9Ow9Koiu78oxxy6ktA9bnx0qLSm RRiAF9Bo5SEmfwQmbmdZu5GcN1YrSSFuEG/RZsj8lfLEVpbk/1cj1yIVFsEg07yOepZWOU WCbAYvchSEJu0DueQBTZso8kDByPhLo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741100710; a=rsa-sha256; cv=none; b=O7f2s8bcyo2uF/e9A0dwUMyxe+Pc+THoYOp1ePeclClelUnhAcWnFgr0Ddu6xkPD3be0IU z4mzDDDDxN3nrWtN7lmawX1/Ir98UhQ7QTuzX3kdbKRYmsbngBU4JlRLLtvxb5BKPzdAv5 Qo6dTySfy6rJHSaDX+76budve4SLzW4= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf09.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B52451007; Tue, 4 Mar 2025 07:05:23 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2EEE23F66E; Tue, 4 Mar 2025 07:05:08 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 07/11] mm/vmalloc: Warn on improper use of vunmap_range() Date: Tue, 4 Mar 2025 15:04:37 +0000 Message-ID: <20250304150444.3788920-8-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250304150444.3788920-1-ryan.roberts@arm.com> References: <20250304150444.3788920-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Stat-Signature: jcehzt4xhr7sjgs4s1d6gttdraguafgj X-Rspamd-Queue-Id: BCD19140013 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1741100710-619867 X-HE-Meta: U2FsdGVkX1+N8b2HazMHGFfy4wG5J37J86BoQeAE97WVLBP9tJVUHg2MAyHzPaxr28R0YN/aQ2kSB6rjMoSotDKJen0K4V66fyZ3HwpopcMQx5lELZNZmAHzW2Qz/MbNcoRirE/05yHoXGSZ72qh4Mrcf/9b82rqag5KWjw0pTSZh0yr6zKJA3nVT0xfdy8seCFxYnVoUqeJyD/GARh9ix+WJZzXRoB+/B0gLF/H5F7pv0ion8m8mjsjhwjtnjZghIwSHZsgPUgsbxZlVYx/A0PhRW0Gcp0PNiiwM16HMTwfr1JO/nlvb3yk/XIm9Ge3fovxgOCsYEh2P2BuxeY9xbmoh7pErN9OZtOR66z7II2bO087MJZewav0s7jO+6BV4O+F2T5ssCgjfDev8iRAAMBgyRNzzJ7B9VX4+tSNI0v+dcwyCyZF5a1uGRxLAqblNT1v+7Xg10HGWXnFhANOUAFCwgpwYiIah+yDwqQHZCqZ844MvpWEkF8ammiJLVhJnaxk+fpbW2cKs0t5yIl2BQVGYjxkGRpAgyo2vkFz4KMwx6Fu/xwuJ/H7Wm3Dfp7T39dSc4LkqLEQbTVmnj9wQvME5EyHZGtjoVr9dvoBaLEX2APpSFwIQFr/UgMmK5ALqPueXTtNjdCvNR1MQ7IaiNAXT/6O4EEYxwCV+Zs7VKY9s0BSVHhfkdXxdmwKi8kc6EOZIX2DvZlsTNqliGkvgxbOg0fMsuAe1dr7bRh+l2Q+TIQ/JLBN/pvKqggackaMms7/3RcJxeiU33ncHZuEO59Yc4fkcLg/a+e8bPV4rAKSZAILLL6PjqkD8TCX1BZHCrxCz9TzLQz+tLaWc4PQQwaqDewmb9XMQDwbT3ZrTzbavMXuPemKbhWxSy3PNOrmM41Dfok+eZ05cumBTBhnkjuhBlQdpugFZIlgmTwgiYzt7KHW6j+YvQ+oD4YaQ5k8W3ATBYXOq7/KSHhnofS Q2DZ0bBO D9satx8ewd97MFuwsQRlsCfNWduemSXTo7FPrkAIlu5vTIlFTlOe4c4JP2x9eL/AEWaAmwTEiqiVzt1rlHOuCAGaxEVx/8Q7Oo5NyOJG3Pdmcvow+djYxRVjFpw60xCl/yGAWgrMxdJFRVVhKlfYVYIrDLqSotWTQmkDwryccP5V+sJKt+35EglEvpwc2u2aeXFr+KsZHjAfHgD8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: A call to vmalloc_huge() may cause memory blocks to be mapped at pmd or pud level. But it is possible to subsequently call vunmap_range() on a sub-range of the mapped memory, which partially overlaps a pmd or pud. In this case, vmalloc unmaps the entire pmd or pud so that the no-overlapping portion is also unmapped. Clearly that would have a bad outcome, but it's not something that any callers do today as far as I can tell. So I guess it's just expected that callers will not do this. However, it would be useful to know if this happened in future; let's add a warning to cover the eventuality. Reviewed-by: Anshuman Khandual Reviewed-by: Catalin Marinas Signed-off-by: Ryan Roberts --- mm/vmalloc.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index a6e7acebe9ad..fcdf67d5177a 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -374,8 +374,10 @@ static void vunmap_pmd_range(pud_t *pud, unsigned long addr, unsigned long end, if (cleared || pmd_bad(*pmd)) *mask |= PGTBL_PMD_MODIFIED; - if (cleared) + if (cleared) { + WARN_ON(next - addr < PMD_SIZE); continue; + } if (pmd_none_or_clear_bad(pmd)) continue; vunmap_pte_range(pmd, addr, next, mask); @@ -399,8 +401,10 @@ static void vunmap_pud_range(p4d_t *p4d, unsigned long addr, unsigned long end, if (cleared || pud_bad(*pud)) *mask |= PGTBL_PUD_MODIFIED; - if (cleared) + if (cleared) { + WARN_ON(next - addr < PUD_SIZE); continue; + } if (pud_none_or_clear_bad(pud)) continue; vunmap_pmd_range(pud, addr, next, mask); From patchwork Tue Mar 4 15:04:38 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 14000872 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4AAE7C021B8 for ; Tue, 4 Mar 2025 15:05:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 826246B0098; Tue, 4 Mar 2025 10:05:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7FBA46B0099; Tue, 4 Mar 2025 10:05:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6C4D76B009A; Tue, 4 Mar 2025 10:05:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 4DD4A6B0098 for ; Tue, 4 Mar 2025 10:05:15 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id B18981201C8 for ; Tue, 4 Mar 2025 15:05:14 +0000 (UTC) X-FDA: 83184191748.28.321F9DF Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf18.hostedemail.com (Postfix) with ESMTP id D4C9C1C0010 for ; Tue, 4 Mar 2025 15:05:12 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741100713; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=05+yUch1qVnnrqVG6RV+wIx5vc2UIZOaUlgwjSMdsHI=; b=Mj0FHZ0d0OSFRmm5LpABqGQ2IDLb7K7S9nXC0GyeJHVHlmncdKGBvNtX7K5XuClWB9QEWX kFY2IL7D5B/UqLK4vpHYebuhC9Mnu6rQ+Z3x6ZEHYQo0eovvxrbUjEDbhyV8OhJvMSqABN fn1A1JFwS2YF0FvAcTZGNkl3BChBqTk= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741100713; a=rsa-sha256; cv=none; b=DnBY/qPU6+ehXtqlNs6Mrn+Jjod7rtc/pp84lGQL7JAmGQdL4WeejKR/HVbBUC3YiumZB7 WcPOHEvQ/8k/Gyf1LJMmN2cRyMy3CjQ7xHjapOH/EU31rIFJfZZmllZE1DzjbBEFBf97Jj P6S4gHQnj2W0CPXUJlx8PvCnaxuSt/c= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D45AB1C25; Tue, 4 Mar 2025 07:05:25 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4EB953F66E; Tue, 4 Mar 2025 07:05:10 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 08/11] mm/vmalloc: Gracefully unmap huge ptes Date: Tue, 4 Mar 2025 15:04:38 +0000 Message-ID: <20250304150444.3788920-9-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250304150444.3788920-1-ryan.roberts@arm.com> References: <20250304150444.3788920-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: D4C9C1C0010 X-Rspam-User: X-Stat-Signature: oiufsna4mh7i4nd5p9sy8f99e5wg8eg5 X-HE-Tag: 1741100712-385137 X-HE-Meta: U2FsdGVkX1+/3H1rggTlIO9/FXJTThA6v50Qvm+Hx9ynfTvJVl/QRf8xebUCLBGfZYfruuo8AElBVlQ45RXX9VbsyJXGI+iEWKf9tDmXYCeIOLshbYmqnTwYetlYxqjYr269Z2hdX6bVgjTweW4jxBPH7KgUucJ7zMuwEwZmX3DoOBp4q1qWOY9Cr8GrO7Q7J1Fe91DeVw/KVsUwQSszj1xhB7N2aliIcxP89ep0U4z1Bs2amEK6jMfqdouCxa9RVfM7vo4bOkyMpFz0/cmY+WNAjayMPtZG1YsfkPmpU01P3xSzchGAcwo0r6bu56Nlr93Xt+Hc8Ae8HjbOaM6m46DtMsEDnvjn00NhEW9s6p/M3+nusyxtPUY9311ptasfSZeSsV+Rs3hq13L8SJn+GNQEVcoFn8DyPXrUZ2jcWCRvVcM9GgOwXT8+owKIm/EzlN2lYpA9gjfIQQ9wnjpiw3eTSyuYbSFEkDty81JWjvonxanF8xTnfsP1d8QT50Hs4HM1YYDt8lXIXijWJ5A9yfAucU1v7eGfjsM9pSFRzcvTQjqnQNak1An5BjlfwdO4coUkXGzUoDDaRmaad6wAuDt/oG0WYpbdd4uCiJka5yhLrHdC9JT925vjW+k+HRoNgpbBXf/lXP8YqL8Vbkbbo/S37OAym4+4UG1nCar8SeyRqw7O8ZGYPJ74/KSX3ze/W7XkbnmtiQHRzFYobD9lLSJbKShdHBFGbgLJT2dELoAVrirZkSL+vC6qmzbiA2cTpojnBMvZ1hF2rlXqaCXjzrebp8Y5i8QzwnXHYkMypnk6boeQgrrg3IbFafWwiaEYHDZCKmfUBxImQvWGQwxymlSwd858xjL+zZEwGFLpXnykE0PgYTLoNXxBQZOu6VvQs100dAOo1YshYhKEt834ybIMRfiMQYHUL6zCCEoZrQtZjSoe9Jq7Iire/aZRkNXC3esTSWbzDB+c4ss6UNh L1aJiO57 f6MFJ+8lcrt1eyODLnwB2+FjbMmV3C4Ggb5CiUOf9tPlZ9kqeFphDWSK94xHc9Fe9DjmSNx3+2KzVHz/dKi95FU/y1oAQrcQRWwTIh6XLfTKK1Ov3D0JRt57HFiceYjuuQqOKdsgdwH1yXy8dfe8OwDimHjBBP3DSLtU9kPEpBfiAibWKkCIqya/8PQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Commit f7ee1f13d606 ("mm/vmalloc: enable mapping of huge pages at pte level in vmap") added its support by reusing the set_huge_pte_at() API, which is otherwise only used for user mappings. But when unmapping those huge ptes, it continued to call ptep_get_and_clear(), which is a layering violation. To date, the only arch to implement this support is powerpc and it all happens to work ok for it. But arm64's implementation of ptep_get_and_clear() can not be safely used to clear a previous set_huge_pte_at(). So let's introduce a new arch opt-in function, arch_vmap_pte_range_unmap_size(), which can provide the size of a (present) pte. Then we can call huge_ptep_get_and_clear() to tear it down properly. Note that if vunmap_range() is called with a range that starts in the middle of a huge pte-mapped page, we must unmap the entire huge page so the behaviour is consistent with pmd and pud block mappings. In this case emit a warning just like we do for pmd/pud mappings. Reviewed-by: Anshuman Khandual Reviewed-by: Uladzislau Rezki (Sony) Reviewed-by: Catalin Marinas Signed-off-by: Ryan Roberts --- include/linux/vmalloc.h | 8 ++++++++ mm/vmalloc.c | 18 ++++++++++++++++-- 2 files changed, 24 insertions(+), 2 deletions(-) diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 31e9ffd936e3..16dd4cba64f2 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -113,6 +113,14 @@ static inline unsigned long arch_vmap_pte_range_map_size(unsigned long addr, uns } #endif +#ifndef arch_vmap_pte_range_unmap_size +static inline unsigned long arch_vmap_pte_range_unmap_size(unsigned long addr, + pte_t *ptep) +{ + return PAGE_SIZE; +} +#endif + #ifndef arch_vmap_pte_supported_shift static inline int arch_vmap_pte_supported_shift(unsigned long size) { diff --git a/mm/vmalloc.c b/mm/vmalloc.c index fcdf67d5177a..6111ce900ec4 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -350,12 +350,26 @@ static void vunmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, pgtbl_mod_mask *mask) { pte_t *pte; + pte_t ptent; + unsigned long size = PAGE_SIZE; pte = pte_offset_kernel(pmd, addr); do { - pte_t ptent = ptep_get_and_clear(&init_mm, addr, pte); +#ifdef CONFIG_HUGETLB_PAGE + size = arch_vmap_pte_range_unmap_size(addr, pte); + if (size != PAGE_SIZE) { + if (WARN_ON(!IS_ALIGNED(addr, size))) { + addr = ALIGN_DOWN(addr, size); + pte = PTR_ALIGN_DOWN(pte, sizeof(*pte) * (size >> PAGE_SHIFT)); + } + ptent = huge_ptep_get_and_clear(&init_mm, addr, pte, size); + if (WARN_ON(end - addr < size)) + size = end - addr; + } else +#endif + ptent = ptep_get_and_clear(&init_mm, addr, pte); WARN_ON(!pte_none(ptent) && !pte_present(ptent)); - } while (pte++, addr += PAGE_SIZE, addr != end); + } while (pte += (size >> PAGE_SHIFT), addr += size, addr != end); *mask |= PGTBL_PTE_MODIFIED; } From patchwork Tue Mar 4 15:04:39 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 14000873 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9FD4C282D2 for ; Tue, 4 Mar 2025 15:05:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 90F586B0099; Tue, 4 Mar 2025 10:05:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8C3646B009A; Tue, 4 Mar 2025 10:05:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 75F7B6B009B; Tue, 4 Mar 2025 10:05:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 504606B0099 for ; Tue, 4 Mar 2025 10:05:17 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id F15801A0206 for ; Tue, 4 Mar 2025 15:05:16 +0000 (UTC) X-FDA: 83184191874.26.6AB1BC5 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf29.hostedemail.com (Postfix) with ESMTP id EF424120017 for ; Tue, 4 Mar 2025 15:05:14 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=none; spf=pass (imf29.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741100715; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=th/IQZfHt9AcPjof0ScPeiHwDMp+4L8L0OKS4xd9m68=; b=sBu/AEEwKXfP1LakWMlEufpnqhXn9zTAN9/IUWJExStdUmx9h5nAokMl+aF6aRgDAXn1yK vH+5vBiQJ26gYhkmsbY+V5Rcn+rzu9kgOtG9dUU3Q6BQbYoS8zdL6Fe0XZ0UvTh5Nj8cx3 IR+d8zCQhuSDyq4Qk6/RGbsiTW6OlTk= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=none; spf=pass (imf29.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741100715; a=rsa-sha256; cv=none; b=647/trBxIIUprtmQAJuPbdfxyl5WB4TmBj9iMGVS87182Cov1Lcr6sczg5icTUeHrtBEI2 rZcJ/3amgKbnLUXnPlMSr8IdQD4a79yY/m+UEy75oj9bYYgfh7Aewz4i/7Z/wN3iuwdprV RnVfBoN8QPYK6jIBqYMmydhtbd5mDtw= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id F3807FEC; Tue, 4 Mar 2025 07:05:27 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6DD083F66E; Tue, 4 Mar 2025 07:05:12 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 09/11] arm64/mm: Support huge pte-mapped pages in vmap Date: Tue, 4 Mar 2025 15:04:39 +0000 Message-ID: <20250304150444.3788920-10-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250304150444.3788920-1-ryan.roberts@arm.com> References: <20250304150444.3788920-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: i1c4paw4xt5x4y4hzak6w8kf586s1i7g X-Rspamd-Queue-Id: EF424120017 X-Rspamd-Server: rspam07 X-HE-Tag: 1741100714-609978 X-HE-Meta: U2FsdGVkX18/+/JeuqnLOwf5ayA4B66kxo5IEmF58CuiTJz907VIN7RgBaiThHXGMZiTkGFYPe423ZzB4FjxLqVzY+RdgpFYa9lI+URFtOnWmSdqYpuCa/Fdtw+eTXghar7IYQWM+iHeei0MImKs2Ehqe2BkXw9Qt+WsCEjM4RGjMeLq1ezEvetZlDKeRhGAOtWXn939lOVt+tggFYuT0b4e5HCGmjUOUWc2SFzZ33T0vui69E/arF+diGZCqwLor0c7YT94Y05NFYOH+DwdufU0sR7GkNylf8x4lxXQ0++/ZY0PQrEQQuPB3vX9awLF9sx8vPVk0wUIouuR6wGkowe2vsaVXtJTe9Iw3DeonUjHGXvwyM270SvC+8dd0KwcBzhsBdXOtkIPlG2iFk+6L7jJJfmxKSNq4RFhJnuJJORh3MAZkYCuHyibvS/Ns+mtm+CvcjwwP8dddhYi4xxGw+Eidm3GnG766RASzu+VepRPvMegclUhYwRzss5oP9W6+9Gr1dL3IQ/fvnov9TRCR2OR5ykAVopcrc8WYuuj9p146Njp82h9cfFDhECbu9xPBHAv3bsaqw+yHzUnH7YGyjlUPvjmzRgHXKsI67hEkM/hihQeTKJPcSIlRO8j4IqJdLNMt/4xj9vRcSDgb6kmRxRjxBsfoJMVnY+GLLPRptMls0gnqOez8muWmDfHg+uBKoarB5+ghe6JZu2/2ZoqCR7acjDwkqnmczQ+KbL1FxiFEH5goU6yB/jKgNhWauWxL/ehIFAG975W+oSCRqVlvrRcY+fNDwu+kNH/lnmoi/U9VSmfsnejrNVs0ASTvJwRcNJMVp6RzsX+FNTTWxI6sGf7JYGqMTH3YEfKCvuUGanXDvkYBAvg/8ZUprIhLDKHKei0ct8FEpPqYx/EhuDHVVj+XjvSUdnpKiZffuAd+qQKltWqX4JChrxrLcJT3KLhS9dsigli6gVxca/eN9y yCXtAvxG kK2XKWtHp6MwAGdC+McV8Y+Wi5O/WWbBPqLAHhs5Sg0a2iM8C4OkpP/7JjEAtYZDhCCCR2SDKFlBpJ20dFLfKuL0jbFt91zdc0L64UCBU/rkVWjDYLPUBZCJExLTARO9uXi4nTFr5+ezUglTRld7VAmSpNQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Implement the required arch functions to enable use of contpte in the vmap when VM_ALLOW_HUGE_VMAP is specified. This speeds up vmap operations due to only having to issue a DSB and ISB per contpte block instead of per pte. But it also means that the TLB pressure reduces due to only needing a single TLB entry for the whole contpte block. Since vmap uses set_huge_pte_at() to set the contpte, that API is now used for kernel mappings for the first time. Although in the vmap case we never expect it to be called to modify a valid mapping so clear_flush() should never be called, it's still wise to make it robust for the kernel case, so amend the tlb flush function if the mm is for kernel space. Tested with vmalloc performance selftests: # kself/mm/test_vmalloc.sh \ run_test_mask=1 test_repeat_count=5 nr_pages=256 test_loop_count=100000 use_huge=1 Duration reduced from 1274243 usec to 1083553 usec on Apple M2 for 15% reduction in time taken. Reviewed-by: Anshuman Khandual Reviewed-by: Catalin Marinas Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/vmalloc.h | 45 ++++++++++++++++++++++++++++++++ arch/arm64/mm/hugetlbpage.c | 5 +++- 2 files changed, 49 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/vmalloc.h b/arch/arm64/include/asm/vmalloc.h index 38fafffe699f..12f534e8f3ed 100644 --- a/arch/arm64/include/asm/vmalloc.h +++ b/arch/arm64/include/asm/vmalloc.h @@ -23,6 +23,51 @@ static inline bool arch_vmap_pmd_supported(pgprot_t prot) return !IS_ENABLED(CONFIG_PTDUMP_DEBUGFS); } +#define arch_vmap_pte_range_map_size arch_vmap_pte_range_map_size +static inline unsigned long arch_vmap_pte_range_map_size(unsigned long addr, + unsigned long end, u64 pfn, + unsigned int max_page_shift) +{ + /* + * If the block is at least CONT_PTE_SIZE in size, and is naturally + * aligned in both virtual and physical space, then we can pte-map the + * block using the PTE_CONT bit for more efficient use of the TLB. + */ + if (max_page_shift < CONT_PTE_SHIFT) + return PAGE_SIZE; + + if (end - addr < CONT_PTE_SIZE) + return PAGE_SIZE; + + if (!IS_ALIGNED(addr, CONT_PTE_SIZE)) + return PAGE_SIZE; + + if (!IS_ALIGNED(PFN_PHYS(pfn), CONT_PTE_SIZE)) + return PAGE_SIZE; + + return CONT_PTE_SIZE; +} + +#define arch_vmap_pte_range_unmap_size arch_vmap_pte_range_unmap_size +static inline unsigned long arch_vmap_pte_range_unmap_size(unsigned long addr, + pte_t *ptep) +{ + /* + * The caller handles alignment so it's sufficient just to check + * PTE_CONT. + */ + return pte_valid_cont(__ptep_get(ptep)) ? CONT_PTE_SIZE : PAGE_SIZE; +} + +#define arch_vmap_pte_supported_shift arch_vmap_pte_supported_shift +static inline int arch_vmap_pte_supported_shift(unsigned long size) +{ + if (size >= CONT_PTE_SIZE) + return CONT_PTE_SHIFT; + + return PAGE_SHIFT; +} + #endif #define arch_vmap_pgprot_tagged arch_vmap_pgprot_tagged diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index efd18bd1eae3..c1cb13dd5e84 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -210,7 +210,10 @@ static void clear_flush(struct mm_struct *mm, for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) ptep_get_and_clear_anysz(mm, ptep, pgsize); - __flush_hugetlb_tlb_range(&vma, saddr, addr, pgsize, true); + if (mm == &init_mm) + flush_tlb_kernel_range(saddr, addr); + else + __flush_hugetlb_tlb_range(&vma, saddr, addr, pgsize, true); } void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, From patchwork Tue Mar 4 15:04:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 14000874 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8F88C021B8 for ; Tue, 4 Mar 2025 15:05:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8250E6B009B; Tue, 4 Mar 2025 10:05:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7AE606B009C; Tue, 4 Mar 2025 10:05:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 58A946B009D; Tue, 4 Mar 2025 10:05:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 2829F6B009B for ; Tue, 4 Mar 2025 10:05:19 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id CED7E1A0221 for ; Tue, 4 Mar 2025 15:05:18 +0000 (UTC) X-FDA: 83184191916.21.3EBE3BF Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf24.hostedemail.com (Postfix) with ESMTP id 1F53B18000D for ; Tue, 4 Mar 2025 15:05:16 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf24.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741100717; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=K/4I1FWAK0Gz+hpZmFBcWWWAhCrvc2G8FaWno0kbrTQ=; b=Xz+fqqGLruCmGV3/9sLoGnRBbGVsFWXVxMOJP4L+u0EQLthGNaVq0QMCUe/qEFAiQknSk7 /mPw3kwSOj2FA/VrQEIDoqNk5KeNFq+17nGCvgqTAPb3zGektwoVoW3I0Zo5iWuN9De9Pa mQTGzJSA4JpMs+9xPt8OqiE0EL3JKMQ= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf24.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741100717; a=rsa-sha256; cv=none; b=qdAY7gCyeetqJ02JR/6co+ODRcBxrKE3a7Djs+coZnrugO6pXv83COampc5kSOT3jMbGsN vcPCXmDlCAptRVnRjQdGqQrh6oJ6Nsvh5pSYa6lSZ9ZHxYz0jHg2JlTmD1m5sJIIjP1Spl aJjQhi1RYU+CbydAD4qaWCvB2bZmdUY= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1F4C61C2B; Tue, 4 Mar 2025 07:05:30 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8D33D3F66E; Tue, 4 Mar 2025 07:05:14 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 10/11] mm/vmalloc: Enter lazy mmu mode while manipulating vmalloc ptes Date: Tue, 4 Mar 2025 15:04:40 +0000 Message-ID: <20250304150444.3788920-11-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250304150444.3788920-1-ryan.roberts@arm.com> References: <20250304150444.3788920-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Stat-Signature: 5bmkp33yp9b5y8ybfxiu9kot119wgiss X-Rspamd-Queue-Id: 1F53B18000D X-Rspamd-Server: rspam06 X-Rspam-User: X-HE-Tag: 1741100716-847096 X-HE-Meta: U2FsdGVkX18NinRT8X/LDNgjum6DVtqQ41iLXzBJyAtI1B4eNxFVBYY7+tswXqkH+NIcuxYWwJAslj+G7nCFF8T4Q54JTutVAJ/IDkjIEwZVCwrYAjgFGrIJDUoFTI0CqdqACwdozyi68YzGQdmvTNaVBxJ8aDAWPhAfb9um0UiiM7iSj4t6EwumFEuZkGQn27I5NlZeM+tWi8Wclkj7Q+iCxYww8Yc9AHBGq7mpNidofEjlasYek3+0JfPinjjGRuiRKut2TQECXr/aDr/1bUzzqt0NnGmjIK/k318yhkhfoVtGr2/4F+M9/UPZhAPDr2Jt6jM4Mv3EfWiQtahCDtK7i3W4x0ZD1M3c10stSKDEjVCURjSZzNeHPJAsQotpHlbCK4XoOcKec4b4ORBUdLoKoq+sqF3pnDCgTceYr4Fd3fzqTj7/oZvM0VTd2wKGWlkklX2MgMmhi/0lBYjWgpOi2J92t0kq0y42e4fW+bLLiA/C16asqNmxiVF/cwr98SdEZ1tElMTb9sTjxeE+elA2hOESAwC187179fSNfaPuZb/0csyTHSPTAla1cS8QK+Rub+JLvJJFrD2AVVPSCjuxeFI22CTPbAzSXJ+IzEawCGWWQUDgPvj8wJCJs8ZwaPEMS2qESZcQswXoaHntP9DHlbnpMyjTV3NVrwry9DmBWj5DQOh2MGk7Z9R55fEwWpaBGJsL4K8IxGgn+AqA/tnv6mSBAz6B2XHxF/WOLyJ3IowS2LPEz908wwP2RLE0UuJRS9uDwx/HXKalhXPlj4kdQjL8RjTQHkp3RZUxHG7s2lMTfBk46Cq0Dw1Tvt0Y2XVLI52V+6DUGBbmT+/kfElh2lOTpg6bYcQfUvq9oXeKt0s7cPSQZRtMYvdymDc74nc/vKDv9yrGhnNO2hSCnie/Ba4chJz9pwsNd0b0pemn5SnNG7YCQlvy8B1a2kiTyy387fcnYABbbXBTXMb Ir62QsH2 XFd3dcSNoNr8V5UoGccpHJNnY3EiorWB2VZm607GtWQVscirluXGr12aR7Oo2dM4a6m5HSURAnyBLqQB/vuQK62krzY+HxFUQlJnE2v8In0wIC7Vir7FRnjyPOl8XPtDDaqAExG1DIeNmCYylDPa3y3iDEA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Wrap vmalloc's pte table manipulation loops with arch_enter_lazy_mmu_mode() / arch_leave_lazy_mmu_mode(). This provides the arch code with the opportunity to optimize the pte manipulations. Note that vmap_pfn() already uses lazy mmu mode since it delegates to apply_to_page_range() which enters lazy mmu mode for both user and kernel mappings. These hooks will shortly be used by arm64 to improve vmalloc performance. Signed-off-by: Ryan Roberts --- mm/vmalloc.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 6111ce900ec4..b63ca0b7dd40 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -104,6 +104,9 @@ static int vmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, pte = pte_alloc_kernel_track(pmd, addr, mask); if (!pte) return -ENOMEM; + + arch_enter_lazy_mmu_mode(); + do { if (unlikely(!pte_none(ptep_get(pte)))) { if (pfn_valid(pfn)) { @@ -127,6 +130,8 @@ static int vmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, set_pte_at(&init_mm, addr, pte, pfn_pte(pfn, prot)); pfn++; } while (pte += PFN_DOWN(size), addr += size, addr != end); + + arch_leave_lazy_mmu_mode(); *mask |= PGTBL_PTE_MODIFIED; return 0; } @@ -354,6 +359,8 @@ static void vunmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, unsigned long size = PAGE_SIZE; pte = pte_offset_kernel(pmd, addr); + arch_enter_lazy_mmu_mode(); + do { #ifdef CONFIG_HUGETLB_PAGE size = arch_vmap_pte_range_unmap_size(addr, pte); @@ -370,6 +377,8 @@ static void vunmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, ptent = ptep_get_and_clear(&init_mm, addr, pte); WARN_ON(!pte_none(ptent) && !pte_present(ptent)); } while (pte += (size >> PAGE_SHIFT), addr += size, addr != end); + + arch_leave_lazy_mmu_mode(); *mask |= PGTBL_PTE_MODIFIED; } @@ -515,6 +524,9 @@ static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr, pte = pte_alloc_kernel_track(pmd, addr, mask); if (!pte) return -ENOMEM; + + arch_enter_lazy_mmu_mode(); + do { struct page *page = pages[*nr]; @@ -528,6 +540,8 @@ static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr, set_pte_at(&init_mm, addr, pte, mk_pte(page, prot)); (*nr)++; } while (pte++, addr += PAGE_SIZE, addr != end); + + arch_leave_lazy_mmu_mode(); *mask |= PGTBL_PTE_MODIFIED; return 0; } From patchwork Tue Mar 4 15:04:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 14000875 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9272C021B8 for ; Tue, 4 Mar 2025 15:05:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 391FC280001; Tue, 4 Mar 2025 10:05:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 318CE6B009D; Tue, 4 Mar 2025 10:05:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1E2E7280002; Tue, 4 Mar 2025 10:05:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id D822F280001 for ; Tue, 4 Mar 2025 10:05:21 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 375BC801AE for ; Tue, 4 Mar 2025 15:05:21 +0000 (UTC) X-FDA: 83184192042.17.1B9AD3A Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf27.hostedemail.com (Postfix) with ESMTP id 6602B40035 for ; Tue, 4 Mar 2025 15:05:19 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf27.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741100719; a=rsa-sha256; cv=none; b=K3sXywVN+MP0eTb0d/ZMu9qJ2GfNyXP0M8hVR3vAIo0OaNj1it2kJ3RvBgX56VlcLyscpj syLG+PUiwqP5VxQQfByrFepwSfgz9pUS/BVaTRqlqowVTp2/iYO/W7fqzUoqSYS/62Zh5m 5G1L4BuVZ+FIcueMqGHVSVJXKcLmzro= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf27.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741100719; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xO9chZjr8zs4w1q8LqVh/WqfnVAw9DQrlJXrzXLev3o=; b=3U+osh+6gV6LrdnCATtOhw+jWft6ZaT1pOZQpThD06by2xkrdKBxkGtQYuYF+Br7W2ZnBi RBd7nI3DhfvRGOEJA6hejBEMoaxm9Yq2BTUN0qgoyM/XkjuV5r1zbJwPgy4/wJK9Bce+4t 7UJgseFn0NxQyyWjjyRGMUYzEi4wP90= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3E5CF1C25; Tue, 4 Mar 2025 07:05:32 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id ACF063F66E; Tue, 4 Mar 2025 07:05:16 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 11/11] arm64/mm: Batch barriers when updating kernel mappings Date: Tue, 4 Mar 2025 15:04:41 +0000 Message-ID: <20250304150444.3788920-12-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250304150444.3788920-1-ryan.roberts@arm.com> References: <20250304150444.3788920-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 6602B40035 X-Stat-Signature: ghinf8xde1psu5b4853k9gbdhobocudz X-HE-Tag: 1741100719-450006 X-HE-Meta: U2FsdGVkX1+QZLgkbl56G5L0jth5Hc0xg2e9TUgEb9NmreN0JA4ev0hi4B6YiL9uILD7CrZbRfkcS8PUNJ/HdebI5Hcn7zlWrx+wGSyjY1lpV7XtOeQXPGtuQtntYcM7ixiKKxehaYoONmjBC4BmAXe3h2XI60XvkIFYNPM+KpTbHHP+UF6q7yLr1kCTxniBHqVFmS0ky7oehAFL2D09UGkrwIDZbfbcgQ4vUtDUG4b2Vr5MrV0OjxF2UIiUJlbC7fm6agxKHzHByCqQD6P8mREYkTmGmHTKi8FGf/QAn8Afr7LuI7AA//KM79bqkhrLT+ezxTMCrVJDXirmLeBGdDuKvIaUC3LqGNRNsGwnrOtNHEWrU5+iN1JjYhbSCq5SzKxlyRcmmIygN2dU345bz8L1lmafB1ynX8km8XqeQi15b16LKbP4s+KYRqbfAFCuuD4xCY+TjzaOHgZbr2dhH9ze78pPE4s4jVP4XA8QA7xTDYR2/t8kOw+hDfDBG3d8jMkcHUW4cRJXn7d8M0g41J+MJmG6/skznx0GBIvYaWrEEmeig1lbfqp3zk46/OXEdxTR9msLvPi0DNsiWqotMJ8LdZaEyV31p4yZtyFmvkRNA+66kvgzK3qQhlFmAvPOimDPMokYvhin+wLyknCCmFesoQ0qa2mcqW3nBt4pOmlDaNSL5ShN9MLl8LC0MTgaAXuNCy09eX/cgW/AlBFxk+sRIqGqWjyd2/CtpNwUlmw68gD2R9aDRtVdSX05wVfWtEwHckX/t0p7kgyUV6iHgM/WcOiROGfPw5J0YEl42kBYsmdguoNV9vPnT5nl4YFzKG9cIkEHP8UGcAW9CoEJGeRpR2r6Gm7y2LtLgJ9xxForJireWslkgfrpWzvY4tLM5VWlnX8HxHsD7CRHcE4x9wPXcSUGbZs6/z8JxPSdsM40QtLC1G/tBKd49Nnghw/egsaGo4iM6FA7cekXdqd 5LWcyfxt qe4GWU4rbZK9eSMNNPHdloqjcjEhmJwuKI3kaiM9TDVubW/nGF43PpHx7SbHY0cqTal0uaFXG/+XgUxI+V21nI23flzLK+zpKYOaU1MZ1jw7l/xFwWio1kJu4CuKr9fGEcCbHJDiqHv69Q2KiEvyVZSWALZtPoZTkZZdZ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Because the kernel can't tolerate page faults for kernel mappings, when setting a valid, kernel space pte (or pmd/pud/p4d/pgd), it emits a dsb(ishst) to ensure that the store to the pgtable is observed by the table walker immediately. Additionally it emits an isb() to ensure that any already speculatively determined invalid mapping fault gets canceled. We can improve the performance of vmalloc operations by batching these barriers until the end of a set of entry updates. arch_enter_lazy_mmu_mode() and arch_leave_lazy_mmu_mode() provide the required hooks. vmalloc improves by up to 30% as a result. Two new TIF_ flags are created; TIF_LAZY_MMU tells us if the task is in the lazy mode and can therefore defer any barriers until exit from the lazy mode. TIF_LAZY_MMU_PENDING is used to remember if any pte operation was performed while in the lazy mode that required barriers. Then when leaving lazy mode, if that flag is set, we emit the barriers. Since arch_enter_lazy_mmu_mode() and arch_leave_lazy_mmu_mode() are used for both user and kernel mappings, we need the second flag to avoid emitting barriers unnecessarily if only user mappings were updated. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 73 ++++++++++++++++++++++------ arch/arm64/include/asm/thread_info.h | 2 + arch/arm64/kernel/process.c | 9 ++-- 3 files changed, 64 insertions(+), 20 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 1898c3069c43..149df945c1ab 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -40,6 +40,55 @@ #include #include +static inline void emit_pte_barriers(void) +{ + /* + * These barriers are emitted under certain conditions after a pte entry + * was modified (see e.g. __set_pte_complete()). The dsb makes the store + * visible to the table walker. The isb ensures that any previous + * speculative "invalid translation" marker that is in the CPU's + * pipeline gets cleared, so that any access to that address after + * setting the pte to valid won't cause a spurious fault. If the thread + * gets preempted after storing to the pgtable but before emitting these + * barriers, __switch_to() emits a dsb which ensure the walker gets to + * see the store. There is no guarrantee of an isb being issued though. + * This is safe because it will still get issued (albeit on a + * potentially different CPU) when the thread starts running again, + * before any access to the address. + */ + dsb(ishst); + isb(); +} + +static inline void queue_pte_barriers(void) +{ + if (test_thread_flag(TIF_LAZY_MMU)) + set_thread_flag(TIF_LAZY_MMU_PENDING); + else + emit_pte_barriers(); +} + +#define __HAVE_ARCH_ENTER_LAZY_MMU_MODE +static inline void arch_enter_lazy_mmu_mode(void) +{ + VM_WARN_ON(in_interrupt()); + VM_WARN_ON(test_thread_flag(TIF_LAZY_MMU)); + + set_thread_flag(TIF_LAZY_MMU); +} + +static inline void arch_flush_lazy_mmu_mode(void) +{ + if (test_and_clear_thread_flag(TIF_LAZY_MMU_PENDING)) + emit_pte_barriers(); +} + +static inline void arch_leave_lazy_mmu_mode(void) +{ + arch_flush_lazy_mmu_mode(); + clear_thread_flag(TIF_LAZY_MMU); +} + #ifdef CONFIG_TRANSPARENT_HUGEPAGE #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE @@ -323,10 +372,8 @@ static inline void __set_pte_complete(pte_t pte) * Only if the new pte is valid and kernel, otherwise TLB maintenance * has the necessary barriers. */ - if (pte_valid_not_user(pte)) { - dsb(ishst); - isb(); - } + if (pte_valid_not_user(pte)) + queue_pte_barriers(); } static inline void __set_pte(pte_t *ptep, pte_t pte) @@ -778,10 +825,8 @@ static inline void set_pmd(pmd_t *pmdp, pmd_t pmd) WRITE_ONCE(*pmdp, pmd); - if (pmd_valid(pmd)) { - dsb(ishst); - isb(); - } + if (pmd_valid(pmd)) + queue_pte_barriers(); } static inline void pmd_clear(pmd_t *pmdp) @@ -845,10 +890,8 @@ static inline void set_pud(pud_t *pudp, pud_t pud) WRITE_ONCE(*pudp, pud); - if (pud_valid(pud)) { - dsb(ishst); - isb(); - } + if (pud_valid(pud)) + queue_pte_barriers(); } static inline void pud_clear(pud_t *pudp) @@ -925,8 +968,7 @@ static inline void set_p4d(p4d_t *p4dp, p4d_t p4d) } WRITE_ONCE(*p4dp, p4d); - dsb(ishst); - isb(); + queue_pte_barriers(); } static inline void p4d_clear(p4d_t *p4dp) @@ -1052,8 +1094,7 @@ static inline void set_pgd(pgd_t *pgdp, pgd_t pgd) } WRITE_ONCE(*pgdp, pgd); - dsb(ishst); - isb(); + queue_pte_barriers(); } static inline void pgd_clear(pgd_t *pgdp) diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h index 1114c1c3300a..1fdd74b7b831 100644 --- a/arch/arm64/include/asm/thread_info.h +++ b/arch/arm64/include/asm/thread_info.h @@ -82,6 +82,8 @@ void arch_setup_new_exec(void); #define TIF_SME_VL_INHERIT 28 /* Inherit SME vl_onexec across exec */ #define TIF_KERNEL_FPSTATE 29 /* Task is in a kernel mode FPSIMD section */ #define TIF_TSC_SIGSEGV 30 /* SIGSEGV on counter-timer access */ +#define TIF_LAZY_MMU 31 /* Task in lazy mmu mode */ +#define TIF_LAZY_MMU_PENDING 32 /* Ops pending for lazy mmu mode exit */ #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 42faebb7b712..45a55fe81788 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -680,10 +680,11 @@ struct task_struct *__switch_to(struct task_struct *prev, gcs_thread_switch(next); /* - * Complete any pending TLB or cache maintenance on this CPU in case - * the thread migrates to a different CPU. - * This full barrier is also required by the membarrier system - * call. + * Complete any pending TLB or cache maintenance on this CPU in case the + * thread migrates to a different CPU. This full barrier is also + * required by the membarrier system call. Additionally it makes any + * in-progress pgtable writes visible to the table walker; See + * emit_pte_barriers(). */ dsb(ish);