From patchwork Wed Aug 2 15:13:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 13338358 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF542C04FDF for ; Wed, 2 Aug 2023 15:15:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F37F22801A7; Wed, 2 Aug 2023 11:14:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DF9DF2801A2; Wed, 2 Aug 2023 11:14:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CC3662801A6; Wed, 2 Aug 2023 11:14:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id BA5752801A3 for ; Wed, 2 Aug 2023 11:14:21 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 7FC50B1C5C for ; Wed, 2 Aug 2023 15:14:21 +0000 (UTC) X-FDA: 81079510722.26.D96ED48 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf20.hostedemail.com (Postfix) with ESMTP id A1FC61C0028 for ; Wed, 2 Aug 2023 15:14:19 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=NEyqsPqS; spf=none (imf20.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690989259; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7lDn/wtFMDStStfJSFg4JFdBjyCxjqZcRZNsesSoGFY=; b=oEeNnZx9wI+8O58wHlP9f+P6X5U6x7W2FuXMjBBsZLJ0H0dITzNg+5rkLgfGgDMW9QEHxE DFNEaDSPlQjYe343UHRCQ4cSUjHjH8Ws6mu1iuRBasKmOZkxPzBru6Q4bk2MY/WZvZd2kM yyWmRlZgFSgFchRCvtMpP2sTP7c9YtM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690989259; a=rsa-sha256; cv=none; b=TDIxcy5zLm6k0VGTiP5GXJG2yOlnhHM7KRXOJLzkylWIAvRJpBe5O1PpnuUv2IBiPmVzLa 3SKP3CdAJnW2omrSJz3C7ewL2wK8/ZxKiRaKpbnSsdzjtBRh6oAlGkgGSlVq9hlvp1kpRA h0phiDJKZipVdR25ucXYPLl64C94sLM= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=NEyqsPqS; spf=none (imf20.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=7lDn/wtFMDStStfJSFg4JFdBjyCxjqZcRZNsesSoGFY=; b=NEyqsPqSVW9mPGio75eZb8vstz BwjATcHvubSU38801ClVjqC6T9M8z6Z/kBkSdqAWHV13QQpXH8yi57P/rvidiDEJBvXS6smJX1bZ4 kF8RIFwrxaSEtzRUv7tq90xT/8geYsbp7ij1iXYIwDX3yWj8uzV49ELoKS5mA5J5A5wd73YmGijDk zQU9GEMSNz5pBP9qsQJ+6nd3nsUeGEwwgduHihDEorD3tL7mg4oytaJ6TBaSzp2DLHtLyJDpxvFFO A4pG5s7mmIDvtzDsi8nhMuNuQ015e7dJPivYtELxxm7HSSMvIAqiFQ//Eq+5yJJaZMNhO2FBL4tZk 3l+xhvhA==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1qRDY8-00Ffim-Fd; Wed, 02 Aug 2023 15:14:08 +0000 From: "Matthew Wilcox (Oracle)" To: Andrew Morton Cc: "Matthew Wilcox (Oracle)" , linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Mike Rapoport Subject: [PATCH v6 06/38] mm: Add default definition of set_ptes() Date: Wed, 2 Aug 2023 16:13:34 +0100 Message-Id: <20230802151406.3735276-7-willy@infradead.org> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20230802151406.3735276-1-willy@infradead.org> References: <20230802151406.3735276-1-willy@infradead.org> MIME-Version: 1.0 X-Stat-Signature: rjgps398rdq8f9843ffpybb8fqzco5s4 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: A1FC61C0028 X-Rspam-User: X-HE-Tag: 1690989259-158974 X-HE-Meta: U2FsdGVkX1+uBs2N7NQ/8KHwl1cQImzYDawxdlFYtnShCi/3H4yjYVD+6dwwC31FYBdAqFdz5YjS2YTzfDb3I4OW0eI7OKZ+1t+7xCp1E/YZg2b1spaLkmdPvY9QgxZpVZlyA6O5HRzdXDY9fw+QH90hWPqhZXSqZTW3EvreYqfM2nD5EXpdPtWIu7qxBGXu5eydl9EU21QZCKfX7i/E1ujoJWfsPfXnRF8TRXM7/s5IgJIR0pBzQcL1ZFGbTAZYXTbE+sOS4Mpvvcwfc4NXfxiUnw1VKuLHqOv56hFCTU0tV34zOVfmM+Z82Ie9zNn12hTxuokTyL9Rl8Tfob8Ah/8SaNgwlc+AqIrLIo1h2JEQRncgwBZCIUrydXBSpe3bQ5CS2ajrWQWjvYKG7I/6W44jYJ9chSh869WNsRa1mzi9peX0U6hlKmjMUdPsDWo6PbyrawW3mX8E+Pghbg3LKNgxqhsN0yHtaoImcuvgZkMsTq5SjbNn2f0FTsKHjX+xwVl3sRvqQGg6an/xvqwVnuMd6cI+lmxpLFFu0BtNfJgSDhqk+/pAGl1ZJagED1K76ZhtPqHx0xvQZRzP5u0r7ymkepzlGDZPxx2hbrgNLAudKSlItxBNTmugjYThPN8atb5xFr9ofXAruthinBjuppDe3V9+Z1oI5JUzp5vNRMunK+S+62oGVNhrsP60IYSsI77MK2aeaZbGNmFZ+WRgGrTMY/LIDii4nHI2JP5+D8jFQrQ0bmYDTOmzpm85PkcK2wd1LsnKqRy9nAKdTQphVw3Vl/ocL+V+72jRy1AraWzEvOkJx6aLlmVTwt2sXjH7xiuHOP3gebgG8yslEP1K660A5W/Gk+wMt395KCla2/QgQVrACVcBHgBGA/MNmywtRxs49EW4vUpxv+9jnCcrDGEOU0EE2Xq6Q1z3pp3nM4yaERNm5tC+cCHSL82aOzDoyFlSCHwhNXub6PSGQuZ Nj2eEn/U keVeKGkVDxRp7WfYlmQ/8Xf2e/bbSu2S0wm8kFNfKMU/0rEdKhnWGG/mj9LLc7bQQCpQr4SV/OX23A2jsLuNIRssfJVUS2oF2E3+TirFYQIFVPcgX6qRdRQZm3WvGJClWXvGe7G+h3KmgaQbSHhC6BPkKlrbtyzCGJxvArbquRuKEdr12cPrBE77w4vqBjq2y1UDAZ9KDJ0iLPRoKrwaiUAPkml6ygSUtFfv1jXuvPRPzE49alsaZbB9m2v2DRXoIY7cjtjVY+AWSmTMDwi4mnMCDEfXI0oTOTWRY X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Most architectures can just define set_pte() and PFN_PTE_SHIFT to use this definition. It's also a handy spot to document the guarantees provided by the MM. Suggested-by: Mike Rapoport (IBM) Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Mike Rapoport (IBM) Tested-by: David Woodhouse --- include/linux/pgtable.h | 81 ++++++++++++++++++++++++++++++----------- 1 file changed, 60 insertions(+), 21 deletions(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index f34e0f2cb4d8..3fde0d5d1c29 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -182,6 +182,66 @@ static inline int pmd_young(pmd_t pmd) } #endif +/* + * A facility to provide lazy MMU batching. This allows PTE updates and + * page invalidations to be delayed until a call to leave lazy MMU mode + * is issued. Some architectures may benefit from doing this, and it is + * beneficial for both shadow and direct mode hypervisors, which may batch + * the PTE updates which happen during this window. Note that using this + * interface requires that read hazards be removed from the code. A read + * hazard could result in the direct mode hypervisor case, since the actual + * write to the page tables may not yet have taken place, so reads though + * a raw PTE pointer after it has been modified are not guaranteed to be + * up to date. This mode can only be entered and left under the protection of + * the page table locks for all page tables which may be modified. In the UP + * case, this is required so that preemption is disabled, and in the SMP case, + * it must synchronize the delayed page table writes properly on other CPUs. + */ +#ifndef __HAVE_ARCH_ENTER_LAZY_MMU_MODE +#define arch_enter_lazy_mmu_mode() do {} while (0) +#define arch_leave_lazy_mmu_mode() do {} while (0) +#define arch_flush_lazy_mmu_mode() do {} while (0) +#endif + +#ifndef set_ptes +#ifdef PFN_PTE_SHIFT +/** + * set_ptes - Map consecutive pages to a contiguous range of addresses. + * @mm: Address space to map the pages into. + * @addr: Address to map the first page at. + * @ptep: Page table pointer for the first entry. + * @pte: Page table entry for the first page. + * @nr: Number of pages to map. + * + * May be overridden by the architecture, or the architecture can define + * set_pte() and PFN_PTE_SHIFT. + * + * Context: The caller holds the page table lock. The pages all belong + * to the same folio. The PTEs are all in the same PMD. + */ +static inline void set_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte, unsigned int nr) +{ + page_table_check_ptes_set(mm, ptep, pte, nr); + + arch_enter_lazy_mmu_mode(); + for (;;) { + set_pte(ptep, pte); + if (--nr == 0) + break; + ptep++; + pte = __pte(pte_val(pte) + (1UL << PFN_PTE_SHIFT)); + } + arch_leave_lazy_mmu_mode(); +} +#ifndef set_pte_at +#define set_pte_at(mm, addr, ptep, pte) set_ptes(mm, addr, ptep, pte, 1) +#endif +#endif +#else +#define set_pte_at(mm, addr, ptep, pte) set_ptes(mm, addr, ptep, pte, 1) +#endif + #ifndef __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS extern int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address, pte_t *ptep, @@ -1051,27 +1111,6 @@ static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot) #define pgprot_decrypted(prot) (prot) #endif -/* - * A facility to provide lazy MMU batching. This allows PTE updates and - * page invalidations to be delayed until a call to leave lazy MMU mode - * is issued. Some architectures may benefit from doing this, and it is - * beneficial for both shadow and direct mode hypervisors, which may batch - * the PTE updates which happen during this window. Note that using this - * interface requires that read hazards be removed from the code. A read - * hazard could result in the direct mode hypervisor case, since the actual - * write to the page tables may not yet have taken place, so reads though - * a raw PTE pointer after it has been modified are not guaranteed to be - * up to date. This mode can only be entered and left under the protection of - * the page table locks for all page tables which may be modified. In the UP - * case, this is required so that preemption is disabled, and in the SMP case, - * it must synchronize the delayed page table writes properly on other CPUs. - */ -#ifndef __HAVE_ARCH_ENTER_LAZY_MMU_MODE -#define arch_enter_lazy_mmu_mode() do {} while (0) -#define arch_leave_lazy_mmu_mode() do {} while (0) -#define arch_flush_lazy_mmu_mode() do {} while (0) -#endif - /* * A facility to provide batching of the reload of page tables and * other process state with the actual context switch code for