From patchwork Thu Nov 7 20:20:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 13867063 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5265D5D689 for ; Thu, 7 Nov 2024 20:20:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DD7726B00A8; Thu, 7 Nov 2024 15:20:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DAF436B00A9; Thu, 7 Nov 2024 15:20:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C4F866B00AA; Thu, 7 Nov 2024 15:20:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id AA2606B00A8 for ; Thu, 7 Nov 2024 15:20:43 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 2DB08C0268 for ; Thu, 7 Nov 2024 20:20:43 +0000 (UTC) X-FDA: 82760415318.13.24B5FA4 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf29.hostedemail.com (Postfix) with ESMTP id BB62A120012 for ; Thu, 7 Nov 2024 20:19:52 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=cwDULtvL; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of 3GCEtZwYKCE8FBGyr5x55x2v.t532z4BE-331Crt1.58x@flex--yuzhao.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3GCEtZwYKCE8FBGyr5x55x2v.t532z4BE-331Crt1.58x@flex--yuzhao.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731010756; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1tthOirtAYAbVODGiUuOcdhR/O6us8nRui7ePy/4XFU=; b=ejEQ9YOdpubiQNssMtVUXsnRgalbm0Q3Vw+Yc97KplZy4gVDTt7TS5ZXymEFgvEYk9ibyG 1A0tCr1qNOnxLUAyBgxrk7WunHwXgL/CFNh2o/vObXQwfG+OwOCJ9JEj4DS5wAE3m1Ycnq cOsbgPBHW70PzMsX+z8BGr32buf4pLQ= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=cwDULtvL; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of 3GCEtZwYKCE8FBGyr5x55x2v.t532z4BE-331Crt1.58x@flex--yuzhao.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3GCEtZwYKCE8FBGyr5x55x2v.t532z4BE-331Crt1.58x@flex--yuzhao.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731010756; a=rsa-sha256; cv=none; b=ha4RVqh/hcsa5ub3uqsb5Ww09jgs3eBRyFFC7RXG/bFHTJpnZpaFONyHCwEOk64HMmTmc4 +FdOlg2V3wFcnd+bgNX7Wm8JqeQYmYffWrF81HtAzX7/bG123InlHe9bhe0AYrWRO+te+e v3y1d9ZFGCe5LiUbReNX//yPraRJHnA= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-6e59dc7df64so17925187b3.1 for ; Thu, 07 Nov 2024 12:20:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731010840; x=1731615640; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=1tthOirtAYAbVODGiUuOcdhR/O6us8nRui7ePy/4XFU=; b=cwDULtvLoLXFRnP283GX9YNYOy9mJPKx0fAgWlQDL0bmWBE1w+EytS0b+Yr+yZzM1h I6EUy7AEIVuPGU/Qi1k1WFAbrcyppS9bA9ewJfIKZFdGkB8TJoNylDM9y8eQqM2GJN1l umSH0iljWWPjcPbFt/1zEfjUFeAOKlIXGUqP1KVFdyWdV9S0GMamFi4DbbiykW+wxDHV bCgQF+r8g6J3o3LiCtMptX98JXqx/uOuFDV1kcwy2aBAgVQYwCeurqFasgbFxr6EAPaT D4Gpz8WNSBkzRnX2QWoIYr6t2VSSKPaIEGpgVEltNHmIPvyurCr9YILeeg2iAYBSNXgp /7oA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731010840; x=1731615640; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=1tthOirtAYAbVODGiUuOcdhR/O6us8nRui7ePy/4XFU=; b=KXU/9SbU8H9eEm5lbHgBZLjhUK1SlcsdR6Zfh1W+s5dA0yXdZnbNqJtSX6U2Sb7XdB 5pPMo1GsjHTxCeP1goGpa1Z1r0juegX9rHCMiZt/YE4IMBzLdfFNdDcHztVxb96cN4M8 yfCfrsjNj9qg4y0P+ZNVyI8TlvBe5VR+Ljv/KS6NZcZkxGSyeGZ6d0Piebaj/G7WR/Ei Vux9wNFRm8AYfzvBg9Q4ahJDNXEXmEK+XFiWHmNQ1lzaLejWXmwcfBjxryLI+8EjlvNE LXcU+Ep9Gvke8XPkb6j2r4ksjoyk4dDkvsAL39QyU0RDlBoPy32k3B52izuaSVt/3Dkv bFKw== X-Forwarded-Encrypted: i=1; AJvYcCXxeYyEXyLf2QY7bQcCs/vDSCZTnPDIRd/M5aTdUWhDtFvIqVfi1v1kjfHOxQF74miafLxFclgkoQ==@kvack.org X-Gm-Message-State: AOJu0YwizgXgCk+Mzl5a5pw9A4Eb8ZuwTz7NF79LKMGW+wHs6c3SrHds 5rH6buBmB2HeFXiVbZ8jTW3S6oRw2QKI7bS7BFtMNRWAdf+ngJiwkpjebFEL2KSrRGlexuzuJun ViQ== X-Google-Smtp-Source: AGHT+IGTqsn/Xtckt6+UQcvmaBMjWwhX3GjS38ldjfYHLBKaWt3NH7a5E1E88BJehCRrOBEtvF7Ru96C8rA= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:a4c0:c64f:6cdd:91f8]) (user=yuzhao job=sendgmr) by 2002:a05:690c:255:b0:6ea:4b3a:6703 with SMTP id 00721157ae682-6eadc16949bmr149247b3.5.1731010840283; Thu, 07 Nov 2024 12:20:40 -0800 (PST) Date: Thu, 7 Nov 2024 13:20:28 -0700 In-Reply-To: <20241107202033.2721681-1-yuzhao@google.com> Mime-Version: 1.0 References: <20241107202033.2721681-1-yuzhao@google.com> X-Mailer: git-send-email 2.47.0.277.g8800431eea-goog Message-ID: <20241107202033.2721681-2-yuzhao@google.com> Subject: [PATCH v2 1/6] mm/hugetlb_vmemmap: batch-update PTEs From: Yu Zhao To: Andrew Morton , Catalin Marinas , Marc Zyngier , Muchun Song , Thomas Gleixner , Will Deacon Cc: Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: BB62A120012 X-Stat-Signature: euh33kom6zcnqhcart3uno5tppfyazy8 X-HE-Tag: 1731010792-351604 X-HE-Meta: U2FsdGVkX19gDfNw7akgpnv3h99I1Dx0CDcMRJn7kaUZJAj64oqfQ26zm4KfBeQRD96eiis5T4ivKdL8Ge51ppepinabHG9WpVFhoQNrd2k2jcsWIpWCgfUbyErsQFSPmKx5JrHpmq7WXh0byFwveui5ufkHHpRNOf9jlxShWa/VxPs7Ob3vSIODC23x1PuZA5qMMNtXOH3Ome8NPClnndMg8zGqmJ8/dz5aQnGd3E6g13g7gY9Gh1PavXMzPE7SdnjQmQmofmHbAuo8oER3W2W4GbAaH2lm63Jbtk4AfiHuZKGyrctLt6N2aN8YE5kb96EfvgHLiDGOv48qvzc0LnfZERIWm8s4EwDGbYeedvqmslATPqi1IzQJ+k/gJvhGVDzo6BO/ia8ZLAj6F4akAW9OQnNb9SIxcdImxNcgvPFvZO/aPTZ153AHHXgtf4FqpZzvPnkZU0io+CxylRFBYghaA1Cr7YxDrNEQT69tgoAFGmXVA+GnZlYof1m4EJehSNEfK0MoCpe5lME9sRENeSAytEub9JgXjzaF+lGLPK1Wj4Z/FZw1hZAmI1n7pNuO7866Rrc8BZzZk69yb6eSMAnq45Ds6nGWlMu8yKsPss1Ss4FYkZ1NWnmo5Dt3aX1x5DYGasshKbDlcL95tx3yRwnx8+mHsdIkSf7QgnloeeZr0b8OcaenNhtAt6zglGB8mUmlm2mh4DSTk+4pXyfiNrkS4LhiufghjZhcxpS1o9yl3iUyMRPQNHzL6/BmXlb0+V57PtCk537Q6ePA4uitvK9i8PN6L8X9zPtkeR8Ttk/bylUDbhwcNM0Nzli5Hym8VGhpq9T1V+B6966ezc2gSEone6wHzE8qDuzaU0m4iPqcPhN6IhAZUuIRgssa41MZ4lSLB8FEDTOBppEvJfm0lbYxnhW1rRm8fFOq1VGXgFVi630BGCxkBk6SDddGkhNlFz4uwtAo+oNIdtHS7BP cSilW2OO zSRMmcORH53xFd7e3B4ZTbQohBMcSpE2rcd2bjtRXoSIF28EmGxX3Lxp0605TyeuaVVh/V0iP8b2LboZ70TOZtJbgIZGnM2k2T9lXJGSMGO/X66W5qgHH9HnrL/7BefX+LY9Un2UuGX2kW9X4/qbTWp+XmiGTeUTQ5MBVU0gLC1NvR4XKBTYJj+/O5e0FGjq8CG2/8RygM2g0FojB7KztjxXitQaty29/rHs//+beOG5FCdVtKFmf/mbtxxz/YtzwF+kH524F7mM0trhLoYMmZQGYe1lcWTiVokxcGFEuHgJkTjIWyMeLwSDiA8Ov77KMdlvAfBXHqYAQPzsZ2Jv1K7L2fPIS5UGF1uehQ3gWE12cXKUTDheWzoocBTA9U9Qxbok70rBJ6tTZ2/p1ZTfKdzHviiIsV7o8v+fYYZgCGUDXh+HIZINO2jZBRGzcrxpQ+XWukYJxk+s2/V+TsgGP+EJdNKH9XjzoSd0n7EwHeYNVdScnFOEoUOr5JTJXTLXz6jU4WQklHA4zwdaEcmoCQ/PZ7+XgF7Dq65D5eFT53ohdwLwneH7SC05k/+yfe4tMpoE1azbc4JxAvskDzxslFLAvmQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Convert vmemmap_remap_walk->remap_pte to ->remap_pte_range so that vmemmap remap walks can batch-update PTEs. The goal of this conversion is to allow architectures to implement their own optimizations if possible, e.g., only to stop remote CPUs once for each batch when updating vmemmap on arm64. It is not intended to change the remap workflow nor should it by itself have any side effects on performance. Signed-off-by: Yu Zhao --- mm/hugetlb_vmemmap.c | 163 ++++++++++++++++++++++++------------------- 1 file changed, 91 insertions(+), 72 deletions(-) diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 57b7f591eee8..46befab48d41 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -22,7 +22,7 @@ /** * struct vmemmap_remap_walk - walk vmemmap page table * - * @remap_pte: called for each lowest-level entry (PTE). + * @remap_pte_range: called on a range of PTEs. * @nr_walked: the number of walked pte. * @reuse_page: the page which is reused for the tail vmemmap pages. * @reuse_addr: the virtual address of the @reuse_page page. @@ -32,8 +32,8 @@ * operations. */ struct vmemmap_remap_walk { - void (*remap_pte)(pte_t *pte, unsigned long addr, - struct vmemmap_remap_walk *walk); + void (*remap_pte_range)(pte_t *pte, unsigned long start, + unsigned long end, struct vmemmap_remap_walk *walk); unsigned long nr_walked; struct page *reuse_page; unsigned long reuse_addr; @@ -101,10 +101,6 @@ static int vmemmap_pmd_entry(pmd_t *pmd, unsigned long addr, struct page *head; struct vmemmap_remap_walk *vmemmap_walk = walk->private; - /* Only splitting, not remapping the vmemmap pages. */ - if (!vmemmap_walk->remap_pte) - walk->action = ACTION_CONTINUE; - spin_lock(&init_mm.page_table_lock); head = pmd_leaf(*pmd) ? pmd_page(*pmd) : NULL; /* @@ -129,33 +125,36 @@ static int vmemmap_pmd_entry(pmd_t *pmd, unsigned long addr, ret = -ENOTSUPP; } spin_unlock(&init_mm.page_table_lock); - if (!head || ret) + if (ret) return ret; - return vmemmap_split_pmd(pmd, head, addr & PMD_MASK, vmemmap_walk); -} + if (head) { + ret = vmemmap_split_pmd(pmd, head, addr & PMD_MASK, vmemmap_walk); + if (ret) + return ret; + } -static int vmemmap_pte_entry(pte_t *pte, unsigned long addr, - unsigned long next, struct mm_walk *walk) -{ - struct vmemmap_remap_walk *vmemmap_walk = walk->private; + if (vmemmap_walk->remap_pte_range) { + pte_t *pte = pte_offset_kernel(pmd, addr); - /* - * The reuse_page is found 'first' in page table walking before - * starting remapping. - */ - if (!vmemmap_walk->reuse_page) - vmemmap_walk->reuse_page = pte_page(ptep_get(pte)); - else - vmemmap_walk->remap_pte(pte, addr, vmemmap_walk); - vmemmap_walk->nr_walked++; + vmemmap_walk->nr_walked += (next - addr) / PAGE_SIZE; + /* + * The reuse_page is found 'first' in page table walking before + * starting remapping. + */ + if (!vmemmap_walk->reuse_page) { + vmemmap_walk->reuse_page = pte_page(ptep_get(pte)); + pte++; + addr += PAGE_SIZE; + } + vmemmap_walk->remap_pte_range(pte, addr, next, vmemmap_walk); + } return 0; } static const struct mm_walk_ops vmemmap_remap_ops = { .pmd_entry = vmemmap_pmd_entry, - .pte_entry = vmemmap_pte_entry, }; static int vmemmap_remap_range(unsigned long start, unsigned long end, @@ -172,7 +171,7 @@ static int vmemmap_remap_range(unsigned long start, unsigned long end, if (ret) return ret; - if (walk->remap_pte && !(walk->flags & VMEMMAP_REMAP_NO_TLB_FLUSH)) + if (walk->remap_pte_range && !(walk->flags & VMEMMAP_REMAP_NO_TLB_FLUSH)) flush_tlb_kernel_range(start, end); return 0; @@ -204,33 +203,45 @@ static void free_vmemmap_page_list(struct list_head *list) free_vmemmap_page(page); } -static void vmemmap_remap_pte(pte_t *pte, unsigned long addr, - struct vmemmap_remap_walk *walk) +static void vmemmap_remap_pte_range(pte_t *pte, unsigned long start, unsigned long end, + struct vmemmap_remap_walk *walk) { - /* - * Remap the tail pages as read-only to catch illegal write operation - * to the tail pages. - */ - pgprot_t pgprot = PAGE_KERNEL_RO; - struct page *page = pte_page(ptep_get(pte)); - pte_t entry; - - /* Remapping the head page requires r/w */ - if (unlikely(addr == walk->reuse_addr)) { - pgprot = PAGE_KERNEL; - list_del(&walk->reuse_page->lru); + int i; + struct page *page; + int nr_pages = (end - start) / PAGE_SIZE; + for (i = 0; i < nr_pages; i++) { + page = pte_page(ptep_get(pte + i)); + + list_add(&page->lru, walk->vmemmap_pages); + } + + page = walk->reuse_page; + + if (start == walk->reuse_addr) { + list_del(&page->lru); + copy_page(page_to_virt(page), (void *)walk->reuse_addr); /* - * Makes sure that preceding stores to the page contents from - * vmemmap_remap_free() become visible before the set_pte_at() - * write. + * Makes sure that preceding stores to the page contents become + * visible before set_pte_at(). */ smp_wmb(); } - entry = mk_pte(walk->reuse_page, pgprot); - list_add(&page->lru, walk->vmemmap_pages); - set_pte_at(&init_mm, addr, pte, entry); + for (i = 0; i < nr_pages; i++) { + pte_t val; + + /* + * The head page must be mapped read-write; the tail pages are + * mapped read-only to catch illegal modifications. + */ + if (!i && start == walk->reuse_addr) + val = mk_pte(page, PAGE_KERNEL); + else + val = mk_pte(page, PAGE_KERNEL_RO); + + set_pte_at(&init_mm, start + PAGE_SIZE * i, pte + i, val); + } } /* @@ -252,27 +263,39 @@ static inline void reset_struct_pages(struct page *start) memcpy(start, from, sizeof(*from) * NR_RESET_STRUCT_PAGE); } -static void vmemmap_restore_pte(pte_t *pte, unsigned long addr, - struct vmemmap_remap_walk *walk) +static void vmemmap_restore_pte_range(pte_t *pte, unsigned long start, unsigned long end, + struct vmemmap_remap_walk *walk) { - pgprot_t pgprot = PAGE_KERNEL; + int i; struct page *page; - void *to; - - BUG_ON(pte_page(ptep_get(pte)) != walk->reuse_page); + int nr_pages = (end - start) / PAGE_SIZE; page = list_first_entry(walk->vmemmap_pages, struct page, lru); - list_del(&page->lru); - to = page_to_virt(page); - copy_page(to, (void *)walk->reuse_addr); - reset_struct_pages(to); + + for (i = 0; i < nr_pages; i++) { + BUG_ON(pte_page(ptep_get(pte + i)) != walk->reuse_page); + + copy_page(page_to_virt(page), (void *)walk->reuse_addr); + reset_struct_pages(page_to_virt(page)); + + page = list_next_entry(page, lru); + } /* * Makes sure that preceding stores to the page contents become visible - * before the set_pte_at() write. + * before set_pte_at(). */ smp_wmb(); - set_pte_at(&init_mm, addr, pte, mk_pte(page, pgprot)); + + for (i = 0; i < nr_pages; i++) { + pte_t val; + + page = list_first_entry(walk->vmemmap_pages, struct page, lru); + list_del(&page->lru); + + val = mk_pte(page, PAGE_KERNEL); + set_pte_at(&init_mm, start + PAGE_SIZE * i, pte + i, val); + } } /** @@ -290,7 +313,6 @@ static int vmemmap_remap_split(unsigned long start, unsigned long end, unsigned long reuse) { struct vmemmap_remap_walk walk = { - .remap_pte = NULL, .flags = VMEMMAP_SPLIT_NO_TLB_FLUSH, }; @@ -322,10 +344,10 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end, { int ret; struct vmemmap_remap_walk walk = { - .remap_pte = vmemmap_remap_pte, - .reuse_addr = reuse, - .vmemmap_pages = vmemmap_pages, - .flags = flags, + .remap_pte_range = vmemmap_remap_pte_range, + .reuse_addr = reuse, + .vmemmap_pages = vmemmap_pages, + .flags = flags, }; int nid = page_to_nid((struct page *)reuse); gfp_t gfp_mask = GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN; @@ -340,8 +362,6 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end, */ walk.reuse_page = alloc_pages_node(nid, gfp_mask, 0); if (walk.reuse_page) { - copy_page(page_to_virt(walk.reuse_page), - (void *)walk.reuse_addr); list_add(&walk.reuse_page->lru, vmemmap_pages); memmap_pages_add(1); } @@ -371,10 +391,9 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end, * They will be restored in the following call. */ walk = (struct vmemmap_remap_walk) { - .remap_pte = vmemmap_restore_pte, - .reuse_addr = reuse, - .vmemmap_pages = vmemmap_pages, - .flags = 0, + .remap_pte_range = vmemmap_restore_pte_range, + .reuse_addr = reuse, + .vmemmap_pages = vmemmap_pages, }; vmemmap_remap_range(reuse, end, &walk); @@ -425,10 +444,10 @@ static int vmemmap_remap_alloc(unsigned long start, unsigned long end, { LIST_HEAD(vmemmap_pages); struct vmemmap_remap_walk walk = { - .remap_pte = vmemmap_restore_pte, - .reuse_addr = reuse, - .vmemmap_pages = &vmemmap_pages, - .flags = flags, + .remap_pte_range = vmemmap_restore_pte_range, + .reuse_addr = reuse, + .vmemmap_pages = &vmemmap_pages, + .flags = flags, }; /* See the comment in the vmemmap_remap_free(). */ From patchwork Thu Nov 7 20:20:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 13867064 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA8A6D5D688 for ; Thu, 7 Nov 2024 20:20:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F3B696B00AA; Thu, 7 Nov 2024 15:20:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E71B56B00AB; Thu, 7 Nov 2024 15:20:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D15BB6B00AC; Thu, 7 Nov 2024 15:20:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id ABB276B00AA for ; Thu, 7 Nov 2024 15:20:45 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 763611601CB for ; Thu, 7 Nov 2024 20:20:45 +0000 (UTC) X-FDA: 82760416872.02.EC1CB1A Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf17.hostedemail.com (Postfix) with ESMTP id EF96740009 for ; Thu, 7 Nov 2024 20:20:15 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="I6yE14t/"; spf=pass (imf17.hostedemail.com: domain of 3GiEtZwYKCFEHDI0t7z77z4x.v75416DG-553Etv3.7Az@flex--yuzhao.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3GiEtZwYKCFEHDI0t7z77z4x.v75416DG-553Etv3.7Az@flex--yuzhao.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731010617; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Oo5C9e0+cl8r7hE7UWmRxwqyWt2dE16i/xTv+Vqu/GQ=; b=NOrhO7k2hgvH0LTBgtkfAE92eN2KBUBYyvpGH3bjD0Sbu0fnymga4qv59SxGtYvQ3s2riP qOfjf+wSgIFo70Yt+sBhJSggU4AheItkKgqAA+0KPCDEUmqB8znA3EGY5zdEATZPOjXy+U /F45ZIRPtWKzQ9dN1+pXePDwTL3Ij/g= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="I6yE14t/"; spf=pass (imf17.hostedemail.com: domain of 3GiEtZwYKCFEHDI0t7z77z4x.v75416DG-553Etv3.7Az@flex--yuzhao.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3GiEtZwYKCFEHDI0t7z77z4x.v75416DG-553Etv3.7Az@flex--yuzhao.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731010617; a=rsa-sha256; cv=none; b=jIACA9DhOyF5Vh5cEBh5mSNflfsvmABSrC0EXU1KkntZovcpCbVo2OJFp+Ntm+VoBwBmsp glR/1hFSOISesm5gu4URvavgqGu0F8lJkIqEIo0RNYogCye8GwCvK40g0f7fz3rDocHhRm zrt+A9w9jPVcff+jGkn/Or/Qa3tKrdc= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-6e376aa4586so26245017b3.1 for ; Thu, 07 Nov 2024 12:20:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731010843; x=1731615643; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Oo5C9e0+cl8r7hE7UWmRxwqyWt2dE16i/xTv+Vqu/GQ=; b=I6yE14t/2byhElqsXA8xgmgb8qkZJbXRGdhOiAJrgkw5Yv1Nx4t5OYlRR6HR51fJUa mXN9YKqyPtHdBA8ZFZFxVDRXaBF6Tni2XTT2P81p1eRJL5UhV5ulV9IQYx4o9dP3f+GU OdSo5f9vbz5KjAyWzUpMLFF5pDzT+uZDkr2crqArT4U3xd1lSJ18GboEseqSk5B5I0OD CH2ypJRrlqSpGkUgJWva7UVd5HqOsVYjHhjsHfadWMbUHeSofiXNDqOsKb49JqrZXe58 tL6toxGCdKNxoF5WnKbRPShYyPaMhh8yZ9ZP5SyqaoMq39fqSat7gxDxcgJ801GSjPXB LKXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731010843; x=1731615643; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Oo5C9e0+cl8r7hE7UWmRxwqyWt2dE16i/xTv+Vqu/GQ=; b=UuFAUh51v/XZvlCW+twEO4oKDuwut+xlXTcqqaGp7wDRyyd1E79yyZh3slFB541ymP 3IWHkl7WDP+7RmPE0uj6StBDdRAJhXdqjQaM4uvHabfmyG1laPO6fzlFj3On40AlgOXu oN3dL5sBVbwqL3wT003lpzoKilfrLzkIWo/4oS/tiQvxeDF/f1ozS2e+j5ffbyowXIAs eqqKSJof3hT4kzs1mQtUrX3slkHWsO7e0qJgPjvLbo+XlQ2HEtgMpHzX9f4yesZkLzQ/ uEfdy8QWrz8dBV0bpf3eZe3SnrDeDAv5wOF8SG6/HLg1I1T0OE7rutwtEsWBg7aYifr8 0Ggg== X-Forwarded-Encrypted: i=1; AJvYcCUX+p93yOGp0rRyI+aPa8LpmBRktWZ/h8FMKvjySDyvFmBMO+BIEIU9as9RM8P3YO5CSMBktj/Bvw==@kvack.org X-Gm-Message-State: AOJu0YwHnUKNu2NVbpknCj3oqGWO+0uXtVwFCj1C4Nxg34+pw/oGJAfQ sywXeCCaC8RHStHoE7NZFf1PEFaLDgwSJQD0EWQlRBQw+NGjPKCG9F14xGgDoPmNDBcTdvbb0rL 95A== X-Google-Smtp-Source: AGHT+IFItECqZ2dxgSyANVkerUl0Fg/UFqJy0LSXkY8wHLj1uLzeJdS7jFKPfch4UM5ukL5EJb9In+OVPNo= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:a4c0:c64f:6cdd:91f8]) (user=yuzhao job=sendgmr) by 2002:a05:690c:4a13:b0:6e3:b93:3ae2 with SMTP id 00721157ae682-6eaddd704d6mr14027b3.1.1731010842758; Thu, 07 Nov 2024 12:20:42 -0800 (PST) Date: Thu, 7 Nov 2024 13:20:29 -0700 In-Reply-To: <20241107202033.2721681-1-yuzhao@google.com> Mime-Version: 1.0 References: <20241107202033.2721681-1-yuzhao@google.com> X-Mailer: git-send-email 2.47.0.277.g8800431eea-goog Message-ID: <20241107202033.2721681-3-yuzhao@google.com> Subject: [PATCH v2 2/6] mm/hugetlb_vmemmap: add arch-independent helpers From: Yu Zhao To: Andrew Morton , Catalin Marinas , Marc Zyngier , Muchun Song , Thomas Gleixner , Will Deacon Cc: Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: EF96740009 X-Stat-Signature: 6sxmkx165km6fjw54c7b1e6p998hhfxi X-Rspam-User: X-HE-Tag: 1731010815-32945 X-HE-Meta: U2FsdGVkX1+hNgsKuSv5iDzYJVkilAobcssLiqvZerY6wrGlMZnYLYbOYxdJtnInZLaUveE/aDHUvmbKFCETWLvNCFEgHvsxvK+q6ktw43N8MREoOVZivFADOxc8+DUdIfEtFY568a+uaPL70I0eVwOxLTW6SdGH2BNyslyeVgXYb5V8gyIF+L4piPKk27ND1Mn4Biq/kJMwTgLRWJ+n4HxeiZd1eKdQ2cYvCRm1MTaD/YmGR1eCEIkWNa4rdeL2W8wfrTF2vGruRxa8XD04Ju9rDiJ2ovdq17GdJ6AsfXDQewFtGxueMGXk8CVEcNFUx3Z/+3SaatAAGBlHpJJ3Eqs0qWcGXldjOjEm1ahIlGtVBt4rlxoFUGXEr1l5jkN3jJTBck5riFiOP8oKwULKrgvald22N26jdKUwdGlC+UftOqWlDJ0vUvK+J+wuR4AjXCa9u90wWHcbTT8QAvts06UlJTcth0XQu591WJea+nCOvfNOy6FKQzf+xXvWXjQeaBEeNm8G5Te50l43x597JXii6Jw04dwe/upWyQHI+mN745+ijMVj0CEAgkOAA3qfRFsWgC7jf4T8Dvab8Tu7Ke9O5hoSwwTkvZK+DAuo+6v3e+n3h47kiFeJnQ0jdM9+4s/iAzhAn50SdAyZ6/2ociJLz0knM+N+Urs/8FG8U6rS2pZAXplk+7ogvyRKZle6asx/rosOIx2x3mta/mtc8bXjYzBQT291iWfr04mJI/Pb7MwaeKX1fktlnOEA5rX6EO7yVrN/ewmFuKnU36f4ql1Eu1BnKJ41W+OSGqEbVlZo0MMCHq2QuVajVLS/0FzD6cI86qy7FKXwf2dPEPmCyWIOJjkyorNh6qI5MG9pjojk7ZsyFjigwoZtl5JGyGwisNrujE+yOSxaDsy6ynJQo86DIef0TMgIGbfcXqmxqwywClgogVjpI4Im2yZSF3fxqGzhs7jdbRN+E1PwhsS ypuATD6H laRqPJUAMJl9fjhEYf7QZFlJDukEB7X5Rv9KVp4cOE33/fBWCEVZnhxdnsT3k0yQgi5TalnbG2+xd3pUZQmUpN7uRfaTpv6nP7OA6kGK82LRBce2RpvVYOIEBzHp7XHLY0NpbrgrrnU23hmZ/ol2vvzU2ANoN7GYMBHyrQ5V//bRlcEI9b/iRgudlzmZmgAAcVbxoBtzPmrBvr1ZG/CG7/Q1E2THcts9Usjhxk0Gkfr8auc6uxC7Ri1yvkC/qU1wFI6acu85e47WLAE5rAqJlio78xroD6XrMzO04+MCel0V25832htZOoaJIRdKc00KSvUgnv2x+RAlNfmk2sIxSrCwI58AuxFmTsEoYzewAgm+g4c8PvexEtIvjkDrg/5rVrdzkS01ZdiUMXfR3lSEvzuGx5td4mCBEG5PKgjnJfvNUP7BpaK2ED6iDz78CdwN22hMSQBwdlcwIgJZaMLg4eDT78TR0o4/V6ekEgxugUaFz+p2B8vs1UO7zU8RpwittR8xfKLtLy3m0iAGa7LWoggHpcBBumHEXpxTIvvDInpK3FgzlkOK0HT6sp9bYizCN8K9KsGHwkrkvFF05Ud5AcbO8lQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add architecture-independent helpers to allow individual architectures to work around their own limitations when updating vmemmap. Specifically, the current remap workflow requires break-before-make (BBM) on arm64. By overriding the default helpers later in this series, arm64 will be able to support the current HVO implementation. Signed-off-by: Yu Zhao --- include/linux/mm_types.h | 7 +++ mm/hugetlb_vmemmap.c | 99 ++++++++++++++++++++++++++++++++++------ 2 files changed, 92 insertions(+), 14 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 6e3bdf8e38bc..0f3ae6e173f6 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1499,4 +1499,11 @@ enum { /* See also internal only FOLL flags in mm/internal.h */ }; +/* Skip the TLB flush when we split the PMD */ +#define VMEMMAP_SPLIT_NO_TLB_FLUSH BIT(0) +/* Skip the TLB flush when we remap the PTE */ +#define VMEMMAP_REMAP_NO_TLB_FLUSH BIT(1) +/* synchronize_rcu() to avoid writes from page_ref_add_unless() */ +#define VMEMMAP_SYNCHRONIZE_RCU BIT(2) + #endif /* _LINUX_MM_TYPES_H */ diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 46befab48d41..e50a196399f5 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -38,16 +38,56 @@ struct vmemmap_remap_walk { struct page *reuse_page; unsigned long reuse_addr; struct list_head *vmemmap_pages; - -/* Skip the TLB flush when we split the PMD */ -#define VMEMMAP_SPLIT_NO_TLB_FLUSH BIT(0) -/* Skip the TLB flush when we remap the PTE */ -#define VMEMMAP_REMAP_NO_TLB_FLUSH BIT(1) -/* synchronize_rcu() to avoid writes from page_ref_add_unless() */ -#define VMEMMAP_SYNCHRONIZE_RCU BIT(2) unsigned long flags; }; +#ifndef VMEMMAP_ARCH_TLB_FLUSH_FLAGS +#define VMEMMAP_ARCH_TLB_FLUSH_FLAGS 0 +#endif + +#ifndef vmemmap_update_supported +static bool vmemmap_update_supported(void) +{ + return true; +} +#endif + +#ifndef vmemmap_update_lock +static void vmemmap_update_lock(void) +{ +} +#endif + +#ifndef vmemmap_update_unlock +static void vmemmap_update_unlock(void) +{ +} +#endif + +#ifndef vmemmap_update_pte_range_start +static void vmemmap_update_pte_range_start(pte_t *pte, unsigned long start, unsigned long end) +{ +} +#endif + +#ifndef vmemmap_update_pte_range_end +static void vmemmap_update_pte_range_end(void) +{ +} +#endif + +#ifndef vmemmap_update_pmd_range_start +static void vmemmap_update_pmd_range_start(pmd_t *pmd, unsigned long start, unsigned long end) +{ +} +#endif + +#ifndef vmemmap_update_pmd_range_end +static void vmemmap_update_pmd_range_end(void) +{ +} +#endif + static int vmemmap_split_pmd(pmd_t *pmd, struct page *head, unsigned long start, struct vmemmap_remap_walk *walk) { @@ -83,7 +123,9 @@ static int vmemmap_split_pmd(pmd_t *pmd, struct page *head, unsigned long start, /* Make pte visible before pmd. See comment in pmd_install(). */ smp_wmb(); + vmemmap_update_pmd_range_start(pmd, start, start + PMD_SIZE); pmd_populate_kernel(&init_mm, pmd, pgtable); + vmemmap_update_pmd_range_end(); if (!(walk->flags & VMEMMAP_SPLIT_NO_TLB_FLUSH)) flush_tlb_kernel_range(start, start + PMD_SIZE); } else { @@ -164,10 +206,12 @@ static int vmemmap_remap_range(unsigned long start, unsigned long end, VM_BUG_ON(!PAGE_ALIGNED(start | end)); + vmemmap_update_lock(); mmap_read_lock(&init_mm); ret = walk_page_range_novma(&init_mm, start, end, &vmemmap_remap_ops, NULL, walk); mmap_read_unlock(&init_mm); + vmemmap_update_unlock(); if (ret) return ret; @@ -228,6 +272,8 @@ static void vmemmap_remap_pte_range(pte_t *pte, unsigned long start, unsigned lo smp_wmb(); } + vmemmap_update_pte_range_start(pte, start, end); + for (i = 0; i < nr_pages; i++) { pte_t val; @@ -242,6 +288,8 @@ static void vmemmap_remap_pte_range(pte_t *pte, unsigned long start, unsigned lo set_pte_at(&init_mm, start + PAGE_SIZE * i, pte + i, val); } + + vmemmap_update_pte_range_end(); } /* @@ -287,6 +335,8 @@ static void vmemmap_restore_pte_range(pte_t *pte, unsigned long start, unsigned */ smp_wmb(); + vmemmap_update_pte_range_start(pte, start, end); + for (i = 0; i < nr_pages; i++) { pte_t val; @@ -296,6 +346,8 @@ static void vmemmap_restore_pte_range(pte_t *pte, unsigned long start, unsigned val = mk_pte(page, PAGE_KERNEL); set_pte_at(&init_mm, start + PAGE_SIZE * i, pte + i, val); } + + vmemmap_update_pte_range_end(); } /** @@ -513,7 +565,8 @@ static int __hugetlb_vmemmap_restore_folio(const struct hstate *h, */ int hugetlb_vmemmap_restore_folio(const struct hstate *h, struct folio *folio) { - return __hugetlb_vmemmap_restore_folio(h, folio, VMEMMAP_SYNCHRONIZE_RCU); + return __hugetlb_vmemmap_restore_folio(h, folio, + VMEMMAP_SYNCHRONIZE_RCU | VMEMMAP_ARCH_TLB_FLUSH_FLAGS); } /** @@ -553,7 +606,7 @@ long hugetlb_vmemmap_restore_folios(const struct hstate *h, list_move(&folio->lru, non_hvo_folios); } - if (restored) + if (restored && !(VMEMMAP_ARCH_TLB_FLUSH_FLAGS & VMEMMAP_REMAP_NO_TLB_FLUSH)) flush_tlb_all(); if (!ret) ret = restored; @@ -641,7 +694,8 @@ void hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct folio *folio) { LIST_HEAD(vmemmap_pages); - __hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages, VMEMMAP_SYNCHRONIZE_RCU); + __hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages, + VMEMMAP_SYNCHRONIZE_RCU | VMEMMAP_ARCH_TLB_FLUSH_FLAGS); free_vmemmap_page_list(&vmemmap_pages); } @@ -683,7 +737,8 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l break; } - flush_tlb_all(); + if (!(VMEMMAP_ARCH_TLB_FLUSH_FLAGS & VMEMMAP_SPLIT_NO_TLB_FLUSH)) + flush_tlb_all(); list_for_each_entry(folio, folio_list, lru) { int ret; @@ -701,24 +756,35 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l * allowing more vmemmap remaps to occur. */ if (ret == -ENOMEM && !list_empty(&vmemmap_pages)) { - flush_tlb_all(); + if (!(VMEMMAP_ARCH_TLB_FLUSH_FLAGS & VMEMMAP_REMAP_NO_TLB_FLUSH)) + flush_tlb_all(); free_vmemmap_page_list(&vmemmap_pages); INIT_LIST_HEAD(&vmemmap_pages); __hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages, flags); } } - flush_tlb_all(); + if (!(VMEMMAP_ARCH_TLB_FLUSH_FLAGS & VMEMMAP_REMAP_NO_TLB_FLUSH)) + flush_tlb_all(); free_vmemmap_page_list(&vmemmap_pages); } +static int hugetlb_vmemmap_sysctl(const struct ctl_table *ctl, int write, + void *buffer, size_t *lenp, loff_t *ppos) +{ + if (!vmemmap_update_supported()) + return -ENODEV; + + return proc_dobool(ctl, write, buffer, lenp, ppos); +} + static struct ctl_table hugetlb_vmemmap_sysctls[] = { { .procname = "hugetlb_optimize_vmemmap", .data = &vmemmap_optimize_enabled, .maxlen = sizeof(vmemmap_optimize_enabled), .mode = 0644, - .proc_handler = proc_dobool, + .proc_handler = hugetlb_vmemmap_sysctl, }, }; @@ -729,6 +795,11 @@ static int __init hugetlb_vmemmap_init(void) /* HUGETLB_VMEMMAP_RESERVE_SIZE should cover all used struct pages */ BUILD_BUG_ON(__NR_USED_SUBPAGE > HUGETLB_VMEMMAP_RESERVE_PAGES); + if (READ_ONCE(vmemmap_optimize_enabled) && !vmemmap_update_supported()) { + pr_warn("HugeTLB: disabling HVO due to missing support.\n"); + WRITE_ONCE(vmemmap_optimize_enabled, false); + } + for_each_hstate(h) { if (hugetlb_vmemmap_optimizable(h)) { register_sysctl_init("vm", hugetlb_vmemmap_sysctls); From patchwork Thu Nov 7 20:20:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 13867065 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BBD3D5D689 for ; Thu, 7 Nov 2024 20:20:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AA4E56B00AC; Thu, 7 Nov 2024 15:20:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A53856B00AD; Thu, 7 Nov 2024 15:20:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8CFC16B00AE; Thu, 7 Nov 2024 15:20:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 648E86B00AC for ; Thu, 7 Nov 2024 15:20:48 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 05FDFC0268 for ; Thu, 7 Nov 2024 20:20:48 +0000 (UTC) X-FDA: 82760416494.15.54AEE03 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf24.hostedemail.com (Postfix) with ESMTP id 3AA5718000A for ; Thu, 7 Nov 2024 20:20:41 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=vSvDbcN1; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of 3HSEtZwYKCFQKGL3wA2AA270.yA8749GJ-886Hwy6.AD2@flex--yuzhao.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3HSEtZwYKCFQKGL3wA2AA270.yA8749GJ-886Hwy6.AD2@flex--yuzhao.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731010675; a=rsa-sha256; cv=none; b=HFnptqauNCej3zxUDCOerMgnisewSdL4uZ+juYjh+inMBj5uek6Hj/hOSiOHBY0CyKSHwB oBE60S1APJZpNmZXjMLeldSUXz4bpc8XFcRvmgquMvxdEx6Fl32OQsl1YQFx95Zv5i5M8Q 0EdIX2fxrj2U1asl0ZrWmcAxWCdUUro= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=vSvDbcN1; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of 3HSEtZwYKCFQKGL3wA2AA270.yA8749GJ-886Hwy6.AD2@flex--yuzhao.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3HSEtZwYKCFQKGL3wA2AA270.yA8749GJ-886Hwy6.AD2@flex--yuzhao.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731010675; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=v35HyYfzjB2bG4AOilBZdKULCbIxbvNUBJLvIql+6B0=; b=kxUXePf5fj6m3Z0xD1RT/2LksXBK4bWap6mkbbM/uhpr384GVzBrML7Vlne69q586HtEfq q4zwa+To0JZwtTVjPedxoH68E4+BUfxVV6OWeYPu5191L4d/GDEwn1EVVo0cSkasqojcJU Mbg5xBEIEvQgzVIigmXg6EWZ77mkVXk= Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e293b3e014aso2109924276.3 for ; Thu, 07 Nov 2024 12:20:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731010845; x=1731615645; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=v35HyYfzjB2bG4AOilBZdKULCbIxbvNUBJLvIql+6B0=; b=vSvDbcN16Tw8D5YKTR4uXJvFiQjKRApzF1v2qLWpd6dsGnoC6SVHB8wjL6rHHadI90 jZ0tc36+DTwW+OTJW6d/CudSVYKc2uOQiwKRTNwOILpqByXfn5OXDMPP7lzUJYIdkqeS O63jl9dt0CkJiWM2p/r7GNIDQHL903347BYcMkgWbwceTeaEYYVVfcyxff8Fa7HxtXQA NNZWVQTWkm/tNW/iUhsTLfae7g+Nya1Y3d2AM+y0mk3F6BH7NMrJKFjoQQTL2VGlhqqb oqX7PTD4VkA2ILIZ81KR1P6NnU0eJyW0gsxAXaTcgyzotrcO4P95H0gM4nSSLM4hYvK5 YFpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731010845; x=1731615645; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=v35HyYfzjB2bG4AOilBZdKULCbIxbvNUBJLvIql+6B0=; b=GRkQsoIKnqEq13gN0yN0j0SXdqhLkMWAdy6h48mGMIcQWRv3YpTTkG6AkdPM2iyywm FxbNDBo6rAxRqdARmEy/sfscq/cMARA1RHEwCBcFW0MU/DwCSdwy+Dxjl+LUJFlxpscU 8X0/lpcPRki7S4U/9+D/n51QzyEpQqqtUbYcfpu6kFR+zajZcU4qTNiWLzXhTTcYNy9A iz/Mk/3lExq63fRg6fNfY+R694KzPOAHlMgYuenEcLbZ1tvEhexmbyPc61T9HALlmTzy gZgDzn776lUBcZ/jATmmlnxuK9ZmmcCnn9A1+MIze4qie4sJOrPN3WsqnokTTJucoia6 1rQA== X-Forwarded-Encrypted: i=1; AJvYcCXCHgWD3mkN56OsXAdrMRcn6U5Fq5k0I6+N1+yRw40I2OH41XkOygBdAIVIDgv+OKMGnuNBoe4OsA==@kvack.org X-Gm-Message-State: AOJu0YxjkAzGfnzzARYUdkcxcVVfbhaedvot1Owy62EIYNSGlLN1jkSI 2jzfiAZAc68Q87c6XvDkosi/F3oxsnl5EGMF2gQ9G0Td7WDC7L10MSHBRTXJU2GKdnFOTPNaAfR mEQ== X-Google-Smtp-Source: AGHT+IH5+g5PrNVgp5Pr5OTYOBdUbFjFulnS4uSn6ZxFeBhddvMAobpkpn0NbUnR+mioP6ARIrKUFKFKqUg= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:a4c0:c64f:6cdd:91f8]) (user=yuzhao job=sendgmr) by 2002:a25:a249:0:b0:e28:e510:6ab1 with SMTP id 3f1490d57ef6-e337f8faff5mr210276.8.1731010845128; Thu, 07 Nov 2024 12:20:45 -0800 (PST) Date: Thu, 7 Nov 2024 13:20:30 -0700 In-Reply-To: <20241107202033.2721681-1-yuzhao@google.com> Mime-Version: 1.0 References: <20241107202033.2721681-1-yuzhao@google.com> X-Mailer: git-send-email 2.47.0.277.g8800431eea-goog Message-ID: <20241107202033.2721681-4-yuzhao@google.com> Subject: [PATCH v2 3/6] irqchip/gic-v3: support SGI broadcast From: Yu Zhao To: Andrew Morton , Catalin Marinas , Marc Zyngier , Muchun Song , Thomas Gleixner , Will Deacon Cc: Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao X-Stat-Signature: b9ukwj7nr1dz5affe8pmwt6pjuhm4j56 X-Rspamd-Queue-Id: 3AA5718000A X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1731010841-957065 X-HE-Meta: U2FsdGVkX1/2s3b8TIRpKtHQvz2GjdQ2O7d33ufps6doVTiD+d2nTNHDX6X8wAAub9JLXRZUyzcr54rEt7WUNUXzAXx/3LYodZ5UN8xv7i4tA6quPwM0yYSyiSJOBGdoueNJHszBA/QmZGOE2KAD8Hs/DJURLFDZ5tV2r7KtIyVey1XGbD/sIwsQV8zv6BVAhtiOdP6kwHzvcT8OXpHUcBZJ3EoEW1qEtSIdAFkliG43XOj9G47ck2bOXeboVg5w5DRSUnWe9EJm86vIBEFXNeXZ6Yyosf4uSvluRCLlcd2z7pBaMgw6CKseBn1l5QitlhBMWj24iKw7EwPK20kKF1Oog/s20gC67Pn5bc7XFwKoSu4m3iUW7ef7VjgOzaJlN0OmgrccAu5CzLx8EPUc7tv8PhmyqsUHCdJGGtzjnTBPn2H7YGl2IzrYJHLGZburPZ+Cc3uzzoBusKaCVXz5UgmddConuvyUL26+9Xu9KB7AtAxP5biujUXnze0CVx5YXbXQqeLi1G2XJY+mvOQVovCXJb4XHlTHaDAX4A1WzdJ9di+Chj5884kEBlGJIr0sv3R0RdfX49YAQiSiwT3mdLdP7c/UHhQ5DLIMJ2N5wEglBAkgReb4KGRRnOjFwX/yn/5kM/R5+obYjgGbQL8U9l30Wx6Fs3sQTF3NRu4sD+mIMG4ulMM32QoxHmDndf6Xkb8CKrsAk7OMnKdaOTNkBnB7suzlrL+n7qB/hViUAQzmDDL/XXtfASwY9BJp/xCOsbIjfR3jdP/2SfC16wOFsDUzeQGIOFQwcJxwiam3wU386oaMoO9Rk5NVYLr9rAOBOeTeDjO0SKhOxmkZ4h19RSlL8F2jkAXTDFWXwUZf3fY1ymohTTXx76rlxCbR5FIUTJKqJSOXGngkCyn4M7McIUXMALhp4rkm6oubuirrCySBJ5/ARGKG99WASOiDi5JWkgT8inmju7iLch35aP8 hRn30Gjk 5W7weBbEKZkgvq14x8ia1wibfnzl1e2Eh2JtZfp+xQfngubfIiWpeaoD/bVsa21IG46kittW1paj5Un9Hl0es5UyQHNSoiWYOStH86On4HzdnmZXTEFcb+hhpRCuqIFtXbFONEGZ/q16eaWaX2skiAZaOtUq9nNLW52RtZgx1ni/Jpp09A5dMlMLQ/kGrgufmuagnRhyoZs/jRMoTpenmtpEfho5DHDbW2RhTndrpKy3TCalhL2u1Y7+6Iab6QxDlbBeYeNVajJklyw4DY5gC28JkHzJ5I6CxAEQweZzjio9STP4s+CJNkRbOoeylprtqq9sm5Y6dvqsbEI2qDPEeCXF+bpnEEvGjX5yv4ppqJeRoYRy9bI+ervJQmSfTWhc/ZTbxmijK15Bn6+I+MAxSi4R1U46McHqexbgBW7ZivBu1IU8DrRGUR/21Y6LabYA/IdCG8JQiMWQPdXw1JKIjbkJOZ8k8pNmzj0dViweD5xBeRqB27MvTALbbcc5orMMFCRru8Y3je/hM1nCQwCGz/1/38cAhstgTRlyqkpuky0VIai8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: GIC v3 and later support SGI broadcast, i.e., the mode that routes interrupts to all PEs in the system excluding the local CPU. Supporting this mode can avoid looping through all the remote CPUs when broadcasting SGIs, especially for systems with 200+ CPUs. The performance improvement can be measured with the rest of this series booted with "hugetlb_free_vmemmap=on irqchip.gicv3_pseudo_nmi=1": cd /sys/kernel/mm/hugepages/ echo 600 >hugepages-1048576kB/nr_hugepages echo 2048kB >hugepages-1048576kB/demote_size perf record -g time echo 600 >hugepages-1048576kB/demote" With 80 CPUs: gic_ipi_send_mask() bash sys time Before: 38.14% 0m10.513s After: 0.20% 0m5.132s Signed-off-by: Yu Zhao --- drivers/irqchip/irq-gic-v3.c | 31 ++++++++++++++++++++++++++++--- 1 file changed, 28 insertions(+), 3 deletions(-) diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index ce87205e3e82..7ebe870e4608 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -1322,6 +1322,7 @@ static void gic_cpu_init(void) #define MPIDR_TO_SGI_RS(mpidr) (MPIDR_RS(mpidr) << ICC_SGI1R_RS_SHIFT) #define MPIDR_TO_SGI_CLUSTER_ID(mpidr) ((mpidr) & ~0xFUL) +#define MPIDR_TO_SGI_TARGET_LIST(mpidr) (1 << ((mpidr) & 0xf)) /* * gic_starting_cpu() is called after the last point where cpuhp is allowed @@ -1356,7 +1357,7 @@ static u16 gic_compute_target_list(int *base_cpu, const struct cpumask *mask, mpidr = gic_cpu_to_affinity(cpu); while (cpu < nr_cpu_ids) { - tlist |= 1 << (mpidr & 0xf); + tlist |= MPIDR_TO_SGI_TARGET_LIST(mpidr); next_cpu = cpumask_next(cpu, mask); if (next_cpu >= nr_cpu_ids) @@ -1394,9 +1395,20 @@ static void gic_send_sgi(u64 cluster_id, u16 tlist, unsigned int irq) gic_write_sgi1r(val); } +static void gic_broadcast_sgi(unsigned int irq) +{ + u64 val; + + val = BIT_ULL(ICC_SGI1R_IRQ_ROUTING_MODE_BIT) | (irq << ICC_SGI1R_SGI_ID_SHIFT); + + pr_devel("CPU %d: broadcasting SGI %u\n", smp_processor_id(), irq); + gic_write_sgi1r(val); +} + static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask) { - int cpu; + int cpu = smp_processor_id(); + bool self = cpumask_test_cpu(cpu, mask); if (WARN_ON(d->hwirq >= 16)) return; @@ -1407,6 +1419,19 @@ static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask) */ dsb(ishst); + if (cpumask_weight(mask) + !self == num_online_cpus()) { + /* Broadcast to all but self */ + gic_broadcast_sgi(d->hwirq); + if (self) { + unsigned long mpidr = gic_cpu_to_affinity(cpu); + + /* Send to self */ + gic_send_sgi(MPIDR_TO_SGI_CLUSTER_ID(mpidr), + MPIDR_TO_SGI_TARGET_LIST(mpidr), d->hwirq); + } + goto done; + } + for_each_cpu(cpu, mask) { u64 cluster_id = MPIDR_TO_SGI_CLUSTER_ID(gic_cpu_to_affinity(cpu)); u16 tlist; @@ -1414,7 +1439,7 @@ static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask) tlist = gic_compute_target_list(&cpu, mask, cluster_id); gic_send_sgi(cluster_id, tlist, d->hwirq); } - +done: /* Force the above writes to ICC_SGI1R_EL1 to be executed */ isb(); } From patchwork Thu Nov 7 20:20:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 13867066 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 724BAD5D689 for ; Thu, 7 Nov 2024 20:20:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E78456B00AD; Thu, 7 Nov 2024 15:20:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DD8626B00AF; Thu, 7 Nov 2024 15:20:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C02316B00B0; Thu, 7 Nov 2024 15:20:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id A180E6B00AD for ; Thu, 7 Nov 2024 15:20:50 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 3384340156 for ; Thu, 7 Nov 2024 20:20:50 +0000 (UTC) X-FDA: 82760415654.13.E64CC27 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf12.hostedemail.com (Postfix) with ESMTP id E86634001D for ; Thu, 7 Nov 2024 20:20:31 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0G4MKvPc; spf=pass (imf12.hostedemail.com: domain of 3HyEtZwYKCFYMIN5yC4CC492.0CA96BIL-AA8Jy08.CF4@flex--yuzhao.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3HyEtZwYKCFYMIN5yC4CC492.0CA96BIL-AA8Jy08.CF4@flex--yuzhao.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731010797; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KhIcSe1txNfkCnP7MPXpnY6tOWGMY/4eAudpBxJaXiU=; b=eFvD9bDI5X2Gq26O0N83NykebQ7URp8aD5XX93KEuuktgcT88eJ0eFJhPR5up8D85m13qx fU2YjNvJBQ3FvRyVJOYP5zpXXi3TtU7OVQ+Kh4IX+Y6bJC2bNPvCsEpImW4+wU/zisu21p dEsT9L77ObcxP8+QYYOt7kiJvKLWlWE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731010797; a=rsa-sha256; cv=none; b=zo7c9QqbCuKCeM61NCWZPwSC+HBkSJPl0N+nh2Ny9cHq4BYDnoshInn5SXFCruBh8iAuVe YXk0OmhMqNHPXynHjW1RKFmomffze+VXeiEsU8OVQtTJ8XvVwvxa5OPSljLx0VFdZ8pGZr isYyFBYWVJ7Z9fMbpggbjAFvV792FDc= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0G4MKvPc; spf=pass (imf12.hostedemail.com: domain of 3HyEtZwYKCFYMIN5yC4CC492.0CA96BIL-AA8Jy08.CF4@flex--yuzhao.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3HyEtZwYKCFYMIN5yC4CC492.0CA96BIL-AA8Jy08.CF4@flex--yuzhao.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e30c7a7ca60so1791017276.0 for ; Thu, 07 Nov 2024 12:20:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731010847; x=1731615647; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=KhIcSe1txNfkCnP7MPXpnY6tOWGMY/4eAudpBxJaXiU=; b=0G4MKvPcSPX7LCBggX11ARGkdIXbE8nROL41TkhQl4y8lSL7IrAJwkLJ++79U/q7sp BZYBUbsDVb06j71b/18DXRpNbi6D2VUsADL/SysqL24iHRJkQEPSkMay/qDX8GZFOdTW /pvqs+RyyFrHAu+TnbX2uYVpPnmDA8J3zs468EsZZQfW4QWkwlxWONr9/roDF5vpgbhb jPZOvjLBnlOqjsirkJdaDkkKC1vb4iv+LRq/XJJJumWwiSS3lb/KtYKSQogwnBx2jvbW PvGxl4bNsiJ2TgXgwttWilZX+URtRvVavXKlKUPH6G7U3QSxPAisadAn9T9+QRDAw3yL XZpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731010847; x=1731615647; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=KhIcSe1txNfkCnP7MPXpnY6tOWGMY/4eAudpBxJaXiU=; b=Jdl3ApJxCeRzRex5LVdly8wimABQBA0QXhmkf9vz3sGTl7h9dGADfRl+FRKRlYZv2q pM1nuYtHGuHQecfAk+ELFunnb2r3SG+EYC0U3JH48fgPavaNi0SKEmg8hRVhm3TC6L/A 7L4o/tX7E6Re4RCrK2EMZJhpiKKs9EYGx+SFu2xfGhYmbEY2gFvkE3dfU8l8QINCfqrG nWcxnYLxZbVXxpULTCTzcNjOVMapv5dEvPhlyOd2DKJFfrl64pl/5ZHWPRYu2d0oh6H+ +ieE267ju5rK1O6bZkriwXyYAdWSkk8LC72CwGjk8hc+EhNIfuK9R7reVC5lPKBu2fgw wtXw== X-Forwarded-Encrypted: i=1; AJvYcCXWAugBhtRvMTmmoRh1h/AxZ5oZNJYL99XqJRhz9/mTsKwsyTQZj2ilevWgSp1JsDYwtjHy0ucrLw==@kvack.org X-Gm-Message-State: AOJu0YzGOy9qHPY1601T9RKn/QyLhq1n9XsRNsGM5dtHh1goBR6RG/Vo rBIC7KfWiNhBV3cw2kMGc0gU8OW3UBUNfBscM2nsxgELi+qOxGshXRl+ZlHSewEJNAGK8GpWQVd ibw== X-Google-Smtp-Source: AGHT+IEnJ0e357/3OotzjsP9lexlQ5CBUwlEpVYwPI3w1zjYPdS8y4hVyHrOYLJ8xSQLJ2NE0kbCcYElb98= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:a4c0:c64f:6cdd:91f8]) (user=yuzhao job=sendgmr) by 2002:a25:bc84:0:b0:e2b:d0e9:1cdc with SMTP id 3f1490d57ef6-e337f908f8dmr320276.10.1731010847379; Thu, 07 Nov 2024 12:20:47 -0800 (PST) Date: Thu, 7 Nov 2024 13:20:31 -0700 In-Reply-To: <20241107202033.2721681-1-yuzhao@google.com> Mime-Version: 1.0 References: <20241107202033.2721681-1-yuzhao@google.com> X-Mailer: git-send-email 2.47.0.277.g8800431eea-goog Message-ID: <20241107202033.2721681-5-yuzhao@google.com> Subject: [PATCH v2 4/6] arm64: broadcast IPIs to pause remote CPUs From: Yu Zhao To: Andrew Morton , Catalin Marinas , Marc Zyngier , Muchun Song , Thomas Gleixner , Will Deacon Cc: Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao X-Stat-Signature: rd8r6dhz85ecuc3eo6mghb6bthci6c4e X-Rspam-User: X-Rspamd-Queue-Id: E86634001D X-Rspamd-Server: rspam02 X-HE-Tag: 1731010831-843619 X-HE-Meta: U2FsdGVkX19F/eliKnZCmnBJf6PrZteX7z+0v0KE6h7bDjbEEuzj/iz4JI4ZPhqcE6QiZkFAA/Gn6LIlFsvXk6U86iD5fjFJ9rKJwoKR1Z/9KF4c/svTZR7pAGxIoAsjjiPKX8I3/slUWRsX8uYwHZbSxzsuBb/o9RMGAvnvFA8e+FILjQYlUu8pIrsQJ8nyjYNj/3xHPqsBYRCdhFjnceOxxc9jueUFIF/vAPAAimYyh7hFrXzgXpGK+LYhEl0IQgBalFE8tBwEl4Uar+TJXdgXnWiEFfjCOsGQSheE+V9qcV/o3zHW1Y2IEKGliQkcUK53Hap4OHOXr5UU2yIvFq6Z9/qcpEU2+pFnPJPWJIuLDyxGBc1uJIaPPsXXu2GPhQyYcOFUVlqySksGir/EtC9/x8U7sCuyWxAtRzi5D8UORepiq+0Q0Yh/AivvChRjiHPxx8socb1MS4caPCf//IbRShRvp959YqhHYUMbAEF5mMtM4KEfpzrDeTMvKyBe0d+5k90F0Kz4WuLQNOljnVP1e9W5pZ5HhTWFaXnGs+K54LLyeIgC9hI9Bu0o8vXPbysqApAdfTgup+HuOMcgWMVpX9lDKOtyHNRyweHI+7FyQq6KAK2gIML4NTn3ulfZ67X8x+4clyZ5VHeh2dcvMfGllnPFoNcjxj+P0kqSzSXTG9ry5QJvaP6jAMy/IOlUyfoLs9HeUJ8u7pzcXpvQaAdUFsolveDABI6oqy3o9ge33FQd5nKzSAd0UZUwN8YueJD34djV9xTpdn1Dh9WdgtHTB7IMvOWOpG3Czull9gbU72bcDAO5uvAPgyY3QtNtrl611PzXNMV2vNUqPXRJtv8IbNvrZckOW7/uV6p8t/aZIn0v/H2mJb8P/H7PmaoG0weOb/y3cZcLCoWfjPDIJSUr8UmeIyXJGfUzlFtbziulQALxJjtWloa+lW6ekPoc5mm8Tjw0VtiooH8QTod z66dXjbt xPdsNeqlcavTjCANqspQyD5p2URsnI7pWNftuF8qWvjY3hfyWn3TfwteQbuv1tzi8+F2yTO+ToXWwYW9Je5E03N8jFQYiulc/KA11tyJzBe7H8LZtWo+HUrJGt+Dp49x0DQ7Q+RWQ/O3mmrKMShMWYR7+wiI8iancUGdI+/gIRo7EW+LfHyU0UmEFlevYH7MeK6378iyFa0QE4QCDR6ffg0KxWSFS9f1aK1ERiBhGQeeHqxWowkMdP0r5XS2kwrQTfi/yShKSFhDJZRSAunZfze2HQijCBuHyiSbkzr63QWHZ5wih7G+ySKsPcu3sOgvsTOJCOkPCgMm2+wWdxs+5AHB1vJs2LV4SByQSm9iIRrl5/uxWZsrRX9ganwSltHeH9gMnZO/dUcl3Z+rlLqQa/husBX+nHs7UoJfj9rO29Iki3KxWK/B36omSSw+2I0hM30nMIL7jC322XneoxZKW8oDLuaHnR8Vd2p8iCY0Wjwfwk07YO1jZTUjPbY2rBTU/QIZHSQ5c9O0wszmF81PSIDCA04248mYpjPuGKaOv+1q1LUE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Broadcast pseudo-NMI IPIs to pause remote CPUs for a short period of time, and then reliably resume them when the local CPU exits critical sections that preclude the execution of remote CPUs. A typical example of such critical sections is BBM on kernel PTEs. HugeTLB Vmemmap Optimization (HVO) on arm64 was disabled by commit 060a2c92d1b6 ("arm64: mm: hugetlb: Disable HUGETLB_PAGE_OPTIMIZE_VMEMMAP") due to the folllowing reason: This is deemed UNPREDICTABLE by the Arm architecture without a break-before-make sequence (make the PTE invalid, TLBI, write the new valid PTE). However, such sequence is not possible since the vmemmap may be concurrently accessed by the kernel. Supporting BBM on kernel PTEs is one of the approaches that can make HVO safe on arm64. Signed-off-by: Yu Zhao --- arch/arm64/include/asm/smp.h | 3 ++ arch/arm64/kernel/smp.c | 85 +++++++++++++++++++++++++++++++++--- 2 files changed, 81 insertions(+), 7 deletions(-) diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h index 2510eec026f7..cffb0cfed961 100644 --- a/arch/arm64/include/asm/smp.h +++ b/arch/arm64/include/asm/smp.h @@ -133,6 +133,9 @@ bool cpus_are_stuck_in_kernel(void); extern void crash_smp_send_stop(void); extern bool smp_crash_stop_failed(void); +void pause_remote_cpus(void); +void resume_remote_cpus(void); + #endif /* ifndef __ASSEMBLY__ */ #endif /* ifndef __ASM_SMP_H */ diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 3b3f6b56e733..54e9f6374aa3 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -85,7 +85,12 @@ static int ipi_irq_base __ro_after_init; static int nr_ipi __ro_after_init = NR_IPI; static struct irq_desc *ipi_desc[MAX_IPI] __ro_after_init; -static bool crash_stop; +enum { + SEND_STOP, + CRASH_STOP, +}; + +static unsigned long stop_in_progress; static void ipi_setup(int cpu); @@ -917,6 +922,72 @@ static void __noreturn ipi_cpu_crash_stop(unsigned int cpu, struct pt_regs *regs #endif } +static DEFINE_RAW_SPINLOCK(cpu_pause_lock); +static bool __cacheline_aligned_in_smp cpu_paused; +static atomic_t __cacheline_aligned_in_smp nr_cpus_paused; + +static void pause_local_cpu(void) +{ + atomic_inc(&nr_cpus_paused); + + while (READ_ONCE(cpu_paused)) + cpu_relax(); + + atomic_dec(&nr_cpus_paused); + + /* + * The caller of resume_remote_cpus() should make sure that clearing + * cpu_paused is ordered after other changes that can have any impact on + * this CPU. The isb() below makes sure this CPU doesn't speculatively + * execute the next instruction before it sees all those changes. + */ + isb(); +} + +void pause_remote_cpus(void) +{ + cpumask_t cpus_to_pause; + int nr_cpus_to_pause = num_online_cpus() - 1; + + lockdep_assert_cpus_held(); + lockdep_assert_preemption_disabled(); + + if (!nr_cpus_to_pause) + return; + + cpumask_copy(&cpus_to_pause, cpu_online_mask); + cpumask_clear_cpu(smp_processor_id(), &cpus_to_pause); + + raw_spin_lock(&cpu_pause_lock); + + WARN_ON_ONCE(cpu_paused); + WARN_ON_ONCE(atomic_read(&nr_cpus_paused)); + + cpu_paused = true; + + smp_cross_call(&cpus_to_pause, IPI_CPU_STOP_NMI); + + while (atomic_read(&nr_cpus_paused) != nr_cpus_to_pause) + cpu_relax(); + + raw_spin_unlock(&cpu_pause_lock); +} + +void resume_remote_cpus(void) +{ + if (!cpu_paused) + return; + + raw_spin_lock(&cpu_pause_lock); + + WRITE_ONCE(cpu_paused, false); + + while (atomic_read(&nr_cpus_paused)) + cpu_relax(); + + raw_spin_unlock(&cpu_pause_lock); +} + static void arm64_backtrace_ipi(cpumask_t *mask) { __ipi_send_mask(ipi_desc[IPI_CPU_BACKTRACE], mask); @@ -970,7 +1041,9 @@ static void do_handle_IPI(int ipinr) case IPI_CPU_STOP: case IPI_CPU_STOP_NMI: - if (IS_ENABLED(CONFIG_KEXEC_CORE) && crash_stop) { + if (!test_bit(SEND_STOP, &stop_in_progress)) { + pause_local_cpu(); + } else if (test_bit(CRASH_STOP, &stop_in_progress)) { ipi_cpu_crash_stop(cpu, get_irq_regs()); unreachable(); } else { @@ -1142,7 +1215,6 @@ static inline unsigned int num_other_online_cpus(void) void smp_send_stop(void) { - static unsigned long stop_in_progress; cpumask_t mask; unsigned long timeout; @@ -1154,7 +1226,7 @@ void smp_send_stop(void) goto skip_ipi; /* Only proceed if this is the first CPU to reach this code */ - if (test_and_set_bit(0, &stop_in_progress)) + if (test_and_set_bit(SEND_STOP, &stop_in_progress)) return; /* @@ -1230,12 +1302,11 @@ void crash_smp_send_stop(void) * This function can be called twice in panic path, but obviously * we execute this only once. * - * We use this same boolean to tell whether the IPI we send was a + * We use the CRASH_STOP bit to tell whether the IPI we send was a * stop or a "crash stop". */ - if (crash_stop) + if (test_and_set_bit(CRASH_STOP, &stop_in_progress)) return; - crash_stop = 1; smp_send_stop(); From patchwork Thu Nov 7 20:20:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 13867067 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5FDA7D5D688 for ; Thu, 7 Nov 2024 20:20:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F12D16B00AF; Thu, 7 Nov 2024 15:20:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E99116B00B0; Thu, 7 Nov 2024 15:20:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C79AF6B00B1; Thu, 7 Nov 2024 15:20:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 9FE6F6B00AF for ; Thu, 7 Nov 2024 15:20:52 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 4A36D1601CB for ; Thu, 7 Nov 2024 20:20:52 +0000 (UTC) X-FDA: 82760416830.16.1B66875 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf18.hostedemail.com (Postfix) with ESMTP id 8DDDF1C000A for ; Thu, 7 Nov 2024 20:20:34 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Y2R6K4Yf; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf18.hostedemail.com: domain of 3ISEtZwYKCFgOKP70E6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--yuzhao.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3ISEtZwYKCFgOKP70E6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--yuzhao.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731010790; a=rsa-sha256; cv=none; b=IyP4gG1IzbvbRcGmWh17kgwZfXjIJ8zMx9sOhdm0ThzQhN+7Z24Zbk0cy/4JQC9a+fz30V e7YBLHzYVOEmsGLgUCSiXvC3f4J/KuC5W16xnhRTrfvcnm8YIF8jpd8/mcRYY+IR4TjOLT UQOLkWtCbBfK0L+iCTB8f/lHg8V1F3Q= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Y2R6K4Yf; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf18.hostedemail.com: domain of 3ISEtZwYKCFgOKP70E6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--yuzhao.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3ISEtZwYKCFgOKP70E6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--yuzhao.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731010790; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kshc+5TecBnElfjUZu5d6HS1cvMCCmL1E/fnjesi6Ow=; b=qO7aShSFMI2a3vIcfkN4HCvBle/R/zm5b5kA7NEh5ys0hVfycQg4EYRVlUwatXknlotVFj MjxM8LLhgzB5tv6QptXOF9XmU96XxXCFP25tCANc/j6jzdsmy2aFGKlm59oq72cHO2eWpZ XbMq0xaA5D5hECq60zM/QLrAZTxKSkY= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-6e6101877abso29087817b3.0 for ; Thu, 07 Nov 2024 12:20:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731010849; x=1731615649; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=kshc+5TecBnElfjUZu5d6HS1cvMCCmL1E/fnjesi6Ow=; b=Y2R6K4YfCG4P7OHlsq8dLAM2WaDCOKzdauv2NslWcUloyl3zxO/VogI8mAM7IYk76q KDsz/tdoNQYmvz8H5thmV+NpdtKVJTiyS262rk8eh+lzWgkZeMOtJOiY7Xpae4pb4gd9 IHd5uufbW1DS19ZqpmOXTqefsoNbrob9Ub8J8cCQ6672a5rDUpBynoMPJLSkSgrntP7l 0G8wOegA+LdgdGgRKgwU60vKkD11C6FR2WdWd24RwpHSBu64CQAudvAuSXKx5R+NwhSg aRejnZM/g1ZPzQktPf4fS+VGeFWot2DQYTzSA7epGI8xKlbbxVkElM++9iMlrSNqfhBR 4RhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731010849; x=1731615649; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=kshc+5TecBnElfjUZu5d6HS1cvMCCmL1E/fnjesi6Ow=; b=uClxOf61ZuEW/fwGw4cv9HwfC4Af3WGp4cGKnQtKW9nZBkHvkDoxigRb5AwP5qAIOk qIW9EEWcCphU8h23aK/5mQyy2c8ZMvN2UgGXDO7TmVtcFudZSVHA8TEwp6StzTGkXxw0 4XOLcZpynwNg2QSCTsfPpAFJdyn2L0WK2JolLex7ygn01gunDxHK8g4lQJowRi+0J9rN BIGpmP1GTAZKXRW5YVTc3HJFr6iaZPJ0zEAdnZ9Exntq0RQJNx6GBz16wdawCtbaYNrC TD9JUoSYFnI3HmWVcCYBg8hjHAgRM425MTypGEfUKMvZIyiWHEKyue7cizQd0Nyge+dL x/+A== X-Forwarded-Encrypted: i=1; AJvYcCW+GIAchnSueehx3fLzsPJrjL1Cb4gykj66PL1u++8IFshdQvomA1GmPfwcUzmxbrnQsnRMJFvMWw==@kvack.org X-Gm-Message-State: AOJu0YwOyk/ioxqT1b0/eDDNqI4I+OQ37vkkgQnBBEa8Nzn2HVnGJ84N +jIUTxcQHFXCzfsF9mJAoJpP11L+QHxpU0ilZP28mnGbKJpBHU2B/B7xH/tr/B8CHkxdjlI85Wm lBA== X-Google-Smtp-Source: AGHT+IEjIv/a3qKWak4A1Vn18DXvCmBDzVmrxwgOEzTP/K9OGHGZiWxXZlnHtaXBKDC55f0LUGtreAyzNRE= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:a4c0:c64f:6cdd:91f8]) (user=yuzhao job=sendgmr) by 2002:a05:690c:c08:b0:6c1:298e:5a7 with SMTP id 00721157ae682-6eaddfb88bcmr33137b3.5.1731010849660; Thu, 07 Nov 2024 12:20:49 -0800 (PST) Date: Thu, 7 Nov 2024 13:20:32 -0700 In-Reply-To: <20241107202033.2721681-1-yuzhao@google.com> Mime-Version: 1.0 References: <20241107202033.2721681-1-yuzhao@google.com> X-Mailer: git-send-email 2.47.0.277.g8800431eea-goog Message-ID: <20241107202033.2721681-6-yuzhao@google.com> Subject: [PATCH v2 5/6] arm64: pause remote CPUs to update vmemmap From: Yu Zhao To: Andrew Morton , Catalin Marinas , Marc Zyngier , Muchun Song , Thomas Gleixner , Will Deacon Cc: Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao X-Rspam-User: X-Rspamd-Queue-Id: 8DDDF1C000A X-Rspamd-Server: rspam11 X-Stat-Signature: p5pmpq93daxwoguwn4oinpohxxkrimty X-HE-Tag: 1731010834-113481 X-HE-Meta: U2FsdGVkX1+lw0VmRyiny9mRvGAGtzoptDuaRuyuWegdhDwC/7GLad8Bpe+x0vg/n9i1UMI+TAlR5VzJfmmoyNQtyYTgZ8y5tMkybo9Z6qX6hHY7zfR4OUh58hIJDuWhMjrAmGq4x4O7WIpjmuyn1ZPV6tDsNAXnyKehrme8ZfBjGmOdP6VqXrHHT831B5E3oScChdFIUiBY6OqnHFSo+8QMVZMs1uTivJXV2ZjRk1V6U8CFChDaupBEQIEwPefxy10KysazCmy+JFD+kSKYTZJJQU5Yxc0TUE2NTGIYq75enqIFf2fxNoRvKmXmLgoxDeV256CPu7Yp1k1kVqWPXQSgLyWdsVvHDt7J18UqZ9nEcSonPE8qEIVdrgjjeIm2A8y2N4zWc7Lef3b8krIV5NTBiXMcI9yRm7mmboiyvz2JrobKbsIMEynp4bayHLCcrqyd6g9H0iKgWu501eCEqD6kUetMwTxEto2DFYeNBZafaEr6Q6HCYvPKUgY3QOJZCgXPKw+BvKnMHD9fqYj+yZi9LF42XzU3FT05WrCfh0N/oIVCGqhBF9IsEEBEnwFVtwhwOynVjAAPZFbCnXmMtsv4kincdnyabNb1VGlri6iUOAYvhdxptRZycdaE6CaOtI8TqoerfYI5d0hJXbhLeWlP7KHmpLqtLZhJE/bQTglv0/EunuqdukqsTNdbHm+Gxy7jwyL5qMyApgcf1QgMu3WlzK3Azi9dQt3bSz4xg8d4ylrhUALbp4/mbqVEzWn1/wyF24Y1Byw12cqdZKlwH+Blif3IzbRrjLFQ+hNGrgzxyMyPbcWxQXELo7TvSbPouYx0SYBMmMd+ssDS4kYnX/Gqruulqe2dLLUBYf0mffn6ibDvy79s8piWoXd8n43q1MSFwxnGqhodY6PcPkMac9WF06W2Xz24YsY7db5PG6H9/vfuy2v+t/xtfOM/G+LkGpYrclRF7dpYLMzkC5U CCIqSGTZ 8BfHj0EX83Wicn74R0UcB+f66svMvnnHbP3ffb2InBd+nSMGZToKlyVDAG5YZCR3TjrR3l9JNhf1hpWnGPCGZWFffA7q9o+66aMjL1++hjjxDoyRBQ43f6FUc9Yw3po9uBwpkg5CnEcotLpvKVqdMxVsUmnN1iWqLu2z0GTeJrOCz9lAy8NkFnbs20EJaPp0laVGXowVieTsRgH8c1Rz2KFHJgbJXFTsrdgm0T8WBr/6ZELAhaEqhSX4PNG41xhVa+wbedd94sJJeDvtuQhnPxGZk0+R2LA01SApBh3K/jD+7dN65rqRc6tvvm8es+lItqO8UsyvTn7/1yj1ptk4RGF/ubTqBhLsakbi6aW7BkwdQ/ZUxt0dFPcS17mJCL3feDT74cwQyCAoj7g8UjBa8HOct1HV5dDIggS9yMh6UuEU1k4xZg7ugqzvOmFn4zzUHhUJ/ow1W3m5rfDgQmNiDYJpmbSgZj/25DU7x44en65x8Rn2slgGASr+RPmOj9+TmWs+SZD1+ALsofu1lYJsdLlYinqdLNgjM9lQrhrCiGYa7SH/Y8UZxMQMYywIFztm+75b2koJDUtBumJo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Pause remote CPUs so that the local CPU can follow the proper BBM sequence to safely update the vmemmap mapping `struct page` areas. While updating the vmemmap, it is guaranteed that neither the local CPU nor the remote ones will access the `struct page` area being updated, and therefore they should not trigger kernel PFs. Signed-off-by: Yu Zhao --- arch/arm64/include/asm/pgalloc.h | 69 ++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h index 8ff5f2a2579e..f50f79f57c1e 100644 --- a/arch/arm64/include/asm/pgalloc.h +++ b/arch/arm64/include/asm/pgalloc.h @@ -12,6 +12,7 @@ #include #include #include +#include #define __HAVE_ARCH_PGD_FREE #define __HAVE_ARCH_PUD_FREE @@ -137,4 +138,72 @@ pmd_populate(struct mm_struct *mm, pmd_t *pmdp, pgtable_t ptep) __pmd_populate(pmdp, page_to_phys(ptep), PMD_TYPE_TABLE | PMD_TABLE_PXN); } +#ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP + +#define VMEMMAP_ARCH_TLB_FLUSH_FLAGS (VMEMMAP_SPLIT_NO_TLB_FLUSH | VMEMMAP_REMAP_NO_TLB_FLUSH) + +#define vmemmap_update_supported vmemmap_update_supported +static inline bool vmemmap_update_supported(void) +{ + return system_uses_irq_prio_masking(); +} + +#define vmemmap_update_lock vmemmap_update_lock +static inline void vmemmap_update_lock(void) +{ + cpus_read_lock(); +} + +#define vmemmap_update_unlock vmemmap_update_unlock +static inline void vmemmap_update_unlock(void) +{ + cpus_read_unlock(); +} + +#define vmemmap_update_pte_range_start vmemmap_update_pte_range_start +static inline void vmemmap_update_pte_range_start(pte_t *pte, + unsigned long start, unsigned long end) +{ + unsigned long addr; + + local_irq_disable(); + pause_remote_cpus(); + + for (addr = start; addr != end; addr += PAGE_SIZE, pte++) + pte_clear(&init_mm, addr, pte); + + flush_tlb_kernel_range(start, end); +} + +#define vmemmap_update_pte_range_end vmemmap_update_pte_range_end +static inline void vmemmap_update_pte_range_end(void) +{ + resume_remote_cpus(); + local_irq_enable(); +} + +#define vmemmap_update_pmd_range_start vmemmap_update_pmd_range_start +static inline void vmemmap_update_pmd_range_start(pmd_t *pmd, + unsigned long start, unsigned long end) +{ + unsigned long addr; + + local_irq_disable(); + pause_remote_cpus(); + + for (addr = start; addr != end; addr += PMD_SIZE, pmd++) + pmd_clear(pmd); + + flush_tlb_kernel_range(start, end); +} + +#define vmemmap_update_pmd_range_end vmemmap_update_pmd_range_end +static inline void vmemmap_update_pmd_range_end(void) +{ + resume_remote_cpus(); + local_irq_enable(); +} + +#endif /* CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP */ + #endif From patchwork Thu Nov 7 20:20:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 13867068 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B536D5D688 for ; Thu, 7 Nov 2024 20:20:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 797166B00B1; Thu, 7 Nov 2024 15:20:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 746A66B00B2; Thu, 7 Nov 2024 15:20:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 59A776B00B3; Thu, 7 Nov 2024 15:20:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 30A286B00B1 for ; Thu, 7 Nov 2024 15:20:55 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id AE8694022E for ; Thu, 7 Nov 2024 20:20:54 +0000 (UTC) X-FDA: 82760416788.15.E04A470 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf20.hostedemail.com (Postfix) with ESMTP id 26F511C0012 for ; Thu, 7 Nov 2024 20:20:07 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0YlZqLgX; spf=pass (imf20.hostedemail.com: domain of 3JCEtZwYKCFsRNSA3H9HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--yuzhao.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3JCEtZwYKCFsRNSA3H9HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--yuzhao.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731010627; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RYMFXTGWFxXGTBSXbf4RqlCV7Y4+WKb9i4kIMpLBOpY=; b=WWyIdZhNthz7SVsovo4zXYl5srsU37+vGmazRmo4Boj8Z5a+RmtHS0bcxsNvvdQNSMeSRh 1y+vNDGSqDSdMLF+0EK025WZGbIuCMtV5z94Fmgo8b75jhwCDawtpKOORzjghAguy3JAan VggBOTyXRylLcoUvoPl4a1jneAnh3QA= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0YlZqLgX; spf=pass (imf20.hostedemail.com: domain of 3JCEtZwYKCFsRNSA3H9HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--yuzhao.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3JCEtZwYKCFsRNSA3H9HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--yuzhao.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731010627; a=rsa-sha256; cv=none; b=r5pbx1VlFQz9urAko/KjYTZoKJHsio6cwJ+BIfdmg81iNFmGOUGfBdAM6tsv25+NkjjFHC sQg4ZhufDq/dgNFWIB/Hl1x9IMI31AklL1gl0/cqjhNWZsXui0LNxNq5lQIpHjh+hEhvc/ kPHC1Xd8UcotQgm43WGPznRkYBYDETc= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-6e3c638cc27so26582047b3.0 for ; Thu, 07 Nov 2024 12:20:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731010852; x=1731615652; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=RYMFXTGWFxXGTBSXbf4RqlCV7Y4+WKb9i4kIMpLBOpY=; b=0YlZqLgXrv4AosQDyqd5bfZT4ehfgv6iU6iuo0o86PGMqeJkQ70t1rNCclGBG6vcP/ VlzUMdFwVQYsvyrtyUpUaVItN7aTLhsjhBRrSpLV6jV0y4iqk9zmJFj2YxABKvuzQn5+ OvlcgNheBdokSLqJdGs1mMaZH+VSY0oj0iyMh6YECwTtA6IRZPmM77vz2Nr+RluDqnJ0 m3IVjWMImYXHz0rqcyUtz5rzRC7aZ/3GcWJcqj35N9NhMVv/2oXspe+EUtrVdvvDUqga 27z0EbVa4x59bRn/uHZ1ALkZAucU47WWgioeSHwWoNQYzExmPO88CqrWpSIIbM/fFsTU 6kqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731010852; x=1731615652; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=RYMFXTGWFxXGTBSXbf4RqlCV7Y4+WKb9i4kIMpLBOpY=; b=IosBLe8HUi/1nEG2reR2+LqkZIpc4uNRpkEMS3P9R/vfeMfi0QfIxNiTawXwtKhkeZ RLFXyFTv2ezF3SEYd76N+aLMVG2e74JMZPuG7m0rm3Fubb5Dl1yn1cMHYkoSy9G8TkM9 AsA8Wr5iTzmdTc3kc8g2TbwCYsyC2e5/7PqCqEM09TFzaRkel0KXS6/LFShj6KvCZ1sv RsSTwHKqNuWfmpIJvQrUsOoSJzxDVg5ZZmmGGWn+Ec85dguRlWOI/HhbvhKTe0E3vUul 36cn8tSYeoz8Zg2waKGVNEJfTzWflmdFetzdEJW4SgN9Na9chMY8dQ3r8Flhirj0s2Jb m4lA== X-Forwarded-Encrypted: i=1; AJvYcCXCJOp5Vt7KjdoxZBx7C1uNj7Fx3ELBdNkJgGVsfkKEpQpz0MOUUSbUELD1194jz/+rEEJkHSy4bA==@kvack.org X-Gm-Message-State: AOJu0YyZolAeC04fQa6Ui2my5aJddidrsSZC3nRN3M/5qDxbJmzw/1qH EUdg4fkfMJ67/nBU/HEwIYa/tzNgKPmM7HOTLjOWg3zHWMBxe5LlYet81r69OTezKhhgtDYhAuV V6w== X-Google-Smtp-Source: AGHT+IHlhijSC9zoCXOj87mpdA8tF9p5aSLV/movLFpBkBVUAVMMT2gDU4exk7NlnuXq35RVmXnfaJPFwyA= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:a4c0:c64f:6cdd:91f8]) (user=yuzhao job=sendgmr) by 2002:a5b:809:0:b0:e2e:3401:ea0f with SMTP id 3f1490d57ef6-e337f8f63a3mr313276.7.1731010852085; Thu, 07 Nov 2024 12:20:52 -0800 (PST) Date: Thu, 7 Nov 2024 13:20:33 -0700 In-Reply-To: <20241107202033.2721681-1-yuzhao@google.com> Mime-Version: 1.0 References: <20241107202033.2721681-1-yuzhao@google.com> X-Mailer: git-send-email 2.47.0.277.g8800431eea-goog Message-ID: <20241107202033.2721681-7-yuzhao@google.com> Subject: [PATCH v2 6/6] arm64: select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP From: Yu Zhao To: Andrew Morton , Catalin Marinas , Marc Zyngier , Muchun Song , Thomas Gleixner , Will Deacon Cc: Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 26F511C0012 X-Stat-Signature: aoywohntw14n36qydswsk7hegzdc7urh X-Rspam-User: X-HE-Tag: 1731010807-757896 X-HE-Meta: U2FsdGVkX1//KZ5PScrpIPKYzrUb6vkDS4x3lXZbavfODjno2KKuvEgpww5Hwt3UdQeHuOtOibl86xBGpsT7Q0clJmo3a4biK0V5SwtRzHjTZixvb4XE4YDZbUg03WyS5c66pY1kHCiTCLBLTlxf08RsdfhKXpJCTCVSe5DpxM7/LTPbSu8iPKltwhqlR8sXAShU5/Kvzf0jNB8NRcyiN+jZEulenm5SX2Wihvml12FdcDWAxdxHlEYE2UjZX0q6lnb+M77urvRSiEnco+URP4F+djxlfz5ZcdbfuDNpO84jWHu9+6T2uCfOf6pe7/dBqjy/mNaA9vD+PF8KgiVMR/Qi5WY1AZQTi1rdZTSaYUhZMl9RBJsU5AwOQYCLiMJ9epTyghtj625ap6EacTvm1QnbOX2n468eil7Sz6FBzSoiyEGH5ilfzuhCZA74mKdfbCugK6tn1LBb9WGCeh/c4eWGosRfBZZB1UOSFCVfL8wA1Cm6rdOJErg+nDiw7ZHnXA05UrKU8F/n812JQgJ4QuQ1RdPmQUPlcLz2FNg3YvwBXHZU2U7I8MmxWYCsIahksWFosJ+2dx9NgZE2Mrn49AHUqsCyJWbeDkpcg4oD8eGGoEQWhGQeDiozsysFtXmQpdG/rBovZmw2wgS0PKCiyNpcuG5j9pc272WxOwcs7JsJbe5ceHC6O/H+cm8xodR3kl/KfYfgkPmiPjPV1yx14pp0lzNtos227xBVgxc84qSd/w5DADyiOYYTCFlJPUp1LxlbxBQgrmzv27X4Q+HohY+ax0oHmK1X3v4/atKqtUawfYzg03r3qyB1sVJD0Cz8HXbY7OIfBty8dgFpRH1BrZAJMm7lYIqiodL0jnXs3LfteMHLaG4wI6hkdtB2URh8trmB9tDWyj4TGVPW0RB7nr4uQ6WgIL5TFgeFoOU+Ku01rCIhKWIK7pevl1kiIDrsch5Bc/PkBq80xG8mLZI 7nIIC6t8 cXQxLCJIM6gklGy9u28xnZSHL2j5rYxmJykoHIuZ6PqG+OSeslCC9FVCdxw7iA8wInnS9dlzuKPiSXd95anj+hnjFYOA8dhy/rJFbTSN75AqX/AULmWf5aFpzCh6SVDRzq7Z+hzqIuIweWU+V86pQyqr7w1oEtDE8k7GqsclYy8YRYDk4Nvir1a1cqG26Vg+aVVx3mGMvMOLfF3Ok+C/Nwd7vbkrSWnwTo3BQMOtLyulg/YC8Qli9Eoj8hWvtXVHqIEveUnhY+yHAIGfW85gUPGfsxiEMfVptLqmowdAoLOsPcjz75Y4y6bwOOzB8J7zMIYzbf6oWZ5X+dVQUCTWW/AnaRt7auGhCUIsepMnHn/FW7PkvB1Msw3/3WoaDlxi2EyHQMNBXJA+eTxMwlzmzbxJwHqfxjyIY9qcVh5hGNmL4OLv4WB1jmYyO3KyhmpmR3QCixfIbZBJtW7RD0m0z0EdlTwpZbtogfcgfcGMDLwTPKU1DF7y2KnmnvC+JhOhARoXIoilR9lhxOmrpvkm1iht1o/rVip8r8wb5X7jfELN7J3tcrylNYYyztY7lwHqg/36z5XrXBf927cv3rXDIZmlyjA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: To use HVO, make sure that the kernel is booted with pseudo-NMI enabled by "irqchip.gicv3_pseudo_nmi=1", as well as "hugetlb_free_vmemmap=on" unless HVO is enabled by default. Note that HVO checks the pseudo-NMI capability and is disabled at runtime if that turns out not supported. Successfully enabling HVO should have the following: # dmesg | grep NMI GICv3: Pseudo-NMIs enabled using ... # sysctl vm.hugetlb_optimize_vmemmap vm.hugetlb_optimize_vmemmap = 1 For comparison purposes, the whole series was measured against this patch only, to show the overhead from pausing remote CPUs: HugeTLB operations This patch only The whole series Change Alloc 600 1GB 0m3.526s 0m3.649s +4% Free 600 1GB 0m0.880s 0m0.917s +4% Demote 600 1GB to 307200 2MB 0m1.575s 0m3.640s +231% Free 307200 2MB 0m0.946s 0m2.921s +309% Signed-off-by: Yu Zhao --- arch/arm64/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index fd9df6dcc593..e93745f819d9 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -109,6 +109,7 @@ config ARM64 select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT select ARCH_WANT_FRAME_POINTERS select ARCH_WANT_HUGE_PMD_SHARE if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36) + select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP select ARCH_WANT_LD_ORPHAN_WARN select ARCH_WANTS_EXECMEM_LATE if EXECMEM select ARCH_WANTS_NO_INSTR