From patchwork Thu Nov 7 20:20:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 13867064 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA8A6D5D688 for ; Thu, 7 Nov 2024 20:20:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F3B696B00AA; Thu, 7 Nov 2024 15:20:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E71B56B00AB; Thu, 7 Nov 2024 15:20:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D15BB6B00AC; Thu, 7 Nov 2024 15:20:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id ABB276B00AA for ; Thu, 7 Nov 2024 15:20:45 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 763611601CB for ; Thu, 7 Nov 2024 20:20:45 +0000 (UTC) X-FDA: 82760416872.02.EC1CB1A Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf17.hostedemail.com (Postfix) with ESMTP id EF96740009 for ; Thu, 7 Nov 2024 20:20:15 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="I6yE14t/"; spf=pass (imf17.hostedemail.com: domain of 3GiEtZwYKCFEHDI0t7z77z4x.v75416DG-553Etv3.7Az@flex--yuzhao.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3GiEtZwYKCFEHDI0t7z77z4x.v75416DG-553Etv3.7Az@flex--yuzhao.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731010617; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Oo5C9e0+cl8r7hE7UWmRxwqyWt2dE16i/xTv+Vqu/GQ=; b=NOrhO7k2hgvH0LTBgtkfAE92eN2KBUBYyvpGH3bjD0Sbu0fnymga4qv59SxGtYvQ3s2riP qOfjf+wSgIFo70Yt+sBhJSggU4AheItkKgqAA+0KPCDEUmqB8znA3EGY5zdEATZPOjXy+U /F45ZIRPtWKzQ9dN1+pXePDwTL3Ij/g= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="I6yE14t/"; spf=pass (imf17.hostedemail.com: domain of 3GiEtZwYKCFEHDI0t7z77z4x.v75416DG-553Etv3.7Az@flex--yuzhao.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3GiEtZwYKCFEHDI0t7z77z4x.v75416DG-553Etv3.7Az@flex--yuzhao.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731010617; a=rsa-sha256; cv=none; b=jIACA9DhOyF5Vh5cEBh5mSNflfsvmABSrC0EXU1KkntZovcpCbVo2OJFp+Ntm+VoBwBmsp glR/1hFSOISesm5gu4URvavgqGu0F8lJkIqEIo0RNYogCye8GwCvK40g0f7fz3rDocHhRm zrt+A9w9jPVcff+jGkn/Or/Qa3tKrdc= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-6e376aa4586so26245017b3.1 for ; Thu, 07 Nov 2024 12:20:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731010843; x=1731615643; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Oo5C9e0+cl8r7hE7UWmRxwqyWt2dE16i/xTv+Vqu/GQ=; b=I6yE14t/2byhElqsXA8xgmgb8qkZJbXRGdhOiAJrgkw5Yv1Nx4t5OYlRR6HR51fJUa mXN9YKqyPtHdBA8ZFZFxVDRXaBF6Tni2XTT2P81p1eRJL5UhV5ulV9IQYx4o9dP3f+GU OdSo5f9vbz5KjAyWzUpMLFF5pDzT+uZDkr2crqArT4U3xd1lSJ18GboEseqSk5B5I0OD CH2ypJRrlqSpGkUgJWva7UVd5HqOsVYjHhjsHfadWMbUHeSofiXNDqOsKb49JqrZXe58 tL6toxGCdKNxoF5WnKbRPShYyPaMhh8yZ9ZP5SyqaoMq39fqSat7gxDxcgJ801GSjPXB LKXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731010843; x=1731615643; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Oo5C9e0+cl8r7hE7UWmRxwqyWt2dE16i/xTv+Vqu/GQ=; b=UuFAUh51v/XZvlCW+twEO4oKDuwut+xlXTcqqaGp7wDRyyd1E79yyZh3slFB541ymP 3IWHkl7WDP+7RmPE0uj6StBDdRAJhXdqjQaM4uvHabfmyG1laPO6fzlFj3On40AlgOXu oN3dL5sBVbwqL3wT003lpzoKilfrLzkIWo/4oS/tiQvxeDF/f1ozS2e+j5ffbyowXIAs eqqKSJof3hT4kzs1mQtUrX3slkHWsO7e0qJgPjvLbo+XlQ2HEtgMpHzX9f4yesZkLzQ/ uEfdy8QWrz8dBV0bpf3eZe3SnrDeDAv5wOF8SG6/HLg1I1T0OE7rutwtEsWBg7aYifr8 0Ggg== X-Forwarded-Encrypted: i=1; AJvYcCUX+p93yOGp0rRyI+aPa8LpmBRktWZ/h8FMKvjySDyvFmBMO+BIEIU9as9RM8P3YO5CSMBktj/Bvw==@kvack.org X-Gm-Message-State: AOJu0YwHnUKNu2NVbpknCj3oqGWO+0uXtVwFCj1C4Nxg34+pw/oGJAfQ sywXeCCaC8RHStHoE7NZFf1PEFaLDgwSJQD0EWQlRBQw+NGjPKCG9F14xGgDoPmNDBcTdvbb0rL 95A== X-Google-Smtp-Source: AGHT+IFItECqZ2dxgSyANVkerUl0Fg/UFqJy0LSXkY8wHLj1uLzeJdS7jFKPfch4UM5ukL5EJb9In+OVPNo= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:a4c0:c64f:6cdd:91f8]) (user=yuzhao job=sendgmr) by 2002:a05:690c:4a13:b0:6e3:b93:3ae2 with SMTP id 00721157ae682-6eaddd704d6mr14027b3.1.1731010842758; Thu, 07 Nov 2024 12:20:42 -0800 (PST) Date: Thu, 7 Nov 2024 13:20:29 -0700 In-Reply-To: <20241107202033.2721681-1-yuzhao@google.com> Mime-Version: 1.0 References: <20241107202033.2721681-1-yuzhao@google.com> X-Mailer: git-send-email 2.47.0.277.g8800431eea-goog Message-ID: <20241107202033.2721681-3-yuzhao@google.com> Subject: [PATCH v2 2/6] mm/hugetlb_vmemmap: add arch-independent helpers From: Yu Zhao To: Andrew Morton , Catalin Marinas , Marc Zyngier , Muchun Song , Thomas Gleixner , Will Deacon Cc: Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: EF96740009 X-Stat-Signature: 6sxmkx165km6fjw54c7b1e6p998hhfxi X-Rspam-User: X-HE-Tag: 1731010815-32945 X-HE-Meta: U2FsdGVkX1+hNgsKuSv5iDzYJVkilAobcssLiqvZerY6wrGlMZnYLYbOYxdJtnInZLaUveE/aDHUvmbKFCETWLvNCFEgHvsxvK+q6ktw43N8MREoOVZivFADOxc8+DUdIfEtFY568a+uaPL70I0eVwOxLTW6SdGH2BNyslyeVgXYb5V8gyIF+L4piPKk27ND1Mn4Biq/kJMwTgLRWJ+n4HxeiZd1eKdQ2cYvCRm1MTaD/YmGR1eCEIkWNa4rdeL2W8wfrTF2vGruRxa8XD04Ju9rDiJ2ovdq17GdJ6AsfXDQewFtGxueMGXk8CVEcNFUx3Z/+3SaatAAGBlHpJJ3Eqs0qWcGXldjOjEm1ahIlGtVBt4rlxoFUGXEr1l5jkN3jJTBck5riFiOP8oKwULKrgvald22N26jdKUwdGlC+UftOqWlDJ0vUvK+J+wuR4AjXCa9u90wWHcbTT8QAvts06UlJTcth0XQu591WJea+nCOvfNOy6FKQzf+xXvWXjQeaBEeNm8G5Te50l43x597JXii6Jw04dwe/upWyQHI+mN745+ijMVj0CEAgkOAA3qfRFsWgC7jf4T8Dvab8Tu7Ke9O5hoSwwTkvZK+DAuo+6v3e+n3h47kiFeJnQ0jdM9+4s/iAzhAn50SdAyZ6/2ociJLz0knM+N+Urs/8FG8U6rS2pZAXplk+7ogvyRKZle6asx/rosOIx2x3mta/mtc8bXjYzBQT291iWfr04mJI/Pb7MwaeKX1fktlnOEA5rX6EO7yVrN/ewmFuKnU36f4ql1Eu1BnKJ41W+OSGqEbVlZo0MMCHq2QuVajVLS/0FzD6cI86qy7FKXwf2dPEPmCyWIOJjkyorNh6qI5MG9pjojk7ZsyFjigwoZtl5JGyGwisNrujE+yOSxaDsy6ynJQo86DIef0TMgIGbfcXqmxqwywClgogVjpI4Im2yZSF3fxqGzhs7jdbRN+E1PwhsS ypuATD6H laRqPJUAMJl9fjhEYf7QZFlJDukEB7X5Rv9KVp4cOE33/fBWCEVZnhxdnsT3k0yQgi5TalnbG2+xd3pUZQmUpN7uRfaTpv6nP7OA6kGK82LRBce2RpvVYOIEBzHp7XHLY0NpbrgrrnU23hmZ/ol2vvzU2ANoN7GYMBHyrQ5V//bRlcEI9b/iRgudlzmZmgAAcVbxoBtzPmrBvr1ZG/CG7/Q1E2THcts9Usjhxk0Gkfr8auc6uxC7Ri1yvkC/qU1wFI6acu85e47WLAE5rAqJlio78xroD6XrMzO04+MCel0V25832htZOoaJIRdKc00KSvUgnv2x+RAlNfmk2sIxSrCwI58AuxFmTsEoYzewAgm+g4c8PvexEtIvjkDrg/5rVrdzkS01ZdiUMXfR3lSEvzuGx5td4mCBEG5PKgjnJfvNUP7BpaK2ED6iDz78CdwN22hMSQBwdlcwIgJZaMLg4eDT78TR0o4/V6ekEgxugUaFz+p2B8vs1UO7zU8RpwittR8xfKLtLy3m0iAGa7LWoggHpcBBumHEXpxTIvvDInpK3FgzlkOK0HT6sp9bYizCN8K9KsGHwkrkvFF05Ud5AcbO8lQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add architecture-independent helpers to allow individual architectures to work around their own limitations when updating vmemmap. Specifically, the current remap workflow requires break-before-make (BBM) on arm64. By overriding the default helpers later in this series, arm64 will be able to support the current HVO implementation. Signed-off-by: Yu Zhao --- include/linux/mm_types.h | 7 +++ mm/hugetlb_vmemmap.c | 99 ++++++++++++++++++++++++++++++++++------ 2 files changed, 92 insertions(+), 14 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 6e3bdf8e38bc..0f3ae6e173f6 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1499,4 +1499,11 @@ enum { /* See also internal only FOLL flags in mm/internal.h */ }; +/* Skip the TLB flush when we split the PMD */ +#define VMEMMAP_SPLIT_NO_TLB_FLUSH BIT(0) +/* Skip the TLB flush when we remap the PTE */ +#define VMEMMAP_REMAP_NO_TLB_FLUSH BIT(1) +/* synchronize_rcu() to avoid writes from page_ref_add_unless() */ +#define VMEMMAP_SYNCHRONIZE_RCU BIT(2) + #endif /* _LINUX_MM_TYPES_H */ diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 46befab48d41..e50a196399f5 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -38,16 +38,56 @@ struct vmemmap_remap_walk { struct page *reuse_page; unsigned long reuse_addr; struct list_head *vmemmap_pages; - -/* Skip the TLB flush when we split the PMD */ -#define VMEMMAP_SPLIT_NO_TLB_FLUSH BIT(0) -/* Skip the TLB flush when we remap the PTE */ -#define VMEMMAP_REMAP_NO_TLB_FLUSH BIT(1) -/* synchronize_rcu() to avoid writes from page_ref_add_unless() */ -#define VMEMMAP_SYNCHRONIZE_RCU BIT(2) unsigned long flags; }; +#ifndef VMEMMAP_ARCH_TLB_FLUSH_FLAGS +#define VMEMMAP_ARCH_TLB_FLUSH_FLAGS 0 +#endif + +#ifndef vmemmap_update_supported +static bool vmemmap_update_supported(void) +{ + return true; +} +#endif + +#ifndef vmemmap_update_lock +static void vmemmap_update_lock(void) +{ +} +#endif + +#ifndef vmemmap_update_unlock +static void vmemmap_update_unlock(void) +{ +} +#endif + +#ifndef vmemmap_update_pte_range_start +static void vmemmap_update_pte_range_start(pte_t *pte, unsigned long start, unsigned long end) +{ +} +#endif + +#ifndef vmemmap_update_pte_range_end +static void vmemmap_update_pte_range_end(void) +{ +} +#endif + +#ifndef vmemmap_update_pmd_range_start +static void vmemmap_update_pmd_range_start(pmd_t *pmd, unsigned long start, unsigned long end) +{ +} +#endif + +#ifndef vmemmap_update_pmd_range_end +static void vmemmap_update_pmd_range_end(void) +{ +} +#endif + static int vmemmap_split_pmd(pmd_t *pmd, struct page *head, unsigned long start, struct vmemmap_remap_walk *walk) { @@ -83,7 +123,9 @@ static int vmemmap_split_pmd(pmd_t *pmd, struct page *head, unsigned long start, /* Make pte visible before pmd. See comment in pmd_install(). */ smp_wmb(); + vmemmap_update_pmd_range_start(pmd, start, start + PMD_SIZE); pmd_populate_kernel(&init_mm, pmd, pgtable); + vmemmap_update_pmd_range_end(); if (!(walk->flags & VMEMMAP_SPLIT_NO_TLB_FLUSH)) flush_tlb_kernel_range(start, start + PMD_SIZE); } else { @@ -164,10 +206,12 @@ static int vmemmap_remap_range(unsigned long start, unsigned long end, VM_BUG_ON(!PAGE_ALIGNED(start | end)); + vmemmap_update_lock(); mmap_read_lock(&init_mm); ret = walk_page_range_novma(&init_mm, start, end, &vmemmap_remap_ops, NULL, walk); mmap_read_unlock(&init_mm); + vmemmap_update_unlock(); if (ret) return ret; @@ -228,6 +272,8 @@ static void vmemmap_remap_pte_range(pte_t *pte, unsigned long start, unsigned lo smp_wmb(); } + vmemmap_update_pte_range_start(pte, start, end); + for (i = 0; i < nr_pages; i++) { pte_t val; @@ -242,6 +288,8 @@ static void vmemmap_remap_pte_range(pte_t *pte, unsigned long start, unsigned lo set_pte_at(&init_mm, start + PAGE_SIZE * i, pte + i, val); } + + vmemmap_update_pte_range_end(); } /* @@ -287,6 +335,8 @@ static void vmemmap_restore_pte_range(pte_t *pte, unsigned long start, unsigned */ smp_wmb(); + vmemmap_update_pte_range_start(pte, start, end); + for (i = 0; i < nr_pages; i++) { pte_t val; @@ -296,6 +346,8 @@ static void vmemmap_restore_pte_range(pte_t *pte, unsigned long start, unsigned val = mk_pte(page, PAGE_KERNEL); set_pte_at(&init_mm, start + PAGE_SIZE * i, pte + i, val); } + + vmemmap_update_pte_range_end(); } /** @@ -513,7 +565,8 @@ static int __hugetlb_vmemmap_restore_folio(const struct hstate *h, */ int hugetlb_vmemmap_restore_folio(const struct hstate *h, struct folio *folio) { - return __hugetlb_vmemmap_restore_folio(h, folio, VMEMMAP_SYNCHRONIZE_RCU); + return __hugetlb_vmemmap_restore_folio(h, folio, + VMEMMAP_SYNCHRONIZE_RCU | VMEMMAP_ARCH_TLB_FLUSH_FLAGS); } /** @@ -553,7 +606,7 @@ long hugetlb_vmemmap_restore_folios(const struct hstate *h, list_move(&folio->lru, non_hvo_folios); } - if (restored) + if (restored && !(VMEMMAP_ARCH_TLB_FLUSH_FLAGS & VMEMMAP_REMAP_NO_TLB_FLUSH)) flush_tlb_all(); if (!ret) ret = restored; @@ -641,7 +694,8 @@ void hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct folio *folio) { LIST_HEAD(vmemmap_pages); - __hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages, VMEMMAP_SYNCHRONIZE_RCU); + __hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages, + VMEMMAP_SYNCHRONIZE_RCU | VMEMMAP_ARCH_TLB_FLUSH_FLAGS); free_vmemmap_page_list(&vmemmap_pages); } @@ -683,7 +737,8 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l break; } - flush_tlb_all(); + if (!(VMEMMAP_ARCH_TLB_FLUSH_FLAGS & VMEMMAP_SPLIT_NO_TLB_FLUSH)) + flush_tlb_all(); list_for_each_entry(folio, folio_list, lru) { int ret; @@ -701,24 +756,35 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l * allowing more vmemmap remaps to occur. */ if (ret == -ENOMEM && !list_empty(&vmemmap_pages)) { - flush_tlb_all(); + if (!(VMEMMAP_ARCH_TLB_FLUSH_FLAGS & VMEMMAP_REMAP_NO_TLB_FLUSH)) + flush_tlb_all(); free_vmemmap_page_list(&vmemmap_pages); INIT_LIST_HEAD(&vmemmap_pages); __hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages, flags); } } - flush_tlb_all(); + if (!(VMEMMAP_ARCH_TLB_FLUSH_FLAGS & VMEMMAP_REMAP_NO_TLB_FLUSH)) + flush_tlb_all(); free_vmemmap_page_list(&vmemmap_pages); } +static int hugetlb_vmemmap_sysctl(const struct ctl_table *ctl, int write, + void *buffer, size_t *lenp, loff_t *ppos) +{ + if (!vmemmap_update_supported()) + return -ENODEV; + + return proc_dobool(ctl, write, buffer, lenp, ppos); +} + static struct ctl_table hugetlb_vmemmap_sysctls[] = { { .procname = "hugetlb_optimize_vmemmap", .data = &vmemmap_optimize_enabled, .maxlen = sizeof(vmemmap_optimize_enabled), .mode = 0644, - .proc_handler = proc_dobool, + .proc_handler = hugetlb_vmemmap_sysctl, }, }; @@ -729,6 +795,11 @@ static int __init hugetlb_vmemmap_init(void) /* HUGETLB_VMEMMAP_RESERVE_SIZE should cover all used struct pages */ BUILD_BUG_ON(__NR_USED_SUBPAGE > HUGETLB_VMEMMAP_RESERVE_PAGES); + if (READ_ONCE(vmemmap_optimize_enabled) && !vmemmap_update_supported()) { + pr_warn("HugeTLB: disabling HVO due to missing support.\n"); + WRITE_ONCE(vmemmap_optimize_enabled, false); + } + for_each_hstate(h) { if (hugetlb_vmemmap_optimizable(h)) { register_sysctl_init("vm", hugetlb_vmemmap_sysctls);