From patchwork Tue Sep 10 23:43:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799483 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 043D8EE01F1 for ; Tue, 10 Sep 2024 23:45:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B27018D00D5; Tue, 10 Sep 2024 19:44:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A37228D0002; Tue, 10 Sep 2024 19:44:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 815128D00D5; Tue, 10 Sep 2024 19:44:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 61ADB8D0002 for ; Tue, 10 Sep 2024 19:44:52 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 11FCCA0C59 for ; Tue, 10 Sep 2024 23:44:52 +0000 (UTC) X-FDA: 82550461224.08.54F5575 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf08.hostedemail.com (Postfix) with ESMTP id 2E265160003 for ; Tue, 10 Sep 2024 23:44:49 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=F9EEUtvp; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of 38dngZgsKCGYEGOIVPIcXRKKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--ackerleytng.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=38dngZgsKCGYEGOIVPIcXRKKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011775; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8iv7923HL7PnugAdoQNMJmA75my6sY7uB1Q3OZBVqww=; b=N6jEZJ6J8vP6EE6ZcNoLEr1m1+pW03LzNteVkcJBhyTY1y6NjAUrWFMbFq46i8U+BcowVz O3lsCzziTJkrKc4aSq4R6RIXXhtNlXFoQ/uLBuPKMqAjEgCosfMqs9XiyUEcIr4ZNdmb3C dO+8AuLfsYYBwjAYIG8TnPy2CR1yi5I= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011775; a=rsa-sha256; cv=none; b=6vFgjMKzzK3aNTXUJO0ClExp1eZHX1Ijf8KjQaTsTxOdNOO7REH/HNzxpY1LxhP9Gvh/9q Lhk0x8sMfjMQ8zfO0vLqGVrt7uEJgkKeNlsHZVlvdgPCGJiEmXNPLgszttXAg35/LhVz56 UzKmGDr9QVSrhMtc4suFipG/34GjDBE= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=F9EEUtvp; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of 38dngZgsKCGYEGOIVPIcXRKKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--ackerleytng.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=38dngZgsKCGYEGOIVPIcXRKKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--ackerleytng.bounces.google.com Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e163641feb9so1115565276.0 for ; Tue, 10 Sep 2024 16:44:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011889; x=1726616689; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=8iv7923HL7PnugAdoQNMJmA75my6sY7uB1Q3OZBVqww=; b=F9EEUtvpXXJFTpSzgXTZbO0VXe0N/1FRo7ckWSmTqpaW57mmQrLSPfc8vcamWppwF4 wdtRn114DyK46B2GdzwC2qWcvlY9OnteoPOtZil1NS0mKAJnHRq9lRprwgLp29xKTEzL uGDn8KAe29dps9hH7+LHr1fASqc75YeX8NmaJZpfXsCw0kWomCWVpuwmBkdAjQxPggC5 LJEiCZYMAhRyNjTVC1TbhvkiaLUebEW6AXlHn9lNPRxy2UR23YrZ7XIL7q7ut468oWYO PXtfWe64snkeMa6DD+4FRnfpHPkrMe41CahjIWijT1Rnh3hxpddTeuX5oEQtcVbmFauN iM0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011889; x=1726616689; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=8iv7923HL7PnugAdoQNMJmA75my6sY7uB1Q3OZBVqww=; b=VPG9USdYEZlLcvrhOWt1xW4eZ11eye5o9Md1qqmgLxfmplh+M4jGtmFmkYIaiLnpAo UYQMCoxWq68HcazS/3qP8aDHtrjCnSAYdd8bNHaj9gSJnit6Z82XhVFMBx8OhsSga446 TCcdrWmshMQrztVH5DJR8Cld54i12grPSh6K0hhwWoQkV6BawczVo14cy/UH19bZ3yWZ dOkWzCH3WrIFKVFp6pJ/IwS+BLzADOCOgW4hvBQlkNwYfSxxdqRwxuks+2PNp97V8Ux5 0Tnr6CfCh51XiemlG2sqOqZb1TrExG4nDDp2Nz7aEjfPyKBm6U/MA7VkcHiYJs9C+sUf RI5g== X-Forwarded-Encrypted: i=1; AJvYcCWDgO6wfFHVDZlicxLsp9yRqQ3lzg2TllKdvzrTdIY5U9NeQiOjwa4i7iIm00qEphwpRkL02bB24A==@kvack.org X-Gm-Message-State: AOJu0YzN6gHdQNYVcUufoBv9F0uBrHdrE8yyq0FNVY9eAxZO00m7JXhi Dg+4kkhGg2MpznOnFC7A5fBJ4w+JW9eujQfR0fD0rC9GKWeeRIK3VBshLy96tlLg8Oq2Eysf4u7 pFoXFrlJE/s5/r3dZl5KSWQ== X-Google-Smtp-Source: AGHT+IExTdEexhbqpimHxd6gs3l7Pq2iWLuHahj/FFfLrEvqqV1hWUN+M4z6YvLC879N40fo1vT1sN3faPRrRHEZgw== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a25:aaaf:0:b0:e1a:7eff:f66b with SMTP id 3f1490d57ef6-e1d7a0f3520mr31045276.5.1726011889166; Tue, 10 Sep 2024 16:44:49 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:41 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <083829f3f633d6d24d64d4639f92d163355b24fd.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 10/39] mm: hugetlb: Add option to create new subpool without using surplus From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 2E265160003 X-Stat-Signature: pggofrpa975mtj3n8bb7nqxkdrggttxf X-Rspam-User: X-HE-Tag: 1726011889-60012 X-HE-Meta: U2FsdGVkX1+/GZwwNxptWM1Eh7nQwJ+zDZQwzsH0xrdpUcwOENy8IgWU/7uMl2WPJ3NaldZ/2+HlEiPR+k7rG7jG3dIHxHsVlwGe9JSbyfdWhOLd+BMut0HTBMCsNFfYDHbtSOlk+ZIXSow3zrYViHtcWWPs8YtrqqbgV25TOK1mj3yTUymNwTaqsQCpINOr0SZvlfDxIQp4op3vk3sFULSaGiov6gKNBNURfxglg0TpYi7OHcd/f6oOTf+CAwdB8iaeAQx1IREBFhdgQKErGoIFPg0juf3Ylt8geV/fM67XfrhjUMKWa1PHjm6ycrQ5/275eMh/+07Ykj7bxABYGykpllWcWrufAs9N2FV4wKIwY2MGPoIK4UMs+778i+94n85PLzILvGFvnAF9K0Nrrioc5c7g3kHRtkdu7s+3UtjZAmmFM2/P6mJJwEVxHhAV0CsCK4BPGEwM2Brq4i1vGv7o1dvfftyFhXOxiyDgHKz5Ulg5M2VWw0C1L5Bn0pdww1Ic9Ns21h++zH0jVdzLt2RJgWKe39zh5GaMnqEkKJww/DHMhpm6Caun+03yUq3zoglCLkFbEFjeZVPhTIGEDQvNpkRaxRQ/dg3sQGcqvoXwHNrMx36H9OQCvqtAqHtYJEcqC7WaeyBLmQrA5J0aI2X3qW7CalPTcPKweZwXUjkgZN74Ffy2vsC89XQvfWOJynmH/lQk4p1O4etUhVyxe5HJiZD47kLQ/y3MR87pLAGkQJkt6LulCqBrIg0OFZUbl52zJd6AqHLzmJ1wXX/T1bLtrj18moSUtvb4C3EBxr0IYHXvf8Q4pwndwm9WKQGojlhm2ahkgp54GxTDDQKPW5gOho5dzsLUuUSSU5pTSpv8UXMyMIK3YVk2DRtDMNISz6tKGcJy90I2gSpduvrkMC/dzm7KFLxiMUS7m8PKQjpIQCLBkpTIqdlSJQAlMDeE+yZn3THUNMDf65WeuvF 152o3Qko NpKHMoaO8drYIXbEP0x+dGlm0nIv/IKCp9iNbI53xQKGFtiD1ZPqkmChkrpVgqVy7Mit4TUKim3WYceicwOiL678Q0QPB4227WSxOGIXy5mhsm9+K4J99eN7Jn5yMba+qnF+jPA9fjoEDyB59T1GBQaIXj3rQ34BIezUQKh8HfGVCYoeX1wfRffeHRU2eklGcvEVXhrPSt/adQn0SqNpfvwELGJ5WpNCTp5Lmr9GhvzE3qDxANlxTN1CbCRSA0t7U8Yy7COUHH0P+XMAfQJfvyAqU5qk9THs2Xz40I4r8SRvYK1LbYYN213JDuJy176c6e89W5hH38EzQs/GSyHicv1cejtYg9i78yUFXKPW7TSTArmpL2ciTvuuw/esGUEGUVpYJ9w5bIheds2tuW7WP0dQ3X3YsRRzsNeqCO/GEgbOTHQEbGw0KJjIi2WEtE6f3mYHDTwXffEdk8CUFS+Xe2aZQ2x6ZOkiHQPCxIsJ5e4uRfWzHnOJyPNBOKlyZR+v5iClMTxFom5Ewyp8skO3r4I52qi+V+MOXisTEM8cig2jSlMIQo95XA4BwQkBtOxPNrgLH0TXFY45arKnncW706EIPRhouvwdfEgVS7oEQfSnupjxz0dboN3l5eE+Ao4NvFKiXtET4qWO8vga5M4N6An2ENA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: __hugetlb_acct_memory() today does more than just memory accounting. when there's insufficient HugeTLB pages, __hugetlb_acct_memory() will attempt to get surplus pages. This change adds a flag to disable getting surplus pages if there are insufficient HugeTLB pages. Signed-off-by: Ackerley Tng --- fs/hugetlbfs/inode.c | 2 +- include/linux/hugetlb.h | 2 +- mm/hugetlb.c | 43 ++++++++++++++++++++++++++++++----------- 3 files changed, 34 insertions(+), 13 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 9f6cff356796..300a6ef300c1 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -1488,7 +1488,7 @@ hugetlbfs_fill_super(struct super_block *sb, struct fs_context *fc) if (ctx->max_hpages != -1 || ctx->min_hpages != -1) { sbinfo->spool = hugepage_new_subpool(ctx->hstate, ctx->max_hpages, - ctx->min_hpages); + ctx->min_hpages, true); if (!sbinfo->spool) goto out_free; } diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 907cfbbd9e24..9ef1adbd3207 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -116,7 +116,7 @@ extern int hugetlb_max_hstate __read_mostly; for ((h) = hstates; (h) < &hstates[hugetlb_max_hstate]; (h)++) struct hugepage_subpool *hugepage_new_subpool(struct hstate *h, long max_hpages, - long min_hpages); + long min_hpages, bool use_surplus); void hugepage_put_subpool(struct hugepage_subpool *spool); long hugepage_subpool_get_pages(struct hugepage_subpool *spool, long delta); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 808915108126..efdb5772b367 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -92,6 +92,7 @@ static int num_fault_mutexes; struct mutex *hugetlb_fault_mutex_table ____cacheline_aligned_in_smp; /* Forward declaration */ +static int __hugetlb_acct_memory(struct hstate *h, long delta, bool use_surplus); static int hugetlb_acct_memory(struct hstate *h, long delta); static void hugetlb_vma_lock_free(struct vm_area_struct *vma); static void hugetlb_vma_lock_alloc(struct vm_area_struct *vma); @@ -129,7 +130,7 @@ static inline void unlock_or_release_subpool(struct hugepage_subpool *spool, } struct hugepage_subpool *hugepage_new_subpool(struct hstate *h, long max_hpages, - long min_hpages) + long min_hpages, bool use_surplus) { struct hugepage_subpool *spool; @@ -143,7 +144,8 @@ struct hugepage_subpool *hugepage_new_subpool(struct hstate *h, long max_hpages, spool->hstate = h; spool->min_hpages = min_hpages; - if (min_hpages != -1 && hugetlb_acct_memory(h, min_hpages)) { + if (min_hpages != -1 && + __hugetlb_acct_memory(h, min_hpages, use_surplus)) { kfree(spool); return NULL; } @@ -2592,6 +2594,21 @@ static nodemask_t *policy_mbind_nodemask(gfp_t gfp) return NULL; } +static int hugetlb_hstate_reserve_pages(struct hstate *h, + long num_pages_to_reserve) + __must_hold(&hugetlb_lock) +{ + long needed; + + needed = (h->resv_huge_pages + num_pages_to_reserve) - h->free_huge_pages; + if (needed <= 0) { + h->resv_huge_pages += num_pages_to_reserve; + return 0; + } + + return needed; +} + /* * Increase the hugetlb pool such that it can accommodate a reservation * of size 'delta'. @@ -2608,13 +2625,7 @@ static int gather_surplus_pages(struct hstate *h, long delta) int node; nodemask_t *mbind_nodemask = policy_mbind_nodemask(htlb_alloc_mask(h)); - lockdep_assert_held(&hugetlb_lock); - needed = (h->resv_huge_pages + delta) - h->free_huge_pages; - if (needed <= 0) { - h->resv_huge_pages += delta; - return 0; - } - + needed = delta; allocated = 0; ret = -ENOMEM; @@ -5104,7 +5115,7 @@ unsigned long hugetlb_total_pages(void) return nr_total_pages; } -static int hugetlb_acct_memory(struct hstate *h, long delta) +static int __hugetlb_acct_memory(struct hstate *h, long delta, bool use_surplus) { int ret = -ENOMEM; @@ -5136,7 +5147,12 @@ static int hugetlb_acct_memory(struct hstate *h, long delta) * above. */ if (delta > 0) { - if (gather_surplus_pages(h, delta) < 0) + long required_surplus = hugetlb_hstate_reserve_pages(h, delta); + + if (!use_surplus && required_surplus > 0) + goto out; + + if (gather_surplus_pages(h, required_surplus) < 0) goto out; if (delta > allowed_mems_nr(h)) { @@ -5154,6 +5170,11 @@ static int hugetlb_acct_memory(struct hstate *h, long delta) return ret; } +static int hugetlb_acct_memory(struct hstate *h, long delta) +{ + return __hugetlb_acct_memory(h, delta, true); +} + static void hugetlb_vm_op_open(struct vm_area_struct *vma) { struct resv_map *resv = vma_resv_map(vma);