From patchwork Fri Nov 8 16:20:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13868444 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A3B0D64061 for ; Fri, 8 Nov 2024 16:21:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 54BEE6B00A8; Fri, 8 Nov 2024 11:21:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4CE466B00AA; Fri, 8 Nov 2024 11:21:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3705A6B00AB; Fri, 8 Nov 2024 11:21:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 15E9B6B00A8 for ; Fri, 8 Nov 2024 11:21:09 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 89C5DC0EBB for ; Fri, 8 Nov 2024 16:21:08 +0000 (UTC) X-FDA: 82763441082.26.400C673 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) by imf09.hostedemail.com (Postfix) with ESMTP id 10B8514001B for ; Fri, 8 Nov 2024 16:20:40 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=zHud5AwP; spf=pass (imf09.hostedemail.com: domain of 3cTouZwUKCN4TABBAGOOGLE.COMLINUX-MMKVACK.ORG@flex--tabba.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=3cTouZwUKCN4TABBAGOOGLE.COMLINUX-MMKVACK.ORG@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731082697; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LLzT0eDLF6iAMaTBJqJboYyOg2pJMFOgkBQWEqDLmNQ=; b=iHOEhDvqRxseBrz9MXlSiSf0RKQLVfXAVRmBkrrdfp20fTefKCG/1nkKFmSj9zyMjymj1c 11NgOE9ASqZQsQpVTeLEI8wZK5uB4rY2aC72DRp/CizKZTyhmFvp+6IUp14JXu4gQ4BNZp Qr0y+NWbmc8/EHrHBWYJ9iPzXf3gW+M= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=zHud5AwP; spf=pass (imf09.hostedemail.com: domain of 3cTouZwUKCN4TABBAGOOGLE.COMLINUX-MMKVACK.ORG@flex--tabba.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=3cTouZwUKCN4TABBAGOOGLE.COMLINUX-MMKVACK.ORG@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731082697; a=rsa-sha256; cv=none; b=plV4lIVgwDs9dJ3nNqd0/Zpb7wV5n3SsrfL9R8wcYcXAWNX1h8dc44w2v3KmdWaI1W+7na UDtZtH2RR5UG6tGqYYIigu9CFeLvd0j5IJIYjLaPFY4+gZ3Y88L6qe8K3u/r5wjUep9+bk q/tCAUUbxUsH9ooK0dOI7cS30ufKc8Y= Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-37d45f1e935so1279947f8f.0 for ; Fri, 08 Nov 2024 08:21:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731082865; x=1731687665; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=LLzT0eDLF6iAMaTBJqJboYyOg2pJMFOgkBQWEqDLmNQ=; b=zHud5AwPsjFBm1Y8J9F6j733IU8qJzSdzOA0rB0Kapyoy4sqCNzoTW4Cod/WYTw0TH 0H1TFyvVhpCSKLNfbv2FMqtiMsJGWRMmfAcRgL9LnkP6CKixOE4U1ctPu6XG/8Cp67n8 GOD1DbfJPSSj5g2v7hp1N0RFdDDDNbl9TkYX7in32eO2xswmMlNCVWH3SzSZQZg3ybCC F9++1iPxWhBgcXHbJBmPQ3ycRsSR+Gc+2sAYFfke1h6r53UM6Eb3udwG5iq8incC6CSy 9GtRiab9Z6ggoiBPCv74/NFk1jI/008vccYGYYZmkYtgHBrG45SE5I0/AoLdp2182pm2 uFWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731082865; x=1731687665; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=LLzT0eDLF6iAMaTBJqJboYyOg2pJMFOgkBQWEqDLmNQ=; b=Icg6Zj8d2SnV4jV6BRJqlAyQskotxw8azUA2OxsjFVyFRE+3vdfCfzhSoSLhBFB7Qr Ej9ukHaxVw2skpRHQ3NZAkcAv6uAwVeect3E1nY13HWJBa/c1Df8v2xg2hxlS6jbkep3 7ebpYGzp3IbhggjNL7g/1Sa5OFiOPYKsudWEMUzSzCWZaFuRTV3mnB8a3XCFgsQxEflJ pfDaRJvIfxo2NYFTeQWVlHoHgFhmoB6SITNL599EAItpQ71nvdprIzjNjlqAkaZFIf4F tsnMLjVU9GzfHvX4tWjIA/n1yDdhdAMxlUfSUchNwFiPEUik/NuGR6O+DOo5sfflEpyb pRFw== X-Gm-Message-State: AOJu0YzVG1kvuZDX3lev1B7H9OjJaw+C3ZONHfaV6Iun83hBmdBOtJ8X F5BgYMHcGthlKQbhUeiWN4n1OSB2MfXZkz/PqCba55C9E5cbS7e9CaPsd+f6tDZop/HozFapGLo si2iEEA8Ygi8LDZJGVYAE4Ay+NBejzs2v8ya6KrPQsyz0s9V6VSVStx2UDL7V6w0LLsWKV+xk8X 185rUJwTtePKWC9u5jtQVnmw== X-Google-Smtp-Source: AGHT+IHHoxyjGs1AUxhryPf5sywJLDUszI7O7sOrCReyd47RhmaPFBz34fv43FAG+LKSgY7icD6Ea0HGQw== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a05:6000:1b02:b0:37d:4850:c3be with SMTP id ffacd0b85a97d-381f18881dfmr2466f8f.10.1731082865212; Fri, 08 Nov 2024 08:21:05 -0800 (PST) Date: Fri, 8 Nov 2024 16:20:40 +0000 In-Reply-To: <20241108162040.159038-1-tabba@google.com> Mime-Version: 1.0 References: <20241108162040.159038-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.277.g8800431eea-goog Message-ID: <20241108162040.159038-11-tabba@google.com> Subject: [RFC PATCH v1 10/10] mm: hugetlb: Use owner_ops on folio_put for hugetlb From: Fuad Tabba To: linux-mm@kvack.org Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org, david@redhat.com, rppt@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, muchun.song@linux.dev, simona@ffwll.ch, airlied@gmail.com, pbonzini@redhat.com, seanjc@google.com, willy@infradead.org, jgg@nvidia.com, jhubbard@nvidia.com, ackerleytng@google.com, vannapurve@google.com, mail@maciej.szmigiero.name, kirill.shutemov@linux.intel.com, quic_eberman@quicinc.com, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, tabba@google.com X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 10B8514001B X-Stat-Signature: iqiqnk8pk1h3tgie7cbksuw5hp7w8q9i X-Rspam-User: X-HE-Tag: 1731082840-427480 X-HE-Meta: U2FsdGVkX18GbuY/ZGuFH6UtHIvdYO3dfx5XSjErI69FNJFHbY2SrUWVEcphOSEbqpHIIwHCSiJZoVuWYIiVsklnXhfXbN81fMxSkpLRisUgRVIUtKgi59epvqWkITGrdyGNz3Ev5wyW0I2BL0K1eHFY8+CWeMTK6J4404o2dvHtoySvXPljVGajbsJYd2ms8nbOoolr0qsTvZBgalldKdYgxPff92zEUiR26UqF/zeax2y9jXYXOcXEK+9TB4ZRqtQQJDd1YsGBULn3bsc2E5rTTRLIwQNb4TBoUcSqSrelDa8CoZTX5e/Nb5M2a3t7VBX/9wGDwC+QAi3h9PGw7EJyBosiq5Y6NQ4zRrfGDxOSni8dryB3nE4MNj5mOieP3v0QBJY8pJ2FZnuQPGNRj9XmULCyZy+vD/wwFYta/6DRa1+pD1G65svBG+IA4aylXTF1PFgOUEx2cMODI0FbK4eogoW0rRN/WPbVqzT4JET1uqauF3rudcd9imUMqdAYWDxJHAvOaBDWLAcMg1+/W5fSQIlG9VxMjd/I+cYP76Fz86kp8YksYDcXuEZnJdnvZ5lZZY2xKZRr8XtjL/jcSHDNSm1+EUufIKHjbJ4uvkhMlKa4bQuzqsDg2JHLUNaDHYHz+uZiLYeCdPXqmEHbpgUWTDhEQbUiCVKMNnPPyR7qO3B8Lq+Iq+2yBUcIBLBwXAnmWT+1kfAZc6FxlgWiKNg/XwQGC4ztHX1teIA8l+Wk38acUx+g1krtdx+534dKiP3KGfopFkBl1X+WBs4Ob9guzmfCkbD/zQUmJy9nL2VMVFODc+eNJT5kH681J2wUG6SZvCEmAUE+TkguPx2SKTaOXia/+xSyQMtsX8GT9o6UOaytLsAqr+pKGECihovQVOICNPAXvx9ahM9hQqGuv+ubvOsSvQN0jN7V0qH0rZeZWnG5237U1lIYKd5kpTjHOS8upeDncGIUCdMkwvl tcdwVupE SEODFP4OO320sXxXPkh+3CzAInaieFqkORXlXfh4HZL4qRmAOTUZ2BKVy2pNnDYoGZOfrwx3iVMddeoeo8XAEeqzPGpiJrqn/MUnyF4XgSwv6tlSct7ekUj891gy2bCb9ce3PaaqbLDzm3y5rdYri8vhxNudQxEHmKonCcNJVHTpcn4G8Nla4I/Zot3XWE3WTYlqcxssK3Zr6vR0jVQmoRzy+N9mb6NjJ5jbdQrbaRbNil3yQBHNZw6aEid8FalwGmSXR4520CPQwoWvNo/QytwosYwppf3MkzEuuHrwfZBcW1FGSOxZZIZegtCx1Hf8QbfHjxLgVL4xDyx0bsAo1b/d5sDeYTdHqkpz/yrlpKONHpxJ0HreoAFmQfhOsPfamSLBo5JTDdOzWI3RuyRUrS0IvZCB6XdyUU4q6HYJQmzNTK29cW6mR2NDMi3UwAp1Om7d6gL1OH8wzBE6GrN+YexxBBxw7bx3rFTClwLlHlMoQlPzCOyHxGJRokLlequnGacvzP25dgni46Hb/09w+VmgG078xLKD/jbZmvHr8ps+afxrPm86rs/cJhavOYSzs0bOQ825AHMVSIkI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now that we have the folio_owner_ops callback, use it for hugetlb pages instead of using a dedicated callback. Since owner_ops is overlaid with lru, we need to unset owner_ops to allow the use of lru when its isolated. At that point we know that the reference count is elevated, will not reach 0, and thus not trigger a callback. Therefore, it is safe to do so provided we restore it before we put the folio back. Signed-off-by: Fuad Tabba --- include/linux/hugetlb.h | 2 -- mm/hugetlb.c | 57 +++++++++++++++++++++++++++++++++-------- mm/swap.c | 14 ---------- 3 files changed, 47 insertions(+), 26 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index e846d7dac77c..500848862702 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -20,8 +20,6 @@ struct user_struct; struct mmu_gather; struct node; -void free_huge_folio(struct folio *folio); - #ifdef CONFIG_HUGETLB_PAGE #include diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 2308e94d8615..4e1c87e37968 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -89,6 +89,33 @@ static void __hugetlb_vma_unlock_write_free(struct vm_area_struct *vma); static void hugetlb_unshare_pmds(struct vm_area_struct *vma, unsigned long start, unsigned long end); static struct resv_map *vma_resv_map(struct vm_area_struct *vma); +static void free_huge_folio(struct folio *folio); + +static const struct folio_owner_ops hugetlb_owner_ops = { + .free = free_huge_folio, +}; + +/* + * Mark this folio as a hugetlb-owned folio. + * + * Set the folio hugetlb flag and owner operations. + */ +static void folio_set_hugetlb_owner(struct folio *folio) +{ + __folio_set_hugetlb(folio); + folio_set_owner_ops(folio, &hugetlb_owner_ops); +} + +/* + * Unmark this folio from being a hugetlb-owned folio. + * + * Clear the folio hugetlb flag and owner operations. + */ +static void folio_clear_hugetlb_owner(struct folio *folio) +{ + folio_clear_owner_ops(folio); + __folio_clear_hugetlb(folio); +} static void hugetlb_free_folio(struct folio *folio) { @@ -1617,7 +1644,7 @@ static void remove_hugetlb_folio(struct hstate *h, struct folio *folio, * to tail struct pages. */ if (!folio_test_hugetlb_vmemmap_optimized(folio)) { - __folio_clear_hugetlb(folio); + folio_clear_hugetlb_owner(folio); } h->nr_huge_pages--; @@ -1641,7 +1668,7 @@ static void add_hugetlb_folio(struct hstate *h, struct folio *folio, h->surplus_huge_pages++; h->surplus_huge_pages_node[nid]++; } - __folio_set_hugetlb(folio); + folio_set_hugetlb_owner(folio); folio_change_private(folio, NULL); /* @@ -1692,7 +1719,7 @@ static void __update_and_free_hugetlb_folio(struct hstate *h, */ if (folio_test_hugetlb(folio)) { spin_lock_irq(&hugetlb_lock); - __folio_clear_hugetlb(folio); + folio_clear_hugetlb_owner(folio); spin_unlock_irq(&hugetlb_lock); } @@ -1793,7 +1820,7 @@ static void bulk_vmemmap_restore_error(struct hstate *h, list_for_each_entry_safe(folio, t_folio, non_hvo_folios, _hugetlb_list) { list_del(&folio->_hugetlb_list); spin_lock_irq(&hugetlb_lock); - __folio_clear_hugetlb(folio); + folio_clear_hugetlb_owner(folio); spin_unlock_irq(&hugetlb_lock); update_and_free_hugetlb_folio(h, folio, false); cond_resched(); @@ -1818,7 +1845,7 @@ static void bulk_vmemmap_restore_error(struct hstate *h, } else { list_del(&folio->_hugetlb_list); spin_lock_irq(&hugetlb_lock); - __folio_clear_hugetlb(folio); + folio_clear_hugetlb_owner(folio); spin_unlock_irq(&hugetlb_lock); update_and_free_hugetlb_folio(h, folio, false); cond_resched(); @@ -1851,14 +1878,14 @@ static void update_and_free_pages_bulk(struct hstate *h, * should only be pages on the non_hvo_folios list. * Do note that the non_hvo_folios list could be empty. * Without HVO enabled, ret will be 0 and there is no need to call - * __folio_clear_hugetlb as this was done previously. + * folio_clear_hugetlb_owner as this was done previously. */ VM_WARN_ON(!list_empty(folio_list)); VM_WARN_ON(ret < 0); if (!list_empty(&non_hvo_folios) && ret) { spin_lock_irq(&hugetlb_lock); list_for_each_entry(folio, &non_hvo_folios, _hugetlb_list) - __folio_clear_hugetlb(folio); + folio_clear_hugetlb_owner(folio); spin_unlock_irq(&hugetlb_lock); } @@ -1879,7 +1906,7 @@ struct hstate *size_to_hstate(unsigned long size) return NULL; } -void free_huge_folio(struct folio *folio) +static void free_huge_folio(struct folio *folio) { /* * Can't pass hstate in here because it is called from the @@ -1959,7 +1986,7 @@ static void __prep_account_new_huge_page(struct hstate *h, int nid) static void init_new_hugetlb_folio(struct hstate *h, struct folio *folio) { - __folio_set_hugetlb(folio); + folio_set_hugetlb_owner(folio); INIT_LIST_HEAD(&folio->_hugetlb_list); hugetlb_set_folio_subpool(folio, NULL); set_hugetlb_cgroup(folio, NULL); @@ -7428,6 +7455,14 @@ bool folio_isolate_hugetlb(struct folio *folio, struct list_head *list) goto unlock; } folio_clear_hugetlb_migratable(folio); + /* + * Clear folio->owner_ops; now we can use folio->lru. + * Note that the folio cannot get freed because we are holding a + * reference. The reference will be put in folio_putback_hugetlb(), + * after restoring folio->owner_ops. + */ + folio_clear_owner_ops(folio); + INIT_LIST_HEAD(&folio->lru); list_del_init(&folio->_hugetlb_list); list_add_tail(&folio->lru, list); unlock: @@ -7480,7 +7515,9 @@ void folio_putback_hugetlb(struct folio *folio) { spin_lock_irq(&hugetlb_lock); folio_set_hugetlb_migratable(folio); - list_del_init(&folio->lru); + list_del(&folio->lru); + /* Restore folio->owner_ops since we can no longer use folio->lru. */ + folio_set_owner_ops(folio, &hugetlb_owner_ops); list_add_tail(&folio->_hugetlb_list, &(folio_hstate(folio))->hugepage_activelist); spin_unlock_irq(&hugetlb_lock); folio_put(folio); diff --git a/mm/swap.c b/mm/swap.c index d2578465e270..9798ca47f26a 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -117,11 +117,6 @@ void __folio_put(struct folio *folio) return; } - if (folio_test_hugetlb(folio)) { - free_huge_folio(folio); - return; - } - page_cache_release(folio); folio_unqueue_deferred_split(folio); mem_cgroup_uncharge(folio); @@ -953,15 +948,6 @@ void folios_put_refs(struct folio_batch *folios, unsigned int *refs) if (!folio_ref_sub_and_test(folio, nr_refs)) continue; - /* hugetlb has its own memcg */ - if (folio_test_hugetlb(folio)) { - if (lruvec) { - unlock_page_lruvec_irqrestore(lruvec, flags); - lruvec = NULL; - } - free_huge_folio(folio); - continue; - } folio_unqueue_deferred_split(folio); __page_cache_release(folio, &lruvec, &flags);