From patchwork Fri Nov 8 16:20:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13868441 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6BA78D64065 for ; Fri, 8 Nov 2024 16:21:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A05D96B00A2; Fri, 8 Nov 2024 11:21:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 98B876B00A3; Fri, 8 Nov 2024 11:21:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7B8006B00A4; Fri, 8 Nov 2024 11:21:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 54C436B00A2 for ; Fri, 8 Nov 2024 11:21:02 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id F184B1A0DCE for ; Fri, 8 Nov 2024 16:21:01 +0000 (UTC) X-FDA: 82763439402.20.1611000 Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) by imf29.hostedemail.com (Postfix) with ESMTP id 3539312001E for ; Fri, 8 Nov 2024 16:20:09 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=YJpIYkow; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of 3ajouZwUKCNcM34439HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--tabba.bounces.google.com designates 209.85.221.74 as permitted sender) smtp.mailfrom=3ajouZwUKCNcM34439HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--tabba.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731082689; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Z1UKVDyubm+0FNKQOmqAnK7wACyIXxMp06jV3hLqPe8=; b=jJiMyYjKGJQetJ6WENyp49hdsm+MMCebXP93iKjgUito0EVBZGuoIuSJctGTPbPlvQtU3i cwq+NaIBLnvodfoom/o2371o0G/PSl3EMeKoZTGny93kdpXWAW5UxdDvgTHyuNP8F+CYdl Nixq7dZKGmVUwdVbcR2V9UJjyzeb/K8= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=YJpIYkow; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of 3ajouZwUKCNcM34439HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--tabba.bounces.google.com designates 209.85.221.74 as permitted sender) smtp.mailfrom=3ajouZwUKCNcM34439HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--tabba.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731082689; a=rsa-sha256; cv=none; b=Y9wPTXhgLEBG4uPD1e2kcer0VBG9OEBz2s8Zlq2pvfSk7SN4grg1kRFkSTb7hJWntVKntB 549Guq6PVR439R5ApJBsBuLvHvnH6FbbFVhXdIUAPUqH5n9vMISvKJF29vwZX/mZwxQGzB pht2K0lWEU3QHecMhw6rXJ59UpVNEkY= Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-37d59ad50f3so1163419f8f.0 for ; Fri, 08 Nov 2024 08:20:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731082859; x=1731687659; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Z1UKVDyubm+0FNKQOmqAnK7wACyIXxMp06jV3hLqPe8=; b=YJpIYkowqpQ4IbjEWih2MjGNWkU8QjTqo3YChMMF5ZhmnF10XKKPG7+hZXzvY9hWLg sH9UJyzNG39zUqqIehTF3Gy6XxzU/VpyQNV4dC7H2cA4FCKESGsu9hfE48BrhfGXQ8yI nKCPRF8sT7+gnV4SGDzRTDTU/HFp4KngQlLMNPi4D0W9iomiVUiDyz/lWIZNvIwsyyLq eFhpcyRQBFstiWr6rxV0kDIy9af3Wady9h72FQCB1tSNuxfhnBCFQe0rG+61ZEa6wZxa hofOj+XgSFXxBgwveV8GSy597kSMKd9QCsAABMc2fE/YEh/9Jwu0r6zAULx71i8rAkXR eEyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731082859; x=1731687659; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Z1UKVDyubm+0FNKQOmqAnK7wACyIXxMp06jV3hLqPe8=; b=e0Adqh3ge6cXYjJi80FGIXbfb1dqE3h1Horcxxbm10zLm2+7eg0lLwavjV5Ktm7jD/ 3FZz2M58EMFvs8Nfjg/QVsM8woYiRc3hObOQTvm35glbockq8nMesxcA0DuUfkQTPrtp ndbwh9KFCWVq0tSpW+ylsx4kR5SAJq2t+OsXsbCITR4aYzxPEuCG/DIVSzPYQUD0GfvB J8ATkcBC6YROhezVefhN/C+jsMckBvVWMQ16sOzZPjHgFNrrU1FxX9DCqsLo6PxfOJ6s 8FLGfOrCthYd4PpS2WJx8OkjqLnves7eaUGUYGCM+2bhWYx59hcnFF//qpBMTL4wXUQV V+MA== X-Gm-Message-State: AOJu0Yzi90M9K+Yqi0CnMLpU82Gdve/qpuUI8hDUzXSiFvbszXDdtpFz 5H4m65aWC05uWJPNbcWvJ/tbhxtWpCWwWeTrrMr33Rk8ijKzPR9brMCV9Z06SlPrDE5ZClpYEHB za5rfb2Au7Ir0UO81EDb8nAvCpKjhl7p4OIr8DRm7sC/mGKYjT60UTLHdBPnz9HZNp3RekeH0cs wWse9ZlAOaXAFAsW484iqD0Q== X-Google-Smtp-Source: AGHT+IHkZAD0auPV4emIDnQo0+EU0NmTafyVDODnqw54sbX8+2pfhiQ95jeQsPF/8IsVn+XR42DkmZua1Q== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:adf:ffca:0:b0:37d:4cee:559 with SMTP id ffacd0b85a97d-381f1862148mr2432f8f.3.1731082858585; Fri, 08 Nov 2024 08:20:58 -0800 (PST) Date: Fri, 8 Nov 2024 16:20:37 +0000 In-Reply-To: <20241108162040.159038-1-tabba@google.com> Mime-Version: 1.0 References: <20241108162040.159038-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.277.g8800431eea-goog Message-ID: <20241108162040.159038-8-tabba@google.com> Subject: [RFC PATCH v1 07/10] mm: Introduce struct folio_owner_ops From: Fuad Tabba To: linux-mm@kvack.org Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org, david@redhat.com, rppt@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, muchun.song@linux.dev, simona@ffwll.ch, airlied@gmail.com, pbonzini@redhat.com, seanjc@google.com, willy@infradead.org, jgg@nvidia.com, jhubbard@nvidia.com, ackerleytng@google.com, vannapurve@google.com, mail@maciej.szmigiero.name, kirill.shutemov@linux.intel.com, quic_eberman@quicinc.com, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, tabba@google.com X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 3539312001E X-Stat-Signature: xrstrc69figghqcb945qbgw19i1kckqx X-Rspam-User: X-HE-Tag: 1731082809-287209 X-HE-Meta: U2FsdGVkX18mbWscBMJxFc0n6YXz7gQooqucgA1V/x+hoW5ILlIr+ZGYWBKHqtlT5PWnBPUi+a9iaHbKksU7TTgxQIReaASRpnKMwQnIVXgz4ud9FF+BGuSfNWkd3zsoitst4h3Qw7aRcXPj5uoJbESSesQDkOys/KfosK3ueAg7pk/kIDbopOFKW86X9BuIE1EETrYRA7QS48bG71kf7hsl2RAtWIQ2gMNnZPHDsanMX3Big41JAuiO+jTZfgWj5POszI7uB7o4Ph/zReLMPoY1urLDpdsidAhwSEiHJc7CtbOkzajecPnskjayWVlFV2MtLHAhGvrq05HtQTeOiz7CCsMhW2DZAAUsK6r+6ko/AWxyp1U6udu2uXrHwwLF5KnaBHCRjK8G99VyzXCriKp9nh2hJaqnPjSzSFlR5Owwn7su4jv5wio7V+SVcwhajTZx7bedOrW5dISK8mjOFuJwgKJSkTjuUMefja6zWB9WdkY/FxTpXmNqB+2UgPuqz0pSb4UEqcA+p27mApJ5moWfndC565jByXK1sQjEVC9zYeHDvP+eumDNJee+93DP25wmBFtwhA9PsqhoGXXK+jN/l648xeUVDPFXlEXvsbC/odsrw3nVgQ3g2m+bERuOn65TC4jKw1gzutrYPeLUBxQegq8YNCMmBSLn7bTIdM+LF4nKkFm+AFKk7rquK1PRRL2j31geFZhMFMJCrm4eRL3mYc7ksRZjh7mntxRXHXk3i0+W/Jkri7NvrxDnOTStQ76rRuOgRsfACuNFabjirb+4sn1Sx9qNeuvs8rgrLHDEKE8QR544qF0k8VQk02XMUFqYQYCjJ67kY7WmexcnPeroUxgwFEalZNSk3bpUxef28uxVCRCbyP78a8FzSgj+YhT716+lIva9PbGDwQ4PGOCXIzKyo75XN9S+w5/vaiRRPV137ogFsjeQy/tkfOYgtdYXnPQ9lyshecozJkJ 5MGbzKr7 8WzD47f2sI9DKI5z/n86m3b+8YbkUg+UX+eH9bgC0Lvzo3RFn5mjTHTAsTIzgN+j6RXxQmdjVBdft6HBForyfCYlyLHPfh2E42WUP8pdPVp6kqV4jT0BV0MVL5Fqlws+4GPt16gA1qwldnVr789KsxjIhk82ZpOfRY2QR4nl8xONv4Q8NCtzrQ5m2UF8fufVwXYrJMlq0LFkgB6xreNzhaeqQK2ihFnlYlTIRbS5WazUUaPoZVSEKYwZ0tNw6/Nn8CzLSo22Jab7+lzNzQUH/6p6CKOnUPNwbQ0vtOPKZja3xm1ClxdfqFIlkHpqRTz35ht5TW2kn7iESWxNgywZtIgZiio+O3NTmq1jftq5nRx/a+Fh4NGQUkNAEgNYQx7gu3VU1HcKAn2IAhsJ2QXqQkYq1LHxwkbwoJDwr6ilHNah0B1EL30GjIIhqCD9a5NVYiLVs/fXw9ooBsZp9OijuRfYAXgiWgiWYj50xZy0gxXgLiS6hzjEmi41DAO0DrZqeBNvWeEjm8jpHF8O9BpGHPxR097xzHiJkY21RGaxv+RyFtrZkXYAvVHqSeZn3VGqKBoSDSBnkEoW6qfs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Introduce struct folio_owner_ops, a method table that contains callbacks to owners of folios that need special handling for certain operations. For now, it only contains a callback for folio free(), which is called immediately after the folio refcount drops to 0. Add a pointer to this struct overlaid on struct page compound_head, pgmap, and struct page/folio lru. The users of this struct either will not use lru (e.g., zone device), or would be able to easily isolate when lru is being used (e.g., hugetlb) and handle it accordingly. While folios are isolated, they cannot get freed and the owner_ops are unstable. This is sufficient for the current use case of returning these folios to a custom allocator. To identify that a folio has owner_ops, we set bit 1 of the field, in a similar way to that bit 0 of compound_head is used to identify compound pages. Signed-off-by: Fuad Tabba --- include/linux/mm_types.h | 64 +++++++++++++++++++++++++++++++++++++--- mm/swap.c | 19 ++++++++++++ 2 files changed, 79 insertions(+), 4 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 365c73be0bb4..6e06286f44f1 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -41,10 +41,12 @@ struct mem_cgroup; * * If you allocate the page using alloc_pages(), you can use some of the * space in struct page for your own purposes. The five words in the main - * union are available, except for bit 0 of the first word which must be - * kept clear. Many users use this word to store a pointer to an object - * which is guaranteed to be aligned. If you use the same storage as - * page->mapping, you must restore it to NULL before freeing the page. + * union are available, except for bit 0 (used for compound_head pages) + * and bit 1 (used for owner_ops) of the first word, which must be kept + * clear and used with care. Many users use this word to store a pointer + * to an object which is guaranteed to be aligned. If you use the same + * storage as page->mapping, you must restore it to NULL before freeing + * the page. * * The mapcount field must not be used for own purposes. * @@ -283,10 +285,16 @@ typedef struct { unsigned long val; } swp_entry_t; +struct folio_owner_ops; + /** * struct folio - Represents a contiguous set of bytes. * @flags: Identical to the page flags. * @lru: Least Recently Used list; tracks how recently this folio was used. + * @owner_ops: Pointer to callback operations of the folio owner. Valid if bit 1 + * is set. + * NOTE: Cannot be used with lru, since it is overlaid with it. To use lru, + * owner_ops must be cleared first, and restored once done with lru. * @mlock_count: Number of times this folio has been pinned by mlock(). * @mapping: The file this page belongs to, or refers to the anon_vma for * anonymous memory. @@ -330,6 +338,7 @@ struct folio { unsigned long flags; union { struct list_head lru; + const struct folio_owner_ops *owner_ops; /* Bit 1 is set */ /* private: avoid cluttering the output */ struct { void *__filler; @@ -417,6 +426,7 @@ FOLIO_MATCH(flags, flags); FOLIO_MATCH(lru, lru); FOLIO_MATCH(mapping, mapping); FOLIO_MATCH(compound_head, lru); +FOLIO_MATCH(compound_head, owner_ops); FOLIO_MATCH(index, index); FOLIO_MATCH(private, private); FOLIO_MATCH(_mapcount, _mapcount); @@ -452,6 +462,13 @@ FOLIO_MATCH(flags, _flags_3); FOLIO_MATCH(compound_head, _head_3); #undef FOLIO_MATCH +struct folio_owner_ops { + /* + * Called once the folio refcount reaches 0. + */ + void (*free)(struct folio *folio); +}; + /** * struct ptdesc - Memory descriptor for page tables. * @__page_flags: Same as page flags. Powerpc only. @@ -560,6 +577,45 @@ static inline void *folio_get_private(struct folio *folio) return folio->private; } +/* + * Use bit 1, since bit 0 is used to indicate a compound page in compound_head, + * which owner_ops is overlaid with. + */ +#define FOLIO_OWNER_OPS_BIT 1UL +#define FOLIO_OWNER_OPS (1UL << FOLIO_OWNER_OPS_BIT) + +/* + * Set the folio owner_ops as well as bit 1 of the pointer to indicate that the + * folio has owner_ops. + */ +static inline void folio_set_owner_ops(struct folio *folio, const struct folio_owner_ops *owner_ops) +{ + owner_ops = (const struct folio_owner_ops *)((unsigned long)owner_ops | FOLIO_OWNER_OPS); + folio->owner_ops = owner_ops; +} + +/* + * Clear the folio owner_ops including bit 1 of the pointer. + */ +static inline void folio_clear_owner_ops(struct folio *folio) +{ + folio->owner_ops = NULL; +} + +/* + * Return the folio's owner_ops if it has them, otherwise, return NULL. + */ +static inline const struct folio_owner_ops *folio_get_owner_ops(struct folio *folio) +{ + const struct folio_owner_ops *owner_ops = folio->owner_ops; + + if (!((unsigned long)owner_ops & FOLIO_OWNER_OPS)) + return NULL; + + owner_ops = (const struct folio_owner_ops *)((unsigned long)owner_ops & ~FOLIO_OWNER_OPS); + return owner_ops; +} + struct page_frag_cache { void * va; #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) diff --git a/mm/swap.c b/mm/swap.c index 638a3f001676..767ff6d8f47b 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -110,6 +110,13 @@ static void page_cache_release(struct folio *folio) void __folio_put(struct folio *folio) { + const struct folio_owner_ops *owner_ops = folio_get_owner_ops(folio); + + if (unlikely(owner_ops)) { + owner_ops->free(folio); + return; + } + if (unlikely(folio_is_zone_device(folio))) { free_zone_device_folio(folio); return; @@ -929,10 +936,22 @@ void folios_put_refs(struct folio_batch *folios, unsigned int *refs) for (i = 0, j = 0; i < folios->nr; i++) { struct folio *folio = folios->folios[i]; unsigned int nr_refs = refs ? refs[i] : 1; + const struct folio_owner_ops *owner_ops; if (is_huge_zero_folio(folio)) continue; + owner_ops = folio_get_owner_ops(folio); + if (unlikely(owner_ops)) { + if (lruvec) { + unlock_page_lruvec_irqrestore(lruvec, flags); + lruvec = NULL; + } + if (folio_ref_sub_and_test(folio, nr_refs)) + owner_ops->free(folio); + continue; + } + if (folio_is_zone_device(folio)) { if (lruvec) { unlock_page_lruvec_irqrestore(lruvec, flags);