From patchwork Fri Nov 8 16:20:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13868447 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2263DD5E12E for ; Fri, 8 Nov 2024 16:20:49 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CC2EE10E9EB; Fri, 8 Nov 2024 16:20:47 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="1Oegflka"; dkim-atps=neutral Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9A66310E9EB for ; Fri, 8 Nov 2024 16:20:46 +0000 (UTC) Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-4315cefda02so16608905e9.0 for ; Fri, 08 Nov 2024 08:20:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731082845; x=1731687645; darn=lists.freedesktop.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Kl2W4EtzrNZ1cJuCyrufhU34bgPSAtiEFnUtuMhqv3Y=; b=1OegflkaCpwvG8dn3r+lyb7k7m6ysuEO5qSnF82aO2bqMs+G/ReopPpi3CrHPXaXw1 oipFQjpYRdgc1mouNmRyjf8imvBt1/qLH8fAPrtB4r9tzZhjN+0ybv3Q1Y1Rdrz/QjFj OgkncRWvU2RMoGkYZJQEZ2j3D6RdfBDzEFZYi7guvnVYo/SxEXoJcyzEBIhUPnqsNPvU GWkxegPLl5++2XVNW5EaZsSzdet6wf6VLs+4nVorAJOVYmsLxBaRkYcZQJ+9aFpaYnD3 AdmAZxFKirFI0at4/Jzr2HoIbCGiT0UxMuhBTuQq3MMsDY4RGt/7tNP333HusBUr/Fny GLQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731082845; x=1731687645; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Kl2W4EtzrNZ1cJuCyrufhU34bgPSAtiEFnUtuMhqv3Y=; b=VNzysgPZ44T6w5d+KtVyeOXz7vNFCalG0WjaWmwSVUMQnpu0ygOleqhtei1hdfEYzv WwUeGDZwvzvDbzQTWEjxpN3Q3jaaFzqrFF/ivrw1PEDZRpCYPKjTUx7hG9ajzEeug4h0 SWtj9WzGKb/b69qqHagCWUgpQ5mnNS71lGAtz165/4xDOrZ1F/W5i/czEU/9PS//UBbC hHZwierCW3mk+pQSwJBKZVTY44ksErDJpCuggFBX76+7r32lnQo17eqRjF3SN/0qL9+6 IHrnLLnyc1yYJ38X4JE2L5C9KxBNr+XzWPNJLVtIHvDzaBpdm1Nkidams93hncZvMh3L x3YQ== X-Forwarded-Encrypted: i=1; AJvYcCUN9QdqbrJo4xl0iTZtKgPriNhw0IqUm61JS5AVVt1ITq8Yscdkns4nI1oMyvUnuh/zXGQU1v3sHfg=@lists.freedesktop.org X-Gm-Message-State: AOJu0YyyoD97QU8mrKUgAzJZrFN4ymp34bNscBj2Ko4KaOsQAikS1z3y dWE+qm86edVdFItdWyZFq6Hg8W6t7up/N4SdsPr0Nb3BIQ4HUZdPpxpyD41v93T+7WSrn1sEtw= = X-Google-Smtp-Source: AGHT+IGpbTMLKH3Z//v+mS0oCFNNN5K9KgwxRuxW+vMV7GiwPseAyRHtw03enomMXvu4TmjvVSMQK6Mc+A== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a05:600c:4a21:b0:431:43e6:adfc with SMTP id 5b1f17b1804b1-432b7527703mr24725e9.8.1731082844948; Fri, 08 Nov 2024 08:20:44 -0800 (PST) Date: Fri, 8 Nov 2024 16:20:31 +0000 In-Reply-To: <20241108162040.159038-1-tabba@google.com> Mime-Version: 1.0 References: <20241108162040.159038-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.277.g8800431eea-goog Message-ID: <20241108162040.159038-2-tabba@google.com> Subject: [RFC PATCH v1 01/10] mm/hugetlb: rename isolate_hugetlb() to folio_isolate_hugetlb() From: Fuad Tabba To: linux-mm@kvack.org Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org, david@redhat.com, rppt@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, muchun.song@linux.dev, simona@ffwll.ch, airlied@gmail.com, pbonzini@redhat.com, seanjc@google.com, willy@infradead.org, jgg@nvidia.com, jhubbard@nvidia.com, ackerleytng@google.com, vannapurve@google.com, mail@maciej.szmigiero.name, kirill.shutemov@linux.intel.com, quic_eberman@quicinc.com, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, tabba@google.com X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: David Hildenbrand Let's make the function name match "folio_isolate_lru()", and add some kernel doc. Signed-off-by: David Hildenbrand Signed-off-by: Fuad Tabba --- include/linux/hugetlb.h | 4 ++-- mm/gup.c | 2 +- mm/hugetlb.c | 23 ++++++++++++++++++++--- mm/mempolicy.c | 2 +- mm/migrate.c | 6 +++--- 5 files changed, 27 insertions(+), 10 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ae4fe8615bb6..b0cf8dbfeb6a 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -153,7 +153,7 @@ bool hugetlb_reserve_pages(struct inode *inode, long from, long to, vm_flags_t vm_flags); long hugetlb_unreserve_pages(struct inode *inode, long start, long end, long freed); -bool isolate_hugetlb(struct folio *folio, struct list_head *list); +bool folio_isolate_hugetlb(struct folio *folio, struct list_head *list); int get_hwpoison_hugetlb_folio(struct folio *folio, bool *hugetlb, bool unpoison); int get_huge_page_for_hwpoison(unsigned long pfn, int flags, bool *migratable_cleared); @@ -414,7 +414,7 @@ static inline pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr, return NULL; } -static inline bool isolate_hugetlb(struct folio *folio, struct list_head *list) +static inline bool folio_isolate_hugetlb(struct folio *folio, struct list_head *list) { return false; } diff --git a/mm/gup.c b/mm/gup.c index 28ae330ec4dd..40bbcffca865 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2301,7 +2301,7 @@ static unsigned long collect_longterm_unpinnable_folios( continue; if (folio_test_hugetlb(folio)) { - isolate_hugetlb(folio, movable_folio_list); + folio_isolate_hugetlb(folio, movable_folio_list); continue; } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index cec4b121193f..e17bb2847572 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2868,7 +2868,7 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h, * Fail with -EBUSY if not possible. */ spin_unlock_irq(&hugetlb_lock); - isolated = isolate_hugetlb(old_folio, list); + isolated = folio_isolate_hugetlb(old_folio, list); ret = isolated ? 0 : -EBUSY; spin_lock_irq(&hugetlb_lock); goto free_new; @@ -2953,7 +2953,7 @@ int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list) if (hstate_is_gigantic(h)) return -ENOMEM; - if (folio_ref_count(folio) && isolate_hugetlb(folio, list)) + if (folio_ref_count(folio) && folio_isolate_hugetlb(folio, list)) ret = 0; else if (!folio_ref_count(folio)) ret = alloc_and_dissolve_hugetlb_folio(h, folio, list); @@ -7396,7 +7396,24 @@ __weak unsigned long hugetlb_mask_last_page(struct hstate *h) #endif /* CONFIG_ARCH_WANT_GENERAL_HUGETLB */ -bool isolate_hugetlb(struct folio *folio, struct list_head *list) +/** + * folio_isolate_hugetlb: try to isolate an allocated hugetlb folio + * @folio: the folio to isolate + * @list: the list to add the folio to on success + * + * Isolate an allocated (refcount > 0) hugetlb folio, marking it as + * isolated/non-migratable, and moving it from the active list to the + * given list. + * + * Isolation will fail if @folio is not an allocated hugetlb folio, or if + * it is already isolated/non-migratable. + * + * On success, an additional folio reference is taken that must be dropped + * using folio_putback_active_hugetlb() to undo the isolation. + * + * Return: True if isolation worked, otherwise False. + */ +bool folio_isolate_hugetlb(struct folio *folio, struct list_head *list) { bool ret = true; diff --git a/mm/mempolicy.c b/mm/mempolicy.c index bb37cd1a51d8..41bdff67757c 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -647,7 +647,7 @@ static int queue_folios_hugetlb(pte_t *pte, unsigned long hmask, */ if ((flags & MPOL_MF_MOVE_ALL) || (!folio_likely_mapped_shared(folio) && !hugetlb_pmd_shared(pte))) - if (!isolate_hugetlb(folio, qp->pagelist)) + if (!folio_isolate_hugetlb(folio, qp->pagelist)) qp->nr_failed++; unlock: spin_unlock(ptl); diff --git a/mm/migrate.c b/mm/migrate.c index dfb5eba3c522..55585b5f57ec 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -136,7 +136,7 @@ static void putback_movable_folio(struct folio *folio) * * This function shall be used whenever the isolated pageset has been * built from lru, balloon, hugetlbfs page. See isolate_migratepages_range() - * and isolate_hugetlb(). + * and folio_isolate_hugetlb(). */ void putback_movable_pages(struct list_head *l) { @@ -177,7 +177,7 @@ bool isolate_folio_to_list(struct folio *folio, struct list_head *list) bool isolated, lru; if (folio_test_hugetlb(folio)) - return isolate_hugetlb(folio, list); + return folio_isolate_hugetlb(folio, list); lru = !__folio_test_movable(folio); if (lru) @@ -2208,7 +2208,7 @@ static int __add_folio_for_migration(struct folio *folio, int node, return -EACCES; if (folio_test_hugetlb(folio)) { - if (isolate_hugetlb(folio, pagelist)) + if (folio_isolate_hugetlb(folio, pagelist)) return 1; } else if (folio_isolate_lru(folio)) { list_add_tail(&folio->lru, pagelist); From patchwork Fri Nov 8 16:20:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13868448 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E68D0D5E13C for ; Fri, 8 Nov 2024 16:20:52 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8BFB010E9FF; Fri, 8 Nov 2024 16:20:50 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="GAZOyzaA"; dkim-atps=neutral Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by gabe.freedesktop.org (Postfix) with ESMTPS id C0CBD10EA00 for ; Fri, 8 Nov 2024 16:20:48 +0000 (UTC) Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-43163a40ee0so16031765e9.0 for ; Fri, 08 Nov 2024 08:20:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731082847; x=1731687647; darn=lists.freedesktop.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=YSD3xsyZZCSV87kyUY/RPMIHEZzslYpSJkMBD8vjd+E=; b=GAZOyzaAR2Chq/jXFOq31u6nckJTUo6BWJgz7tH15e6C3YQwvv49KF4MmrMzXA4PM0 RES/rXvpxA4FINJsqjutg24/fRcOyXUV4uZFakm/QX9VR/5eDdcWdppreTexPNlIZJT8 lOs0G67CWpDri6gRPyRSKudywlzRPHkRQCLhRhgHcb2+6LHLTrAcFlM1Uqmuq1xDYaG2 mc1E2WwmeSyW4PSpFBumCcvGouzExZ/YAFZLC5qQ1jmYfuctTy280j3J37MbBqDWPxON dfQF54vDTQpJVwwJ0Ll5mgPkV1yhFVDMtp0Om2kvwe6BL9JJX+CWTmd1soCUVmXO2IZ6 mzUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731082847; x=1731687647; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YSD3xsyZZCSV87kyUY/RPMIHEZzslYpSJkMBD8vjd+E=; b=ehJfNrM1SZinJvuYMW1vBomPYd86BdHzs/iGor7SXG9XscX6Oez1HH/Pf1wPI1VV2f K/rWUYvydTKvxa0B2XrQCIQ7BLB/klwSVjqFnyJHjJAqgqVGGnetqP+/Pk72NOl4Xx2X XnjgF1HAOD50QQwUk8NgthYSsFZA9ADCdkvqH79TGO4CapJpMrfWyhS8P5qv63eQv66q 3jyrpEkZ+5dyaKH/BCshjEbL6f7TfqcITVJdbtqW1z6QjOUUSVPbQowOWoVwZMt6Ypd1 U/axLQaapXsWDxYhIErIzt92xb1D4rKUgsyHf3UXVXbgcdJOG6Ar9WJ6yM2qrt0XkKFG zFoQ== X-Forwarded-Encrypted: i=1; AJvYcCXvwRuDJ5pz41nS/lQzUED+V5zoP0XoltcYODFSICGkpvxUhdhcCcze/zZOw0Yy4ON+yadjZdDyazo=@lists.freedesktop.org X-Gm-Message-State: AOJu0YxbXXG7wcN+nsSXlcSFt3rPtsNyJicW3yP69XqmGNPYZbNMTqgn LlPc1Yw82KY5npNYGGputFyJFXrq9TXPq7aiwQM8VHYmduyv4TsHg6G0P5qan3PYdpCxIm5Wxw= = X-Google-Smtp-Source: AGHT+IFgJCvqHUnZF2sVggmAky7vOXqPzIu1h0GscO2JqqplWhQ8XkVaucC9oHlByVAHMi0UBju5RAkHJw== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a05:600c:6a84:b0:42c:a8b5:c26 with SMTP id 5b1f17b1804b1-432b74fc1e5mr108415e9.2.1731082847148; Fri, 08 Nov 2024 08:20:47 -0800 (PST) Date: Fri, 8 Nov 2024 16:20:32 +0000 In-Reply-To: <20241108162040.159038-1-tabba@google.com> Mime-Version: 1.0 References: <20241108162040.159038-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.277.g8800431eea-goog Message-ID: <20241108162040.159038-3-tabba@google.com> Subject: [RFC PATCH v1 02/10] mm/migrate: don't call folio_putback_active_hugetlb() on dst hugetlb folio From: Fuad Tabba To: linux-mm@kvack.org Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org, david@redhat.com, rppt@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, muchun.song@linux.dev, simona@ffwll.ch, airlied@gmail.com, pbonzini@redhat.com, seanjc@google.com, willy@infradead.org, jgg@nvidia.com, jhubbard@nvidia.com, ackerleytng@google.com, vannapurve@google.com, mail@maciej.szmigiero.name, kirill.shutemov@linux.intel.com, quic_eberman@quicinc.com, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, tabba@google.com X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: David Hildenbrand We replaced a simple put_page() by a putback_active_hugepage() call in commit 3aaa76e125c1 (" mm: migrate: hugetlb: putback destination hugepage to active list"), to set the "active" flag on the dst hugetlb folio. Nowadays, we decoupled the "active" list from the flag, by calling the flag "migratable". Calling "putback" on something that wasn't allocated is weird and not future proof, especially if we might reach that path when migration failed and we just want to free the freshly allocated hugetlb folio. Let's simply set the "migratable" flag in move_hugetlb_state(), where we know that allocation succeeded, and use simple folio_put() to return our reference. Do we need the hugetlb_lock for setting that flag? Staring at other users of folio_set_hugetlb_migratable(), it does not look like it. After all, the dst folio should already be on the active list, and we are not modifying that list. Signed-off-by: David Hildenbrand Signed-off-by: Fuad Tabba --- mm/hugetlb.c | 5 +++++ mm/migrate.c | 8 ++++---- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e17bb2847572..da3fe1840ab8 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -7508,6 +7508,11 @@ void move_hugetlb_state(struct folio *old_folio, struct folio *new_folio, int re } spin_unlock_irq(&hugetlb_lock); } + /* + * Our old folio is isolated and has "migratable" cleared until it + * is putback. As migration succeeded, set the new folio "migratable". + */ + folio_set_hugetlb_migratable(new_folio); } static void hugetlb_unshare_pmds(struct vm_area_struct *vma, diff --git a/mm/migrate.c b/mm/migrate.c index 55585b5f57ec..b129dc41c140 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1547,14 +1547,14 @@ static int unmap_and_move_huge_page(new_folio_t get_new_folio, list_move_tail(&src->lru, ret); /* - * If migration was not successful and there's a freeing callback, use - * it. Otherwise, put_page() will drop the reference grabbed during - * isolation. + * If migration was not successful and there's a freeing callback, + * return the folio to that special allocator. Otherwise, simply drop + * our additional reference. */ if (put_new_folio) put_new_folio(dst, private); else - folio_putback_active_hugetlb(dst); + folio_put(dst); return rc; } From patchwork Fri Nov 8 16:20:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13868449 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 82D6DD5E144 for ; Fri, 8 Nov 2024 16:20:54 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 63C1010EA04; Fri, 8 Nov 2024 16:20:52 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="TIdQRGaR"; dkim-atps=neutral Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1338F10EA02 for ; Fri, 8 Nov 2024 16:20:51 +0000 (UTC) Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-37d5a3afa84so1365042f8f.3 for ; Fri, 08 Nov 2024 08:20:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731082849; x=1731687649; darn=lists.freedesktop.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=c97rKSzu5OtSEQDwZyqgn1x58KOnaAilRAoGtwwyknw=; b=TIdQRGaRQkOnbts0K12ZCrSRngdYoa8MKqd2lMK2zMO8Vq1a6Wlr8h/7RRF4MHTU6t rEQHsLIRFj3Rd+lQYQWVrFxEETvl16znAj8yNv+iOvaRFRTRiFPy7K4qCFXJSCXynKR2 7YUojqsjAR9RNkB3b408UFSrngfV6Lj+IqvEV38F7q0V9Rf6KXtMsf0OUvxWItlee/Wk K+87j4U44WGbDjx3FBDMXJ35hO5sJhgGBpNGn/k1RjeRoPqGwcCgUAHNfib770N7AlKf pOrQmCqK20R/OPl/bQ9/GhDP9+gEaGDU+EYSqFin1Ixq3ftTIN7yr/gAlRTV2Sm7vBTR DHSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731082849; x=1731687649; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=c97rKSzu5OtSEQDwZyqgn1x58KOnaAilRAoGtwwyknw=; b=lSP1Dtr/spof+AkqLsYpmdcUS3vobt5ubwLKGEUJlu8Uosx2ZIr/IAbb2l83PY8Cq4 HV2DOoYmipcGaKdH837ORsiO2fzqwbnrBuhgIMNGemElpvMmm5B4rexbbrITvcE/6XqR 6tn6B/veITtXFMjR3CyqH2wiGBS3wc1zJjRkYZJHpvll+Xtym1lfuPMZBRqm6bHZuqGN h5bjpF8tIRCjcp5sULlETc3Vq+G13uoPecKmVvwnVs5yN7+gtDEmosxEZV56XebFSCf2 fo/Tv1xFjBL3vxV+DLJJ8m344l4hrmDBX9Y6f6bbSyhx9HzJBXH/mBlxXOMpwMk+4nKn vM2Q== X-Forwarded-Encrypted: i=1; AJvYcCU0nx+CclD14Jvi1WDp//gHPJng1FWqi863dxOpTbkZQ3mKca/85w3fE2py6eR1Y14nU9SQqFnTyGY=@lists.freedesktop.org X-Gm-Message-State: AOJu0YzYewmASH8geWHfSi8nevlNNyOnCW1fRKOGV6nl8nJ+R/j6wkUE 2k1+ZJDSI8SqcVJCgW0Xi9v2XR95+Y1MzNqNxztH/r1ANC2npiJB9YLbet6T9xdb98GtYvq4XQ= = X-Google-Smtp-Source: AGHT+IELhlKA+w5GoDYCQVGGU0Q69yO5Jt+Zri88wGQ6Nd2QAGecV+Ce7InAsjA0a+zrL4cO721qJjhh5w== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a5d:4f84:0:b0:381:d049:c688 with SMTP id ffacd0b85a97d-381f1884303mr2415f8f.9.1731082849389; Fri, 08 Nov 2024 08:20:49 -0800 (PST) Date: Fri, 8 Nov 2024 16:20:33 +0000 In-Reply-To: <20241108162040.159038-1-tabba@google.com> Mime-Version: 1.0 References: <20241108162040.159038-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.277.g8800431eea-goog Message-ID: <20241108162040.159038-4-tabba@google.com> Subject: [RFC PATCH v1 03/10] mm/hugetlb: rename "folio_putback_active_hugetlb()" to "folio_putback_hugetlb()" From: Fuad Tabba To: linux-mm@kvack.org Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org, david@redhat.com, rppt@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, muchun.song@linux.dev, simona@ffwll.ch, airlied@gmail.com, pbonzini@redhat.com, seanjc@google.com, willy@infradead.org, jgg@nvidia.com, jhubbard@nvidia.com, ackerleytng@google.com, vannapurve@google.com, mail@maciej.szmigiero.name, kirill.shutemov@linux.intel.com, quic_eberman@quicinc.com, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, tabba@google.com X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: David Hildenbrand Now that folio_putback_hugetlb() is only called on folios that were previously isolated through folio_isolate_hugetlb(), let's rename it to match folio_putback_lru(). Add some kernel doc to clarify how this function is supposed to be used. Signed-off-by: David Hildenbrand Signed-off-by: Fuad Tabba --- include/linux/hugetlb.h | 4 ++-- mm/hugetlb.c | 15 +++++++++++++-- mm/migrate.c | 6 +++--- 3 files changed, 18 insertions(+), 7 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index b0cf8dbfeb6a..e846d7dac77c 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -157,7 +157,7 @@ bool folio_isolate_hugetlb(struct folio *folio, struct list_head *list); int get_hwpoison_hugetlb_folio(struct folio *folio, bool *hugetlb, bool unpoison); int get_huge_page_for_hwpoison(unsigned long pfn, int flags, bool *migratable_cleared); -void folio_putback_active_hugetlb(struct folio *folio); +void folio_putback_hugetlb(struct folio *folio); void move_hugetlb_state(struct folio *old_folio, struct folio *new_folio, int reason); void hugetlb_fix_reserve_counts(struct inode *inode); extern struct mutex *hugetlb_fault_mutex_table; @@ -430,7 +430,7 @@ static inline int get_huge_page_for_hwpoison(unsigned long pfn, int flags, return 0; } -static inline void folio_putback_active_hugetlb(struct folio *folio) +static inline void folio_putback_hugetlb(struct folio *folio) { } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index da3fe1840ab8..d58bd815fdf2 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -7409,7 +7409,7 @@ __weak unsigned long hugetlb_mask_last_page(struct hstate *h) * it is already isolated/non-migratable. * * On success, an additional folio reference is taken that must be dropped - * using folio_putback_active_hugetlb() to undo the isolation. + * using folio_putback_hugetlb() to undo the isolation. * * Return: True if isolation worked, otherwise False. */ @@ -7461,7 +7461,18 @@ int get_huge_page_for_hwpoison(unsigned long pfn, int flags, return ret; } -void folio_putback_active_hugetlb(struct folio *folio) +/** + * folio_putback_hugetlb: unisolate a hugetlb folio + * @folio: the isolated hugetlb folio + * + * Putback/un-isolate the hugetlb folio that was previous isolated using + * folio_isolate_hugetlb(): marking it non-isolated/migratable and putting it + * back onto the active list. + * + * Will drop the additional folio reference obtained through + * folio_isolate_hugetlb(). + */ +void folio_putback_hugetlb(struct folio *folio) { spin_lock_irq(&hugetlb_lock); folio_set_hugetlb_migratable(folio); diff --git a/mm/migrate.c b/mm/migrate.c index b129dc41c140..89292d131148 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -145,7 +145,7 @@ void putback_movable_pages(struct list_head *l) list_for_each_entry_safe(folio, folio2, l, lru) { if (unlikely(folio_test_hugetlb(folio))) { - folio_putback_active_hugetlb(folio); + folio_putback_hugetlb(folio); continue; } list_del(&folio->lru); @@ -1459,7 +1459,7 @@ static int unmap_and_move_huge_page(new_folio_t get_new_folio, if (folio_ref_count(src) == 1) { /* page was freed from under us. So we are done. */ - folio_putback_active_hugetlb(src); + folio_putback_hugetlb(src); return MIGRATEPAGE_SUCCESS; } @@ -1542,7 +1542,7 @@ static int unmap_and_move_huge_page(new_folio_t get_new_folio, folio_unlock(src); out: if (rc == MIGRATEPAGE_SUCCESS) - folio_putback_active_hugetlb(src); + folio_putback_hugetlb(src); else if (rc != -EAGAIN) list_move_tail(&src->lru, ret); From patchwork Fri Nov 8 16:20:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13868450 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1B543D64063 for ; Fri, 8 Nov 2024 16:20:56 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D44AF10EA08; Fri, 8 Nov 2024 16:20:53 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="0uBEpR0F"; dkim-atps=neutral Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by gabe.freedesktop.org (Postfix) with ESMTPS id A185410EA07 for ; Fri, 8 Nov 2024 16:20:52 +0000 (UTC) Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-6ea258fe4b6so45052647b3.1 for ; Fri, 08 Nov 2024 08:20:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731082851; x=1731687651; darn=lists.freedesktop.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=xoczasfH3EjGVlSybMjc91OekbCn/q1l859VFPlt17E=; b=0uBEpR0FuIlAvP4KNKajv6XI5JU/q5vlCY1aJSFZxVrAHnYC+Ie9adekG2Vg0mnpJz Ep0GZ/aZnf0GL/DujZWc/xBHAz2t3mx+qsLoUGJOn4buap5OtmU3zL8P0ijGcv8HJLPX hEAFDox6i1cym5x7E5FA0sFtPBDPKTHnxNZdq3+XiN2gciVd0d6qo2oSvFIAl/v1a4Wo SYRBCh4hjHMFL/dhMOhvKDIxFmhBUtbLeHBEBdyjVBINqobmW4mNPAgUw5NQ/PlAqNor yNu/OJozIYfU/z2AYHy60o3yM0aZ3eoQnCrHR5p0tIAzK7aQxaFZmN2+IRBviHti4mof FOSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731082851; x=1731687651; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=xoczasfH3EjGVlSybMjc91OekbCn/q1l859VFPlt17E=; b=jaCkv60k6z5WgAmbQFRlhx4XifKGHT1m1bhzHvv3HVUrdTrQ/x29vYjddEResIH4oK DE4T2jbNCZWTGTeBUTcjDmmc7yQfX4B3rE16vy99jgxjc1J/YX2ybY4c7UgOeaAffWmV 7BDPqK13BfJK2+7MUrw7ZxkrX/gw6kDKnThuSzmEE/9PX/Z32Ffm3kERKV8AHo4HwB4T CAhssl9zhpiHAQnE9tkJp0ZO8zvmdNyfcMWbFNNp+bZdBGTjz4+6qrwID503+xZMw+hQ tIh852gDzhBwQ+fFTg6dKfGaukFGYLUjFAawYWk3OYvQFuQHAZww/mKg/WrRMfG3cTUF qbnQ== X-Forwarded-Encrypted: i=1; AJvYcCVR29qJ4RWqTkrzw9AXRdB7R8Hm/ERDm/K2RvFncIzCxHMbtHqn75bI2H0sdwvPfETzPPd5gwd+bGU=@lists.freedesktop.org X-Gm-Message-State: AOJu0YyGnyfD0HeVukfdaEqVCOVfcxck5+cPJKubiIyrgqD1Nwab1Xs/ WLgLIbyirRIJzyETvqddkN7DDxu1cX4v3jLelU8WPt14dje7EZV1bqP0HI3ZJJbPQLUP3fcs6Q= = X-Google-Smtp-Source: AGHT+IGi0dElEr9qu8fzvtM9cId0LpD1vPVzTDBDvUodGuNk9OVUxqdu5KHHQFxgVeqHmDaBFP7u+PMuDw== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a05:690c:4b13:b0:6ea:decd:84e with SMTP id 00721157ae682-6eadecd0dd3mr590627b3.5.1731082851750; Fri, 08 Nov 2024 08:20:51 -0800 (PST) Date: Fri, 8 Nov 2024 16:20:34 +0000 In-Reply-To: <20241108162040.159038-1-tabba@google.com> Mime-Version: 1.0 References: <20241108162040.159038-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.277.g8800431eea-goog Message-ID: <20241108162040.159038-5-tabba@google.com> Subject: [RFC PATCH v1 04/10] mm/hugetlb-cgroup: convert hugetlb_cgroup_css_offline() to work on folios From: Fuad Tabba To: linux-mm@kvack.org Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org, david@redhat.com, rppt@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, muchun.song@linux.dev, simona@ffwll.ch, airlied@gmail.com, pbonzini@redhat.com, seanjc@google.com, willy@infradead.org, jgg@nvidia.com, jhubbard@nvidia.com, ackerleytng@google.com, vannapurve@google.com, mail@maciej.szmigiero.name, kirill.shutemov@linux.intel.com, quic_eberman@quicinc.com, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, tabba@google.com X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: David Hildenbrand Let's convert hugetlb_cgroup_css_offline() and hugetlb_cgroup_move_parent() to work on folios. hugepage_activelist contains folios, not pages. While at it, rename page_hcg simply to hcg, removing most of the "page" terminology. Signed-off-by: David Hildenbrand Signed-off-by: Fuad Tabba --- mm/hugetlb_cgroup.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c index d8d0e665caed..1bdeaf25f640 100644 --- a/mm/hugetlb_cgroup.c +++ b/mm/hugetlb_cgroup.c @@ -195,24 +195,23 @@ static void hugetlb_cgroup_css_free(struct cgroup_subsys_state *css) * cannot fail. */ static void hugetlb_cgroup_move_parent(int idx, struct hugetlb_cgroup *h_cg, - struct page *page) + struct folio *folio) { unsigned int nr_pages; struct page_counter *counter; - struct hugetlb_cgroup *page_hcg; + struct hugetlb_cgroup *hcg; struct hugetlb_cgroup *parent = parent_hugetlb_cgroup(h_cg); - struct folio *folio = page_folio(page); - page_hcg = hugetlb_cgroup_from_folio(folio); + hcg = hugetlb_cgroup_from_folio(folio); /* * We can have pages in active list without any cgroup * ie, hugepage with less than 3 pages. We can safely * ignore those pages. */ - if (!page_hcg || page_hcg != h_cg) + if (!hcg || hcg != h_cg) goto out; - nr_pages = compound_nr(page); + nr_pages = folio_nr_pages(folio); if (!parent) { parent = root_h_cgroup; /* root has no limit */ @@ -235,13 +234,13 @@ static void hugetlb_cgroup_css_offline(struct cgroup_subsys_state *css) { struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_css(css); struct hstate *h; - struct page *page; + struct folio *folio; do { for_each_hstate(h) { spin_lock_irq(&hugetlb_lock); - list_for_each_entry(page, &h->hugepage_activelist, lru) - hugetlb_cgroup_move_parent(hstate_index(h), h_cg, page); + list_for_each_entry(folio, &h->hugepage_activelist, lru) + hugetlb_cgroup_move_parent(hstate_index(h), h_cg, folio); spin_unlock_irq(&hugetlb_lock); } From patchwork Fri Nov 8 16:20:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13868451 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C5152D64064 for ; Fri, 8 Nov 2024 16:20:56 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DB56A10EA02; Fri, 8 Nov 2024 16:20:55 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="BH093nr7"; dkim-atps=neutral Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by gabe.freedesktop.org (Postfix) with ESMTPS id 28F4D10EA07 for ; Fri, 8 Nov 2024 16:20:55 +0000 (UTC) Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e30d7b4205eso3597240276.2 for ; Fri, 08 Nov 2024 08:20:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731082854; x=1731687654; darn=lists.freedesktop.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=EuhnD2p0KG553uFJn8sTE5BvDRtJggITHppkxTMLSiU=; b=BH093nr7+4IlHwg2+GIskDXJR2aTA8XD5lFOuDCsiGDgMscWtkL9hi0EWN6LpMxyz5 eH6y4eHIxc4vAeO9bmkOlddV8VjFkB1nszybunJbCdqwyVOCpR7S5cF8XSocsBjvOIY9 D2fJto9nvCc6UYoOsE3fLpu4a3Og1HMfTH0oSCO26dGjYDH6rvhqoYEshanFPINAa54X R9Cia2o24bB6z1JmuLbijGoNbu5ECSNqQWAt6rQOYgB0ZsFdDyGQBpiHvsOrd5RpavE0 FyuntTAhUPWhJaL+f0BGVX91B1dHAZTXEk5xDkMKtfvfpQ/WYI3KrjDs8U7wO6wEkm/a tsAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731082854; x=1731687654; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=EuhnD2p0KG553uFJn8sTE5BvDRtJggITHppkxTMLSiU=; b=ZjgowttbRJSGdnvzintcood1MhXU4zRveNYgGGyLHjJMAerijtpyihBGlP2W7IVWZQ yt6qCvRcoOc631QkIKr6l8wStbfNwOySN1vV7HVW2LYieTFMISS4enF/m+6lulMQEzJE 20BNrlAWBf9qcGYJpB9Au2ZjACt57G7Z381zjpo/XpDBQJfhTg3QEYfcIJ59EKYU598W EsZESF2IpfY85/9kMLGVfUZ5C7W3cwLwWpmNR8U32SmuSuJopWG17g8OFg8Gq1sI3Xco n7g4IDZGIgamH7sVg3k7NnMAC+2CqqmFkfFtpWzZZ1rpfJ79dkiq/Wbuc51PgSWKOHtv l/rw== X-Forwarded-Encrypted: i=1; AJvYcCWDt446J/xWSpkpZMm7+kli9PZuqzje3552FGa/FwT32a3gNRQWkmA+NARhTNkkeVKM+QKmsSMZhnc=@lists.freedesktop.org X-Gm-Message-State: AOJu0YydUwNvVGmdMgHuRV4kWwLfDHTkth2P8ThU1w79jP5bNTL4yox2 3e0xtiS/ypMIW+wWKyj20jDIIIaWdyFgAw66VQ6HaFRQCGj2Uxh/7tTCf9GBTKoPUoe6Soj54w= = X-Google-Smtp-Source: AGHT+IESFJvscd0BPH0GhrFsgF5ykZdsLy8VBXsCYVPr2h3JOn3SzzNujpcRbxNhrkNgQoNrxNGAtYBroQ== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a25:d001:0:b0:e30:d518:30f2 with SMTP id 3f1490d57ef6-e337f8417b3mr2585276.1.1731082854058; Fri, 08 Nov 2024 08:20:54 -0800 (PST) Date: Fri, 8 Nov 2024 16:20:35 +0000 In-Reply-To: <20241108162040.159038-1-tabba@google.com> Mime-Version: 1.0 References: <20241108162040.159038-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.277.g8800431eea-goog Message-ID: <20241108162040.159038-6-tabba@google.com> Subject: [RFC PATCH v1 05/10] mm/hugetlb: use folio->lru int demote_free_hugetlb_folios() From: Fuad Tabba To: linux-mm@kvack.org Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org, david@redhat.com, rppt@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, muchun.song@linux.dev, simona@ffwll.ch, airlied@gmail.com, pbonzini@redhat.com, seanjc@google.com, willy@infradead.org, jgg@nvidia.com, jhubbard@nvidia.com, ackerleytng@google.com, vannapurve@google.com, mail@maciej.szmigiero.name, kirill.shutemov@linux.intel.com, quic_eberman@quicinc.com, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, tabba@google.com X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: David Hildenbrand Let's avoid messing with pages. Signed-off-by: David Hildenbrand Signed-off-by: Fuad Tabba --- mm/hugetlb.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index d58bd815fdf2..a64852280213 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3806,13 +3806,15 @@ static long demote_free_hugetlb_folios(struct hstate *src, struct hstate *dst, for (i = 0; i < pages_per_huge_page(src); i += pages_per_huge_page(dst)) { struct page *page = folio_page(folio, i); + struct folio *new_folio; page->mapping = NULL; clear_compound_head(page); prep_compound_page(page, dst->order); + new_folio = page_folio(page); - init_new_hugetlb_folio(dst, page_folio(page)); - list_add(&page->lru, &dst_list); + init_new_hugetlb_folio(dst, new_folio); + list_add(&new_folio->lru, &dst_list); } } From patchwork Fri Nov 8 16:20:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13868452 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AE928D64061 for ; Fri, 8 Nov 2024 16:20:59 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2D8FE10EA07; Fri, 8 Nov 2024 16:20:59 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="PiTeQx0g"; dkim-atps=neutral Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by gabe.freedesktop.org (Postfix) with ESMTPS id E224310EA07 for ; Fri, 8 Nov 2024 16:20:57 +0000 (UTC) Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-431604a3b47so15111985e9.3 for ; Fri, 08 Nov 2024 08:20:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731082856; x=1731687656; darn=lists.freedesktop.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=djjKrUl0oR5lK1iUj5wBqHv7X6lVNwT0SqGCjb6L8ss=; b=PiTeQx0g09KVOLPS0BwdWudDhvb7qQus0o+u+RBWST3K+w6St3TcmyvwMKk/IH9VQX lyjbsAk5itRAd11GgRfSyyMmwFNTdppVnb5+gPgG4VA5Go8qZa2e4css7kDZ2fCbMTng DGRSfOrcJLT9GXl2roVXlOhUkv7XsJp/fEMp2pCn7J7VKVRlZI5yNDoxLeERWHwX+Lvm QX/Sn+K1qjN31ftdtgCB1lBHaVTwsxIGL3cURKbnVmF69+d41rKBm/WSgJ3uPqoXssoH 2JrQuV+gbEb7X+8w3pWcs5LcyVFV/I7LCBtZsH7qfoI+OdKIdryd9nmxDdo34IGKLrQH y+Uw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731082856; x=1731687656; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=djjKrUl0oR5lK1iUj5wBqHv7X6lVNwT0SqGCjb6L8ss=; b=IGxioCBVwRvSAA1uAFkuCeBc4L8e791+UffrgtN7HStqX1FUnrD3JbWqjunbRvAv8f DYhvLms5NfkLluy6TT+hmNP9h/wN1B70K24+1N0VR2ayJdq2NcvzbYzi1XmPjUhWJ83M B+UMRA+gbNICuO7N8SxFW2vgsGcVUkMmSmHIQyHRZVsxg43eHUwBIVTlFzV+HX/2JVaX 2GO9WPci8gvFux4QRn7JEQSWvO2Fv8QZDk1qvFr04rTcEmxP1Pcg89fpPDWq1ckRppQZ 9qUCqdbpaGmXu9lPXUUML2thtXNaGJnHY1yk7STL2i91bLFITzdVNSLOYAIzX5d8MEmF uSdQ== X-Forwarded-Encrypted: i=1; AJvYcCVIgqfnYi3K09o/JU0/SDqNJgSFjnJZ0DInDtR1mOOXOfz1zfjpWv1Z3ZmiP/lo5dgLrfEPUWwqhi8=@lists.freedesktop.org X-Gm-Message-State: AOJu0YyvC50JCREjC9Vdom4B73SYbesFbAV78pJH1DhbDg2mt+eS+RKl bHZq+lS2HbNfad2jt3WxSnpAXLEYx8SCh1W/lRRff2wUd28B0RwNtATKSRmV+PoqLuMWs/tQZQ= = X-Google-Smtp-Source: AGHT+IG8oqeRtBHhwUKqiIKTnkqXE8W7GFfQboidk/t9PyUEaqaxv663mtEdw0kf/yfOyooXYNVI8Y+B3g== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a7b:c845:0:b0:42c:b995:20f1 with SMTP id 5b1f17b1804b1-432b7515d74mr74205e9.4.1731082856332; Fri, 08 Nov 2024 08:20:56 -0800 (PST) Date: Fri, 8 Nov 2024 16:20:36 +0000 In-Reply-To: <20241108162040.159038-1-tabba@google.com> Mime-Version: 1.0 References: <20241108162040.159038-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.277.g8800431eea-goog Message-ID: <20241108162040.159038-7-tabba@google.com> Subject: [RFC PATCH v1 06/10] mm/hugetlb: use separate folio->_hugetlb_list for hugetlb-internals From: Fuad Tabba To: linux-mm@kvack.org Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org, david@redhat.com, rppt@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, muchun.song@linux.dev, simona@ffwll.ch, airlied@gmail.com, pbonzini@redhat.com, seanjc@google.com, willy@infradead.org, jgg@nvidia.com, jhubbard@nvidia.com, ackerleytng@google.com, vannapurve@google.com, mail@maciej.szmigiero.name, kirill.shutemov@linux.intel.com, quic_eberman@quicinc.com, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, tabba@google.com X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: David Hildenbrand Let's use a separate list head in the folio, as long as hugetlb folios are not isolated. This way, we can reuse folio->lru for different purpose (e.g., owner_ops) as long as they are not isolated. Consequently, folio->lru will only be used while there is an additional folio reference that cannot be dropped until putback/un-isolated. Signed-off-by: David Hildenbrand Signed-off-by: Fuad Tabba --- include/linux/mm_types.h | 18 +++++++++ mm/hugetlb.c | 81 +++++++++++++++++++++------------------- mm/hugetlb_cgroup.c | 4 +- mm/hugetlb_vmemmap.c | 8 ++-- 4 files changed, 66 insertions(+), 45 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 80fef38d9d64..365c73be0bb4 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -310,6 +310,7 @@ typedef struct { * @_hugetlb_cgroup: Do not use directly, use accessor in hugetlb_cgroup.h. * @_hugetlb_cgroup_rsvd: Do not use directly, use accessor in hugetlb_cgroup.h. * @_hugetlb_hwpoison: Do not use directly, call raw_hwp_list_head(). + * @_hugetlb_list: To be used in hugetlb core code only. * @_deferred_list: Folios to be split under memory pressure. * @_unused_slab_obj_exts: Placeholder to match obj_exts in struct slab. * @@ -397,6 +398,17 @@ struct folio { }; struct page __page_2; }; + union { + struct { + unsigned long _flags_3; + unsigned long _head_3; + /* public: */ + struct list_head _hugetlb_list; + /* private: the union with struct page is transitional */ + }; + struct page __page_3; + }; + }; #define FOLIO_MATCH(pg, fl) \ @@ -433,6 +445,12 @@ FOLIO_MATCH(compound_head, _head_2); FOLIO_MATCH(flags, _flags_2a); FOLIO_MATCH(compound_head, _head_2a); #undef FOLIO_MATCH +#define FOLIO_MATCH(pg, fl) \ + static_assert(offsetof(struct folio, fl) == \ + offsetof(struct page, pg) + 3 * sizeof(struct page)) +FOLIO_MATCH(flags, _flags_3); +FOLIO_MATCH(compound_head, _head_3); +#undef FOLIO_MATCH /** * struct ptdesc - Memory descriptor for page tables. diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a64852280213..2308e94d8615 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1316,7 +1316,7 @@ static void enqueue_hugetlb_folio(struct hstate *h, struct folio *folio) lockdep_assert_held(&hugetlb_lock); VM_BUG_ON_FOLIO(folio_ref_count(folio), folio); - list_move(&folio->lru, &h->hugepage_freelists[nid]); + list_move(&folio->_hugetlb_list, &h->hugepage_freelists[nid]); h->free_huge_pages++; h->free_huge_pages_node[nid]++; folio_set_hugetlb_freed(folio); @@ -1329,14 +1329,14 @@ static struct folio *dequeue_hugetlb_folio_node_exact(struct hstate *h, bool pin = !!(current->flags & PF_MEMALLOC_PIN); lockdep_assert_held(&hugetlb_lock); - list_for_each_entry(folio, &h->hugepage_freelists[nid], lru) { + list_for_each_entry(folio, &h->hugepage_freelists[nid], _hugetlb_list) { if (pin && !folio_is_longterm_pinnable(folio)) continue; if (folio_test_hwpoison(folio)) continue; - list_move(&folio->lru, &h->hugepage_activelist); + list_move(&folio->_hugetlb_list, &h->hugepage_activelist); folio_ref_unfreeze(folio, 1); folio_clear_hugetlb_freed(folio); h->free_huge_pages--; @@ -1599,7 +1599,7 @@ static void remove_hugetlb_folio(struct hstate *h, struct folio *folio, if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported()) return; - list_del(&folio->lru); + list_del(&folio->_hugetlb_list); if (folio_test_hugetlb_freed(folio)) { folio_clear_hugetlb_freed(folio); @@ -1616,8 +1616,9 @@ static void remove_hugetlb_folio(struct hstate *h, struct folio *folio, * pages. Otherwise, someone (memory error handling) may try to write * to tail struct pages. */ - if (!folio_test_hugetlb_vmemmap_optimized(folio)) + if (!folio_test_hugetlb_vmemmap_optimized(folio)) { __folio_clear_hugetlb(folio); + } h->nr_huge_pages--; h->nr_huge_pages_node[nid]--; @@ -1632,7 +1633,7 @@ static void add_hugetlb_folio(struct hstate *h, struct folio *folio, lockdep_assert_held(&hugetlb_lock); - INIT_LIST_HEAD(&folio->lru); + INIT_LIST_HEAD(&folio->_hugetlb_list); h->nr_huge_pages++; h->nr_huge_pages_node[nid]++; @@ -1640,8 +1641,8 @@ static void add_hugetlb_folio(struct hstate *h, struct folio *folio, h->surplus_huge_pages++; h->surplus_huge_pages_node[nid]++; } - __folio_set_hugetlb(folio); + folio_change_private(folio, NULL); /* * We have to set hugetlb_vmemmap_optimized again as above @@ -1789,8 +1790,8 @@ static void bulk_vmemmap_restore_error(struct hstate *h, * hugetlb pages with vmemmap we will free up memory so that we * can allocate vmemmap for more hugetlb pages. */ - list_for_each_entry_safe(folio, t_folio, non_hvo_folios, lru) { - list_del(&folio->lru); + list_for_each_entry_safe(folio, t_folio, non_hvo_folios, _hugetlb_list) { + list_del(&folio->_hugetlb_list); spin_lock_irq(&hugetlb_lock); __folio_clear_hugetlb(folio); spin_unlock_irq(&hugetlb_lock); @@ -1808,14 +1809,14 @@ static void bulk_vmemmap_restore_error(struct hstate *h, * If are able to restore vmemmap and free one hugetlb page, we * quit processing the list to retry the bulk operation. */ - list_for_each_entry_safe(folio, t_folio, folio_list, lru) + list_for_each_entry_safe(folio, t_folio, folio_list, _hugetlb_list) if (hugetlb_vmemmap_restore_folio(h, folio)) { - list_del(&folio->lru); + list_del(&folio->_hugetlb_list); spin_lock_irq(&hugetlb_lock); add_hugetlb_folio(h, folio, true); spin_unlock_irq(&hugetlb_lock); } else { - list_del(&folio->lru); + list_del(&folio->_hugetlb_list); spin_lock_irq(&hugetlb_lock); __folio_clear_hugetlb(folio); spin_unlock_irq(&hugetlb_lock); @@ -1856,12 +1857,12 @@ static void update_and_free_pages_bulk(struct hstate *h, VM_WARN_ON(ret < 0); if (!list_empty(&non_hvo_folios) && ret) { spin_lock_irq(&hugetlb_lock); - list_for_each_entry(folio, &non_hvo_folios, lru) + list_for_each_entry(folio, &non_hvo_folios, _hugetlb_list) __folio_clear_hugetlb(folio); spin_unlock_irq(&hugetlb_lock); } - list_for_each_entry_safe(folio, t_folio, &non_hvo_folios, lru) { + list_for_each_entry_safe(folio, t_folio, &non_hvo_folios, _hugetlb_list) { update_and_free_hugetlb_folio(h, folio, false); cond_resched(); } @@ -1959,7 +1960,7 @@ static void __prep_account_new_huge_page(struct hstate *h, int nid) static void init_new_hugetlb_folio(struct hstate *h, struct folio *folio) { __folio_set_hugetlb(folio); - INIT_LIST_HEAD(&folio->lru); + INIT_LIST_HEAD(&folio->_hugetlb_list); hugetlb_set_folio_subpool(folio, NULL); set_hugetlb_cgroup(folio, NULL); set_hugetlb_cgroup_rsvd(folio, NULL); @@ -2112,7 +2113,7 @@ static void prep_and_add_allocated_folios(struct hstate *h, /* Add all new pool pages to free lists in one lock cycle */ spin_lock_irqsave(&hugetlb_lock, flags); - list_for_each_entry_safe(folio, tmp_f, folio_list, lru) { + list_for_each_entry_safe(folio, tmp_f, folio_list, _hugetlb_list) { __prep_account_new_huge_page(h, folio_nid(folio)); enqueue_hugetlb_folio(h, folio); } @@ -2165,7 +2166,7 @@ static struct folio *remove_pool_hugetlb_folio(struct hstate *h, if ((!acct_surplus || h->surplus_huge_pages_node[node]) && !list_empty(&h->hugepage_freelists[node])) { folio = list_entry(h->hugepage_freelists[node].next, - struct folio, lru); + struct folio, _hugetlb_list); remove_hugetlb_folio(h, folio, acct_surplus); break; } @@ -2491,7 +2492,7 @@ static int gather_surplus_pages(struct hstate *h, long delta) alloc_ok = false; break; } - list_add(&folio->lru, &surplus_list); + list_add(&folio->_hugetlb_list, &surplus_list); cond_resched(); } allocated += i; @@ -2526,7 +2527,7 @@ static int gather_surplus_pages(struct hstate *h, long delta) ret = 0; /* Free the needed pages to the hugetlb pool */ - list_for_each_entry_safe(folio, tmp, &surplus_list, lru) { + list_for_each_entry_safe(folio, tmp, &surplus_list, _hugetlb_list) { if ((--needed) < 0) break; /* Add the page to the hugetlb allocator */ @@ -2539,7 +2540,7 @@ static int gather_surplus_pages(struct hstate *h, long delta) * Free unnecessary surplus pages to the buddy allocator. * Pages have no ref count, call free_huge_folio directly. */ - list_for_each_entry_safe(folio, tmp, &surplus_list, lru) + list_for_each_entry_safe(folio, tmp, &surplus_list, _hugetlb_list) free_huge_folio(folio); spin_lock_irq(&hugetlb_lock); @@ -2588,7 +2589,7 @@ static void return_unused_surplus_pages(struct hstate *h, if (!folio) goto out; - list_add(&folio->lru, &page_list); + list_add(&folio->_hugetlb_list, &page_list); } out: @@ -3051,7 +3052,7 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, folio_set_hugetlb_restore_reserve(folio); h->resv_huge_pages--; } - list_add(&folio->lru, &h->hugepage_activelist); + list_add(&folio->_hugetlb_list, &h->hugepage_activelist); folio_ref_unfreeze(folio, 1); /* Fall through */ } @@ -3211,7 +3212,7 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h, /* Send list for bulk vmemmap optimization processing */ hugetlb_vmemmap_optimize_folios(h, folio_list); - list_for_each_entry_safe(folio, tmp_f, folio_list, lru) { + list_for_each_entry_safe(folio, tmp_f, folio_list, _hugetlb_list) { if (!folio_test_hugetlb_vmemmap_optimized(folio)) { /* * If HVO fails, initialize all tail struct pages @@ -3260,7 +3261,7 @@ static void __init gather_bootmem_prealloc_node(unsigned long nid) hugetlb_folio_init_vmemmap(folio, h, HUGETLB_VMEMMAP_RESERVE_PAGES); init_new_hugetlb_folio(h, folio); - list_add(&folio->lru, &folio_list); + list_add(&folio->_hugetlb_list, &folio_list); /* * We need to restore the 'stolen' pages to totalram_pages @@ -3317,7 +3318,7 @@ static void __init hugetlb_hstate_alloc_pages_onenode(struct hstate *h, int nid) &node_states[N_MEMORY], NULL); if (!folio) break; - list_add(&folio->lru, &folio_list); + list_add(&folio->_hugetlb_list, &folio_list); } cond_resched(); } @@ -3379,7 +3380,7 @@ static void __init hugetlb_pages_alloc_boot_node(unsigned long start, unsigned l if (!folio) break; - list_move(&folio->lru, &folio_list); + list_move(&folio->_hugetlb_list, &folio_list); cond_resched(); } @@ -3544,13 +3545,13 @@ static void try_to_free_low(struct hstate *h, unsigned long count, for_each_node_mask(i, *nodes_allowed) { struct folio *folio, *next; struct list_head *freel = &h->hugepage_freelists[i]; - list_for_each_entry_safe(folio, next, freel, lru) { + list_for_each_entry_safe(folio, next, freel, _hugetlb_list) { if (count >= h->nr_huge_pages) goto out; if (folio_test_highmem(folio)) continue; remove_hugetlb_folio(h, folio, false); - list_add(&folio->lru, &page_list); + list_add(&folio->_hugetlb_list, &page_list); } } @@ -3703,7 +3704,7 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid, goto out; } - list_add(&folio->lru, &page_list); + list_add(&folio->_hugetlb_list, &page_list); allocated++; /* Bail for signals. Probably ctrl-c from user */ @@ -3750,7 +3751,7 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid, if (!folio) break; - list_add(&folio->lru, &page_list); + list_add(&folio->_hugetlb_list, &page_list); } /* free the pages after dropping lock */ spin_unlock_irq(&hugetlb_lock); @@ -3793,13 +3794,13 @@ static long demote_free_hugetlb_folios(struct hstate *src, struct hstate *dst, */ mutex_lock(&dst->resize_lock); - list_for_each_entry_safe(folio, next, src_list, lru) { + list_for_each_entry_safe(folio, next, src_list, _hugetlb_list) { int i; if (folio_test_hugetlb_vmemmap_optimized(folio)) continue; - list_del(&folio->lru); + list_del(&folio->_hugetlb_list); split_page_owner(&folio->page, huge_page_order(src), huge_page_order(dst)); pgalloc_tag_split(folio, huge_page_order(src), huge_page_order(dst)); @@ -3814,7 +3815,7 @@ static long demote_free_hugetlb_folios(struct hstate *src, struct hstate *dst, new_folio = page_folio(page); init_new_hugetlb_folio(dst, new_folio); - list_add(&new_folio->lru, &dst_list); + list_add(&new_folio->_hugetlb_list, &dst_list); } } @@ -3847,12 +3848,12 @@ static long demote_pool_huge_page(struct hstate *src, nodemask_t *nodes_allowed, LIST_HEAD(list); struct folio *folio, *next; - list_for_each_entry_safe(folio, next, &src->hugepage_freelists[node], lru) { + list_for_each_entry_safe(folio, next, &src->hugepage_freelists[node], _hugetlb_list) { if (folio_test_hwpoison(folio)) continue; remove_hugetlb_folio(src, folio, false); - list_add(&folio->lru, &list); + list_add(&folio->_hugetlb_list, &list); if (++nr_demoted == nr_to_demote) break; @@ -3864,8 +3865,8 @@ static long demote_pool_huge_page(struct hstate *src, nodemask_t *nodes_allowed, spin_lock_irq(&hugetlb_lock); - list_for_each_entry_safe(folio, next, &list, lru) { - list_del(&folio->lru); + list_for_each_entry_safe(folio, next, &list, _hugetlb_list) { + list_del(&folio->_hugetlb_list); add_hugetlb_folio(src, folio, false); nr_demoted--; @@ -7427,7 +7428,8 @@ bool folio_isolate_hugetlb(struct folio *folio, struct list_head *list) goto unlock; } folio_clear_hugetlb_migratable(folio); - list_move_tail(&folio->lru, list); + list_del_init(&folio->_hugetlb_list); + list_add_tail(&folio->lru, list); unlock: spin_unlock_irq(&hugetlb_lock); return ret; @@ -7478,7 +7480,8 @@ void folio_putback_hugetlb(struct folio *folio) { spin_lock_irq(&hugetlb_lock); folio_set_hugetlb_migratable(folio); - list_move_tail(&folio->lru, &(folio_hstate(folio))->hugepage_activelist); + list_del_init(&folio->lru); + list_add_tail(&folio->_hugetlb_list, &(folio_hstate(folio))->hugepage_activelist); spin_unlock_irq(&hugetlb_lock); folio_put(folio); } diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c index 1bdeaf25f640..ee720eeaf6b1 100644 --- a/mm/hugetlb_cgroup.c +++ b/mm/hugetlb_cgroup.c @@ -239,7 +239,7 @@ static void hugetlb_cgroup_css_offline(struct cgroup_subsys_state *css) do { for_each_hstate(h) { spin_lock_irq(&hugetlb_lock); - list_for_each_entry(folio, &h->hugepage_activelist, lru) + list_for_each_entry(folio, &h->hugepage_activelist, _hugetlb_list) hugetlb_cgroup_move_parent(hstate_index(h), h_cg, folio); spin_unlock_irq(&hugetlb_lock); @@ -933,7 +933,7 @@ void hugetlb_cgroup_migrate(struct folio *old_folio, struct folio *new_folio) /* move the h_cg details to new cgroup */ set_hugetlb_cgroup(new_folio, h_cg); set_hugetlb_cgroup_rsvd(new_folio, h_cg_rsvd); - list_move(&new_folio->lru, &h->hugepage_activelist); + list_move(&new_folio->_hugetlb_list, &h->hugepage_activelist); spin_unlock_irq(&hugetlb_lock); return; } diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 57b7f591eee8..b2cb8d328aac 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -519,7 +519,7 @@ long hugetlb_vmemmap_restore_folios(const struct hstate *h, long ret = 0; unsigned long flags = VMEMMAP_REMAP_NO_TLB_FLUSH | VMEMMAP_SYNCHRONIZE_RCU; - list_for_each_entry_safe(folio, t_folio, folio_list, lru) { + list_for_each_entry_safe(folio, t_folio, folio_list, _hugetlb_list) { if (folio_test_hugetlb_vmemmap_optimized(folio)) { ret = __hugetlb_vmemmap_restore_folio(h, folio, flags); /* only need to synchronize_rcu() once for each batch */ @@ -531,7 +531,7 @@ long hugetlb_vmemmap_restore_folios(const struct hstate *h, } /* Add non-optimized folios to output list */ - list_move(&folio->lru, non_hvo_folios); + list_move(&folio->_hugetlb_list, non_hvo_folios); } if (restored) @@ -651,7 +651,7 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l LIST_HEAD(vmemmap_pages); unsigned long flags = VMEMMAP_REMAP_NO_TLB_FLUSH | VMEMMAP_SYNCHRONIZE_RCU; - list_for_each_entry(folio, folio_list, lru) { + list_for_each_entry(folio, folio_list, _hugetlb_list) { int ret = hugetlb_vmemmap_split_folio(h, folio); /* @@ -666,7 +666,7 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l flush_tlb_all(); - list_for_each_entry(folio, folio_list, lru) { + list_for_each_entry(folio, folio_list, _hugetlb_list) { int ret; ret = __hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages, flags); From patchwork Fri Nov 8 16:20:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13868453 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 35B0ED64065 for ; Fri, 8 Nov 2024 16:21:02 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9124D10EA0E; Fri, 8 Nov 2024 16:21:01 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="oSeBQdXk"; dkim-atps=neutral Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) by gabe.freedesktop.org (Postfix) with ESMTPS id 30D3410EA0D for ; Fri, 8 Nov 2024 16:21:00 +0000 (UTC) Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-37d5016d21eso1218048f8f.3 for ; Fri, 08 Nov 2024 08:21:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731082858; x=1731687658; darn=lists.freedesktop.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Z1UKVDyubm+0FNKQOmqAnK7wACyIXxMp06jV3hLqPe8=; b=oSeBQdXkP8KLppn0kAAqaKGsckQkBsphvEhQm4zAP/M+xO3CYEui4NZEqBiLJQ5ViE rcOp7oQLhYTQFNAjcOcjozJap0pdF1WM5aKJWHBjGSU30ybyC5zSN6pytVy5N4n+4/eh K/ee8nkXzNC27O6Yq90XZbF0SDm19YQqDpcA+U4DzU1b8bFQpgroRDQyT5lIK8ABdNAj huvR3kExiRdGXh2NYKLFmcQhIAqleKSphz46h/1kBmjUwnoC7/g6VHQkPTD5dw3le9ix FyrqNORkoK5Cdr5dv/Ert8nqDIy/+bT4a+YRvq77tiFRazRrq5/aQYwvLGfhb+fsef/F 5/Vg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731082858; x=1731687658; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Z1UKVDyubm+0FNKQOmqAnK7wACyIXxMp06jV3hLqPe8=; b=W6rwUc1hWVYxCD4znos+q3ervt5LlaPOEbOoiSakcWqb9q51+7riS/NrYmeHZ0+sev p4y47GR8PHwyDZuguIoq/RypD2edOjb66Nxfx4NJVgiV/vePqSGw8yNXZF/JjDHc46wz j5IksqL9JA+nuLCCUTg+5zuYAuS68aJ3IGHU5mh23XsWo/i3kqo/2UIH0oxcw3xioua+ LsqLv/OSjgzC/lUbByr29dfWQkQH6ta+XUFiaTk9N+q/xq0pfHvFS3hNjyEWmP2WhZDz TyAw5/1xi+azqZ4cce6tyiMp9UtFM1kf/SRsSjQfhwuWik4BaOxVpq/jYLhlmxh8x20o aVmw== X-Forwarded-Encrypted: i=1; AJvYcCWMygxywZ46Gtl5vbGFeucJmJhAzw7b8KExLxCifuI0SRrqIsq40jMnZUldiJVpmpcFBlqwccL8MxI=@lists.freedesktop.org X-Gm-Message-State: AOJu0Yw7WgRM06Itzy5WhgFvUzMkDZ//PSlnzlXanzWv3EAWFluRGsSs HgPRt4RRMAMF9kButlhabqz3lzT/l5feBNbfFh5WGm1vDaqugjxKA6kB2Hh+OIZcL2uNx867UQ= = X-Google-Smtp-Source: AGHT+IHkZAD0auPV4emIDnQo0+EU0NmTafyVDODnqw54sbX8+2pfhiQ95jeQsPF/8IsVn+XR42DkmZua1Q== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:adf:ffca:0:b0:37d:4cee:559 with SMTP id ffacd0b85a97d-381f1862148mr2432f8f.3.1731082858585; Fri, 08 Nov 2024 08:20:58 -0800 (PST) Date: Fri, 8 Nov 2024 16:20:37 +0000 In-Reply-To: <20241108162040.159038-1-tabba@google.com> Mime-Version: 1.0 References: <20241108162040.159038-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.277.g8800431eea-goog Message-ID: <20241108162040.159038-8-tabba@google.com> Subject: [RFC PATCH v1 07/10] mm: Introduce struct folio_owner_ops From: Fuad Tabba To: linux-mm@kvack.org Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org, david@redhat.com, rppt@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, muchun.song@linux.dev, simona@ffwll.ch, airlied@gmail.com, pbonzini@redhat.com, seanjc@google.com, willy@infradead.org, jgg@nvidia.com, jhubbard@nvidia.com, ackerleytng@google.com, vannapurve@google.com, mail@maciej.szmigiero.name, kirill.shutemov@linux.intel.com, quic_eberman@quicinc.com, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, tabba@google.com X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Introduce struct folio_owner_ops, a method table that contains callbacks to owners of folios that need special handling for certain operations. For now, it only contains a callback for folio free(), which is called immediately after the folio refcount drops to 0. Add a pointer to this struct overlaid on struct page compound_head, pgmap, and struct page/folio lru. The users of this struct either will not use lru (e.g., zone device), or would be able to easily isolate when lru is being used (e.g., hugetlb) and handle it accordingly. While folios are isolated, they cannot get freed and the owner_ops are unstable. This is sufficient for the current use case of returning these folios to a custom allocator. To identify that a folio has owner_ops, we set bit 1 of the field, in a similar way to that bit 0 of compound_head is used to identify compound pages. Signed-off-by: Fuad Tabba --- include/linux/mm_types.h | 64 +++++++++++++++++++++++++++++++++++++--- mm/swap.c | 19 ++++++++++++ 2 files changed, 79 insertions(+), 4 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 365c73be0bb4..6e06286f44f1 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -41,10 +41,12 @@ struct mem_cgroup; * * If you allocate the page using alloc_pages(), you can use some of the * space in struct page for your own purposes. The five words in the main - * union are available, except for bit 0 of the first word which must be - * kept clear. Many users use this word to store a pointer to an object - * which is guaranteed to be aligned. If you use the same storage as - * page->mapping, you must restore it to NULL before freeing the page. + * union are available, except for bit 0 (used for compound_head pages) + * and bit 1 (used for owner_ops) of the first word, which must be kept + * clear and used with care. Many users use this word to store a pointer + * to an object which is guaranteed to be aligned. If you use the same + * storage as page->mapping, you must restore it to NULL before freeing + * the page. * * The mapcount field must not be used for own purposes. * @@ -283,10 +285,16 @@ typedef struct { unsigned long val; } swp_entry_t; +struct folio_owner_ops; + /** * struct folio - Represents a contiguous set of bytes. * @flags: Identical to the page flags. * @lru: Least Recently Used list; tracks how recently this folio was used. + * @owner_ops: Pointer to callback operations of the folio owner. Valid if bit 1 + * is set. + * NOTE: Cannot be used with lru, since it is overlaid with it. To use lru, + * owner_ops must be cleared first, and restored once done with lru. * @mlock_count: Number of times this folio has been pinned by mlock(). * @mapping: The file this page belongs to, or refers to the anon_vma for * anonymous memory. @@ -330,6 +338,7 @@ struct folio { unsigned long flags; union { struct list_head lru; + const struct folio_owner_ops *owner_ops; /* Bit 1 is set */ /* private: avoid cluttering the output */ struct { void *__filler; @@ -417,6 +426,7 @@ FOLIO_MATCH(flags, flags); FOLIO_MATCH(lru, lru); FOLIO_MATCH(mapping, mapping); FOLIO_MATCH(compound_head, lru); +FOLIO_MATCH(compound_head, owner_ops); FOLIO_MATCH(index, index); FOLIO_MATCH(private, private); FOLIO_MATCH(_mapcount, _mapcount); @@ -452,6 +462,13 @@ FOLIO_MATCH(flags, _flags_3); FOLIO_MATCH(compound_head, _head_3); #undef FOLIO_MATCH +struct folio_owner_ops { + /* + * Called once the folio refcount reaches 0. + */ + void (*free)(struct folio *folio); +}; + /** * struct ptdesc - Memory descriptor for page tables. * @__page_flags: Same as page flags. Powerpc only. @@ -560,6 +577,45 @@ static inline void *folio_get_private(struct folio *folio) return folio->private; } +/* + * Use bit 1, since bit 0 is used to indicate a compound page in compound_head, + * which owner_ops is overlaid with. + */ +#define FOLIO_OWNER_OPS_BIT 1UL +#define FOLIO_OWNER_OPS (1UL << FOLIO_OWNER_OPS_BIT) + +/* + * Set the folio owner_ops as well as bit 1 of the pointer to indicate that the + * folio has owner_ops. + */ +static inline void folio_set_owner_ops(struct folio *folio, const struct folio_owner_ops *owner_ops) +{ + owner_ops = (const struct folio_owner_ops *)((unsigned long)owner_ops | FOLIO_OWNER_OPS); + folio->owner_ops = owner_ops; +} + +/* + * Clear the folio owner_ops including bit 1 of the pointer. + */ +static inline void folio_clear_owner_ops(struct folio *folio) +{ + folio->owner_ops = NULL; +} + +/* + * Return the folio's owner_ops if it has them, otherwise, return NULL. + */ +static inline const struct folio_owner_ops *folio_get_owner_ops(struct folio *folio) +{ + const struct folio_owner_ops *owner_ops = folio->owner_ops; + + if (!((unsigned long)owner_ops & FOLIO_OWNER_OPS)) + return NULL; + + owner_ops = (const struct folio_owner_ops *)((unsigned long)owner_ops & ~FOLIO_OWNER_OPS); + return owner_ops; +} + struct page_frag_cache { void * va; #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) diff --git a/mm/swap.c b/mm/swap.c index 638a3f001676..767ff6d8f47b 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -110,6 +110,13 @@ static void page_cache_release(struct folio *folio) void __folio_put(struct folio *folio) { + const struct folio_owner_ops *owner_ops = folio_get_owner_ops(folio); + + if (unlikely(owner_ops)) { + owner_ops->free(folio); + return; + } + if (unlikely(folio_is_zone_device(folio))) { free_zone_device_folio(folio); return; @@ -929,10 +936,22 @@ void folios_put_refs(struct folio_batch *folios, unsigned int *refs) for (i = 0, j = 0; i < folios->nr; i++) { struct folio *folio = folios->folios[i]; unsigned int nr_refs = refs ? refs[i] : 1; + const struct folio_owner_ops *owner_ops; if (is_huge_zero_folio(folio)) continue; + owner_ops = folio_get_owner_ops(folio); + if (unlikely(owner_ops)) { + if (lruvec) { + unlock_page_lruvec_irqrestore(lruvec, flags); + lruvec = NULL; + } + if (folio_ref_sub_and_test(folio, nr_refs)) + owner_ops->free(folio); + continue; + } + if (folio_is_zone_device(folio)) { if (lruvec) { unlock_page_lruvec_irqrestore(lruvec, flags); From patchwork Fri Nov 8 16:20:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13868454 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A1A34D64061 for ; Fri, 8 Nov 2024 16:21:04 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E47F010EA11; Fri, 8 Nov 2024 16:21:03 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="j0loRt8N"; dkim-atps=neutral Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by gabe.freedesktop.org (Postfix) with ESMTPS id 51F3A10EA0C for ; Fri, 8 Nov 2024 16:21:02 +0000 (UTC) Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-431518ae047so18129175e9.0 for ; Fri, 08 Nov 2024 08:21:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731082861; x=1731687661; darn=lists.freedesktop.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Zc2jp+pMldWQELuiNGfEMyRAPqgAV/dSRAey9KCp9+8=; b=j0loRt8NLlL0auNbA0gYTWl3J1Na6iFFIKW1YuakbEqR7pbpOnwyLKzKpSrTAOiy0m yISVRwvDMpl6xpU8dTUF4osyNDUE+y5ZMpfPt9KDTLom8/gW/Rvx58JqBRfm+NqiLhgF dGVh1/cyoqLsvi0C8iDcn36P5rfSrGO/pUD5DuwrLUTlFMQp3/zjihbjvDOpyH5aL5gI hI/L6oHQ/DHg/TjlCaL8hM94EzHDYIlqQOv0VvLOo0K75ev1MoDqf7JK3spAgddW1hOC dvIPMFATFNRY5H488Kw/7uDVdCrYjd19P+i1AWuXJv07HOnjD6UEVFKpIVfPZCbDu0B8 1Ocg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731082861; x=1731687661; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Zc2jp+pMldWQELuiNGfEMyRAPqgAV/dSRAey9KCp9+8=; b=jp1gViPRQSCviWMtGvk7Cf5w1GqSO1WYLWjHxImfTRX01LyV1F0M9GSXBS3KmIbdw6 TsddtQODnFlBbGfCanXIW6mCejJMKTaNvH57zEq9WmZWmnn3nwjyqt2IEGLU8JZeJ9rQ A1tPEFyZ01Ea3Ok0BS52+eN1eUh5uainrhznnt4KDBgPPbeoxdubSG4khYblao9x3Xj8 jW44jmPI3J/wYTwwWuxmvkcmPIf2izjH20v37D+5TlEzccNbZNsK5sZoeUxnuZk+pOrT LPdceTWqtvAMjB5Pu1y7y1kWIsvwPqROgsaokkoRC0aPRZuZiqkAEBtnJsSYmx/cx6Kz KKEA== X-Forwarded-Encrypted: i=1; AJvYcCXp+lCQ17dk8xYk9Ebcs0dvuRxDVg4NEiT87YLTP6yA4t3RXY6RYVv4MEiaJSdTPEGxFNgm4U3eTvo=@lists.freedesktop.org X-Gm-Message-State: AOJu0YxKfPkY/WiyAai48RtzC0LwiKpkQ6f1ndeA9T9ha6/ktOKoiEv1 pNvPSD0baebyckMSZdaQNQ5GdTE87Hjlvq+9B3Qf7W+ryzgkAyxgNz+d/kicnHvC2J3mkG++Nw= = X-Google-Smtp-Source: AGHT+IEFnogtOzb1Xio0nKmbj4KqgDBTNvMMzqTZB5Q0tAq5Rm4eo6VUxTjFp/Ump5nYGzIZAg3lfcnSTA== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a7b:cbc9:0:b0:42e:6ad4:e411 with SMTP id 5b1f17b1804b1-432b741c9b5mr131765e9.1.1731082860803; Fri, 08 Nov 2024 08:21:00 -0800 (PST) Date: Fri, 8 Nov 2024 16:20:38 +0000 In-Reply-To: <20241108162040.159038-1-tabba@google.com> Mime-Version: 1.0 References: <20241108162040.159038-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.277.g8800431eea-goog Message-ID: <20241108162040.159038-9-tabba@google.com> Subject: [RFC PATCH v1 08/10] mm: Use getters and setters to access page pgmap From: Fuad Tabba To: linux-mm@kvack.org Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org, david@redhat.com, rppt@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, muchun.song@linux.dev, simona@ffwll.ch, airlied@gmail.com, pbonzini@redhat.com, seanjc@google.com, willy@infradead.org, jgg@nvidia.com, jhubbard@nvidia.com, ackerleytng@google.com, vannapurve@google.com, mail@maciej.szmigiero.name, kirill.shutemov@linux.intel.com, quic_eberman@quicinc.com, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, tabba@google.com X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" The pointer to pgmap in struct page is overlaid with folio owner_ops. To indicate that a page/folio has owner ops, bit 1 is set. Therefore, before we can start to using owner_ops, we need to ensure that all accesses to page pgmap sanitize the pointer value. This patch introduces the accessors, which will be modified in the following patch to sanitize the pointer values. No functional change intended. Signed-off-by: Fuad Tabba --- drivers/gpu/drm/nouveau/nouveau_dmem.c | 4 +++- drivers/pci/p2pdma.c | 8 +++++--- include/linux/memremap.h | 6 +++--- include/linux/mm_types.h | 13 +++++++++++++ lib/test_hmm.c | 2 +- mm/hmm.c | 2 +- mm/memory.c | 2 +- mm/memremap.c | 19 +++++++++++-------- mm/migrate_device.c | 4 ++-- mm/mm_init.c | 2 +- 10 files changed, 41 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c index 1a072568cef6..d7d9d9476bb0 100644 --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c @@ -88,7 +88,9 @@ struct nouveau_dmem { static struct nouveau_dmem_chunk *nouveau_page_to_chunk(struct page *page) { - return container_of(page->pgmap, struct nouveau_dmem_chunk, pagemap); + struct dev_pagemap *pgmap = page_get_pgmap(page); + + return container_of(pgmap, struct nouveau_dmem_chunk, pagemap); } static struct nouveau_drm *page_to_drm(struct page *page) diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c index 4f47a13cb500..19519bb4ba56 100644 --- a/drivers/pci/p2pdma.c +++ b/drivers/pci/p2pdma.c @@ -193,7 +193,7 @@ static const struct attribute_group p2pmem_group = { static void p2pdma_page_free(struct page *page) { - struct pci_p2pdma_pagemap *pgmap = to_p2p_pgmap(page->pgmap); + struct pci_p2pdma_pagemap *pgmap = to_p2p_pgmap(page_get_pgmap(page)); /* safe to dereference while a reference is held to the percpu ref */ struct pci_p2pdma *p2pdma = rcu_dereference_protected(pgmap->provider->p2pdma, 1); @@ -1016,8 +1016,10 @@ enum pci_p2pdma_map_type pci_p2pdma_map_segment(struct pci_p2pdma_map_state *state, struct device *dev, struct scatterlist *sg) { - if (state->pgmap != sg_page(sg)->pgmap) { - state->pgmap = sg_page(sg)->pgmap; + struct dev_pagemap *pgmap = page_get_pgmap(sg_page(sg)); + + if (state->pgmap != pgmap) { + state->pgmap = pgmap; state->map = pci_p2pdma_map_type(state->pgmap, dev); state->bus_off = to_p2p_pgmap(state->pgmap)->bus_offset; } diff --git a/include/linux/memremap.h b/include/linux/memremap.h index 3f7143ade32c..060e27b6aee0 100644 --- a/include/linux/memremap.h +++ b/include/linux/memremap.h @@ -161,7 +161,7 @@ static inline bool is_device_private_page(const struct page *page) { return IS_ENABLED(CONFIG_DEVICE_PRIVATE) && is_zone_device_page(page) && - page->pgmap->type == MEMORY_DEVICE_PRIVATE; + page_get_pgmap(page)->type == MEMORY_DEVICE_PRIVATE; } static inline bool folio_is_device_private(const struct folio *folio) @@ -173,13 +173,13 @@ static inline bool is_pci_p2pdma_page(const struct page *page) { return IS_ENABLED(CONFIG_PCI_P2PDMA) && is_zone_device_page(page) && - page->pgmap->type == MEMORY_DEVICE_PCI_P2PDMA; + page_get_pgmap(page)->type == MEMORY_DEVICE_PCI_P2PDMA; } static inline bool is_device_coherent_page(const struct page *page) { return is_zone_device_page(page) && - page->pgmap->type == MEMORY_DEVICE_COHERENT; + page_get_pgmap(page)->type == MEMORY_DEVICE_COHERENT; } static inline bool folio_is_device_coherent(const struct folio *folio) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 6e06286f44f1..27075ea24e67 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -616,6 +616,19 @@ static inline const struct folio_owner_ops *folio_get_owner_ops(struct folio *fo return owner_ops; } +/* + * Get the page dev_pagemap pgmap pointer. + */ +#define page_get_pgmap(page) ((page)->pgmap) + +/* + * Set the page dev_pagemap pgmap pointer. + */ +static inline void page_set_pgmap(struct page *page, struct dev_pagemap *pgmap) +{ + page->pgmap = pgmap; +} + struct page_frag_cache { void * va; #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) diff --git a/lib/test_hmm.c b/lib/test_hmm.c index 056f2e411d7b..d3e3843f57dd 100644 --- a/lib/test_hmm.c +++ b/lib/test_hmm.c @@ -195,7 +195,7 @@ static int dmirror_fops_release(struct inode *inode, struct file *filp) static struct dmirror_chunk *dmirror_page_to_chunk(struct page *page) { - return container_of(page->pgmap, struct dmirror_chunk, pagemap); + return container_of(page_get_pgmap(page), struct dmirror_chunk, pagemap); } static struct dmirror_device *dmirror_page_to_device(struct page *page) diff --git a/mm/hmm.c b/mm/hmm.c index 7e0229ae4a5a..b5f5ac218fda 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -248,7 +248,7 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, * just report the PFN. */ if (is_device_private_entry(entry) && - pfn_swap_entry_to_page(entry)->pgmap->owner == + page_get_pgmap(pfn_swap_entry_to_page(entry))->owner == range->dev_private_owner) { cpu_flags = HMM_PFN_VALID; if (is_writable_device_private_entry(entry)) diff --git a/mm/memory.c b/mm/memory.c index 80850cad0e6f..5853fa5767c7 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4276,7 +4276,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) */ get_page(vmf->page); pte_unmap_unlock(vmf->pte, vmf->ptl); - ret = vmf->page->pgmap->ops->migrate_to_ram(vmf); + ret = page_get_pgmap(vmf->page)->ops->migrate_to_ram(vmf); put_page(vmf->page); } else if (is_hwpoison_entry(entry)) { ret = VM_FAULT_HWPOISON; diff --git a/mm/memremap.c b/mm/memremap.c index 40d4547ce514..931bc85da1df 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -458,8 +458,9 @@ EXPORT_SYMBOL_GPL(get_dev_pagemap); void free_zone_device_folio(struct folio *folio) { - if (WARN_ON_ONCE(!folio->page.pgmap->ops || - !folio->page.pgmap->ops->page_free)) + struct dev_pagemap *pgmap = page_get_pgmap(&folio->page); + + if (WARN_ON_ONCE(!pgmap->ops || !pgmap->ops->page_free)) return; mem_cgroup_uncharge(folio); @@ -486,17 +487,17 @@ void free_zone_device_folio(struct folio *folio) * to clear folio->mapping. */ folio->mapping = NULL; - folio->page.pgmap->ops->page_free(folio_page(folio, 0)); + pgmap->ops->page_free(folio_page(folio, 0)); - if (folio->page.pgmap->type != MEMORY_DEVICE_PRIVATE && - folio->page.pgmap->type != MEMORY_DEVICE_COHERENT) + if (pgmap->type != MEMORY_DEVICE_PRIVATE && + pgmap->type != MEMORY_DEVICE_COHERENT) /* * Reset the refcount to 1 to prepare for handing out the page * again. */ folio_set_count(folio, 1); else - put_dev_pagemap(folio->page.pgmap); + put_dev_pagemap(pgmap); } void zone_device_page_init(struct page *page) @@ -505,7 +506,7 @@ void zone_device_page_init(struct page *page) * Drivers shouldn't be allocating pages after calling * memunmap_pages(). */ - WARN_ON_ONCE(!percpu_ref_tryget_live(&page->pgmap->ref)); + WARN_ON_ONCE(!percpu_ref_tryget_live(&page_get_pgmap(page)->ref)); set_page_count(page, 1); lock_page(page); } @@ -514,7 +515,9 @@ EXPORT_SYMBOL_GPL(zone_device_page_init); #ifdef CONFIG_FS_DAX bool __put_devmap_managed_folio_refs(struct folio *folio, int refs) { - if (folio->page.pgmap->type != MEMORY_DEVICE_FS_DAX) + struct dev_pagemap *pgmap = page_get_pgmap(&folio->page); + + if (pgmap->type != MEMORY_DEVICE_FS_DAX) return false; /* diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 9cf26592ac93..368def358d02 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -135,7 +135,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, page = pfn_swap_entry_to_page(entry); if (!(migrate->flags & MIGRATE_VMA_SELECT_DEVICE_PRIVATE) || - page->pgmap->owner != migrate->pgmap_owner) + page_get_pgmap(page)->owner != migrate->pgmap_owner) goto next; mpfn = migrate_pfn(page_to_pfn(page)) | @@ -156,7 +156,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, goto next; else if (page && is_device_coherent_page(page) && (!(migrate->flags & MIGRATE_VMA_SELECT_DEVICE_COHERENT) || - page->pgmap->owner != migrate->pgmap_owner)) + page_get_pgmap(page)->owner != migrate->pgmap_owner)) goto next; mpfn = migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE; mpfn |= pte_write(pte) ? MIGRATE_PFN_WRITE : 0; diff --git a/mm/mm_init.c b/mm/mm_init.c index 1c205b0a86ed..279cdaebfd2b 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -995,7 +995,7 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn, * and zone_device_data. It is a bug if a ZONE_DEVICE page is * ever freed or placed on a driver-private list. */ - page->pgmap = pgmap; + page_set_pgmap(page, pgmap); page->zone_device_data = NULL; /* From patchwork Fri Nov 8 16:20:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13868455 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3C40DD64063 for ; Fri, 8 Nov 2024 16:21:06 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9F2CF10EA15; Fri, 8 Nov 2024 16:21:05 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="yJnx4mi1"; dkim-atps=neutral Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by gabe.freedesktop.org (Postfix) with ESMTPS id E3B4910EA10 for ; Fri, 8 Nov 2024 16:21:03 +0000 (UTC) Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-6ea82a5480fso45675677b3.0 for ; Fri, 08 Nov 2024 08:21:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731082863; x=1731687663; darn=lists.freedesktop.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=AGA0UiFmqkm5IQcmVA3JJZcNgt8jjwXyqomK1aijshk=; b=yJnx4mi1RMWVsvKrcCsdCjRLwkM93OrCxvqpo1oICWB1Jax67+xjCEc7/PI5AfLWyf mjcvK5Q008cUs6qFOxsJYHx8HR7QsPF0Piq1ZRHNublLRCaSpdAJ7GhYqPzBOQOSgDfa GDfTxjtwRhWFa8vt2+EZmc7WJph2mlUl8Ak3mawlKHFgXZjH2wjuI47spTpFvZsxe9dT uGz1zCZLac97BR7Qn6YtaSUNzCZSpLUqHkrU0GeLc5a4oQKxmg37zI+upE9L0wyKMJyo 2VVCpA/vMVKSoJJrG/wBZ4XuLdxRO3Vk9bhzXHdHEGYshFVUxxxD7WpmQ2nElUoY2X6h ngjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731082863; x=1731687663; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=AGA0UiFmqkm5IQcmVA3JJZcNgt8jjwXyqomK1aijshk=; b=ehGA2B5ib27XSimU8798yA4s1wgw+fX6Bu4lYBShRp7zleBllv25VTh6sj8MT9b3IM f6R51GNmMU/Hs6ilzer9AFd7pN14j8mTFMzFG+fyNQMwzbmeuO5X81WTi+oodPVg8uB/ q7bhmNj/rRM6EEB3ksvXzLT3uFshyJLjSOsfr0HDHopfb4SVYGfWU6Z1gpAhVOwlLtne sRiMulcpIa5BBnUkhIB23I8lf5dtas7svxcPmdRXTZeh5B84eI4+ORYVVgr/eQWynf3H SzdewdUl+yAjJl169XpEPRIKqLt2c1++mj8yYXWC4YJpSXS0KfkTqE5vhTJUWGO4+1U9 vlnQ== X-Forwarded-Encrypted: i=1; AJvYcCXm9PYl8dwY/L9xdLRYKvPQr5ENtsMW+XHHOBABU9XsvAW1oO/6QwZD/8jAr6lN9Kb0R+qIweO5JtQ=@lists.freedesktop.org X-Gm-Message-State: AOJu0YzJmWQkzsZtcWZmXqVHfb1My2kBrsP+CMvg92YMBBZT6ZSEjRE2 VPKGJ4aEPfdbyh94L3jfCe42BLYDxrZOV5ItB5MjMlOmkUlgF8F55i+jWWXLB5JobFeD525nFw= = X-Google-Smtp-Source: AGHT+IGtaX0Xx+3549jHaKW8S9KxdS0O7fbxqd21Fy1uVBwd66gZLlXyQvWaaGo3amHvs3pcci25pPLu+w== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a25:dc4a:0:b0:e25:5cb1:77d8 with SMTP id 3f1490d57ef6-e337f8ed8bbmr2122276.6.1731082862962; Fri, 08 Nov 2024 08:21:02 -0800 (PST) Date: Fri, 8 Nov 2024 16:20:39 +0000 In-Reply-To: <20241108162040.159038-1-tabba@google.com> Mime-Version: 1.0 References: <20241108162040.159038-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.277.g8800431eea-goog Message-ID: <20241108162040.159038-10-tabba@google.com> Subject: [RFC PATCH v1 09/10] mm: Use owner_ops on folio_put for zone device pages From: Fuad Tabba To: linux-mm@kvack.org Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org, david@redhat.com, rppt@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, muchun.song@linux.dev, simona@ffwll.ch, airlied@gmail.com, pbonzini@redhat.com, seanjc@google.com, willy@infradead.org, jgg@nvidia.com, jhubbard@nvidia.com, ackerleytng@google.com, vannapurve@google.com, mail@maciej.szmigiero.name, kirill.shutemov@linux.intel.com, quic_eberman@quicinc.com, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, tabba@google.com X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Now that we have the folio_owner_ops callback, use it for zone device pages instead of using a dedicated callback. Note that struct dev_pagemap (pgmap) in struct page is overlaid with struct folio owner_ops. Therefore, make struct dev_pagemap contain an instance of struct folio_owner_ops, to handle it the same way as struct folio_owner_ops. Also note that, although struct dev_pagemap_ops has a page_free() function, it does not have the same intention as the folio_owner_ops free() callback nor does it have the same behavior. The page_free() function is used as an optional callback to drivers that use zone device to inform them of the freeing of the page. Signed-off-by: Fuad Tabba --- include/linux/memremap.h | 8 +++++++ include/linux/mm_types.h | 16 ++++++++++++-- mm/internal.h | 1 - mm/memremap.c | 44 -------------------------------------- mm/mm_init.c | 46 ++++++++++++++++++++++++++++++++++++++++ mm/swap.c | 18 ++-------------- 6 files changed, 70 insertions(+), 63 deletions(-) diff --git a/include/linux/memremap.h b/include/linux/memremap.h index 060e27b6aee0..5b68bbc588a3 100644 --- a/include/linux/memremap.h +++ b/include/linux/memremap.h @@ -106,6 +106,7 @@ struct dev_pagemap_ops { /** * struct dev_pagemap - metadata for ZONE_DEVICE mappings + * @folio_ops: method table for folio operations. * @altmap: pre-allocated/reserved memory for vmemmap allocations * @ref: reference count that pins the devm_memremap_pages() mapping * @done: completion for @ref @@ -125,6 +126,7 @@ struct dev_pagemap_ops { * @ranges: array of ranges to be mapped when nr_range > 1 */ struct dev_pagemap { + struct folio_owner_ops folio_ops; struct vmem_altmap altmap; struct percpu_ref ref; struct completion done; @@ -140,6 +142,12 @@ struct dev_pagemap { }; }; +/* + * The folio_owner_ops structure needs to be first since pgmap in struct page is + * overlaid with owner_ops in struct folio. + */ +static_assert(offsetof(struct dev_pagemap, folio_ops) == 0); + static inline bool pgmap_has_memory_failure(struct dev_pagemap *pgmap) { return pgmap->ops && pgmap->ops->memory_failure; diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 27075ea24e67..a72fda20d5e9 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -427,6 +427,7 @@ FOLIO_MATCH(lru, lru); FOLIO_MATCH(mapping, mapping); FOLIO_MATCH(compound_head, lru); FOLIO_MATCH(compound_head, owner_ops); +FOLIO_MATCH(pgmap, owner_ops); FOLIO_MATCH(index, index); FOLIO_MATCH(private, private); FOLIO_MATCH(_mapcount, _mapcount); @@ -618,15 +619,26 @@ static inline const struct folio_owner_ops *folio_get_owner_ops(struct folio *fo /* * Get the page dev_pagemap pgmap pointer. + * + * The page pgmap is overlaid with the folio owner_ops, where bit 1 is used to + * indicate that the page/folio has owner ops. The dev_pagemap contains + * owner_ops and is handled the same way. The getter returns a sanitized + * pointer. */ -#define page_get_pgmap(page) ((page)->pgmap) +#define page_get_pgmap(page) \ + ((struct dev_pagemap *)((unsigned long)(page)->pgmap & ~FOLIO_OWNER_OPS)) /* * Set the page dev_pagemap pgmap pointer. + * + * The page pgmap is overlaid with the folio owner_ops, where bit 1 is used to + * indicate that the page/folio has owner ops. The dev_pagemap contains + * owner_ops and is handled the same way. The setter sets bit 1 to indicate + * that the page owner_ops. */ static inline void page_set_pgmap(struct page *page, struct dev_pagemap *pgmap) { - page->pgmap = pgmap; + page->pgmap = (struct dev_pagemap *)((unsigned long)pgmap | FOLIO_OWNER_OPS); } struct page_frag_cache { diff --git a/mm/internal.h b/mm/internal.h index 5a7302baeed7..a041247bed10 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1262,7 +1262,6 @@ int numa_migrate_check(struct folio *folio, struct vm_fault *vmf, unsigned long addr, int *flags, bool writable, int *last_cpupid); -void free_zone_device_folio(struct folio *folio); int migrate_device_coherent_folio(struct folio *folio); struct vm_struct *__get_vm_area_node(unsigned long size, diff --git a/mm/memremap.c b/mm/memremap.c index 931bc85da1df..9fd5f57219eb 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -456,50 +456,6 @@ struct dev_pagemap *get_dev_pagemap(unsigned long pfn, } EXPORT_SYMBOL_GPL(get_dev_pagemap); -void free_zone_device_folio(struct folio *folio) -{ - struct dev_pagemap *pgmap = page_get_pgmap(&folio->page); - - if (WARN_ON_ONCE(!pgmap->ops || !pgmap->ops->page_free)) - return; - - mem_cgroup_uncharge(folio); - - /* - * Note: we don't expect anonymous compound pages yet. Once supported - * and we could PTE-map them similar to THP, we'd have to clear - * PG_anon_exclusive on all tail pages. - */ - if (folio_test_anon(folio)) { - VM_BUG_ON_FOLIO(folio_test_large(folio), folio); - __ClearPageAnonExclusive(folio_page(folio, 0)); - } - - /* - * When a device managed page is freed, the folio->mapping field - * may still contain a (stale) mapping value. For example, the - * lower bits of folio->mapping may still identify the folio as an - * anonymous folio. Ultimately, this entire field is just stale - * and wrong, and it will cause errors if not cleared. - * - * For other types of ZONE_DEVICE pages, migration is either - * handled differently or not done at all, so there is no need - * to clear folio->mapping. - */ - folio->mapping = NULL; - pgmap->ops->page_free(folio_page(folio, 0)); - - if (pgmap->type != MEMORY_DEVICE_PRIVATE && - pgmap->type != MEMORY_DEVICE_COHERENT) - /* - * Reset the refcount to 1 to prepare for handing out the page - * again. - */ - folio_set_count(folio, 1); - else - put_dev_pagemap(pgmap); -} - void zone_device_page_init(struct page *page) { /* diff --git a/mm/mm_init.c b/mm/mm_init.c index 279cdaebfd2b..47c1f8fd4914 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -974,6 +974,51 @@ static void __init memmap_init(void) } #ifdef CONFIG_ZONE_DEVICE + +static void free_zone_device_folio(struct folio *folio) +{ + struct dev_pagemap *pgmap = page_get_pgmap(&folio->page); + + if (WARN_ON_ONCE(!pgmap->ops || !pgmap->ops->page_free)) + return; + + mem_cgroup_uncharge(folio); + + /* + * Note: we don't expect anonymous compound pages yet. Once supported + * and we could PTE-map them similar to THP, we'd have to clear + * PG_anon_exclusive on all tail pages. + */ + if (folio_test_anon(folio)) { + VM_BUG_ON_FOLIO(folio_test_large(folio), folio); + __ClearPageAnonExclusive(folio_page(folio, 0)); + } + + /* + * When a device managed page is freed, the folio->mapping field + * may still contain a (stale) mapping value. For example, the + * lower bits of folio->mapping may still identify the folio as an + * anonymous folio. Ultimately, this entire field is just stale + * and wrong, and it will cause errors if not cleared. + * + * For other types of ZONE_DEVICE pages, migration is either + * handled differently or not done at all, so there is no need + * to clear folio->mapping. + */ + folio->mapping = NULL; + pgmap->ops->page_free(folio_page(folio, 0)); + + if (pgmap->type != MEMORY_DEVICE_PRIVATE && + pgmap->type != MEMORY_DEVICE_COHERENT) + /* + * Reset the refcount to 1 to prepare for handing out the page + * again. + */ + folio_set_count(folio, 1); + else + put_dev_pagemap(pgmap); +} + static void __ref __init_zone_device_page(struct page *page, unsigned long pfn, unsigned long zone_idx, int nid, struct dev_pagemap *pgmap) @@ -995,6 +1040,7 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn, * and zone_device_data. It is a bug if a ZONE_DEVICE page is * ever freed or placed on a driver-private list. */ + pgmap->folio_ops.free = free_zone_device_folio; page_set_pgmap(page, pgmap); page->zone_device_data = NULL; diff --git a/mm/swap.c b/mm/swap.c index 767ff6d8f47b..d2578465e270 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -117,11 +117,6 @@ void __folio_put(struct folio *folio) return; } - if (unlikely(folio_is_zone_device(folio))) { - free_zone_device_folio(folio); - return; - } - if (folio_test_hugetlb(folio)) { free_huge_folio(folio); return; @@ -947,20 +942,11 @@ void folios_put_refs(struct folio_batch *folios, unsigned int *refs) unlock_page_lruvec_irqrestore(lruvec, flags); lruvec = NULL; } - if (folio_ref_sub_and_test(folio, nr_refs)) - owner_ops->free(folio); - continue; - } - - if (folio_is_zone_device(folio)) { - if (lruvec) { - unlock_page_lruvec_irqrestore(lruvec, flags); - lruvec = NULL; - } + /* fenced by folio_is_zone_device() */ if (put_devmap_managed_folio_refs(folio, nr_refs)) continue; if (folio_ref_sub_and_test(folio, nr_refs)) - free_zone_device_folio(folio); + owner_ops->free(folio); continue; } From patchwork Fri Nov 8 16:20:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13868456 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 37A73D64062 for ; Fri, 8 Nov 2024 16:21:09 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AF49910EA14; Fri, 8 Nov 2024 16:21:08 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="LhgKyQZB"; dkim-atps=neutral Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) by gabe.freedesktop.org (Postfix) with ESMTPS id B08E110EA13 for ; Fri, 8 Nov 2024 16:21:06 +0000 (UTC) Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-37d5a3afa84so1365218f8f.3 for ; Fri, 08 Nov 2024 08:21:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731082865; x=1731687665; darn=lists.freedesktop.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=LLzT0eDLF6iAMaTBJqJboYyOg2pJMFOgkBQWEqDLmNQ=; b=LhgKyQZBCvMVXJtubPNbXy2Jf5pGVywD7PNS+d4B/LadJEP4tnGjufHT2oV2dYl7IS VF3iyeQBj1zEq8FsqMjE0kq9EycmMvAPm+tbpkgT31/GlmYedNEUvEdQ/D1ZGA3aH38K yJXAhZkaL9lHIcCrXDUs0Z2auTED6LbGA6Myhvdrr2/6xn191vcxaJbhGWmeVcIoKdM/ VWyMnQ1Wih66XtV/yqM1/XrA9cT5+EfUxr78PtQ8mDvBtnv6EgR4KafXDftcZ61e+VAC /442vxB0zmwyTz/qg6f3VK3GYIQUVb62DOAvAl3wrucq/UR0m7Cbc62/ggDWLkVEQHMS TsyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731082865; x=1731687665; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=LLzT0eDLF6iAMaTBJqJboYyOg2pJMFOgkBQWEqDLmNQ=; b=UrLNh4SL+5UZKHI7P2ZuRO0HgXoxUTLoFnKM+xDHc0rCSUGdGA8y4FYfOSb0UR0h1t StKrNSvZBVSq+xXeCngGkfiZ9HkuQ6Bz2CWbpf3/pOTzpC+1gxJouB6CnjoUyW1rYCHR urTuIlAYCzlDUJ5UmUl1nh95h9Ba0vHt/vriheark5TczN9NLCJR9ynhKMb95H1Ou1K0 AtDAe4xmickfHCH6lf7nlcjHXm+wCDeIKfGBSJg1h5UBei85iiu1ZMYOJ6wa40xBLg1w ZGheeBMGV97Wb59vwsQRL3RQ/f/9tqqLpI/keBu0hChLgSOfveUizK1Gq3uBR/RnMwXj xafQ== X-Forwarded-Encrypted: i=1; AJvYcCWqDlXK/zehPB05nl2Y+GgMHlr2vB/MW6pH3Et85GEEOjO601t2oHLWDsDU1vU4Jp5MeQYql4OLCkc=@lists.freedesktop.org X-Gm-Message-State: AOJu0YyXjF0RLzT8piT0XSiXKP0b7jVyOPNCeC1BcOs1WTvcOrPaF9zA 4fqVKdh5kFr4yNChINS/btyUY+A5iYruhWuWsUNtdJQaRpzDeVIisF2KlkQyu5ceMOJfWhTc+g= = X-Google-Smtp-Source: AGHT+IHHoxyjGs1AUxhryPf5sywJLDUszI7O7sOrCReyd47RhmaPFBz34fv43FAG+LKSgY7icD6Ea0HGQw== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a05:6000:1b02:b0:37d:4850:c3be with SMTP id ffacd0b85a97d-381f18881dfmr2466f8f.10.1731082865212; Fri, 08 Nov 2024 08:21:05 -0800 (PST) Date: Fri, 8 Nov 2024 16:20:40 +0000 In-Reply-To: <20241108162040.159038-1-tabba@google.com> Mime-Version: 1.0 References: <20241108162040.159038-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.277.g8800431eea-goog Message-ID: <20241108162040.159038-11-tabba@google.com> Subject: [RFC PATCH v1 10/10] mm: hugetlb: Use owner_ops on folio_put for hugetlb From: Fuad Tabba To: linux-mm@kvack.org Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org, david@redhat.com, rppt@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, muchun.song@linux.dev, simona@ffwll.ch, airlied@gmail.com, pbonzini@redhat.com, seanjc@google.com, willy@infradead.org, jgg@nvidia.com, jhubbard@nvidia.com, ackerleytng@google.com, vannapurve@google.com, mail@maciej.szmigiero.name, kirill.shutemov@linux.intel.com, quic_eberman@quicinc.com, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, tabba@google.com X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Now that we have the folio_owner_ops callback, use it for hugetlb pages instead of using a dedicated callback. Since owner_ops is overlaid with lru, we need to unset owner_ops to allow the use of lru when its isolated. At that point we know that the reference count is elevated, will not reach 0, and thus not trigger a callback. Therefore, it is safe to do so provided we restore it before we put the folio back. Signed-off-by: Fuad Tabba --- include/linux/hugetlb.h | 2 -- mm/hugetlb.c | 57 +++++++++++++++++++++++++++++++++-------- mm/swap.c | 14 ---------- 3 files changed, 47 insertions(+), 26 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index e846d7dac77c..500848862702 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -20,8 +20,6 @@ struct user_struct; struct mmu_gather; struct node; -void free_huge_folio(struct folio *folio); - #ifdef CONFIG_HUGETLB_PAGE #include diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 2308e94d8615..4e1c87e37968 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -89,6 +89,33 @@ static void __hugetlb_vma_unlock_write_free(struct vm_area_struct *vma); static void hugetlb_unshare_pmds(struct vm_area_struct *vma, unsigned long start, unsigned long end); static struct resv_map *vma_resv_map(struct vm_area_struct *vma); +static void free_huge_folio(struct folio *folio); + +static const struct folio_owner_ops hugetlb_owner_ops = { + .free = free_huge_folio, +}; + +/* + * Mark this folio as a hugetlb-owned folio. + * + * Set the folio hugetlb flag and owner operations. + */ +static void folio_set_hugetlb_owner(struct folio *folio) +{ + __folio_set_hugetlb(folio); + folio_set_owner_ops(folio, &hugetlb_owner_ops); +} + +/* + * Unmark this folio from being a hugetlb-owned folio. + * + * Clear the folio hugetlb flag and owner operations. + */ +static void folio_clear_hugetlb_owner(struct folio *folio) +{ + folio_clear_owner_ops(folio); + __folio_clear_hugetlb(folio); +} static void hugetlb_free_folio(struct folio *folio) { @@ -1617,7 +1644,7 @@ static void remove_hugetlb_folio(struct hstate *h, struct folio *folio, * to tail struct pages. */ if (!folio_test_hugetlb_vmemmap_optimized(folio)) { - __folio_clear_hugetlb(folio); + folio_clear_hugetlb_owner(folio); } h->nr_huge_pages--; @@ -1641,7 +1668,7 @@ static void add_hugetlb_folio(struct hstate *h, struct folio *folio, h->surplus_huge_pages++; h->surplus_huge_pages_node[nid]++; } - __folio_set_hugetlb(folio); + folio_set_hugetlb_owner(folio); folio_change_private(folio, NULL); /* @@ -1692,7 +1719,7 @@ static void __update_and_free_hugetlb_folio(struct hstate *h, */ if (folio_test_hugetlb(folio)) { spin_lock_irq(&hugetlb_lock); - __folio_clear_hugetlb(folio); + folio_clear_hugetlb_owner(folio); spin_unlock_irq(&hugetlb_lock); } @@ -1793,7 +1820,7 @@ static void bulk_vmemmap_restore_error(struct hstate *h, list_for_each_entry_safe(folio, t_folio, non_hvo_folios, _hugetlb_list) { list_del(&folio->_hugetlb_list); spin_lock_irq(&hugetlb_lock); - __folio_clear_hugetlb(folio); + folio_clear_hugetlb_owner(folio); spin_unlock_irq(&hugetlb_lock); update_and_free_hugetlb_folio(h, folio, false); cond_resched(); @@ -1818,7 +1845,7 @@ static void bulk_vmemmap_restore_error(struct hstate *h, } else { list_del(&folio->_hugetlb_list); spin_lock_irq(&hugetlb_lock); - __folio_clear_hugetlb(folio); + folio_clear_hugetlb_owner(folio); spin_unlock_irq(&hugetlb_lock); update_and_free_hugetlb_folio(h, folio, false); cond_resched(); @@ -1851,14 +1878,14 @@ static void update_and_free_pages_bulk(struct hstate *h, * should only be pages on the non_hvo_folios list. * Do note that the non_hvo_folios list could be empty. * Without HVO enabled, ret will be 0 and there is no need to call - * __folio_clear_hugetlb as this was done previously. + * folio_clear_hugetlb_owner as this was done previously. */ VM_WARN_ON(!list_empty(folio_list)); VM_WARN_ON(ret < 0); if (!list_empty(&non_hvo_folios) && ret) { spin_lock_irq(&hugetlb_lock); list_for_each_entry(folio, &non_hvo_folios, _hugetlb_list) - __folio_clear_hugetlb(folio); + folio_clear_hugetlb_owner(folio); spin_unlock_irq(&hugetlb_lock); } @@ -1879,7 +1906,7 @@ struct hstate *size_to_hstate(unsigned long size) return NULL; } -void free_huge_folio(struct folio *folio) +static void free_huge_folio(struct folio *folio) { /* * Can't pass hstate in here because it is called from the @@ -1959,7 +1986,7 @@ static void __prep_account_new_huge_page(struct hstate *h, int nid) static void init_new_hugetlb_folio(struct hstate *h, struct folio *folio) { - __folio_set_hugetlb(folio); + folio_set_hugetlb_owner(folio); INIT_LIST_HEAD(&folio->_hugetlb_list); hugetlb_set_folio_subpool(folio, NULL); set_hugetlb_cgroup(folio, NULL); @@ -7428,6 +7455,14 @@ bool folio_isolate_hugetlb(struct folio *folio, struct list_head *list) goto unlock; } folio_clear_hugetlb_migratable(folio); + /* + * Clear folio->owner_ops; now we can use folio->lru. + * Note that the folio cannot get freed because we are holding a + * reference. The reference will be put in folio_putback_hugetlb(), + * after restoring folio->owner_ops. + */ + folio_clear_owner_ops(folio); + INIT_LIST_HEAD(&folio->lru); list_del_init(&folio->_hugetlb_list); list_add_tail(&folio->lru, list); unlock: @@ -7480,7 +7515,9 @@ void folio_putback_hugetlb(struct folio *folio) { spin_lock_irq(&hugetlb_lock); folio_set_hugetlb_migratable(folio); - list_del_init(&folio->lru); + list_del(&folio->lru); + /* Restore folio->owner_ops since we can no longer use folio->lru. */ + folio_set_owner_ops(folio, &hugetlb_owner_ops); list_add_tail(&folio->_hugetlb_list, &(folio_hstate(folio))->hugepage_activelist); spin_unlock_irq(&hugetlb_lock); folio_put(folio); diff --git a/mm/swap.c b/mm/swap.c index d2578465e270..9798ca47f26a 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -117,11 +117,6 @@ void __folio_put(struct folio *folio) return; } - if (folio_test_hugetlb(folio)) { - free_huge_folio(folio); - return; - } - page_cache_release(folio); folio_unqueue_deferred_split(folio); mem_cgroup_uncharge(folio); @@ -953,15 +948,6 @@ void folios_put_refs(struct folio_batch *folios, unsigned int *refs) if (!folio_ref_sub_and_test(folio, nr_refs)) continue; - /* hugetlb has its own memcg */ - if (folio_test_hugetlb(folio)) { - if (lruvec) { - unlock_page_lruvec_irqrestore(lruvec, flags); - lruvec = NULL; - } - free_huge_folio(folio); - continue; - } folio_unqueue_deferred_split(folio); __page_cache_release(folio, &lruvec, &flags);