[v3,1/3] mm,page_owner: Update metada for tail pages

Message ID	20240326063036.6242-2-osalvador@suse.de (mailing list archive)
State	New
Headers	show Return-Path: <owner-linux-mm@kvack.org> From: Oscar Salvador <osalvador@suse.de> To: Andrew Morton <akpm@linux-foundation.org> Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>, Marco Elver <elver@google.com>, Andrey Konovalov <andreyknvl@gmail.com>, Alexander Potapenko <glider@google.com>, Oscar Salvador <osalvador@suse.de> Subject: [PATCH v3 1/3] mm,page_owner: Update metada for tail pages Date: Tue, 26 Mar 2024 07:30:34 +0100 Message-ID: <20240326063036.6242-2-osalvador@suse.de> In-Reply-To: <20240326063036.6242-1-osalvador@suse.de> References: <20240326063036.6242-1-osalvador@suse.de> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	page_owner: Fix refcount imbalance \| expand [v3,0/3] page_owner: Fix refcount imbalance [v3,1/3] mm,page_owner: Update metada for tail pages [v3,2/3] mm,page_owner: Fix refcount imbalance [v3,3/3] mm,page_owner: Fix accounting of pages when migrating

Message ID

20240326063036.6242-2-osalvador@suse.de (mailing list archive)

State

New

Headers

From: Oscar Salvador <osalvador@suse.de>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org,
	linux-mm@kvack.org,
	Michal Hocko <mhocko@suse.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Marco Elver <elver@google.com>,
	Andrey Konovalov <andreyknvl@gmail.com>,
	Alexander Potapenko <glider@google.com>,
	Oscar Salvador <osalvador@suse.de>
Subject: [PATCH v3 1/3] mm,page_owner: Update metada for tail pages
Date: Tue, 26 Mar 2024 07:30:34 +0100
Message-ID: <20240326063036.6242-2-osalvador@suse.de>
In-Reply-To: <20240326063036.6242-1-osalvador@suse.de>
References: <20240326063036.6242-1-osalvador@suse.de>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: owner-linux-mm@kvack.org
Precedence: bulk

Series

page_owner: Fix refcount imbalance | expand

Commit Message

Oscar Salvador March 26, 2024, 6:30 a.m. UTC

__set_page_owner_handle() and __reset_page_owner() update the metadata
of all pages when the page is of a higher-order, but we miss to do the
same when the pages are migrated.
__folio_copy_owner() only updates the metadata of the head page, meaning
that the information stored in the first page and the tail pages will not
match.

Strictly speaking that is not a big problem because 1) we do not print
tail pages and 2) upon splitting all tail pages will inherit the
metada of the head page, but it is better to have all metadata in check
should there be any problem, so it can ease debugging.

For that purpose, a couple of helpers are created
__update_page_owner_handle() which updates the metadata on allocation,
and __update_page_owner_free_handle() which does the same when the page
is freed.

__folio_copy_owner() will make use of both as it needs to entirely replace
the page_owner metadata for the new page.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
 mm/page_owner.c | 137 ++++++++++++++++++++++++++----------------------
 1 file changed, 74 insertions(+), 63 deletions(-)

Comments

Vlastimil Babka April 2, 2024, 10:13 a.m. UTC | #1

Subject: metada -> metadata

On 3/26/24 7:30 AM, Oscar Salvador wrote:
> __set_page_owner_handle() and __reset_page_owner() update the metadata
> of all pages when the page is of a higher-order, but we miss to do the
> same when the pages are migrated.
> __folio_copy_owner() only updates the metadata of the head page, meaning
> that the information stored in the first page and the tail pages will not
> match.
> 
> Strictly speaking that is not a big problem because 1) we do not print
> tail pages and 2) upon splitting all tail pages will inherit the
> metada of the head page, but it is better to have all metadata in check

  metadata

> should there be any problem, so it can ease debugging.
> 
> For that purpose, a couple of helpers are created
> __update_page_owner_handle() which updates the metadata on allocation,
> and __update_page_owner_free_handle() which does the same when the page
> is freed.
> 
> __folio_copy_owner() will make use of both as it needs to entirely replace
> the page_owner metadata for the new page.
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>

Reviewed-by: Vlastimil Babka <vbabka@suse.cz>

Also I think this series should move to mm-hotfixes due to fixing bugs from rc1.

Some more nits:

> @@ -355,31 +375,21 @@ void __folio_copy_owner(struct folio *newfolio, struct folio *old)
>  	}
>  
>  	old_page_owner = get_page_owner(old_ext);
> -	new_page_owner = get_page_owner(new_ext);
> -	new_page_owner->order = old_page_owner->order;
> -	new_page_owner->gfp_mask = old_page_owner->gfp_mask;
> -	new_page_owner->last_migrate_reason =
> -		old_page_owner->last_migrate_reason;
> -	new_page_owner->handle = old_page_owner->handle;
> -	new_page_owner->pid = old_page_owner->pid;
> -	new_page_owner->tgid = old_page_owner->tgid;
> -	new_page_owner->free_pid = old_page_owner->free_pid;
> -	new_page_owner->free_tgid = old_page_owner->free_tgid;
> -	new_page_owner->ts_nsec = old_page_owner->ts_nsec;
> -	new_page_owner->free_ts_nsec = old_page_owner->ts_nsec;
> -	strcpy(new_page_owner->comm, old_page_owner->comm);
> -
> +	__update_page_owner_handle(new_ext, old_page_owner->handle,
> +				   old_page_owner->order, old_page_owner->gfp_mask,
> +				   old_page_owner->last_migrate_reason,
> +				   old_page_owner->ts_nsec, old_page_owner->pid,
> +				   old_page_owner->tgid, old_page_owner->comm);
>  	/*
> -	 * We don't clear the bit on the old folio as it's going to be freed
> -	 * after migration. Until then, the info can be useful in case of
> -	 * a bug, and the overall stats will be off a bit only temporarily.
> -	 * Also, migrate_misplaced_transhuge_page() can still fail the
> -	 * migration and then we want the old folio to retain the info. But
> -	 * in that case we also don't need to explicitly clear the info from
> -	 * the new page, which will be freed.
> +	 * Do not proactively clear PAGE_EXT_OWNER{_ALLOCATED} bits as the folio
> +	 * will be freed after migration. Keep them until then as they may be
> +	 * useful.
>  	 */

The full old comment made sense, the new one sounds like it's talking about
the old folio ("will be freed after migration") but we're modifying the new
folio here. IIUC it means the case of migration failing and then the new
folio MIGHT be freed. So I think you made the comment too much concise to be
immediately clear?

> -	__set_bit(PAGE_EXT_OWNER, &new_ext->flags);
> -	__set_bit(PAGE_EXT_OWNER_ALLOCATED, &new_ext->flags);
> +	__update_page_owner_free_handle(new_ext, 0, old_page_owner->order,
> +					old_page_owner->free_pid,
> +					old_page_owner->free_tgid,
> +					old_page_owner->free_ts_nsec);
> +
>  	page_ext_put(new_ext);
>  	page_ext_put(old_ext);
>  }
> @@ -787,8 +797,9 @@ static void init_pages_in_zone(pg_data_t *pgdat, struct zone *zone)
>  				goto ext_put_continue;
>  
>  			/* Found early allocated page */
> -			__set_page_owner_handle(page_ext, early_handle,
> -						0, 0);
> +			__update_page_owner_handle(page_ext, early_handle, 0, 0,
> +						   -1, local_clock(), current->pid,
> +						   current->tgid, current->comm);
>  			count++;
>  ext_put_continue:
>  			page_ext_put(page_ext);

Oscar Salvador April 2, 2024, 11:19 a.m. UTC | #2

On Tue, Apr 02, 2024 at 12:13:37PM +0200, Vlastimil Babka wrote:
> Subject: metada -> metadata

Ooops.

> > Signed-off-by: Oscar Salvador <osalvador@suse.de>
> 
> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>

Thanks!

> > @@ -355,31 +375,21 @@ void __folio_copy_owner(struct folio *newfolio, struct folio *old)
> > -	 * We don't clear the bit on the old folio as it's going to be freed
> > -	 * after migration. Until then, the info can be useful in case of
> > -	 * a bug, and the overall stats will be off a bit only temporarily.
> > -	 * Also, migrate_misplaced_transhuge_page() can still fail the
> > -	 * migration and then we want the old folio to retain the info. But
> > -	 * in that case we also don't need to explicitly clear the info from
> > -	 * the new page, which will be freed.
> > +	 * Do not proactively clear PAGE_EXT_OWNER{_ALLOCATED} bits as the folio
> > +	 * will be freed after migration. Keep them until then as they may be
> > +	 * useful.
> >  	 */
> 
> The full old comment made sense, the new one sounds like it's talking about
> the old folio ("will be freed after migration") but we're modifying the new
> folio here. IIUC it means the case of migration failing and then the new
> folio MIGHT be freed. So I think you made the comment too much concise to be
> immediately clear?

It probably could be improved by saying that there is no need to clear
the bit from the old folio since that will be done when __reset_page_owner()
gets called on the old folio.

Now, answering your question about whether we can fail or not at this
stage.
I looked into this a few weeks ago and I made my mind that no, we cannot
fail at this stage, and the following is my reasoning.

This is the callchain that leads to folio_copy_owner:

migrate_folio_move
 move_to_new_folio
  migrate_folio
   migrate_folio_extra
    folio_migrate_copy
     folio_copy
     folio_migrate_flags
      folio_copy_owner

folio_copy_owner() gets called only from folio_migrate_flags().
And all the functions that call folio_migrate_flags(), return
MIGRATEPAGE_SUCCESS right after calling it, so it is kinda the last
step of the migration.

So no, we cannot fail at this stage, so we do not have to worry about
undoing this.

diff --git a/mm/page_owner.c b/mm/page_owner.c
index d17d1351ec84..52d1ced0b57f 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -228,9 +228,58 @@  static void dec_stack_record_count(depot_stack_handle_t handle)
 		refcount_dec(&stack_record->count);
 }
 
-void __reset_page_owner(struct page *page, unsigned short order)
+static inline void __update_page_owner_handle(struct page_ext *page_ext,
+					      depot_stack_handle_t handle,
+					      unsigned short order,
+					      gfp_t gfp_mask,
+					      short last_migrate_reason, u64 ts_nsec,
+					      pid_t pid, pid_t tgid, char *comm)
 {
 	int i;
+	struct page_owner *page_owner;
+
+	for (i = 0; i < (1 << order); i++) {
+		page_owner = get_page_owner(page_ext);
+		page_owner->handle = handle;
+		page_owner->order = order;
+		page_owner->gfp_mask = gfp_mask;
+		page_owner->last_migrate_reason = last_migrate_reason;
+		page_owner->pid = pid;
+		page_owner->tgid = tgid;
+		page_owner->ts_nsec = ts_nsec;
+		strscpy(page_owner->comm, comm,
+			sizeof(page_owner->comm));
+		__set_bit(PAGE_EXT_OWNER, &page_ext->flags);
+		__set_bit(PAGE_EXT_OWNER_ALLOCATED, &page_ext->flags);
+		page_ext = page_ext_next(page_ext);
+	}
+}
+
+static inline void __update_page_owner_free_handle(struct page_ext *page_ext,
+						   depot_stack_handle_t handle,
+						   unsigned short order,
+						   pid_t pid, pid_t tgid,
+						   u64 free_ts_nsec)
+{
+	int i;
+	struct page_owner *page_owner;
+
+	for (i = 0; i < (1 << order); i++) {
+		page_owner = get_page_owner(page_ext);
+		/* Only __reset_page_owner() wants to clear the bit */
+		if (handle) {
+			__clear_bit(PAGE_EXT_OWNER_ALLOCATED, &page_ext->flags);
+			page_owner->free_handle = handle;
+		}
+		page_owner->free_ts_nsec = free_ts_nsec;
+		page_owner->free_pid = current->pid;
+		page_owner->free_tgid = current->tgid;
+		page_ext = page_ext_next(page_ext);
+	}
+}
+
+void __reset_page_owner(struct page *page, unsigned short order)
+{
 	struct page_ext *page_ext;
 	depot_stack_handle_t handle;
 	depot_stack_handle_t alloc_handle;
@@ -245,16 +294,10 @@  void __reset_page_owner(struct page *page, unsigned short order)
 	alloc_handle = page_owner->handle;
 
 	handle = save_stack(GFP_NOWAIT | __GFP_NOWARN);
-	for (i = 0; i < (1 << order); i++) {
-		__clear_bit(PAGE_EXT_OWNER_ALLOCATED, &page_ext->flags);
-		page_owner->free_handle = handle;
-		page_owner->free_ts_nsec = free_ts_nsec;
-		page_owner->free_pid = current->pid;
-		page_owner->free_tgid = current->tgid;
-		page_ext = page_ext_next(page_ext);
-		page_owner = get_page_owner(page_ext);
-	}
+	__update_page_owner_free_handle(page_ext, handle, order, current->pid,
+					current->tgid, free_ts_nsec);
 	page_ext_put(page_ext);
+
 	if (alloc_handle != early_handle)
 		/*
 		 * early_handle is being set as a handle for all those
@@ -266,36 +309,11 @@  void __reset_page_owner(struct page *page, unsigned short order)
 		dec_stack_record_count(alloc_handle);
 }
 
-static inline void __set_page_owner_handle(struct page_ext *page_ext,
-					depot_stack_handle_t handle,
-					unsigned short order, gfp_t gfp_mask)
-{
-	struct page_owner *page_owner;
-	int i;
-	u64 ts_nsec = local_clock();
-
-	for (i = 0; i < (1 << order); i++) {
-		page_owner = get_page_owner(page_ext);
-		page_owner->handle = handle;
-		page_owner->order = order;
-		page_owner->gfp_mask = gfp_mask;
-		page_owner->last_migrate_reason = -1;
-		page_owner->pid = current->pid;
-		page_owner->tgid = current->tgid;
-		page_owner->ts_nsec = ts_nsec;
-		strscpy(page_owner->comm, current->comm,
-			sizeof(page_owner->comm));
-		__set_bit(PAGE_EXT_OWNER, &page_ext->flags);
-		__set_bit(PAGE_EXT_OWNER_ALLOCATED, &page_ext->flags);
-
-		page_ext = page_ext_next(page_ext);
-	}
-}
-
 noinline void __set_page_owner(struct page *page, unsigned short order,
 					gfp_t gfp_mask)
 {
 	struct page_ext *page_ext;
+	u64 ts_nsec = local_clock();
 	depot_stack_handle_t handle;
 
 	handle = save_stack(gfp_mask);
@@ -303,7 +321,9 @@  noinline void __set_page_owner(struct page *page, unsigned short order,
 	page_ext = page_ext_get(page);
 	if (unlikely(!page_ext))
 		return;
-	__set_page_owner_handle(page_ext, handle, order, gfp_mask);
+	__update_page_owner_handle(page_ext, handle, order, gfp_mask, -1,
+				   current->pid, current->tgid, ts_nsec,
+				   current->comm);
 	page_ext_put(page_ext);
 	inc_stack_record_count(handle, gfp_mask);
 }
@@ -342,7 +362,7 @@  void __folio_copy_owner(struct folio *newfolio, struct folio *old)
 {
 	struct page_ext *old_ext;
 	struct page_ext *new_ext;
-	struct page_owner *old_page_owner, *new_page_owner;
+	struct page_owner *old_page_owner;
 
 	old_ext = page_ext_get(&old->page);
 	if (unlikely(!old_ext))
@@ -355,31 +375,21 @@  void __folio_copy_owner(struct folio *newfolio, struct folio *old)
 	}
 
 	old_page_owner = get_page_owner(old_ext);
-	new_page_owner = get_page_owner(new_ext);
-	new_page_owner->order = old_page_owner->order;
-	new_page_owner->gfp_mask = old_page_owner->gfp_mask;
-	new_page_owner->last_migrate_reason =
-		old_page_owner->last_migrate_reason;
-	new_page_owner->handle = old_page_owner->handle;
-	new_page_owner->pid = old_page_owner->pid;
-	new_page_owner->tgid = old_page_owner->tgid;
-	new_page_owner->free_pid = old_page_owner->free_pid;
-	new_page_owner->free_tgid = old_page_owner->free_tgid;
-	new_page_owner->ts_nsec = old_page_owner->ts_nsec;
-	new_page_owner->free_ts_nsec = old_page_owner->ts_nsec;
-	strcpy(new_page_owner->comm, old_page_owner->comm);
-
+	__update_page_owner_handle(new_ext, old_page_owner->handle,
+				   old_page_owner->order, old_page_owner->gfp_mask,
+				   old_page_owner->last_migrate_reason,
+				   old_page_owner->ts_nsec, old_page_owner->pid,
+				   old_page_owner->tgid, old_page_owner->comm);
 	/*
-	 * We don't clear the bit on the old folio as it's going to be freed
-	 * after migration. Until then, the info can be useful in case of
-	 * a bug, and the overall stats will be off a bit only temporarily.
-	 * Also, migrate_misplaced_transhuge_page() can still fail the
-	 * migration and then we want the old folio to retain the info. But
-	 * in that case we also don't need to explicitly clear the info from
-	 * the new page, which will be freed.
+	 * Do not proactively clear PAGE_EXT_OWNER{_ALLOCATED} bits as the folio
+	 * will be freed after migration. Keep them until then as they may be
+	 * useful.
 	 */
-	__set_bit(PAGE_EXT_OWNER, &new_ext->flags);
-	__set_bit(PAGE_EXT_OWNER_ALLOCATED, &new_ext->flags);
+	__update_page_owner_free_handle(new_ext, 0, old_page_owner->order,
+					old_page_owner->free_pid,
+					old_page_owner->free_tgid,
+					old_page_owner->free_ts_nsec);
+
 	page_ext_put(new_ext);
 	page_ext_put(old_ext);
 }
@@ -787,8 +797,9 @@  static void init_pages_in_zone(pg_data_t *pgdat, struct zone *zone)
 				goto ext_put_continue;
 
 			/* Found early allocated page */
-			__set_page_owner_handle(page_ext, early_handle,
-						0, 0);
+			__update_page_owner_handle(page_ext, early_handle, 0, 0,
+						   -1, local_clock(), current->pid,
+						   current->tgid, current->comm);
 			count++;
 ext_put_continue:
 			page_ext_put(page_ext);

[v3,1/3] mm,page_owner: Update metada for tail pages

Commit Message

Comments

Patch