diff mbox series

[net-next] mm/page_pool: catch page_pool memory leaks

Message ID 170082101266.1085481.12199867179160710331.stgit@firesoul (mailing list archive)
State New
Headers show
Series [net-next] mm/page_pool: catch page_pool memory leaks | expand

Commit Message

Jesper Dangaard Brouer Nov. 24, 2023, 10:16 a.m. UTC
Pages belonging to a page_pool (PP) instance must be freed through the
PP APIs in-order to correctly release any DMA mappings and release
refcnt on the DMA device when freeing PP instance. When PP release a
page (page_pool_release_page) the page->pp_magic value is cleared.

This patch detect a leaked PP page in free_page_is_bad() via
unexpected state of page->pp_magic value being PP_SIGNATURE.

We choose to report and treat it as a bad page. It would be possible
to release the page via returning it to the PP instance as the
page->pp pointer is likely still valid.

Notice this code is only activated when either compiled with
CONFIG_DEBUG_VM or boot cmdline debug_pagealloc=on, and
CONFIG_PAGE_POOL.

Reduced example output of leak with PP_SIGNATURE = dead000000000040:

 BUG: Bad page state in process swapper/4  pfn:141fa6
 page:000000006dbf8062 refcount:0 mapcount:0 mapping:0000000000000000 index:0x141fa6000 pfn:0x141fa6
 flags: 0x2fffff80000000(node=0|zone=2|lastcpupid=0x1fffff)
 page_type: 0xffffffff()
 raw: 002fffff80000000 dead000000000040 ffff88814888a000 0000000000000000
 raw: 0000000141fa6000 0000000000000001 00000000ffffffff 0000000000000000
 page dumped because: page_pool leak
 [...]
 Call Trace:
  <IRQ>
  dump_stack_lvl+0x32/0x50
  bad_page+0x70/0xf0
  free_unref_page_prepare+0x263/0x430
  free_unref_page+0x34/0x130
  mlx5e_free_rx_mpwqe+0x190/0x1c0 [mlx5_core]
  mlx5e_post_rx_mpwqes+0x1ac/0x280 [mlx5_core]
  mlx5e_napi_poll+0x12b/0x710 [mlx5_core]
  ? skb_free_head+0x4f/0x90
  __napi_poll+0x2b/0x1c0
  net_rx_action+0x27b/0x360

The advantage is the Call Trace directly points to the function
leaking the PP page, which in this case is an on purpose bug
introduced into the mlx5 driver to test this code change.

Currently PP will periodically in page_pool_release_retry()
printk warning "stalled pool shutdown" which cannot be directly
corrolated to leaking and might as well be a false positive
due to SKBs being stuck on a socket for an extended period.
After this patch we should be able to remove this printk.

Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
---
 mm/page_alloc.c |    7 +++++++
 1 file changed, 7 insertions(+)

Comments

patchwork-bot+netdevbpf@kernel.org Nov. 26, 2023, 3:20 p.m. UTC | #1
Hello:

This patch was applied to netdev/net-next.git (main)
by David S. Miller <davem@davemloft.net>:

On Fri, 24 Nov 2023 11:16:52 +0100 you wrote:
> Pages belonging to a page_pool (PP) instance must be freed through the
> PP APIs in-order to correctly release any DMA mappings and release
> refcnt on the DMA device when freeing PP instance. When PP release a
> page (page_pool_release_page) the page->pp_magic value is cleared.
> 
> This patch detect a leaked PP page in free_page_is_bad() via
> unexpected state of page->pp_magic value being PP_SIGNATURE.
> 
> [...]

Here is the summary with links:
  - [net-next] mm/page_pool: catch page_pool memory leaks
    https://git.kernel.org/netdev/net-next/c/dba1b8a7ab68

You are awesome, thank you!
diff mbox series

Patch

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 733732e7e0ba..37ca4f4b62bf 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -915,6 +915,9 @@  static inline bool page_expected_state(struct page *page,
 			page_ref_count(page) |
 #ifdef CONFIG_MEMCG
 			page->memcg_data |
+#endif
+#ifdef CONFIG_PAGE_POOL
+			((page->pp_magic & ~0x3UL) == PP_SIGNATURE) |
 #endif
 			(page->flags & check_flags)))
 		return false;
@@ -941,6 +944,10 @@  static const char *page_bad_reason(struct page *page, unsigned long flags)
 #ifdef CONFIG_MEMCG
 	if (unlikely(page->memcg_data))
 		bad_reason = "page still charged to cgroup";
+#endif
+#ifdef CONFIG_PAGE_POOL
+	if (unlikely((page->pp_magic & ~0x3UL) == PP_SIGNATURE))
+		bad_reason = "page_pool leak";
 #endif
 	return bad_reason;
 }