diff mbox series

[net-next,v1,10/16] page_pool: don't release iov on elevanted refcount

Message ID 20231208005250.2910004-11-almasrymina@google.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series Device Memory TCP | expand

Checks

Context Check Description
netdev/series_format fail Series longer than 15 patches (and no cover letter)
netdev/tree_selection success Clearly marked for net-next, async
netdev/ynl success Generated files up to date; no warnings/errors; GEN HAS DIFF 2 files changed, 185 insertions(+);
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1117 this patch: 1117
netdev/cc_maintainers success CCed 7 of 7 maintainers
netdev/build_clang success Errors and warnings before: 1143 this patch: 1143
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1144 this patch: 1144
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 29 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Mina Almasry Dec. 8, 2023, 12:52 a.m. UTC
Currently the page_pool behavior is that a page is considered for
recycling only once, the first time __page_pool_put_page() is called on
it.

This works because in practice the net stack only holds 1 reference to
the skb frags. In that case, the page_pool recycling works as expected,
as the skb frags will have 1 reference on the pages from the net stack
when __page_pool_put_page() is called (if the driver is not holding
extra references for recycling), and so the page will be recycled.

However, this is not compatible with devmem TCP. For devmem TCP, the net
stack holds 2 references for each frag, 1 reference is part of the SKB,
and the second reference is for the user holding the frag until they
call SO_DEVMEM_DONTNEED. This causes a bug in the page_pool recycling
where, when the skb is freed, the reference count goes from 2->1, the
page_pool sees a pending reference, releases the page, and so no devmem
iovs get recycled.

To fix this, don't release iovs on elevated refcount.

Signed-off-by: Mina Almasry <almasrymina@google.com>
---
 net/core/page_pool.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)
diff mbox series

Patch

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index f0148d66371b..dc2a148f5b06 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -731,6 +731,29 @@  __page_pool_put_page(struct page_pool *pool, struct page *page,
 		/* Page found as candidate for recycling */
 		return page;
 	}
+
+	if (page_is_page_pool_iov(page)) {
+		/* With devmem TCP and ppiovs, we can't release pages if the
+		 * refcount is > 1. This is because the net stack holds
+		 * 2 references:
+		 *	- 1 for the skb, and
+		 *	- 1 for the user until they call SO_DEVMEM_DONTNEED.
+		 * Releasing pages for elevated refcounts completely disables
+		 * page_pool recycling. Instead, simply don't release pages and
+		 * the next call to napi_pp_put_page() via SO_DEVMEM_DONTNEED
+		 * will consider the page again for recycling. As a result,
+		 * devmem TCP incompatible with drivers doing refcnt based
+		 * recycling unless those drivers:
+		 *
+		 * - don't mark skb_mark_for_recycle()
+		 * - are sure to release the last reference with
+		 *   page_pool_put_full_page() to consider the page for
+		 *   page_pool recycling.
+		 */
+		page_pool_page_put_many(page, 1);
+		return NULL;
+	}
+
 	/* Fallback/non-XDP mode: API user have elevated refcnt.
 	 *
 	 * Many drivers split up the page into fragments, and some