Message ID | 20230612130256.4572-6-linyunsheng@huawei.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | introduce page_pool_alloc() API | expand |
On Mon, 12 Jun 2023 21:02:56 +0800 Yunsheng Lin wrote: > +2. page_pool_alloc_frag(): allocate memory with page splitting when driver knows > + that the memory it need is always smaller than or equal to half of the page > + allocated from page pool. Page splitting enables memory saving and thus avoid > + TLB/cache miss for data access, but there also is some cost to implement page > + splitting, mainly some cache line dirtying/bouncing for 'struct page' and > + atomic operation for page->pp_frag_count. > + > +3. page_pool_alloc(): allocate memory with or without page splitting depending > + on the requested memory size when driver doesn't know the size of memory it > + need beforehand. It is a mix of the above two case, so it is a wrapper of the > + above API to simplify driver's interface for memory allocation with least > + memory utilization and performance penalty. Seems like the semantics of page_pool_alloc() are always better than page_pool_alloc_frag(). Is there a reason to keep these two separate?
On 2023/6/14 12:40, Jakub Kicinski wrote: > On Mon, 12 Jun 2023 21:02:56 +0800 Yunsheng Lin wrote: >> +2. page_pool_alloc_frag(): allocate memory with page splitting when driver knows >> + that the memory it need is always smaller than or equal to half of the page >> + allocated from page pool. Page splitting enables memory saving and thus avoid >> + TLB/cache miss for data access, but there also is some cost to implement page >> + splitting, mainly some cache line dirtying/bouncing for 'struct page' and >> + atomic operation for page->pp_frag_count. >> + >> +3. page_pool_alloc(): allocate memory with or without page splitting depending >> + on the requested memory size when driver doesn't know the size of memory it >> + need beforehand. It is a mix of the above two case, so it is a wrapper of the >> + above API to simplify driver's interface for memory allocation with least >> + memory utilization and performance penalty. > > Seems like the semantics of page_pool_alloc() are always better than > page_pool_alloc_frag(). Is there a reason to keep these two separate? I am agree the semantics of page_pool_alloc() is better, I was thinking about combining those two too. The reason I am keeping it is about the nic hw with fixed buffer size for each desc, and that buffer size is always smaller than or equal to half of the page allocated from page pool, so it doesn't bother doing the checking of 'size << 1 > max_size' and doesn't care about the actual truesize. > . >
On Wed, 14 Jun 2023 20:04:39 +0800 Yunsheng Lin wrote: > > Seems like the semantics of page_pool_alloc() are always better than > > page_pool_alloc_frag(). Is there a reason to keep these two separate? > > I am agree the semantics of page_pool_alloc() is better, I was thinking > about combining those two too. > The reason I am keeping it is about the nic hw with fixed buffer size for > each desc, and that buffer size is always smaller than or equal to half > of the page allocated from page pool, so it doesn't bother doing the > checking of 'size << 1 > max_size' and doesn't care about the actual > truesize. I see. Let's reorg the documentation, then? Something along the lines of, maybe: The page_pool allocator is optimized for recycling page or page frag used by skb packet and xdp frame. Basic use involves replacing napi_alloc_frag() and alloc_pages() calls with page_pool_alloc(). page_pool_alloc() allocates memory with or without page splitting depending on the requested memory size. If the driver knows that it always requires full pages or its allocates are always smaller than half a page, it can use one of the more specific API calls: 1. page_pool_alloc_pages(): allocate memory without page splitting when driver knows that the memory it need is always bigger than half of the page allocated from page pool. There is no cache line dirtying for 'struct page' when a page is recycled back to the page pool. 2. page_pool_alloc_frag(): allocate memory with page splitting when driver knows that the memory it need is always smaller than or equal to half of the page allocated from page pool. Page splitting enables memory saving and thus avoid TLB/cache miss for data access, but there also is some cost to implement page splitting, mainly some cache line dirtying/bouncing for 'struct page' and atomic operation for page->pp_frag_count.
On 2023/6/15 0:56, Jakub Kicinski wrote: > On Wed, 14 Jun 2023 20:04:39 +0800 Yunsheng Lin wrote: >>> Seems like the semantics of page_pool_alloc() are always better than >>> page_pool_alloc_frag(). Is there a reason to keep these two separate? >> >> I am agree the semantics of page_pool_alloc() is better, I was thinking >> about combining those two too. >> The reason I am keeping it is about the nic hw with fixed buffer size for >> each desc, and that buffer size is always smaller than or equal to half >> of the page allocated from page pool, so it doesn't bother doing the >> checking of 'size << 1 > max_size' and doesn't care about the actual >> truesize. > > I see. Let's reorg the documentation, then? Something along the lines > of, maybe: There is still one thing I am not sure about page_pool_alloc() API: It use *size both as input and output, I am not sure if it is a general pratice or not, or is there other better pratice than this.
diff --git a/Documentation/networking/page_pool.rst b/Documentation/networking/page_pool.rst index 873efd97f822..df3e28728008 100644 --- a/Documentation/networking/page_pool.rst +++ b/Documentation/networking/page_pool.rst @@ -4,12 +4,28 @@ Page Pool API ============= -The page_pool allocator is optimized for the XDP mode that uses one frame -per-page, but it can fallback on the regular page allocator APIs. - -Basic use involves replacing alloc_pages() calls with the -page_pool_alloc_pages() call. Drivers should use page_pool_dev_alloc_pages() -replacing dev_alloc_pages(). +The page_pool allocator is optimized for recycling page or page frag used by skb +packet and xdp frame. + +Basic use involves replacing alloc_pages() calls with different page pool +allocator API based on different use case: +1. page_pool_alloc_pages(): allocate memory without page splitting when driver + knows that the memory it need is always bigger than half of the page + allocated from page pool. There is no cache line dirtying for 'struct page' + when a page is recycled back to the page pool. + +2. page_pool_alloc_frag(): allocate memory with page splitting when driver knows + that the memory it need is always smaller than or equal to half of the page + allocated from page pool. Page splitting enables memory saving and thus avoid + TLB/cache miss for data access, but there also is some cost to implement page + splitting, mainly some cache line dirtying/bouncing for 'struct page' and + atomic operation for page->pp_frag_count. + +3. page_pool_alloc(): allocate memory with or without page splitting depending + on the requested memory size when driver doesn't know the size of memory it + need beforehand. It is a mix of the above two case, so it is a wrapper of the + above API to simplify driver's interface for memory allocation with least + memory utilization and performance penalty. API keeps track of in-flight pages, in order to let API user know when it is safe to free a page_pool object. Thus, API users @@ -93,6 +109,12 @@ a page will cause no race conditions is enough. * page_pool_dev_alloc_pages(): Get a page from the page allocator or page_pool caches. +* page_pool_dev_alloc_frag(): Get a page frag from the page allocator or + page_pool caches. + +* page_pool_dev_alloc(): Get a page or page frag from the page allocator or + page_pool caches. + * page_pool_get_dma_addr(): Retrieve the stored DMA address. * page_pool_get_dma_dir(): Retrieve the stored DMA direction. diff --git a/include/net/page_pool.h b/include/net/page_pool.h index f4fc339ff020..5fea37fd7767 100644 --- a/include/net/page_pool.h +++ b/include/net/page_pool.h @@ -5,28 +5,6 @@ * Copyright (C) 2016 Red Hat, Inc. */ -/** - * DOC: page_pool allocator - * - * This page_pool allocator is optimized for the XDP mode that - * uses one-frame-per-page, but have fallbacks that act like the - * regular page allocator APIs. - * - * Basic use involve replacing alloc_pages() calls with the - * page_pool_alloc_pages() call. Drivers should likely use - * page_pool_dev_alloc_pages() replacing dev_alloc_pages(). - * - * API keeps track of in-flight pages, in-order to let API user know - * when it is safe to dealloactor page_pool object. Thus, API users - * must make sure to call page_pool_release_page() when a page is - * "leaving" the page_pool. Or call page_pool_put_page() where - * appropiate. For maintaining correct accounting. - * - * API user must only call page_pool_put_page() once on a page, as it - * will either recycle the page, or in case of elevated refcnt, it - * will release the DMA mapping and in-flight state accounting. We - * hope to lift this requirement in the future. - */ #ifndef _NET_PAGE_POOL_H #define _NET_PAGE_POOL_H
As more drivers begin to use the frag API, update the document about how to decide which API to for the driver author. Also it seems there is a similar document in page_pool.h, so remove it to avoid the duplication. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> CC: Lorenzo Bianconi <lorenzo@kernel.org> CC: Alexander Duyck <alexander.duyck@gmail.com> --- Documentation/networking/page_pool.rst | 34 +++++++++++++++++++++----- include/net/page_pool.h | 22 ----------------- 2 files changed, 28 insertions(+), 28 deletions(-)