Message ID | 1689259687-5231-1-git-send-email-haiyangz@microsoft.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [net-next] net: mana: Add page pool for RX buffers | expand |
On Thu, 13 Jul 2023 14:48:45 +0000 Haiyang Zhang wrote: > Add page pool for RX buffers for faster buffer cycle and reduce CPU > usage. > > Get an extra ref count of a page after allocation, so after upper > layers put the page, it's still referenced by the pool. We can reuse > it as RX buffer without alloc a new page. Please use the real page_pool API from include/net/page_pool.h We've moved past every driver reinventing the wheel, sorry.
On 14/07/2023 05.53, Jakub Kicinski wrote: > On Thu, 13 Jul 2023 14:48:45 +0000 Haiyang Zhang wrote: >> Add page pool for RX buffers for faster buffer cycle and reduce CPU >> usage. >> >> Get an extra ref count of a page after allocation, so after upper >> layers put the page, it's still referenced by the pool. We can reuse >> it as RX buffer without alloc a new page. > > Please use the real page_pool API from include/net/page_pool.h > We've moved past every driver reinventing the wheel, sorry. +1 Quoting[1]: Documentation/networking/page_pool.rst Basic use involves replacing alloc_pages() calls with the page_pool_alloc_pages() call. Drivers should use page_pool_dev_alloc_pages() replacing dev_alloc_pages(). [1] https://kernel.org/doc/html/latest/networking/page_pool.html
> -----Original Message----- > From: Jesper Dangaard Brouer <jbrouer@redhat.com> > On 14/07/2023 05.53, Jakub Kicinski wrote: > > On Thu, 13 Jul 2023 14:48:45 +0000 Haiyang Zhang wrote: > >> Add page pool for RX buffers for faster buffer cycle and reduce CPU > >> usage. > >> > >> Get an extra ref count of a page after allocation, so after upper > >> layers put the page, it's still referenced by the pool. We can reuse > >> it as RX buffer without alloc a new page. > > > > Please use the real page_pool API from include/net/page_pool.h > > We've moved past every driver reinventing the wheel, sorry. > > +1 > > Quoting[1]: Documentation/networking/page_pool.rst > > Basic use involves replacing alloc_pages() calls with the > page_pool_alloc_pages() call. > Drivers should use page_pool_dev_alloc_pages() replacing > dev_alloc_pages(). Thank Jakub and Jesper for the reviews. I'm aware of the page_pool.rst doc, and actually tried it before this patch, but I got lower perf. If I understand correctly, we should call page_pool_release_page() before passing the SKB to napi_gro_receive(). I found the page_pool_dev_alloc_pages() goes through the slow path, because the page_pool_release_page() let the page leave the pool. Do we have to call page_pool_release_page() before passing the SKB to napi_gro_receive()? Any better way to recycle the pages from the upper layer of non-XDP case? Thanks, - Haiyang
On 14/07/2023 14.51, Haiyang Zhang wrote: > > >> -----Original Message----- >> From: Jesper Dangaard Brouer <jbrouer@redhat.com> >> On 14/07/2023 05.53, Jakub Kicinski wrote: >>> On Thu, 13 Jul 2023 14:48:45 +0000 Haiyang Zhang wrote: >>>> Add page pool for RX buffers for faster buffer cycle and reduce CPU >>>> usage. >>>> >>>> Get an extra ref count of a page after allocation, so after upper >>>> layers put the page, it's still referenced by the pool. We can reuse >>>> it as RX buffer without alloc a new page. >>> >>> Please use the real page_pool API from include/net/page_pool.h >>> We've moved past every driver reinventing the wheel, sorry. >> >> +1 >> >> Quoting[1]: Documentation/networking/page_pool.rst >> >> Basic use involves replacing alloc_pages() calls with the >> page_pool_alloc_pages() call. >> Drivers should use page_pool_dev_alloc_pages() replacing >> dev_alloc_pages(). > > Thank Jakub and Jesper for the reviews. > I'm aware of the page_pool.rst doc, and actually tried it before this > patch, but I got lower perf. If I understand correctly, we should call > page_pool_release_page() before passing the SKB to napi_gro_receive(). > > I found the page_pool_dev_alloc_pages() goes through the slow path, > because the page_pool_release_page() let the page leave the pool. > > Do we have to call page_pool_release_page() before passing the SKB > to napi_gro_receive()? Any better way to recycle the pages from the > upper layer of non-XDP case? > Today SKB "upper layers" can recycle page_pool backed packet data/page. Just use skb_mark_for_recycle(skb), then you don't need page_pool_release_page(). I guess, we should update the documentation, mentioning this. --Jesper
On Fri, 14 Jul 2023 15:13:15 +0200 Jesper Dangaard Brouer wrote: > > Thank Jakub and Jesper for the reviews. > > I'm aware of the page_pool.rst doc, and actually tried it before this > > patch, but I got lower perf. If I understand correctly, we should call > > page_pool_release_page() before passing the SKB to napi_gro_receive(). > > > > I found the page_pool_dev_alloc_pages() goes through the slow path, > > because the page_pool_release_page() let the page leave the pool. > > > > Do we have to call page_pool_release_page() before passing the SKB > > to napi_gro_receive()? Any better way to recycle the pages from the > > upper layer of non-XDP case? > > > > Today SKB "upper layers" can recycle page_pool backed packet data/page. > > Just use skb_mark_for_recycle(skb), then you don't need > page_pool_release_page(). > > I guess, we should update the documentation, mentioning this. Ah, I should probably send in the few cleanups form the huge page series. It looks like all users of page_pool_release_page() can be converted to skb recycling, so we should hide it and remove from docs?
> -----Original Message----- > From: Jesper Dangaard Brouer <jbrouer@redhat.com> > Sent: Friday, July 14, 2023 9:13 AM > To: Haiyang Zhang <haiyangz@microsoft.com>; Jesper Dangaard Brouer > <jbrouer@redhat.com>; Jakub Kicinski <kuba@kernel.org> > Cc: brouer@redhat.com; linux-hyperv@vger.kernel.org; netdev@vger.kernel.org; > Dexuan Cui <decui@microsoft.com>; KY Srinivasan <kys@microsoft.com>; Paul > Rosswurm <paulros@microsoft.com>; olaf@aepfle.de; vkuznets@redhat.com; > davem@davemloft.net; wei.liu@kernel.org; edumazet@google.com; > pabeni@redhat.com; leon@kernel.org; Long Li <longli@microsoft.com>; > ssengar@linux.microsoft.com; linux-rdma@vger.kernel.org; > daniel@iogearbox.net; john.fastabend@gmail.com; bpf@vger.kernel.org; > ast@kernel.org; Ajay Sharma <sharmaajay@microsoft.com>; hawk@kernel.org; > tglx@linutronix.de; shradhagupta@linux.microsoft.com; linux- > kernel@vger.kernel.org; Ilias Apalodimas <ilias.apalodimas@linaro.org> > Subject: Re: [PATCH net-next] net: mana: Add page pool for RX buffers > > [You don't often get email from jbrouer@redhat.com. Learn why this is > important at https://aka.ms/LearnAboutSenderIdentification ] > > On 14/07/2023 14.51, Haiyang Zhang wrote: > > > > > >> -----Original Message----- > >> From: Jesper Dangaard Brouer <jbrouer@redhat.com> > >> On 14/07/2023 05.53, Jakub Kicinski wrote: > >>> On Thu, 13 Jul 2023 14:48:45 +0000 Haiyang Zhang wrote: > >>>> Add page pool for RX buffers for faster buffer cycle and reduce CPU > >>>> usage. > >>>> > >>>> Get an extra ref count of a page after allocation, so after upper > >>>> layers put the page, it's still referenced by the pool. We can reuse > >>>> it as RX buffer without alloc a new page. > >>> > >>> Please use the real page_pool API from include/net/page_pool.h > >>> We've moved past every driver reinventing the wheel, sorry. > >> > >> +1 > >> > >> Quoting[1]: Documentation/networking/page_pool.rst > >> > >> Basic use involves replacing alloc_pages() calls with the > >> page_pool_alloc_pages() call. > >> Drivers should use page_pool_dev_alloc_pages() replacing > >> dev_alloc_pages(). > > > > Thank Jakub and Jesper for the reviews. > > I'm aware of the page_pool.rst doc, and actually tried it before this > > patch, but I got lower perf. If I understand correctly, we should call > > page_pool_release_page() before passing the SKB to napi_gro_receive(). > > > > I found the page_pool_dev_alloc_pages() goes through the slow path, > > because the page_pool_release_page() let the page leave the pool. > > > > Do we have to call page_pool_release_page() before passing the SKB > > to napi_gro_receive()? Any better way to recycle the pages from the > > upper layer of non-XDP case? > > > > Today SKB "upper layers" can recycle page_pool backed packet data/page. > > Just use skb_mark_for_recycle(skb), then you don't need > page_pool_release_page(). Will do. Thanks a lot! - Haiyang
在 2023/7/14 20:51, Haiyang Zhang 写道: > > >> -----Original Message----- >> From: Jesper Dangaard Brouer <jbrouer@redhat.com> >> On 14/07/2023 05.53, Jakub Kicinski wrote: >>> On Thu, 13 Jul 2023 14:48:45 +0000 Haiyang Zhang wrote: >>>> Add page pool for RX buffers for faster buffer cycle and reduce CPU >>>> usage. >>>> >>>> Get an extra ref count of a page after allocation, so after upper >>>> layers put the page, it's still referenced by the pool. We can reuse >>>> it as RX buffer without alloc a new page. >>> >>> Please use the real page_pool API from include/net/page_pool.h >>> We've moved past every driver reinventing the wheel, sorry. >> >> +1 >> >> Quoting[1]: Documentation/networking/page_pool.rst >> >> Basic use involves replacing alloc_pages() calls with the >> page_pool_alloc_pages() call. >> Drivers should use page_pool_dev_alloc_pages() replacing >> dev_alloc_pages(). > > Thank Jakub and Jesper for the reviews. > I'm aware of the page_pool.rst doc, and actually tried it before this > patch, but I got lower perf. If I understand correctly, we should call > page_pool_release_page() before passing the SKB to napi_gro_receive(). > If I get this commit correctly, this commit is to use page pool to get better performance. IIRC, folio is to make memory optimization. From the performance results, with folio, the performance will get about 10%. So not sure if the folio can be used in this commit to get better performance. That is my 2 cent. Zhu Yanjun > I found the page_pool_dev_alloc_pages() goes through the slow path, > because the page_pool_release_page() let the page leave the pool. > > Do we have to call page_pool_release_page() before passing the SKB > to napi_gro_receive()? Any better way to recycle the pages from the > upper layer of non-XDP case? > > Thanks, > - Haiyang >
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c index a499e460594b..6444a8e47852 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -1507,6 +1507,34 @@ static void mana_rx_skb(void *buf_va, struct mana_rxcomp_oob *cqe, return; } +static struct page *mana_get_page_from_pool(struct mana_rxq *rxq) +{ + struct page *page; + int i; + + i = rxq->pl_last + 1; + if (i >= MANA_POOL_SIZE) + i = 0; + + rxq->pl_last = i; + + page = rxq->pool[i]; + if (page_ref_count(page) == 1) { + get_page(page); + return page; + } + + page = dev_alloc_page(); + if (page) { + put_page(rxq->pool[i]); + + get_page(page); + rxq->pool[i] = page; + } + + return page; +} + static void *mana_get_rxfrag(struct mana_rxq *rxq, struct device *dev, dma_addr_t *da, bool is_napi) { @@ -1533,7 +1561,7 @@ static void *mana_get_rxfrag(struct mana_rxq *rxq, struct device *dev, return NULL; } } else { - page = dev_alloc_page(); + page = mana_get_page_from_pool(rxq); if (!page) return NULL; @@ -1873,6 +1901,21 @@ static int mana_create_txq(struct mana_port_context *apc, return err; } +static void mana_release_rxq_pool(struct mana_rxq *rxq) +{ + struct page *page; + int i; + + for (i = 0; i < MANA_POOL_SIZE; i++) { + page = rxq->pool[i]; + + if (page) + put_page(page); + + rxq->pool[i] = NULL; + } +} + static void mana_destroy_rxq(struct mana_port_context *apc, struct mana_rxq *rxq, bool validate_state) @@ -1917,6 +1960,8 @@ static void mana_destroy_rxq(struct mana_port_context *apc, rx_oob->buf_va = NULL; } + mana_release_rxq_pool(rxq); + if (rxq->gdma_rq) mana_gd_destroy_queue(gc, rxq->gdma_rq); @@ -2008,6 +2053,27 @@ static int mana_push_wqe(struct mana_rxq *rxq) return 0; } +static int mana_alloc_rxq_pool(struct mana_rxq *rxq) +{ + struct page *page; + int i; + + for (i = 0; i < MANA_POOL_SIZE; i++) { + page = dev_alloc_page(); + if (!page) + goto err; + + rxq->pool[i] = page; + } + + return 0; + +err: + mana_release_rxq_pool(rxq); + + return -ENOMEM; +} + static struct mana_rxq *mana_create_rxq(struct mana_port_context *apc, u32 rxq_idx, struct mana_eq *eq, struct net_device *ndev) @@ -2029,6 +2095,11 @@ static struct mana_rxq *mana_create_rxq(struct mana_port_context *apc, if (!rxq) return NULL; + if (mana_alloc_rxq_pool(rxq)) { + kfree(rxq); + return NULL; + } + rxq->ndev = ndev; rxq->num_rx_buf = RX_BUFFERS_PER_QUEUE; rxq->rxq_idx = rxq_idx; diff --git a/include/net/mana/mana.h b/include/net/mana/mana.h index 024ad8ddb27e..8f1f09f9e4ab 100644 --- a/include/net/mana/mana.h +++ b/include/net/mana/mana.h @@ -297,6 +297,8 @@ struct mana_recv_buf_oob { #define MANA_XDP_MTU_MAX (PAGE_SIZE - MANA_RXBUF_PAD - XDP_PACKET_HEADROOM) +#define MANA_POOL_SIZE (RX_BUFFERS_PER_QUEUE * 2) + struct mana_rxq { struct gdma_queue *gdma_rq; /* Cache the gdma receive queue id */ @@ -330,6 +332,9 @@ struct mana_rxq { bool xdp_flush; int xdp_rc; /* XDP redirect return code */ + struct page *pool[MANA_POOL_SIZE]; + int pl_last; + /* MUST BE THE LAST MEMBER: * Each receive buffer has an associated mana_recv_buf_oob. */
Add page pool for RX buffers for faster buffer cycle and reduce CPU usage. Get an extra ref count of a page after allocation, so after upper layers put the page, it's still referenced by the pool. We can reuse it as RX buffer without alloc a new page. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> --- drivers/net/ethernet/microsoft/mana/mana_en.c | 73 ++++++++++++++++++- include/net/mana/mana.h | 5 ++ 2 files changed, 77 insertions(+), 1 deletion(-)