Message ID | 1681334163-31084-3-git-send-email-haiyangz@microsoft.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | net: mana: Add support for jumbo frame | expand |
On Wed, Apr 12, 2023 at 02:16:01PM -0700, Haiyang Zhang wrote: > Move out common buffer allocation code from mana_process_rx_cqe() and > mana_alloc_rx_wqe() to helper functions. > Refactor related variables so they can be changed in one place, and buffer > sizes are in sync. > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> > Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> > --- > V3: > Refectored to multiple patches for readability. Suggested by Jacob Keller. > > V2: > Refectored to multiple patches for readability. Suggested by Yunsheng Lin. > > --- > drivers/net/ethernet/microsoft/mana/mana_en.c | 154 ++++++++++-------- > include/net/mana/mana.h | 6 +- > 2 files changed, 91 insertions(+), 69 deletions(-) <...> > +static void *mana_get_rxfrag(struct mana_rxq *rxq, struct device *dev, > + dma_addr_t *da, bool is_napi) > +{ > + struct page *page; > + void *va; > + > + /* Reuse XDP dropped page if available */ > + if (rxq->xdp_save_va) { > + va = rxq->xdp_save_va; > + rxq->xdp_save_va = NULL; > + } else { > + page = dev_alloc_page(); Documentation/networking/page_pool.rst 10 Basic use involves replacing alloc_pages() calls with the 11 page_pool_alloc_pages() call. Drivers should use page_pool_dev_alloc_pages() 12 replacing dev_alloc_pages(). General question, is this sentence applicable to all new code or only for XDP related paths? Thanks
> -----Original Message----- > From: Leon Romanovsky <leon@kernel.org> > Sent: Thursday, April 13, 2023 9:04 AM > To: Haiyang Zhang <haiyangz@microsoft.com> > Cc: linux-hyperv@vger.kernel.org; netdev@vger.kernel.org; Dexuan Cui > <decui@microsoft.com>; KY Srinivasan <kys@microsoft.com>; Paul Rosswurm > <paulros@microsoft.com>; olaf@aepfle.de; vkuznets@redhat.com; > davem@davemloft.net; wei.liu@kernel.org; edumazet@google.com; > kuba@kernel.org; pabeni@redhat.com; Long Li <longli@microsoft.com>; > ssengar@linux.microsoft.com; linux-rdma@vger.kernel.org; > daniel@iogearbox.net; john.fastabend@gmail.com; bpf@vger.kernel.org; > ast@kernel.org; Ajay Sharma <sharmaajay@microsoft.com>; > hawk@kernel.org; linux-kernel@vger.kernel.org > Subject: Re: [PATCH V3,net-next, 2/4] net: mana: Refactor RX buffer allocation > code to prepare for various MTU > > On Wed, Apr 12, 2023 at 02:16:01PM -0700, Haiyang Zhang wrote: > > Move out common buffer allocation code from mana_process_rx_cqe() and > > mana_alloc_rx_wqe() to helper functions. > > Refactor related variables so they can be changed in one place, and buffer > > sizes are in sync. > > > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> > > Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> > > --- > > V3: > > Refectored to multiple patches for readability. Suggested by Jacob Keller. > > > > V2: > > Refectored to multiple patches for readability. Suggested by Yunsheng Lin. > > > > --- > > drivers/net/ethernet/microsoft/mana/mana_en.c | 154 ++++++++++------- > - > > include/net/mana/mana.h | 6 +- > > 2 files changed, 91 insertions(+), 69 deletions(-) > > <...> > > > +static void *mana_get_rxfrag(struct mana_rxq *rxq, struct device *dev, > > + dma_addr_t *da, bool is_napi) > > +{ > > + struct page *page; > > + void *va; > > + > > + /* Reuse XDP dropped page if available */ > > + if (rxq->xdp_save_va) { > > + va = rxq->xdp_save_va; > > + rxq->xdp_save_va = NULL; > > + } else { > > + page = dev_alloc_page(); > > Documentation/networking/page_pool.rst > 10 Basic use involves replacing alloc_pages() calls with the > 11 page_pool_alloc_pages() call. Drivers should use > page_pool_dev_alloc_pages() > 12 replacing dev_alloc_pages(). > > General question, is this sentence applicable to all new code or only > for XDP related paths? Quote from the context before that sentence -- ============= Page Pool API ============= The page_pool allocator is optimized for the XDP mode that uses one frame per-page, but it can fallback on the regular page allocator APIs. Basic use involves replacing alloc_pages() calls with the page_pool_alloc_pages() call. Drivers should use page_pool_dev_alloc_pages() replacing dev_alloc_pages(). --unquote So the page pool is optimized for the XDP, and that sentence is applicable to drivers that have set up page pool for XDP optimization. static inline struct page *page_pool_dev_alloc_pages(struct page_pool *pool) //need a pool been set up Back to our mana driver, we don't have page pool setup yet. (will consider in the future) So we cannot call page_pool_dev_alloc_pages(pool) in this place yet. Thanks, - Haiyang
On Thu, Apr 13, 2023 at 02:03:50PM +0000, Haiyang Zhang wrote: > > > > -----Original Message----- > > From: Leon Romanovsky <leon@kernel.org> > > Sent: Thursday, April 13, 2023 9:04 AM > > To: Haiyang Zhang <haiyangz@microsoft.com> > > Cc: linux-hyperv@vger.kernel.org; netdev@vger.kernel.org; Dexuan Cui > > <decui@microsoft.com>; KY Srinivasan <kys@microsoft.com>; Paul Rosswurm > > <paulros@microsoft.com>; olaf@aepfle.de; vkuznets@redhat.com; > > davem@davemloft.net; wei.liu@kernel.org; edumazet@google.com; > > kuba@kernel.org; pabeni@redhat.com; Long Li <longli@microsoft.com>; > > ssengar@linux.microsoft.com; linux-rdma@vger.kernel.org; > > daniel@iogearbox.net; john.fastabend@gmail.com; bpf@vger.kernel.org; > > ast@kernel.org; Ajay Sharma <sharmaajay@microsoft.com>; > > hawk@kernel.org; linux-kernel@vger.kernel.org > > Subject: Re: [PATCH V3,net-next, 2/4] net: mana: Refactor RX buffer allocation > > code to prepare for various MTU > > > > On Wed, Apr 12, 2023 at 02:16:01PM -0700, Haiyang Zhang wrote: > > > Move out common buffer allocation code from mana_process_rx_cqe() and > > > mana_alloc_rx_wqe() to helper functions. > > > Refactor related variables so they can be changed in one place, and buffer > > > sizes are in sync. > > > > > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> > > > Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> > > > --- > > > V3: > > > Refectored to multiple patches for readability. Suggested by Jacob Keller. > > > > > > V2: > > > Refectored to multiple patches for readability. Suggested by Yunsheng Lin. > > > > > > --- > > > drivers/net/ethernet/microsoft/mana/mana_en.c | 154 ++++++++++------- > > - > > > include/net/mana/mana.h | 6 +- > > > 2 files changed, 91 insertions(+), 69 deletions(-) > > > > <...> > > > > > +static void *mana_get_rxfrag(struct mana_rxq *rxq, struct device *dev, > > > + dma_addr_t *da, bool is_napi) > > > +{ > > > + struct page *page; > > > + void *va; > > > + > > > + /* Reuse XDP dropped page if available */ > > > + if (rxq->xdp_save_va) { > > > + va = rxq->xdp_save_va; > > > + rxq->xdp_save_va = NULL; > > > + } else { > > > + page = dev_alloc_page(); > > > > Documentation/networking/page_pool.rst > > 10 Basic use involves replacing alloc_pages() calls with the > > 11 page_pool_alloc_pages() call. Drivers should use > > page_pool_dev_alloc_pages() > > 12 replacing dev_alloc_pages(). > > > > General question, is this sentence applicable to all new code or only > > for XDP related paths? > > Quote from the context before that sentence -- > > ============= > Page Pool API > ============= > The page_pool allocator is optimized for the XDP mode that uses one frame > per-page, but it can fallback on the regular page allocator APIs. > Basic use involves replacing alloc_pages() calls with the > page_pool_alloc_pages() call. Drivers should use page_pool_dev_alloc_pages() > replacing dev_alloc_pages(). > > --unquote > > So the page pool is optimized for the XDP, and that sentence is applicable to drivers > that have set up page pool for XDP optimization. "but it can fallback on the regular page allocator APIs." > static inline struct page *page_pool_dev_alloc_pages(struct page_pool *pool) //need a pool been set up > > Back to our mana driver, we don't have page pool setup yet. (will consider in the future) > So we cannot call page_pool_dev_alloc_pages(pool) in this place yet. ok, thanks > > Thanks, > - Haiyang >
On Thu, 13 Apr 2023 19:30:59 +0300 Leon Romanovsky wrote: > > So the page pool is optimized for the XDP, and that sentence is applicable to drivers > > that have set up page pool for XDP optimization. > > "but it can fallback on the regular page allocator APIs." The XDP thing is historic AFAIU, page_pool has been expanded to cover all uses cases, and is "just better" (tm) than using page allocator directly. Maybe we should update the doc.
On Thu, Apr 13, 2023 at 09:40:03AM -0700, Jakub Kicinski wrote: > On Thu, 13 Apr 2023 19:30:59 +0300 Leon Romanovsky wrote: > > > So the page pool is optimized for the XDP, and that sentence is applicable to drivers > > > that have set up page pool for XDP optimization. > > > > "but it can fallback on the regular page allocator APIs." > > The XDP thing is historic AFAIU, page_pool has been expanded to cover > all uses cases, and is "just better" (tm) than using page allocator > directly. Maybe we should update the doc. The last sentence is always true :)
On Thu, 13 Apr 2023 19:59:29 +0300 Leon Romanovsky wrote: >> Maybe we should update the doc. > > The last sentence is always true :) Yeah, I felt bad when writing it :)
On Wed, 12 Apr 2023 14:16:01 -0700 Haiyang Zhang wrote: > +/* Allocate frag for rx buffer, and save the old buf */ > +static void mana_refill_rxoob(struct device *dev, struct mana_rxq *rxq, The fill function is spelled with a _ between rx and oob, please be consistent. > + struct mana_recv_buf_oob *rxoob, void **old_buf) > +{ > + dma_addr_t da; > + void *va; > + > + va = mana_get_rxfrag(rxq, dev, &da, true); > + no empty lines between function call and error check. Please fix this in all the code this patch set is touching. > + if (!va) > + return;
> -----Original Message----- > From: Jakub Kicinski <kuba@kernel.org> > Sent: Friday, April 14, 2023 10:05 PM > To: Haiyang Zhang <haiyangz@microsoft.com> > Cc: linux-hyperv@vger.kernel.org; netdev@vger.kernel.org; Dexuan Cui > <decui@microsoft.com>; KY Srinivasan <kys@microsoft.com>; Paul Rosswurm > <paulros@microsoft.com>; olaf@aepfle.de; vkuznets@redhat.com; > davem@davemloft.net; wei.liu@kernel.org; edumazet@google.com; > pabeni@redhat.com; leon@kernel.org; Long Li <longli@microsoft.com>; > ssengar@linux.microsoft.com; linux-rdma@vger.kernel.org; > daniel@iogearbox.net; john.fastabend@gmail.com; bpf@vger.kernel.org; > ast@kernel.org; Ajay Sharma <sharmaajay@microsoft.com>; > hawk@kernel.org; linux-kernel@vger.kernel.org > Subject: Re: [PATCH V3,net-next, 2/4] net: mana: Refactor RX buffer allocation > code to prepare for various MTU > > On Wed, 12 Apr 2023 14:16:01 -0700 Haiyang Zhang wrote: > > +/* Allocate frag for rx buffer, and save the old buf */ > > +static void mana_refill_rxoob(struct device *dev, struct mana_rxq *rxq, > > The fill function is spelled with a _ between rx and oob, > please be consistent. Will do. > > > + struct mana_recv_buf_oob *rxoob, void > **old_buf) > > +{ > > + dma_addr_t da; > > + void *va; > > + > > + va = mana_get_rxfrag(rxq, dev, &da, true); > > + > > no empty lines between function call and error check. > Please fix this in all the code this patch set is touching. Sure. Since the patch set is accepted, I will fix the empty lines in a new patch. Thanks, - Haiyang
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c index 112c642dc89b..911954ff84ee 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -1282,14 +1282,64 @@ static void mana_rx_skb(void *buf_va, struct mana_rxcomp_oob *cqe, u64_stats_update_end(&rx_stats->syncp); drop: - WARN_ON_ONCE(rxq->xdp_save_page); - rxq->xdp_save_page = virt_to_page(buf_va); + WARN_ON_ONCE(rxq->xdp_save_va); + /* Save for reuse */ + rxq->xdp_save_va = buf_va; ++ndev->stats.rx_dropped; return; } +static void *mana_get_rxfrag(struct mana_rxq *rxq, struct device *dev, + dma_addr_t *da, bool is_napi) +{ + struct page *page; + void *va; + + /* Reuse XDP dropped page if available */ + if (rxq->xdp_save_va) { + va = rxq->xdp_save_va; + rxq->xdp_save_va = NULL; + } else { + page = dev_alloc_page(); + if (!page) + return NULL; + + va = page_to_virt(page); + } + + *da = dma_map_single(dev, va + XDP_PACKET_HEADROOM, rxq->datasize, + DMA_FROM_DEVICE); + + if (dma_mapping_error(dev, *da)) { + put_page(virt_to_head_page(va)); + return NULL; + } + + return va; +} + +/* Allocate frag for rx buffer, and save the old buf */ +static void mana_refill_rxoob(struct device *dev, struct mana_rxq *rxq, + struct mana_recv_buf_oob *rxoob, void **old_buf) +{ + dma_addr_t da; + void *va; + + va = mana_get_rxfrag(rxq, dev, &da, true); + + if (!va) + return; + + dma_unmap_single(dev, rxoob->sgl[0].address, rxq->datasize, + DMA_FROM_DEVICE); + *old_buf = rxoob->buf_va; + + rxoob->buf_va = va; + rxoob->sgl[0].address = da; +} + static void mana_process_rx_cqe(struct mana_rxq *rxq, struct mana_cq *cq, struct gdma_comp *cqe) { @@ -1299,10 +1349,8 @@ static void mana_process_rx_cqe(struct mana_rxq *rxq, struct mana_cq *cq, struct mana_recv_buf_oob *rxbuf_oob; struct mana_port_context *apc; struct device *dev = gc->dev; - void *new_buf, *old_buf; - struct page *new_page; + void *old_buf = NULL; u32 curr, pktlen; - dma_addr_t da; apc = netdev_priv(ndev); @@ -1345,40 +1393,11 @@ static void mana_process_rx_cqe(struct mana_rxq *rxq, struct mana_cq *cq, rxbuf_oob = &rxq->rx_oobs[curr]; WARN_ON_ONCE(rxbuf_oob->wqe_inf.wqe_size_in_bu != 1); - /* Reuse XDP dropped page if available */ - if (rxq->xdp_save_page) { - new_page = rxq->xdp_save_page; - rxq->xdp_save_page = NULL; - } else { - new_page = alloc_page(GFP_ATOMIC); - } - - if (new_page) { - da = dma_map_page(dev, new_page, XDP_PACKET_HEADROOM, rxq->datasize, - DMA_FROM_DEVICE); - - if (dma_mapping_error(dev, da)) { - __free_page(new_page); - new_page = NULL; - } - } - - new_buf = new_page ? page_to_virt(new_page) : NULL; - - if (new_buf) { - dma_unmap_page(dev, rxbuf_oob->buf_dma_addr, rxq->datasize, - DMA_FROM_DEVICE); - - old_buf = rxbuf_oob->buf_va; - - /* refresh the rxbuf_oob with the new page */ - rxbuf_oob->buf_va = new_buf; - rxbuf_oob->buf_dma_addr = da; - rxbuf_oob->sgl[0].address = rxbuf_oob->buf_dma_addr; - } else { - old_buf = NULL; /* drop the packet if no memory */ - } + mana_refill_rxoob(dev, rxq, rxbuf_oob, &old_buf); + /* Unsuccessful refill will have old_buf == NULL. + * In this case, mana_rx_skb() will drop the packet. + */ mana_rx_skb(old_buf, oob, rxq); drop: @@ -1659,8 +1678,8 @@ static void mana_destroy_rxq(struct mana_port_context *apc, mana_deinit_cq(apc, &rxq->rx_cq); - if (rxq->xdp_save_page) - __free_page(rxq->xdp_save_page); + if (rxq->xdp_save_va) + put_page(virt_to_head_page(rxq->xdp_save_va)); for (i = 0; i < rxq->num_rx_buf; i++) { rx_oob = &rxq->rx_oobs[i]; @@ -1668,10 +1687,10 @@ static void mana_destroy_rxq(struct mana_port_context *apc, if (!rx_oob->buf_va) continue; - dma_unmap_page(dev, rx_oob->buf_dma_addr, rxq->datasize, - DMA_FROM_DEVICE); + dma_unmap_single(dev, rx_oob->sgl[0].address, + rx_oob->sgl[0].size, DMA_FROM_DEVICE); - free_page((unsigned long)rx_oob->buf_va); + put_page(virt_to_head_page(rx_oob->buf_va)); rx_oob->buf_va = NULL; } @@ -1681,6 +1700,26 @@ static void mana_destroy_rxq(struct mana_port_context *apc, kfree(rxq); } +static int mana_fill_rx_oob(struct mana_recv_buf_oob *rx_oob, u32 mem_key, + struct mana_rxq *rxq, struct device *dev) +{ + dma_addr_t da; + void *va; + + va = mana_get_rxfrag(rxq, dev, &da, false); + + if (!va) + return -ENOMEM; + + rx_oob->buf_va = va; + + rx_oob->sgl[0].address = da; + rx_oob->sgl[0].size = rxq->datasize; + rx_oob->sgl[0].mem_key = mem_key; + + return 0; +} + #define MANA_WQE_HEADER_SIZE 16 #define MANA_WQE_SGE_SIZE 16 @@ -1690,9 +1729,8 @@ static int mana_alloc_rx_wqe(struct mana_port_context *apc, struct gdma_context *gc = apc->ac->gdma_dev->gdma_context; struct mana_recv_buf_oob *rx_oob; struct device *dev = gc->dev; - struct page *page; - dma_addr_t da; u32 buf_idx; + int ret; WARN_ON(rxq->datasize == 0 || rxq->datasize > PAGE_SIZE); @@ -1703,25 +1741,12 @@ static int mana_alloc_rx_wqe(struct mana_port_context *apc, rx_oob = &rxq->rx_oobs[buf_idx]; memset(rx_oob, 0, sizeof(*rx_oob)); - page = alloc_page(GFP_KERNEL); - if (!page) - return -ENOMEM; - - da = dma_map_page(dev, page, XDP_PACKET_HEADROOM, rxq->datasize, - DMA_FROM_DEVICE); - - if (dma_mapping_error(dev, da)) { - __free_page(page); - return -ENOMEM; - } - - rx_oob->buf_va = page_to_virt(page); - rx_oob->buf_dma_addr = da; - rx_oob->num_sge = 1; - rx_oob->sgl[0].address = rx_oob->buf_dma_addr; - rx_oob->sgl[0].size = rxq->datasize; - rx_oob->sgl[0].mem_key = apc->ac->gdma_dev->gpa_mkey; + + ret = mana_fill_rx_oob(rx_oob, apc->ac->gdma_dev->gpa_mkey, rxq, + dev); + if (ret) + return ret; rx_oob->wqe_req.sgl = rx_oob->sgl; rx_oob->wqe_req.num_sge = rx_oob->num_sge; @@ -1780,9 +1805,10 @@ static struct mana_rxq *mana_create_rxq(struct mana_port_context *apc, rxq->ndev = ndev; rxq->num_rx_buf = RX_BUFFERS_PER_QUEUE; rxq->rxq_idx = rxq_idx; - rxq->datasize = ALIGN(MAX_FRAME_SIZE, 64); rxq->rxobj = INVALID_MANA_HANDLE; + rxq->datasize = ALIGN(ETH_FRAME_LEN, 64); + err = mana_alloc_rx_wqe(apc, rxq, &rq_size, &cq_size); if (err) goto out; diff --git a/include/net/mana/mana.h b/include/net/mana/mana.h index bb11a6535d80..037bcabf6b98 100644 --- a/include/net/mana/mana.h +++ b/include/net/mana/mana.h @@ -36,9 +36,6 @@ enum TRI_STATE { #define COMP_ENTRY_SIZE 64 -#define ADAPTER_MTU_SIZE 1500 -#define MAX_FRAME_SIZE (ADAPTER_MTU_SIZE + 14) - #define RX_BUFFERS_PER_QUEUE 512 #define MAX_SEND_BUFFERS_PER_QUEUE 256 @@ -282,7 +279,6 @@ struct mana_recv_buf_oob { struct gdma_wqe_request wqe_req; void *buf_va; - dma_addr_t buf_dma_addr; /* SGL of the buffer going to be sent has part of the work request. */ u32 num_sge; @@ -322,7 +318,7 @@ struct mana_rxq { struct bpf_prog __rcu *bpf_prog; struct xdp_rxq_info xdp_rxq; - struct page *xdp_save_page; + void *xdp_save_va; /* for reusing */ bool xdp_flush; int xdp_rc; /* XDP redirect return code */