Message ID | 20240911145555.318605-2-ap420073@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | bnxt_en: implement tcp-data-split ethtool command | expand |
On 9/11/2024 7:55 AM, Taehee Yoo wrote: > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. > > > The bnxt_en driver supports rx-copybreak, but it couldn't be set by > userspace. Only the default value(256) has worked. > This patch makes the bnxt_en driver support following command. > `ethtool --set-tunable <devname> rx-copybreak <value> ` and > `ethtool --get-tunable <devname> rx-copybreak`. > > Signed-off-by: Taehee Yoo <ap420073@gmail.com> > --- > > v2: > - Define max/vim rx_copybreak value. > > drivers/net/ethernet/broadcom/bnxt/bnxt.c | 24 ++++++---- > drivers/net/ethernet/broadcom/bnxt/bnxt.h | 6 ++- > .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 47 ++++++++++++++++++- > 3 files changed, 66 insertions(+), 11 deletions(-) > > diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c <snip> > +static void bnxt_init_ring_params(struct bnxt *bp) > +{ > + bp->rx_copybreak = BNXT_DEFAULT_RX_COPYBREAK; > +} > + > /* bp->rx_ring_size, bp->tx_ring_size, dev->mtu, BNXT_FLAG_{G|L}RO flags must > * be set on entry. > */ > @@ -4465,7 +4470,6 @@ void bnxt_set_ring_params(struct bnxt *bp) > rx_space = rx_size + ALIGN(max(NET_SKB_PAD, XDP_PACKET_HEADROOM), 8) + > SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); > > - bp->rx_copy_thresh = BNXT_RX_COPY_THRESH; > ring_size = bp->rx_ring_size; > bp->rx_agg_ring_size = 0; > bp->rx_agg_nr_pages = 0; > @@ -4510,7 +4514,8 @@ void bnxt_set_ring_params(struct bnxt *bp) > ALIGN(max(NET_SKB_PAD, XDP_PACKET_HEADROOM), 8) - > SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); > } else { > - rx_size = SKB_DATA_ALIGN(BNXT_RX_COPY_THRESH + NET_IP_ALIGN); > + rx_size = SKB_DATA_ALIGN(bp->rx_copybreak + > + NET_IP_ALIGN); Tiny nit, but why did you wrap NET_IP_ALIGN to the next line? > rx_space = rx_size + NET_SKB_PAD + > SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); > } > @@ -6424,8 +6429,8 @@ static int bnxt_hwrm_vnic_set_hds(struct bnxt *bp, struct bnxt_vnic_info *vnic) > VNIC_PLCMODES_CFG_REQ_FLAGS_HDS_IPV6); > req->enables |= > cpu_to_le32(VNIC_PLCMODES_CFG_REQ_ENABLES_HDS_THRESHOLD_VALID); > - req->jumbo_thresh = cpu_to_le16(bp->rx_copy_thresh); > - req->hds_threshold = cpu_to_le16(bp->rx_copy_thresh); > + req->jumbo_thresh = cpu_to_le16(bp->rx_copybreak); > + req->hds_threshold = cpu_to_le16(bp->rx_copybreak); > } > req->vnic_id = cpu_to_le32(vnic->fw_vnic_id); > return hwrm_req_send(bp, req); > @@ -15864,6 +15869,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) > bnxt_init_l2_fltr_tbl(bp); > bnxt_set_rx_skb_mode(bp, false); > bnxt_set_tpa_flags(bp); > + bnxt_init_ring_params(bp); > bnxt_set_ring_params(bp); > bnxt_rdma_aux_device_init(bp); > rc = bnxt_set_dflt_rings(bp, true); <snip> > diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c > index f71cc8188b4e..201c3fcba04e 100644 > --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c > +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c > @@ -4319,6 +4319,49 @@ static int bnxt_get_eee(struct net_device *dev, struct ethtool_keee *edata) > return 0; > } > > +static int bnxt_set_tunable(struct net_device *dev, > + const struct ethtool_tunable *tuna, > + const void *data) > +{ > + struct bnxt *bp = netdev_priv(dev); > + u32 rx_copybreak; > + > + switch (tuna->id) { > + case ETHTOOL_RX_COPYBREAK: > + rx_copybreak = *(u32 *)data; > + if (rx_copybreak < BNXT_MIN_RX_COPYBREAK || > + rx_copybreak > BNXT_MAX_RX_COPYBREAK) > + return -EINVAL; > + if (rx_copybreak != bp->rx_copybreak) { > + bp->rx_copybreak = rx_copybreak; Should bp->rx_copybreak get set before closing the interface in the netif_running case? Can changing this while traffic is running cause any unexpected issues? I wonder if this would be better/safer? if (netif_running(dev)) { bnxt_close_nic(bp, false, false); bp->rx_copybreak = rx_copybreak; bnxt_set_ring_params(bp); bnxt_open_nic(bp, false, false); } else { bp->rx_copybreak = rx_copybreak; } Thanks, Brett > + if (netif_running(dev)) { > + bnxt_close_nic(bp, false, false); > + bnxt_set_ring_params(bp); > + bnxt_open_nic(bp, false, false); > + } > + } > + return 0; > + default: > + return -EOPNOTSUPP; > + } > +} > + <snip>
On Thu, Sep 12, 2024 at 12:36 AM Brett Creeley <bcreeley@amd.com> wrote: Hi Brett, Thank you so much for your review! > > > > On 9/11/2024 7:55 AM, Taehee Yoo wrote: > > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. > > > > > > The bnxt_en driver supports rx-copybreak, but it couldn't be set by > > userspace. Only the default value(256) has worked. > > This patch makes the bnxt_en driver support following command. > > `ethtool --set-tunable <devname> rx-copybreak <value> ` and > > `ethtool --get-tunable <devname> rx-copybreak`. > > > > Signed-off-by: Taehee Yoo <ap420073@gmail.com> > > --- > > > > v2: > > - Define max/vim rx_copybreak value. > > > > drivers/net/ethernet/broadcom/bnxt/bnxt.c | 24 ++++++---- > > drivers/net/ethernet/broadcom/bnxt/bnxt.h | 6 ++- > > .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 47 ++++++++++++++++++- > > 3 files changed, 66 insertions(+), 11 deletions(-) > > > > diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c > > <snip> > > > +static void bnxt_init_ring_params(struct bnxt *bp) > > +{ > > + bp->rx_copybreak = BNXT_DEFAULT_RX_COPYBREAK; > > +} > > + > > /* bp->rx_ring_size, bp->tx_ring_size, dev->mtu, BNXT_FLAG_{G|L}RO flags must > > * be set on entry. > > */ > > @@ -4465,7 +4470,6 @@ void bnxt_set_ring_params(struct bnxt *bp) > > rx_space = rx_size + ALIGN(max(NET_SKB_PAD, XDP_PACKET_HEADROOM), 8) + > > SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); > > > > - bp->rx_copy_thresh = BNXT_RX_COPY_THRESH; > > ring_size = bp->rx_ring_size; > > bp->rx_agg_ring_size = 0; > > bp->rx_agg_nr_pages = 0; > > @@ -4510,7 +4514,8 @@ void bnxt_set_ring_params(struct bnxt *bp) > > ALIGN(max(NET_SKB_PAD, XDP_PACKET_HEADROOM), 8) - > > SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); > > } else { > > - rx_size = SKB_DATA_ALIGN(BNXT_RX_COPY_THRESH + NET_IP_ALIGN); > > + rx_size = SKB_DATA_ALIGN(bp->rx_copybreak + > > + NET_IP_ALIGN); > > Tiny nit, but why did you wrap NET_IP_ALIGN to the next line? Because It exceeds 80 characters line. > > > rx_space = rx_size + NET_SKB_PAD + > > SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); > > } > > @@ -6424,8 +6429,8 @@ static int bnxt_hwrm_vnic_set_hds(struct bnxt *bp, struct bnxt_vnic_info *vnic) > > VNIC_PLCMODES_CFG_REQ_FLAGS_HDS_IPV6); > > req->enables |= > > cpu_to_le32(VNIC_PLCMODES_CFG_REQ_ENABLES_HDS_THRESHOLD_VALID); > > - req->jumbo_thresh = cpu_to_le16(bp->rx_copy_thresh); > > - req->hds_threshold = cpu_to_le16(bp->rx_copy_thresh); > > + req->jumbo_thresh = cpu_to_le16(bp->rx_copybreak); > > + req->hds_threshold = cpu_to_le16(bp->rx_copybreak); > > } > > req->vnic_id = cpu_to_le32(vnic->fw_vnic_id); > > return hwrm_req_send(bp, req); > > @@ -15864,6 +15869,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) > > bnxt_init_l2_fltr_tbl(bp); > > bnxt_set_rx_skb_mode(bp, false); > > bnxt_set_tpa_flags(bp); > > + bnxt_init_ring_params(bp); > > bnxt_set_ring_params(bp); > > bnxt_rdma_aux_device_init(bp); > > rc = bnxt_set_dflt_rings(bp, true); > > <snip> > > > diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c > > index f71cc8188b4e..201c3fcba04e 100644 > > --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c > > +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c > > @@ -4319,6 +4319,49 @@ static int bnxt_get_eee(struct net_device *dev, struct ethtool_keee *edata) > > return 0; > > } > > > > +static int bnxt_set_tunable(struct net_device *dev, > > + const struct ethtool_tunable *tuna, > > + const void *data) > > +{ > > + struct bnxt *bp = netdev_priv(dev); > > + u32 rx_copybreak; > > + > > + switch (tuna->id) { > > + case ETHTOOL_RX_COPYBREAK: > > + rx_copybreak = *(u32 *)data; > > + if (rx_copybreak < BNXT_MIN_RX_COPYBREAK || > > + rx_copybreak > BNXT_MAX_RX_COPYBREAK) > > + return -EINVAL; > > + if (rx_copybreak != bp->rx_copybreak) { > > + bp->rx_copybreak = rx_copybreak; > > Should bp->rx_copybreak get set before closing the interface in the > netif_running case? Can changing this while traffic is running cause any > unexpected issues? > > I wonder if this would be better/safer? > > if (netif_running(dev)) { > bnxt_close_nic(bp, false, false); > bp->rx_copybreak = rx_copybreak; > bnxt_set_ring_params(bp); > bnxt_open_nic(bp, false, false); > } else { > bp->rx_copybreak = rx_copybreak; > } I think your suggestion is much safer! I will use your suggestion in the v3 patch. > > Thanks, > > Brett > > > + if (netif_running(dev)) { > > + bnxt_close_nic(bp, false, false); > > + bnxt_set_ring_params(bp); > > + bnxt_open_nic(bp, false, false); > > + } > > + } > > + return 0; > > + default: > > + return -EOPNOTSUPP; > > + } > > +} > > + > > <snip> Thanks a lot! Taehee Yoo
On Thu, 12 Sep 2024 00:53:31 +0900 Taehee Yoo wrote: > > if (netif_running(dev)) { > > bnxt_close_nic(bp, false, false); > > bp->rx_copybreak = rx_copybreak; > > bnxt_set_ring_params(bp); > > bnxt_open_nic(bp, false, false); > > } else { > > bp->rx_copybreak = rx_copybreak; > > } > > I think your suggestion is much safer! > I will use your suggestion in the v3 patch. This is better but Andy mentioned on another thread that queue reset should work, so instead of full close / open maybe we can just do: for (/* all Rx queues */) { bnxt_queue_stop(); bnxt_queue_start(); } when the device is already running?
On Wed, Sep 11, 2024 at 5:22 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Thu, 12 Sep 2024 00:53:31 +0900 Taehee Yoo wrote: > > > if (netif_running(dev)) { > > > bnxt_close_nic(bp, false, false); > > > bp->rx_copybreak = rx_copybreak; > > > bnxt_set_ring_params(bp); > > > bnxt_open_nic(bp, false, false); > > > } else { > > > bp->rx_copybreak = rx_copybreak; > > > } > > > > I think your suggestion is much safer! > > I will use your suggestion in the v3 patch. > > This is better but Andy mentioned on another thread that queue reset > should work, so instead of full close / open maybe we can just do: > > for (/* all Rx queues */) { > bnxt_queue_stop(); > bnxt_queue_start(); > } > > when the device is already running? If the copybreak value changes, I don't think queue restart will work. We need to size and allocate the buffers differently than before, so I think we need to do close/open.
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 6e422e24750a..8da211e083a4 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -81,7 +81,6 @@ MODULE_DESCRIPTION("Broadcom NetXtreme network driver"); #define BNXT_RX_OFFSET (NET_SKB_PAD + NET_IP_ALIGN) #define BNXT_RX_DMA_OFFSET NET_SKB_PAD -#define BNXT_RX_COPY_THRESH 256 #define BNXT_TX_PUSH_THRESH 164 @@ -1330,13 +1329,13 @@ static struct sk_buff *bnxt_copy_data(struct bnxt_napi *bnapi, u8 *data, if (!skb) return NULL; - dma_sync_single_for_cpu(&pdev->dev, mapping, bp->rx_copy_thresh, + dma_sync_single_for_cpu(&pdev->dev, mapping, bp->rx_copybreak, bp->rx_dir); memcpy(skb->data - NET_IP_ALIGN, data - NET_IP_ALIGN, len + NET_IP_ALIGN); - dma_sync_single_for_device(&pdev->dev, mapping, bp->rx_copy_thresh, + dma_sync_single_for_device(&pdev->dev, mapping, bp->rx_copybreak, bp->rx_dir); skb_put(skb, len); @@ -1829,7 +1828,7 @@ static inline struct sk_buff *bnxt_tpa_end(struct bnxt *bp, return NULL; } - if (len <= bp->rx_copy_thresh) { + if (len <= bp->rx_copybreak) { skb = bnxt_copy_skb(bnapi, data_ptr, len, mapping); if (!skb) { bnxt_abort_tpa(cpr, idx, agg_bufs); @@ -1931,6 +1930,7 @@ static void bnxt_deliver_skb(struct bnxt *bp, struct bnxt_napi *bnapi, bnxt_vf_rep_rx(bp, skb); return; } + skb_record_rx_queue(skb, bnapi->index); napi_gro_receive(&bnapi->napi, skb); } @@ -2162,7 +2162,7 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr, } } - if (len <= bp->rx_copy_thresh) { + if (len <= bp->rx_copybreak) { if (!xdp_active) skb = bnxt_copy_skb(bnapi, data_ptr, len, dma_addr); else @@ -4451,6 +4451,11 @@ void bnxt_set_tpa_flags(struct bnxt *bp) bp->flags |= BNXT_FLAG_GRO; } +static void bnxt_init_ring_params(struct bnxt *bp) +{ + bp->rx_copybreak = BNXT_DEFAULT_RX_COPYBREAK; +} + /* bp->rx_ring_size, bp->tx_ring_size, dev->mtu, BNXT_FLAG_{G|L}RO flags must * be set on entry. */ @@ -4465,7 +4470,6 @@ void bnxt_set_ring_params(struct bnxt *bp) rx_space = rx_size + ALIGN(max(NET_SKB_PAD, XDP_PACKET_HEADROOM), 8) + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); - bp->rx_copy_thresh = BNXT_RX_COPY_THRESH; ring_size = bp->rx_ring_size; bp->rx_agg_ring_size = 0; bp->rx_agg_nr_pages = 0; @@ -4510,7 +4514,8 @@ void bnxt_set_ring_params(struct bnxt *bp) ALIGN(max(NET_SKB_PAD, XDP_PACKET_HEADROOM), 8) - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); } else { - rx_size = SKB_DATA_ALIGN(BNXT_RX_COPY_THRESH + NET_IP_ALIGN); + rx_size = SKB_DATA_ALIGN(bp->rx_copybreak + + NET_IP_ALIGN); rx_space = rx_size + NET_SKB_PAD + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); } @@ -6424,8 +6429,8 @@ static int bnxt_hwrm_vnic_set_hds(struct bnxt *bp, struct bnxt_vnic_info *vnic) VNIC_PLCMODES_CFG_REQ_FLAGS_HDS_IPV6); req->enables |= cpu_to_le32(VNIC_PLCMODES_CFG_REQ_ENABLES_HDS_THRESHOLD_VALID); - req->jumbo_thresh = cpu_to_le16(bp->rx_copy_thresh); - req->hds_threshold = cpu_to_le16(bp->rx_copy_thresh); + req->jumbo_thresh = cpu_to_le16(bp->rx_copybreak); + req->hds_threshold = cpu_to_le16(bp->rx_copybreak); } req->vnic_id = cpu_to_le32(vnic->fw_vnic_id); return hwrm_req_send(bp, req); @@ -15864,6 +15869,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) bnxt_init_l2_fltr_tbl(bp); bnxt_set_rx_skb_mode(bp, false); bnxt_set_tpa_flags(bp); + bnxt_init_ring_params(bp); bnxt_set_ring_params(bp); bnxt_rdma_aux_device_init(bp); rc = bnxt_set_dflt_rings(bp, true); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 69231e85140b..cff031993223 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -34,6 +34,10 @@ #include <linux/firmware/broadcom/tee_bnxt_fw.h> #endif +#define BNXT_DEFAULT_RX_COPYBREAK 256 +#define BNXT_MIN_RX_COPYBREAK 65 +#define BNXT_MAX_RX_COPYBREAK 1024 + extern struct list_head bnxt_block_cb_list; struct page_pool; @@ -2299,7 +2303,7 @@ struct bnxt { enum dma_data_direction rx_dir; u32 rx_ring_size; u32 rx_agg_ring_size; - u32 rx_copy_thresh; + u32 rx_copybreak; u32 rx_ring_mask; u32 rx_agg_ring_mask; int rx_nr_pages; diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c index f71cc8188b4e..201c3fcba04e 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c @@ -4319,6 +4319,49 @@ static int bnxt_get_eee(struct net_device *dev, struct ethtool_keee *edata) return 0; } +static int bnxt_set_tunable(struct net_device *dev, + const struct ethtool_tunable *tuna, + const void *data) +{ + struct bnxt *bp = netdev_priv(dev); + u32 rx_copybreak; + + switch (tuna->id) { + case ETHTOOL_RX_COPYBREAK: + rx_copybreak = *(u32 *)data; + if (rx_copybreak < BNXT_MIN_RX_COPYBREAK || + rx_copybreak > BNXT_MAX_RX_COPYBREAK) + return -EINVAL; + if (rx_copybreak != bp->rx_copybreak) { + bp->rx_copybreak = rx_copybreak; + if (netif_running(dev)) { + bnxt_close_nic(bp, false, false); + bnxt_set_ring_params(bp); + bnxt_open_nic(bp, false, false); + } + } + return 0; + default: + return -EOPNOTSUPP; + } +} + +static int bnxt_get_tunable(struct net_device *dev, + const struct ethtool_tunable *tuna, void *data) +{ + struct bnxt *bp = netdev_priv(dev); + + switch (tuna->id) { + case ETHTOOL_RX_COPYBREAK: + *(u32 *)data = bp->rx_copybreak; + break; + default: + return -EOPNOTSUPP; + } + + return 0; +} + static int bnxt_read_sfp_module_eeprom_info(struct bnxt *bp, u16 i2c_addr, u16 page_number, u8 bank, u16 start_addr, u16 data_length, @@ -4769,7 +4812,7 @@ static int bnxt_run_loopback(struct bnxt *bp) cpr = &rxr->bnapi->cp_ring; if (bp->flags & BNXT_FLAG_CHIP_P5_PLUS) cpr = rxr->rx_cpr; - pkt_size = min(bp->dev->mtu + ETH_HLEN, bp->rx_copy_thresh); + pkt_size = min(bp->dev->mtu + ETH_HLEN, bp->rx_copybreak); skb = netdev_alloc_skb(bp->dev, pkt_size); if (!skb) return -ENOMEM; @@ -5342,6 +5385,8 @@ const struct ethtool_ops bnxt_ethtool_ops = { .get_link_ext_stats = bnxt_get_link_ext_stats, .get_eee = bnxt_get_eee, .set_eee = bnxt_set_eee, + .get_tunable = bnxt_get_tunable, + .set_tunable = bnxt_set_tunable, .get_module_info = bnxt_get_module_info, .get_module_eeprom = bnxt_get_module_eeprom, .get_module_eeprom_by_page = bnxt_get_module_eeprom_by_page,
The bnxt_en driver supports rx-copybreak, but it couldn't be set by userspace. Only the default value(256) has worked. This patch makes the bnxt_en driver support following command. `ethtool --set-tunable <devname> rx-copybreak <value> ` and `ethtool --get-tunable <devname> rx-copybreak`. Signed-off-by: Taehee Yoo <ap420073@gmail.com> --- v2: - Define max/vim rx_copybreak value. drivers/net/ethernet/broadcom/bnxt/bnxt.c | 24 ++++++---- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 6 ++- .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 47 ++++++++++++++++++- 3 files changed, 66 insertions(+), 11 deletions(-)