Message ID | 20250214035037.650291-1-mohsin.bashr@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next,V2] eth: fbnic: Add ethtool support for IRQ coalescing | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Clearly marked for net-next |
netdev/apply | fail | Patch does not apply to net-next-0 |
On Thu, Feb 13, 2025 at 07:50:37PM -0800, Mohsin Bashir wrote: > Add ethtool support to configure the IRQ coalescing behavior. Support > separate timers for Rx and Tx for time based coalescing. For frame based > configuration, currently we only support the Rx side. > > The hardware allows configuration of descriptor count instead of frame > count requiring conversion between the two. We assume 2 descriptors > per frame, one for the metadata and one for the data segment. > > When rx-frames are not configured, we set the RX descriptor count to > half the ring size as a fail safe. > > Default configuration: > ethtool -c eth0 | grep -E "rx-usecs:|tx-usecs:|rx-frames:" > rx-usecs: 30 > rx-frames: 0 > tx-usecs: 35 > > IRQ rate test: > With single iperf flow we monitor IRQ rate while changing the tx-usesc and > rx-usecs to high and low values. > > ethtool -C eth0 rx-frames 8192 rx-usecs 150 tx-usecs 150 > irq/sec 13k > irq/sec 14k > irq/sec 14k > > ethtool -C eth0 rx-frames 8192 rx-usecs 10 tx-usecs 10 > irq/sec 27k > irq/sec 28k > irq/sec 28k > > Validating the use of extack: > ethtool -C eth0 rx-frames 16384 > netlink error: fbnic: rx_frames is above device max > netlink error: Invalid argument Nice, thanks. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Andrew
On Thu, Feb 13, 2025 at 07:50:37PM -0800, Mohsin Bashir wrote: > Add ethtool support to configure the IRQ coalescing behavior. Support > separate timers for Rx and Tx for time based coalescing. For frame based > configuration, currently we only support the Rx side. > > The hardware allows configuration of descriptor count instead of frame > count requiring conversion between the two. We assume 2 descriptors > per frame, one for the metadata and one for the data segment. > > When rx-frames are not configured, we set the RX descriptor count to > half the ring size as a fail safe. > > Default configuration: > ethtool -c eth0 | grep -E "rx-usecs:|tx-usecs:|rx-frames:" > rx-usecs: 30 > rx-frames: 0 > tx-usecs: 35 > > IRQ rate test: > With single iperf flow we monitor IRQ rate while changing the tx-usesc and > rx-usecs to high and low values. > > ethtool -C eth0 rx-frames 8192 rx-usecs 150 tx-usecs 150 > irq/sec 13k > irq/sec 14k > irq/sec 14k > > ethtool -C eth0 rx-frames 8192 rx-usecs 10 tx-usecs 10 > irq/sec 27k > irq/sec 28k > irq/sec 28k > > Validating the use of extack: > ethtool -C eth0 rx-frames 16384 > netlink error: fbnic: rx_frames is above device max > netlink error: Invalid argument > > Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Hi Moshin, Unfortunately this does not seem to apply cleanly against current net-next. Could you rebase and repost? Thanks!
On 2/13/2025 7:50 PM, Mohsin Bashir wrote: > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. > > > Add ethtool support to configure the IRQ coalescing behavior. Support > separate timers for Rx and Tx for time based coalescing. For frame based > configuration, currently we only support the Rx side. > > The hardware allows configuration of descriptor count instead of frame > count requiring conversion between the two. We assume 2 descriptors > per frame, one for the metadata and one for the data segment. > > When rx-frames are not configured, we set the RX descriptor count to > half the ring size as a fail safe. > > Default configuration: > ethtool -c eth0 | grep -E "rx-usecs:|tx-usecs:|rx-frames:" > rx-usecs: 30 > rx-frames: 0 > tx-usecs: 35 > > IRQ rate test: > With single iperf flow we monitor IRQ rate while changing the tx-usesc and > rx-usecs to high and low values. > > ethtool -C eth0 rx-frames 8192 rx-usecs 150 tx-usecs 150 > irq/sec 13k > irq/sec 14k > irq/sec 14k > > ethtool -C eth0 rx-frames 8192 rx-usecs 10 tx-usecs 10 > irq/sec 27k > irq/sec 28k > irq/sec 28k > > Validating the use of extack: > ethtool -C eth0 rx-frames 16384 > netlink error: fbnic: rx_frames is above device max > netlink error: Invalid argument > > Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> > --- > V2: > - Update fbnic_set_coalesce() to use extack to highlight incorrect config > - Simplify fbnic_config_rx_frames() > V1: https://lore.kernel.org/netdev/20250212234946.2536116-1-mohsin.bashr@gmail.com > --- > drivers/net/ethernet/meta/fbnic/fbnic.h | 3 + > .../net/ethernet/meta/fbnic/fbnic_ethtool.c | 59 +++++++++++++++++++ > .../net/ethernet/meta/fbnic/fbnic_netdev.c | 4 ++ > .../net/ethernet/meta/fbnic/fbnic_netdev.h | 6 ++ > drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 53 ++++++++++++++--- > drivers/net/ethernet/meta/fbnic/fbnic_txrx.h | 3 + > 6 files changed, 120 insertions(+), 8 deletions(-) > LGTM. Reviewed-by: Brett Creeley <brett.creeley@amd.com> <snip>
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic.h b/drivers/net/ethernet/meta/fbnic/fbnic.h index 14751f16e125..548e882381ce 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic.h +++ b/drivers/net/ethernet/meta/fbnic/fbnic.h @@ -180,6 +180,9 @@ void fbnic_dbg_exit(void); void fbnic_csr_get_regs(struct fbnic_dev *fbd, u32 *data, u32 *regs_version); int fbnic_csr_regs_len(struct fbnic_dev *fbd); +void fbnic_config_txrx_usecs(struct fbnic_napi_vector *nv, u32 arm); +void fbnic_config_rx_frames(struct fbnic_napi_vector *nv); + enum fbnic_boards { fbnic_board_asic }; diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c index 20cd9f5f89e2..54c216148d12 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c +++ b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c @@ -135,6 +135,61 @@ static void fbnic_clone_free(struct fbnic_net *clone) kfree(clone); } +static int fbnic_get_coalesce(struct net_device *netdev, + struct ethtool_coalesce *ec, + struct kernel_ethtool_coalesce *kernel_coal, + struct netlink_ext_ack *extack) +{ + struct fbnic_net *fbn = netdev_priv(netdev); + + ec->tx_coalesce_usecs = fbn->tx_usecs; + ec->rx_coalesce_usecs = fbn->rx_usecs; + ec->rx_max_coalesced_frames = fbn->rx_max_frames; + + return 0; +} + +static int fbnic_set_coalesce(struct net_device *netdev, + struct ethtool_coalesce *ec, + struct kernel_ethtool_coalesce *kernel_coal, + struct netlink_ext_ack *extack) +{ + struct fbnic_net *fbn = netdev_priv(netdev); + + /* Verify against hardware limits */ + if (ec->rx_coalesce_usecs > FIELD_MAX(FBNIC_INTR_CQ_REARM_RCQ_TIMEOUT)) { + NL_SET_ERR_MSG_MOD(extack, "rx_usecs is above device max"); + return -EINVAL; + } + if (ec->tx_coalesce_usecs > FIELD_MAX(FBNIC_INTR_CQ_REARM_TCQ_TIMEOUT)) { + NL_SET_ERR_MSG_MOD(extack, "tx_usecs is above device max"); + return -EINVAL; + } + if (ec->rx_max_coalesced_frames > + FIELD_MAX(FBNIC_QUEUE_RIM_THRESHOLD_RCD_MASK) / + FBNIC_MIN_RXD_PER_FRAME) { + NL_SET_ERR_MSG_MOD(extack, "rx_frames is above device max"); + return -EINVAL; + } + + fbn->tx_usecs = ec->tx_coalesce_usecs; + fbn->rx_usecs = ec->rx_coalesce_usecs; + fbn->rx_max_frames = ec->rx_max_coalesced_frames; + + if (netif_running(netdev)) { + int i; + + for (i = 0; i < fbn->num_napi; i++) { + struct fbnic_napi_vector *nv = fbn->napi[i]; + + fbnic_config_txrx_usecs(nv, 0); + fbnic_config_rx_frames(nv); + } + } + + return 0; +} + static void fbnic_get_strings(struct net_device *dev, u32 sset, u8 *data) { int i; @@ -586,9 +641,13 @@ fbnic_get_eth_mac_stats(struct net_device *netdev, } static const struct ethtool_ops fbnic_ethtool_ops = { + .supported_coalesce_params = ETHTOOL_COALESCE_USECS | + ETHTOOL_COALESCE_RX_MAX_FRAMES, .get_drvinfo = fbnic_get_drvinfo, .get_regs_len = fbnic_get_regs_len, .get_regs = fbnic_get_regs, + .get_coalesce = fbnic_get_coalesce, + .set_coalesce = fbnic_set_coalesce, .get_strings = fbnic_get_strings, .get_ethtool_stats = fbnic_get_ethtool_stats, .get_sset_count = fbnic_get_sset_count, diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c index 1db57c42333e..8b6be6b60945 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c +++ b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c @@ -618,6 +618,10 @@ struct net_device *fbnic_netdev_alloc(struct fbnic_dev *fbd) fbn->ppq_size = FBNIC_PPQ_SIZE_DEFAULT; fbn->rcq_size = FBNIC_RCQ_SIZE_DEFAULT; + fbn->tx_usecs = FBNIC_TX_USECS_DEFAULT; + fbn->rx_usecs = FBNIC_RX_USECS_DEFAULT; + fbn->rx_max_frames = FBNIC_RX_FRAMES_DEFAULT; + default_queues = netif_get_num_default_rss_queues(); if (default_queues > fbd->max_num_queues) default_queues = fbd->max_num_queues; diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h index a392ac1cc4f2..46af92a8f781 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h +++ b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h @@ -12,6 +12,7 @@ #include "fbnic_txrx.h" #define FBNIC_MAX_NAPI_VECTORS 128u +#define FBNIC_MIN_RXD_PER_FRAME 2 struct fbnic_net { struct fbnic_ring *tx[FBNIC_MAX_TXQS]; @@ -27,6 +28,11 @@ struct fbnic_net { u32 ppq_size; u32 rcq_size; + u16 rx_usecs; + u16 tx_usecs; + + u32 rx_max_frames; + u16 num_napi; struct phylink *phylink; diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c index d4d7027df9a0..7e0bd11f05d2 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c +++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c @@ -2010,9 +2010,51 @@ static void fbnic_config_drop_mode_rcq(struct fbnic_napi_vector *nv, fbnic_ring_wr32(rcq, FBNIC_QUEUE_RDE_CTL0, rcq_ctl); } +static void fbnic_config_rim_threshold(struct fbnic_ring *rcq, u16 nv_idx, u32 rx_desc) +{ + u32 threshold; + + /* Set the threhsold to half the ring size if rx_frames + * is not configured + */ + threshold = rx_desc ? : rcq->size_mask / 2; + + fbnic_ring_wr32(rcq, FBNIC_QUEUE_RIM_CTL, nv_idx); + fbnic_ring_wr32(rcq, FBNIC_QUEUE_RIM_THRESHOLD, threshold); +} + +void fbnic_config_txrx_usecs(struct fbnic_napi_vector *nv, u32 arm) +{ + struct fbnic_net *fbn = netdev_priv(nv->napi.dev); + struct fbnic_dev *fbd = nv->fbd; + u32 val = arm; + + val |= FIELD_PREP(FBNIC_INTR_CQ_REARM_RCQ_TIMEOUT, fbn->rx_usecs) | + FBNIC_INTR_CQ_REARM_RCQ_TIMEOUT_UPD_EN; + val |= FIELD_PREP(FBNIC_INTR_CQ_REARM_TCQ_TIMEOUT, fbn->tx_usecs) | + FBNIC_INTR_CQ_REARM_TCQ_TIMEOUT_UPD_EN; + + fbnic_wr32(fbd, FBNIC_INTR_CQ_REARM(nv->v_idx), val); +} + +void fbnic_config_rx_frames(struct fbnic_napi_vector *nv) +{ + struct fbnic_net *fbn = netdev_priv(nv->napi.dev); + int i; + + for (i = nv->txt_count; i < nv->rxt_count + nv->txt_count; i++) { + struct fbnic_q_triad *qt = &nv->qt[i]; + + fbnic_config_rim_threshold(&qt->cmpl, nv->v_idx, + fbn->rx_max_frames * + FBNIC_MIN_RXD_PER_FRAME); + } +} + static void fbnic_enable_rcq(struct fbnic_napi_vector *nv, struct fbnic_ring *rcq) { + struct fbnic_net *fbn = netdev_priv(nv->napi.dev); u32 log_size = fls(rcq->size_mask); u32 rcq_ctl; @@ -2040,8 +2082,8 @@ static void fbnic_enable_rcq(struct fbnic_napi_vector *nv, fbnic_ring_wr32(rcq, FBNIC_QUEUE_RCQ_SIZE, log_size & 0xf); /* Store interrupt information for the completion queue */ - fbnic_ring_wr32(rcq, FBNIC_QUEUE_RIM_CTL, nv->v_idx); - fbnic_ring_wr32(rcq, FBNIC_QUEUE_RIM_THRESHOLD, rcq->size_mask / 2); + fbnic_config_rim_threshold(rcq, nv->v_idx, fbn->rx_max_frames * + FBNIC_MIN_RXD_PER_FRAME); fbnic_ring_wr32(rcq, FBNIC_QUEUE_RIM_MASK, 0); /* Enable queue */ @@ -2080,12 +2122,7 @@ void fbnic_enable(struct fbnic_net *fbn) static void fbnic_nv_irq_enable(struct fbnic_napi_vector *nv) { - struct fbnic_dev *fbd = nv->fbd; - u32 val; - - val = FBNIC_INTR_CQ_REARM_INTR_UNMASK; - - fbnic_wr32(fbd, FBNIC_INTR_CQ_REARM(nv->v_idx), val); + fbnic_config_txrx_usecs(nv, FBNIC_INTR_CQ_REARM_INTR_UNMASK); } void fbnic_napi_enable(struct fbnic_net *fbn) diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h index c2a94f31f71b..483e11e8bf39 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h +++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h @@ -31,6 +31,9 @@ struct fbnic_net; #define FBNIC_HPQ_SIZE_DEFAULT 256 #define FBNIC_PPQ_SIZE_DEFAULT 256 #define FBNIC_RCQ_SIZE_DEFAULT 1024 +#define FBNIC_TX_USECS_DEFAULT 35 +#define FBNIC_RX_USECS_DEFAULT 30 +#define FBNIC_RX_FRAMES_DEFAULT 0 #define FBNIC_RX_TROOM \ SKB_DATA_ALIGN(sizeof(struct skb_shared_info))
Add ethtool support to configure the IRQ coalescing behavior. Support separate timers for Rx and Tx for time based coalescing. For frame based configuration, currently we only support the Rx side. The hardware allows configuration of descriptor count instead of frame count requiring conversion between the two. We assume 2 descriptors per frame, one for the metadata and one for the data segment. When rx-frames are not configured, we set the RX descriptor count to half the ring size as a fail safe. Default configuration: ethtool -c eth0 | grep -E "rx-usecs:|tx-usecs:|rx-frames:" rx-usecs: 30 rx-frames: 0 tx-usecs: 35 IRQ rate test: With single iperf flow we monitor IRQ rate while changing the tx-usesc and rx-usecs to high and low values. ethtool -C eth0 rx-frames 8192 rx-usecs 150 tx-usecs 150 irq/sec 13k irq/sec 14k irq/sec 14k ethtool -C eth0 rx-frames 8192 rx-usecs 10 tx-usecs 10 irq/sec 27k irq/sec 28k irq/sec 28k Validating the use of extack: ethtool -C eth0 rx-frames 16384 netlink error: fbnic: rx_frames is above device max netlink error: Invalid argument Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> --- V2: - Update fbnic_set_coalesce() to use extack to highlight incorrect config - Simplify fbnic_config_rx_frames() V1: https://lore.kernel.org/netdev/20250212234946.2536116-1-mohsin.bashr@gmail.com --- drivers/net/ethernet/meta/fbnic/fbnic.h | 3 + .../net/ethernet/meta/fbnic/fbnic_ethtool.c | 59 +++++++++++++++++++ .../net/ethernet/meta/fbnic/fbnic_netdev.c | 4 ++ .../net/ethernet/meta/fbnic/fbnic_netdev.h | 6 ++ drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 53 ++++++++++++++--- drivers/net/ethernet/meta/fbnic/fbnic_txrx.h | 3 + 6 files changed, 120 insertions(+), 8 deletions(-)