Message ID | 20250305162132.1106080-7-aleksander.lobakin@intel.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | idpf: add XDP support | expand |
On Wed, Mar 05, 2025 at 05:21:22PM +0100, Alexander Lobakin wrote: > Currently, the maximum number of queues available for one vport is 16. > This is hardcoded, but then the function calculating the optimal number > of queues takes min(16, num_online_cpus()). > On order to be able to allocate more queues, which will be then used for nit: s/On/In > XDP, stop hardcoding 16 and rely on what the device gives us. Instead of > num_online_cpus(), which is considered suboptimal since at least 2013, > use netif_get_num_default_rss_queues() to still have free queues in the > pool. Should we update older drivers as well? > nr_cpu_ids number of Tx queues are needed only for lockless XDP sending, > the regular stack doesn't benefit from that anyhow. > On a 128-thread Xeon, this now gives me 32 regular Tx queues and leaves > 224 free for XDP (128 of which will handle XDP_TX, .ndo_xdp_xmit(), and > XSk xmit when enabled). > > Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> > --- > drivers/net/ethernet/intel/idpf/idpf_txrx.c | 8 +------- > drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 2 +- > 2 files changed, 2 insertions(+), 8 deletions(-) > > diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c > index c15833928ea1..2f221c0abad8 100644 > --- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c > +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c > @@ -1234,13 +1234,7 @@ int idpf_vport_calc_total_qs(struct idpf_adapter *adapter, u16 vport_idx, > num_req_tx_qs = vport_config->user_config.num_req_tx_qs; > num_req_rx_qs = vport_config->user_config.num_req_rx_qs; > } else { > - int num_cpus; > - > - /* Restrict num of queues to cpus online as a default > - * configuration to give best performance. User can always > - * override to a max number of queues via ethtool. > - */ > - num_cpus = num_online_cpus(); > + u32 num_cpus = netif_get_num_default_rss_queues(); > > dflt_splitq_txq_grps = min_t(int, max_q->max_txq, num_cpus); > dflt_singleq_txqs = min_t(int, max_q->max_txq, num_cpus); > diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c > index 3d2413b8684f..135af3cc243f 100644 > --- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c > +++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c > @@ -937,7 +937,7 @@ int idpf_vport_alloc_max_qs(struct idpf_adapter *adapter, > max_tx_q = le16_to_cpu(caps->max_tx_q) / default_vports; > if (adapter->num_alloc_vports < default_vports) { > max_q->max_rxq = min_t(u16, max_rx_q, IDPF_MAX_Q); > - max_q->max_txq = min_t(u16, max_tx_q, IDPF_MAX_Q); > + max_q->max_txq = min_t(u16, max_tx_q, IDPF_LARGE_MAX_Q); > } else { > max_q->max_rxq = IDPF_MIN_Q; > max_q->max_txq = IDPF_MIN_Q; > -- > 2.48.1 >
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Date: Fri, 7 Mar 2025 11:32:15 +0100 > On Wed, Mar 05, 2025 at 05:21:22PM +0100, Alexander Lobakin wrote: >> Currently, the maximum number of queues available for one vport is 16. >> This is hardcoded, but then the function calculating the optimal number >> of queues takes min(16, num_online_cpus()). >> On order to be able to allocate more queues, which will be then used for > > nit: s/On/In Also "use a saner limit", not "a use saner limit" in the subject =\ > >> XDP, stop hardcoding 16 and rely on what the device gives us. Instead of >> num_online_cpus(), which is considered suboptimal since at least 2013, >> use netif_get_num_default_rss_queues() to still have free queues in the >> pool. > > Should we update older drivers as well? That would be good. For idpf, this is particularly important since the current logic eats 128 Tx queues for skb traffic on my Xeon out of 256 available by default (per vport). On a 256-thread system, it would eat the whole limit, leaving nothing for XDP >_< ice doesn't have a per-port limit IIRC. > >> nr_cpu_ids number of Tx queues are needed only for lockless XDP sending, >> the regular stack doesn't benefit from that anyhow. >> On a 128-thread Xeon, this now gives me 32 regular Tx queues and leaves >> 224 free for XDP (128 of which will handle XDP_TX, .ndo_xdp_xmit(), and >> XSk xmit when enabled). Thanks, Olek
diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c index c15833928ea1..2f221c0abad8 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c @@ -1234,13 +1234,7 @@ int idpf_vport_calc_total_qs(struct idpf_adapter *adapter, u16 vport_idx, num_req_tx_qs = vport_config->user_config.num_req_tx_qs; num_req_rx_qs = vport_config->user_config.num_req_rx_qs; } else { - int num_cpus; - - /* Restrict num of queues to cpus online as a default - * configuration to give best performance. User can always - * override to a max number of queues via ethtool. - */ - num_cpus = num_online_cpus(); + u32 num_cpus = netif_get_num_default_rss_queues(); dflt_splitq_txq_grps = min_t(int, max_q->max_txq, num_cpus); dflt_singleq_txqs = min_t(int, max_q->max_txq, num_cpus); diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c index 3d2413b8684f..135af3cc243f 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c +++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c @@ -937,7 +937,7 @@ int idpf_vport_alloc_max_qs(struct idpf_adapter *adapter, max_tx_q = le16_to_cpu(caps->max_tx_q) / default_vports; if (adapter->num_alloc_vports < default_vports) { max_q->max_rxq = min_t(u16, max_rx_q, IDPF_MAX_Q); - max_q->max_txq = min_t(u16, max_tx_q, IDPF_MAX_Q); + max_q->max_txq = min_t(u16, max_tx_q, IDPF_LARGE_MAX_Q); } else { max_q->max_rxq = IDPF_MIN_Q; max_q->max_txq = IDPF_MIN_Q;
Currently, the maximum number of queues available for one vport is 16. This is hardcoded, but then the function calculating the optimal number of queues takes min(16, num_online_cpus()). On order to be able to allocate more queues, which will be then used for XDP, stop hardcoding 16 and rely on what the device gives us. Instead of num_online_cpus(), which is considered suboptimal since at least 2013, use netif_get_num_default_rss_queues() to still have free queues in the pool. nr_cpu_ids number of Tx queues are needed only for lockless XDP sending, the regular stack doesn't benefit from that anyhow. On a 128-thread Xeon, this now gives me 32 regular Tx queues and leaves 224 free for XDP (128 of which will handle XDP_TX, .ndo_xdp_xmit(), and XSk xmit when enabled). Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> --- drivers/net/ethernet/intel/idpf/idpf_txrx.c | 8 +------- drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 2 +- 2 files changed, 2 insertions(+), 8 deletions(-)