diff mbox series

[net-next,RFC,2/2] virtio-net: free old xmit skbs only in NAPI

Message ID 20250122061600.16781-3-jasowang@redhat.com (mailing list archive)
State RFC
Delegated to: Netdev Maintainers
Headers show
Series don't clean packets in start_xmit in TX NAPI mode | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 10 of 10 maintainers
netdev/build_clang success Errors and warnings before: 1 this patch: 1
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch warning CHECK: spaces preferred around that '+' (ctx:VxV)
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Jason Wang Jan. 22, 2025, 6:16 a.m. UTC
When NAPI mode is enabled, we try to free old transmited packets
before sending a packet. This has several side effects:

- transmitted packets need to be freed before sending a packet, this
  introduces delay and increases the average packets transmit time.
- more time in hold the TX lock that causes more TX lock contention
  with the TX NAPI

This would be more noticeable when using a fast device like
vhost-user/DPDK. So this patch tries to avoid those issues by not
cleaning transmitted packets in start_xmit() when TX NAPI is
enabled. Notification will be disabled at the beginning of the
start_xmit() but we can't enable delayed notification after TX is
stopped. Instead, the delayed notification needs to be enabled if we
need to kick the virtqueue.

Performance numbers:

1) pktgen_sample03_burst_single_flow.sh (burst 256) + testpmd (rxonly)
   on the host:

- When pinning TX IRQ to pktgen VCPU: split virtqueue PPS were
  increased 62% from 6.45 Mpps to 10.5 Mpps; packed virtqueue PPS were
  increased 60% from 7.8 Mpps to 12.5 Mpps.
- When pinning TX IRQ to VCPU other than pktgen: split virtqueue PPS
  were increased 25% from 6.15 Mpps to 7.7 Mpps; packed virtqueue PPS
  were increased 50.6% from 8.3Mpps to 12.5 Mpps.

2) Netperf:

- Netperf in guest + vhost-net/TAP on the host doesn't show obvious
  differences.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/net/virtio_net.c | 27 ++++++++++++++++++---------
 1 file changed, 18 insertions(+), 9 deletions(-)
diff mbox series

Patch

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 2f6c3dc68ba0..3d5c44546dc1 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -3271,15 +3271,10 @@  static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	bool use_napi = sq->napi.weight;
 	bool kick;
 
-	/* Free up any pending old buffers before queueing new ones. */
-	do {
-		if (use_napi)
-			virtqueue_disable_cb(sq->vq);
-
+	if (!use_napi)
 		free_old_xmit(sq, txq, false);
-
-	} while (use_napi && !xmit_more &&
-	       unlikely(!virtqueue_enable_cb_delayed(sq->vq)));
+	else
+		virtqueue_disable_cb(sq->vq);
 
 	/* timestamp packet in software */
 	skb_tx_timestamp(skb);
@@ -3305,7 +3300,18 @@  static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 		nf_reset_ct(skb);
 	}
 
-	check_sq_full_and_disable(vi, dev, sq);
+	if (tx_may_stop(vi, dev, sq) && !use_napi &&
+	    unlikely(virtqueue_enable_cb_delayed(sq->vq))) {
+		/* More just got used, free them then recheck. */
+		free_old_xmit(sq, txq, false);
+		if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
+			netif_start_subqueue(dev, qnum);
+			u64_stats_update_begin(&sq->stats.syncp);
+			u64_stats_inc(&sq->stats.wake);
+			u64_stats_update_end(&sq->stats.syncp);
+			virtqueue_disable_cb(sq->vq);
+		}
+	}
 
 	kick = use_napi ? __netdev_tx_sent_queue(txq, skb->len, xmit_more) :
 			  !xmit_more || netif_xmit_stopped(txq);
@@ -3317,6 +3323,9 @@  static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 		}
 	}
 
+	if (use_napi && kick && unlikely(!virtqueue_enable_cb_delayed(sq->vq)))
+		virtqueue_napi_schedule(&sq->napi, sq->vq);
+
 	return NETDEV_TX_OK;
 }