diff mbox series

[net,3/3] net: enetc: reset xdp_tx_in_flight when updating bpf program

Message ID 20240919084104.661180-4-wei.fang@nxp.com (mailing list archive)
State New
Delegated to: Netdev Maintainers
Headers show
Series net: enetc: fix some issues of XDP | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 16 this patch: 16
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 11 of 11 maintainers
netdev/build_clang success Errors and warnings before: 16 this patch: 16
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 17 this patch: 17
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 7 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-09-19--12-00 (tests: 763)

Commit Message

Wei Fang Sept. 19, 2024, 8:41 a.m. UTC
When running "xdp-bench tx eno0" to test the XDP_TX feature of ENETC
on LS1028A, it was found that if the command was re-run multiple times,
Rx could not receive the frames, and the result of xdo-bench showed
that the rx rate was 0.

root@ls1028ardb:~# ./xdp-bench tx eno0
Hairpinning (XDP_TX) packets on eno0 (ifindex 3; driver fsl_enetc)
Summary                      2046 rx/s                  0 err,drop/s
Summary                         0 rx/s                  0 err,drop/s
Summary                         0 rx/s                  0 err,drop/s
Summary                         0 rx/s                  0 err,drop/s

By observing the Rx PIR and CIR registers, we found that CIR is always
equal to 0x7FF and PIR is always 0x7FE, which means that the Rx ring
is full and can no longer accommodate other Rx frames. Therefore, it
is obvious that the RX BD ring has not been cleaned up.

Further analysis of the code revealed that the Rx BD ring will only
be cleaned if the "cleaned_cnt > xdp_tx_in_flight" condition is met.
Therefore, some debug logs were added to the driver and the current
values of cleaned_cnt and xdp_tx_in_flight were printed when the Rx
BD ring was full. The logs are as follows.

[  178.762419] [XDP TX] >> cleaned_cnt:1728, xdp_tx_in_flight:2140
[  178.771387] [XDP TX] >> cleaned_cnt:1941, xdp_tx_in_flight:2110
[  178.776058] [XDP TX] >> cleaned_cnt:1792, xdp_tx_in_flight:2110

From the results, we can see that the maximum value of xdp_tx_in_flight
has reached 2140. However, the size of the Rx BD ring is only 2048. This
is incredible, so checked the code again and found that the driver did
not reset xdp_tx_in_flight when installing or uninstalling bpf program,
resulting in xdp_tx_in_flight still retaining the value after the last
command was run.

Fixes: c33bfaf91c4c ("net: enetc: set up XDP program under enetc_reconfigure()")
Cc: stable@vger.kernel.org
Signed-off-by: Wei Fang <wei.fang@nxp.com>
---
 drivers/net/ethernet/freescale/enetc/enetc.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Vladimir Oltean Sept. 19, 2024, 1:21 p.m. UTC | #1
On Thu, Sep 19, 2024 at 04:41:04PM +0800, Wei Fang wrote:
> When running "xdp-bench tx eno0" to test the XDP_TX feature of ENETC
> on LS1028A, it was found that if the command was re-run multiple times,
> Rx could not receive the frames, and the result of xdo-bench showed
> that the rx rate was 0.
> 
> root@ls1028ardb:~# ./xdp-bench tx eno0
> Hairpinning (XDP_TX) packets on eno0 (ifindex 3; driver fsl_enetc)
> Summary                      2046 rx/s                  0 err,drop/s
> Summary                         0 rx/s                  0 err,drop/s
> Summary                         0 rx/s                  0 err,drop/s
> Summary                         0 rx/s                  0 err,drop/s
> 
> By observing the Rx PIR and CIR registers, we found that CIR is always
> equal to 0x7FF and PIR is always 0x7FE, which means that the Rx ring
> is full and can no longer accommodate other Rx frames. Therefore, it
> is obvious that the RX BD ring has not been cleaned up.
> 
> Further analysis of the code revealed that the Rx BD ring will only
> be cleaned if the "cleaned_cnt > xdp_tx_in_flight" condition is met.
> Therefore, some debug logs were added to the driver and the current
> values of cleaned_cnt and xdp_tx_in_flight were printed when the Rx
> BD ring was full. The logs are as follows.
> 
> [  178.762419] [XDP TX] >> cleaned_cnt:1728, xdp_tx_in_flight:2140
> [  178.771387] [XDP TX] >> cleaned_cnt:1941, xdp_tx_in_flight:2110
> [  178.776058] [XDP TX] >> cleaned_cnt:1792, xdp_tx_in_flight:2110
> 
> From the results, we can see that the maximum value of xdp_tx_in_flight
> has reached 2140. However, the size of the Rx BD ring is only 2048. This
> is incredible, so checked the code again and found that the driver did
> not reset xdp_tx_in_flight when installing or uninstalling bpf program,
> resulting in xdp_tx_in_flight still retaining the value after the last
> command was run.
> 
> Fixes: c33bfaf91c4c ("net: enetc: set up XDP program under enetc_reconfigure()")

This does not explain why enetc_recycle_xdp_tx_buff(), which decreases
xdp_tx_in_flight, does not get called?

In patch 2/3 you wrote:

| Tx BD rings are disabled first in enetc_stop() and then
| wait for empty. This operation is not safe while the Tx BD ring
| is actively transmitting frames, and will cause the ring to not
| be empty and hardware exception. As described in the block guide
| of LS1028A NETC, software should only disable an active ring after
| all pending ring entries have been consumed (i.e. when PI = CI).
| Disabling a transmit ring that is actively processing BDs risks
| a HW-SW race hazard whereby a hardware resource becomes assigned
| to work on one or more ring entries only to have those entries be
| removed due to the ring becoming disabled. So the correct behavior
| is that the software stops putting frames on the Tx BD rings (this
| is what ENETC_TX_DOWN does), then waits for the Tx BD rings to be
| empty, and finally disables the Tx BD rings.

I'm surprised that after fixing that, this change would still be needed,
rather than xdp_tx_in_flight naturally dropping down to 0 when stopping
NAPI. Why doesn't that happen, and what happens to the pending XDP_TX
buffers?
diff mbox series

Patch

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index 5830c046cb7d..3cff76923ab9 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -2769,6 +2769,7 @@  static int enetc_reconfigure_xdp_cb(struct enetc_ndev_priv *priv, void *ctx)
 	for (i = 0; i < priv->num_rx_rings; i++) {
 		struct enetc_bdr *rx_ring = priv->rx_ring[i];
 
+		rx_ring->xdp.xdp_tx_in_flight = 0;
 		rx_ring->xdp.prog = prog;
 
 		if (prog)