diff mbox series

[net,3/3] net: enetc: reset xdp_tx_in_flight when updating bpf program

Message ID 20240919084104.661180-4-wei.fang@nxp.com (mailing list archive)
State Superseded
Headers show
Series net: enetc: fix some issues of XDP | expand

Commit Message

Wei Fang Sept. 19, 2024, 8:41 a.m. UTC
When running "xdp-bench tx eno0" to test the XDP_TX feature of ENETC
on LS1028A, it was found that if the command was re-run multiple times,
Rx could not receive the frames, and the result of xdo-bench showed
that the rx rate was 0.

root@ls1028ardb:~# ./xdp-bench tx eno0
Hairpinning (XDP_TX) packets on eno0 (ifindex 3; driver fsl_enetc)
Summary                      2046 rx/s                  0 err,drop/s
Summary                         0 rx/s                  0 err,drop/s
Summary                         0 rx/s                  0 err,drop/s
Summary                         0 rx/s                  0 err,drop/s

By observing the Rx PIR and CIR registers, we found that CIR is always
equal to 0x7FF and PIR is always 0x7FE, which means that the Rx ring
is full and can no longer accommodate other Rx frames. Therefore, it
is obvious that the RX BD ring has not been cleaned up.

Further analysis of the code revealed that the Rx BD ring will only
be cleaned if the "cleaned_cnt > xdp_tx_in_flight" condition is met.
Therefore, some debug logs were added to the driver and the current
values of cleaned_cnt and xdp_tx_in_flight were printed when the Rx
BD ring was full. The logs are as follows.

[  178.762419] [XDP TX] >> cleaned_cnt:1728, xdp_tx_in_flight:2140
[  178.771387] [XDP TX] >> cleaned_cnt:1941, xdp_tx_in_flight:2110
[  178.776058] [XDP TX] >> cleaned_cnt:1792, xdp_tx_in_flight:2110

From the results, we can see that the maximum value of xdp_tx_in_flight
has reached 2140. However, the size of the Rx BD ring is only 2048. This
is incredible, so checked the code again and found that the driver did
not reset xdp_tx_in_flight when installing or uninstalling bpf program,
resulting in xdp_tx_in_flight still retaining the value after the last
command was run.

Fixes: c33bfaf91c4c ("net: enetc: set up XDP program under enetc_reconfigure()")
Cc: stable@vger.kernel.org
Signed-off-by: Wei Fang <wei.fang@nxp.com>
---
 drivers/net/ethernet/freescale/enetc/enetc.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Vladimir Oltean Sept. 19, 2024, 1:21 p.m. UTC | #1
On Thu, Sep 19, 2024 at 04:41:04PM +0800, Wei Fang wrote:
> When running "xdp-bench tx eno0" to test the XDP_TX feature of ENETC
> on LS1028A, it was found that if the command was re-run multiple times,
> Rx could not receive the frames, and the result of xdo-bench showed
> that the rx rate was 0.
> 
> root@ls1028ardb:~# ./xdp-bench tx eno0
> Hairpinning (XDP_TX) packets on eno0 (ifindex 3; driver fsl_enetc)
> Summary                      2046 rx/s                  0 err,drop/s
> Summary                         0 rx/s                  0 err,drop/s
> Summary                         0 rx/s                  0 err,drop/s
> Summary                         0 rx/s                  0 err,drop/s
> 
> By observing the Rx PIR and CIR registers, we found that CIR is always
> equal to 0x7FF and PIR is always 0x7FE, which means that the Rx ring
> is full and can no longer accommodate other Rx frames. Therefore, it
> is obvious that the RX BD ring has not been cleaned up.
> 
> Further analysis of the code revealed that the Rx BD ring will only
> be cleaned if the "cleaned_cnt > xdp_tx_in_flight" condition is met.
> Therefore, some debug logs were added to the driver and the current
> values of cleaned_cnt and xdp_tx_in_flight were printed when the Rx
> BD ring was full. The logs are as follows.
> 
> [  178.762419] [XDP TX] >> cleaned_cnt:1728, xdp_tx_in_flight:2140
> [  178.771387] [XDP TX] >> cleaned_cnt:1941, xdp_tx_in_flight:2110
> [  178.776058] [XDP TX] >> cleaned_cnt:1792, xdp_tx_in_flight:2110
> 
> From the results, we can see that the maximum value of xdp_tx_in_flight
> has reached 2140. However, the size of the Rx BD ring is only 2048. This
> is incredible, so checked the code again and found that the driver did
> not reset xdp_tx_in_flight when installing or uninstalling bpf program,
> resulting in xdp_tx_in_flight still retaining the value after the last
> command was run.
> 
> Fixes: c33bfaf91c4c ("net: enetc: set up XDP program under enetc_reconfigure()")

This does not explain why enetc_recycle_xdp_tx_buff(), which decreases
xdp_tx_in_flight, does not get called?

In patch 2/3 you wrote:

| Tx BD rings are disabled first in enetc_stop() and then
| wait for empty. This operation is not safe while the Tx BD ring
| is actively transmitting frames, and will cause the ring to not
| be empty and hardware exception. As described in the block guide
| of LS1028A NETC, software should only disable an active ring after
| all pending ring entries have been consumed (i.e. when PI = CI).
| Disabling a transmit ring that is actively processing BDs risks
| a HW-SW race hazard whereby a hardware resource becomes assigned
| to work on one or more ring entries only to have those entries be
| removed due to the ring becoming disabled. So the correct behavior
| is that the software stops putting frames on the Tx BD rings (this
| is what ENETC_TX_DOWN does), then waits for the Tx BD rings to be
| empty, and finally disables the Tx BD rings.

I'm surprised that after fixing that, this change would still be needed,
rather than xdp_tx_in_flight naturally dropping down to 0 when stopping
NAPI. Why doesn't that happen, and what happens to the pending XDP_TX
buffers?
Wei Fang Sept. 20, 2024, 3:12 a.m. UTC | #2
> -----Original Message-----
> From: Vladimir Oltean <vladimir.oltean@nxp.com>
> Sent: 2024年9月19日 21:22
> To: Wei Fang <wei.fang@nxp.com>
> Cc: davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> pabeni@redhat.com; Claudiu Manoil <claudiu.manoil@nxp.com>;
> ast@kernel.org; daniel@iogearbox.net; hawk@kernel.org;
> john.fastabend@gmail.com; linux-kernel@vger.kernel.org;
> netdev@vger.kernel.org; bpf@vger.kernel.org; stable@vger.kernel.org;
> imx@lists.linux.dev
> Subject: Re: [PATCH net 3/3] net: enetc: reset xdp_tx_in_flight when updating
> bpf program
> 
> On Thu, Sep 19, 2024 at 04:41:04PM +0800, Wei Fang wrote:
> > When running "xdp-bench tx eno0" to test the XDP_TX feature of ENETC
> > on LS1028A, it was found that if the command was re-run multiple
> > times, Rx could not receive the frames, and the result of xdo-bench
> > showed that the rx rate was 0.
> >
> > root@ls1028ardb:~# ./xdp-bench tx eno0 Hairpinning (XDP_TX) packets on
> > eno0 (ifindex 3; driver fsl_enetc)
> > Summary                      2046 rx/s                  0
> err,drop/s
> > Summary                         0 rx/s                  0
> err,drop/s
> > Summary                         0 rx/s                  0
> err,drop/s
> > Summary                         0 rx/s                  0
> err,drop/s
> >
> > By observing the Rx PIR and CIR registers, we found that CIR is always
> > equal to 0x7FF and PIR is always 0x7FE, which means that the Rx ring
> > is full and can no longer accommodate other Rx frames. Therefore, it
> > is obvious that the RX BD ring has not been cleaned up.
> >
> > Further analysis of the code revealed that the Rx BD ring will only be
> > cleaned if the "cleaned_cnt > xdp_tx_in_flight" condition is met.
> > Therefore, some debug logs were added to the driver and the current
> > values of cleaned_cnt and xdp_tx_in_flight were printed when the Rx BD
> > ring was full. The logs are as follows.
> >
> > [  178.762419] [XDP TX] >> cleaned_cnt:1728, xdp_tx_in_flight:2140 [
> > 178.771387] [XDP TX] >> cleaned_cnt:1941, xdp_tx_in_flight:2110 [
> > 178.776058] [XDP TX] >> cleaned_cnt:1792, xdp_tx_in_flight:2110
> >
> > From the results, we can see that the maximum value of
> > xdp_tx_in_flight has reached 2140. However, the size of the Rx BD ring
> > is only 2048. This is incredible, so checked the code again and found
> > that the driver did not reset xdp_tx_in_flight when installing or
> > uninstalling bpf program, resulting in xdp_tx_in_flight still
> > retaining the value after the last command was run.
> >
> > Fixes: c33bfaf91c4c ("net: enetc: set up XDP program under
> > enetc_reconfigure()")
> 
> This does not explain why enetc_recycle_xdp_tx_buff(), which decreases
> xdp_tx_in_flight, does not get called?
> 
> In patch 2/3 you wrote:
> 
> | Tx BD rings are disabled first in enetc_stop() and then wait for
> | empty. This operation is not safe while the Tx BD ring is actively
> | transmitting frames, and will cause the ring to not be empty and
> | hardware exception. As described in the block guide of LS1028A NETC,
> | software should only disable an active ring after all pending ring
> | entries have been consumed (i.e. when PI = CI).
> | Disabling a transmit ring that is actively processing BDs risks a
> | HW-SW race hazard whereby a hardware resource becomes assigned to work
> | on one or more ring entries only to have those entries be removed due
> | to the ring becoming disabled. So the correct behavior is that the
> | software stops putting frames on the Tx BD rings (this is what
> | ENETC_TX_DOWN does), then waits for the Tx BD rings to be empty, and
> | finally disables the Tx BD rings.
> 
> I'm surprised that after fixing that, this change would still be needed, rather
> than xdp_tx_in_flight naturally dropping down to 0 when stopping NAPI. Why
> doesn't that happen, and what happens to the pending XDP_TX buffers?

The reason is that interrupt is disabled (disable_irq() is called in enetc_stop()) so
enetc_recycle_xdp_tx_buff() will not be called. Actually all XDP_TX frames are
sent out and XDP_TX buffers will be freed by enetc_free_rxtx_rings(). So there is
no noticeable impact.

Another solution is that move disable_irq() to the end of enetc_stop(), so that
the IRQ is still active until the Tx is finished. In this case, the xdp_tx_in_flight will
naturally drop down to 0 as you expect.
Maciej Fijalkowski Sept. 20, 2024, 1:04 p.m. UTC | #3
On Thu, Sep 19, 2024 at 04:41:04PM +0800, Wei Fang wrote:
> When running "xdp-bench tx eno0" to test the XDP_TX feature of ENETC
> on LS1028A, it was found that if the command was re-run multiple times,
> Rx could not receive the frames, and the result of xdo-bench showed
> that the rx rate was 0.
> 
> root@ls1028ardb:~# ./xdp-bench tx eno0
> Hairpinning (XDP_TX) packets on eno0 (ifindex 3; driver fsl_enetc)
> Summary                      2046 rx/s                  0 err,drop/s
> Summary                         0 rx/s                  0 err,drop/s
> Summary                         0 rx/s                  0 err,drop/s
> Summary                         0 rx/s                  0 err,drop/s
> 
> By observing the Rx PIR and CIR registers, we found that CIR is always
> equal to 0x7FF and PIR is always 0x7FE, which means that the Rx ring
> is full and can no longer accommodate other Rx frames. Therefore, it
> is obvious that the RX BD ring has not been cleaned up.
> 
> Further analysis of the code revealed that the Rx BD ring will only
> be cleaned if the "cleaned_cnt > xdp_tx_in_flight" condition is met.
> Therefore, some debug logs were added to the driver and the current
> values of cleaned_cnt and xdp_tx_in_flight were printed when the Rx
> BD ring was full. The logs are as follows.
> 
> [  178.762419] [XDP TX] >> cleaned_cnt:1728, xdp_tx_in_flight:2140
> [  178.771387] [XDP TX] >> cleaned_cnt:1941, xdp_tx_in_flight:2110
> [  178.776058] [XDP TX] >> cleaned_cnt:1792, xdp_tx_in_flight:2110
> 
> From the results, we can see that the maximum value of xdp_tx_in_flight
> has reached 2140. However, the size of the Rx BD ring is only 2048. This
> is incredible, so checked the code again and found that the driver did
> not reset xdp_tx_in_flight when installing or uninstalling bpf program,
> resulting in xdp_tx_in_flight still retaining the value after the last
> command was run.
> 
> Fixes: c33bfaf91c4c ("net: enetc: set up XDP program under enetc_reconfigure()")
> Cc: stable@vger.kernel.org
> Signed-off-by: Wei Fang <wei.fang@nxp.com>
> ---
>  drivers/net/ethernet/freescale/enetc/enetc.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
> index 5830c046cb7d..3cff76923ab9 100644
> --- a/drivers/net/ethernet/freescale/enetc/enetc.c
> +++ b/drivers/net/ethernet/freescale/enetc/enetc.c
> @@ -2769,6 +2769,7 @@ static int enetc_reconfigure_xdp_cb(struct enetc_ndev_priv *priv, void *ctx)
>  	for (i = 0; i < priv->num_rx_rings; i++) {
>  		struct enetc_bdr *rx_ring = priv->rx_ring[i];
>  
> +		rx_ring->xdp.xdp_tx_in_flight = 0;

zero init is good but shouldn't you be draining these buffers when
removing XDP resources at least? what happens with DMA mappings that are
related to these cached buffers?

>  		rx_ring->xdp.prog = prog;
>  
>  		if (prog)
> -- 
> 2.34.1
> 
>
Wei Fang Sept. 20, 2024, 2:05 p.m. UTC | #4
> -----Original Message-----
> From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> Sent: 2024年9月20日 21:05
> To: Wei Fang <wei.fang@nxp.com>
> Cc: davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> pabeni@redhat.com; Claudiu Manoil <claudiu.manoil@nxp.com>; Vladimir
> Oltean <vladimir.oltean@nxp.com>; ast@kernel.org; daniel@iogearbox.net;
> hawk@kernel.org; john.fastabend@gmail.com; linux-kernel@vger.kernel.org;
> netdev@vger.kernel.org; bpf@vger.kernel.org; stable@vger.kernel.org;
> imx@lists.linux.dev
> Subject: Re: [PATCH net 3/3] net: enetc: reset xdp_tx_in_flight when updating
> bpf program
> 
> On Thu, Sep 19, 2024 at 04:41:04PM +0800, Wei Fang wrote:
> > When running "xdp-bench tx eno0" to test the XDP_TX feature of ENETC
> > on LS1028A, it was found that if the command was re-run multiple
> > times, Rx could not receive the frames, and the result of xdo-bench
> > showed that the rx rate was 0.
> >
> > root@ls1028ardb:~# ./xdp-bench tx eno0 Hairpinning (XDP_TX) packets on
> > eno0 (ifindex 3; driver fsl_enetc)
> > Summary                      2046 rx/s                  0
> err,drop/s
> > Summary                         0 rx/s                  0
> err,drop/s
> > Summary                         0 rx/s                  0
> err,drop/s
> > Summary                         0 rx/s                  0
> err,drop/s
> >
> > By observing the Rx PIR and CIR registers, we found that CIR is always
> > equal to 0x7FF and PIR is always 0x7FE, which means that the Rx ring
> > is full and can no longer accommodate other Rx frames. Therefore, it
> > is obvious that the RX BD ring has not been cleaned up.
> >
> > Further analysis of the code revealed that the Rx BD ring will only be
> > cleaned if the "cleaned_cnt > xdp_tx_in_flight" condition is met.
> > Therefore, some debug logs were added to the driver and the current
> > values of cleaned_cnt and xdp_tx_in_flight were printed when the Rx BD
> > ring was full. The logs are as follows.
> >
> > [  178.762419] [XDP TX] >> cleaned_cnt:1728, xdp_tx_in_flight:2140 [
> > 178.771387] [XDP TX] >> cleaned_cnt:1941, xdp_tx_in_flight:2110 [
> > 178.776058] [XDP TX] >> cleaned_cnt:1792, xdp_tx_in_flight:2110
> >
> > From the results, we can see that the maximum value of
> > xdp_tx_in_flight has reached 2140. However, the size of the Rx BD ring
> > is only 2048. This is incredible, so checked the code again and found
> > that the driver did not reset xdp_tx_in_flight when installing or
> > uninstalling bpf program, resulting in xdp_tx_in_flight still
> > retaining the value after the last command was run.
> >
> > Fixes: c33bfaf91c4c ("net: enetc: set up XDP program under
> > enetc_reconfigure()")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Wei Fang <wei.fang@nxp.com>
> > ---
> >  drivers/net/ethernet/freescale/enetc/enetc.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c
> > b/drivers/net/ethernet/freescale/enetc/enetc.c
> > index 5830c046cb7d..3cff76923ab9 100644
> > --- a/drivers/net/ethernet/freescale/enetc/enetc.c
> > +++ b/drivers/net/ethernet/freescale/enetc/enetc.c
> > @@ -2769,6 +2769,7 @@ static int enetc_reconfigure_xdp_cb(struct
> enetc_ndev_priv *priv, void *ctx)
> >  	for (i = 0; i < priv->num_rx_rings; i++) {
> >  		struct enetc_bdr *rx_ring = priv->rx_ring[i];
> >
> > +		rx_ring->xdp.xdp_tx_in_flight = 0;
> 
> zero init is good but shouldn't you be draining these buffers when removing
> XDP resources at least? what happens with DMA mappings that are related to
> these cached buffers?
> 

All the buffers will be freed and DMA will be unmapped when XDP program is
installed. I am thinking that another solution may be better, which is mentioned
in another thread replying to Vladimir, so that xdp_tx_in_flight will naturally drop
to 0, and the TX-related statistics will be more accurate.
Vladimir Oltean Sept. 20, 2024, 2:25 p.m. UTC | #5
On Fri, Sep 20, 2024 at 05:05:14PM +0300, Wei Fang wrote:
> > zero init is good but shouldn't you be draining these buffers when removing
> > XDP resources at least? what happens with DMA mappings that are related to
> > these cached buffers?
> > 
> 
> All the buffers will be freed and DMA will be unmapped when XDP program is
> installed.

There is still a problem with the patch you proposed here, which is that
enetc_reconfigure() has one more call site, from enetc_hwtstamp_set().
If enetc_free_rxtx_rings() is the one that gets rid of the stale
buffers, it should also be the one that resets xdp_tx_in_flight,
otherwise you will still leave the problem unsolved where XDP_TX can be
interrupted by a change in hwtstamping state, and the software "in flight"
counter gets out of sync with the ring state.

Also, I suspect that the blamed commit is wrong. Also the normal netdev
close path should be susceptible to this issue, not just enetc_reconfigure().
Maybe something like ff58fda09096 ("net: enetc: prioritize ability to go
down over packet processing"). That's when we started rushing the NAPI
poll routing to finish. I don't think it was possible, before that, to
close the netdev while there were XDP_TX frames pending to be recycled.

> I am thinking that another solution may be better, which is mentioned
> in another thread replying to Vladimir, so that xdp_tx_in_flight will naturally drop
> to 0, and the TX-related statistics will be more accurate.

Please give me some more time to analyze the flow after just your patch 2/3.
I have a draft reply, but I would still like to test some things.
Vladimir Oltean Sept. 20, 2024, 2:29 p.m. UTC | #6
On Fri, Sep 20, 2024 at 05:25:11PM +0300, Vladimir Oltean wrote:
> That's when we started rushing the NAPI poll routing to finish.

routine*
Wei Fang Sept. 23, 2024, 1:59 a.m. UTC | #7
> -----Original Message-----
> From: Vladimir Oltean <vladimir.oltean@nxp.com>
> Sent: 2024年9月20日 22:25
> To: Wei Fang <wei.fang@nxp.com>
> Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com>; davem@davemloft.net;
> edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; Claudiu
> Manoil <claudiu.manoil@nxp.com>; ast@kernel.org; daniel@iogearbox.net;
> hawk@kernel.org; john.fastabend@gmail.com; linux-kernel@vger.kernel.org;
> netdev@vger.kernel.org; bpf@vger.kernel.org; stable@vger.kernel.org;
> imx@lists.linux.dev
> Subject: Re: [PATCH net 3/3] net: enetc: reset xdp_tx_in_flight when updating
> bpf program
> 
> On Fri, Sep 20, 2024 at 05:05:14PM +0300, Wei Fang wrote:
> > > zero init is good but shouldn't you be draining these buffers when removing
> > > XDP resources at least? what happens with DMA mappings that are related
> to
> > > these cached buffers?
> > >
> >
> > All the buffers will be freed and DMA will be unmapped when XDP program
> is
> > installed.
> 
> There is still a problem with the patch you proposed here, which is that
> enetc_reconfigure() has one more call site, from enetc_hwtstamp_set().
> If enetc_free_rxtx_rings() is the one that gets rid of the stale
> buffers, it should also be the one that resets xdp_tx_in_flight,
> otherwise you will still leave the problem unsolved where XDP_TX can be
> interrupted by a change in hwtstamping state, and the software "in flight"
> counter gets out of sync with the ring state.

Yes, you are right. It's a potential issue if RX_TSTAMP is set when XDP is also
enabled.

> 
> Also, I suspect that the blamed commit is wrong. Also the normal netdev
> close path should be susceptible to this issue, not just enetc_reconfigure().
> Maybe something like ff58fda09096 ("net: enetc: prioritize ability to go
> down over packet processing"). 

Thanks for the reminder, I will change the blamed commit in next version

> That's when we started rushing the NAPI
> poll routing to finish. I don't think it was possible, before that, to
> close the netdev while there were XDP_TX frames pending to be recycled.
> 
> > I am thinking that another solution may be better, which is mentioned
> > in another thread replying to Vladimir, so that xdp_tx_in_flight will naturally
> drop
> > to 0, and the TX-related statistics will be more accurate.
> 
> Please give me some more time to analyze the flow after just your patch 2/3.
> I have a draft reply, but I would still like to test some things.

Okay, I have tested this solution (see changes below), and from what I observed,
the xdp_tx_in_flight can naturally drop to 0 in every test. So if there are no other
risks, the next version will use this solution.

@@ -2467,10 +2469,6 @@ void enetc_start(struct net_device *ndev)
        struct enetc_ndev_priv *priv = netdev_priv(ndev);
        int i;

-       enetc_setup_interrupts(priv);
-
-       enetc_enable_tx_bdrs(priv);
-
        for (i = 0; i < priv->bdr_int_num; i++) {
                int irq = pci_irq_vector(priv->si->pdev,
                                         ENETC_BDR_INT_BASE_IDX + i);
@@ -2479,6 +2477,10 @@ void enetc_start(struct net_device *ndev)
                enable_irq(irq);
        }

+       enetc_setup_interrupts(priv);
+
+       enetc_enable_tx_bdrs(priv);
+
        enetc_enable_rx_bdrs(priv);

        netif_tx_start_all_queues(ndev);
@@ -2547,6 +2549,12 @@ void enetc_stop(struct net_device *ndev)

        enetc_disable_rx_bdrs(priv);

+       enetc_wait_bdrs(priv);
+
+       enetc_disable_tx_bdrs(priv);
+
+       enetc_clear_interrupts(priv);
+
        for (i = 0; i < priv->bdr_int_num; i++) {
                int irq = pci_irq_vector(priv->si->pdev,
                                         ENETC_BDR_INT_BASE_IDX + i);
@@ -2555,12 +2563,6 @@ void enetc_stop(struct net_device *ndev)
                napi_synchronize(&priv->int_vector[i]->napi);
                napi_disable(&priv->int_vector[i]->napi);
        }
-
-       enetc_wait_bdrs(priv);
-
-       enetc_disable_tx_bdrs(priv);
-
-       enetc_clear_interrupts(priv);
 }
 EXPORT_SYMBOL_GPL(enetc_stop);
Ratheesh Kannoth Sept. 23, 2024, 4:56 a.m. UTC | #8
On 2024-09-20 at 08:42:06, Wei Fang (wei.fang@nxp.com) wrote:
> enetc_recycle_xdp_tx_buff() will not be called. Actually all XDP_TX frames are
> sent out and XDP_TX buffers will be freed by enetc_free_rxtx_rings().
why didn't you choose enetc_free_rxtx_rings() to reset inflight count to 0 ?
Wei Fang Sept. 23, 2024, 5:48 a.m. UTC | #9
> -----Original Message-----
> From: Ratheesh Kannoth <rkannoth@marvell.com>
> Sent: 2024年9月23日 12:56
> To: Wei Fang <wei.fang@nxp.com>
> Cc: Vladimir Oltean <vladimir.oltean@nxp.com>; davem@davemloft.net;
> edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; Claudiu
> Manoil <claudiu.manoil@nxp.com>; ast@kernel.org; daniel@iogearbox.net;
> hawk@kernel.org; john.fastabend@gmail.com; linux-kernel@vger.kernel.org;
> netdev@vger.kernel.org; bpf@vger.kernel.org; stable@vger.kernel.org;
> imx@lists.linux.dev
> Subject: Re: [PATCH net 3/3] net: enetc: reset xdp_tx_in_flight when updating
> bpf program
> 
> On 2024-09-20 at 08:42:06, Wei Fang (wei.fang@nxp.com) wrote:
> > enetc_recycle_xdp_tx_buff() will not be called. Actually all XDP_TX
> > frames are sent out and XDP_TX buffers will be freed by
> enetc_free_rxtx_rings().
> why didn't you choose enetc_free_rxtx_rings() to reset inflight count to 0 ?

IMO, I think enetc_reconfigure_xdp_cb() is more appropriate to reset
xdp_tx_in_flight than enetc_free_rxtx_rings().
Vladimir Oltean Sept. 27, 2024, 2:33 p.m. UTC | #10
Hi Wei,

On Mon, Sep 23, 2024 at 04:59:56AM +0300, Wei Fang wrote:
> Okay, I have tested this solution (see changes below), and from what I observed,
> the xdp_tx_in_flight can naturally drop to 0 in every test. So if there are no other
> risks, the next version will use this solution.
> 

Sorry for the delay. I have tested this variant and it works. Just one
thing below.

> @@ -2467,10 +2469,6 @@ void enetc_start(struct net_device *ndev)
>         struct enetc_ndev_priv *priv = netdev_priv(ndev);
>         int i;
> 
> -       enetc_setup_interrupts(priv);
> -
> -       enetc_enable_tx_bdrs(priv);
> -
>         for (i = 0; i < priv->bdr_int_num; i++) {
>                 int irq = pci_irq_vector(priv->si->pdev,
>                                          ENETC_BDR_INT_BASE_IDX + i);
> @@ -2479,6 +2477,10 @@ void enetc_start(struct net_device *ndev)
>                 enable_irq(irq);
>         }
> 
> +       enetc_setup_interrupts(priv);
> +
> +       enetc_enable_tx_bdrs(priv);
> +
>         enetc_enable_rx_bdrs(priv);
> 
>         netif_tx_start_all_queues(ndev);
> @@ -2547,6 +2549,12 @@ void enetc_stop(struct net_device *ndev)
> 
>         enetc_disable_rx_bdrs(priv);
> 
> +       enetc_wait_bdrs(priv);
> +
> +       enetc_disable_tx_bdrs(priv);
> +
> +       enetc_clear_interrupts(priv);

Here, NAPI may still be scheduled. So if you clear interrupts, enetc_poll()
on another CPU might still have time to re-enable them. This makes the
call pointless.

Please move the enetc_clear_interrupts() call after the for() loop below
(AKA leave it where it is).

> +
>         for (i = 0; i < priv->bdr_int_num; i++) {
>                 int irq = pci_irq_vector(priv->si->pdev,
>                                          ENETC_BDR_INT_BASE_IDX + i);
> @@ -2555,12 +2563,6 @@ void enetc_stop(struct net_device *ndev)
>                 napi_synchronize(&priv->int_vector[i]->napi);
>                 napi_disable(&priv->int_vector[i]->napi);
>         }
> -
> -       enetc_wait_bdrs(priv);
> -
> -       enetc_disable_tx_bdrs(priv);
> -
> -       enetc_clear_interrupts(priv);
>  }
>  EXPORT_SYMBOL_GPL(enetc_stop);

FWIW, there are at least 2 other valid ways of solving this problem. One
has already been mentioned (reset the counter in enetc_free_rx_ring()):

@@ -2014,6 +2015,8 @@ static void enetc_free_rx_ring(struct enetc_bdr *rx_ring)
 		__free_page(rx_swbd->page);
 		rx_swbd->page = NULL;
 	}
+
+	rx_ring->xdp.xdp_tx_in_flight = 0;
 }

 static void enetc_free_rxtx_rings(struct enetc_ndev_priv *priv)

And the other would be to keep rescheduling NAPI until there are no more
pending XDP_TX frames.

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index 3cff76923ab9..36520f8c49a6 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -1689,6 +1689,7 @@ static int enetc_poll(struct napi_struct *napi, int budget)
 		work_done = enetc_clean_rx_ring_xdp(rx_ring, napi, budget, prog);
 	else
 		work_done = enetc_clean_rx_ring(rx_ring, napi, budget);
-	if (work_done == budget)
+	if (work_done == budget || rx_ring->xdp.xdp_tx_in_flight)
 		complete = false;
 	if (work_done)

But I like your second proposal the best. It doesn't involve adding an
unnecessary extra test in the fast path.
Vladimir Oltean Sept. 27, 2024, 2:38 p.m. UTC | #11
On Thu, Sep 19, 2024 at 04:41:04PM +0800, Wei Fang wrote:
> When running "xdp-bench tx eno0" to test the XDP_TX feature of ENETC
> on LS1028A, it was found that if the command was re-run multiple times,
> Rx could not receive the frames, and the result of xdo-bench showed
> that the rx rate was 0.
> 
> root@ls1028ardb:~# ./xdp-bench tx eno0
> Hairpinning (XDP_TX) packets on eno0 (ifindex 3; driver fsl_enetc)
> Summary                      2046 rx/s                  0 err,drop/s
> Summary                         0 rx/s                  0 err,drop/s
> Summary                         0 rx/s                  0 err,drop/s
> Summary                         0 rx/s                  0 err,drop/s
> 
> By observing the Rx PIR and CIR registers, we found that CIR is always
> equal to 0x7FF and PIR is always 0x7FE, which means that the Rx ring
> is full and can no longer accommodate other Rx frames. Therefore, it
> is obvious that the RX BD ring has not been cleaned up.

I wouldn't call "obvious" something that you had to go and put debug
prints for. There's nothing obvious about the manifestation of the issue.

> 
> Further analysis of the code revealed that the Rx BD ring will only
> be cleaned if the "cleaned_cnt > xdp_tx_in_flight" condition is met.
> Therefore, some debug logs were added to the driver and the current
> values of cleaned_cnt and xdp_tx_in_flight were printed when the Rx
> BD ring was full. The logs are as follows.
> 
> [  178.762419] [XDP TX] >> cleaned_cnt:1728, xdp_tx_in_flight:2140
> [  178.771387] [XDP TX] >> cleaned_cnt:1941, xdp_tx_in_flight:2110
> [  178.776058] [XDP TX] >> cleaned_cnt:1792, xdp_tx_in_flight:2110
> 
> From the results, we can see that the maximum value of xdp_tx_in_flight
> has reached 2140. However, the size of the Rx BD ring is only 2048. This
> is incredible, so checked the code again and found that the driver did
> not reset xdp_tx_in_flight when installing or uninstalling bpf program,
> resulting in xdp_tx_in_flight still retaining the value after the last
> command was run.

When you resubmit, please be careful to readjust the commit message to
the new patch. Especially this paragraph, but also the title.

> 
> Fixes: c33bfaf91c4c ("net: enetc: set up XDP program under enetc_reconfigure()")
> Cc: stable@vger.kernel.org
> Signed-off-by: Wei Fang <wei.fang@nxp.com>
> ---
Wei Fang Sept. 29, 2024, 1:31 a.m. UTC | #12
> Hi Wei,
> 
> On Mon, Sep 23, 2024 at 04:59:56AM +0300, Wei Fang wrote:
> > Okay, I have tested this solution (see changes below), and from what I
> observed,
> > the xdp_tx_in_flight can naturally drop to 0 in every test. So if there are no
> other
> > risks, the next version will use this solution.
> >
> 
> Sorry for the delay. I have tested this variant and it works. Just one
> thing below.
> 
> > @@ -2467,10 +2469,6 @@ void enetc_start(struct net_device *ndev)
> >         struct enetc_ndev_priv *priv = netdev_priv(ndev);
> >         int i;
> >
> > -       enetc_setup_interrupts(priv);
> > -
> > -       enetc_enable_tx_bdrs(priv);
> > -
> >         for (i = 0; i < priv->bdr_int_num; i++) {
> >                 int irq = pci_irq_vector(priv->si->pdev,
> >
> ENETC_BDR_INT_BASE_IDX + i);
> > @@ -2479,6 +2477,10 @@ void enetc_start(struct net_device *ndev)
> >                 enable_irq(irq);
> >         }
> >
> > +       enetc_setup_interrupts(priv);
> > +
> > +       enetc_enable_tx_bdrs(priv);
> > +
> >         enetc_enable_rx_bdrs(priv);
> >
> >         netif_tx_start_all_queues(ndev);
> > @@ -2547,6 +2549,12 @@ void enetc_stop(struct net_device *ndev)
> >
> >         enetc_disable_rx_bdrs(priv);
> >
> > +       enetc_wait_bdrs(priv);
> > +
> > +       enetc_disable_tx_bdrs(priv);
> > +
> > +       enetc_clear_interrupts(priv);
> 
> Here, NAPI may still be scheduled. So if you clear interrupts, enetc_poll()
> on another CPU might still have time to re-enable them. This makes the
> call pointless.
> 
> Please move the enetc_clear_interrupts() call after the for() loop below
> (AKA leave it where it is).

Okay, I will, thanks.
> 
> > +
> >         for (i = 0; i < priv->bdr_int_num; i++) {
> >                 int irq = pci_irq_vector(priv->si->pdev,
> >
> ENETC_BDR_INT_BASE_IDX + i);
> > @@ -2555,12 +2563,6 @@ void enetc_stop(struct net_device *ndev)
> >                 napi_synchronize(&priv->int_vector[i]->napi);
> >                 napi_disable(&priv->int_vector[i]->napi);
> >         }
> > -
> > -       enetc_wait_bdrs(priv);
> > -
> > -       enetc_disable_tx_bdrs(priv);
> > -
> > -       enetc_clear_interrupts(priv);
> >  }
> >  EXPORT_SYMBOL_GPL(enetc_stop);
> 
> FWIW, there are at least 2 other valid ways of solving this problem. One
> has already been mentioned (reset the counter in enetc_free_rx_ring()):
> 
> @@ -2014,6 +2015,8 @@ static void enetc_free_rx_ring(struct enetc_bdr
> *rx_ring)
>  		__free_page(rx_swbd->page);
>  		rx_swbd->page = NULL;
>  	}
> +
> +	rx_ring->xdp.xdp_tx_in_flight = 0;
>  }
> 
>  static void enetc_free_rxtx_rings(struct enetc_ndev_priv *priv)
> 
> And the other would be to keep rescheduling NAPI until there are no more
> pending XDP_TX frames.
> 
> diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c
> b/drivers/net/ethernet/freescale/enetc/enetc.c
> index 3cff76923ab9..36520f8c49a6 100644
> --- a/drivers/net/ethernet/freescale/enetc/enetc.c
> +++ b/drivers/net/ethernet/freescale/enetc/enetc.c
> @@ -1689,6 +1689,7 @@ static int enetc_poll(struct napi_struct *napi, int
> budget)
>  		work_done = enetc_clean_rx_ring_xdp(rx_ring, napi, budget, prog);
>  	else
>  		work_done = enetc_clean_rx_ring(rx_ring, napi, budget);
> -	if (work_done == budget)
> +	if (work_done == budget || rx_ring->xdp.xdp_tx_in_flight)
>  		complete = false;
>  	if (work_done)
> 
> But I like your second proposal the best. It doesn't involve adding an
> unnecessary extra test in the fast path.
diff mbox series

Patch

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index 5830c046cb7d..3cff76923ab9 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -2769,6 +2769,7 @@  static int enetc_reconfigure_xdp_cb(struct enetc_ndev_priv *priv, void *ctx)
 	for (i = 0; i < priv->num_rx_rings; i++) {
 		struct enetc_bdr *rx_ring = priv->rx_ring[i];
 
+		rx_ring->xdp.xdp_tx_in_flight = 0;
 		rx_ring->xdp.prog = prog;
 
 		if (prog)