diff mbox series

[net-next,2/3] net: stmmac: improve TX timer arm logic

Message ID 20230922111247.497-2-ansuelsmth@gmail.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net-next,1/3] net: introduce napi_is_scheduled helper | expand

Checks

Context Check Description
netdev/series_format warning Series does not have a cover letter
netdev/tree_selection success Clearly marked for net-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1340 this patch: 1340
netdev/cc_maintainers success CCed 10 of 10 maintainers
netdev/build_clang success Errors and warnings before: 1363 this patch: 1363
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1363 this patch: 1363
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 28 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Christian Marangi Sept. 22, 2023, 11:12 a.m. UTC
There is currently a problem with the TX timer getting armed multiple
unnecessary times causing big performance regression on some device that
suffer from heavy handling of hrtimer rearm.

The use of the TX timer is an old implementation that predates the napi
implementation and the interrupt enable/disable handling.

Due to stmmac being a very old code, the TX timer was never evaluated
again with this new implementation and was kept there causing
performance regression. The performance regression started to appear
with kernel version 4.19 with 8fce33317023 ("net: stmmac: Rework coalesce
timer and fix multi-queue races") where the timer was reduced to 1ms
causing it to be armed 40 times more than before.

Decreasing the timer made the problem more present and caused the
regression in the other of 600-700mbps on some device (regression where
this was notice is ipq806x).

The problem is in the fact that handling the hrtimer on some target is
expensive and recent kernel made the timer armed much more times.
A solution that was proposed was reverting the hrtimer change and use
mod_timer but such solution would still hide the real problem in the
current implementation.

To fix the regression, apply some additional logic and skip arming the
timer when not needed.

Arm the timer ONLY if a napi is not already scheduled. Running the timer
is redundant since the same function (stmmac_tx_clean) will run in the
napi TX poll. Also try to cancel any timer if a napi is scheduled to
prevent redundant run of TX call.

With the following new logic the original performance are restored while
keeping using the hrtimer.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
---
 .../net/ethernet/stmicro/stmmac/stmmac_main.c  | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

Comments

Vincent Whitchurch Sept. 29, 2023, 12:38 p.m. UTC | #1
On Fri, 2023-09-22 at 13:12 +0200, Christian Marangi wrote:
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 9201ed778ebc..14bf6fae6662 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -2994,13 +2994,25 @@ static void stmmac_tx_timer_arm(struct stmmac_priv *priv, u32 queue)
>  {
>  	struct stmmac_tx_queue *tx_q = &priv->dma_conf.tx_queue[queue];
>  	u32 tx_coal_timer = priv->tx_coal_timer[queue];
> +	struct stmmac_channel *ch;
> +	struct napi_struct *napi;
>  
> 
>  	if (!tx_coal_timer)
>  		return;
>  
> 
> -	hrtimer_start(&tx_q->txtimer,
> -		      STMMAC_COAL_TIMER(tx_coal_timer),
> -		      HRTIMER_MODE_REL);
> +	ch = &priv->channel[tx_q->queue_index];
> +	napi = tx_q->xsk_pool ? &ch->rxtx_napi : &ch->tx_napi;
> +
> +	/* Arm timer only if napi is not already scheduled.
> +	 * Try to cancel any timer if napi is scheduled, timer will be armed
> +	 * again in the next scheduled napi.
> +	 */
> +	if (unlikely(!napi_is_scheduled(napi)))
> +		hrtimer_start(&tx_q->txtimer,
> +			      STMMAC_COAL_TIMER(tx_coal_timer),
> +			      HRTIMER_MODE_REL);
> +	else
> +		hrtimer_try_to_cancel(&tx_q->txtimer);

When this function is called from within the napi poll function
(stmmac_tx_clean()), NAPI_STATE_SCHED will always be set and so after
this patch the "We still have pending packets, let's call for a new
scheduling" logic will never start the timer.  Was that really
intentional?
Christian Marangi Sept. 30, 2023, 12:04 p.m. UTC | #2
On Fri, Sep 29, 2023 at 12:38:48PM +0000, Vincent Whitchurch wrote:
> On Fri, 2023-09-22 at 13:12 +0200, Christian Marangi wrote:
> > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > index 9201ed778ebc..14bf6fae6662 100644
> > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > @@ -2994,13 +2994,25 @@ static void stmmac_tx_timer_arm(struct stmmac_priv *priv, u32 queue)
> >  {
> >  	struct stmmac_tx_queue *tx_q = &priv->dma_conf.tx_queue[queue];
> >  	u32 tx_coal_timer = priv->tx_coal_timer[queue];
> > +	struct stmmac_channel *ch;
> > +	struct napi_struct *napi;
> >  
> > 
> >  	if (!tx_coal_timer)
> >  		return;
> >  
> > 
> > -	hrtimer_start(&tx_q->txtimer,
> > -		      STMMAC_COAL_TIMER(tx_coal_timer),
> > -		      HRTIMER_MODE_REL);
> > +	ch = &priv->channel[tx_q->queue_index];
> > +	napi = tx_q->xsk_pool ? &ch->rxtx_napi : &ch->tx_napi;
> > +
> > +	/* Arm timer only if napi is not already scheduled.
> > +	 * Try to cancel any timer if napi is scheduled, timer will be armed
> > +	 * again in the next scheduled napi.
> > +	 */
> > +	if (unlikely(!napi_is_scheduled(napi)))
> > +		hrtimer_start(&tx_q->txtimer,
> > +			      STMMAC_COAL_TIMER(tx_coal_timer),
> > +			      HRTIMER_MODE_REL);
> > +	else
> > +		hrtimer_try_to_cancel(&tx_q->txtimer);
> 
> When this function is called from within the napi poll function
> (stmmac_tx_clean()), NAPI_STATE_SCHED will always be set and so after
> this patch the "We still have pending packets, let's call for a new
> scheduling" logic will never start the timer.  Was that really
> intentional?
>

No and understanding the code flow of napi and tx-coal is hard... (also
problem with tx coal arise only with real world scenario and now with
synthetic tests like iperf.

I will shortly send a v2 of this that will just move the logic of arming
the TX timer outside napi call after DMA interrupt is enabled again.
Currently testing the new version on openwrt with ipq806x hoping
everything is good.

(same perf increase observed but no queue timeout)
diff mbox series

Patch

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 9201ed778ebc..14bf6fae6662 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2994,13 +2994,25 @@  static void stmmac_tx_timer_arm(struct stmmac_priv *priv, u32 queue)
 {
 	struct stmmac_tx_queue *tx_q = &priv->dma_conf.tx_queue[queue];
 	u32 tx_coal_timer = priv->tx_coal_timer[queue];
+	struct stmmac_channel *ch;
+	struct napi_struct *napi;
 
 	if (!tx_coal_timer)
 		return;
 
-	hrtimer_start(&tx_q->txtimer,
-		      STMMAC_COAL_TIMER(tx_coal_timer),
-		      HRTIMER_MODE_REL);
+	ch = &priv->channel[tx_q->queue_index];
+	napi = tx_q->xsk_pool ? &ch->rxtx_napi : &ch->tx_napi;
+
+	/* Arm timer only if napi is not already scheduled.
+	 * Try to cancel any timer if napi is scheduled, timer will be armed
+	 * again in the next scheduled napi.
+	 */
+	if (unlikely(!napi_is_scheduled(napi)))
+		hrtimer_start(&tx_q->txtimer,
+			      STMMAC_COAL_TIMER(tx_coal_timer),
+			      HRTIMER_MODE_REL);
+	else
+		hrtimer_try_to_cancel(&tx_q->txtimer);
 }
 
 /**