diff mbox series

fec: high latency with imx8mm compared to imx6q

Message ID 1422776754.146013.1676652774408.JavaMail.zimbra@nod.at (mailing list archive)
State RFC
Headers show
Series fec: high latency with imx8mm compared to imx6q | expand

Checks

Context Check Description
netdev/tree_selection success Guessed tree name to be net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers warning 4 maintainers not CCed: davem@davemloft.net edumazet@google.com pabeni@redhat.com kuba@kernel.org
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff fail author Signed-off-by missing
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch warning WARNING: Do not use trace_printk() in production code (this can be ignored if built only with a debug config option)
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Richard Weinberger Feb. 17, 2023, 4:52 p.m. UTC
Hi!

I'm investigating into latency issues on an imx8mm system after
migrating from imx6q.
A regression test showed massive latency increases when single/small packets
are exchanged.

A simple test using ping exhibits the problem.
Pinging the very same host from the imx8mm has a way higher RTT than from the imx6.

Ping, 100 packets each, from imx6q:
rtt min/avg/max/mdev = 0.689/0.851/1.027/0.088 ms

Ping, 100 packets each, from imx8mm:
rtt min/avg/max/mdev = 1.073/2.064/2.189/0.330 ms

You can see that the average RTT has more than doubled.
I see the same results with every imx8mm system I got my hands on so far.
Also the kernel version does not matter, I've tried also the NXP tree without success.

All reported numbers have been produced using vanilla Linux v6.2-rc8 with these boards:
PHYTEC phyBOARD-Mira Quad with an i.MX6Q, silicon rev 1.5
FSL i.MX8MM EVK board with an i.MX8MM, revision 1.0

While digging into the fec ethernet driver I noticed that on the imx8mm sending
packet takes extremely long.

I'm measuring the time between triggering transmission start,
arrival of the transmit done IRQ and NAPI done.
Don't get confused by the function names, gcc inlined like hell.

imx6q:
   tst-104     [003] b..3.   217.340689: fec_enet_start_xmit: START skb: 8a68617d
   tst-104     [003] b..3.   217.340702: fec_enet_start_xmit: DONE skb: 8a68617d
<idle>-0       [000] d.h1.   217.340736: fec_enet_interrupt: 
<idle>-0       [000] d.h1.   217.340739: fec_enet_interrupt: scheduling napi
<idle>-0       [000] ..s1.   217.340774: fec_enet_rx_napi: TX DONE skb: 8a68617d

Time between submit and irq: 34us
Time between submit and tx done: 72us

imx8mm:
   tst-95      [000] b..2.   142.713409: fec_enet_start_xmit: START skb: 00000000ad10a62d
   tst-95      [000] b..2.   142.713417: fec_enet_start_xmit: DONE skb: 00000000ad10a62d
<idle>-0       [000] d.h1.   142.714428: fec_enet_interrupt: 
<idle>-0       [000] d.h1.   142.714430: fec_enet_interrupt: scheduling napi
<idle>-0       [000] ..s1.   142.714451: fec_enet_rx_napi: TX DONE skb: 00000000ad10a62d

Time between submit and irq: 1011us
Time between submit and tx done: 1034us 

As you can see, imx8mm's fec needs more than a whole millisecond to send a single packet.
Please note I'm just talking about latency. Throughput is fine, when the transmitter is
kept busy it seems to be much faster.

Is this a known issue?
Does fec need further tweaking for the imx8mm?
Can it be that the ethernet controller is in a sleep mode and needs to wake up each time?

Thanks,
//richard

My debug patch:

Comments

David Laight Feb. 17, 2023, 8:49 p.m. UTC | #1
From: Richard Weinberger
> Sent: 17 February 2023 16:53
...
> I'm investigating into latency issues on an imx8mm system after
> migrating from imx6q.
> A regression test showed massive latency increases when single/small packets
> are exchanged.
> 
> A simple test using ping exhibits the problem.
> Pinging the very same host from the imx8mm has a way higher RTT than from the imx6.
> 
> Ping, 100 packets each, from imx6q:
> rtt min/avg/max/mdev = 0.689/0.851/1.027/0.088 ms
> 
> Ping, 100 packets each, from imx8mm:
> rtt min/avg/max/mdev = 1.073/2.064/2.189/0.330 ms
> 
> You can see that the average RTT has more than doubled.
...

Is it just interrupt latency caused by interrupt coalescing
to avoid excessive interrupts?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Andrew Lunn Feb. 18, 2023, 1:04 a.m. UTC | #2
On Fri, Feb 17, 2023 at 08:49:23PM +0000, David Laight wrote:
> From: Richard Weinberger
> > Sent: 17 February 2023 16:53
> ...
> > I'm investigating into latency issues on an imx8mm system after
> > migrating from imx6q.
> > A regression test showed massive latency increases when single/small packets
> > are exchanged.
> > 
> > A simple test using ping exhibits the problem.
> > Pinging the very same host from the imx8mm has a way higher RTT than from the imx6.
> > 
> > Ping, 100 packets each, from imx6q:
> > rtt min/avg/max/mdev = 0.689/0.851/1.027/0.088 ms
> > 
> > Ping, 100 packets each, from imx8mm:
> > rtt min/avg/max/mdev = 1.073/2.064/2.189/0.330 ms
> > 
> > You can see that the average RTT has more than doubled.
> ...
> 
> Is it just interrupt latency caused by interrupt coalescing
> to avoid excessive interrupts?

Just adding to this, it appears imx6q does not have support for
changing the interrupt coalescing. imx8m does appear to support it. So
try playing with ethtool -c/-C.

    Andrew
Wei Fang Feb. 18, 2023, 1:27 a.m. UTC | #3
> -----Original Message-----
> From: Andrew Lunn <andrew@lunn.ch>
> Sent: 2023年2月18日 9:05
> To: David Laight <David.Laight@aculab.com>
> Cc: 'Richard Weinberger' <richard@nod.at>; netdev@vger.kernel.org; Wei Fang
> <wei.fang@nxp.com>; Shenwei Wang <shenwei.wang@nxp.com>; Clark Wang
> <xiaoning.wang@nxp.com>; dl-linux-imx <linux-imx@nxp.com>
> Subject: Re: high latency with imx8mm compared to imx6q
> 
> On Fri, Feb 17, 2023 at 08:49:23PM +0000, David Laight wrote:
> > From: Richard Weinberger
> > > Sent: 17 February 2023 16:53
> > ...
> > > I'm investigating into latency issues on an imx8mm system after
> > > migrating from imx6q.
> > > A regression test showed massive latency increases when single/small
> > > packets are exchanged.
> > >
> > > A simple test using ping exhibits the problem.
> > > Pinging the very same host from the imx8mm has a way higher RTT than
> from the imx6.
> > >
> > > Ping, 100 packets each, from imx6q:
> > > rtt min/avg/max/mdev = 0.689/0.851/1.027/0.088 ms
> > >
> > > Ping, 100 packets each, from imx8mm:
> > > rtt min/avg/max/mdev = 1.073/2.064/2.189/0.330 ms
> > >
> > > You can see that the average RTT has more than doubled.
> > ...
> >
> > Is it just interrupt latency caused by interrupt coalescing to avoid
> > excessive interrupts?
> 
> Just adding to this, it appears imx6q does not have support for changing the
> interrupt coalescing. imx8m does appear to support it. So try playing with
> ethtool -c/-C.
> 
Yes, I agree with Andrew, the interrupt coalescence feature default to be enabled
on i.MX8MM platforms. The purpose of the interrupt coalescing is to reduce the
number of interrupts generated by the MAC so as to reduce the CPU loading. 
As Andrew said, you can turn down rx-usecs and tx-usecs, and then try again.
Richard Weinberger Feb. 18, 2023, 9:42 a.m. UTC | #4
----- Ursprüngliche Mail -----
> Von: "wei fang" <wei.fang@nxp.com>
>> > Is it just interrupt latency caused by interrupt coalescing to avoid
>> > excessive interrupts?
>> 
>> Just adding to this, it appears imx6q does not have support for changing the
>> interrupt coalescing. imx8m does appear to support it. So try playing with
>> ethtool -c/-C.
>> 
> Yes, I agree with Andrew, the interrupt coalescence feature default to be
> enabled
> on i.MX8MM platforms. The purpose of the interrupt coalescing is to reduce the
> number of interrupts generated by the MAC so as to reduce the CPU loading.
> As Andrew said, you can turn down rx-usecs and tx-usecs, and then try again.

Hm, I thought my settings are fine (IOW no coalescing at all).
Coalesce parameters for eth0:
Adaptive RX: n/a  TX: n/a
stats-block-usecs: n/a
sample-interval: n/a
pkt-rate-low: n/a
pkt-rate-high: n/a

rx-usecs: 0
rx-frames: 0
rx-usecs-irq: n/a
rx-frames-irq: n/a

tx-usecs: 0
tx-frames: 0
tx-usecs-irq: n/a
tx-frames-irq: n/a

rx-usecs-low: n/a
rx-frame-low: n/a
tx-usecs-low: n/a
tx-frame-low: n/a

rx-usecs-high: n/a
rx-frame-high: n/a
tx-usecs-high: n/a


But I noticed something interesting this morning. When I set rx-usecs, tx-usecs,
rx-frames and tx-frames to 1, *sometimes* the RTT is good.

PING 192.168.0.52 (192.168.0.52) 56(84) bytes of data.
64 bytes from 192.168.0.52: icmp_seq=1 ttl=64 time=0.730 ms
64 bytes from 192.168.0.52: icmp_seq=2 ttl=64 time=0.356 ms
64 bytes from 192.168.0.52: icmp_seq=3 ttl=64 time=0.303 ms
64 bytes from 192.168.0.52: icmp_seq=4 ttl=64 time=2.22 ms
64 bytes from 192.168.0.52: icmp_seq=5 ttl=64 time=2.54 ms
64 bytes from 192.168.0.52: icmp_seq=6 ttl=64 time=0.354 ms
64 bytes from 192.168.0.52: icmp_seq=7 ttl=64 time=2.22 ms
64 bytes from 192.168.0.52: icmp_seq=8 ttl=64 time=2.54 ms
64 bytes from 192.168.0.52: icmp_seq=9 ttl=64 time=2.53 ms

So coalescing plays a role but it looks like the ethernet controller
does not always obey my settings.
I didn't look into the configured registers so far, maybe ethtool does not set them
correctly.

Thanks,
//richard
Wei Fang Feb. 18, 2023, 11:52 a.m. UTC | #5
> -----Original Message-----
> From: Richard Weinberger <richard@nod.at>
> Sent: 2023年2月18日 17:43
> To: Wei Fang <wei.fang@nxp.com>
> Cc: Andrew Lunn <andrew@lunn.ch>; David Laight
> <David.Laight@aculab.com>; netdev <netdev@vger.kernel.org>; Shenwei
> Wang <shenwei.wang@nxp.com>; Clark Wang <xiaoning.wang@nxp.com>;
> dl-linux-imx <linux-imx@nxp.com>
> Subject: Re: high latency with imx8mm compared to imx6q
> 
> ----- Ursprüngliche Mail -----
> > Von: "wei fang" <wei.fang@nxp.com>
> >> > Is it just interrupt latency caused by interrupt coalescing to
> >> > avoid excessive interrupts?
> >>
> >> Just adding to this, it appears imx6q does not have support for
> >> changing the interrupt coalescing. imx8m does appear to support it.
> >> So try playing with ethtool -c/-C.
> >>
> > Yes, I agree with Andrew, the interrupt coalescence feature default to
> > be enabled on i.MX8MM platforms. The purpose of the interrupt
> > coalescing is to reduce the number of interrupts generated by the MAC
> > so as to reduce the CPU loading.
> > As Andrew said, you can turn down rx-usecs and tx-usecs, and then try again.
> 
> Hm, I thought my settings are fine (IOW no coalescing at all).
> Coalesce parameters for eth0:
> Adaptive RX: n/a  TX: n/a
> rx-usecs: 0
> rx-frames: 0
> tx-usecs: 0
> tx-frames: 0
> 
Unfortunately, the fec driver does not support to set rx-usecs/rx-frames/tx-usecs/tx-frames
to 0 to disable interrupt coalescing. 0 is an invalid parameters. :(

> 
> But I noticed something interesting this morning. When I set rx-usecs, tx-usecs,
> rx-frames and tx-frames to 1, *sometimes* the RTT is good.
> 
> PING 192.168.0.52 (192.168.0.52) 56(84) bytes of data.
> 64 bytes from 192.168.0.52: icmp_seq=1 ttl=64 time=0.730 ms
> 64 bytes from 192.168.0.52: icmp_seq=2 ttl=64 time=0.356 ms
> 64 bytes from 192.168.0.52: icmp_seq=3 ttl=64 time=0.303 ms
> 64 bytes from 192.168.0.52: icmp_seq=4 ttl=64 time=2.22 ms
> 64 bytes from 192.168.0.52: icmp_seq=5 ttl=64 time=2.54 ms
> 64 bytes from 192.168.0.52: icmp_seq=6 ttl=64 time=0.354 ms
> 64 bytes from 192.168.0.52: icmp_seq=7 ttl=64 time=2.22 ms
> 64 bytes from 192.168.0.52: icmp_seq=8 ttl=64 time=2.54 ms
> 64 bytes from 192.168.0.52: icmp_seq=9 ttl=64 time=2.53 ms
> 
> So coalescing plays a role but it looks like the ethernet controller does not
> always obey my settings.
> I didn't look into the configured registers so far, maybe ethtool does not set
> them correctly.
> 
It look a bit weird. I did the same setting with my i.MX8ULP and didn't have this
issue. I'm not sure whether you network is stable or network node devices also
enable interrupt coalescing and the relevant parameters are set to a bit high.
Richard Weinberger Feb. 18, 2023, 12:03 p.m. UTC | #6
----- Ursprüngliche Mail -----
> Von: "wei fang" <wei.fang@nxp.com>
>> Hm, I thought my settings are fine (IOW no coalescing at all).
>> Coalesce parameters for eth0:
>> Adaptive RX: n/a  TX: n/a
>> rx-usecs: 0
>> rx-frames: 0
>> tx-usecs: 0
>> tx-frames: 0
>> 
> Unfortunately, the fec driver does not support to set
> rx-usecs/rx-frames/tx-usecs/tx-frames
> to 0 to disable interrupt coalescing. 0 is an invalid parameters. :(

So setting all values to 1 is the most "no coalescing" setting i can get?
 
>> 
>> But I noticed something interesting this morning. When I set rx-usecs, tx-usecs,
>> rx-frames and tx-frames to 1, *sometimes* the RTT is good.
>> 
>> PING 192.168.0.52 (192.168.0.52) 56(84) bytes of data.
>> 64 bytes from 192.168.0.52: icmp_seq=1 ttl=64 time=0.730 ms
>> 64 bytes from 192.168.0.52: icmp_seq=2 ttl=64 time=0.356 ms
>> 64 bytes from 192.168.0.52: icmp_seq=3 ttl=64 time=0.303 ms
>> 64 bytes from 192.168.0.52: icmp_seq=4 ttl=64 time=2.22 ms
>> 64 bytes from 192.168.0.52: icmp_seq=5 ttl=64 time=2.54 ms
>> 64 bytes from 192.168.0.52: icmp_seq=6 ttl=64 time=0.354 ms
>> 64 bytes from 192.168.0.52: icmp_seq=7 ttl=64 time=2.22 ms
>> 64 bytes from 192.168.0.52: icmp_seq=8 ttl=64 time=2.54 ms
>> 64 bytes from 192.168.0.52: icmp_seq=9 ttl=64 time=2.53 ms
>> 
>> So coalescing plays a role but it looks like the ethernet controller does not
>> always obey my settings.
>> I didn't look into the configured registers so far, maybe ethtool does not set
>> them correctly.
>> 
> It look a bit weird. I did the same setting with my i.MX8ULP and didn't have
> this
> issue. I'm not sure whether you network is stable or network node devices also
> enable interrupt coalescing and the relevant parameters are set to a bit high.

I'm pretty sure my network is good, I've tested also different locations.
And as I said, with the imx6q on the very same network everything works as expected.

So, with rx-usecs/rx-frames/tx-usecs/tx-frames set to 1, you see a RTT smaller than 1ms?

Thanks,
//richard
Wei Fang Feb. 18, 2023, 12:28 p.m. UTC | #7
> -----Original Message-----
> From: Richard Weinberger <richard@nod.at>
> Sent: 2023年2月18日 20:03
> To: Wei Fang <wei.fang@nxp.com>
> Cc: Andrew Lunn <andrew@lunn.ch>; David Laight
> <David.Laight@aculab.com>; netdev <netdev@vger.kernel.org>; Shenwei
> Wang <shenwei.wang@nxp.com>; Clark Wang <xiaoning.wang@nxp.com>;
> dl-linux-imx <linux-imx@nxp.com>
> Subject: Re: high latency with imx8mm compared to imx6q
> 
> ----- Ursprüngliche Mail -----
> > Von: "wei fang" <wei.fang@nxp.com>
> >> Hm, I thought my settings are fine (IOW no coalescing at all).
> >> Coalesce parameters for eth0:
> >> Adaptive RX: n/a  TX: n/a
> >> rx-usecs: 0
> >> rx-frames: 0
> >> tx-usecs: 0
> >> tx-frames: 0
> >>
> > Unfortunately, the fec driver does not support to set
> > rx-usecs/rx-frames/tx-usecs/tx-frames
> > to 0 to disable interrupt coalescing. 0 is an invalid parameters. :(
> 
> So setting all values to 1 is the most "no coalescing" setting i can get?
> 
If you use the ethtool cmd, the minimum can only be set to 1.
But you can set the coalescing registers directly on your console,
ENET_RXICn[ICEN] (addr: base + F0h offset + (4d × n) where n=0,1,2) and
ENET_TXICn[ICEN] (addr: base + 100h offset + (4d × n), where n=0d to 2d)
set the ICEN bit (bit 31) to 0:
0 disable Interrupt coalescing.
1 disable Interrupt coalescing.
or modify you fec driver, but remember, the interrupt coalescing feature
can only be disable by setting the ICEN bit to 0, do not set the tx/rx usecs/frames
to 0.

> >>
> >> But I noticed something interesting this morning. When I set
> >> rx-usecs, tx-usecs, rx-frames and tx-frames to 1, *sometimes* the RTT is
> good.
> >>
> >> PING 192.168.0.52 (192.168.0.52) 56(84) bytes of data.
> >> 64 bytes from 192.168.0.52: icmp_seq=1 ttl=64 time=0.730 ms
> >> 64 bytes from 192.168.0.52: icmp_seq=2 ttl=64 time=0.356 ms
> >> 64 bytes from 192.168.0.52: icmp_seq=3 ttl=64 time=0.303 ms
> >> 64 bytes from 192.168.0.52: icmp_seq=4 ttl=64 time=2.22 ms
> >> 64 bytes from 192.168.0.52: icmp_seq=5 ttl=64 time=2.54 ms
> >> 64 bytes from 192.168.0.52: icmp_seq=6 ttl=64 time=0.354 ms
> >> 64 bytes from 192.168.0.52: icmp_seq=7 ttl=64 time=2.22 ms
> >> 64 bytes from 192.168.0.52: icmp_seq=8 ttl=64 time=2.54 ms
> >> 64 bytes from 192.168.0.52: icmp_seq=9 ttl=64 time=2.53 ms
> >>
> >> So coalescing plays a role but it looks like the ethernet controller
> >> does not always obey my settings.
> >> I didn't look into the configured registers so far, maybe ethtool
> >> does not set them correctly.
> >>
> > It look a bit weird. I did the same setting with my i.MX8ULP and
> > didn't have this issue. I'm not sure whether you network is stable or
> > network node devices also enable interrupt coalescing and the relevant
> > parameters are set to a bit high.
> 
> I'm pretty sure my network is good, I've tested also different locations.
> And as I said, with the imx6q on the very same network everything works as
> expected.
> 
> So, with rx-usecs/rx-frames/tx-usecs/tx-frames set to 1, you see a RTT smaller
> than 1ms?
> 
Yes, but my platform is i.MX8ULP not i.MX8MM, I'll check i.MX8MM next Monday.
Wei Fang Feb. 18, 2023, 12:29 p.m. UTC | #8
> -----Original Message-----
> From: Wei Fang
> Sent: 2023年2月18日 20:28
> To: 'Richard Weinberger' <richard@nod.at>
> Cc: Andrew Lunn <andrew@lunn.ch>; David Laight
> <David.Laight@aculab.com>; netdev <netdev@vger.kernel.org>; Shenwei
> Wang <shenwei.wang@nxp.com>; Clark Wang <xiaoning.wang@nxp.com>;
> dl-linux-imx <linux-imx@nxp.com>
> Subject: RE: high latency with imx8mm compared to imx6q
> 
> 
> > -----Original Message-----
> > From: Richard Weinberger <richard@nod.at>
> > Sent: 2023年2月18日 20:03
> > To: Wei Fang <wei.fang@nxp.com>
> > Cc: Andrew Lunn <andrew@lunn.ch>; David Laight
> > <David.Laight@aculab.com>; netdev <netdev@vger.kernel.org>; Shenwei
> > Wang <shenwei.wang@nxp.com>; Clark Wang <xiaoning.wang@nxp.com>;
> > dl-linux-imx <linux-imx@nxp.com>
> > Subject: Re: high latency with imx8mm compared to imx6q
> >
> > ----- Ursprüngliche Mail -----
> > > Von: "wei fang" <wei.fang@nxp.com>
> > >> Hm, I thought my settings are fine (IOW no coalescing at all).
> > >> Coalesce parameters for eth0:
> > >> Adaptive RX: n/a  TX: n/a
> > >> rx-usecs: 0
> > >> rx-frames: 0
> > >> tx-usecs: 0
> > >> tx-frames: 0
> > >>
> > > Unfortunately, the fec driver does not support to set
> > > rx-usecs/rx-frames/tx-usecs/tx-frames
> > > to 0 to disable interrupt coalescing. 0 is an invalid parameters. :(
> >
> > So setting all values to 1 is the most "no coalescing" setting i can get?
> >
> If you use the ethtool cmd, the minimum can only be set to 1.
> But you can set the coalescing registers directly on your console,
> ENET_RXICn[ICEN] (addr: base + F0h offset + (4d × n) where n=0,1,2) and
> ENET_TXICn[ICEN] (addr: base + 100h offset + (4d × n), where n=0d to 2d)
> set the ICEN bit (bit 31) to 0:
> 0 disable Interrupt coalescing.
> 1 disable Interrupt coalescing.
sorry, correct my typo. 
1 enable Interrupt coalescing.

> or modify you fec driver, but remember, the interrupt coalescing feature
> can only be disable by setting the ICEN bit to 0, do not set the tx/rx
> usecs/frames
> to 0.
> 
> > >>
> > >> But I noticed something interesting this morning. When I set
> > >> rx-usecs, tx-usecs, rx-frames and tx-frames to 1, *sometimes* the RTT is
> > good.
> > >>
> > >> PING 192.168.0.52 (192.168.0.52) 56(84) bytes of data.
> > >> 64 bytes from 192.168.0.52: icmp_seq=1 ttl=64 time=0.730 ms
> > >> 64 bytes from 192.168.0.52: icmp_seq=2 ttl=64 time=0.356 ms
> > >> 64 bytes from 192.168.0.52: icmp_seq=3 ttl=64 time=0.303 ms
> > >> 64 bytes from 192.168.0.52: icmp_seq=4 ttl=64 time=2.22 ms
> > >> 64 bytes from 192.168.0.52: icmp_seq=5 ttl=64 time=2.54 ms
> > >> 64 bytes from 192.168.0.52: icmp_seq=6 ttl=64 time=0.354 ms
> > >> 64 bytes from 192.168.0.52: icmp_seq=7 ttl=64 time=2.22 ms
> > >> 64 bytes from 192.168.0.52: icmp_seq=8 ttl=64 time=2.54 ms
> > >> 64 bytes from 192.168.0.52: icmp_seq=9 ttl=64 time=2.53 ms
> > >>
> > >> So coalescing plays a role but it looks like the ethernet controller
> > >> does not always obey my settings.
> > >> I didn't look into the configured registers so far, maybe ethtool
> > >> does not set them correctly.
> > >>
> > > It look a bit weird. I did the same setting with my i.MX8ULP and
> > > didn't have this issue. I'm not sure whether you network is stable or
> > > network node devices also enable interrupt coalescing and the relevant
> > > parameters are set to a bit high.
> >
> > I'm pretty sure my network is good, I've tested also different locations.
> > And as I said, with the imx6q on the very same network everything works as
> > expected.
> >
> > So, with rx-usecs/rx-frames/tx-usecs/tx-frames set to 1, you see a RTT smaller
> > than 1ms?
> >
> Yes, but my platform is i.MX8ULP not i.MX8MM, I'll check i.MX8MM next
> Monday.
Richard Weinberger Feb. 18, 2023, 1:20 p.m. UTC | #9
----- Ursprüngliche Mail -----
> Von: "wei fang" <wei.fang@nxp.com>
> If you use the ethtool cmd, the minimum can only be set to 1.
> But you can set the coalescing registers directly on your console,
> ENET_RXICn[ICEN] (addr: base + F0h offset + (4d × n) where n=0,1,2) and
> ENET_TXICn[ICEN] (addr: base + 100h offset + (4d × n), where n=0d to 2d)
> set the ICEN bit (bit 31) to 0:
> 0 disable Interrupt coalescing.
> 1 disable Interrupt coalescing.
> or modify you fec driver, but remember, the interrupt coalescing feature
> can only be disable by setting the ICEN bit to 0, do not set the tx/rx
> usecs/frames
> to 0.

Disabling interrupt coalescing seems to make things much better. :-)
 
>> >>
>> >> But I noticed something interesting this morning. When I set
>> >> rx-usecs, tx-usecs, rx-frames and tx-frames to 1, *sometimes* the RTT is
>> good.
>> >>
>> >> PING 192.168.0.52 (192.168.0.52) 56(84) bytes of data.
>> >> 64 bytes from 192.168.0.52: icmp_seq=1 ttl=64 time=0.730 ms
>> >> 64 bytes from 192.168.0.52: icmp_seq=2 ttl=64 time=0.356 ms
>> >> 64 bytes from 192.168.0.52: icmp_seq=3 ttl=64 time=0.303 ms
>> >> 64 bytes from 192.168.0.52: icmp_seq=4 ttl=64 time=2.22 ms
>> >> 64 bytes from 192.168.0.52: icmp_seq=5 ttl=64 time=2.54 ms
>> >> 64 bytes from 192.168.0.52: icmp_seq=6 ttl=64 time=0.354 ms
>> >> 64 bytes from 192.168.0.52: icmp_seq=7 ttl=64 time=2.22 ms
>> >> 64 bytes from 192.168.0.52: icmp_seq=8 ttl=64 time=2.54 ms
>> >> 64 bytes from 192.168.0.52: icmp_seq=9 ttl=64 time=2.53 ms
>> >>
>> >> So coalescing plays a role but it looks like the ethernet controller
>> >> does not always obey my settings.
>> >> I didn't look into the configured registers so far, maybe ethtool
>> >> does not set them correctly.
>> >>
>> > It look a bit weird. I did the same setting with my i.MX8ULP and
>> > didn't have this issue. I'm not sure whether you network is stable or
>> > network node devices also enable interrupt coalescing and the relevant
>> > parameters are set to a bit high.
>> 
>> I'm pretty sure my network is good, I've tested also different locations.
>> And as I said, with the imx6q on the very same network everything works as
>> expected.
>> 
>> So, with rx-usecs/rx-frames/tx-usecs/tx-frames set to 1, you see a RTT smaller
>> than 1ms?
>> 
> Yes, but my platform is i.MX8ULP not i.MX8MM, I'll check i.MX8MM next Monday.

Now I don't see the outlines anymore. Maybe the test from before was really wonky. :-S
Next week I'll do a bigger test on the testbed with interrupt coalescing
disabled at driver level.

Thanks a lot for all the great input so far!
//richard
Andrew Lunn Feb. 20, 2023, 12:11 a.m. UTC | #10
On Sat, Feb 18, 2023 at 02:20:53PM +0100, Richard Weinberger wrote:
> ----- Ursprüngliche Mail -----
> > Von: "wei fang" <wei.fang@nxp.com>
> > If you use the ethtool cmd, the minimum can only be set to 1.
> > But you can set the coalescing registers directly on your console,
> > ENET_RXICn[ICEN] (addr: base + F0h offset + (4d × n) where n=0,1,2) and
> > ENET_TXICn[ICEN] (addr: base + 100h offset + (4d × n), where n=0d to 2d)
> > set the ICEN bit (bit 31) to 0:
> > 0 disable Interrupt coalescing.
> > 1 disable Interrupt coalescing.
> > or modify you fec driver, but remember, the interrupt coalescing feature
> > can only be disable by setting the ICEN bit to 0, do not set the tx/rx
> > usecs/frames
> > to 0.
> 
> Disabling interrupt coalescing seems to make things much better. :-)

Another thing to consider. The FEC in imx8 gained support for EEE. So
if your link is otherwise idle, it could be put into low power mode,
and takes a little time to wake up. Like most MAC drivers, EEE is
broken on the FEC, but it could still be active. You might want to put
a printk() in fec_enet_eee_mode_set() and see if it is active. I would
not trust ethtool.

    Andrew
diff mbox series

Patch

diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index 2341597408d1..7b0d43d76dea 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -565,6 +565,8 @@  static int fec_enet_txq_submit_skb(struct fec_enet_priv_tx_q *txq,
 	unsigned int index;
 	int entries_free;
 
+	trace_printk("START skb: %p\n", skb);
+
 	entries_free = fec_enet_get_free_txdesc_num(txq);
 	if (entries_free < MAX_SKB_FRAGS + 1) {
 		dev_kfree_skb_any(skb);
@@ -674,6 +676,7 @@  static int fec_enet_txq_submit_skb(struct fec_enet_priv_tx_q *txq,
 
 	/* Trigger transmission start */
 	writel(0, txq->bd.reg_desc_active);
+	trace_printk("DONE skb: %p\n", skb);
 
 	return 0;
 }
@@ -1431,6 +1434,7 @@  fec_enet_tx_queue(struct net_device *ndev, u16 queue_id)
 		} else {
 			ndev->stats.tx_packets++;
 			ndev->stats.tx_bytes += skb->len;
+			trace_printk("TX DONE skb: %p\n", skb);
 		}
 
 		/* NOTE: SKBTX_IN_PROGRESS being set does not imply it's we who
@@ -1809,12 +1813,15 @@  fec_enet_interrupt(int irq, void *dev_id)
 	struct fec_enet_private *fep = netdev_priv(ndev);
 	irqreturn_t ret = IRQ_NONE;
 
+	trace_printk("\n");
+
 	if (fec_enet_collect_events(fep) && fep->link) {
 		ret = IRQ_HANDLED;
 
 		if (napi_schedule_prep(&fep->napi)) {
 			/* Disable interrupts */
 			writel(0, fep->hwp + FEC_IMASK);
+			trace_printk("scheduling napi\n");
 			__napi_schedule(&fep->napi);
 		}
 	}