mbox series

[net-next,v2,0/2] Enable 2.5Gbps speed for stmmac

Message ID 20210405112953.26008-1-michael.wei.hong.sit@intel.com (mailing list archive)
Headers show
Series Enable 2.5Gbps speed for stmmac | expand

Message

Sit, Michael Wei Hong April 5, 2021, 11:29 a.m. UTC
This patchset enables 2.5Gbps speed mode for stmmac.
Link speed mode is detected and configured at serdes power up sequence.
For 2.5G, we do not use SGMII in-band AN, we check the link speed mode
in the serdes and disable the in-band AN accordingly.

Changes:
v1 -> v2
 patch 1/2
 -Remove MAC supported link speed masking

 patch 2/2
 -Add supported link speed masking in the PCS

iperf3 and ping for 2.5Gbps and regression test on 10M/100M/1000Mbps
is done to prevent regresson issues.

10Mbps
host@EHL$ ethtool -s enp0s30f4 duplex full speed 10
[  310.132264] intel-eth-pci 0000:00:1e.4 enp0s30f4: Link is Down
[  312.438102] intel-eth-pci 0000:00:1e.4 enp0s30f4: Link is Up - 10Mbps/Full - flow control off
[  312.447652] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s30f4: link becomes ready
host@EHL$ iperf3 -c 192.168.1.1
Connecting to host 192.168.1.1, port 5201
[  5] local 192.168.1.2 port 60706 connected to 192.168.1.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  1.26 MBytes  10.6 Mbits/sec    0   29.7 KBytes
[  5]   1.00-2.00   sec  1.09 MBytes  9.17 Mbits/sec    0   29.7 KBytes
[  5]   2.00-3.00   sec  1.09 MBytes  9.17 Mbits/sec    0   29.7 KBytes
[  5]   3.00-4.00   sec  1.15 MBytes  9.68 Mbits/sec    0   29.7 KBytes
[  5]   4.00-5.00   sec  1.09 MBytes  9.17 Mbits/sec    0   29.7 KBytes
[  5]   5.00-6.00   sec  1.09 MBytes  9.17 Mbits/sec    0   29.7 KBytes
[  5]   6.00-7.00   sec  1.15 MBytes  9.68 Mbits/sec    0   29.7 KBytes
[  5]   7.00-8.00   sec  1.09 MBytes  9.17 Mbits/sec    0   29.7 KBytes
[  5]   8.00-9.00   sec  1.09 MBytes  9.17 Mbits/sec    0   29.7 KBytes
[  5]   9.00-10.00  sec  1.15 MBytes  9.68 Mbits/sec    0   29.7 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  11.3 MBytes  9.47 Mbits/sec    0             sender
[  5]   0.00-10.01  sec  11.1 MBytes  9.34 Mbits/sec                  receiver

iperf Done.
host@EHL$ ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.557 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.528 ms
64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.535 ms
64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.525 ms
64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.527 ms
64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.555 ms
64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.539 ms
64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.588 ms
64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.570 ms
64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.540 ms

--- 192.168.1.1 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9194ms
rtt min/avg/max/mdev = 0.525/0.546/0.588/0.019 ms
host@EHL$

100Mbps
host@EHL$ ethtool -s enp0s30f4 duplex full speed 100
[  204.178572] intel-eth-pci 0000:00:1e.4 enp0s30f4: Link is Down
[  207.990094] intel-eth-pci 0000:00:1e.4 enp0s30f4: Link is Up - 100Mbps/Full - flow control off
[  207.999744] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s30f4: link becomes ready
host@EHL$ iperf3 -c 192.168.1.1
Connecting to host 192.168.1.1, port 5201
[  5] local 192.168.1.2 port 60702 connected to 192.168.1.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  11.6 MBytes  97.0 Mbits/sec    1    102 KBytes
[  5]   1.00-2.00   sec  10.9 MBytes  91.7 Mbits/sec    0    102 KBytes
[  5]   2.00-3.00   sec  10.8 MBytes  90.5 Mbits/sec    0    102 KBytes
[  5]   3.00-4.00   sec  11.0 MBytes  92.6 Mbits/sec    0    102 KBytes
[  5]   4.00-5.00   sec  10.8 MBytes  90.6 Mbits/sec    0    102 KBytes
[  5]   5.00-6.00   sec  11.0 MBytes  92.6 Mbits/sec    0    102 KBytes
[  5]   6.00-7.00   sec  11.0 MBytes  92.6 Mbits/sec    0    102 KBytes
[  5]   7.00-8.00   sec  10.8 MBytes  90.6 Mbits/sec    0    102 KBytes
[  5]   8.00-9.00   sec  11.0 MBytes  92.6 Mbits/sec    0    102 KBytes
[  5]   9.00-10.00  sec  11.0 MBytes  92.6 Mbits/sec    0    102 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   110 MBytes  92.3 Mbits/sec    1             sender
[  5]   0.00-10.00  sec   109 MBytes  91.8 Mbits/sec                  receiver

iperf Done.
host@EHL$ ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.331 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.322 ms
64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.315 ms
64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.315 ms
64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.295 ms
64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.300 ms
64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.307 ms
64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.294 ms
64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.292 ms
64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.297 ms

--- 192.168.1.1 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9215ms
rtt min/avg/max/mdev = 0.292/0.306/0.331/0.012 ms

1G speed
host@EHL$ iperf3 -c 192.168.1.1
Connecting to host 192.168.1.1, port 5201
[  5] local 192.168.1.2 port 60698 connected to 192.168.1.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   114 MBytes   954 Mbits/sec    0    533 KBytes
[  5]   1.00-2.00   sec   112 MBytes   942 Mbits/sec    0    591 KBytes
[  5]   2.00-3.00   sec   113 MBytes   945 Mbits/sec    0    621 KBytes
[  5]   3.00-4.00   sec   112 MBytes   941 Mbits/sec    0    621 KBytes
[  5]   4.00-5.00   sec   112 MBytes   942 Mbits/sec    0    764 KBytes
[  5]   5.00-6.00   sec   112 MBytes   944 Mbits/sec    0    764 KBytes
[  5]   6.00-7.00   sec   111 MBytes   933 Mbits/sec    0    803 KBytes
[  5]   7.00-8.00   sec   112 MBytes   944 Mbits/sec    0    803 KBytes
[  5]   8.00-9.00   sec   112 MBytes   944 Mbits/sec    0    843 KBytes
[  5]   9.00-10.00  sec   112 MBytes   944 Mbits/sec    0    843 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.10 GBytes   943 Mbits/sec    0             sender
[  5]   0.00-10.00  sec  1.09 GBytes   940 Mbits/sec                  receiver

iperf Done.
host@EHL$ ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.299 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.277 ms
64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.277 ms
64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.286 ms
64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.330 ms
64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.276 ms
64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.296 ms
64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.272 ms
64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.276 ms
64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.274 ms

--- 192.168.1.1 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9196ms
rtt min/avg/max/mdev = 0.272/0.286/0.330/0.017 ms

2.5G speed
host@EHL$ iperf3 -c 192.168.1.1
Connecting to host 192.168.1.1, port 5201
[  5] local 192.168.1.2 port 55160 connected to 192.168.1.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   175 MBytes  1.47 Gbits/sec   17    683 KBytes
[  5]   1.00-2.00   sec   202 MBytes  1.70 Gbits/sec    0    707 KBytes
[  5]   2.00-3.00   sec   204 MBytes  1.71 Gbits/sec    0    751 KBytes
[  5]   3.00-4.00   sec   204 MBytes  1.71 Gbits/sec    0    773 KBytes
[  5]   4.00-5.00   sec   202 MBytes  1.70 Gbits/sec    0    773 KBytes
[  5]   5.00-6.00   sec   204 MBytes  1.71 Gbits/sec    0    798 KBytes
[  5]   6.00-7.00   sec   204 MBytes  1.71 Gbits/sec    0    807 KBytes
[  5]   7.00-8.00   sec   204 MBytes  1.71 Gbits/sec    0    807 KBytes
[  5]   8.00-9.00   sec   204 MBytes  1.71 Gbits/sec    0    807 KBytes
[  5]   9.00-10.00  sec   202 MBytes  1.70 Gbits/sec    0    807 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.96 GBytes  1.68 Gbits/sec   17             sender
[  5]   0.00-10.00  sec  1.96 GBytes  1.68 Gbits/sec                  receiver

iperf Done.
host@EHL$ ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.671 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.300 ms
64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.300 ms
64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.291 ms
64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.296 ms
64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.301 ms
64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.328 ms
64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.306 ms
64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.299 ms
64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.293 ms

--- 192.168.1.1 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9251ms
rtt min/avg/max/mdev = 0.291/0.338/0.671/0.111 ms

Voon Weifeng (2):
  net: stmmac: enable 2.5Gbps link speed
  net: pcs: configure xpcs 2.5G speed mode

 .../net/ethernet/stmicro/stmmac/dwmac-intel.c | 44 ++++++++++++++++++-
 .../net/ethernet/stmicro/stmmac/dwmac-intel.h | 13 ++++++
 .../net/ethernet/stmicro/stmmac/dwmac4_core.c |  1 +
 .../net/ethernet/stmicro/stmmac/stmmac_main.c | 20 ++++++++-
 drivers/net/pcs/pcs-xpcs.c                    | 39 ++++++++++++++++
 include/linux/pcs/pcs-xpcs.h                  |  1 +
 include/linux/stmmac.h                        |  2 +
 7 files changed, 117 insertions(+), 3 deletions(-)

Comments

Andrew Lunn April 5, 2021, 1:11 p.m. UTC | #1
On Mon, Apr 05, 2021 at 07:29:51PM +0800, Michael Sit Wei Hong wrote:
> This patchset enables 2.5Gbps speed mode for stmmac.
> Link speed mode is detected and configured at serdes power up sequence.
> For 2.5G, we do not use SGMII in-band AN, we check the link speed mode
> in the serdes and disable the in-band AN accordingly.
> 
> Changes:
> v1 -> v2
>  patch 1/2
>  -Remove MAC supported link speed masking
> 
>  patch 2/2
>  -Add supported link speed masking in the PCS

So there still some confusion here.

------------            --------
|MAC - PCS |---serdes---| PHY  |--- copper 
------------            --------


You have a MAC and an PCS in the stmmac IP block. That then has some
sort of SERDES interface, running 1000BaseX, SGMII, SGMII overclocked
at 2.5G or 25000BaseX. Connected to the SERDES you have a PHY which
converts to copper, giving you 2500BaseT.

You said earlier, that the PHY can only do 2500BaseT. So it should be
the PHY driver which sets supported to 2500BaseT and no other speeds.

You should think about when somebody uses this MAC with a different
PHY, one that can do the full range of 10/half through to 2.5G
full. What generally happens is that the PHY performs auto-neg to
determine the link speed. For 10M-1G speeds the PHY will configure its
SERDES interface to SGMII and phylink will ask the PCS to also be
configured to SGMII. If the PHY negotiates 2500BaseT, it will
configure its side of the SERDES to 2500BaseX or SGMII overclocked at
2.5G. Again, phylink will ask the PCS to match what the PHY is doing.

So, where exactly is the limitation in your hardware? PCS or PHY?

     Andrew
Sit, Michael Wei Hong April 5, 2021, 2:23 p.m. UTC | #2
> -----Original Message-----
> From: Andrew Lunn <andrew@lunn.ch>
> Sent: Monday, 5 April, 2021 9:11 PM
> To: Sit, Michael Wei Hong <michael.wei.hong.sit@intel.com>
> Cc: peppe.cavallaro@st.com; alexandre.torgue@st.com;
> joabreu@synopsys.com; davem@davemloft.net;
> kuba@kernel.org; mcoquelin.stm32@gmail.com;
> linux@armlinux.org.uk; Voon, Weifeng
> <weifeng.voon@intel.com>; Ong, Boon Leong
> <boon.leong.ong@intel.com>; qiangqing.zhang@nxp.com; Wong,
> Vee Khee <vee.khee.wong@intel.com>; fugang.duan@nxp.com;
> Chuah, Kim Tatt <kim.tatt.chuah@intel.com>;
> netdev@vger.kernel.org; linux-stm32@st-md-
> mailman.stormreply.com; linux-arm-kernel@lists.infradead.org;
> linux-kernel@vger.kernel.org; hkallweit1@gmail.com
> Subject: Re: [PATCH net-next v2 0/2] Enable 2.5Gbps speed for
> stmmac
> 
> On Mon, Apr 05, 2021 at 07:29:51PM +0800, Michael Sit Wei Hong
> wrote:
> > This patchset enables 2.5Gbps speed mode for stmmac.
> > Link speed mode is detected and configured at serdes power
> up sequence.
> > For 2.5G, we do not use SGMII in-band AN, we check the link
> speed mode
> > in the serdes and disable the in-band AN accordingly.
> >
> > Changes:
> > v1 -> v2
> >  patch 1/2
> >  -Remove MAC supported link speed masking
> >
> >  patch 2/2
> >  -Add supported link speed masking in the PCS
> 
> So there still some confusion here.
> 
> ------------            --------
> |MAC - PCS |---serdes---| PHY  |--- copper
> ------------            --------
> 
> 
> You have a MAC and an PCS in the stmmac IP block. That then has
> some
> sort of SERDES interface, running 1000BaseX, SGMII, SGMII
> overclocked
> at 2.5G or 25000BaseX. Connected to the SERDES you have a PHY
> which
> converts to copper, giving you 2500BaseT.
> 
> You said earlier, that the PHY can only do 2500BaseT. So it should
> be
> the PHY driver which sets supported to 2500BaseT and no other
> speeds.
> 
> You should think about when somebody uses this MAC with a
> different
> PHY, one that can do the full range of 10/half through to 2.5G
> full. What generally happens is that the PHY performs auto-neg to
> determine the link speed. For 10M-1G speeds the PHY will
> configure its
> SERDES interface to SGMII and phylink will ask the PCS to also be
> configured to SGMII. If the PHY negotiates 2500BaseT, it will
> configure its side of the SERDES to 2500BaseX or SGMII
> overclocked at
> 2.5G. Again, phylink will ask the PCS to match what the PHY is
> doing.
> 
> So, where exactly is the limitation in your hardware? PCS or PHY?
The limitation in the hardware is at the PCS side where it is either running
in SGMII 2.5G or SGMII 1G speeds.
When running on SGMII 2.5G speeds, we disable the in-band AN and use 2.5G speed only
> 
>      Andrew
Andrew Lunn April 5, 2021, 2:35 p.m. UTC | #3
> > You have a MAC and an PCS in the stmmac IP block. That then has
> > some
> > sort of SERDES interface, running 1000BaseX, SGMII, SGMII
> > overclocked
> > at 2.5G or 25000BaseX. Connected to the SERDES you have a PHY
> > which
> > converts to copper, giving you 2500BaseT.
> > 
> > You said earlier, that the PHY can only do 2500BaseT. So it should
> > be
> > the PHY driver which sets supported to 2500BaseT and no other
> > speeds.
> > 
> > You should think about when somebody uses this MAC with a
> > different
> > PHY, one that can do the full range of 10/half through to 2.5G
> > full. What generally happens is that the PHY performs auto-neg to
> > determine the link speed. For 10M-1G speeds the PHY will
> > configure its
> > SERDES interface to SGMII and phylink will ask the PCS to also be
> > configured to SGMII. If the PHY negotiates 2500BaseT, it will
> > configure its side of the SERDES to 2500BaseX or SGMII
> > overclocked at
> > 2.5G. Again, phylink will ask the PCS to match what the PHY is
> > doing.
> > 
> > So, where exactly is the limitation in your hardware? PCS or PHY?
> The limitation in the hardware is at the PCS side where it is either running
> in SGMII 2.5G or SGMII 1G speeds.
> When running on SGMII 2.5G speeds, we disable the in-band AN and use 2.5G speed only

So there is no actual limitation! The MAC should indicate it can do
10Half through to 2500BaseT. And you need to listen to PHYLINK and
swap the PCS between SGMII to overclocked SGMII when it requests.

PHYLINK will call stmmac_mac_config() and use state->interface to
decide how to configure the PCS to match what the PHY is doing.

     Andrew
Voon, Weifeng April 6, 2021, 9:05 a.m. UTC | #4
> > > You have a MAC and an PCS in the stmmac IP block. That then has some
> > > sort of SERDES interface, running 1000BaseX, SGMII, SGMII
> > > overclocked at 2.5G or 25000BaseX. Connected to the SERDES you have
> > > a PHY which converts to copper, giving you 2500BaseT.
> > >
> > > You said earlier, that the PHY can only do 2500BaseT. So it should
> > > be the PHY driver which sets supported to 2500BaseT and no other
> > > speeds.
> > >
> > > You should think about when somebody uses this MAC with a different
> > > PHY, one that can do the full range of 10/half through to 2.5G full.
> > > What generally happens is that the PHY performs auto-neg to
> > > determine the link speed. For 10M-1G speeds the PHY will configure
> > > its SERDES interface to SGMII and phylink will ask the PCS to also
> > > be configured to SGMII. If the PHY negotiates 2500BaseT, it will
> > > configure its side of the SERDES to 2500BaseX or SGMII overclocked
> > > at 2.5G. Again, phylink will ask the PCS to match what the PHY is
> > > doing.
> > >
> > > So, where exactly is the limitation in your hardware? PCS or PHY?
> > The limitation in the hardware is at the PCS side where it is either
> > running in SGMII 2.5G or SGMII 1G speeds.
> > When running on SGMII 2.5G speeds, we disable the in-band AN and use
> > 2.5G speed only
> 
> So there is no actual limitation! The MAC should indicate it can do 10Half
> through to 2500BaseT. And you need to listen to PHYLINK and swap the PCS
> between SGMII to overclocked SGMII when it requests.
> 
> PHYLINK will call stmmac_mac_config() and use state->interface to decide
> how to configure the PCS to match what the PHY is doing.
> 
>      Andrew

The limitation is not on the MAC, PCS or the PHY. For Intel mgbe, the
overclocking of 2.5 times clock rate to support 2.5G is only able to be
configured in the BIOS during boot time. Kernel driver has no access to 
modify the clock rate for 1Gbps/2.5G mode. The way to determined the 
current 1G/2.5G mode is by reading a dedicated adhoc register through mdio bus.
In short, after the system boot up, it is either in 1G mode or 2.5G mode 
which not able to be changed on the fly. 

Since the stmmac MAC can pair with any PCS and PHY, I still prefer that we tie
this platform specific limitation with the of MAC. As stmmac does handle platform
specific config/limitation. 

What is your thoughts? 

Weifeng
Andrew Lunn April 6, 2021, 8:06 p.m. UTC | #5
> The limitation is not on the MAC, PCS or the PHY. For Intel mgbe, the
> overclocking of 2.5 times clock rate to support 2.5G is only able to be
> configured in the BIOS during boot time. Kernel driver has no access to 
> modify the clock rate for 1Gbps/2.5G mode. The way to determined the 
> current 1G/2.5G mode is by reading a dedicated adhoc register through mdio bus.
> In short, after the system boot up, it is either in 1G mode or 2.5G mode 
> which not able to be changed on the fly. 

Right. It would of been a lot easier if this was in the commit message
from the beginning. Please ensure the next version does say this.

> Since the stmmac MAC can pair with any PCS and PHY, I still prefer that we tie
> this platform specific limitation with the of MAC. As stmmac does handle platform
> specific config/limitation. 

So yes, this needs to be somewhere in the intel specific stmmac code,
with a nice comment explaining what is going on.

What PHY are you using? The Aquantia/Marvell multi-gige phy can do
rate adaptation. So you could fix the MAC-PHY link to 2500BaseX, and
let the PHY internally handle the different line speeds.

    Andrew
Voon, Weifeng April 7, 2021, 3:02 a.m. UTC | #6
> > The limitation is not on the MAC, PCS or the PHY. For Intel mgbe, the
> > overclocking of 2.5 times clock rate to support 2.5G is only able to
> > be configured in the BIOS during boot time. Kernel driver has no
> > access to modify the clock rate for 1Gbps/2.5G mode. The way to
> > determined the current 1G/2.5G mode is by reading a dedicated adhoc
> register through mdio bus.
> > In short, after the system boot up, it is either in 1G mode or 2.5G
> > mode which not able to be changed on the fly.
> 
> Right. It would of been a lot easier if this was in the commit message
> from the beginning. Please ensure the next version does say this.
> 
> > Since the stmmac MAC can pair with any PCS and PHY, I still prefer
> > that we tie this platform specific limitation with the of MAC. As
> > stmmac does handle platform specific config/limitation.
> 
> So yes, this needs to be somewhere in the intel specific stmmac code,
> with a nice comment explaining what is going on.
> 
> What PHY are you using? The Aquantia/Marvell multi-gige phy can do rate
> adaptation. So you could fix the MAC-PHY link to 2500BaseX, and let the
> PHY internally handle the different line speeds.
> 
Intel mgbe is flexible to pair with any PHY. Only Aquantia/Marvell
multi-gige PHY can do rate adaption right? Hence, we still need to take 
care of others PHYs.

Thanks for all the comments, will include them in v3. 

Weifeng
Andrew Lunn April 7, 2021, 12:44 p.m. UTC | #7
> Intel mgbe is flexible to pair with any PHY. Only Aquantia/Marvell
> multi-gige PHY can do rate adaption right?

The Marvell/Marvell multi-gige PHY can also do rate
adaptation. Marvell buying Aquantia made naming messy :-(
I should probably use part numbers.

> Hence, we still need to take care of others PHYs.

Yes, it just makes working around the broken design harder if you want
to get the most out of the hardware.

   Andrew
Russell King (Oracle) April 7, 2021, 1 p.m. UTC | #8
On Wed, Apr 07, 2021 at 02:44:39PM +0200, Andrew Lunn wrote:
> > Intel mgbe is flexible to pair with any PHY. Only Aquantia/Marvell
> > multi-gige PHY can do rate adaption right?
> 
> The Marvell/Marvell multi-gige PHY can also do rate
> adaptation. Marvell buying Aquantia made naming messy :-(
> I should probably use part numbers.
> 
> > Hence, we still need to take care of others PHYs.
> 
> Yes, it just makes working around the broken design harder if you want
> to get the most out of the hardware.

FYI, we really need to come up with a good solution to the rate
adaption issue. What we have today really is not good.

For example, take a MAC that supports only 2500base-X connected to a
PHY that does rate adaption from 2500base-X to media speed.

So, the PHY could be capable of 10, 100, 1G and 2.5G media speeds,
and would advertise those in its supported mask. The MAC however
would only report (via the validate callback) support for 2.5G speed
because that's all that 2500base-X supports.

What we really want when a rate adapting capable PHY is connected is
to ignore what ethtool link modes the MAC supports beyond "does it
support this interface type" and just use the PHY supported mask.
However, that's another property of the PHY that we need to know from
phylib, and it's not clear when that property should be made available.
As we know from Marvell PHYs, it depends on the configurable MAC_TYPE
setting, so could only be available once we've selected an interface
mode for the PHY. On the other hand, we might need to know what
interface mode(s) are available from the PHY and MAC to select an
appropriate mode.

This is not easy problems to overcome; I have had some patches for some
time which allow some combination of MAC and PHY to advertise which
interface mode(s) they support but I haven't been entirely happy with
them to push them upstream - and it would be another phylink API change
which means having to maintain the new and old code until everything
has been updated (thereby making stuff a lot more complex.) After the
last round of phylink API updates and the hostility from people over
that, this is a big demotivating factor.