diff mbox series

[net-next,v3,10/47] net: phylink: Adjust link settings based on rate adaptation

Message ID 20220715215954.1449214-11-sean.anderson@seco.com (mailing list archive)
State New, archived
Headers show
Series net: dpaa: Convert to phylink | expand

Commit Message

Sean Anderson July 15, 2022, 9:59 p.m. UTC
If the phy is configured to use pause-based rate adaptation, ensure that
the link is full duplex with pause frame reception enabled. Note that these
settings may be overridden by ethtool.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
---

Changes in v3:
- New

 drivers/net/phy/phylink.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Andrew Lunn July 16, 2022, 8:17 p.m. UTC | #1
On Fri, Jul 15, 2022 at 05:59:17PM -0400, Sean Anderson wrote:
> If the phy is configured to use pause-based rate adaptation, ensure that
> the link is full duplex with pause frame reception enabled. Note that these
> settings may be overridden by ethtool.
> 
> Signed-off-by: Sean Anderson <sean.anderson@seco.com>
> ---
> 
> Changes in v3:
> - New
> 
>  drivers/net/phy/phylink.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
> index 7fa21941878e..7f65413aa778 100644
> --- a/drivers/net/phy/phylink.c
> +++ b/drivers/net/phy/phylink.c
> @@ -1445,6 +1445,10 @@ static void phylink_phy_change(struct phy_device *phydev, bool up)
>  	pl->phy_state.speed = phy_interface_speed(phydev->interface,
>  						  phydev->speed);
>  	pl->phy_state.duplex = phydev->duplex;
> +	if (phydev->rate_adaptation == RATE_ADAPT_PAUSE) {
> +		pl->phy_state.duplex = DUPLEX_FULL;
> +		rx_pause = true;
> +	}

I would not do this. If the requirements for rate adaptation are not
fulfilled, you should turn off rate adaptation.

A MAC which knows rate adaptation is going on can help out, by not
advertising 10Half, 100Half etc. Autoneg will then fail for modes
where rate adaptation does not work.

The MAC should also be declaring what sort of pause it supports, so
disable rate adaptation if it does not have async pause.

      Andrew
Sean Anderson July 16, 2022, 10:37 p.m. UTC | #2
On 7/16/22 4:17 PM, Andrew Lunn wrote:
> On Fri, Jul 15, 2022 at 05:59:17PM -0400, Sean Anderson wrote:
>> If the phy is configured to use pause-based rate adaptation, ensure that
>> the link is full duplex with pause frame reception enabled. Note that these
>> settings may be overridden by ethtool.
>>
>> Signed-off-by: Sean Anderson <sean.anderson@seco.com>
>> ---
>>
>> Changes in v3:
>> - New
>>
>>   drivers/net/phy/phylink.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
>> index 7fa21941878e..7f65413aa778 100644
>> --- a/drivers/net/phy/phylink.c
>> +++ b/drivers/net/phy/phylink.c
>> @@ -1445,6 +1445,10 @@ static void phylink_phy_change(struct phy_device *phydev, bool up)
>>   	pl->phy_state.speed = phy_interface_speed(phydev->interface,
>>   						  phydev->speed);
>>   	pl->phy_state.duplex = phydev->duplex;
>> +	if (phydev->rate_adaptation == RATE_ADAPT_PAUSE) {
>> +		pl->phy_state.duplex = DUPLEX_FULL;
>> +		rx_pause = true;
>> +	}
> 
> I would not do this. If the requirements for rate adaptation are not
> fulfilled, you should turn off rate adaptation.
> 
> A MAC which knows rate adaptation is going on can help out, by not
> advertising 10Half, 100Half etc. Autoneg will then fail for modes
> where rate adaptation does not work.

OK, so maybe it is better to phylink_warn here. Something along the
lines of "phy using pause-based rate adaptation, but duplex is %s".

> The MAC should also be declaring what sort of pause it supports, so
> disable rate adaptation if it does not have async pause.

That's what we do in the previous patch.

The problem is that rx_pause and tx_pause are resolved based on our
advertisement and the link partner's advertisement. However, the link
partner may not support pause frames at all. In that case, we will get
rx_pause and tx_pause as false. However, we still want to enable rx_pause,
because we know that the phy will be emitting pause frames. And of course
the user can always force disable pause frames anyway through ethtool.

--Sean
Andrew Lunn July 17, 2022, 1:39 a.m. UTC | #3
> > I would not do this. If the requirements for rate adaptation are not
> > fulfilled, you should turn off rate adaptation.
> > 
> > A MAC which knows rate adaptation is going on can help out, by not
> > advertising 10Half, 100Half etc. Autoneg will then fail for modes
> > where rate adaptation does not work.
> 
> OK, so maybe it is better to phylink_warn here. Something along the
> lines of "phy using pause-based rate adaptation, but duplex is %s".

You say 1/2 duplex simply does not work with rate adaptation. So i
would actually return -EINVAL at the point the MAC indicates what
modes it supports if there is a 1/2 duplex mode in the list.

> 
> > The MAC should also be declaring what sort of pause it supports, so
> > disable rate adaptation if it does not have async pause.
> 
> That's what we do in the previous patch.
> 
> The problem is that rx_pause and tx_pause are resolved based on our
> advertisement and the link partner's advertisement. However, the link
> partner may not support pause frames at all. In that case, we will get
> rx_pause and tx_pause as false. However, we still want to enable rx_pause,
> because we know that the phy will be emitting pause frames. And of course
> the user can always force disable pause frames anyway through ethtool.

Right, so we need a table somewhere in the documentation listing the
different combinations and what should happen.

If the MAC does not support rx_pause, rate adaptation is turned off.
If the negotiation results in no rx_pause, force it on anyway with
Pause based adaptation. If ethtool turns pause off, turn off rate
adaptation.

Does 802.3 say anything about this?

We might also want to add an additional state to the ethtool get for
pause, to indicate rx_pause is enabled because of rate adaptation, not
because of autoneg.

       Andrew
Russell King (Oracle) July 18, 2022, 4:12 p.m. UTC | #4
On Fri, Jul 15, 2022 at 05:59:17PM -0400, Sean Anderson wrote:
> If the phy is configured to use pause-based rate adaptation, ensure that
> the link is full duplex with pause frame reception enabled. Note that these
> settings may be overridden by ethtool.
> 
> Signed-off-by: Sean Anderson <sean.anderson@seco.com>
> ---
> 
> Changes in v3:
> - New
> 
>  drivers/net/phy/phylink.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
> index 7fa21941878e..7f65413aa778 100644
> --- a/drivers/net/phy/phylink.c
> +++ b/drivers/net/phy/phylink.c
> @@ -1445,6 +1445,10 @@ static void phylink_phy_change(struct phy_device *phydev, bool up)
>  	pl->phy_state.speed = phy_interface_speed(phydev->interface,
>  						  phydev->speed);
>  	pl->phy_state.duplex = phydev->duplex;
> +	if (phydev->rate_adaptation == RATE_ADAPT_PAUSE) {
> +		pl->phy_state.duplex = DUPLEX_FULL;
> +		rx_pause = true;
> +	}

I really don't like this - as I've pointed out in my previous email, the
reporting in the kernel message log for "Link is Up" will be incorrect
if you force the phy_state here like this. If the media side link has
been negotiated to be half duplex, we should state that in the "Link is
Up" message.

It's only the PCS and MAC that care about this, so this should be dealt
with when calling into the PCS and MAC's link_up() method.

The problem we have are the legacy drivers (of which mv88e6xxx and
mtk_eth_soc are the only two I'm aware of) that make use of the
state->speed and state->duplex when configuring stuff. We could've been
down to just mv88e6xxx had the DSA and mv88e6xxx patches been sorted
out, but sadly that's now going to be some time off due to reviewer
failure.
Russell King (Oracle) July 18, 2022, 4:14 p.m. UTC | #5
On Sat, Jul 16, 2022 at 06:37:22PM -0400, Sean Anderson wrote:
> On 7/16/22 4:17 PM, Andrew Lunn wrote:
> > On Fri, Jul 15, 2022 at 05:59:17PM -0400, Sean Anderson wrote:
> > > If the phy is configured to use pause-based rate adaptation, ensure that
> > > the link is full duplex with pause frame reception enabled. Note that these
> > > settings may be overridden by ethtool.
> > > 
> > > Signed-off-by: Sean Anderson <sean.anderson@seco.com>
> > > ---
> > > 
> > > Changes in v3:
> > > - New
> > > 
> > >   drivers/net/phy/phylink.c | 4 ++++
> > >   1 file changed, 4 insertions(+)
> > > 
> > > diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
> > > index 7fa21941878e..7f65413aa778 100644
> > > --- a/drivers/net/phy/phylink.c
> > > +++ b/drivers/net/phy/phylink.c
> > > @@ -1445,6 +1445,10 @@ static void phylink_phy_change(struct phy_device *phydev, bool up)
> > >   	pl->phy_state.speed = phy_interface_speed(phydev->interface,
> > >   						  phydev->speed);
> > >   	pl->phy_state.duplex = phydev->duplex;
> > > +	if (phydev->rate_adaptation == RATE_ADAPT_PAUSE) {
> > > +		pl->phy_state.duplex = DUPLEX_FULL;
> > > +		rx_pause = true;
> > > +	}
> > 
> > I would not do this. If the requirements for rate adaptation are not
> > fulfilled, you should turn off rate adaptation.
> > 
> > A MAC which knows rate adaptation is going on can help out, by not
> > advertising 10Half, 100Half etc. Autoneg will then fail for modes
> > where rate adaptation does not work.
> 
> OK, so maybe it is better to phylink_warn here. Something along the
> lines of "phy using pause-based rate adaptation, but duplex is %s".
> 
> > The MAC should also be declaring what sort of pause it supports, so
> > disable rate adaptation if it does not have async pause.
> 
> That's what we do in the previous patch.
> 
> The problem is that rx_pause and tx_pause are resolved based on our
> advertisement and the link partner's advertisement. However, the link
> partner may not support pause frames at all. In that case, we will get
> rx_pause and tx_pause as false. However, we still want to enable rx_pause,
> because we know that the phy will be emitting pause frames. And of course
> the user can always force disable pause frames anyway through ethtool.

If you want the MAC to enable rx_pause, that ought to be handled
separately in the mac_link_up() method, IMHO.
Russell King (Oracle) July 18, 2022, 4:22 p.m. UTC | #6
On Sun, Jul 17, 2022 at 03:39:39AM +0200, Andrew Lunn wrote:
> > > I would not do this. If the requirements for rate adaptation are not
> > > fulfilled, you should turn off rate adaptation.
> > > 
> > > A MAC which knows rate adaptation is going on can help out, by not
> > > advertising 10Half, 100Half etc. Autoneg will then fail for modes
> > > where rate adaptation does not work.
> > 
> > OK, so maybe it is better to phylink_warn here. Something along the
> > lines of "phy using pause-based rate adaptation, but duplex is %s".
> 
> You say 1/2 duplex simply does not work with rate adaptation. So i
> would actually return -EINVAL at the point the MAC indicates what
> modes it supports if there is a 1/2 duplex mode in the list.

If we have a PHY that supports rate adaption using pause frames, which
implies a full duplex link between the PHY and MAC, one would hope that
someone isn't silly enough to integrate it with a half-duplex only MAC.

This ought to be handled while bringing up the PHY. If the PHY uses
pause frames but the MAC doesn't support full-duplex at the PHY
interface speed, then we should not allow the PHY to do rate adaption.
The easiest way to achieve that is to not allow the PHY to advertise
anything except the PHY interface speed on its media. If that means
there's nothing to advertise, then we fail.

> Right, so we need a table somewhere in the documentation listing the
> different combinations and what should happen.
> 
> If the MAC does not support rx_pause, rate adaptation is turned off.
> If the negotiation results in no rx_pause, force it on anyway with
> Pause based adaptation. If ethtool turns pause off, turn off rate
> adaptation.

That last bit is really awkward - what if the link partner is doing 100M
on the media because that's the fastest it's capable of, but our local
PHY is doing rate adaption to 1G, and we turn pause off, causing rate
adaption in the PHY to be turned off. We need to reconfigure the
advertisement to drop anything except the 1G speed and renegotiate the
link, which will cause the link to go down.

That's going to be really odd behaviour for a user to get their head
around.

> Does 802.3 say anything about this?

I think rate adaption is out of scope of 802.3.

> We might also want to add an additional state to the ethtool get for
> pause, to indicate rx_pause is enabled because of rate adaptation, not
> because of autoneg.

That may well be a much better approach; it lets the user see what is
going on and it becomes more understandable to the user IMHO.
Sean Anderson July 18, 2022, 4:29 p.m. UTC | #7
On 7/16/22 9:39 PM, Andrew Lunn wrote:
>> > I would not do this. If the requirements for rate adaptation are not
>> > fulfilled, you should turn off rate adaptation.
>> > 
>> > A MAC which knows rate adaptation is going on can help out, by not
>> > advertising 10Half, 100Half etc. Autoneg will then fail for modes
>> > where rate adaptation does not work.
>> 
>> OK, so maybe it is better to phylink_warn here. Something along the
>> lines of "phy using pause-based rate adaptation, but duplex is %s".
> 
> You say 1/2 duplex simply does not work with rate adaptation.

It doesn't work with pause-based rate adaptation. This is because we can't
enable pause frames in half duplex (see phy_get_pause). I don't know if this
is a technical limitation (or something else), but presumably there exists a
MAC out there which can't enable pause frames unless it's in full-duplex mode.

> So i
> would actually return -EINVAL at the point the MAC indicates what
> modes it supports if there is a 1/2 duplex mode in the list.

Well, half duplex is still valid if we are at the full line rate. This is more
of a sanity check on what we get back from the phy. That is, we should never
get anything but full duplex if the phy indicates that pause-based rate
adaptation is being performed. So maybe this should live in phy_read_status?

And of course, CRS-based adaptation requires half-duplex (or a MAC which
respects CRS in full-duplex mode).

>> 
>> > The MAC should also be declaring what sort of pause it supports, so
>> > disable rate adaptation if it does not have async pause.
>> 
>> That's what we do in the previous patch.
>> 
>> The problem is that rx_pause and tx_pause are resolved based on our
>> advertisement and the link partner's advertisement. However, the link
>> partner may not support pause frames at all. In that case, we will get
>> rx_pause and tx_pause as false. However, we still want to enable rx_pause,
>> because we know that the phy will be emitting pause frames. And of course
>> the user can always force disable pause frames anyway through ethtool.
> 
> Right, so we need a table somewhere in the documentation listing the
> different combinations and what should happen.

OK, so first here's table 28B-3 (e.g. linkmode_resolve_pause):

Local device  Link partner  Local resolution Partner resolution
============= ============= ================ ==================
PAUSE ASM_DIR PAUSE ASM_DIR Transmit Receive Transmit   Receive
===== ======= ===== ======= ======== ======= ========   =======
    0       0     X       X        N       N        N         N
    0       1     0       X        N       N        N         N
    0       1     1       0        N       N        N         N
    0       1     1       1        Y       N        N         Y
    1       0     0       X        N       N        N         N
    1       X     1       X        Y       Y        Y         Y
    1       1     0       0        N       N        N         N
    1       1     0       1        N       Y        Y         N

And now here's the same table, but assuming that we have a local phy
performing rate adaptation

Local device  Link partner  Local resolution Partner resolution
============= ============= ================ ==================
PAUSE ASM_DIR PAUSE ASM_DIR Transmit Receive Transmit   Receive
===== ======= ===== ======= ======== ======= ========   =======
    0       0     X       X        N       N        N         N # Broken
    0       1     0       X        N       N        N         N # Broken
    0       1     1       0        N       N        N         N # Broken
    0       1     1       1        Y       N        N         Y # Broken
    1       0     0       X        ?       ?        N         N # Semi-broken
    1       X     1       X        Y       Y        Y         Y
    1       1     0       0        N       Y        N         N
    1       1     0       1        N       Y        Y         N

The rows marked as "Broken" don't have local receive pause enabled.
These should never occur, since we can detect that the local MAC doesn't
support pause reception and disable advertisement of pause-based
rate-adapted modes.

On the row marked as "Semi-broken", the local MAC supports only
symmetric pause, and the link partner doesn't support pause. We're not
supposed to send pause frames, so we disable pause, but this breaks rate
adaptation. In this case, we could renegotiate with rate-adapted modes
disabled. Alternatively, we could just decline to advertise rate-adapted
modes for symmetric-pause MACs. This avoids the semi-broken line above,
but also prevents the line below from using rate adaptation.

> If the MAC does not support rx_pause, rate adaptation is turned off.
>> If the negotiation results in no rx_pause, force it on anyway with
> Pause based adaptation. If ethtool turns pause off, turn off rate
> adaptation.
> 
> Does 802.3 say anything about this?

Only IPG-based and CRS-based rate adaptation are defined in 802.3.

> We might also want to add an additional state to the ethtool get for
> pause, to indicate rx_pause is enabled because of rate adaptation, not
> because of autoneg.

Probably a good idea.

--Sean
Sean Anderson July 18, 2022, 4:45 p.m. UTC | #8
On 7/18/22 12:12 PM, Russell King (Oracle) wrote:
> On Fri, Jul 15, 2022 at 05:59:17PM -0400, Sean Anderson wrote:
>> If the phy is configured to use pause-based rate adaptation, ensure that
>> the link is full duplex with pause frame reception enabled. Note that these
>> settings may be overridden by ethtool.
>> 
>> Signed-off-by: Sean Anderson <sean.anderson@seco.com>
>> ---
>> 
>> Changes in v3:
>> - New
>> 
>>  drivers/net/phy/phylink.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>> 
>> diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
>> index 7fa21941878e..7f65413aa778 100644
>> --- a/drivers/net/phy/phylink.c
>> +++ b/drivers/net/phy/phylink.c
>> @@ -1445,6 +1445,10 @@ static void phylink_phy_change(struct phy_device *phydev, bool up)
>>  	pl->phy_state.speed = phy_interface_speed(phydev->interface,
>>  						  phydev->speed);
>>  	pl->phy_state.duplex = phydev->duplex;
>> +	if (phydev->rate_adaptation == RATE_ADAPT_PAUSE) {
>> +		pl->phy_state.duplex = DUPLEX_FULL;

Just form context, as discussed with Andrew, this should never change
anything. That is, it could be replaced with

WARN_ON_ONCE(pl->phy_state.duplex != DUPLEX_FULL);

Since the phy should never report that it is using rate_adaptation
unless it is using full duplex.

>> +		rx_pause = true;
>> +	}
> 
> I really don't like this - as I've pointed out in my previous email, the
> reporting in the kernel message log for "Link is Up" will be incorrect
> if you force the phy_state here like this. If the media side link has
> been negotiated to be half duplex, we should state that in the "Link is
> Up" message.

So I guess the question here is whether there are phys which do duplex
adaptation. There definitely are phys which support a half-duplex
interface mode and a full duplex link mode (such as discussed in patch 08/47).
If it's important to get this right, I can do the same treatment for duplex
as I did for speed.

> It's only the PCS and MAC that care about this, so this should be dealt
> with when calling into the PCS and MAC's link_up() method.
> 
> The problem we have are the legacy drivers (of which mv88e6xxx and
> mtk_eth_soc are the only two I'm aware of) that make use of the
> state->speed and state->duplex when configuring stuff. We could've been
> down to just mv88e6xxx had the DSA and mv88e6xxx patches been sorted
> out, but sadly that's now going to be some time off due to reviewer
> failure.
> 

--Sean
Russell King (Oracle) July 18, 2022, 5:58 p.m. UTC | #9
On Mon, Jul 18, 2022 at 12:45:01PM -0400, Sean Anderson wrote:
> 
> 
> On 7/18/22 12:12 PM, Russell King (Oracle) wrote:
> > On Fri, Jul 15, 2022 at 05:59:17PM -0400, Sean Anderson wrote:
> >> If the phy is configured to use pause-based rate adaptation, ensure that
> >> the link is full duplex with pause frame reception enabled. Note that these
> >> settings may be overridden by ethtool.
> >> 
> >> Signed-off-by: Sean Anderson <sean.anderson@seco.com>
> >> ---
> >> 
> >> Changes in v3:
> >> - New
> >> 
> >>  drivers/net/phy/phylink.c | 4 ++++
> >>  1 file changed, 4 insertions(+)
> >> 
> >> diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
> >> index 7fa21941878e..7f65413aa778 100644
> >> --- a/drivers/net/phy/phylink.c
> >> +++ b/drivers/net/phy/phylink.c
> >> @@ -1445,6 +1445,10 @@ static void phylink_phy_change(struct phy_device *phydev, bool up)
> >>  	pl->phy_state.speed = phy_interface_speed(phydev->interface,
> >>  						  phydev->speed);
> >>  	pl->phy_state.duplex = phydev->duplex;
> >> +	if (phydev->rate_adaptation == RATE_ADAPT_PAUSE) {
> >> +		pl->phy_state.duplex = DUPLEX_FULL;
> 
> Just form context, as discussed with Andrew, this should never change
> anything. That is, it could be replaced with
> 
> WARN_ON_ONCE(pl->phy_state.duplex != DUPLEX_FULL);
> 
> Since the phy should never report that it is using rate_adaptation
> unless it is using full duplex.

The "rate adaption" thing tends not to be a result of negotiation with
the link partner, but more a configuration issue. At least that is the
case with 88x3310 PHYs. There is no mention of any kind of restriction
on duplex when operating in rate adaption mode (whether it's the MACSEC
version that can generate pause frames, or the non-MACSEC that can't.)

> >> +		rx_pause = true;
> >> +	}
> > 
> > I really don't like this - as I've pointed out in my previous email, the
> > reporting in the kernel message log for "Link is Up" will be incorrect
> > if you force the phy_state here like this. If the media side link has
> > been negotiated to be half duplex, we should state that in the "Link is
> > Up" message.
> 
> So I guess the question here is whether there are phys which do duplex
> adaptation. There definitely are phys which support a half-duplex
> interface mode and a full duplex link mode (such as discussed in patch 08/47).
> If it's important to get this right, I can do the same treatment for duplex
> as I did for speed.

I guess it's something we don't know.

The sensible thing is not to add a WARN_ON() for the case, but to
restrict the PHY advertisement so the half-duplex case can't happen if
the host link is operating in a mode that requires rate adaption to
gain the other speeds.
diff mbox series

Patch

diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 7fa21941878e..7f65413aa778 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -1445,6 +1445,10 @@  static void phylink_phy_change(struct phy_device *phydev, bool up)
 	pl->phy_state.speed = phy_interface_speed(phydev->interface,
 						  phydev->speed);
 	pl->phy_state.duplex = phydev->duplex;
+	if (phydev->rate_adaptation == RATE_ADAPT_PAUSE) {
+		pl->phy_state.duplex = DUPLEX_FULL;
+		rx_pause = true;
+	}
 	pl->phy_state.pause = MLO_PAUSE_NONE;
 	if (tx_pause)
 		pl->phy_state.pause |= MLO_PAUSE_TX;