diff mbox series

net: phylink: Pass state to pcs_config

Message ID 20211214233450.1488736-1-sean.anderson@seco.com (mailing list archive)
State New, archived
Headers show
Series net: phylink: Pass state to pcs_config | expand

Commit Message

Sean Anderson Dec. 14, 2021, 11:34 p.m. UTC
Although most PCSs only need the interface and advertising to configure
themselves, there is an oddly named "permit_pause_to_mac" parameter
included as well, and only used by mvpp2. This parameter indicates
whether pause settings should be autonegotiated or not. mvpp2 needs this
because it cannot both set the pause mode manually and and advertise
pause support. That is, if you want to set the pause mode, you have to
advertise that you don't support flow control. We can't just
autonegotiate the pause mode and then set it manually, because if
the link goes down we will start advertising the wrong thing. So
instead, we have to set it up front during pcs_config. However, we can't
determine whether we are autonegotiating flow control based on our
advertisement (since we advertise flow control even when it is set
manually).

So we have had this strange additional argument tagging along which is
used by one driver (though soon to be one more since mvneta has the same
problem). We could stick MLO_PAUSE_AN in the "mode" parameter, since
that contains other autonegotiation configuration. However, there are a
lot of places in the codebase which do a direct comparison (e.g. mode ==
MLO_AN_FIXED), so it would be difficult to add an extra bit without
breaking things. But this whole time, mac_config has been getting the
whole state, and it has not suffered unduly. So just pass state and
eliminate these other parameters.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
---

 drivers/net/ethernet/cadence/macb_main.c      |  8 ++----
 .../net/ethernet/marvell/mvpp2/mvpp2_main.c   | 28 +++++++++----------
 .../microchip/lan966x/lan966x_phylink.c       | 10 +++----
 .../microchip/sparx5/sparx5_phylink.c         | 16 +++++------
 drivers/net/pcs/pcs-lynx.c                    | 15 +++++-----
 drivers/net/pcs/pcs-xpcs.c                    |  6 ++--
 drivers/net/phy/phylink.c                     |  9 ++----
 include/linux/phylink.h                       | 14 +++-------
 8 files changed, 42 insertions(+), 64 deletions(-)

Comments

Russell King (Oracle) Dec. 14, 2021, 11:45 p.m. UTC | #1
On Tue, Dec 14, 2021 at 06:34:50PM -0500, Sean Anderson wrote:
> Although most PCSs only need the interface and advertising to configure
> themselves, there is an oddly named "permit_pause_to_mac" parameter
> included as well, and only used by mvpp2. This parameter indicates
> whether pause settings should be autonegotiated or not. mvpp2 needs this
> because it cannot both set the pause mode manually and and advertise
> pause support. That is, if you want to set the pause mode, you have to
> advertise that you don't support flow control. We can't just
> autonegotiate the pause mode and then set it manually, because if
> the link goes down we will start advertising the wrong thing. So
> instead, we have to set it up front during pcs_config. However, we can't
> determine whether we are autonegotiating flow control based on our
> advertisement (since we advertise flow control even when it is set
> manually).
> 
> So we have had this strange additional argument tagging along which is
> used by one driver (though soon to be one more since mvneta has the same
> problem). We could stick MLO_PAUSE_AN in the "mode" parameter, since
> that contains other autonegotiation configuration. However, there are a
> lot of places in the codebase which do a direct comparison (e.g. mode ==
> MLO_AN_FIXED), so it would be difficult to add an extra bit without
> breaking things. But this whole time, mac_config has been getting the
> whole state, and it has not suffered unduly. So just pass state and
> eliminate these other parameters.

Please no. This is a major step backwards.

mac_config() suffers from the proiblem that people constantly
mis-understand what they can access in "state" and what they can't.
This patch introduces exactly the same problem but for a new API.

I really don't want to make that same mistake again, and this patch
is making that same mistake.

The reason mvpp2 and mvneta are different is because they have a
separate bit to allow the results of pause mode negotiation to be
forwarded to the MAC, and that bit needs to be turned off if the
pause autonegotiation is disabled (which is entirely different
from normal autonegotiation.)
Sean Anderson Dec. 15, 2021, 12:16 a.m. UTC | #2
On 12/14/21 6:45 PM, Russell King (Oracle) wrote:
> On Tue, Dec 14, 2021 at 06:34:50PM -0500, Sean Anderson wrote:
>> Although most PCSs only need the interface and advertising to configure
>> themselves, there is an oddly named "permit_pause_to_mac" parameter
>> included as well, and only used by mvpp2. This parameter indicates
>> whether pause settings should be autonegotiated or not. mvpp2 needs this
>> because it cannot both set the pause mode manually and and advertise
>> pause support. That is, if you want to set the pause mode, you have to
>> advertise that you don't support flow control. We can't just
>> autonegotiate the pause mode and then set it manually, because if
>> the link goes down we will start advertising the wrong thing. So
>> instead, we have to set it up front during pcs_config. However, we can't
>> determine whether we are autonegotiating flow control based on our
>> advertisement (since we advertise flow control even when it is set
>> manually).
>>
>> So we have had this strange additional argument tagging along which is
>> used by one driver (though soon to be one more since mvneta has the same
>> problem). We could stick MLO_PAUSE_AN in the "mode" parameter, since
>> that contains other autonegotiation configuration. However, there are a
>> lot of places in the codebase which do a direct comparison (e.g. mode ==
>> MLO_AN_FIXED), so it would be difficult to add an extra bit without
>> breaking things. But this whole time, mac_config has been getting the
>> whole state, and it has not suffered unduly. So just pass state and
>> eliminate these other parameters.
>
> Please no. This is a major step backwards.
>
> mac_config() suffers from the proiblem that people constantly
> mis-understand what they can access in "state" and what they can't.
> This patch introduces exactly the same problem but for a new API.
>
> I really don't want to make that same mistake again, and this patch
> is making that same mistake.
>
> The reason mvpp2 and mvneta are different is because they have a
> separate bit to allow the results of pause mode negotiation to be
> forwarded to the MAC, and that bit needs to be turned off if the
> pause autonegotiation is disabled

Ok, so let me clarify my understanding. Perhaps this can be eliminated
through a different approach.

When I read the datasheet for mvneta (which hopefully has the same
logic here, since I could not find a datasheet for an mvpp2 device), I
noticed that the Pause_Adv bit said

> It is valid only if flow control mode is defined by Auto-Negotiation
> (as defined by the <AnFcEn> bit).

Which I interpreted to mean that if AnFcEn was clear, then no flow
control was advertised. But perhaps it instead means that the logic is
something like

if (AnFcEn)
	Config_Reg.PAUSE = Pause_Adv;
else
	Config_Reg.PAUSE = SetFcEn;

which would mean that we can just clear AnFcEn in link_up if the
autonegotiated pause settings are different from the configured pause
settings.

> (which is entirely different from normal autonegotiation.)

AFAIK pause autonegotiation happens in the same autonegotiation word
transfer as e.g. duplex autonegotiation. So it is just a subset of the
other which is configurable separately in Linux.

--Sean
Russell King (Oracle) Dec. 15, 2021, 12:49 a.m. UTC | #3
On Tue, Dec 14, 2021 at 07:16:53PM -0500, Sean Anderson wrote:
> Ok, so let me clarify my understanding. Perhaps this can be eliminated
> through a different approach.
> 
> When I read the datasheet for mvneta (which hopefully has the same
> logic here, since I could not find a datasheet for an mvpp2 device), I
> noticed that the Pause_Adv bit said
> 
> > It is valid only if flow control mode is defined by Auto-Negotiation
> > (as defined by the <AnFcEn> bit).
> 
> Which I interpreted to mean that if AnFcEn was clear, then no flow
> control was advertised. But perhaps it instead means that the logic is
> something like
> 
> if (AnFcEn)
> 	Config_Reg.PAUSE = Pause_Adv;
> else
> 	Config_Reg.PAUSE = SetFcEn;
> 
> which would mean that we can just clear AnFcEn in link_up if the
> autonegotiated pause settings are different from the configured pause
> settings.

Having actually played with this hardware quite a bit and observed what
it sends, what it implements for advertising is:

	Config_Reg.PAUSE = Pause_Adv;

Config_Reg gets sent over the 1000BASE-X link to the link partner, and
we receive Remote_Reg from the link partner.

Then, the hardware implements:

	if (AnFcEn)
		MAC_PAUSE = Config_Reg.PAUSE & Remote_Reg.PAUSE;
	else
		MAC_PAUSE = SetFcEn;

In otherwords, AnFcEn controls whether the result of autonegotiation
or the value of SetFcEn controls whether the MAC enables symmetric
pause mode.

Pause_Adv comes from the advertisement, and this is controlled from
ethtool -s and somewhat by ethtool -A.

AnFcEn is controlled purely and only by ethtool -A ... autoneg on|off.
You can't derive this from "state".

SetFcEn comes from ethtool -A ... tx and rx parameters (which must
both be on or both be off.)

Since we have no knowledge what Remote_Reg contains (it's not made
accessible by the hardware), it's impossible to derive whether the
autonegotiated pause settings are different from the configured pause
settings.
Russell King (Oracle) Dec. 15, 2021, 1:12 a.m. UTC | #4
On Wed, Dec 15, 2021 at 12:49:14AM +0000, Russell King (Oracle) wrote:
> On Tue, Dec 14, 2021 at 07:16:53PM -0500, Sean Anderson wrote:
> > Ok, so let me clarify my understanding. Perhaps this can be eliminated
> > through a different approach.
> > 
> > When I read the datasheet for mvneta (which hopefully has the same
> > logic here, since I could not find a datasheet for an mvpp2 device), I
> > noticed that the Pause_Adv bit said
> > 
> > > It is valid only if flow control mode is defined by Auto-Negotiation
> > > (as defined by the <AnFcEn> bit).
> > 
> > Which I interpreted to mean that if AnFcEn was clear, then no flow
> > control was advertised. But perhaps it instead means that the logic is
> > something like
> > 
> > if (AnFcEn)
> > 	Config_Reg.PAUSE = Pause_Adv;
> > else
> > 	Config_Reg.PAUSE = SetFcEn;
> > 
> > which would mean that we can just clear AnFcEn in link_up if the
> > autonegotiated pause settings are different from the configured pause
> > settings.
> 
> Having actually played with this hardware quite a bit and observed what
> it sends, what it implements for advertising is:
> 
> 	Config_Reg.PAUSE = Pause_Adv;
> 
> Config_Reg gets sent over the 1000BASE-X link to the link partner, and
> we receive Remote_Reg from the link partner.
> 
> Then, the hardware implements:
> 
> 	if (AnFcEn)
> 		MAC_PAUSE = Config_Reg.PAUSE & Remote_Reg.PAUSE;
> 	else
> 		MAC_PAUSE = SetFcEn;
> 
> In otherwords, AnFcEn controls whether the result of autonegotiation
> or the value of SetFcEn controls whether the MAC enables symmetric
> pause mode.

I should also note that in the Port Status register,

	TxFcEn = RxFcEn = MAC_PAUSE;

So, the status register bits follow SetFcEn when AnFcEn is disabled.

However, these bits are the only way to report the result of the
negotiation, which is why we use them to report back whether flow
control was enabled in mvneta_pcs_get_state(). These bits will be
ignored by phylink when ethtool -A has disabled pause negotiation,
and in that situation there is no way as I said to be able to read
the negotiation result.

permit_pause_to_mac exists precisely because of the limitions of this
hardware, and having it costs virtually nothing to other network
drivers... except a parameter that others ignore.

If we don't have permit_pause_to_mac in pcs_config() then we need to
have it passed to the link_up() methods instead. There is no other
option (other than breaking mvneta and mvpp2) here than to make the
state of ethtool -A ... autoneg on|off known to the hardware.
Sean Anderson Dec. 16, 2021, 5:02 p.m. UTC | #5
On 12/14/21 8:12 PM, Russell King (Oracle) wrote:
> On Wed, Dec 15, 2021 at 12:49:14AM +0000, Russell King (Oracle) wrote:
>> On Tue, Dec 14, 2021 at 07:16:53PM -0500, Sean Anderson wrote:
>> > Ok, so let me clarify my understanding. Perhaps this can be eliminated
>> > through a different approach.
>> >
>> > When I read the datasheet for mvneta (which hopefully has the same
>> > logic here, since I could not find a datasheet for an mvpp2 device), I
>> > noticed that the Pause_Adv bit said
>> >
>> > > It is valid only if flow control mode is defined by Auto-Negotiation
>> > > (as defined by the <AnFcEn> bit).
>> >
>> > Which I interpreted to mean that if AnFcEn was clear, then no flow
>> > control was advertised. But perhaps it instead means that the logic is
>> > something like
>> >
>> > if (AnFcEn)
>> > 	Config_Reg.PAUSE = Pause_Adv;
>> > else
>> > 	Config_Reg.PAUSE = SetFcEn;
>> >
>> > which would mean that we can just clear AnFcEn in link_up if the
>> > autonegotiated pause settings are different from the configured pause
>> > settings.
>>
>> Having actually played with this hardware quite a bit and observed what
>> it sends, what it implements for advertising is:
>>
>> 	Config_Reg.PAUSE = Pause_Adv;

So the above note from the datasheet about Pause_Adv not being valid is
incorrect?

>> Config_Reg gets sent over the 1000BASE-X link to the link partner, and
>> we receive Remote_Reg from the link partner.
>>
>> Then, the hardware implements:
>>
>> 	if (AnFcEn)
>> 		MAC_PAUSE = Config_Reg.PAUSE & Remote_Reg.PAUSE;
>> 	else
>> 		MAC_PAUSE = SetFcEn;
>>
>> In otherwords, AnFcEn controls whether the result of autonegotiation
>> or the value of SetFcEn controls whether the MAC enables symmetric
>> pause mode.
>
> I should also note that in the Port Status register,
>
> 	TxFcEn = RxFcEn = MAC_PAUSE;
>
> So, the status register bits follow SetFcEn when AnFcEn is disabled.
>
> However, these bits are the only way to report the result of the
> negotiation, which is why we use them to report back whether flow
> control was enabled in mvneta_pcs_get_state(). These bits will be
> ignored by phylink when ethtool -A has disabled pause negotiation,
> and in that situation there is no way as I said to be able to read
> the negotiation result.

Ok, how about

diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
index b1cce4425296..9b41d8ee71fb 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
@@ -6226,8 +6226,7 @@ static int mvpp2_gmac_pcs_config(struct phylink_pcs *pcs, unsigned int mode,
                          * automatically or the bits in MVPP22_GMAC_CTRL_4_REG
                          * manually controls the GMAC pause modes.
                          */
-                       if (permit_pause_to_mac)
-                               val |= MVPP2_GMAC_FLOW_CTRL_AUTONEG;
+                       val |= MVPP2_GMAC_FLOW_CTRL_AUTONEG;

                         /* Configure advertisement bits */
                         mask |= MVPP2_GMAC_FC_ADV_EN | MVPP2_GMAC_FC_ADV_ASM_EN;
@@ -6525,6 +6524,9 @@ static void mvpp2_mac_link_up(struct phylink_config *config,
                 }
         } else {
                 if (!phylink_autoneg_inband(mode)) {
+                       bool cur_tx_pause, cur_rx_pause;
+                       u32 status0 = readl(port->base + MVPP2_GMAC_STATUS0);
+
                         val = MVPP2_GMAC_FORCE_LINK_PASS;

                         if (speed == SPEED_1000 || speed == SPEED_2500)
@@ -6535,11 +6537,18 @@ static void mvpp2_mac_link_up(struct phylink_config *config,
                         if (duplex == DUPLEX_FULL)
                                 val |= MVPP2_GMAC_CONFIG_FULL_DUPLEX;

+                       cur_tx_pause = status0 & MVPP2_GMAC_STATUS0_TX_PAUSE;
+                       cur_rx_pause = status0 & MVPP2_GMAC_STATUS0_RX_PAUSE;
+                       if (tx_pause == cur_tx_pause &&
+                           rx_pause == cur_rx_pause)
+                               val |= MVPP2_GMAC_FLOW_CTRL_AUTONEG;
+
                         mvpp2_modify(port->base + MVPP2_GMAC_AUTONEG_CONFIG,
                                      MVPP2_GMAC_FORCE_LINK_DOWN |
                                      MVPP2_GMAC_FORCE_LINK_PASS |
                                      MVPP2_GMAC_CONFIG_MII_SPEED |
                                      MVPP2_GMAC_CONFIG_GMII_SPEED |
+                                    MVPP2_GMAC_FLOW_CTRL_AUTONEG |
                                      MVPP2_GMAC_CONFIG_FULL_DUPLEX, val);
                 }
---

When we have MLO_PAUSE_AN, this is the same as before. For the other
case, consider the scenario where someone disables pause
autonegotiation, and then plugs in the cable. Here, we get the
negotiated pause from pcs_get_state, but it is overridden by
phylink_apply_manual_flow in phylink_resolve. Then, link_up may clear
AnFcEn, depending on if the pause mode matches. If it matches, then
future link_up events will look like what just happened. If it doesn't,
then we will ignore the negotiated pause mode for future link_up events
(that is, the result of pcs_get_state will not be what was negotiated).
This is OK, because we discard the result of negotiation anyway. If
someone enables pause autonegotiation again, then
phylink_ethtool_set_pauseparam will call phylink_change_inband_advert,
which call phylink_mac_pcs_an_restart since AnFcEn will be set again.

The downside to the above approach is that when MLO_PAUSE_AN is cleared
and we have had to clear AnFcEn, calling pcs_config will always trigger
autonegotiation to restart, even for no-op changes which would not
otherwise trigger a restart. I think this scenario is unlikely enough
not to be a big deal. Lastly, this also depends on

	Config_Reg.PAUSE = Pause_Adv;

or at the very least, anything which is not

if (AnFcEn)
	Config_Reg.PAUSE = Pause_Adv;
else
	Config_Reg.PAUSE = 0;

> permit_pause_to_mac exists precisely because of the limitions of this
> hardware, and having it costs virtually nothing to other network
> drivers... except a parameter that others ignore.
>
> If we don't have permit_pause_to_mac in pcs_config() then we need to
> have it passed to the link_up() methods instead. There is no other
> option (other than breaking mvneta and mvpp2) here than to make the
> state of ethtool -A ... autoneg on|off known to the hardware.

Well, the original patch is primarily motivated by the terrible naming
and documentation regarding this parameter. I was only able to determine
the purpose of this parameter by reading the mvpp2 driver and consulting
the A370 datasheet. I think if it is possible to eliminate this
parameter (such as with the above patch), we should do so, since it will
make the API cleaner and easier to understand. Failing that, I will
submit a patch to improve the documentation (and perhaps rename the
parameter to something more descriptive).

--Sean
Russell King (Oracle) Dec. 16, 2021, 5:26 p.m. UTC | #6
On Thu, Dec 16, 2021 at 12:02:55PM -0500, Sean Anderson wrote:
> On 12/14/21 8:12 PM, Russell King (Oracle) wrote:
> > On Wed, Dec 15, 2021 at 12:49:14AM +0000, Russell King (Oracle) wrote:
> > > On Tue, Dec 14, 2021 at 07:16:53PM -0500, Sean Anderson wrote:
> > > > Ok, so let me clarify my understanding. Perhaps this can be eliminated
> > > > through a different approach.
> > > >
> > > > When I read the datasheet for mvneta (which hopefully has the same
> > > > logic here, since I could not find a datasheet for an mvpp2 device), I
> > > > noticed that the Pause_Adv bit said
> > > >
> > > > > It is valid only if flow control mode is defined by Auto-Negotiation
> > > > > (as defined by the <AnFcEn> bit).
> > > >
> > > > Which I interpreted to mean that if AnFcEn was clear, then no flow
> > > > control was advertised. But perhaps it instead means that the logic is
> > > > something like
> > > >
> > > > if (AnFcEn)
> > > > 	Config_Reg.PAUSE = Pause_Adv;
> > > > else
> > > > 	Config_Reg.PAUSE = SetFcEn;
> > > >
> > > > which would mean that we can just clear AnFcEn in link_up if the
> > > > autonegotiated pause settings are different from the configured pause
> > > > settings.
> > > 
> > > Having actually played with this hardware quite a bit and observed what
> > > it sends, what it implements for advertising is:
> > > 
> > > 	Config_Reg.PAUSE = Pause_Adv;
> 
> So the above note from the datasheet about Pause_Adv not being valid is
> incorrect?
> 
> > > Config_Reg gets sent over the 1000BASE-X link to the link partner, and
> > > we receive Remote_Reg from the link partner.
> > > 
> > > Then, the hardware implements:
> > > 
> > > 	if (AnFcEn)
> > > 		MAC_PAUSE = Config_Reg.PAUSE & Remote_Reg.PAUSE;
> > > 	else
> > > 		MAC_PAUSE = SetFcEn;
> > > 
> > > In otherwords, AnFcEn controls whether the result of autonegotiation
> > > or the value of SetFcEn controls whether the MAC enables symmetric
> > > pause mode.
> > 
> > I should also note that in the Port Status register,
> > 
> > 	TxFcEn = RxFcEn = MAC_PAUSE;
> > 
> > So, the status register bits follow SetFcEn when AnFcEn is disabled.
> > 
> > However, these bits are the only way to report the result of the
> > negotiation, which is why we use them to report back whether flow
> > control was enabled in mvneta_pcs_get_state(). These bits will be
> > ignored by phylink when ethtool -A has disabled pause negotiation,
> > and in that situation there is no way as I said to be able to read
> > the negotiation result.
> 
> Ok, how about
> 
> diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> index b1cce4425296..9b41d8ee71fb 100644
> --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> @@ -6226,8 +6226,7 @@ static int mvpp2_gmac_pcs_config(struct phylink_pcs *pcs, unsigned int mode,
>                          * automatically or the bits in MVPP22_GMAC_CTRL_4_REG
>                          * manually controls the GMAC pause modes.
>                          */
> -                       if (permit_pause_to_mac)
> -                               val |= MVPP2_GMAC_FLOW_CTRL_AUTONEG;
> +                       val |= MVPP2_GMAC_FLOW_CTRL_AUTONEG;
> 
>                         /* Configure advertisement bits */
>                         mask |= MVPP2_GMAC_FC_ADV_EN | MVPP2_GMAC_FC_ADV_ASM_EN;
> @@ -6525,6 +6524,9 @@ static void mvpp2_mac_link_up(struct phylink_config *config,
>                 }
>         } else {
>                 if (!phylink_autoneg_inband(mode)) {
> +                       bool cur_tx_pause, cur_rx_pause;
> +                       u32 status0 = readl(port->base + MVPP2_GMAC_STATUS0);
> +
>                         val = MVPP2_GMAC_FORCE_LINK_PASS;
> 
>                         if (speed == SPEED_1000 || speed == SPEED_2500)
> @@ -6535,11 +6537,18 @@ static void mvpp2_mac_link_up(struct phylink_config *config,
>                         if (duplex == DUPLEX_FULL)
>                                 val |= MVPP2_GMAC_CONFIG_FULL_DUPLEX;
> 
> +                       cur_tx_pause = status0 & MVPP2_GMAC_STATUS0_TX_PAUSE;
> +                       cur_rx_pause = status0 & MVPP2_GMAC_STATUS0_RX_PAUSE;

I think you haven't understood everything I've said. These status bits
report what the MAC is doing. They do not reflect what was negotiated
_unless_ MVPP2_GMAC_FLOW_CTRL_AUTONEG was set.

So, if we clear MVPP2_GMAC_FLOW_CTRL_AUTONEG, these bits will follow
MVPP22_XLG_CTRL0_TX_FLOW_CTRL_EN and MVPP22_XLG_CTRL0_RX_FLOW_CTRL_EN.

Let's say we apply this patch. tx/rx pause are negotiated and enabled.
So cur_tx_pause and cur_rx_pause are both true.

We change the pause settings, forcing tx pause only. This causes
pcs_config to be called which sets MVPP2_GMAC_FLOW_CTRL_AUTONEG, and
then link_up gets called with the differing settings. We clear
MVPP2_GMAC_FLOW_CTRL_AUTONEG and force the pause settings. We now
have the status register containing MVPP2_GMAC_STATUS0_TX_PAUSE set
but MVPP2_GMAC_STATUS0_RX_PAUSE clear.

The link goes down e.g. because the remote end has changed and comes
back. We read the status register and see MVPP2_GMAC_STATUS0_TX_PAUSE
is set and MVPP2_GMAC_STATUS0_RX_PAUSE is still clear. tx_pause is
true and rx_pause is false. These agree with the settings, so we
then set MVPP2_GMAC_FLOW_CTRL_AUTONEG.

If the link goes down and up again, then this cycle repeats - the
status register will now have both MVPP2_GMAC_STATUS0_TX_PAUSE and
MVPP2_GMAC_STATUS0_RX_PAUSE set, so we clear
MVPP2_GMAC_FLOW_CTRL_AUTONEG. If the link goes down/up again, we flip
back to re-enabling MVPP2_GMAC_FLOW_CTRL_AUTONEG.

And we will toggle between these two states.

Sorry, but this can't work.

> When we have MLO_PAUSE_AN, this is the same as before. For the other
> case, consider the scenario where someone disables pause
> autonegotiation, and then plugs in the cable. Here, we get the
> negotiated pause from pcs_get_state, but it is overridden by

In mvneta and mvpp2, pcs_get_state() can only read the current settings
that the PCS/MAC are currently using. There is no way to read purely
the results of negotiation with this hardware.

E.g., if you force speed to 100Mbps, then pcs_get_state() will tell you
that you're doing 100Mbps. If you force duplex, pcs_get_state() will
tell you what's being forced. If you force pause, pcs_get_state() will
tell you what pause settings are being forced.

Sadly, this is the way Marvell designed this hardware, and it sucks,
but we have to support it.

> > permit_pause_to_mac exists precisely because of the limitions of this
> > hardware, and having it costs virtually nothing to other network
> > drivers... except a parameter that others ignore.
> > 
> > If we don't have permit_pause_to_mac in pcs_config() then we need to
> > have it passed to the link_up() methods instead. There is no other
> > option (other than breaking mvneta and mvpp2) here than to make the
> > state of ethtool -A ... autoneg on|off known to the hardware.
> 
> Well, the original patch is primarily motivated by the terrible naming
> and documentation regarding this parameter. I was only able to determine
> the purpose of this parameter by reading the mvpp2 driver and consulting
> the A370 datasheet. I think if it is possible to eliminate this
> parameter (such as with the above patch), we should do so, since it will
> make the API cleaner and easier to understand. Failing that, I will
> submit a patch to improve the documentation (and perhaps rename the
> parameter to something more descriptive).

I don't like having it either, but I've thought about it for a long
time and haven't found any other acceptable solution.

To me, the parameter name describes _exactly_ what it's about. It's
about the PCS being permitted to forward the pause status to the MAC.
Hence "permit" "pause" "to" "mac" and the PCS context comes from the
fact that this is a PCS function. I really don't see what could be
clearer about the name... and I get the impression this is a bit of
a storm in a tea cup.
Sean Anderson Dec. 16, 2021, 5:51 p.m. UTC | #7
On 12/16/21 12:26 PM, Russell King (Oracle) wrote:
> On Thu, Dec 16, 2021 at 12:02:55PM -0500, Sean Anderson wrote:
>> On 12/14/21 8:12 PM, Russell King (Oracle) wrote:
>> > On Wed, Dec 15, 2021 at 12:49:14AM +0000, Russell King (Oracle) wrote:
>> > > On Tue, Dec 14, 2021 at 07:16:53PM -0500, Sean Anderson wrote:
>> > > > Ok, so let me clarify my understanding. Perhaps this can be eliminated
>> > > > through a different approach.
>> > > >
>> > > > When I read the datasheet for mvneta (which hopefully has the same
>> > > > logic here, since I could not find a datasheet for an mvpp2 device), I
>> > > > noticed that the Pause_Adv bit said
>> > > >
>> > > > > It is valid only if flow control mode is defined by Auto-Negotiation
>> > > > > (as defined by the <AnFcEn> bit).
>> > > >
>> > > > Which I interpreted to mean that if AnFcEn was clear, then no flow
>> > > > control was advertised. But perhaps it instead means that the logic is
>> > > > something like
>> > > >
>> > > > if (AnFcEn)
>> > > > 	Config_Reg.PAUSE = Pause_Adv;
>> > > > else
>> > > > 	Config_Reg.PAUSE = SetFcEn;
>> > > >
>> > > > which would mean that we can just clear AnFcEn in link_up if the
>> > > > autonegotiated pause settings are different from the configured pause
>> > > > settings.
>> > >
>> > > Having actually played with this hardware quite a bit and observed what
>> > > it sends, what it implements for advertising is:
>> > >
>> > > 	Config_Reg.PAUSE = Pause_Adv;
>>
>> So the above note from the datasheet about Pause_Adv not being valid is
>> incorrect?
>>
>> > > Config_Reg gets sent over the 1000BASE-X link to the link partner, and
>> > > we receive Remote_Reg from the link partner.
>> > >
>> > > Then, the hardware implements:
>> > >
>> > > 	if (AnFcEn)
>> > > 		MAC_PAUSE = Config_Reg.PAUSE & Remote_Reg.PAUSE;
>> > > 	else
>> > > 		MAC_PAUSE = SetFcEn;
>> > >
>> > > In otherwords, AnFcEn controls whether the result of autonegotiation
>> > > or the value of SetFcEn controls whether the MAC enables symmetric
>> > > pause mode.
>> >
>> > I should also note that in the Port Status register,
>> >
>> > 	TxFcEn = RxFcEn = MAC_PAUSE;
>> >
>> > So, the status register bits follow SetFcEn when AnFcEn is disabled.
>> >
>> > However, these bits are the only way to report the result of the
>> > negotiation, which is why we use them to report back whether flow
>> > control was enabled in mvneta_pcs_get_state(). These bits will be
>> > ignored by phylink when ethtool -A has disabled pause negotiation,
>> > and in that situation there is no way as I said to be able to read
>> > the negotiation result.
>>
>> Ok, how about
>>
>> diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
>> index b1cce4425296..9b41d8ee71fb 100644
>> --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
>> +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
>> @@ -6226,8 +6226,7 @@ static int mvpp2_gmac_pcs_config(struct phylink_pcs *pcs, unsigned int mode,
>>                          * automatically or the bits in MVPP22_GMAC_CTRL_4_REG
>>                          * manually controls the GMAC pause modes.
>>                          */
>> -                       if (permit_pause_to_mac)
>> -                               val |= MVPP2_GMAC_FLOW_CTRL_AUTONEG;
>> +                       val |= MVPP2_GMAC_FLOW_CTRL_AUTONEG;
>>
>>                         /* Configure advertisement bits */
>>                         mask |= MVPP2_GMAC_FC_ADV_EN | MVPP2_GMAC_FC_ADV_ASM_EN;
>> @@ -6525,6 +6524,9 @@ static void mvpp2_mac_link_up(struct phylink_config *config,
>>                 }
>>         } else {
>>                 if (!phylink_autoneg_inband(mode)) {
>> +                       bool cur_tx_pause, cur_rx_pause;
>> +                       u32 status0 = readl(port->base + MVPP2_GMAC_STATUS0);
>> +
>>                         val = MVPP2_GMAC_FORCE_LINK_PASS;
>>
>>                         if (speed == SPEED_1000 || speed == SPEED_2500)
>> @@ -6535,11 +6537,18 @@ static void mvpp2_mac_link_up(struct phylink_config *config,
>>                         if (duplex == DUPLEX_FULL)
>>                                 val |= MVPP2_GMAC_CONFIG_FULL_DUPLEX;
>>
>> +                       cur_tx_pause = status0 & MVPP2_GMAC_STATUS0_TX_PAUSE;
>> +                       cur_rx_pause = status0 & MVPP2_GMAC_STATUS0_RX_PAUSE;
>
> I think you haven't understood everything I've said. These status bits
> report what the MAC is doing. They do not reflect what was negotiated
> _unless_ MVPP2_GMAC_FLOW_CTRL_AUTONEG was set.
>
> So, if we clear MVPP2_GMAC_FLOW_CTRL_AUTONEG, these bits will follow
> MVPP22_XLG_CTRL0_TX_FLOW_CTRL_EN and MVPP22_XLG_CTRL0_RX_FLOW_CTRL_EN.
>
> Let's say we apply this patch. tx/rx pause are negotiated and enabled.
> So cur_tx_pause and cur_rx_pause are both true.
>
> We change the pause settings, forcing tx pause only. This causes
> pcs_config to be called which sets MVPP2_GMAC_FLOW_CTRL_AUTONEG, and
> then link_up gets called with the differing settings. We clear
> MVPP2_GMAC_FLOW_CTRL_AUTONEG and force the pause settings. We now
> have the status register containing MVPP2_GMAC_STATUS0_TX_PAUSE set
> but MVPP2_GMAC_STATUS0_RX_PAUSE clear.
>
> The link goes down e.g. because the remote end has changed and comes
> back. We read the status register and see MVPP2_GMAC_STATUS0_TX_PAUSE
> is set and MVPP2_GMAC_STATUS0_RX_PAUSE is still clear. tx_pause is
> true and rx_pause is false. These agree with the settings, so we
> then set MVPP2_GMAC_FLOW_CTRL_AUTONEG.
>
> If the link goes down and up again, then this cycle repeats - the
> status register will now have both MVPP2_GMAC_STATUS0_TX_PAUSE and
> MVPP2_GMAC_STATUS0_RX_PAUSE set, so we clear
> MVPP2_GMAC_FLOW_CTRL_AUTONEG. If the link goes down/up again, we flip
> back to re-enabling MVPP2_GMAC_FLOW_CTRL_AUTONEG.

The toggling is not really a problem, since we always correct the pause
mode as soon as we notice. The real issue would be if we don't notice
because the link went down and back up in between calls to
phylink_resolve. That could be fixed by verifying that the result of
pcs_get_state matches what was configured.

But perhaps the solution is to move this parameter to mac_link_up. That
would eliminate this toggling. And this parameter really is about the
MAC in the first case.

> And we will toggle between these two states.
>
> Sorry, but this can't work.
>
>> When we have MLO_PAUSE_AN, this is the same as before. For the other
>> case, consider the scenario where someone disables pause
>> autonegotiation, and then plugs in the cable. Here, we get the
>> negotiated pause from pcs_get_state, but it is overridden by
>
> In mvneta and mvpp2, pcs_get_state() can only read the current settings
> that the PCS/MAC are currently using. There is no way to read purely
> the results of negotiation with this hardware.
>
> E.g., if you force speed to 100Mbps, then pcs_get_state() will tell you
> that you're doing 100Mbps. If you force duplex, pcs_get_state() will
> tell you what's being forced. If you force pause, pcs_get_state() will
> tell you what pause settings are being forced.
>
> Sadly, this is the way Marvell designed this hardware, and it sucks,
> but we have to support it.
>
>> > permit_pause_to_mac exists precisely because of the limitions of this
>> > hardware, and having it costs virtually nothing to other network
>> > drivers... except a parameter that others ignore.
>> >
>> > If we don't have permit_pause_to_mac in pcs_config() then we need to
>> > have it passed to the link_up() methods instead. There is no other
>> > option (other than breaking mvneta and mvpp2) here than to make the
>> > state of ethtool -A ... autoneg on|off known to the hardware.
>>
>> Well, the original patch is primarily motivated by the terrible naming
>> and documentation regarding this parameter. I was only able to determine
>> the purpose of this parameter by reading the mvpp2 driver and consulting
>> the A370 datasheet. I think if it is possible to eliminate this
>> parameter (such as with the above patch), we should do so, since it will
>> make the API cleaner and easier to understand. Failing that, I will
>> submit a patch to improve the documentation (and perhaps rename the
>> parameter to something more descriptive).
>
> I don't like having it either, but I've thought about it for a long
> time and haven't found any other acceptable solution.
>
> To me, the parameter name describes _exactly_ what it's about. It's
> about the PCS being permitted to forward the pause status to the MAC.
> Hence "permit" "pause" "to" "mac" and the PCS context comes from the
> fact that this is a PCS function. I really don't see what could be
> clearer about the name... and I get the impression this is a bit of
> a storm in a tea cup.

Well first off, the PCS typically has no ability to delegate/permit
anything to the MAC. So it is already unclear what the verb is. Next,
since this is pcs_config, one would think this has something to do with
pause advertisement. But in fact it has nothing to do with it. In fact,
this parameter only has an effect once mac_link_up comes around. I
suggest something like use_autonegotiated_pause. This makes it clear
that this is about the result of autonegotation.

--Sean
Russell King (Oracle) Dec. 16, 2021, 6:05 p.m. UTC | #8
On Thu, Dec 16, 2021 at 12:51:33PM -0500, Sean Anderson wrote:
> On 12/16/21 12:26 PM, Russell King (Oracle) wrote:
> > On Thu, Dec 16, 2021 at 12:02:55PM -0500, Sean Anderson wrote:
> > > On 12/14/21 8:12 PM, Russell King (Oracle) wrote:
> > > > On Wed, Dec 15, 2021 at 12:49:14AM +0000, Russell King (Oracle) wrote:
> > > > > On Tue, Dec 14, 2021 at 07:16:53PM -0500, Sean Anderson wrote:
> > > > > > Ok, so let me clarify my understanding. Perhaps this can be eliminated
> > > > > > through a different approach.
> > > > > >
> > > > > > When I read the datasheet for mvneta (which hopefully has the same
> > > > > > logic here, since I could not find a datasheet for an mvpp2 device), I
> > > > > > noticed that the Pause_Adv bit said
> > > > > >
> > > > > > > It is valid only if flow control mode is defined by Auto-Negotiation
> > > > > > > (as defined by the <AnFcEn> bit).
> > > > > >
> > > > > > Which I interpreted to mean that if AnFcEn was clear, then no flow
> > > > > > control was advertised. But perhaps it instead means that the logic is
> > > > > > something like
> > > > > >
> > > > > > if (AnFcEn)
> > > > > > 	Config_Reg.PAUSE = Pause_Adv;
> > > > > > else
> > > > > > 	Config_Reg.PAUSE = SetFcEn;
> > > > > >
> > > > > > which would mean that we can just clear AnFcEn in link_up if the
> > > > > > autonegotiated pause settings are different from the configured pause
> > > > > > settings.
> > > > >
> > > > > Having actually played with this hardware quite a bit and observed what
> > > > > it sends, what it implements for advertising is:
> > > > >
> > > > > 	Config_Reg.PAUSE = Pause_Adv;
> > > 
> > > So the above note from the datasheet about Pause_Adv not being valid is
> > > incorrect?
> > > 
> > > > > Config_Reg gets sent over the 1000BASE-X link to the link partner, and
> > > > > we receive Remote_Reg from the link partner.
> > > > >
> > > > > Then, the hardware implements:
> > > > >
> > > > > 	if (AnFcEn)
> > > > > 		MAC_PAUSE = Config_Reg.PAUSE & Remote_Reg.PAUSE;
> > > > > 	else
> > > > > 		MAC_PAUSE = SetFcEn;
> > > > >
> > > > > In otherwords, AnFcEn controls whether the result of autonegotiation
> > > > > or the value of SetFcEn controls whether the MAC enables symmetric
> > > > > pause mode.
> > > >
> > > > I should also note that in the Port Status register,
> > > >
> > > > 	TxFcEn = RxFcEn = MAC_PAUSE;
> > > >
> > > > So, the status register bits follow SetFcEn when AnFcEn is disabled.
> > > >
> > > > However, these bits are the only way to report the result of the
> > > > negotiation, which is why we use them to report back whether flow
> > > > control was enabled in mvneta_pcs_get_state(). These bits will be
> > > > ignored by phylink when ethtool -A has disabled pause negotiation,
> > > > and in that situation there is no way as I said to be able to read
> > > > the negotiation result.
> > > 
> > > Ok, how about
> > > 
> > > diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> > > index b1cce4425296..9b41d8ee71fb 100644
> > > --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> > > +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> > > @@ -6226,8 +6226,7 @@ static int mvpp2_gmac_pcs_config(struct phylink_pcs *pcs, unsigned int mode,
> > >                          * automatically or the bits in MVPP22_GMAC_CTRL_4_REG
> > >                          * manually controls the GMAC pause modes.
> > >                          */
> > > -                       if (permit_pause_to_mac)
> > > -                               val |= MVPP2_GMAC_FLOW_CTRL_AUTONEG;
> > > +                       val |= MVPP2_GMAC_FLOW_CTRL_AUTONEG;
> > > 
> > >                         /* Configure advertisement bits */
> > >                         mask |= MVPP2_GMAC_FC_ADV_EN | MVPP2_GMAC_FC_ADV_ASM_EN;
> > > @@ -6525,6 +6524,9 @@ static void mvpp2_mac_link_up(struct phylink_config *config,
> > >                 }
> > >         } else {
> > >                 if (!phylink_autoneg_inband(mode)) {
> > > +                       bool cur_tx_pause, cur_rx_pause;
> > > +                       u32 status0 = readl(port->base + MVPP2_GMAC_STATUS0);
> > > +
> > >                         val = MVPP2_GMAC_FORCE_LINK_PASS;
> > > 
> > >                         if (speed == SPEED_1000 || speed == SPEED_2500)
> > > @@ -6535,11 +6537,18 @@ static void mvpp2_mac_link_up(struct phylink_config *config,
> > >                         if (duplex == DUPLEX_FULL)
> > >                                 val |= MVPP2_GMAC_CONFIG_FULL_DUPLEX;
> > > 
> > > +                       cur_tx_pause = status0 & MVPP2_GMAC_STATUS0_TX_PAUSE;
> > > +                       cur_rx_pause = status0 & MVPP2_GMAC_STATUS0_RX_PAUSE;
> > 
> > I think you haven't understood everything I've said. These status bits
> > report what the MAC is doing. They do not reflect what was negotiated
> > _unless_ MVPP2_GMAC_FLOW_CTRL_AUTONEG was set.
> > 
> > So, if we clear MVPP2_GMAC_FLOW_CTRL_AUTONEG, these bits will follow
> > MVPP22_XLG_CTRL0_TX_FLOW_CTRL_EN and MVPP22_XLG_CTRL0_RX_FLOW_CTRL_EN.
> > 
> > Let's say we apply this patch. tx/rx pause are negotiated and enabled.
> > So cur_tx_pause and cur_rx_pause are both true.
> > 
> > We change the pause settings, forcing tx pause only. This causes
> > pcs_config to be called which sets MVPP2_GMAC_FLOW_CTRL_AUTONEG, and
> > then link_up gets called with the differing settings. We clear
> > MVPP2_GMAC_FLOW_CTRL_AUTONEG and force the pause settings. We now
> > have the status register containing MVPP2_GMAC_STATUS0_TX_PAUSE set
> > but MVPP2_GMAC_STATUS0_RX_PAUSE clear.
> > 
> > The link goes down e.g. because the remote end has changed and comes
> > back. We read the status register and see MVPP2_GMAC_STATUS0_TX_PAUSE
> > is set and MVPP2_GMAC_STATUS0_RX_PAUSE is still clear. tx_pause is
> > true and rx_pause is false. These agree with the settings, so we
> > then set MVPP2_GMAC_FLOW_CTRL_AUTONEG.
> > 
> > If the link goes down and up again, then this cycle repeats - the
> > status register will now have both MVPP2_GMAC_STATUS0_TX_PAUSE and
> > MVPP2_GMAC_STATUS0_RX_PAUSE set, so we clear
> > MVPP2_GMAC_FLOW_CTRL_AUTONEG. If the link goes down/up again, we flip
> > back to re-enabling MVPP2_GMAC_FLOW_CTRL_AUTONEG.
> 
> The toggling is not really a problem, since we always correct the pause
> mode as soon as we notice.

When do we "notice" ? We don't regularly poll on these platforms, we
rely on interrupts.

> The real issue would be if we don't notice
> because the link went down and back up in between calls to
> phylink_resolve.

Err no. If the link goes down and back up, one of the things the code
is structured to ensure is that phylink_resolve gets called, _and_ that
we will see a link-down-link-up.

The only time that isn't guaranteed is when using a polled driver where
the link state does not latched-fail (or where the hardware does
latch-fail but someone has decided "let's double-read the status
register to get the current state".)

> That could be fixed by verifying that the result of
> pcs_get_state matches what was configured.

So we need to introduce mvneta and mvpp2 specific code into phylink to
do that, because for everything else, what we get from pcs_get_state
is the _resolved_ information.

> But perhaps the solution is to move this parameter to mac_link_up. That
> would eliminate this toggling. And this parameter really is about the
> MAC in the first case.

Maybe, but we still have this parameter.

> > I don't like having it either, but I've thought about it for a long
> > time and haven't found any other acceptable solution.
> > 
> > To me, the parameter name describes _exactly_ what it's about. It's
> > about the PCS being permitted to forward the pause status to the MAC.
> > Hence "permit" "pause" "to" "mac" and the PCS context comes from the
> > fact that this is a PCS function. I really don't see what could be
> > clearer about the name... and I get the impression this is a bit of
> > a storm in a tea cup.
> 
> Well first off, the PCS typically has no ability to delegate/permit
> anything to the MAC. So it is already unclear what the verb is.

In the classical model of ethernet that is completely true - there is
no coupling that communicates the link parameters between the PCS and
MAC. mvpp2 and mvneta annoyingly do not implement the classical model
though.

> Next,
> since this is pcs_config, one would think this has something to do with
> pause advertisement. But in fact it has nothing to do with it. In fact,
> this parameter only has an effect once mac_link_up comes around. I
> suggest something like use_autonegotiated_pause. This makes it clear
> that this is about the result of autonegotation.

I could be devils advocate here and claim that
"use_autonegotiated_pause" is meaningless to a MAC because in the
classical model, the MAC doesn't have any way to know what the
negotiated pause was. So where does this "pause" come from.

So it's just the same problem, doesn't solve anything, and we just have
a different opinion on naming and where something should be.
Sean Anderson Dec. 16, 2021, 6:29 p.m. UTC | #9
On 12/16/21 1:05 PM, Russell King (Oracle) wrote:
> On Thu, Dec 16, 2021 at 12:51:33PM -0500, Sean Anderson wrote:
>> On 12/16/21 12:26 PM, Russell King (Oracle) wrote:
>> > On Thu, Dec 16, 2021 at 12:02:55PM -0500, Sean Anderson wrote:
>> > > On 12/14/21 8:12 PM, Russell King (Oracle) wrote:
>> > > > On Wed, Dec 15, 2021 at 12:49:14AM +0000, Russell King (Oracle) wrote:
>> > > > > On Tue, Dec 14, 2021 at 07:16:53PM -0500, Sean Anderson wrote:
>> > > > > > Ok, so let me clarify my understanding. Perhaps this can be eliminated
>> > > > > > through a different approach.
>> > > > > >
>> > > > > > When I read the datasheet for mvneta (which hopefully has the same
>> > > > > > logic here, since I could not find a datasheet for an mvpp2 device), I
>> > > > > > noticed that the Pause_Adv bit said
>> > > > > >
>> > > > > > > It is valid only if flow control mode is defined by Auto-Negotiation
>> > > > > > > (as defined by the <AnFcEn> bit).
>> > > > > >
>> > > > > > Which I interpreted to mean that if AnFcEn was clear, then no flow
>> > > > > > control was advertised. But perhaps it instead means that the logic is
>> > > > > > something like
>> > > > > >
>> > > > > > if (AnFcEn)
>> > > > > > 	Config_Reg.PAUSE = Pause_Adv;
>> > > > > > else
>> > > > > > 	Config_Reg.PAUSE = SetFcEn;
>> > > > > >
>> > > > > > which would mean that we can just clear AnFcEn in link_up if the
>> > > > > > autonegotiated pause settings are different from the configured pause
>> > > > > > settings.
>> > > > >
>> > > > > Having actually played with this hardware quite a bit and observed what
>> > > > > it sends, what it implements for advertising is:
>> > > > >
>> > > > > 	Config_Reg.PAUSE = Pause_Adv;
>> > >
>> > > So the above note from the datasheet about Pause_Adv not being valid is
>> > > incorrect?
>> > >
>> > > > > Config_Reg gets sent over the 1000BASE-X link to the link partner, and
>> > > > > we receive Remote_Reg from the link partner.
>> > > > >
>> > > > > Then, the hardware implements:
>> > > > >
>> > > > > 	if (AnFcEn)
>> > > > > 		MAC_PAUSE = Config_Reg.PAUSE & Remote_Reg.PAUSE;
>> > > > > 	else
>> > > > > 		MAC_PAUSE = SetFcEn;
>> > > > >
>> > > > > In otherwords, AnFcEn controls whether the result of autonegotiation
>> > > > > or the value of SetFcEn controls whether the MAC enables symmetric
>> > > > > pause mode.
>> > > >
>> > > > I should also note that in the Port Status register,
>> > > >
>> > > > 	TxFcEn = RxFcEn = MAC_PAUSE;
>> > > >
>> > > > So, the status register bits follow SetFcEn when AnFcEn is disabled.
>> > > >
>> > > > However, these bits are the only way to report the result of the
>> > > > negotiation, which is why we use them to report back whether flow
>> > > > control was enabled in mvneta_pcs_get_state(). These bits will be
>> > > > ignored by phylink when ethtool -A has disabled pause negotiation,
>> > > > and in that situation there is no way as I said to be able to read
>> > > > the negotiation result.
>> > >
>> > > Ok, how about
>> > >
>> > > diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
>> > > index b1cce4425296..9b41d8ee71fb 100644
>> > > --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
>> > > +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
>> > > @@ -6226,8 +6226,7 @@ static int mvpp2_gmac_pcs_config(struct phylink_pcs *pcs, unsigned int mode,
>> > >                          * automatically or the bits in MVPP22_GMAC_CTRL_4_REG
>> > >                          * manually controls the GMAC pause modes.
>> > >                          */
>> > > -                       if (permit_pause_to_mac)
>> > > -                               val |= MVPP2_GMAC_FLOW_CTRL_AUTONEG;
>> > > +                       val |= MVPP2_GMAC_FLOW_CTRL_AUTONEG;
>> > >
>> > >                         /* Configure advertisement bits */
>> > >                         mask |= MVPP2_GMAC_FC_ADV_EN | MVPP2_GMAC_FC_ADV_ASM_EN;
>> > > @@ -6525,6 +6524,9 @@ static void mvpp2_mac_link_up(struct phylink_config *config,
>> > >                 }
>> > >         } else {
>> > >                 if (!phylink_autoneg_inband(mode)) {
>> > > +                       bool cur_tx_pause, cur_rx_pause;
>> > > +                       u32 status0 = readl(port->base + MVPP2_GMAC_STATUS0);
>> > > +
>> > >                         val = MVPP2_GMAC_FORCE_LINK_PASS;
>> > >
>> > >                         if (speed == SPEED_1000 || speed == SPEED_2500)
>> > > @@ -6535,11 +6537,18 @@ static void mvpp2_mac_link_up(struct phylink_config *config,
>> > >                         if (duplex == DUPLEX_FULL)
>> > >                                 val |= MVPP2_GMAC_CONFIG_FULL_DUPLEX;
>> > >
>> > > +                       cur_tx_pause = status0 & MVPP2_GMAC_STATUS0_TX_PAUSE;
>> > > +                       cur_rx_pause = status0 & MVPP2_GMAC_STATUS0_RX_PAUSE;
>> >
>> > I think you haven't understood everything I've said. These status bits
>> > report what the MAC is doing. They do not reflect what was negotiated
>> > _unless_ MVPP2_GMAC_FLOW_CTRL_AUTONEG was set.
>> >
>> > So, if we clear MVPP2_GMAC_FLOW_CTRL_AUTONEG, these bits will follow
>> > MVPP22_XLG_CTRL0_TX_FLOW_CTRL_EN and MVPP22_XLG_CTRL0_RX_FLOW_CTRL_EN.
>> >
>> > Let's say we apply this patch. tx/rx pause are negotiated and enabled.
>> > So cur_tx_pause and cur_rx_pause are both true.
>> >
>> > We change the pause settings, forcing tx pause only. This causes
>> > pcs_config to be called which sets MVPP2_GMAC_FLOW_CTRL_AUTONEG, and
>> > then link_up gets called with the differing settings. We clear
>> > MVPP2_GMAC_FLOW_CTRL_AUTONEG and force the pause settings. We now
>> > have the status register containing MVPP2_GMAC_STATUS0_TX_PAUSE set
>> > but MVPP2_GMAC_STATUS0_RX_PAUSE clear.
>> >
>> > The link goes down e.g. because the remote end has changed and comes
>> > back. We read the status register and see MVPP2_GMAC_STATUS0_TX_PAUSE
>> > is set and MVPP2_GMAC_STATUS0_RX_PAUSE is still clear. tx_pause is
>> > true and rx_pause is false. These agree with the settings, so we
>> > then set MVPP2_GMAC_FLOW_CTRL_AUTONEG.
>> >
>> > If the link goes down and up again, then this cycle repeats - the
>> > status register will now have both MVPP2_GMAC_STATUS0_TX_PAUSE and
>> > MVPP2_GMAC_STATUS0_RX_PAUSE set, so we clear
>> > MVPP2_GMAC_FLOW_CTRL_AUTONEG. If the link goes down/up again, we flip
>> > back to re-enabling MVPP2_GMAC_FLOW_CTRL_AUTONEG.
>>
>> The toggling is not really a problem, since we always correct the pause
>> mode as soon as we notice.
>
> When do we "notice" ? We don't regularly poll on these platforms, we
> rely on interrupts.

When the link comes back up again.

>> The real issue would be if we don't notice
>> because the link went down and back up in between calls to
>> phylink_resolve.
>
> Err no. If the link goes down and back up, one of the things the code
> is structured to ensure is that phylink_resolve gets called, _and_ that
> we will see a link-down-link-up.

Great. Then the above will work fine. Because we always set the pause
mode in mac_link_up, it's OK to have the pause be incorrect in the time
between when it comes up and when mac_link_up is called. The result of
the pause from get_state will not be what was negotiated, but that is OK
because we discard it anyway.

> The only time that isn't guaranteed is when using a polled driver where
> the link state does not latched-fail (or where the hardware does
> latch-fail but someone has decided "let's double-read the status
> register to get the current state".)

I wasn't aware of whether this platform used interrupts or polling.

>> That could be fixed by verifying that the result of
>> pcs_get_state matches what was configured.
>
> So we need to introduce mvneta and mvpp2 specific code into phylink to
> do that, because for everything else, what we get from pcs_get_state
> is the _resolved_ information.
>
>> But perhaps the solution is to move this parameter to mac_link_up. That
>> would eliminate this toggling. And this parameter really is about the
>> MAC in the first case.
>
> Maybe, but we still have this parameter.

I think it would be much better if it was in mac_link_up and not in
pcs_ops, because fundamentally it is configuration for the MAC and not
the PCS.

>> > I don't like having it either, but I've thought about it for a long
>> > time and haven't found any other acceptable solution.
>> >
>> > To me, the parameter name describes _exactly_ what it's about. It's
>> > about the PCS being permitted to forward the pause status to the MAC.
>> > Hence "permit" "pause" "to" "mac" and the PCS context comes from the
>> > fact that this is a PCS function. I really don't see what could be
>> > clearer about the name... and I get the impression this is a bit of
>> > a storm in a tea cup.
>>
>> Well first off, the PCS typically has no ability to delegate/permit
>> anything to the MAC. So it is already unclear what the verb is.
>
> In the classical model of ethernet that is completely true - there is
> no coupling that communicates the link parameters between the PCS and
> MAC. mvpp2 and mvneta annoyingly do not implement the classical model
> though.

Right. But the default perspective for any implementer reading through
the documentation will be of the classical model. So even if this
parameter is intended for use with a marvell model, it needs to be
framed in this classical model so that implementers can understand it.

>> Next,
>> since this is pcs_config, one would think this has something to do with
>> pause advertisement. But in fact it has nothing to do with it. In fact,
>> this parameter only has an effect once mac_link_up comes around. I
>> suggest something like use_autonegotiated_pause. This makes it clear
>> that this is about the result of autonegotation.
>
> I could be devils advocate here and claim that
> "use_autonegotiated_pause" is meaningless to a MAC because in the
> classical model, the MAC doesn't have any way to know what the
> negotiated pause was. So where does this "pause" come from.
>
> So it's just the same problem, doesn't solve anything, and we just have
> a different opinion on naming and where something should be.

I think my primary issue with the name is that it is named after its
purpose in the marvell hardware, and not after the user setting. That
is, we have something like

	if (pause_autonegotiation_enabled())
		permit_pause_to_mac()

So the generic interface should be named after the condition and not
the body.

If this parameter got moved to mac_link_up, I think the following would
be good:

@autonegotiated_pause: This indicates whether the pause settings are a
result of autonegotiation or whether they were manually configured. Some
MACs are tightly coupled to their PCSs and have a hardware
implementation of linkmode_resolve_pause() which sets their pause mode
based on the autonegotiated pause mode. For these MACs, disabling this
hardware implementation may inhibit their ability to determine the
autonegotiated pause mode, so it should only be disabled when the pause
mode was manually set. MACs which do not have this feature/limitation
should ignore this parameter.

--Sean
Russell King (Oracle) Dec. 16, 2021, 6:58 p.m. UTC | #10
On Thu, Dec 16, 2021 at 01:29:20PM -0500, Sean Anderson wrote:
> On 12/16/21 1:05 PM, Russell King (Oracle) wrote:
> > On Thu, Dec 16, 2021 at 12:51:33PM -0500, Sean Anderson wrote:
> > > On 12/16/21 12:26 PM, Russell King (Oracle) wrote:
> > > > On Thu, Dec 16, 2021 at 12:02:55PM -0500, Sean Anderson wrote:
> > > > > On 12/14/21 8:12 PM, Russell King (Oracle) wrote:
> > > > > > On Wed, Dec 15, 2021 at 12:49:14AM +0000, Russell King (Oracle) wrote:
> > > > > > > On Tue, Dec 14, 2021 at 07:16:53PM -0500, Sean Anderson wrote:
> > > > > > > > Ok, so let me clarify my understanding. Perhaps this can be eliminated
> > > > > > > > through a different approach.
> > > > > > > >
> > > > > > > > When I read the datasheet for mvneta (which hopefully has the same
> > > > > > > > logic here, since I could not find a datasheet for an mvpp2 device), I
> > > > > > > > noticed that the Pause_Adv bit said
> > > > > > > >
> > > > > > > > > It is valid only if flow control mode is defined by Auto-Negotiation
> > > > > > > > > (as defined by the <AnFcEn> bit).
> > > > > > > >
> > > > > > > > Which I interpreted to mean that if AnFcEn was clear, then no flow
> > > > > > > > control was advertised. But perhaps it instead means that the logic is
> > > > > > > > something like
> > > > > > > >
> > > > > > > > if (AnFcEn)
> > > > > > > > 	Config_Reg.PAUSE = Pause_Adv;
> > > > > > > > else
> > > > > > > > 	Config_Reg.PAUSE = SetFcEn;
> > > > > > > >
> > > > > > > > which would mean that we can just clear AnFcEn in link_up if the
> > > > > > > > autonegotiated pause settings are different from the configured pause
> > > > > > > > settings.
> > > > > > >
> > > > > > > Having actually played with this hardware quite a bit and observed what
> > > > > > > it sends, what it implements for advertising is:
> > > > > > >
> > > > > > > 	Config_Reg.PAUSE = Pause_Adv;
> > > > >
> > > > > So the above note from the datasheet about Pause_Adv not being valid is
> > > > > incorrect?
> > > > >
> > > > > > > Config_Reg gets sent over the 1000BASE-X link to the link partner, and
> > > > > > > we receive Remote_Reg from the link partner.
> > > > > > >
> > > > > > > Then, the hardware implements:
> > > > > > >
> > > > > > > 	if (AnFcEn)
> > > > > > > 		MAC_PAUSE = Config_Reg.PAUSE & Remote_Reg.PAUSE;
> > > > > > > 	else
> > > > > > > 		MAC_PAUSE = SetFcEn;
> > > > > > >
> > > > > > > In otherwords, AnFcEn controls whether the result of autonegotiation
> > > > > > > or the value of SetFcEn controls whether the MAC enables symmetric
> > > > > > > pause mode.
> > > > > >
> > > > > > I should also note that in the Port Status register,
> > > > > >
> > > > > > 	TxFcEn = RxFcEn = MAC_PAUSE;
> > > > > >
> > > > > > So, the status register bits follow SetFcEn when AnFcEn is disabled.
> > > > > >
> > > > > > However, these bits are the only way to report the result of the
> > > > > > negotiation, which is why we use them to report back whether flow
> > > > > > control was enabled in mvneta_pcs_get_state(). These bits will be
> > > > > > ignored by phylink when ethtool -A has disabled pause negotiation,
> > > > > > and in that situation there is no way as I said to be able to read
> > > > > > the negotiation result.
> > > > >
> > > > > Ok, how about
> > > > >
> > > > > diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> > > > > index b1cce4425296..9b41d8ee71fb 100644
> > > > > --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> > > > > +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> > > > > @@ -6226,8 +6226,7 @@ static int mvpp2_gmac_pcs_config(struct phylink_pcs *pcs, unsigned int mode,
> > > > >                          * automatically or the bits in MVPP22_GMAC_CTRL_4_REG
> > > > >                          * manually controls the GMAC pause modes.
> > > > >                          */
> > > > > -                       if (permit_pause_to_mac)
> > > > > -                               val |= MVPP2_GMAC_FLOW_CTRL_AUTONEG;
> > > > > +                       val |= MVPP2_GMAC_FLOW_CTRL_AUTONEG;
> > > > >
> > > > >                         /* Configure advertisement bits */
> > > > >                         mask |= MVPP2_GMAC_FC_ADV_EN | MVPP2_GMAC_FC_ADV_ASM_EN;
> > > > > @@ -6525,6 +6524,9 @@ static void mvpp2_mac_link_up(struct phylink_config *config,
> > > > >                 }
> > > > >         } else {
> > > > >                 if (!phylink_autoneg_inband(mode)) {
> > > > > +                       bool cur_tx_pause, cur_rx_pause;
> > > > > +                       u32 status0 = readl(port->base + MVPP2_GMAC_STATUS0);
> > > > > +
> > > > >                         val = MVPP2_GMAC_FORCE_LINK_PASS;
> > > > >
> > > > >                         if (speed == SPEED_1000 || speed == SPEED_2500)
> > > > > @@ -6535,11 +6537,18 @@ static void mvpp2_mac_link_up(struct phylink_config *config,
> > > > >                         if (duplex == DUPLEX_FULL)
> > > > >                                 val |= MVPP2_GMAC_CONFIG_FULL_DUPLEX;
> > > > >
> > > > > +                       cur_tx_pause = status0 & MVPP2_GMAC_STATUS0_TX_PAUSE;
> > > > > +                       cur_rx_pause = status0 & MVPP2_GMAC_STATUS0_RX_PAUSE;
> > > >
> > > > I think you haven't understood everything I've said. These status bits
> > > > report what the MAC is doing. They do not reflect what was negotiated
> > > > _unless_ MVPP2_GMAC_FLOW_CTRL_AUTONEG was set.
> > > >
> > > > So, if we clear MVPP2_GMAC_FLOW_CTRL_AUTONEG, these bits will follow
> > > > MVPP22_XLG_CTRL0_TX_FLOW_CTRL_EN and MVPP22_XLG_CTRL0_RX_FLOW_CTRL_EN.
> > > >
> > > > Let's say we apply this patch. tx/rx pause are negotiated and enabled.
> > > > So cur_tx_pause and cur_rx_pause are both true.
> > > >
> > > > We change the pause settings, forcing tx pause only. This causes
> > > > pcs_config to be called which sets MVPP2_GMAC_FLOW_CTRL_AUTONEG, and
> > > > then link_up gets called with the differing settings. We clear
> > > > MVPP2_GMAC_FLOW_CTRL_AUTONEG and force the pause settings. We now
> > > > have the status register containing MVPP2_GMAC_STATUS0_TX_PAUSE set
> > > > but MVPP2_GMAC_STATUS0_RX_PAUSE clear.
> > > >
> > > > The link goes down e.g. because the remote end has changed and comes
> > > > back. We read the status register and see MVPP2_GMAC_STATUS0_TX_PAUSE
> > > > is set and MVPP2_GMAC_STATUS0_RX_PAUSE is still clear. tx_pause is
> > > > true and rx_pause is false. These agree with the settings, so we
> > > > then set MVPP2_GMAC_FLOW_CTRL_AUTONEG.
> > > >
> > > > If the link goes down and up again, then this cycle repeats - the
> > > > status register will now have both MVPP2_GMAC_STATUS0_TX_PAUSE and
> > > > MVPP2_GMAC_STATUS0_RX_PAUSE set, so we clear
> > > > MVPP2_GMAC_FLOW_CTRL_AUTONEG. If the link goes down/up again, we flip
> > > > back to re-enabling MVPP2_GMAC_FLOW_CTRL_AUTONEG.
> > > 
> > > The toggling is not really a problem, since we always correct the pause
> > > mode as soon as we notice.
> > 
> > When do we "notice" ? We don't regularly poll on these platforms, we
> > rely on interrupts.
> 
> When the link comes back up again.

So we end up with link-down-link-up and the settings are wrong.
The next time the link goes down and up, the settings are corrected
as I described. The next time the link goes down and up, the settings
are again wrong. etc. I don't consider this acceptable.

> > > The real issue would be if we don't notice
> > > because the link went down and back up in between calls to
> > > phylink_resolve.
> > 
> > Err no. If the link goes down and back up, one of the things the code
> > is structured to ensure is that phylink_resolve gets called, _and_ that
> > we will see a link-down-link-up.
> 
> Great. Then the above will work fine. Because we always set the pause
> mode in mac_link_up, it's OK to have the pause be incorrect in the time
> between when it comes up and when mac_link_up is called. The result of
> the pause from get_state will not be what was negotiated, but that is OK
> because we discard it anyway.

I can't tell whether you are intentionally misinterpreting to try and
get your way or not, but I've now said several times that your proposal
will _not_ work, yet you keep with some weird idea that I'm somehow
agreeing with you.

I'm not.

> I think my primary issue with the name is that it is named after its
> purpose in the marvell hardware, and not after the user setting. That
> is, we have something like
> 
> 	if (pause_autonegotiation_enabled())
> 		permit_pause_to_mac()
> 
> So the generic interface should be named after the condition and not
> the body.
> 
> If this parameter got moved to mac_link_up, I think the following would
> be good:
> 
> @autonegotiated_pause: This indicates whether the pause settings are a
> result of autonegotiation or whether they were manually configured. Some
> MACs are tightly coupled to their PCSs and have a hardware
> implementation of linkmode_resolve_pause() which sets their pause mode
> based on the autonegotiated pause mode. For these MACs, disabling this
> hardware implementation may inhibit their ability to determine the
> autonegotiated pause mode, so it should only be disabled when the pause
> mode was manually set. MACs which do not have this feature/limitation
> should ignore this parameter.

I might agree with that, but I haven't read it because I'm feeling way
too frazzled this evening to do so (not your fault, just an afternoon
of pressure.)
Sean Anderson Dec. 16, 2021, 7 p.m. UTC | #11
On 12/16/21 1:29 PM, Sean Anderson wrote:
>
>
> On 12/16/21 1:05 PM, Russell King (Oracle) wrote:
>> On Thu, Dec 16, 2021 at 12:51:33PM -0500, Sean Anderson wrote:
>>> On 12/16/21 12:26 PM, Russell King (Oracle) wrote:
>>> > On Thu, Dec 16, 2021 at 12:02:55PM -0500, Sean Anderson wrote:
>>> > > On 12/14/21 8:12 PM, Russell King (Oracle) wrote:
>>> > > > On Wed, Dec 15, 2021 at 12:49:14AM +0000, Russell King (Oracle) wrote:
>>> > > > > On Tue, Dec 14, 2021 at 07:16:53PM -0500, Sean Anderson wrote:
>>> > > > > > Ok, so let me clarify my understanding. Perhaps this can be eliminated
>>> > > > > > through a different approach.
>>> > > > > >
>>> > > > > > When I read the datasheet for mvneta (which hopefully has the same
>>> > > > > > logic here, since I could not find a datasheet for an mvpp2 device), I
>>> > > > > > noticed that the Pause_Adv bit said
>>> > > > > >
>>> > > > > > > It is valid only if flow control mode is defined by Auto-Negotiation
>>> > > > > > > (as defined by the <AnFcEn> bit).
>>> > > > > >
>>> > > > > > Which I interpreted to mean that if AnFcEn was clear, then no flow
>>> > > > > > control was advertised. But perhaps it instead means that the logic is
>>> > > > > > something like
>>> > > > > >
>>> > > > > > if (AnFcEn)
>>> > > > > >     Config_Reg.PAUSE = Pause_Adv;
>>> > > > > > else
>>> > > > > >     Config_Reg.PAUSE = SetFcEn;
>>> > > > > >
>>> > > > > > which would mean that we can just clear AnFcEn in link_up if the
>>> > > > > > autonegotiated pause settings are different from the configured pause
>>> > > > > > settings.
>>> > > > >
>>> > > > > Having actually played with this hardware quite a bit and observed what
>>> > > > > it sends, what it implements for advertising is:
>>> > > > >
>>> > > > >     Config_Reg.PAUSE = Pause_Adv;
>>> > >
>>> > > So the above note from the datasheet about Pause_Adv not being valid is
>>> > > incorrect?
>>> > >
>>> > > > > Config_Reg gets sent over the 1000BASE-X link to the link partner, and
>>> > > > > we receive Remote_Reg from the link partner.
>>> > > > >
>>> > > > > Then, the hardware implements:
>>> > > > >
>>> > > > >     if (AnFcEn)
>>> > > > >         MAC_PAUSE = Config_Reg.PAUSE & Remote_Reg.PAUSE;
>>> > > > >     else
>>> > > > >         MAC_PAUSE = SetFcEn;
>>> > > > >
>>> > > > > In otherwords, AnFcEn controls whether the result of autonegotiation
>>> > > > > or the value of SetFcEn controls whether the MAC enables symmetric
>>> > > > > pause mode.
>>> > > >
>>> > > > I should also note that in the Port Status register,
>>> > > >
>>> > > >     TxFcEn = RxFcEn = MAC_PAUSE;
>>> > > >
>>> > > > So, the status register bits follow SetFcEn when AnFcEn is disabled.
>>> > > >
>>> > > > However, these bits are the only way to report the result of the
>>> > > > negotiation, which is why we use them to report back whether flow
>>> > > > control was enabled in mvneta_pcs_get_state(). These bits will be
>>> > > > ignored by phylink when ethtool -A has disabled pause negotiation,
>>> > > > and in that situation there is no way as I said to be able to read
>>> > > > the negotiation result.
>>> > >
>>> > > Ok, how about
>>> > >
>>> > > diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
>>> > > index b1cce4425296..9b41d8ee71fb 100644
>>> > > --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
>>> > > +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
>>> > > @@ -6226,8 +6226,7 @@ static int mvpp2_gmac_pcs_config(struct phylink_pcs *pcs, unsigned int mode,
>>> > >                          * automatically or the bits in MVPP22_GMAC_CTRL_4_REG
>>> > >                          * manually controls the GMAC pause modes.
>>> > >                          */
>>> > > -                       if (permit_pause_to_mac)
>>> > > -                               val |= MVPP2_GMAC_FLOW_CTRL_AUTONEG;
>>> > > +                       val |= MVPP2_GMAC_FLOW_CTRL_AUTONEG;
>>> > >
>>> > >                         /* Configure advertisement bits */
>>> > >                         mask |= MVPP2_GMAC_FC_ADV_EN | MVPP2_GMAC_FC_ADV_ASM_EN;
>>> > > @@ -6525,6 +6524,9 @@ static void mvpp2_mac_link_up(struct phylink_config *config,
>>> > >                 }
>>> > >         } else {
>>> > >                 if (!phylink_autoneg_inband(mode)) {
>>> > > +                       bool cur_tx_pause, cur_rx_pause;
>>> > > +                       u32 status0 = readl(port->base + MVPP2_GMAC_STATUS0);
>>> > > +
>>> > >                         val = MVPP2_GMAC_FORCE_LINK_PASS;
>>> > >
>>> > >                         if (speed == SPEED_1000 || speed == SPEED_2500)
>>> > > @@ -6535,11 +6537,18 @@ static void mvpp2_mac_link_up(struct phylink_config *config,
>>> > >                         if (duplex == DUPLEX_FULL)
>>> > >                                 val |= MVPP2_GMAC_CONFIG_FULL_DUPLEX;
>>> > >
>>> > > +                       cur_tx_pause = status0 & MVPP2_GMAC_STATUS0_TX_PAUSE;
>>> > > +                       cur_rx_pause = status0 & MVPP2_GMAC_STATUS0_RX_PAUSE;
>>> >
>>> > I think you haven't understood everything I've said. These status bits
>>> > report what the MAC is doing. They do not reflect what was negotiated
>>> > _unless_ MVPP2_GMAC_FLOW_CTRL_AUTONEG was set.
>>> >
>>> > So, if we clear MVPP2_GMAC_FLOW_CTRL_AUTONEG, these bits will follow
>>> > MVPP22_XLG_CTRL0_TX_FLOW_CTRL_EN and MVPP22_XLG_CTRL0_RX_FLOW_CTRL_EN.
>>> >
>>> > Let's say we apply this patch. tx/rx pause are negotiated and enabled.
>>> > So cur_tx_pause and cur_rx_pause are both true.
>>> >
>>> > We change the pause settings, forcing tx pause only. This causes
>>> > pcs_config to be called which sets MVPP2_GMAC_FLOW_CTRL_AUTONEG, and
>>> > then link_up gets called with the differing settings. We clear
>>> > MVPP2_GMAC_FLOW_CTRL_AUTONEG and force the pause settings. We now
>>> > have the status register containing MVPP2_GMAC_STATUS0_TX_PAUSE set
>>> > but MVPP2_GMAC_STATUS0_RX_PAUSE clear.
>>> >
>>> > The link goes down e.g. because the remote end has changed and comes
>>> > back. We read the status register and see MVPP2_GMAC_STATUS0_TX_PAUSE
>>> > is set and MVPP2_GMAC_STATUS0_RX_PAUSE is still clear. tx_pause is
>>> > true and rx_pause is false. These agree with the settings, so we
>>> > then set MVPP2_GMAC_FLOW_CTRL_AUTONEG.
>>> >
>>> > If the link goes down and up again, then this cycle repeats - the
>>> > status register will now have both MVPP2_GMAC_STATUS0_TX_PAUSE and
>>> > MVPP2_GMAC_STATUS0_RX_PAUSE set, so we clear
>>> > MVPP2_GMAC_FLOW_CTRL_AUTONEG. If the link goes down/up again, we flip
>>> > back to re-enabling MVPP2_GMAC_FLOW_CTRL_AUTONEG.

Ok, so I think I see what you're getting at here. If we set
MVPP2_GMAC_FLOW_CTRL_AUTONEG after examining the pause mode, then the
pause mode will revert to what was negotiated. But since this platform
uses interrupts, we can clear it in link_up and set it in link_down.

  --Sean
diff mbox series

Patch

diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index d4da9adf6777..2ae717f262e8 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -540,9 +540,7 @@  static void macb_usx_pcs_get_state(struct phylink_pcs *pcs,
 
 static int macb_usx_pcs_config(struct phylink_pcs *pcs,
 			       unsigned int mode,
-			       phy_interface_t interface,
-			       const unsigned long *advertising,
-			       bool permit_pause_to_mac)
+			       const struct phylink_link_state *state)
 {
 	struct macb *bp = container_of(pcs, struct macb, phylink_pcs);
 
@@ -565,9 +563,7 @@  static void macb_pcs_an_restart(struct phylink_pcs *pcs)
 
 static int macb_pcs_config(struct phylink_pcs *pcs,
 			   unsigned int mode,
-			   phy_interface_t interface,
-			   const unsigned long *advertising,
-			   bool permit_pause_to_mac)
+			   const struct phylink_link_state *state)
 {
 	return 0;
 }
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
index b1cce4425296..62de173536bf 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
@@ -6146,9 +6146,7 @@  static void mvpp2_xlg_pcs_get_state(struct phylink_pcs *pcs,
 
 static int mvpp2_xlg_pcs_config(struct phylink_pcs *pcs,
 				unsigned int mode,
-				phy_interface_t interface,
-				const unsigned long *advertising,
-				bool permit_pause_to_mac)
+				const struct phylink_link_state *state)
 {
 	return 0;
 }
@@ -6194,9 +6192,7 @@  static void mvpp2_gmac_pcs_get_state(struct phylink_pcs *pcs,
 }
 
 static int mvpp2_gmac_pcs_config(struct phylink_pcs *pcs, unsigned int mode,
-				 phy_interface_t interface,
-				 const unsigned long *advertising,
-				 bool permit_pause_to_mac)
+				 const struct phylink_link_state *state)
 {
 	struct mvpp2_port *port = mvpp2_pcs_to_port(pcs);
 	u32 mask, val, an, old_an, changed;
@@ -6213,7 +6209,7 @@  static int mvpp2_gmac_pcs_config(struct phylink_pcs *pcs, unsigned int mode,
 			MVPP2_GMAC_CONFIG_FULL_DUPLEX;
 		val = MVPP2_GMAC_IN_BAND_AUTONEG;
 
-		if (interface == PHY_INTERFACE_MODE_SGMII) {
+		if (state->interface == PHY_INTERFACE_MODE_SGMII) {
 			/* SGMII mode receives the speed and duplex from PHY */
 			val |= MVPP2_GMAC_AN_SPEED_EN |
 			       MVPP2_GMAC_AN_DUPLEX_EN;
@@ -6222,18 +6218,21 @@  static int mvpp2_gmac_pcs_config(struct phylink_pcs *pcs, unsigned int mode,
 			val |= MVPP2_GMAC_CONFIG_GMII_SPEED |
 			       MVPP2_GMAC_CONFIG_FULL_DUPLEX;
 
-			/* The FLOW_CTRL_AUTONEG bit selects either the hardware
-			 * automatically or the bits in MVPP22_GMAC_CTRL_4_REG
-			 * manually controls the GMAC pause modes.
+			/* The FLOW_CTRL_AUTONEG bit selects whether flow
+			 * control should come from autonegotiation or whether
+			 * it should be manually configured (by
+			 * MVPP22_GMAC_CTRL_4_REG). If it is cleared (to select
+			 * manual configuration) then flow control
+			 * advertisement will be ignored.
 			 */
-			if (permit_pause_to_mac)
+			if (state->pause & MLO_PAUSE_AN)
 				val |= MVPP2_GMAC_FLOW_CTRL_AUTONEG;
 
 			/* Configure advertisement bits */
 			mask |= MVPP2_GMAC_FC_ADV_EN | MVPP2_GMAC_FC_ADV_ASM_EN;
-			if (phylink_test(advertising, Pause))
+			if (phylink_test(state->advertising, Pause))
 				val |= MVPP2_GMAC_FC_ADV_EN;
-			if (phylink_test(advertising, Asym_Pause))
+			if (phylink_test(state->advertising, Asym_Pause))
 				val |= MVPP2_GMAC_FC_ADV_ASM_EN;
 		}
 	} else {
@@ -6632,8 +6631,7 @@  static void mvpp2_acpi_start(struct mvpp2_port *port)
 			   port->phy_interface);
 	mvpp2_mac_config(&port->phylink_config, MLO_AN_INBAND, &state);
 	port->phylink_pcs.ops->pcs_config(&port->phylink_pcs, MLO_AN_INBAND,
-					  port->phy_interface,
-					  state.advertising, false);
+					  &state);
 	mvpp2_mac_finish(&port->phylink_config, MLO_AN_INBAND,
 			 port->phy_interface);
 	mvpp2_mac_link_up(&port->phylink_config, NULL,
diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_phylink.c b/drivers/net/ethernet/microchip/lan966x/lan966x_phylink.c
index b66a9aa00ea4..dc3dda2fad14 100644
--- a/drivers/net/ethernet/microchip/lan966x/lan966x_phylink.c
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_phylink.c
@@ -86,19 +86,17 @@  static void lan966x_pcs_get_state(struct phylink_pcs *pcs,
 
 static int lan966x_pcs_config(struct phylink_pcs *pcs,
 			      unsigned int mode,
-			      phy_interface_t interface,
-			      const unsigned long *advertising,
-			      bool permit_pause_to_mac)
+			      const struct phylink_link_state *state)
 {
 	struct lan966x_port *port = lan966x_pcs_to_port(pcs);
 	struct lan966x_port_config config;
 	int ret;
 
 	config = port->config;
-	config.portmode = interface;
+	config.portmode = state->interface;
 	config.inband = phylink_autoneg_inband(mode);
-	config.autoneg = phylink_test(advertising, Autoneg);
-	config.advertising = advertising;
+	config.autoneg = phylink_test(state->advertising, Autoneg);
+	config.advertising = state->advertising;
 
 	ret = lan966x_port_pcs_set(port, &config);
 	if (ret)
diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_phylink.c b/drivers/net/ethernet/microchip/sparx5/sparx5_phylink.c
index 8ba33bc1a001..07500502d5be 100644
--- a/drivers/net/ethernet/microchip/sparx5/sparx5_phylink.c
+++ b/drivers/net/ethernet/microchip/sparx5/sparx5_phylink.c
@@ -84,9 +84,7 @@  static void sparx5_pcs_get_state(struct phylink_pcs *pcs,
 
 static int sparx5_pcs_config(struct phylink_pcs *pcs,
 			     unsigned int mode,
-			     phy_interface_t interface,
-			     const unsigned long *advertising,
-			     bool permit_pause_to_mac)
+			     const struct phylink_link_state *state)
 {
 	struct sparx5_port *port = sparx5_pcs_to_port(pcs);
 	struct sparx5_port_config conf;
@@ -94,16 +92,16 @@  static int sparx5_pcs_config(struct phylink_pcs *pcs,
 
 	conf = port->conf;
 	conf.power_down = false;
-	conf.portmode = interface;
+	conf.portmode = state->interface;
 	conf.inband = phylink_autoneg_inband(mode);
-	conf.autoneg = phylink_test(advertising, Autoneg);
+	conf.autoneg = phylink_test(state->advertising, Autoneg);
 	conf.pause_adv = 0;
-	if (phylink_test(advertising, Pause))
+	if (phylink_test(state->advertising, Pause))
 		conf.pause_adv |= ADVERTISE_1000XPAUSE;
-	if (phylink_test(advertising, Asym_Pause))
+	if (phylink_test(state->advertising, Asym_Pause))
 		conf.pause_adv |= ADVERTISE_1000XPSE_ASYM;
-	if (sparx5_is_baser(interface)) {
-		if (phylink_test(advertising, FIBRE))
+	if (sparx5_is_baser(state->interface)) {
+		if (phylink_test(state->advertising, FIBRE))
 			conf.media = PHY_MEDIA_SR;
 		else
 			conf.media = PHY_MEDIA_DAC;
diff --git a/drivers/net/pcs/pcs-lynx.c b/drivers/net/pcs/pcs-lynx.c
index af36cd647bf5..7b5351885b4d 100644
--- a/drivers/net/pcs/pcs-lynx.c
+++ b/drivers/net/pcs/pcs-lynx.c
@@ -182,18 +182,18 @@  static int lynx_pcs_config_usxgmii(struct mdio_device *pcs, unsigned int mode,
 }
 
 static int lynx_pcs_config(struct phylink_pcs *pcs, unsigned int mode,
-			   phy_interface_t ifmode,
-			   const unsigned long *advertising,
-			   bool permit)
+			   const struct phylink_link_state *state)
 {
 	struct lynx_pcs *lynx = phylink_pcs_to_lynx(pcs);
 
-	switch (ifmode) {
+	switch (state->interface) {
 	case PHY_INTERFACE_MODE_1000BASEX:
-		return lynx_pcs_config_1000basex(lynx->mdio, mode, advertising);
+		return lynx_pcs_config_1000basex(lynx->mdio, mode,
+						 state->advertising);
 	case PHY_INTERFACE_MODE_SGMII:
 	case PHY_INTERFACE_MODE_QSGMII:
-		return lynx_pcs_config_sgmii(lynx->mdio, mode, advertising);
+		return lynx_pcs_config_sgmii(lynx->mdio, mode,
+					     state->advertising);
 	case PHY_INTERFACE_MODE_2500BASEX:
 		if (phylink_autoneg_inband(mode)) {
 			dev_err(&lynx->mdio->dev,
@@ -202,7 +202,8 @@  static int lynx_pcs_config(struct phylink_pcs *pcs, unsigned int mode,
 		}
 		break;
 	case PHY_INTERFACE_MODE_USXGMII:
-		return lynx_pcs_config_usxgmii(lynx->mdio, mode, advertising);
+		return lynx_pcs_config_usxgmii(lynx->mdio, mode,
+					       state->advertising);
 	case PHY_INTERFACE_MODE_10GBASER:
 		/* Nothing to do here for 10GBASER */
 		break;
diff --git a/drivers/net/pcs/pcs-xpcs.c b/drivers/net/pcs/pcs-xpcs.c
index cd6742e6ba8b..5eceb84573cb 100644
--- a/drivers/net/pcs/pcs-xpcs.c
+++ b/drivers/net/pcs/pcs-xpcs.c
@@ -831,13 +831,11 @@  int xpcs_do_config(struct dw_xpcs *xpcs, phy_interface_t interface,
 EXPORT_SYMBOL_GPL(xpcs_do_config);
 
 static int xpcs_config(struct phylink_pcs *pcs, unsigned int mode,
-		       phy_interface_t interface,
-		       const unsigned long *advertising,
-		       bool permit_pause_to_mac)
+		       const struct phylink_link_state *state)
 {
 	struct dw_xpcs *xpcs = phylink_pcs_to_xpcs(pcs);
 
-	return xpcs_do_config(xpcs, interface, mode);
+	return xpcs_do_config(xpcs, state->interface, mode);
 }
 
 static int xpcs_get_state_c73(struct dw_xpcs *xpcs,
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 20df8af3e201..d47570365b93 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -768,10 +768,7 @@  static void phylink_major_config(struct phylink *pl, bool restart,
 
 	if (pl->pcs_ops) {
 		err = pl->pcs_ops->pcs_config(pl->pcs, pl->cur_link_an_mode,
-					      state->interface,
-					      state->advertising,
-					      !!(pl->link_config.pause &
-						 MLO_PAUSE_AN));
+					      state);
 		if (err < 0)
 			phylink_err(pl, "pcs_config failed: %pe\n",
 				    ERR_PTR(err));
@@ -821,9 +818,7 @@  static int phylink_change_inband_advert(struct phylink *pl)
 	 * the programmed advertisement has changed.
 	 */
 	ret = pl->pcs_ops->pcs_config(pl->pcs, pl->cur_link_an_mode,
-				      pl->link_config.interface,
-				      pl->link_config.advertising,
-				      !!(pl->link_config.pause & MLO_PAUSE_AN));
+				      &pl->link_config);
 	if (ret < 0)
 		return ret;
 
diff --git a/include/linux/phylink.h b/include/linux/phylink.h
index a2f266cc3442..d3d09c064596 100644
--- a/include/linux/phylink.h
+++ b/include/linux/phylink.h
@@ -408,9 +408,7 @@  struct phylink_pcs_ops {
 	void (*pcs_get_state)(struct phylink_pcs *pcs,
 			      struct phylink_link_state *state);
 	int (*pcs_config)(struct phylink_pcs *pcs, unsigned int mode,
-			  phy_interface_t interface,
-			  const unsigned long *advertising,
-			  bool permit_pause_to_mac);
+			  const struct phylink_link_state *state);
 	void (*pcs_an_restart)(struct phylink_pcs *pcs);
 	void (*pcs_link_up)(struct phylink_pcs *pcs, unsigned int mode,
 			    phy_interface_t interface, int speed, int duplex);
@@ -439,13 +437,10 @@  void pcs_get_state(struct phylink_pcs *pcs,
  * pcs_config() - Configure the PCS mode and advertisement
  * @pcs: a pointer to a &struct phylink_pcs.
  * @mode: one of %MLO_AN_FIXED, %MLO_AN_PHY, %MLO_AN_INBAND.
- * @interface: interface mode to be used
- * @advertising: adertisement ethtool link mode mask
- * @permit_pause_to_mac: permit forwarding pause resolution to MAC
+ * @state: the state to configure, containing the interface and advertisement
  *
  * Configure the PCS for the operating mode, the interface mode, and set
- * the advertisement mask. @permit_pause_to_mac indicates whether the
- * hardware may forward the pause mode resolution to the MAC.
+ * the advertisement mask.
  *
  * When operating in %MLO_AN_INBAND, inband should always be enabled,
  * otherwise inband should be disabled.
@@ -458,8 +453,7 @@  void pcs_get_state(struct phylink_pcs *pcs,
  * For most 10GBASE-R, there is no advertisement.
  */
 int pcs_config(struct phylink_pcs *pcs, unsigned int mode,
-	       phy_interface_t interface, const unsigned long *advertising,
-	       bool permit_pause_to_mac);
+	       const struct phylink_link_state *state);
 
 /**
  * pcs_an_restart() - restart 802.3z BaseX autonegotiation