diff mbox series

[v2,08/11] net: phylink: Adjust advertisement based on rate adaptation

Message ID 20220719235002.1944800-9-sean.anderson@seco.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series net: phy: Add support for rate adaptation | expand

Checks

Context Check Description
netdev/tree_selection success Guessed tree name to be net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit fail Errors and warnings before: 98 this patch: 98
netdev/cc_maintainers success CCed 8 of 8 maintainers
netdev/build_clang fail Errors and warnings before: 22 this patch: 22
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn fail Errors and warnings before: 98 this patch: 98
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 139 lines checked
netdev/kdoc success Errors and warnings before: 1 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Sean Anderson July 19, 2022, 11:49 p.m. UTC
This adds support for adjusting the advertisement for pause-based rate
adaptation. This may result in a lossy link, since the final link settings
are not adjusted. Asymmetric pause support is necessary. It would be
possible for a MAC supporting only symmetric pause to use pause-based rate
adaptation, but only if pause reception was enabled as well.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
---

Changes in v2:
- Determine the interface speed and max mac speed directly instead of
  guessing based on the caps.

 drivers/net/phy/phylink.c | 87 +++++++++++++++++++++++++++++++++++++--
 include/linux/phylink.h   |  5 ++-
 2 files changed, 88 insertions(+), 4 deletions(-)

Comments

Russell King (Oracle) July 20, 2022, 7:08 a.m. UTC | #1
On Tue, Jul 19, 2022 at 07:49:58PM -0400, Sean Anderson wrote:
> +static int phylink_caps_to_speed(unsigned long caps)
> +{
> +	unsigned int max_cap = __fls(caps);
> +
> +	if (max_cap == __fls(MAC_10HD) || max_cap == __fls(MAC_10FD))
> +		return SPEED_10;
> +	if (max_cap == __fls(MAC_100HD) || max_cap == __fls(MAC_100FD))
> +		return SPEED_100;
> +	if (max_cap == __fls(MAC_1000HD) || max_cap == __fls(MAC_1000FD))
> +		return SPEED_1000;
> +	if (max_cap == __fls(MAC_2500FD))
> +		return SPEED_2500;
> +	if (max_cap == __fls(MAC_5000FD))
> +		return SPEED_5000;
> +	if (max_cap == __fls(MAC_10000FD))
> +		return SPEED_10000;
> +	if (max_cap == __fls(MAC_20000FD))
> +		return SPEED_20000;
> +	if (max_cap == __fls(MAC_25000FD))
> +		return SPEED_25000;
> +	if (max_cap == __fls(MAC_40000FD))
> +		return SPEED_40000;
> +	if (max_cap == __fls(MAC_50000FD))
> +		return SPEED_50000;
> +	if (max_cap == __fls(MAC_56000FD))
> +		return SPEED_56000;
> +	if (max_cap == __fls(MAC_100000FD))
> +		return SPEED_100000;
> +	if (max_cap == __fls(MAC_200000FD))
> +		return SPEED_200000;
> +	if (max_cap == __fls(MAC_400000FD))
> +		return SPEED_400000;
> +	return SPEED_UNKNOWN;
> +}

One of my recent patches introduced "phylink_caps_params" table into
the DSA code (which isn't merged) but it's about converting the caps
into the SPEED_* and DUPLEX_*. This is doing more or less the same
7thing but with a priority for speed rather than duplex. The question
about whether it should be this way for the DSA case or whether speed
should take priority was totally ignored by all reviewers of the code
despite being explicitly asked.

Maybe this could be reused here rather than having similar code.

> @@ -482,7 +529,39 @@ unsigned long phylink_get_capabilities(phy_interface_t interface,
>  		break;
>  	}
>  
> -	return caps & mac_capabilities;
> +	switch (rate_adaptation) {
> +	case RATE_ADAPT_NONE:
> +		break;
> +	case RATE_ADAPT_PAUSE: {
> +		/* The MAC must support asymmetric pause towards the local
> +		 * device for this. We could allow just symmetric pause, but
> +		 * then we might have to renegotiate if the link partner
> +		 * doesn't support pause.

Why do we need to renegotiate, and what would this achieve? The link
partner isn't going to say "oh yes I do support pause after all",
and in any case this function is working out what the capabilities
of the system is prior to bringing anything up.

All that we need to know here is whether the MAC supports receiving
pause frames from the PHY - if it doesn't, then the MAC is
incompatible with the PHY using rate adaption.

> +		 */
> +		if (!(mac_capabilities & MAC_SYM_PAUSE) ||
> +		    !(mac_capabilities & MAC_ASYM_PAUSE))
> +			break;
> +
> +		/* Can't adapt if the MAC doesn't support the interface's max
> +		 * speed
> +		 */
> +		if (state.speed != phylink_caps_to_speed(mac_capabilities))
> +			break;

I'm not sure this is the right way to check. If the MAC supports e.g.
10G, 1G, 100M and 10M, but we have a PHY operating in 1000base-X mode
to the PCS/MAC and is using rate adaption, then phylink_caps_to_speed()
will return 10G, but state.speed will be 1G.

Don't we instead want to check whether the MAC capabilities has the FD
bit corresponding to state.speed set?

> +
> +		adapted_caps = GENMASK(__fls(caps), __fls(MAC_10HD));
> +		/* We can't use pause frames in half-duplex mode */
> +		adapted_caps &= ~(MAC_1000HD | MAC_100HD | MAC_10HD);

Have you checked the PHY documentation to see what the behaviour is
in rate adaption mode with pause frames and it negotiates HD on the
media side? Does it handle the HD issue internally?
Sean Anderson July 21, 2022, 4:55 p.m. UTC | #2
On 7/20/22 3:08 AM, Russell King (Oracle) wrote:
> On Tue, Jul 19, 2022 at 07:49:58PM -0400, Sean Anderson wrote:
>> +static int phylink_caps_to_speed(unsigned long caps)
>> +{
>> +	unsigned int max_cap = __fls(caps);
>> +
>> +	if (max_cap == __fls(MAC_10HD) || max_cap == __fls(MAC_10FD))
>> +		return SPEED_10;
>> +	if (max_cap == __fls(MAC_100HD) || max_cap == __fls(MAC_100FD))
>> +		return SPEED_100;
>> +	if (max_cap == __fls(MAC_1000HD) || max_cap == __fls(MAC_1000FD))
>> +		return SPEED_1000;
>> +	if (max_cap == __fls(MAC_2500FD))
>> +		return SPEED_2500;
>> +	if (max_cap == __fls(MAC_5000FD))
>> +		return SPEED_5000;
>> +	if (max_cap == __fls(MAC_10000FD))
>> +		return SPEED_10000;
>> +	if (max_cap == __fls(MAC_20000FD))
>> +		return SPEED_20000;
>> +	if (max_cap == __fls(MAC_25000FD))
>> +		return SPEED_25000;
>> +	if (max_cap == __fls(MAC_40000FD))
>> +		return SPEED_40000;
>> +	if (max_cap == __fls(MAC_50000FD))
>> +		return SPEED_50000;
>> +	if (max_cap == __fls(MAC_56000FD))
>> +		return SPEED_56000;
>> +	if (max_cap == __fls(MAC_100000FD))
>> +		return SPEED_100000;
>> +	if (max_cap == __fls(MAC_200000FD))
>> +		return SPEED_200000;
>> +	if (max_cap == __fls(MAC_400000FD))
>> +		return SPEED_400000;
>> +	return SPEED_UNKNOWN;
>> +}
> 
> One of my recent patches introduced "phylink_caps_params" table into
> the DSA code (which isn't merged) but it's about converting the caps
> into the SPEED_* and DUPLEX_*. This is doing more or less the same
> 7thing but with a priority for speed rather than duplex. The question
> about whether it should be this way for the DSA case or whether speed
> should take priority was totally ignored by all reviewers of the code
> despite being explicitly asked.
> 
> Maybe this could be reused here rather than having similar code.

I'm in favor of that.

>> @@ -482,7 +529,39 @@ unsigned long phylink_get_capabilities(phy_interface_t interface,
>>  		break;
>>  	}
>>  
>> -	return caps & mac_capabilities;
>> +	switch (rate_adaptation) {
>> +	case RATE_ADAPT_NONE:
>> +		break;
>> +	case RATE_ADAPT_PAUSE: {
>> +		/* The MAC must support asymmetric pause towards the local
>> +		 * device for this. We could allow just symmetric pause, but
>> +		 * then we might have to renegotiate if the link partner
>> +		 * doesn't support pause.
> 
> Why do we need to renegotiate, and what would this achieve? The link
> partner isn't going to say "oh yes I do support pause after all",
> and in any case this function is working out what the capabilities
> of the system is prior to bringing anything up.
> 
> All that we need to know here is whether the MAC supports receiving
> pause frames from the PHY - if it doesn't, then the MAC is
> incompatible with the PHY using rate adaption.

AIUI, MAC_SYM_PAUSE and MAC_ASYM_PAUSE correspond to the PAUSE and
ASM_DIR bits used in autonegotiation. For reference, Table 28B-2 from
802.3 is:

PAUSE (A5) ASM_DIR (A6) Capability
========== ============ ================================================
         0            0 No PAUSE
         0            1 Asymmetric PAUSE toward link partner
         1            0 Symmetric PAUSE
	 1            1 Both Symmetric PAUSE and Asymmetric PAUSE toward
                        local device

These correspond to the following valid values for MLO_PAUSE:

MAC_SYM_PAUSE MAC_ASYM_PAUSE Valid pause modes
============= ============== ==============================
            0              0 MLO_PAUSE_NONE
            0              1 MLO_PAUSE_NONE, MLO_PAUSE_TX
            1              0 MLO_PAUSE_NONE, MLO_PAUSE_TXRX
	    1              1 MLO_PAUSE_NONE, MLO_PAUSE_RX,
                             MLO_PAUSE_TXRX

In order to support pause-based rate adaptation, we need MLO_PAUSE_RX to
be valid. This rules out the top two rows. In the bottom mode, we can
enable MLO_PAUSE_RX without MLO_PAUSE_TX. Whatever our link partner
supports, we can still enable it. For the third row, however, we can
only enable MLO_PAUSE_RX if we also enable MLO_PAUSE_TX. This can be a
problem if the link partner does not support pause frames (or the user
has disabled MLO_PAUSE_AN and MLO_PAUSE_TX). So if we were to enable
advertisement of pause-based, rate-adapted modes when only MAC_SYM_PAUSE
was present, then we might end up in a situation where we'd have to
renegotiate without those modes in order to get a valid link state. I
don't want to have to implement that, so for now we only advertise
pause-based, rate-adapted modes if we support MLO_PAUSE_RX without
MLO_PAUSE_TX.

>> +		 */
>> +		if (!(mac_capabilities & MAC_SYM_PAUSE) ||
>> +		    !(mac_capabilities & MAC_ASYM_PAUSE))
>> +			break;
>> +
>> +		/* Can't adapt if the MAC doesn't support the interface's max
>> +		 * speed
>> +		 */
>> +		if (state.speed != phylink_caps_to_speed(mac_capabilities))
>> +			break;
> 
> I'm not sure this is the right way to check. If the MAC supports e.g.
> 10G, 1G, 100M and 10M, but we have a PHY operating in 1000base-X mode
> to the PCS/MAC and is using rate adaption, then phylink_caps_to_speed()
> will return 10G, but state.speed will be 1G.
> 
> Don't we instead want to check whether the MAC capabilities has the FD
> bit corresponding to state.speed set?

Yes, that seems correct.

>> +
>> +		adapted_caps = GENMASK(__fls(caps), __fls(MAC_10HD));
>> +		/* We can't use pause frames in half-duplex mode */
>> +		adapted_caps &= ~(MAC_1000HD | MAC_100HD | MAC_10HD);
> 
> Have you checked the PHY documentation to see what the behaviour is
> in rate adaption mode with pause frames and it negotiates HD on the
> media side? Does it handle the HD issue internally?

It's not documented. This is just conservative. Presumably, there exists
(or could exist) a duplex-adapting phy, but I don't know if I have one.

--Sean
Russell King (Oracle) July 21, 2022, 6:04 p.m. UTC | #3
On Thu, Jul 21, 2022 at 12:55:16PM -0400, Sean Anderson wrote:
> On 7/20/22 3:08 AM, Russell King (Oracle) wrote:
> > On Tue, Jul 19, 2022 at 07:49:58PM -0400, Sean Anderson wrote:
> >> @@ -482,7 +529,39 @@ unsigned long phylink_get_capabilities(phy_interface_t interface,
> >>  		break;
> >>  	}
> >>  
> >> -	return caps & mac_capabilities;
> >> +	switch (rate_adaptation) {
> >> +	case RATE_ADAPT_NONE:
> >> +		break;
> >> +	case RATE_ADAPT_PAUSE: {
> >> +		/* The MAC must support asymmetric pause towards the local
> >> +		 * device for this. We could allow just symmetric pause, but
> >> +		 * then we might have to renegotiate if the link partner
> >> +		 * doesn't support pause.
> > 
> > Why do we need to renegotiate, and what would this achieve? The link
> > partner isn't going to say "oh yes I do support pause after all",
> > and in any case this function is working out what the capabilities
> > of the system is prior to bringing anything up.
> > 
> > All that we need to know here is whether the MAC supports receiving
> > pause frames from the PHY - if it doesn't, then the MAC is
> > incompatible with the PHY using rate adaption.
> 
> AIUI, MAC_SYM_PAUSE and MAC_ASYM_PAUSE correspond to the PAUSE and
> ASM_DIR bits used in autonegotiation. For reference, Table 28B-2 from
> 802.3 is:
> 
> PAUSE (A5) ASM_DIR (A6) Capability
> ========== ============ ================================================
>          0            0 No PAUSE
>          0            1 Asymmetric PAUSE toward link partner
>          1            0 Symmetric PAUSE
> 	 1            1 Both Symmetric PAUSE and Asymmetric PAUSE toward
>                         local device
> 
> These correspond to the following valid values for MLO_PAUSE:
> 
> MAC_SYM_PAUSE MAC_ASYM_PAUSE Valid pause modes
> ============= ============== ==============================
>             0              0 MLO_PAUSE_NONE
>             0              1 MLO_PAUSE_NONE, MLO_PAUSE_TX
>             1              0 MLO_PAUSE_NONE, MLO_PAUSE_TXRX
> 	    1              1 MLO_PAUSE_NONE, MLO_PAUSE_RX,
>                              MLO_PAUSE_TXRX
> 
> In order to support pause-based rate adaptation, we need MLO_PAUSE_RX to
> be valid. This rules out the top two rows. In the bottom mode, we can
> enable MLO_PAUSE_RX without MLO_PAUSE_TX. Whatever our link partner
> supports, we can still enable it. For the third row, however, we can
> only enable MLO_PAUSE_RX if we also enable MLO_PAUSE_TX. This can be a
> problem if the link partner does not support pause frames (or the user
> has disabled MLO_PAUSE_AN and MLO_PAUSE_TX). So if we were to enable
> advertisement of pause-based, rate-adapted modes when only MAC_SYM_PAUSE
> was present, then we might end up in a situation where we'd have to
> renegotiate without those modes in order to get a valid link state. I
> don't want to have to implement that, so for now we only advertise
> pause-based, rate-adapted modes if we support MLO_PAUSE_RX without
> MLO_PAUSE_TX.

Ah, I see. Yes, I agree that we shouldn't do that, and only allow rate
adaption in pause mode to be used if we can enable RX pause without TX
pause on our local MAC.

> > Have you checked the PHY documentation to see what the behaviour is
> > in rate adaption mode with pause frames and it negotiates HD on the
> > media side? Does it handle the HD issue internally?
> 
> It's not documented. This is just conservative. Presumably, there exists
> (or could exist) a duplex-adapting phy, but I don't know if I have one.

I guess it would depend on the structure of the PHY - whether the PHY
is structured similar to a two port switch internally, having a MAC
facing the host and another MAC facing the media side. (I believe this
is exactly how the MACSEC versions of the 88x3310 are structured.)

If you don't have that kind of structure, then I would guess that doing
duplex adaption could be problematical.
Andrew Lunn July 21, 2022, 6:36 p.m. UTC | #4
> I guess it would depend on the structure of the PHY - whether the PHY
> is structured similar to a two port switch internally, having a MAC
> facing the host and another MAC facing the media side. (I believe this
> is exactly how the MACSEC versions of the 88x3310 are structured.)
> 
> If you don't have that kind of structure, then I would guess that doing
> duplex adaption could be problematical.

If you don't have that sort of structure, i think rate adaptation
would have problems in general. Pause is not very fine grained. You
need to somehow buffer packets because what comes from the MAC is
likely to be bursty. And when that buffer overflows, you want to be
selective about what you throw away. You want ARP, OSPF and other
signalling packets to have priority, and user data gets
tossed. Otherwise your network collapses.

	Andrew
Dave Taht July 21, 2022, 7:02 p.m. UTC | #5
On Thu, Jul 21, 2022 at 11:42 AM Andrew Lunn <andrew@lunn.ch> wrote:
>
> > I guess it would depend on the structure of the PHY - whether the PHY
> > is structured similar to a two port switch internally, having a MAC
> > facing the host and another MAC facing the media side. (I believe this
> > is exactly how the MACSEC versions of the 88x3310 are structured.)
> >
> > If you don't have that kind of structure, then I would guess that doing
> > duplex adaption could be problematical.
>
> If you don't have that sort of structure, i think rate adaptation
> would have problems in general. Pause is not very fine grained. You
> need to somehow buffer packets because what comes from the MAC is
> likely to be bursty. And when that buffer overflows, you want to be
> selective about what you throw away. You want ARP, OSPF and other
> signalling packets to have priority, and user data gets
> tossed. Otherwise your network collapses.

Most of our protocols (like arp) are tolerant of some loss and taking
extraordinary measures to preserve "signalling" packets shouldn't be
necessary.

That said, how much buffering can exist here? How much latency between
emission, receipt, and response to a pause will there be?

>
>         Andrew
Sean Anderson July 21, 2022, 7:24 p.m. UTC | #6
On 7/21/22 2:36 PM, Andrew Lunn wrote:
>> I guess it would depend on the structure of the PHY - whether the PHY
>> is structured similar to a two port switch internally, having a MAC
>> facing the host and another MAC facing the media side. (I believe this
>> is exactly how the MACSEC versions of the 88x3310 are structured.)
>> 
>> If you don't have that kind of structure, then I would guess that doing
>> duplex adaption could be problematical.
> 
> If you don't have that sort of structure, i think rate adaptation
> would have problems in general. Pause is not very fine grained. You
> need to somehow buffer packets because what comes from the MAC is
> likely to be bursty. And when that buffer overflows, you want to be
> selective about what you throw away. You want ARP, OSPF and other
> signalling packets to have priority, and user data gets
> tossed. Otherwise your network collapses.

I performed some experiments using iperf to attempt to determine how
things worked. On one end, I had a LS1046ARDB with an AQR113C connected
via 10GBASE-R at address 10.0.0.1. On the other end, I had a custom
board supporting 100BASE-TX at address 10.0.0.2. I ran tests in both
directions at once. In a regular link (both sides using MII), I would
expect around 90Mbit/sec in both directions for full duplex, and around
40Mbit/s in both directions for half duplex (or less?). I would also
expect no retries (since collisions should be handled by the MAC and not
TCP).

Direction Duplex   Interval            Transfer      Bandwidth       Write/Err  Retries  
========= ======== =================== ============= =============== ========== =======
1->2      Full     0.0000-10.0185 sec   111 MBytes    92.8 Mbits/sec      888/0       0
2->1      Full     0.0000-10.0865 sec    99.1 MBytes  82.4 Mbits/sec      794/0       0
1->2      Half     0.0000-10.0098 sec    36.6 MBytes  30.7 Mbits/sec      294/0    1241
2->1      Half     0.0000-10.1764 sec    1.13 MBytes 927 Kbits/sec         10/0     155

From the second line, it appears like the 100BASE-TX device wasn't able
to saturate the link. I'm not sure why that is (it doesn't match
earlier test results I had), but it is reproducable with other iperf
servers (so I don't think it's due to the rate adaptation).

For the second two lines, we can see that the rate adapting sender is
much faster, and has many more retries. This suggests to me that the
rate adapting phy is not performing collision avoidance, causing high
packet loss and starving the 100BASE-TX device.

Clearly half duplex data transfer works, but I don't know if it is
functioning properly.

--Sean
Russell King (Oracle) July 21, 2022, 9:06 p.m. UTC | #7
On Thu, Jul 21, 2022 at 08:36:03PM +0200, Andrew Lunn wrote:
> > I guess it would depend on the structure of the PHY - whether the PHY
> > is structured similar to a two port switch internally, having a MAC
> > facing the host and another MAC facing the media side. (I believe this
> > is exactly how the MACSEC versions of the 88x3310 are structured.)
> > 
> > If you don't have that kind of structure, then I would guess that doing
> > duplex adaption could be problematical.
> 
> If you don't have that sort of structure, i think rate adaptation
> would have problems in general. Pause is not very fine grained. You
> need to somehow buffer packets because what comes from the MAC is
> likely to be bursty. And when that buffer overflows, you want to be
> selective about what you throw away. You want ARP, OSPF and other
> signalling packets to have priority, and user data gets
> tossed. Otherwise your network collapses.

I don't think rate adaption is that inteligent - it's all about slowing
the MAC down to the speed of the media. From what I remember looking at
pause frames, they can specify how long to delay further transmission
by the receiver, and I would expect this to be set according to the
media speed for setups that use pause packets.

For those which don't, then that's a whole different ball game, because
they tend not to have MACs, and then you're probably down to the
capabilities of nothing more than a FIFO in the PHY.
diff mbox series

Patch

diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 619ef553476f..f61040c93f3c 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -398,18 +398,65 @@  void phylink_caps_to_linkmodes(unsigned long *linkmodes, unsigned long caps)
 }
 EXPORT_SYMBOL_GPL(phylink_caps_to_linkmodes);
 
+static int phylink_caps_to_speed(unsigned long caps)
+{
+	unsigned int max_cap = __fls(caps);
+
+	if (max_cap == __fls(MAC_10HD) || max_cap == __fls(MAC_10FD))
+		return SPEED_10;
+	if (max_cap == __fls(MAC_100HD) || max_cap == __fls(MAC_100FD))
+		return SPEED_100;
+	if (max_cap == __fls(MAC_1000HD) || max_cap == __fls(MAC_1000FD))
+		return SPEED_1000;
+	if (max_cap == __fls(MAC_2500FD))
+		return SPEED_2500;
+	if (max_cap == __fls(MAC_5000FD))
+		return SPEED_5000;
+	if (max_cap == __fls(MAC_10000FD))
+		return SPEED_10000;
+	if (max_cap == __fls(MAC_20000FD))
+		return SPEED_20000;
+	if (max_cap == __fls(MAC_25000FD))
+		return SPEED_25000;
+	if (max_cap == __fls(MAC_40000FD))
+		return SPEED_40000;
+	if (max_cap == __fls(MAC_50000FD))
+		return SPEED_50000;
+	if (max_cap == __fls(MAC_56000FD))
+		return SPEED_56000;
+	if (max_cap == __fls(MAC_100000FD))
+		return SPEED_100000;
+	if (max_cap == __fls(MAC_200000FD))
+		return SPEED_200000;
+	if (max_cap == __fls(MAC_400000FD))
+		return SPEED_400000;
+	return SPEED_UNKNOWN;
+}
+
 /**
  * phylink_get_capabilities() - get capabilities for a given MAC
  * @interface: phy interface mode defined by &typedef phy_interface_t
  * @mac_capabilities: bitmask of MAC capabilities
+ * @rate_adaptation: type of rate adaptation being performed
  *
  * Get the MAC capabilities that are supported by the @interface mode and
  * @mac_capabilities.
  */
 unsigned long phylink_get_capabilities(phy_interface_t interface,
-				       unsigned long mac_capabilities)
+				       unsigned long mac_capabilities,
+				       int rate_adaptation)
 {
 	unsigned long caps = MAC_SYM_PAUSE | MAC_ASYM_PAUSE;
+	unsigned long adapted_caps = 0;
+	struct phylink_link_state state = {
+		.interface = interface,
+		.rate_adaptation = rate_adaptation,
+		.link_speed = SPEED_UNKNOWN,
+		.link_duplex = DUPLEX_UNKNOWN,
+	};
+
+	/* Look up the interface speed */
+	phylink_state_fill_speed_duplex(&state);
 
 	switch (interface) {
 	case PHY_INTERFACE_MODE_USXGMII:
@@ -482,7 +529,39 @@  unsigned long phylink_get_capabilities(phy_interface_t interface,
 		break;
 	}
 
-	return caps & mac_capabilities;
+	switch (rate_adaptation) {
+	case RATE_ADAPT_NONE:
+		break;
+	case RATE_ADAPT_PAUSE: {
+		/* The MAC must support asymmetric pause towards the local
+		 * device for this. We could allow just symmetric pause, but
+		 * then we might have to renegotiate if the link partner
+		 * doesn't support pause.
+		 */
+		if (!(mac_capabilities & MAC_SYM_PAUSE) ||
+		    !(mac_capabilities & MAC_ASYM_PAUSE))
+			break;
+
+		/* Can't adapt if the MAC doesn't support the interface's max
+		 * speed
+		 */
+		if (state.speed != phylink_caps_to_speed(mac_capabilities))
+			break;
+
+		adapted_caps = GENMASK(__fls(caps), __fls(MAC_10HD));
+		/* We can't use pause frames in half-duplex mode */
+		adapted_caps &= ~(MAC_1000HD | MAC_100HD | MAC_10HD);
+		break;
+	}
+	case RATE_ADAPT_CRS:
+		/* TODO */
+		break;
+	case RATE_ADAPT_OPEN_LOOP:
+		/* TODO */
+		break;
+	}
+
+	return (caps & mac_capabilities) | adapted_caps;
 }
 EXPORT_SYMBOL_GPL(phylink_get_capabilities);
 
@@ -506,7 +585,8 @@  void phylink_generic_validate(struct phylink_config *config,
 	phylink_set_port_modes(mask);
 	phylink_set(mask, Autoneg);
 	caps = phylink_get_capabilities(state->interface,
-					config->mac_capabilities);
+					config->mac_capabilities,
+					state->rate_adaptation);
 	phylink_caps_to_linkmodes(mask, caps);
 
 	linkmode_and(supported, supported, mask);
@@ -1529,6 +1609,7 @@  static int phylink_bringup_phy(struct phylink *pl, struct phy_device *phy,
 		config.interface = PHY_INTERFACE_MODE_NA;
 	else
 		config.interface = interface;
+	config.rate_adaptation = phy_get_rate_adaptation(phy, config.interface);
 
 	ret = phylink_validate(pl, supported, &config);
 	if (ret) {
diff --git a/include/linux/phylink.h b/include/linux/phylink.h
index f32b97f27ddc..73e1aa73e30f 100644
--- a/include/linux/phylink.h
+++ b/include/linux/phylink.h
@@ -70,6 +70,8 @@  static inline bool phylink_autoneg_inband(unsigned int mode)
  * @link: true if the link is up.
  * @an_enabled: true if autonegotiation is enabled/desired.
  * @an_complete: true if autonegotiation has completed.
+ * @rate_adaptation: method of throttling @interface_speed to @speed, one of
+ *   RATE_ADAPT_* constants.
  */
 struct phylink_link_state {
 	__ETHTOOL_DECLARE_LINK_MODE_MASK(advertising);
@@ -531,7 +533,8 @@  void pcs_link_up(struct phylink_pcs *pcs, unsigned int mode,
 
 void phylink_caps_to_linkmodes(unsigned long *linkmodes, unsigned long caps);
 unsigned long phylink_get_capabilities(phy_interface_t interface,
-				       unsigned long mac_capabilities);
+				       unsigned long mac_capabilities,
+				       int rate_adaptation);
 void phylink_generic_validate(struct phylink_config *config,
 			      unsigned long *supported,
 			      struct phylink_link_state *state);