[RFC,net-next,05/10] Documentation: networking: ethtool-netlink: Add link extended state
diff mbox series

Message ID 20200607145945.30559-6-amitc@mellanox.com
State New
Headers show
Series
  • Add extended state
Related show

Commit Message

Amit Cohen June 7, 2020, 2:59 p.m. UTC
Add link extended state attributes.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
---
 Documentation/networking/ethtool-netlink.rst | 56 ++++++++++++++++++--
 1 file changed, 52 insertions(+), 4 deletions(-)

Comments

Andrew Lunn June 7, 2020, 4:47 p.m. UTC | #1
> +Link extended states:
> +
> +  ============================    =============================================
> +  ``Autoneg failure``             Failure during auto negotiation mechanism

I think you need to define 'failure' here.

Linux PHYs don't have this state. auto-neg is either ongoing, or has
completed. There is no time limit for auto-neg. If there is no link
partner, auto-neg does not fail, it just continues until there is a
link partner which responds and negotiation completes.

Looking at the state diagrams in 802.3 clause 28, what do you consider
as failure?

	 Andrew
Florian Fainelli June 7, 2020, 7:11 p.m. UTC | #2
On 6/7/2020 7:59 AM, Amit Cohen wrote:
> Add link extended state attributes.
> 
> Signed-off-by: Amit Cohen <amitc@mellanox.com>
> Reviewed-by: Petr Machata <petrm@mellanox.com>
> Reviewed-by: Jiri Pirko <jiri@mellanox.com>

If you need to resubmit, I would swap the order of patches #4 and #5
such that the documentation comes first.

[snip]

>  
> +Link extended states:
> +
> +  ============================    =============================================
> +  ``Autoneg failure``             Failure during auto negotiation mechanism
> +
> +  ``Link training failure``       Failure during link training
> +
> +  ``Link logical mismatch``       Logical mismatch in physical coding sublayer
> +                                  or forward error correction sublayer
> +
> +  ``Bad signal integrity``        Signal integrity issues
> +
> +  ``No cable``                    No cable connected
> +
> +  ``Cable issue``                 Failure is related to cable,
> +                                  e.g., unsupported cable
> +
> +  ``EEPROM issue``                Failure is related to EEPROM, e.g., failure
> +                                  during reading or parsing the data
> +
> +  ``Calibration failure``         Failure during calibration algorithm
> +
> +  ``Power budget exceeded``       The hardware is not able to provide the
> +                                  power required from cable or module
> +
> +  ``Overheat``                    The module is overheated
> +  ============================    =============================================
> +
> +Many of the substates are obvious, or terms that someone working in the
> +particular area will be familiar with. The following table summarizes some
> +that are not:

Not sure this comment is helping that much, how about documenting each
of the sub-states currently defined, even if this is just paraphrasing
their own name? Being able to quickly go to the documentation rather
than looking at the header is appreciable.

Thank you!

> +
> +Link extended substates:
> +
> +  ============================    =============================================
> +  ``Unsupported rate``            The system attempted to operate the cable at
> +                                  a rate that is not formally supported, which
> +                                  led to signal integrity issues

Do you have examples? Would you consider a 4-pair copper cable for
Gigabit that has a damaged pair and would downshift somehow fall in that
category?
Amit Cohen June 8, 2020, 10:02 a.m. UTC | #3
Andrew Lunn <andrew@lunn.ch> writes:

>> +Link extended states:
>> +
>> +  ============================    =============================================
>> +  ``Autoneg failure``             Failure during auto negotiation mechanism
>
>I think you need to define 'failure' here.
>
>Linux PHYs don't have this state. auto-neg is either ongoing, or has completed. There is no time limit for auto-neg. If there is no link partner, auto-neg does not fail, it just continues until there is a link partner which responds and negotiation completes.
>
>Looking at the state diagrams in 802.3 clause 28, what do you consider as failure?
>

Ok, you're right. What about renaming this state to "Autoneg issue" and then as ext_substate you can use something like "Autoneg ongoing"? 

>	 Andrew
Andrew Lunn June 8, 2020, 1:47 p.m. UTC | #4
On Mon, Jun 08, 2020 at 10:02:04AM +0000, Amit Cohen wrote:
> Andrew Lunn <andrew@lunn.ch> writes:
> 
> >> +Link extended states:
> >> +
> >> +  ============================    =============================================
> >> +  ``Autoneg failure``             Failure during auto negotiation mechanism
> >
> >I think you need to define 'failure' here.
> >
> >Linux PHYs don't have this state. auto-neg is either ongoing, or has completed. There is no time limit for auto-neg. If there is no link partner, auto-neg does not fail, it just continues until there is a link partner which responds and negotiation completes.
> >
> >Looking at the state diagrams in 802.3 clause 28, what do you consider as failure?
> >
> 
> Ok, you're right. What about renaming this state to "Autoneg issue" and then as ext_substate you can use something like "Autoneg ongoing"? 

Hi Amit

I'm not sure 'issue' is correct here. Just because it has not
completed does not mean there is an issue. It takes around 1.5 seconds
anyway, best case. And if there is no link partner, it is not supposed
to complete. So i would suggest just ``Autoneg``.

	  Andrew
Amit Cohen June 8, 2020, 3:59 p.m. UTC | #5
On 07-Jun-20 22:11, Florian Fainelli wrote:
> 
> 
> On 6/7/2020 7:59 AM, Amit Cohen wrote:
>> Add link extended state attributes.
>>
>> Signed-off-by: Amit Cohen <amitc@mellanox.com>
>> Reviewed-by: Petr Machata <petrm@mellanox.com>
>> Reviewed-by: Jiri Pirko <jiri@mellanox.com>
> 
> If you need to resubmit, I would swap the order of patches #4 and #5
> such that the documentation comes first.
> 
> [snip]

ok

> 
>>  
>> +Link extended states:
>> +
>> +  ============================    =============================================
>> +  ``Autoneg failure``             Failure during auto negotiation mechanism
>> +
>> +  ``Link training failure``       Failure during link training
>> +
>> +  ``Link logical mismatch``       Logical mismatch in physical coding sublayer
>> +                                  or forward error correction sublayer
>> +
>> +  ``Bad signal integrity``        Signal integrity issues
>> +
>> +  ``No cable``                    No cable connected
>> +
>> +  ``Cable issue``                 Failure is related to cable,
>> +                                  e.g., unsupported cable
>> +
>> +  ``EEPROM issue``                Failure is related to EEPROM, e.g., failure
>> +                                  during reading or parsing the data
>> +
>> +  ``Calibration failure``         Failure during calibration algorithm
>> +
>> +  ``Power budget exceeded``       The hardware is not able to provide the
>> +                                  power required from cable or module
>> +
>> +  ``Overheat``                    The module is overheated
>> +  ============================    =============================================
>> +
>> +Many of the substates are obvious, or terms that someone working in the
>> +particular area will be familiar with. The following table summarizes some
>> +that are not:
> 
> Not sure this comment is helping that much, how about documenting each
> of the sub-states currently defined, even if this is just paraphrasing
> their own name? Being able to quickly go to the documentation rather
> than looking at the header is appreciable.
> 
> Thank you!

np, I'll add.
> 
>> +
>> +Link extended substates:
>> +
>> +  ============================    =============================================
>> +  ``Unsupported rate``            The system attempted to operate the cable at
>> +                                  a rate that is not formally supported, which
>> +                                  led to signal integrity issues
> 
> Do you have examples? Would you consider a 4-pair copper cable for
> Gigabit that has a damaged pair and would downshift somehow fall in that
> category?
> 

For example, this statement might appear when an 100G OPTICs (not copper) which is used in 40G rate and sees high BER (when using Parallel Detect).
In this situation we recommend to  see the other end configuration and understand why it is configured to lower speed.

Regarding your example, if it stays on the same speed and have high BER you should expect a different BAD SI statement.

Patch
diff mbox series

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index 7e651ea33eab..4e4570ebbc4d 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -441,10 +441,11 @@  supports.
 LINKSTATE_GET
 =============
 
-Requests link state information. At the moment, only link up/down flag (as
-provided by ``ETHTOOL_GLINK`` ioctl command) is provided but some future
-extensions are planned (e.g. link down reason). This request does not have any
-attributes.
+Requests link state information. Link up/down flag (as provided by
+``ETHTOOL_GLINK`` ioctl command) is provided. Optionally, extended state might
+be provided as well. In general, extended state describes reasons for why a port
+is down, or why it operates in some non-obvious mode. This request does not have
+any attributes.
 
 Request contents:
 
@@ -459,16 +460,63 @@  Kernel response contents:
   ``ETHTOOL_A_LINKSTATE_LINK``          bool    link state (up/down)
   ``ETHTOOL_A_LINKSTATE_SQI``           u32     Current Signal Quality Index
   ``ETHTOOL_A_LINKSTATE_SQI_MAX``       u32     Max support SQI value
+  ``ETHTOOL_A_LINKSTATE_EXT_STATE``     u8      link extended state
+  ``ETHTOOL_A_LINKSTATE_EXT_SUBSTATE``  u8      link extended substate
   ====================================  ======  ============================
 
 For most NIC drivers, the value of ``ETHTOOL_A_LINKSTATE_LINK`` returns
 carrier flag provided by ``netif_carrier_ok()`` but there are drivers which
 define their own handler.
 
+``ETHTOOL_A_LINKSTATE_EXT_STATE`` and ``ETHTOOL_A_LINKSTATE_EXT_SUBSTATE`` are
+optional values. ethtool core can provide either both
+``ETHTOOL_A_LINKSTATE_EXT_STATE`` and ``ETHTOOL_A_LINKSTATE_EXT_SUBSTATE``,
+or only ``ETHTOOL_A_LINKSTATE_EXT_STATE``, or none of them.
+
 ``LINKSTATE_GET`` allows dump requests (kernel returns reply messages for all
 devices supporting the request).
 
 
+Link extended states:
+
+  ============================    =============================================
+  ``Autoneg failure``             Failure during auto negotiation mechanism
+
+  ``Link training failure``       Failure during link training
+
+  ``Link logical mismatch``       Logical mismatch in physical coding sublayer
+                                  or forward error correction sublayer
+
+  ``Bad signal integrity``        Signal integrity issues
+
+  ``No cable``                    No cable connected
+
+  ``Cable issue``                 Failure is related to cable,
+                                  e.g., unsupported cable
+
+  ``EEPROM issue``                Failure is related to EEPROM, e.g., failure
+                                  during reading or parsing the data
+
+  ``Calibration failure``         Failure during calibration algorithm
+
+  ``Power budget exceeded``       The hardware is not able to provide the
+                                  power required from cable or module
+
+  ``Overheat``                    The module is overheated
+  ============================    =============================================
+
+Many of the substates are obvious, or terms that someone working in the
+particular area will be familiar with. The following table summarizes some
+that are not:
+
+Link extended substates:
+
+  ============================    =============================================
+  ``Unsupported rate``            The system attempted to operate the cable at
+                                  a rate that is not formally supported, which
+                                  led to signal integrity issues
+  ============================    =============================================
+
 DEBUG_GET
 =========