diff mbox series

[v1] net: sfp: add quirks for ODI DFP-34X-2C2

Message ID TY3P286MB2611C0FA24318AA397DB689B985A2@TY3P286MB2611.JPNP286.PROD.OUTLOOK.COM (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [v1] net: sfp: add quirks for ODI DFP-34X-2C2 | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 940 this patch: 940
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 7 of 7 maintainers
netdev/build_clang success Errors and warnings before: 957 this patch: 957
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 957 this patch: 957
netdev/checkpatch warning WARNING: line length of 83 exceeds 80 columns WARNING: line length of 84 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-02-27--03-00 (tests: 1456)

Commit Message

Qu Shengyu Feb. 26, 2024, 1:23 p.m. UTC
ODI DFP-34X-2C2 is capable of 2500base-X, but incorrectly report its
capabilities in the EEPROM.
So use sfp_quirk_2500basex for this module to allow 2500Base-X mode.

Signed-off-by: Shengyu Qu <wiagn233@outlook.com>
---
 drivers/net/phy/sfp.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Russell King (Oracle) Feb. 26, 2024, 2:04 p.m. UTC | #1
On Mon, Feb 26, 2024 at 09:23:46PM +0800, Shengyu Qu wrote:
> ODI DFP-34X-2C2 is capable of 2500base-X, but incorrectly report its
> capabilities in the EEPROM.
> So use sfp_quirk_2500basex for this module to allow 2500Base-X mode.

This was previously submitted by Sergio Palumbo, and comes in two
different forms - an OEM version and non-OEM. There was extensive
discussion about this, and the result is that I'm not accepting this
quirk for this module.

The reason is that the module _defaults_ to 1000base-X and requires
manual reconfiguration by the user to operate at 2500base-X.
Unfortunately, there is no way for the kernel to know whether that
reconfiguration has occurred.

The addition of this quirk makes the kernel select 2500base-X when
the module is plugged in to a host that supports both 2500base-X
and 1000base-X, resulting in the link with the module never coming
up. (2500base-X is 1000base-X clocked 2.5x faster, and there is
nothing in the line protocol that identifies this.)

Consequently, adding this quirk makes modules in their default
configuration not link with the host, and thus be inaccessible.

Therefore, for the best user experience (in terms of having a working
module when it turns up at the doorstep) the only option is to refuse
this quirk.

Now, what I really don't like is that I had a lengthy discussion over
this with Sergio, and it seems from a mainline developer point of view
that "oh, Sergio wasn't successful in getting this merged, someone else
can have a go".

It _isn't_ the person who is the problem - it is the principle and the
confusion this will cause to users who receive modules in their default
configuration (1000base-X), and then plug them in with this quirk in
place, and the kernel selects 2500base-X resulting in no link and *no*
access to the module.
Qu Shengyu Feb. 26, 2024, 2:16 p.m. UTC | #2
Hi Russell,

> On Mon, Feb 26, 2024 at 09:23:46PM +0800, Shengyu Qu wrote:
>> ODI DFP-34X-2C2 is capable of 2500base-X, but incorrectly report its
>> capabilities in the EEPROM.
>> So use sfp_quirk_2500basex for this module to allow 2500Base-X mode.
> This was previously submitted by Sergio Palumbo, and comes in two
> different forms - an OEM version and non-OEM. There was extensive
> discussion about this, and the result is that I'm not accepting this
> quirk for this module.
>
> The reason is that the module _defaults_ to 1000base-X and requires
> manual reconfiguration by the user to operate at 2500base-X.
> Unfortunately, there is no way for the kernel to know whether that
> reconfiguration has occurred.
No, In the firmware of this stick, the speed rate is configured to auto
negotiation rather than fixed 1000base-X. I tried 3 different versions 
of the firmware and gets the same result. Also, user can plug in and out 
the optic fiber for five times in 30 seconds to trigger a force factory 
reset.This function is also enabled by default. This would also reset 
the speed configuration to auto negotiation but still keeps LOID config. 
Best regards, Shengyu
> The addition of this quirk makes the kernel select 2500base-X when
> the module is plugged in to a host that supports both 2500base-X
> and 1000base-X, resulting in the link with the module never coming
> up. (2500base-X is 1000base-X clocked 2.5x faster, and there is
> nothing in the line protocol that identifies this.)
>
> Consequently, adding this quirk makes modules in their default
> configuration not link with the host, and thus be inaccessible.
>
> Therefore, for the best user experience (in terms of having a working
> module when it turns up at the doorstep) the only option is to refuse
> this quirk.
>
> Now, what I really don't like is that I had a lengthy discussion over
> this with Sergio, and it seems from a mainline developer point of view
> that "oh, Sergio wasn't successful in getting this merged, someone else
> can have a go".
>
> It _isn't_ the person who is the problem - it is the principle and the
> confusion this will cause to users who receive modules in their default
> configuration (1000base-X), and then plug them in with this quirk in
> place, and the kernel selects 2500base-X resulting in no link and *no*
> access to the module.
>
Russell King (Oracle) Feb. 26, 2024, 2:39 p.m. UTC | #3
On Mon, Feb 26, 2024 at 10:16:46PM +0800, Shengyu Qu wrote:
> Hi Russell,
> 
> > On Mon, Feb 26, 2024 at 09:23:46PM +0800, Shengyu Qu wrote:
> > > ODI DFP-34X-2C2 is capable of 2500base-X, but incorrectly report its
> > > capabilities in the EEPROM.
> > > So use sfp_quirk_2500basex for this module to allow 2500Base-X mode.
> > This was previously submitted by Sergio Palumbo, and comes in two
> > different forms - an OEM version and non-OEM. There was extensive
> > discussion about this, and the result is that I'm not accepting this
> > quirk for this module.
> > 
> > The reason is that the module _defaults_ to 1000base-X and requires
> > manual reconfiguration by the user to operate at 2500base-X.
> > Unfortunately, there is no way for the kernel to know whether that
> > reconfiguration has occurred.
> No, In the firmware of this stick, the speed rate is configured to auto
> negotiation rather than fixed 1000base-X.

How does this "auto negotiation" work?

I mean *exactly* how does it work? How does it know whether the host is
operating at 1000base-X or 2500base-X?

There is *no* inband protocol to allow this to be negotiated.
Qu Shengyu Feb. 26, 2024, 2:51 p.m. UTC | #4
Hi Russell,

在 2024/2/26 22:39, Russell King (Oracle) 写道:
> On Mon, Feb 26, 2024 at 10:16:46PM +0800, Shengyu Qu wrote:
>> Hi Russell,
>>
>>> On Mon, Feb 26, 2024 at 09:23:46PM +0800, Shengyu Qu wrote:
>>>> ODI DFP-34X-2C2 is capable of 2500base-X, but incorrectly report its
>>>> capabilities in the EEPROM.
>>>> So use sfp_quirk_2500basex for this module to allow 2500Base-X mode.
>>> This was previously submitted by Sergio Palumbo, and comes in two
>>> different forms - an OEM version and non-OEM. There was extensive
>>> discussion about this, and the result is that I'm not accepting this
>>> quirk for this module.
>>>
>>> The reason is that the module _defaults_ to 1000base-X and requires
>>> manual reconfiguration by the user to operate at 2500base-X.
>>> Unfortunately, there is no way for the kernel to know whether that
>>> reconfiguration has occurred.
>> No, In the firmware of this stick, the speed rate is configured to auto
>> negotiation rather than fixed 1000base-X.
> 
> How does this "auto negotiation" work?
> 
> I mean *exactly* how does it work? How does it know whether the host is
> operating at 1000base-X or 2500base-X?
> 
> There is *no* inband protocol to allow this to be negotiated.
> 
Well, that seems some kind weird trick implemented in that chip's SDK 
(maybe hardware?). It would automatically detect the speed rate that 
host uses and switch to that rate. The system log of the stick shows that.

Best regards,
Shengyu
Andrew Lunn Feb. 26, 2024, 3:06 p.m. UTC | #5
> Well, that seems some kind weird trick implemented in that chip's SDK (maybe
> hardware?). It would automatically detect the speed rate that host uses and
> switch to that rate. The system log of the stick shows that.

Please could you show some examples from the system log.

If this patch is accepted, these details need to be in the commit
message, to explain how this actually works, when it should not
actually work....

       Andrew
Qu Shengyu Feb. 26, 2024, 3:20 p.m. UTC | #6
Hi Andrew,

When telnet connected to the stick(It's using busybox), "cat /proc/kmsg"
would report something like "<4>change mode to 0", that 0 is the actual
speed rate codename it changes to. You could check this [1] for more
information.

Sorry but I can't record my kmsg output now, since enable connection to
my stick would make the network to disconnect for all users and my
roommates are using it.

Best regards,
Shengyu

[1] https://github.com/Anime4000/RTL960x/blob/main/Docs/FLASH_GETSET_INFO.md#lan_sds_mode

在 2024/2/26 23:06, Andrew Lunn 写道:
>> Well, that seems some kind weird trick implemented in that chip's SDK (maybe
>> hardware?). It would automatically detect the speed rate that host uses and
>> switch to that rate. The system log of the stick shows that.
> 
> Please could you show some examples from the system log.
> 
> If this patch is accepted, these details need to be in the commit
> message, to explain how this actually works, when it should not
> actually work....
> 
>         Andrew
Russell King (Oracle) Feb. 26, 2024, 3:42 p.m. UTC | #7
On Mon, Feb 26, 2024 at 10:51:36PM +0800, Shengyu Qu wrote:
> Hi Russell,
> 
> 在 2024/2/26 22:39, Russell King (Oracle) 写道:
> > On Mon, Feb 26, 2024 at 10:16:46PM +0800, Shengyu Qu wrote:
> > > Hi Russell,
> > > 
> > > > On Mon, Feb 26, 2024 at 09:23:46PM +0800, Shengyu Qu wrote:
> > > > > ODI DFP-34X-2C2 is capable of 2500base-X, but incorrectly report its
> > > > > capabilities in the EEPROM.
> > > > > So use sfp_quirk_2500basex for this module to allow 2500Base-X mode.
> > > > This was previously submitted by Sergio Palumbo, and comes in two
> > > > different forms - an OEM version and non-OEM. There was extensive
> > > > discussion about this, and the result is that I'm not accepting this
> > > > quirk for this module.
> > > > 
> > > > The reason is that the module _defaults_ to 1000base-X and requires
> > > > manual reconfiguration by the user to operate at 2500base-X.
> > > > Unfortunately, there is no way for the kernel to know whether that
> > > > reconfiguration has occurred.
> > > No, In the firmware of this stick, the speed rate is configured to auto
> > > negotiation rather than fixed 1000base-X.
> > 
> > How does this "auto negotiation" work?
> > 
> > I mean *exactly* how does it work? How does it know whether the host is
> > operating at 1000base-X or 2500base-X?
> > 
> > There is *no* inband protocol to allow this to be negotiated.
> > 
> Well, that seems some kind weird trick implemented in that chip's SDK (maybe
> hardware?). It would automatically detect the speed rate that host uses and
> switch to that rate. The system log of the stick shows that.

This sounds racy - between the SFP detecting the speed of the host and
the kernel code reconfiguring the interface. More details please...
Qu Shengyu Feb. 26, 2024, 3:42 p.m. UTC | #8
Hi Andrew,

So I should describe in commit message that user should set the stick to
2.5g/automatic mode and include the link in previous mail? Is that enough?

Best regards,
Shengyu

在 2024/2/26 23:06, Andrew Lunn 写道:
>> Well, that seems some kind weird trick implemented in that chip's SDK (maybe
>> hardware?). It would automatically detect the speed rate that host uses and
>> switch to that rate. The system log of the stick shows that.
> 
> Please could you show some examples from the system log.
> 
> If this patch is accepted, these details need to be in the commit
> message, to explain how this actually works, when it should not
> actually work....
> 
>         Andrew
Russell King (Oracle) Feb. 26, 2024, 3:50 p.m. UTC | #9
On Mon, Feb 26, 2024 at 11:20:47PM +0800, Shengyu Qu wrote:
> Hi Andrew,
> 
> When telnet connected to the stick(It's using busybox), "cat /proc/kmsg"
> would report something like "<4>change mode to 0", that 0 is the actual
> speed rate codename it changes to. You could check this [1] for more
> information.
> 
> Sorry but I can't record my kmsg output now, since enable connection to
> my stick would make the network to disconnect for all users and my
> roommates are using it.
> 
> Best regards,
> Shengyu
> 
> [1] https://github.com/Anime4000/RTL960x/blob/main/Docs/FLASH_GETSET_INFO.md#lan_sds_mode

I'm aware of that link, and sadly the information in that table can at
best be described as "confused".

For example, under "Behaviour" it mentions "1GbaseT/100baseT". How
exactly does a SFP module exhibit baseT behaviour (which is twisted
pair copper ethernet.) Hint: it can't and it doesn't. "Mode" being
listed as "TP" is also misleading and wrong. The ethtool value of
0x20 is bit 5, which is 1000baseT_Full which has nothing to do with
a SFP (and won't be reported as even a supported mode for this SFP.)
Bit 41 would make sense (1000baseX), which is listed against mode 1.

I'm afraid looking at that URL just adds to more confusion.
Qu Shengyu Feb. 26, 2024, 3:51 p.m. UTC | #10
Hi Russell,

Sadly I don't have more information. I think we could only get that from
Realtek guys(This stick is using RTL9601D). What we can get now is all
inside that github repo. Realtek didn't release SDK source code to
public for this chip. :(

Best regards,
Shengyu

在 2024/2/26 23:42, Russell King (Oracle) 写道:
> On Mon, Feb 26, 2024 at 10:51:36PM +0800, Shengyu Qu wrote:
>> Hi Russell,
>>
>> 在 2024/2/26 22:39, Russell King (Oracle) 写道:
>>> On Mon, Feb 26, 2024 at 10:16:46PM +0800, Shengyu Qu wrote:
>>>> Hi Russell,
>>>>
>>>>> On Mon, Feb 26, 2024 at 09:23:46PM +0800, Shengyu Qu wrote:
>>>>>> ODI DFP-34X-2C2 is capable of 2500base-X, but incorrectly report its
>>>>>> capabilities in the EEPROM.
>>>>>> So use sfp_quirk_2500basex for this module to allow 2500Base-X mode.
>>>>> This was previously submitted by Sergio Palumbo, and comes in two
>>>>> different forms - an OEM version and non-OEM. There was extensive
>>>>> discussion about this, and the result is that I'm not accepting this
>>>>> quirk for this module.
>>>>>
>>>>> The reason is that the module _defaults_ to 1000base-X and requires
>>>>> manual reconfiguration by the user to operate at 2500base-X.
>>>>> Unfortunately, there is no way for the kernel to know whether that
>>>>> reconfiguration has occurred.
>>>> No, In the firmware of this stick, the speed rate is configured to auto
>>>> negotiation rather than fixed 1000base-X.
>>>
>>> How does this "auto negotiation" work?
>>>
>>> I mean *exactly* how does it work? How does it know whether the host is
>>> operating at 1000base-X or 2500base-X?
>>>
>>> There is *no* inband protocol to allow this to be negotiated.
>>>
>> Well, that seems some kind weird trick implemented in that chip's SDK (maybe
>> hardware?). It would automatically detect the speed rate that host uses and
>> switch to that rate. The system log of the stick shows that.
> 
> This sounds racy - between the SFP detecting the speed of the host and
> the kernel code reconfiguring the interface. More details please...
>
Andrew Lunn Feb. 26, 2024, 3:52 p.m. UTC | #11
On Mon, Feb 26, 2024 at 11:42:49PM +0800, Shengyu Qu wrote:
> Hi Andrew,
> 
> So I should describe in commit message that user should set the stick to
> 2.5g/automatic mode and include the link in previous mail? Is that enough?

As Russell pointed out, this sounds racy. We need lots of details in
the commit message to convince us this is safe, and is not going to
cause a regression. If it does break and cause a regression, having
those details will also help us decide what to do, other than just
revert the change.

The further the devices get from well defined standardised behaviour,
the more details are needed.

       Andrew
Qu Shengyu Feb. 28, 2024, 10:42 a.m. UTC | #12
Hi Russell,

I got contact with someone that has access to this chip's datasheet.
He tolds me that according to the datasheet, this chip
"supports a rate adaptor feature, which is a Realtek proprietary SERDES
mechanism."
" When the chip's SERDES operates in Rate adaptor mode, the SERDES speed
is fixed at 2.5Gbps and the Ethernet speed cannot be higher than the
SERDES speed.
The SOC need not follow the Ethernet link speed, and cannot change the
SERDES speed in this mode. There is a data flow control mechanism to
ensure correct data transmission."

Is it enough to put this into the commit information?

Best regards,
Shengyu


在 2024/2/26 23:42, Russell King (Oracle) 写道:
> On Mon, Feb 26, 2024 at 10:51:36PM +0800, Shengyu Qu wrote:
>> Hi Russell,
>>
>> 在 2024/2/26 22:39, Russell King (Oracle) 写道:
>>> On Mon, Feb 26, 2024 at 10:16:46PM +0800, Shengyu Qu wrote:
>>>> Hi Russell,
>>>>
>>>>> On Mon, Feb 26, 2024 at 09:23:46PM +0800, Shengyu Qu wrote:
>>>>>> ODI DFP-34X-2C2 is capable of 2500base-X, but incorrectly report its
>>>>>> capabilities in the EEPROM.
>>>>>> So use sfp_quirk_2500basex for this module to allow 2500Base-X mode.
>>>>> This was previously submitted by Sergio Palumbo, and comes in two
>>>>> different forms - an OEM version and non-OEM. There was extensive
>>>>> discussion about this, and the result is that I'm not accepting this
>>>>> quirk for this module.
>>>>>
>>>>> The reason is that the module _defaults_ to 1000base-X and requires
>>>>> manual reconfiguration by the user to operate at 2500base-X.
>>>>> Unfortunately, there is no way for the kernel to know whether that
>>>>> reconfiguration has occurred.
>>>> No, In the firmware of this stick, the speed rate is configured to auto
>>>> negotiation rather than fixed 1000base-X.
>>>
>>> How does this "auto negotiation" work?
>>>
>>> I mean *exactly* how does it work? How does it know whether the host is
>>> operating at 1000base-X or 2500base-X?
>>>
>>> There is *no* inband protocol to allow this to be negotiated.
>>>
>> Well, that seems some kind weird trick implemented in that chip's SDK (maybe
>> hardware?). It would automatically detect the speed rate that host uses and
>> switch to that rate. The system log of the stick shows that.
> 
> This sounds racy - between the SFP detecting the speed of the host and
> the kernel code reconfiguring the interface. More details please...
>
Russell King (Oracle) Feb. 28, 2024, 11:15 a.m. UTC | #13
On Wed, Feb 28, 2024 at 06:42:55PM +0800, Shengyu Qu wrote:
> Hi Russell,
> 
> I got contact with someone that has access to this chip's datasheet.
> He tolds me that according to the datasheet, this chip
> "supports a rate adaptor feature, which is a Realtek proprietary SERDES
> mechanism."
> " When the chip's SERDES operates in Rate adaptor mode, the SERDES speed
> is fixed at 2.5Gbps and the Ethernet speed cannot be higher than the
> SERDES speed.
> The SOC need not follow the Ethernet link speed, and cannot change the
> SERDES speed in this mode. There is a data flow control mechanism to
> ensure correct data transmission."
> 
> Is it enough to put this into the commit information?

I think you need to read that again and understand it. Specifically
"the SERDES speed is fixed at 2.5Gbps". When operating in this mode
(rate adapter mode) this would mean the SERDES (which is what talks
to the host) is fixed at 2500base-X and is fundamentally incompatible
and incapable of linking with a host operating at 1000base-X.

"Rate adapter" is a term that is used by manufacturers to mean that
a e.g. PHY is able to operate at several different speeds on its
media side while operating at a specific speed on its host side.
E.g. a PHY that presents at 10GBASE-R host interface but supports
media at 10M, 100M, 1G, 2.5G and 5G speeds. Typically, it controls
the rate at which the host sends packets by sending PAUSE frames to
the host.

This feature is irrelevant to the abilities of the module to link at
either 1000base-X or 2500base-X with the host.

Thanks.
diff mbox series

Patch

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index f75c9eb3958e..2021cb4ff2f6 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -495,6 +495,13 @@  static const struct sfp_quirk sfp_quirks[] = {
 	// 2500MBd NRZ in their EEPROM
 	SFP_QUIRK_M("Lantech", "8330-262D-E", sfp_quirk_2500basex),
 
+	// ODI DFP-34X-2C2 can operate at 2500base-X, but incorrectly report 1300MBd
+	// NRZ in the EEPROM.
+	// Besides, In early batches, vendor id is set to OEM, but that is fixed in
+	// newer batches.
+	SFP_QUIRK_M("ODI", "DFP-34X-2C2", sfp_quirk_2500basex),
+	SFP_QUIRK_M("OEM", "DFP-34X-2C2", sfp_quirk_2500basex),
+
 	SFP_QUIRK_M("UBNT", "UF-INSTANT", sfp_quirk_ubnt_uf_instant),
 
 	// Walsun HXSX-ATR[CI]-1 don't identify as copper, and use the