diff mbox series

[1/7] dsa: marvell: Provide per device information about max frame size

Message ID 20230309125421.3900962-2-lukma@denx.de (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series dsa: marvell: Add support for mv88e6071 and 6020 switches | expand

Checks

Context Check Description
netdev/series_format warning Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 20 this patch: 20
netdev/cc_maintainers success CCed 8 of 8 maintainers
netdev/build_clang success Errors and warnings before: 18 this patch: 18
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 20 this patch: 20
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 229 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Lukasz Majewski March 9, 2023, 12:54 p.m. UTC
Different Marvell DSA switches support different size of max frame
bytes to be sent. This value corresponds to the memory allocated
in switch to store single frame.

For example mv88e6185 supports max 1632 bytes, which is now in-driver
standard value. On the other hand - mv88e6250 supports 2048 bytes.
To be more interesting - devices supporting jumbo frames - use yet
another value (10240 bytes)

As this value is internal and may be different for each switch IC,
new entry in struct mv88e6xxx_info has been added to store it.

This commit doesn't change the code functionality - it just provides
the max frame size value explicitly - up till now it has been
assigned depending on the callback provided by the switch driver
(e.g. .set_max_frame_size, .port_set_jumbo_size).

Signed-off-by: Lukasz Majewski <lukma@denx.de>

---
Changes for v2:
- Define max_frame_size with default value of 1632 bytes,
- Set proper value for the mv88e6250 switch SoC (linkstreet) family

Changes for v3:
- Add default value for 1632B of the max frame size (to avoid problems
  with not defined values)

Changes for v4:
- Rework the mv88e6xxx_get_max_mtu() by using per device defined
  max_frame_size value

- Add WARN_ON_ONCE() when max_frame_size is not defined

- Add description for the new 'max_frame_size' member of mv88e6xxx_info

Changes for v5:
- Move some code fragments (like get_mtu callback changes) to separate
  patches
---
 drivers/net/dsa/mv88e6xxx/chip.c | 31 +++++++++++++++++++++++++++++++
 drivers/net/dsa/mv88e6xxx/chip.h |  6 ++++++
 2 files changed, 37 insertions(+)

Comments

Vladimir Oltean March 10, 2023, 12:02 p.m. UTC | #1
On Thu, Mar 09, 2023 at 01:54:15PM +0100, Lukasz Majewski wrote:
> Different Marvell DSA switches support different size of max frame
> bytes to be sent. This value corresponds to the memory allocated
> in switch to store single frame.
> 
> For example mv88e6185 supports max 1632 bytes, which is now in-driver
> standard value.

What is the criterion based on which 1632 is the "in-driver standard value"?

> On the other hand - mv88e6250 supports 2048 bytes.

What you mean to suggest here is that, using the current classification
from mv88e6xxx_get_max_mtu(), mv88e6250 falls into the "none of the above"
bucket, for which the driver returns 1522 - VLAN_ETH_HLEN - EDSA_HLEN - ETH_FCS_LEN // 1492.
But it truly supports a maximum frame length of 2048, per your research.

The problem is that I needed to spend 30 minutes to understand this, and
the true motivation for this patch.

> To be more interesting - devices supporting jumbo frames - use yet
> another value (10240 bytes)

What's interesting about this?

> 
> As this value is internal and may be different for each switch IC,
> new entry in struct mv88e6xxx_info has been added to store it.

You need to provide a justification for why the existing code structure
is not good enough.

> 
> This commit doesn't change the code functionality - it just provides
> the max frame size value explicitly - up till now it has been
> assigned depending on the callback provided by the switch driver
> (e.g. .set_max_frame_size, .port_set_jumbo_size).
> 
> Signed-off-by: Lukasz Majewski <lukma@denx.de>
> 
> ---
> Changes for v2:
> - Define max_frame_size with default value of 1632 bytes,
> - Set proper value for the mv88e6250 switch SoC (linkstreet) family
> 
> Changes for v3:
> - Add default value for 1632B of the max frame size (to avoid problems
>   with not defined values)
> 
> Changes for v4:
> - Rework the mv88e6xxx_get_max_mtu() by using per device defined
>   max_frame_size value
> 
> - Add WARN_ON_ONCE() when max_frame_size is not defined
> 
> - Add description for the new 'max_frame_size' member of mv88e6xxx_info
> 
> Changes for v5:
> - Move some code fragments (like get_mtu callback changes) to separate
>   patches

you have change log up to v5, but your subject prefix is [PATCH 1/7]
which implies v1?

> ---
>  drivers/net/dsa/mv88e6xxx/chip.c | 31 +++++++++++++++++++++++++++++++
>  drivers/net/dsa/mv88e6xxx/chip.h |  6 ++++++
>  2 files changed, 37 insertions(+)
> 
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
> index 0a5d6c7bb128..c097a0b19ba6 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.c
> +++ b/drivers/net/dsa/mv88e6xxx/chip.c

It would be good if the commit message contained the procedure based on
which you had made these changes - and preferably they were mechanical.
Having a small C program written would be absolutely ideal.
This is so that reviewers wouldn't have to do it in parallel...

My analysis has determined the following 3 categories:

static int mv88e6xxx_get_max_mtu(struct dsa_switch *ds, int port)
{
	struct mv88e6xxx_chip *chip = ds->priv;

	if (chip->info->ops->port_set_jumbo_size)
		return 10240 - VLAN_ETH_HLEN - EDSA_HLEN - ETH_FCS_LEN; // 10210
	else if (chip->info->ops->set_max_frame_size)
		return 1632 - VLAN_ETH_HLEN - EDSA_HLEN - ETH_FCS_LEN; // 1602
	return 1522 - VLAN_ETH_HLEN - EDSA_HLEN - ETH_FCS_LEN; // 1492
}

By ops:

port_set_jumbo_size:
static const struct mv88e6xxx_ops mv88e6131_ops = {
static const struct mv88e6xxx_ops mv88e6141_ops = {
static const struct mv88e6xxx_ops mv88e6171_ops = {
static const struct mv88e6xxx_ops mv88e6172_ops = {
static const struct mv88e6xxx_ops mv88e6175_ops = {
static const struct mv88e6xxx_ops mv88e6176_ops = {
static const struct mv88e6xxx_ops mv88e6190_ops = {
static const struct mv88e6xxx_ops mv88e6190x_ops = {
static const struct mv88e6xxx_ops mv88e6240_ops = {
static const struct mv88e6xxx_ops mv88e6320_ops = {
static const struct mv88e6xxx_ops mv88e6321_ops = {
static const struct mv88e6xxx_ops mv88e6341_ops = {
static const struct mv88e6xxx_ops mv88e6350_ops = {
static const struct mv88e6xxx_ops mv88e6351_ops = {
static const struct mv88e6xxx_ops mv88e6352_ops = {
static const struct mv88e6xxx_ops mv88e6390_ops = {
static const struct mv88e6xxx_ops mv88e6390x_ops = {
static const struct mv88e6xxx_ops mv88e6393x_ops = {

set_max_frame_size:
static const struct mv88e6xxx_ops mv88e6085_ops = {
static const struct mv88e6xxx_ops mv88e6095_ops = {
static const struct mv88e6xxx_ops mv88e6097_ops = {
static const struct mv88e6xxx_ops mv88e6123_ops = {
static const struct mv88e6xxx_ops mv88e6161_ops = {
static const struct mv88e6xxx_ops mv88e6185_ops = {

none of the above:
static const struct mv88e6xxx_ops mv88e6165_ops = {
static const struct mv88e6xxx_ops mv88e6191_ops = {
static const struct mv88e6xxx_ops mv88e6250_ops = {
static const struct mv88e6xxx_ops mv88e6290_ops = {


By info:

port_set_jumbo_size (10240):
	[MV88E6131] = {
	[MV88E6141] = {
	[MV88E6171] = {
	[MV88E6172] = {
	[MV88E6175] = {
	[MV88E6176] = {
	[MV88E6190] = {
	[MV88E6190X] = {
	[MV88E6240] = {
	[MV88E6320] = {
	[MV88E6321] = {
	[MV88E6341] = {
	[MV88E6350] = {
	[MV88E6351] = {
	[MV88E6352] = {
	[MV88E6390] = {
	[MV88E6390X] = {
	[MV88E6191X] = {
	[MV88E6193X] = {
	[MV88E6393X] = {

set_max_frame_size (1632):
	[MV88E6085] = {
	[MV88E6095] = {
	[MV88E6097] = {
	[MV88E6123] = {
	[MV88E6161] = {
	[MV88E6185] = {

none of the above (1522):
	[MV88E6165] = {
	[MV88E6191] = {
	[MV88E6220] = {
	[MV88E6250] = {
	[MV88E6290] = {


Whereas your analysis seems to have determined this:

port_set_jumbo_size (10240):
	[MV88E6131] = {
	[MV88E6141] = {
	[MV88E6171] = {
	[MV88E6172] = {
	[MV88E6175] = {
	[MV88E6176] = {
	[MV88E6190] = {
	[MV88E6240] = {
	[MV88E6320] = {
	[MV88E6321] = {
	[MV88E6341] = {
	[MV88E6350] = {
	[MV88E6351] = {
	[MV88E6352] = {
	[MV88E6390] = {
	[MV88E6390X] = {
	[MV88E6393X] = {

set_max_frame_size (1632):
	[MV88E6095] = {
	[MV88E6097] = {
	[MV88E6123] = {
	[MV88E6161] = {
	[MV88E6165] = {
	[MV88E6185] = {

none of the above (1522):
	[MV88E6085] = {
	[MV88E6190X] = {
	[MV88E6191] = {
	[MV88E6191X] = {
	[MV88E6193X] = {
	[MV88E6290] = {

what's up with these?! (no max_frame_size)
	[MV88E6220] = {
	[MV88E6250] = {


So our analysis differs for:

MV88E6190X (I say 10240, you say 1522)
MV88E6191X (I say 10240, you say 1522)
MV88E6193X (I say 10240, you say 1522)
MV88E6085 (I say 1632, you say 1522)
MV88E6165 (I say 1522, you say 1632)
MV88E6220 (I say 1522, not clear what you say)
MV88E6250 (I say 1522, not clear what you say)

Double-checking with the code, I believe my analysis to be the correct one...


I have also noticed that you have not acted upon my previous review comment:
https://patchwork.kernel.org/project/netdevbpf/patch/20230106101651.1137755-1-lukma@denx.de/

| 1522 - 30 = 1492.
| 
| I don't believe that there are switches which don't support the standard
| MTU of 1500 ?!
| 
| >  		.port_base_addr = 0x10,
| >  		.phy_base_addr = 0x0,
| >  		.global1_addr = 0x1b,
| 
| Note that I see this behavior isn't new. But I've simulated it, and it
| will produce the following messages on probe:
| 
| [    7.425752] mscc_felix 0000:00:00.5 swp0 (uninitialized): PHY [0000:00:00.3:10] driver [Microsemi GE VSC8514 SyncE] (irq=POLL)
| [    7.437516] mscc_felix 0000:00:00.5: nonfatal error -34 setting MTU to 1500 on port 0
| [    7.588585] mscc_felix 0000:00:00.5 swp1 (uninitialized): PHY [0000:00:00.3:11] driver [Microsemi GE VSC8514 SyncE] (irq=POLL)
| [    7.600433] mscc_felix 0000:00:00.5: nonfatal error -34 setting MTU to 1500 on port 1
| [    7.752613] mscc_felix 0000:00:00.5 swp2 (uninitialized): PHY [0000:00:00.3:12] driver [Microsemi GE VSC8514 SyncE] (irq=POLL)
| [    7.764457] mscc_felix 0000:00:00.5: nonfatal error -34 setting MTU to 1500 on port 2
| [    7.900771] mscc_felix 0000:00:00.5 swp3 (uninitialized): PHY [0000:00:00.3:13] driver [Microsemi GE VSC8514 SyncE] (irq=POLL)
| [    7.912501] mscc_felix 0000:00:00.5: nonfatal error -34 setting MTU to 1500 on port 3
| 
| I wonder, shouldn't we first fix that, and apply this patch set afterwards?

I guess I will have to fix this now, since you haven't done it.

> diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h
> index da6e1339f809..e2b88f1f8376 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.h
> +++ b/drivers/net/dsa/mv88e6xxx/chip.h
> @@ -132,6 +132,12 @@ struct mv88e6xxx_info {
>  	unsigned int num_gpio;
>  	unsigned int max_vid;
>  	unsigned int max_sid;
> +
> +	/* Max Frame Size.
> +	 * This value corresponds to the memory allocated in switch internal
> +	 * memory to store single frame.
> +	 */

What is the source of this definition?

I'm asking because I know of other switches where the internal memory
allocation scheme has nothing to do with the frame size. Instead, there
are SRAM cells of fixed and small size (say 60 octets) chained together.

> +	unsigned int max_frame_size;
>  	unsigned int port_base_addr;
>  	unsigned int phy_base_addr;
>  	unsigned int global1_addr;
> -- 
> 2.20.1
>
Russell King (Oracle) March 10, 2023, 12:25 p.m. UTC | #2
On Fri, Mar 10, 2023 at 02:02:35PM +0200, Vladimir Oltean wrote:
> By ops:
> 
> port_set_jumbo_size:
> static const struct mv88e6xxx_ops mv88e6131_ops = {
> static const struct mv88e6xxx_ops mv88e6141_ops = {
> static const struct mv88e6xxx_ops mv88e6171_ops = {
> static const struct mv88e6xxx_ops mv88e6172_ops = {
> static const struct mv88e6xxx_ops mv88e6175_ops = {
> static const struct mv88e6xxx_ops mv88e6176_ops = {
> static const struct mv88e6xxx_ops mv88e6190_ops = {
> static const struct mv88e6xxx_ops mv88e6190x_ops = {
> static const struct mv88e6xxx_ops mv88e6240_ops = {
> static const struct mv88e6xxx_ops mv88e6320_ops = {
> static const struct mv88e6xxx_ops mv88e6321_ops = {
> static const struct mv88e6xxx_ops mv88e6341_ops = {
> static const struct mv88e6xxx_ops mv88e6350_ops = {
> static const struct mv88e6xxx_ops mv88e6351_ops = {
> static const struct mv88e6xxx_ops mv88e6352_ops = {
> static const struct mv88e6xxx_ops mv88e6390_ops = {
> static const struct mv88e6xxx_ops mv88e6390x_ops = {
> static const struct mv88e6xxx_ops mv88e6393x_ops = {
> 
> set_max_frame_size:
> static const struct mv88e6xxx_ops mv88e6085_ops = {
> static const struct mv88e6xxx_ops mv88e6095_ops = {
> static const struct mv88e6xxx_ops mv88e6097_ops = {
> static const struct mv88e6xxx_ops mv88e6123_ops = {
> static const struct mv88e6xxx_ops mv88e6161_ops = {
> static const struct mv88e6xxx_ops mv88e6185_ops = {
> 
> none of the above:
> static const struct mv88e6xxx_ops mv88e6165_ops = {
> static const struct mv88e6xxx_ops mv88e6191_ops = {
> static const struct mv88e6xxx_ops mv88e6250_ops = {
> static const struct mv88e6xxx_ops mv88e6290_ops = {
> 
> 
> By info:
> 
> port_set_jumbo_size (10240):
> 	[MV88E6131] = {
> 	[MV88E6141] = {
> 	[MV88E6171] = {
> 	[MV88E6172] = {
> 	[MV88E6175] = {
> 	[MV88E6176] = {
> 	[MV88E6190] = {
> 	[MV88E6190X] = {
> 	[MV88E6240] = {
> 	[MV88E6320] = {
> 	[MV88E6321] = {
> 	[MV88E6341] = {
> 	[MV88E6350] = {
> 	[MV88E6351] = {
> 	[MV88E6352] = {
> 	[MV88E6390] = {
> 	[MV88E6390X] = {
> 	[MV88E6191X] = {
> 	[MV88E6193X] = {
> 	[MV88E6393X] = {
> 
> set_max_frame_size (1632):
> 	[MV88E6085] = {
> 	[MV88E6095] = {
> 	[MV88E6097] = {
> 	[MV88E6123] = {
> 	[MV88E6161] = {
> 	[MV88E6185] = {
> 
> none of the above (1522):
> 	[MV88E6165] = {
> 	[MV88E6191] = {
> 	[MV88E6220] = {
> 	[MV88E6250] = {
> 	[MV88E6290] = {
> 
> 
> Whereas your analysis seems to have determined this:
> 
> port_set_jumbo_size (10240):
> 	[MV88E6131] = {
> 	[MV88E6141] = {
> 	[MV88E6171] = {
> 	[MV88E6172] = {
> 	[MV88E6175] = {
> 	[MV88E6176] = {
> 	[MV88E6190] = {
> 	[MV88E6240] = {
> 	[MV88E6320] = {
> 	[MV88E6321] = {
> 	[MV88E6341] = {
> 	[MV88E6350] = {
> 	[MV88E6351] = {
> 	[MV88E6352] = {
> 	[MV88E6390] = {
> 	[MV88E6390X] = {
> 	[MV88E6393X] = {
> 
> set_max_frame_size (1632):
> 	[MV88E6095] = {
> 	[MV88E6097] = {
> 	[MV88E6123] = {
> 	[MV88E6161] = {
> 	[MV88E6165] = {
> 	[MV88E6185] = {
> 
> none of the above (1522):
> 	[MV88E6085] = {
> 	[MV88E6190X] = {
> 	[MV88E6191] = {
> 	[MV88E6191X] = {
> 	[MV88E6193X] = {
> 	[MV88E6290] = {
> 
> what's up with these?! (no max_frame_size)
> 	[MV88E6220] = {
> 	[MV88E6250] = {
> 
> 
> So our analysis differs for:
> 
> MV88E6190X (I say 10240, you say 1522)
> MV88E6191X (I say 10240, you say 1522)
> MV88E6193X (I say 10240, you say 1522)
> MV88E6085 (I say 1632, you say 1522)
> MV88E6165 (I say 1522, you say 1632)
> MV88E6220 (I say 1522, not clear what you say)
> MV88E6250 (I say 1522, not clear what you say)
> 
> Double-checking with the code, I believe my analysis to be the correct one...

This is similar analysis to what I did for a previous patch set, and
came to the conclusion that we need code in the driver to validate
that the addition of these values is in fact correct. See my previous
reviews and my recommendations on how to structure these patch sets,
so we as reviewers don't _have_ to go to this level of verification.

> I have also noticed that you have not acted upon my previous review comment:
> https://patchwork.kernel.org/project/netdevbpf/patch/20230106101651.1137755-1-lukma@denx.de/
> 
> | 1522 - 30 = 1492.
> | 
> | I don't believe that there are switches which don't support the standard
> | MTU of 1500 ?!
> | 
> | >  		.port_base_addr = 0x10,
> | >  		.phy_base_addr = 0x0,
> | >  		.global1_addr = 0x1b,
> | 
> | Note that I see this behavior isn't new. But I've simulated it, and it
> | will produce the following messages on probe:
> | 
> | [    7.425752] mscc_felix 0000:00:00.5 swp0 (uninitialized): PHY [0000:00:00.3:10] driver [Microsemi GE VSC8514 SyncE] (irq=POLL)
> | [    7.437516] mscc_felix 0000:00:00.5: nonfatal error -34 setting MTU to 1500 on port 0
> | [    7.588585] mscc_felix 0000:00:00.5 swp1 (uninitialized): PHY [0000:00:00.3:11] driver [Microsemi GE VSC8514 SyncE] (irq=POLL)
> | [    7.600433] mscc_felix 0000:00:00.5: nonfatal error -34 setting MTU to 1500 on port 1
> | [    7.752613] mscc_felix 0000:00:00.5 swp2 (uninitialized): PHY [0000:00:00.3:12] driver [Microsemi GE VSC8514 SyncE] (irq=POLL)
> | [    7.764457] mscc_felix 0000:00:00.5: nonfatal error -34 setting MTU to 1500 on port 2
> | [    7.900771] mscc_felix 0000:00:00.5 swp3 (uninitialized): PHY [0000:00:00.3:13] driver [Microsemi GE VSC8514 SyncE] (irq=POLL)
> | [    7.912501] mscc_felix 0000:00:00.5: nonfatal error -34 setting MTU to 1500 on port 3
> | 
> | I wonder, shouldn't we first fix that, and apply this patch set afterwards?
> 
> I guess I will have to fix this now, since you haven't done it.

I'm sorry, but why is this Lukasz's problem to fix? If it's broken today
when using mv88e6xxx with this PHY, and Lukasz doesn't have this PHY,
why does Lukasz have to solve this?

> > +
> > +	/* Max Frame Size.
> > +	 * This value corresponds to the memory allocated in switch internal
> > +	 * memory to store single frame.
> > +	 */
> 
> What is the source of this definition?
> 
> I'm asking because I know of other switches where the internal memory
> allocation scheme has nothing to do with the frame size. Instead, there
> are SRAM cells of fixed and small size (say 60 octets) chained together.

The switch documentation only really talks about maximum frame sizes
that the switch can handle, with a few bits that configure what the
maximum frame size is. We also know how large the SRAM is, but how
the SRAM is allocated to packets is for Marvell engineers to know
and not us mere mortals.

So, the base definition for this is the information provided in the
switch documentation.
Vladimir Oltean March 10, 2023, 1:04 p.m. UTC | #3
On Fri, Mar 10, 2023 at 12:25:59PM +0000, Russell King (Oracle) wrote:
> This is similar analysis to what I did for a previous patch set, and
> came to the conclusion that we need code in the driver to validate
> that the addition of these values is in fact correct. See my previous
> reviews and my recommendations on how to structure these patch sets,
> so we as reviewers don't _have_ to go to this level of verification.

Ok, I haven't read other patches or comments except for 1/7 yet.

> > I guess I will have to fix this now, since you haven't done it.
> 
> I'm sorry, but why is this Lukasz's problem to fix? If it's broken today
> when using mv88e6xxx with this PHY, and Lukasz doesn't have this PHY,
> why does Lukasz have to solve this?

Well, in principle no one has to solve it. It would be good to not move
around broken code if we know it's broken, is what I'm saying. This is
because eventually, someone who *is* affected *will* want to fix it, and
that fix will conflict with the refactoring. Lukasz would have had the
interest in fixing it because he's touching that code. Again, I will do
this when I find the time.

> > > +	/* Max Frame Size.
> > > +	 * This value corresponds to the memory allocated in switch internal
> > > +	 * memory to store single frame.
> > > +	 */
> > 
> > What is the source of this definition?
> > 
> > I'm asking because I know of other switches where the internal memory
> > allocation scheme has nothing to do with the frame size. Instead, there
> > are SRAM cells of fixed and small size (say 60 octets) chained together.
> 
> The switch documentation only really talks about maximum frame sizes
> that the switch can handle, with a few bits that configure what the
> maximum frame size is. We also know how large the SRAM is, but how
> the SRAM is allocated to packets is for Marvell engineers to know
> and not us mere mortals.
> 
> So, the base definition for this is the information provided in the
> switch documentation.

Agree with the "mere mortals" comment. I was trying to suggest that the
given definition tries to make it appear that we know more about the
switch internal memory allocation than we really do. "This value
corresponds to the memory allocated in switch internal memory to store
single frame" -> how do we know that it corresponds?
Vladimir Oltean March 10, 2023, 1:06 p.m. UTC | #4
On Fri, Mar 10, 2023 at 03:04:46PM +0200, Vladimir Oltean wrote:
> > I'm sorry, but why is this Lukasz's problem to fix? If it's broken today
> > when using mv88e6xxx with this PHY, and Lukasz doesn't have this PHY,
> > why does Lukasz have to solve this?
> 
> Well, in principle no one has to solve it. It would be good to not move
> around broken code if we know it's broken, is what I'm saying. This is
> because eventually, someone who *is* affected *will* want to fix it, and
> that fix will conflict with the refactoring. Lukasz would have had the
> interest in fixing it because he's touching that code. Again, I will do
> this when I find the time.

also: what PHY? :-/
Lukasz Majewski March 10, 2023, 1:17 p.m. UTC | #5
Hi Vladimir,

> On Thu, Mar 09, 2023 at 01:54:15PM +0100, Lukasz Majewski wrote:
> > Different Marvell DSA switches support different size of max frame
> > bytes to be sent. This value corresponds to the memory allocated
> > in switch to store single frame.
> > 
> > For example mv88e6185 supports max 1632 bytes, which is now
> > in-driver standard value.  
> 
> What is the criterion based on which 1632 is the "in-driver standard
> value"?

It comes from the documentation I suppose. Moreover, this might be the
the "first" used value when set_max_mtu callback was introduced.

> 
> > On the other hand - mv88e6250 supports 2048 bytes.  
> 
> What you mean to suggest here is that, using the current
> classification from mv88e6xxx_get_max_mtu(), mv88e6250 falls into the
> "none of the above" bucket, for which the driver returns 1522 -
> VLAN_ETH_HLEN - EDSA_HLEN - ETH_FCS_LEN // 1492. But it truly
> supports a maximum frame length of 2048, per your research.
> 

And this cannot be easily fixed.

I could just provide callback to .set_max_frame_size in mv88e6250_ops
and the mv88e6250 would use 1632 bytes which would prevent errors.

However, when I dig deeper - it turned out that this value is larger.
And hence I wanted to do it in "right way".

> The problem is that I needed to spend 30 minutes to understand this,
> and the true motivation for this patch.
> 

The motivation is correctly stated in the commit message. There is a
bug - mv88e6250 and friends use 2048 max frame size value.

> > To be more interesting - devices supporting jumbo frames - use yet
> > another value (10240 bytes)  
> 
> What's interesting about this?
> 
> > 
> > As this value is internal and may be different for each switch IC,
> > new entry in struct mv88e6xxx_info has been added to store it.  
> 
> You need to provide a justification for why the existing code
> structure is not good enough.

It is clearly stated in patch 3/7 where the existing scheme is replaced
with this one.

> 
> > 
> > This commit doesn't change the code functionality - it just provides
> > the max frame size value explicitly - up till now it has been
> > assigned depending on the callback provided by the switch driver
> > (e.g. .set_max_frame_size, .port_set_jumbo_size).
> > 
> > Signed-off-by: Lukasz Majewski <lukma@denx.de>
> > 
> > ---
> > Changes for v2:
> > - Define max_frame_size with default value of 1632 bytes,
> > - Set proper value for the mv88e6250 switch SoC (linkstreet) family
> > 
> > Changes for v3:
> > - Add default value for 1632B of the max frame size (to avoid
> > problems with not defined values)
> > 
> > Changes for v4:
> > - Rework the mv88e6xxx_get_max_mtu() by using per device defined
> >   max_frame_size value
> > 
> > - Add WARN_ON_ONCE() when max_frame_size is not defined
> > 
> > - Add description for the new 'max_frame_size' member of
> > mv88e6xxx_info
> > 
> > Changes for v5:
> > - Move some code fragments (like get_mtu callback changes) to
> > separate patches  
> 
> you have change log up to v5, but your subject prefix is [PATCH 1/7]
> which implies v1?

My mistake - I've forgotten to add proper subject-prefix parameter.

> 
> > ---
> >  drivers/net/dsa/mv88e6xxx/chip.c | 31
> > +++++++++++++++++++++++++++++++ drivers/net/dsa/mv88e6xxx/chip.h |
> > 6 ++++++ 2 files changed, 37 insertions(+)
> > 
> > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c
> > b/drivers/net/dsa/mv88e6xxx/chip.c index 0a5d6c7bb128..c097a0b19ba6
> > 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c
> > +++ b/drivers/net/dsa/mv88e6xxx/chip.c  
> 
> It would be good if the commit message contained the procedure based
> on which you had made these changes - and preferably they were
> mechanical. Having a small C program written would be absolutely
> ideal. This is so that reviewers wouldn't have to do it in parallel...
> 
> My analysis has determined the following 3 categories:
> 
> static int mv88e6xxx_get_max_mtu(struct dsa_switch *ds, int port)
> {
> 	struct mv88e6xxx_chip *chip = ds->priv;
> 
> 	if (chip->info->ops->port_set_jumbo_size)
> 		return 10240 - VLAN_ETH_HLEN - EDSA_HLEN -
> ETH_FCS_LEN; // 10210 else if (chip->info->ops->set_max_frame_size)
> 		return 1632 - VLAN_ETH_HLEN - EDSA_HLEN -
> ETH_FCS_LEN; // 1602 return 1522 - VLAN_ETH_HLEN - EDSA_HLEN -
> ETH_FCS_LEN; // 1492 }
> 
> By ops:
> 
> port_set_jumbo_size:
> static const struct mv88e6xxx_ops mv88e6131_ops = {
> static const struct mv88e6xxx_ops mv88e6141_ops = {
> static const struct mv88e6xxx_ops mv88e6171_ops = {
> static const struct mv88e6xxx_ops mv88e6172_ops = {
> static const struct mv88e6xxx_ops mv88e6175_ops = {
> static const struct mv88e6xxx_ops mv88e6176_ops = {
> static const struct mv88e6xxx_ops mv88e6190_ops = {
> static const struct mv88e6xxx_ops mv88e6190x_ops = {
> static const struct mv88e6xxx_ops mv88e6240_ops = {
> static const struct mv88e6xxx_ops mv88e6320_ops = {
> static const struct mv88e6xxx_ops mv88e6321_ops = {
> static const struct mv88e6xxx_ops mv88e6341_ops = {
> static const struct mv88e6xxx_ops mv88e6350_ops = {
> static const struct mv88e6xxx_ops mv88e6351_ops = {
> static const struct mv88e6xxx_ops mv88e6352_ops = {
> static const struct mv88e6xxx_ops mv88e6390_ops = {
> static const struct mv88e6xxx_ops mv88e6390x_ops = {
> static const struct mv88e6xxx_ops mv88e6393x_ops = {
> 
> set_max_frame_size:
> static const struct mv88e6xxx_ops mv88e6085_ops = {
> static const struct mv88e6xxx_ops mv88e6095_ops = {
> static const struct mv88e6xxx_ops mv88e6097_ops = {
> static const struct mv88e6xxx_ops mv88e6123_ops = {
> static const struct mv88e6xxx_ops mv88e6161_ops = {
> static const struct mv88e6xxx_ops mv88e6185_ops = {
> 
> none of the above:
> static const struct mv88e6xxx_ops mv88e6165_ops = {
> static const struct mv88e6xxx_ops mv88e6191_ops = {
> static const struct mv88e6xxx_ops mv88e6250_ops = {
> static const struct mv88e6xxx_ops mv88e6290_ops = {
> 
> 
> By info:
> 
> port_set_jumbo_size (10240):
> 	[MV88E6131] = {
> 	[MV88E6141] = {
> 	[MV88E6171] = {
> 	[MV88E6172] = {
> 	[MV88E6175] = {
> 	[MV88E6176] = {
> 	[MV88E6190] = {
> 	[MV88E6190X] = {
> 	[MV88E6240] = {
> 	[MV88E6320] = {
> 	[MV88E6321] = {
> 	[MV88E6341] = {
> 	[MV88E6350] = {
> 	[MV88E6351] = {
> 	[MV88E6352] = {
> 	[MV88E6390] = {
> 	[MV88E6390X] = {
> 	[MV88E6191X] = {
> 	[MV88E6193X] = {
> 	[MV88E6393X] = {
> 
> set_max_frame_size (1632):
> 	[MV88E6085] = {
> 	[MV88E6095] = {
> 	[MV88E6097] = {
> 	[MV88E6123] = {
> 	[MV88E6161] = {
> 	[MV88E6185] = {
> 
> none of the above (1522):
> 	[MV88E6165] = {
> 	[MV88E6191] = {
> 	[MV88E6220] = {
> 	[MV88E6250] = {
> 	[MV88E6290] = {
> 
> 
> Whereas your analysis seems to have determined this:
> 
> port_set_jumbo_size (10240):
> 	[MV88E6131] = {
> 	[MV88E6141] = {
> 	[MV88E6171] = {
> 	[MV88E6172] = {
> 	[MV88E6175] = {
> 	[MV88E6176] = {
> 	[MV88E6190] = {
> 	[MV88E6240] = {
> 	[MV88E6320] = {
> 	[MV88E6321] = {
> 	[MV88E6341] = {
> 	[MV88E6350] = {
> 	[MV88E6351] = {
> 	[MV88E6352] = {
> 	[MV88E6390] = {
> 	[MV88E6390X] = {
> 	[MV88E6393X] = {
> 
> set_max_frame_size (1632):
> 	[MV88E6095] = {
> 	[MV88E6097] = {
> 	[MV88E6123] = {
> 	[MV88E6161] = {
> 	[MV88E6165] = {
> 	[MV88E6185] = {
> 
> none of the above (1522):
> 	[MV88E6085] = {
> 	[MV88E6190X] = {
> 	[MV88E6191] = {
> 	[MV88E6191X] = {
> 	[MV88E6193X] = {
> 	[MV88E6290] = {
> 
> what's up with these?! (no max_frame_size)
> 	[MV88E6220] = {
> 	[MV88E6250] = {
> 
> 
> So our analysis differs for:
> 
> MV88E6190X (I say 10240, you say 1522)
> MV88E6191X (I say 10240, you say 1522)
> MV88E6193X (I say 10240, you say 1522)
> MV88E6085 (I say 1632, you say 1522)
> MV88E6165 (I say 1522, you say 1632)
> MV88E6220 (I say 1522, not clear what you say)
> MV88E6250 (I say 1522, not clear what you say)
> 
> Double-checking with the code, I believe my analysis to be the
> correct one...
> 

This is explained and provided in the following commits as advised by
Russel.

> 
> I have also noticed that you have not acted upon my previous review
> comment:
> https://patchwork.kernel.org/project/netdevbpf/patch/20230106101651.1137755-1-lukma@denx.de/
> 
> | 1522 - 30 = 1492.
> | 
> | I don't believe that there are switches which don't support the
> standard | MTU of 1500 ?!
> | 
> | >  		.port_base_addr = 0x10,
> | >  		.phy_base_addr = 0x0,
> | >  		.global1_addr = 0x1b,
> | 
> | Note that I see this behavior isn't new. But I've simulated it, and
> it | will produce the following messages on probe:
> | 
> | [    7.425752] mscc_felix 0000:00:00.5 swp0 (uninitialized): PHY
> [0000:00:00.3:10] driver [Microsemi GE VSC8514 SyncE] (irq=POLL) | [
>   7.437516] mscc_felix 0000:00:00.5: nonfatal error -34 setting MTU
> to 1500 on port 0 | [    7.588585] mscc_felix 0000:00:00.5 swp1
> (uninitialized): PHY [0000:00:00.3:11] driver [Microsemi GE VSC8514
> SyncE] (irq=POLL) | [    7.600433] mscc_felix 0000:00:00.5: nonfatal
> error -34 setting MTU to 1500 on port 1 | [    7.752613] mscc_felix
> 0000:00:00.5 swp2 (uninitialized): PHY [0000:00:00.3:12] driver
> [Microsemi GE VSC8514 SyncE] (irq=POLL) | [    7.764457] mscc_felix
> 0000:00:00.5: nonfatal error -34 setting MTU to 1500 on port 2 | [
> 7.900771] mscc_felix 0000:00:00.5 swp3 (uninitialized): PHY
> [0000:00:00.3:13] driver [Microsemi GE VSC8514 SyncE] (irq=POLL) | [
>   7.912501] mscc_felix 0000:00:00.5: nonfatal error -34 setting MTU
> to 1500 on port 3 | | I wonder, shouldn't we first fix that, and
> apply this patch set afterwards?
> 
> I guess I will have to fix this now, since you haven't done it.
> 

I do agree with Russel's reply here.

Moreover, as 6250 and 6220 also have max frame size equal to 2048
bytes, this would be fixed in v6 after getting the "validation"
function run.

This is the "patch 4" in the comment sent by Russel (to fix stuff which
is already broken, but it has been visible after running the validation
code):

https://lists.openwall.net/netdev/2023/03/09/233

> > diff --git a/drivers/net/dsa/mv88e6xxx/chip.h
> > b/drivers/net/dsa/mv88e6xxx/chip.h index da6e1339f809..e2b88f1f8376
> > 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.h
> > +++ b/drivers/net/dsa/mv88e6xxx/chip.h
> > @@ -132,6 +132,12 @@ struct mv88e6xxx_info {
> >  	unsigned int num_gpio;
> >  	unsigned int max_vid;
> >  	unsigned int max_sid;
> > +
> > +	/* Max Frame Size.
> > +	 * This value corresponds to the memory allocated in
> > switch internal
> > +	 * memory to store single frame.
> > +	 */  
> 
> What is the source of this definition?
> 
> I'm asking because I know of other switches where the internal memory
> allocation scheme has nothing to do with the frame size. Instead,
> there are SRAM cells of fixed and small size (say 60 octets) chained
> together.
> 
> > +	unsigned int max_frame_size;
> >  	unsigned int port_base_addr;
> >  	unsigned int phy_base_addr;
> >  	unsigned int global1_addr;
> > -- 
> > 2.20.1
> >   




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Erika Unter
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
Vladimir Oltean March 10, 2023, 1:36 p.m. UTC | #6
On Fri, Mar 10, 2023 at 02:17:19PM +0100, Lukasz Majewski wrote:
> > > For example mv88e6185 supports max 1632 bytes, which is now
> > > in-driver standard value.  
> > 
> > What is the criterion based on which 1632 is the "in-driver standard
> > value"?
> 
> It comes from the documentation I suppose. Moreover, this might be the
> the "first" used value when set_max_mtu callback was introduced.

I'm not playing dumb, I just don't understand what is meant by
"in-driver standard value". Is it the return value of mv88e6xxx_get_max_mtu()
for the MV88E6185 chip? Because I understood it to be somehow the value
returned by default, for chips which don't have a way to change the MTU
(because of the "standard" word).

> > > On the other hand - mv88e6250 supports 2048 bytes.  
> > 
> > What you mean to suggest here is that, using the current
> > classification from mv88e6xxx_get_max_mtu(), mv88e6250 falls into the
> > "none of the above" bucket, for which the driver returns 1522 -
> > VLAN_ETH_HLEN - EDSA_HLEN - ETH_FCS_LEN // 1492. But it truly
> > supports a maximum frame length of 2048, per your research.
> > 
> 
> And this cannot be easily fixed.
> 
> I could just provide callback to .set_max_frame_size in mv88e6250_ops
> and the mv88e6250 would use 1632 bytes which would prevent errors.
> 
> However, when I dig deeper - it turned out that this value is larger.
> And hence I wanted to do it in "right way".

Correct, I'm not debating this. I'm just saying, as a reader of this
patch set in linear order, that the justification is not obvious.

> > I have also noticed that you have not acted upon my previous review
> > comment:
> > https://patchwork.kernel.org/project/netdevbpf/patch/20230106101651.1137755-1-lukma@denx.de/
> > 
> > | 1522 - 30 = 1492.
> > | 
> > | I don't believe that there are switches which don't support the
> > standard | MTU of 1500 ?!
> > | 
> > | >  		.port_base_addr = 0x10,
> > | >  		.phy_base_addr = 0x0,
> > | >  		.global1_addr = 0x1b,
> > | 
> > | Note that I see this behavior isn't new. But I've simulated it, and
> > it | will produce the following messages on probe:
> > | 
> > | [    7.425752] mscc_felix 0000:00:00.5 swp0 (uninitialized): PHY
> > [0000:00:00.3:10] driver [Microsemi GE VSC8514 SyncE] (irq=POLL) | [
> >   7.437516] mscc_felix 0000:00:00.5: nonfatal error -34 setting MTU
> > to 1500 on port 0 | [    7.588585] mscc_felix 0000:00:00.5 swp1
> > (uninitialized): PHY [0000:00:00.3:11] driver [Microsemi GE VSC8514
> > SyncE] (irq=POLL) | [    7.600433] mscc_felix 0000:00:00.5: nonfatal
> > error -34 setting MTU to 1500 on port 1 | [    7.752613] mscc_felix
> > 0000:00:00.5 swp2 (uninitialized): PHY [0000:00:00.3:12] driver
> > [Microsemi GE VSC8514 SyncE] (irq=POLL) | [    7.764457] mscc_felix
> > 0000:00:00.5: nonfatal error -34 setting MTU to 1500 on port 2 | [
> > 7.900771] mscc_felix 0000:00:00.5 swp3 (uninitialized): PHY
> > [0000:00:00.3:13] driver [Microsemi GE VSC8514 SyncE] (irq=POLL) | [
> >   7.912501] mscc_felix 0000:00:00.5: nonfatal error -34 setting MTU
> > to 1500 on port 3 | | I wonder, shouldn't we first fix that, and
> > apply this patch set afterwards?
> > 
> > I guess I will have to fix this now, since you haven't done it.
> 
> I do agree with Russel's reply here.

It's possible that Russell might have slightly misunderstood my quoted
reply here, because he said something about a PHY.

> Moreover, as 6250 and 6220 also have max frame size equal to 2048
> bytes, this would be fixed in v6 after getting the "validation"
> function run.

The problem with this kind of fix is that it should go to the "net" tree;
it removes a user-visible warning that could have been avoided.

OTOH, the kind of "fix" for 6250 and 6220 is different. It is
sub-optimal use of hardware capabilities. That classifies as net-next
material.

The 2 go to different kernel branches.

> This is the "patch 4" in the comment sent by Russel (to fix stuff which
> is already broken, but it has been visible after running the validation
> code):
> 
> https://lists.openwall.net/netdev/2023/03/09/233

I will get there.. eventually.
Russell King (Oracle) March 10, 2023, 2:04 p.m. UTC | #7
On Fri, Mar 10, 2023 at 02:02:35PM +0200, Vladimir Oltean wrote:
> It would be good if the commit message contained the procedure based on
> which you had made these changes - and preferably they were mechanical.
> Having a small C program written would be absolutely ideal.
> This is so that reviewers wouldn't have to do it in parallel...
> 
> My analysis has determined the following 3 categories:
> 
> static int mv88e6xxx_get_max_mtu(struct dsa_switch *ds, int port)
> {
> 	struct mv88e6xxx_chip *chip = ds->priv;
> 
> 	if (chip->info->ops->port_set_jumbo_size)
> 		return 10240 - VLAN_ETH_HLEN - EDSA_HLEN - ETH_FCS_LEN; // 10210
> 	else if (chip->info->ops->set_max_frame_size)
> 		return 1632 - VLAN_ETH_HLEN - EDSA_HLEN - ETH_FCS_LEN; // 1602
> 	return 1522 - VLAN_ETH_HLEN - EDSA_HLEN - ETH_FCS_LEN; // 1492

The question concerning the 1492 MTU (and the others above) is
something that does need to be addressed, but I don't believe it
should be part of this patch series.

In order to properly address this, we need to do a bit of research.
Originally, the driver calculated the MTU by taking the frame size
(1522, 1632 or 10240) and subtracting VLAN_ETH_HLEN and ETH_FCS_LEN.

This would mean the frame sizes were 1500, 1610 and 10218. However,
as a result of:

commit b9c587fed61cf88bd45822c3159644445f6d5aa6
Author: Andrew Lunn <andrew@lunn.ch>
Date:   Sun Sep 26 19:41:26 2021 +0200

    dsa: mv88e6xxx: Include tagger overhead when setting MTU for DSA and CPU ports

This was changed to include the EDSA_HLEN of 8 bytes. The question
is why - and that's a question for Andrew.

The frame size check is not well described looking at the 6176
functional specification. It takes about using an adjusted frame
size in the paragraph that talks about ingress headers, but then
it only takes about adjusting by two bytes which are sent before
the DA, only if MV88E6XXX_PORT_CTL0_HEADER is set (which we don't
touch).

Against the bits that control the maximum frame size, it does state
that "the definition of frame size is counting the frame bytes from
MAC-DA through Layer2 CRC of the frame".

No mention is made whether the EDSA header is included or not, the
assumption was that it wasn't prior to the commit above, but it
would appear that caused a problem, so the EDSA header was added.

Now, obviously, on external ports (those which don't use the EDSA
header) the EDSA header doesn't restrict the size of packets sent
or received on that port. However, the header does exist on the
CPU port - and the obvious question is, does the max frame size
apply, and if so does it apply with the EDSA header included or
excluded. We don't know from the documentation.

DSA ports (those between switches) don't use the EDSA header, but
instead use the DSA header which is four bytes long. Again, whether
that is included in the maximum frame size is unspecified.

Maybe Andrew has some input here as he made the above commit and
can remember why it was necessary.

However, to me, it seems to be rather absurd as it would mean that
on a device that only supports 1522 maximum packet size, the CPU
port using an EDSA header would be incapable of sending or
receiving a packet containing 1500 bytes of payload, VLAN header
and ethernet header, because as soon as the EDSA header is added
we're over the 1522 limit - and that would basically mean the
switch can't be used in a normal ethernet network to switch
such packets.
Lukasz Majewski March 10, 2023, 2:10 p.m. UTC | #8
Hi Vladimir,

> On Fri, Mar 10, 2023 at 02:17:19PM +0100, Lukasz Majewski wrote:
> > > > For example mv88e6185 supports max 1632 bytes, which is now
> > > > in-driver standard value.    
> > > 
> > > What is the criterion based on which 1632 is the "in-driver
> > > standard value"?  
> > 
> > It comes from the documentation I suppose. Moreover, this might be
> > the the "first" used value when set_max_mtu callback was
> > introduced.  
> 
> I'm not playing dumb, I just don't understand what is meant by
> "in-driver standard value". Is it the return value of
> mv88e6xxx_get_max_mtu() for the MV88E6185 chip?

The 1632 is a value from added early switch IC to this driver.

Then the get_max_mtu function was extended to support jumbo frames.

And the extension was based on having the .set_max_frame_size defined
in *_ops structure.

> Because I understood
> it to be somehow the value returned by default, for chips which don't
> have a way to change the MTU (because of the "standard" word).
> 

Probably the "standard" shall be replaced above - it might be
misleading. "Default" would be better.

> > > > On the other hand - mv88e6250 supports 2048 bytes.    
> > > 
> > > What you mean to suggest here is that, using the current
> > > classification from mv88e6xxx_get_max_mtu(), mv88e6250 falls into
> > > the "none of the above" bucket, for which the driver returns 1522
> > > - VLAN_ETH_HLEN - EDSA_HLEN - ETH_FCS_LEN // 1492. But it truly
> > > supports a maximum frame length of 2048, per your research.
> > >   
> > 
> > And this cannot be easily fixed.
> > 
> > I could just provide callback to .set_max_frame_size in
> > mv88e6250_ops and the mv88e6250 would use 1632 bytes which would
> > prevent errors.
> > 
> > However, when I dig deeper - it turned out that this value is
> > larger. And hence I wanted to do it in "right way".  
> 
> Correct, I'm not debating this. I'm just saying, as a reader of this
> patch set in linear order, that the justification is not obvious.
> 
> > > I have also noticed that you have not acted upon my previous
> > > review comment:
> > > https://patchwork.kernel.org/project/netdevbpf/patch/20230106101651.1137755-1-lukma@denx.de/
> > > 
> > > | 1522 - 30 = 1492.
> > > | 
> > > | I don't believe that there are switches which don't support the
> > > standard | MTU of 1500 ?!
> > > | 
> > > | >  		.port_base_addr = 0x10,
> > > | >  		.phy_base_addr = 0x0,
> > > | >  		.global1_addr = 0x1b,
> > > | 
> > > | Note that I see this behavior isn't new. But I've simulated it,
> > > and it | will produce the following messages on probe:
> > > | 
> > > | [    7.425752] mscc_felix 0000:00:00.5 swp0 (uninitialized): PHY
> > > [0000:00:00.3:10] driver [Microsemi GE VSC8514 SyncE] (irq=POLL)
> > > | [ 7.437516] mscc_felix 0000:00:00.5: nonfatal error -34 setting
> > > MTU to 1500 on port 0 | [    7.588585] mscc_felix 0000:00:00.5
> > > swp1 (uninitialized): PHY [0000:00:00.3:11] driver [Microsemi GE
> > > VSC8514 SyncE] (irq=POLL) | [    7.600433] mscc_felix
> > > 0000:00:00.5: nonfatal error -34 setting MTU to 1500 on port 1 |
> > > [    7.752613] mscc_felix 0000:00:00.5 swp2 (uninitialized): PHY
> > > [0000:00:00.3:12] driver [Microsemi GE VSC8514 SyncE] (irq=POLL)
> > > | [    7.764457] mscc_felix 0000:00:00.5: nonfatal error -34
> > > setting MTU to 1500 on port 2 | [ 7.900771] mscc_felix
> > > 0000:00:00.5 swp3 (uninitialized): PHY [0000:00:00.3:13] driver
> > > [Microsemi GE VSC8514 SyncE] (irq=POLL) | [ 7.912501] mscc_felix
> > > 0000:00:00.5: nonfatal error -34 setting MTU to 1500 on port 3 |
> > > | I wonder, shouldn't we first fix that, and apply this patch set
> > > afterwards?
> > > 
> > > I guess I will have to fix this now, since you haven't done it.  
> > 
> > I do agree with Russel's reply here.  
> 
> It's possible that Russell might have slightly misunderstood my quoted
> reply here, because he said something about a PHY.
> 
> > Moreover, as 6250 and 6220 also have max frame size equal to 2048
> > bytes, this would be fixed in v6 after getting the "validation"
> > function run.  
> 
> The problem with this kind of fix is that it should go to the "net"
> tree; it removes a user-visible warning that could have been avoided.
> 
> OTOH, the kind of "fix" for 6250 and 6220 is different. It is
> sub-optimal use of hardware capabilities. That classifies as net-next
> material.
> 
> The 2 go to different kernel branches.
> 

As I said - v6 fixes it in the way which Russel proposed. I also do
like this approach, so to avoid "wasting effort" I would opt for having
this fix in this patchset.

(And I hope that we will finish work on it before MW closes).

> > This is the "patch 4" in the comment sent by Russel (to fix stuff
> > which is already broken, but it has been visible after running the
> > validation code):
> > 
> > https://lists.openwall.net/netdev/2023/03/09/233  
> 
> I will get there.. eventually.




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Erika Unter
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
Vladimir Oltean March 10, 2023, 3:45 p.m. UTC | #9
On Fri, Mar 10, 2023 at 02:17:19PM +0100, Lukasz Majewski wrote:
> This is the "patch 4" in the comment sent by Russel (to fix stuff which
> is already broken, but it has been visible after running the validation
> code):
> 
> https://lists.openwall.net/netdev/2023/03/09/233

Ok, so nope, what I was talking about here (MTU 1492) is *not* what you
have discussed with Russell in patch 4.

What I was talking about is this:
https://patchwork.kernel.org/project/netdevbpf/patch/20230309125421.3900962-2-lukma@denx.de/#25245979
and Russell now seems to agree with me that it should be addressed
separately, and prior to the extra development work done here.

It looks like it will also need a bit of assistance from Andrew to
untangle whether EDSA_HLEN should be included in the max_mtu calculations
for some switch families only, rather than for all.
Andrew Lunn March 10, 2023, 4:12 p.m. UTC | #10
On Fri, Mar 10, 2023 at 05:45:11PM +0200, Vladimir Oltean wrote:
> On Fri, Mar 10, 2023 at 02:17:19PM +0100, Lukasz Majewski wrote:
> > This is the "patch 4" in the comment sent by Russel (to fix stuff which
> > is already broken, but it has been visible after running the validation
> > code):
> > 
> > https://lists.openwall.net/netdev/2023/03/09/233
> 
> Ok, so nope, what I was talking about here (MTU 1492) is *not* what you
> have discussed with Russell in patch 4.
> 
> What I was talking about is this:
> https://patchwork.kernel.org/project/netdevbpf/patch/20230309125421.3900962-2-lukma@denx.de/#25245979
> and Russell now seems to agree with me that it should be addressed
> separately, and prior to the extra development work done here.
> 
> It looks like it will also need a bit of assistance from Andrew to
> untangle whether EDSA_HLEN should be included in the max_mtu calculations
> for some switch families only, rather than for all.

That is hard to say. From other peoples experiments, it seems like
some families don't impose the frame length check on DSA and CPU
ports. Hence they accept frames with DSA or EDSA headers, even if that
puts them over the theoretical port max. Other families do seem to
perform check on DSA and CPU ports. I don't think the datasheets say
anything about this.

The safe thing to do is:

Assume all switches can accept the standard minimum MTU + DSA/EDSA
headers on their CPU ports.

For those switches which accept frames bigger than the standard,
assume the DSA/EDSA header is counted as part of the limit, and so
adjust the calculation.

This might short change a few switches 4/8 bytes, but that is better
than being broken because frames are dropped.

     Andrew
Lukasz Majewski March 12, 2023, 3:55 p.m. UTC | #11
Hi Vladimir,

> On Fri, Mar 10, 2023 at 02:17:19PM +0100, Lukasz Majewski wrote:
> > This is the "patch 4" in the comment sent by Russel (to fix stuff
> > which is already broken, but it has been visible after running the
> > validation code):
> > 
> > https://lists.openwall.net/netdev/2023/03/09/233  
> 
> Ok, so nope, what I was talking about here (MTU 1492) is *not* what
> you have discussed with Russell in patch 4.

The patch 4 would be related to mv88e6220 and 6250 only. It would
provide correct size of MTU.

> 
> What I was talking about is this:
> https://patchwork.kernel.org/project/netdevbpf/patch/20230309125421.3900962-2-lukma@denx.de/#25245979
> and Russell now seems to agree with me that it should be addressed
> separately,

Ok.

> and prior to the extra development work done here.
> 

Why? Up till mine patch set was introduced the problem was unnoticed.
Could this be fixed after it is applied?

> It looks like it will also need a bit of assistance from Andrew to
> untangle whether EDSA_HLEN should be included in the max_mtu
> calculations for some switch families only, rather than for all.


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Erika Unter
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
Vladimir Oltean March 13, 2023, 3:01 p.m. UTC | #12
On Sun, Mar 12, 2023 at 04:55:50PM +0100, Lukasz Majewski wrote:
> > What I was talking about is this:
> > https://patchwork.kernel.org/project/netdevbpf/patch/20230309125421.3900962-2-lukma@denx.de/#25245979
> > and Russell now seems to agree with me that it should be addressed
> > separately,
> 
> Ok.
> 
> > and prior to the extra development work done here.
> 
> Why? Up till mine patch set was introduced the problem was unnoticed.
> Could this be fixed after it is applied?

I have already explained why, here:

| in principle no one has to solve it. It would be good to not move
| around broken code if we know it's broken, is what I'm saying. This is
| because eventually, someone who *is* affected *will* want to fix it, and
| that fix will conflict with the refactoring.

Translated/rephrased.

The 1492 max_mtu issue for MV88E6165, MV88E6191, MV88E6220, MV88E6250
and MV88E6290 was introduced, according to my code analysis, by commit
b9c587fed61c ("dsa: mv88e6xxx: Include tagger overhead when setting MTU
for DSA and CPU ports").

That patch, having a Fixes: tag of 1baf0fac10fb ("net: dsa: mv88e6xxx:
Use chip-wide max frame size for MTU"), was backported to all stable
kernel trees which included commit 1baf0fac10fb.

Running "git tag --contains 1baf0fac10fb", I see that it was first
included in kernel tag v5.9. I deduce that commit b9c587fed61c ("dsa:
mv88e6xxx: Include tagger overhead when setting MTU for DSA and CPU
ports") was also backported to all stable branches more recent than the
v5.9 tag.

Consulting https://www.kernel.org/ and https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/,
I can see that the branches linux-6.2.y, linux-6.1.y, linux-5.15.y and
linux-5.10.y are still maintained by the linux-stable repository, and
contain commit b9c587fed61c (either directly or through backports).

As hinted at by Documentation/process/maintainer-netdev.rst but perhaps
insufficiently clarified, bug fixes to code maintained by netdev go to
the "main" branch of https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git,
a git tree which tracks the main branch of Linus Torvalds (which today
is in the v6.3 release candidates). From there, patches are
automatically backported by the linux-stable maintainers.

The problem with you making changes to code which was pointed out as
incorrect is that these changes will land in net-next.git, in kernel
v6.4 at the earliest.

Assuming somebody else will fix the 1492 MTU issue, there are 2 distinct
moments when that can occur, relative to when the net-next pull request
is sent to Linus Torvalds' main branch:

1. before the net-next pull request. In that case, one of the netdev
   maintainers will have to manually resolve the conflict between one of
   net.git or Linus Torvalds' git tree.

2. after the net-next pull request was accepted. What will happen is
   that net.git will merge with Linus Torvalds' changes, and it will no
   longer contain the same code as on branches 6.2, 6.1, 5.15 and 5.10,
   but rather, the code with your changes. But it is always net.git that
   someone has to develop against when submitting the 1492 MTU change.
   That fix cannot apply any further than the net-next pull request,
   which is the v6.4-rc1 tag at the earliest.

   So, for the bug fix for 1492 MTU to reach the stable branches which
   are still maintained by then, 2 strategies will be taken into
   consideration:

   - the conflicting changes (yours) are also backported along with the
     real bug fix. This is because linux-stable has a preference to not
     diverge from the code in the main branch, and would rather backport
     more than less, to achieve that. But your patch set is quite noisy.
     It touches the entire mv88e6xxx_table[] of switches. It is hard to
     predict how well this chain of dependencies will get backported all
     the way down to linux-5.10.y. If any switch family was added to
     this table since v5.10, then the backporting of your changes would
     also conflict with that addition.

   - if the linux-stable team gives up with the backporting, an email
     will be sent letting the people know, and a manually created series
     of backports can be submitted for direct inclusion into the stable
     trees.

As you can see, the complexity of fixing code in stable that has been
changed in mainline is quite a bit higher than fixing it before changing it.
Also, the result is not as clean, if you add third-party backports into
the equation. For example, someone takes a linux-6.1.y stable kernel
from the future (with the 1492 MTU issue solved) and wants to backport
v6.4 material, which includes your changes. It will conflict, because
there is no linear history. The only way to achieve linear history is to
fix the buggy code before changing it.

To your point that it's not you who chose the 1492 MTU bug but rather
that it chose you, I'm not trying to minimize that, except that I need
to point out that things like this are to be expected when you are
working on a project where you aren't the only contributor.

Since we are already so deep in process-related explanations, I don't
know how aware you are of what fixing the 1492 MTU bug entails.
One would have to prepare a patch that limits the max_mtu to ETH_DATA_LEN
for the switch families where MTU changing is not possible. Once that
gets reviewed and accepted, one would need to wait for no longer than
the next Thursday (when the net.git pull request is sent, and net.git is
merged back into net-next.git, for history linearization), then work on
net-next.git can resume.
diff mbox series

Patch

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 0a5d6c7bb128..c097a0b19ba6 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -5626,6 +5626,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_internal_phys = 5,
 		.max_vid = 4095,
 		.max_sid = 63,
+		.max_frame_size = 1522,
 		.port_base_addr = 0x10,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -5648,6 +5649,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_ports = 11,
 		.num_internal_phys = 0,
 		.max_vid = 4095,
+		.max_frame_size = 1632,
 		.port_base_addr = 0x10,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -5669,6 +5671,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_internal_phys = 8,
 		.max_vid = 4095,
 		.max_sid = 63,
+		.max_frame_size = 1632,
 		.port_base_addr = 0x10,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -5693,6 +5696,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_internal_phys = 5,
 		.max_vid = 4095,
 		.max_sid = 63,
+		.max_frame_size = 1632,
 		.port_base_addr = 0x10,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -5716,6 +5720,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_ports = 8,
 		.num_internal_phys = 0,
 		.max_vid = 4095,
+		.max_frame_size = 10240,
 		.port_base_addr = 0x10,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -5738,6 +5743,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_gpio = 11,
 		.max_vid = 4095,
 		.max_sid = 63,
+		.max_frame_size = 10240,
 		.port_base_addr = 0x10,
 		.phy_base_addr = 0x10,
 		.global1_addr = 0x1b,
@@ -5762,6 +5768,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_internal_phys = 5,
 		.max_vid = 4095,
 		.max_sid = 63,
+		.max_frame_size = 1632,
 		.port_base_addr = 0x10,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -5787,6 +5794,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_internal_phys = 0,
 		.max_vid = 4095,
 		.max_sid = 63,
+		.max_frame_size = 1632,
 		.port_base_addr = 0x10,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -5811,6 +5819,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_internal_phys = 5,
 		.max_vid = 4095,
 		.max_sid = 63,
+		.max_frame_size = 10240,
 		.port_base_addr = 0x10,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -5836,6 +5845,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_gpio = 15,
 		.max_vid = 4095,
 		.max_sid = 63,
+		.max_frame_size = 10240,
 		.port_base_addr = 0x10,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -5860,6 +5870,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_internal_phys = 5,
 		.max_vid = 4095,
 		.max_sid = 63,
+		.max_frame_size = 10240,
 		.port_base_addr = 0x10,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -5885,6 +5896,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_gpio = 15,
 		.max_vid = 4095,
 		.max_sid = 63,
+		.max_frame_size = 10240,
 		.port_base_addr = 0x10,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -5908,6 +5920,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_ports = 10,
 		.num_internal_phys = 0,
 		.max_vid = 4095,
+		.max_frame_size = 1632,
 		.port_base_addr = 0x10,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -5931,6 +5944,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_gpio = 16,
 		.max_vid = 8191,
 		.max_sid = 63,
+		.max_frame_size = 10240,
 		.port_base_addr = 0x0,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -5955,6 +5969,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_gpio = 16,
 		.max_vid = 8191,
 		.max_sid = 63,
+		.max_frame_size = 1522,
 		.port_base_addr = 0x0,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -5978,6 +5993,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_internal_phys = 9,
 		.max_vid = 8191,
 		.max_sid = 63,
+		.max_frame_size = 1522,
 		.port_base_addr = 0x0,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -6001,6 +6017,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_internal_phys = 9,
 		.max_vid = 8191,
 		.max_sid = 63,
+		.max_frame_size = 1522,
 		.port_base_addr = 0x0,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -6024,6 +6041,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_internal_phys = 9,
 		.max_vid = 8191,
 		.max_sid = 63,
+		.max_frame_size = 1522,
 		.port_base_addr = 0x0,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -6051,6 +6069,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_internal_phys = 2,
 		.invalid_port_mask = BIT(2) | BIT(3) | BIT(4),
 		.max_vid = 4095,
+		.max_frame_size = 2048,
 		.port_base_addr = 0x08,
 		.phy_base_addr = 0x00,
 		.global1_addr = 0x0f,
@@ -6075,6 +6094,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_gpio = 15,
 		.max_vid = 4095,
 		.max_sid = 63,
+		.max_frame_size = 10240,
 		.port_base_addr = 0x10,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -6098,6 +6118,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_ports = 7,
 		.num_internal_phys = 5,
 		.max_vid = 4095,
+		.max_frame_size = 2048,
 		.port_base_addr = 0x08,
 		.phy_base_addr = 0x00,
 		.global1_addr = 0x0f,
@@ -6121,6 +6142,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_gpio = 16,
 		.max_vid = 8191,
 		.max_sid = 63,
+		.max_frame_size = 1522,
 		.port_base_addr = 0x0,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -6145,6 +6167,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_internal_phys = 5,
 		.num_gpio = 15,
 		.max_vid = 4095,
+		.max_frame_size = 10240,
 		.port_base_addr = 0x10,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -6170,6 +6193,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_internal_phys = 5,
 		.num_gpio = 15,
 		.max_vid = 4095,
+		.max_frame_size = 10240,
 		.port_base_addr = 0x10,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -6195,6 +6219,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_gpio = 11,
 		.max_vid = 4095,
 		.max_sid = 63,
+		.max_frame_size = 10240,
 		.port_base_addr = 0x10,
 		.phy_base_addr = 0x10,
 		.global1_addr = 0x1b,
@@ -6220,6 +6245,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_internal_phys = 5,
 		.max_vid = 4095,
 		.max_sid = 63,
+		.max_frame_size = 10240,
 		.port_base_addr = 0x10,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -6244,6 +6270,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_internal_phys = 5,
 		.max_vid = 4095,
 		.max_sid = 63,
+		.max_frame_size = 10240,
 		.port_base_addr = 0x10,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -6269,6 +6296,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_gpio = 15,
 		.max_vid = 4095,
 		.max_sid = 63,
+		.max_frame_size = 10240,
 		.port_base_addr = 0x10,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -6294,6 +6322,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_gpio = 16,
 		.max_vid = 8191,
 		.max_sid = 63,
+		.max_frame_size = 10240,
 		.port_base_addr = 0x0,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -6319,6 +6348,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_gpio = 16,
 		.max_vid = 8191,
 		.max_sid = 63,
+		.max_frame_size = 10240,
 		.port_base_addr = 0x0,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
@@ -6343,6 +6373,7 @@  static const struct mv88e6xxx_info mv88e6xxx_table[] = {
 		.num_internal_phys = 9,
 		.max_vid = 8191,
 		.max_sid = 63,
+		.max_frame_size = 10240,
 		.port_base_addr = 0x0,
 		.phy_base_addr = 0x0,
 		.global1_addr = 0x1b,
diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h
index da6e1339f809..e2b88f1f8376 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.h
+++ b/drivers/net/dsa/mv88e6xxx/chip.h
@@ -132,6 +132,12 @@  struct mv88e6xxx_info {
 	unsigned int num_gpio;
 	unsigned int max_vid;
 	unsigned int max_sid;
+
+	/* Max Frame Size.
+	 * This value corresponds to the memory allocated in switch internal
+	 * memory to store single frame.
+	 */
+	unsigned int max_frame_size;
 	unsigned int port_base_addr;
 	unsigned int phy_base_addr;
 	unsigned int global1_addr;