diff mbox series

[RFC,net-next,04/10] net: dsa: mv88e6xxx: Add all hosts mc addr to ATU

Message ID 20240402001137.2980589-5-Joseph.Huang@garmin.com (mailing list archive)
State RFC
Delegated to: Netdev Maintainers
Headers show
Series MC Flood disable and snooping | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 943 this patch: 943
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 7 of 7 maintainers
netdev/build_clang success Errors and warnings before: 954 this patch: 954
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 954 this patch: 954
netdev/checkpatch warning WARNING: line length of 89 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Joseph Huang April 2, 2024, 12:11 a.m. UTC
Add local network all hosts multicast address (224.0.0.1) and link-local
all nodes multicast address (ff02::1) to the ATU so that IGMP/MLD
Queries can be forwarded even when multicast flooding is disabled on a
port.

Signed-off-by: Joseph Huang <Joseph.Huang@garmin.com>
---
 drivers/net/dsa/mv88e6xxx/Kconfig | 12 ++++++++
 drivers/net/dsa/mv88e6xxx/chip.c  | 47 +++++++++++++++++++++++++++++++
 2 files changed, 59 insertions(+)

Comments

Vladimir Oltean April 2, 2024, 6:08 p.m. UTC | #1
On Mon, Apr 01, 2024 at 08:11:03PM -0400, Joseph Huang wrote:
> Add local network all hosts multicast address (224.0.0.1) and link-local
> all nodes multicast address (ff02::1) to the ATU so that IGMP/MLD
> Queries can be forwarded even when multicast flooding is disabled on a
> port.
> 
> Signed-off-by: Joseph Huang <Joseph.Huang@garmin.com>
> ---
>  drivers/net/dsa/mv88e6xxx/Kconfig | 12 ++++++++
>  drivers/net/dsa/mv88e6xxx/chip.c  | 47 +++++++++++++++++++++++++++++++
>  2 files changed, 59 insertions(+)
> 
> diff --git a/drivers/net/dsa/mv88e6xxx/Kconfig b/drivers/net/dsa/mv88e6xxx/Kconfig
> index e3181d5471df..ef7798bf50d7 100644
> --- a/drivers/net/dsa/mv88e6xxx/Kconfig
> +++ b/drivers/net/dsa/mv88e6xxx/Kconfig
> @@ -17,3 +17,15 @@ config NET_DSA_MV88E6XXX_PTP
>  	help
>  	  Say Y to enable PTP hardware timestamping on Marvell 88E6xxx switch
>  	  chips that support it.
> +
> +config NET_DSA_MV88E6XXX_ALWAYS_FLOOD_LOCAL_ALL_HOSTS_ADDRESS
> +	bool "Always flood local all hosts multicast packets"
> +	depends on NET_DSA_MV88E6XXX
> +	help
> +	  When set to Y, always flood multicast packets destined for
> +	  224.0.0.1 (Local Network All Hosts multicast address) and
> +	  ff02::1 (Link-Local All Nodes multicast address), even when
> +	  multicast flooding is disabled for a port.  This is so that
> +	  multicast snooping can continue to function even when
> +	  multicast flooding is disabled.
> +	  If in doubt, say N.

In its current form, this will never be accepted. The mainline Linux
kernel is not a purpose-built project like a bootloader (and also like
some other uses of the Linux kernel). It gets picked up by
distributions, and the same kernel image is supposed to be used on
multiple platforms. So, customizing behavior at compilation time is not
a viable option. If there is any behavior that needs to be different on
some platforms than on others for some reason (this needs a
justification in its own right), it is handled through dedicated user
space API. When others say "hide custom behavior X behind an option", a
compile-time option is not what they mean.

As for the substance of the change itself, I am far from an authority
on multicast, I think you should try to push for something else, which
should be more palatable for everybody.

Some switches I've been working with have explicit flooding controls for:
- L2 multicast
- IPv4 control multicast (224.0.0.x)
- IPv4 data multicast
- IPv6 control multicast (FF02::/16)
- IPv6 data multicast

Whereas the bridge has a single "mcast_flood" control.

It seems adequate to attempt adding more netlink attributes to control
all of the above, individually. Then you could describe your use case,
in a standard way to the kernel, as "ip link set swp0 type bridge_slave
mcast_ipv4_data_flood off mcast_ipv4_ctrl_flood on". For compatibility,
"mcast_flood" could still be interpreted as a global enable for all
multicast.

The trouble seems to be actually offloading this config to Marvell DSA,
they don't seem to have a classifier that distinguishes between kinds of
multicast traffic.

How many link-local IPv4 and IPv6 addresses are there in actual use?
Would it be feasible to actually add ATU addresses for all of them, like
you did for 224.0.0.1 and ff02::1, and say that the port destinations of
_those_ are the mcast_ipv4_ctrl_flood ports?

> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
> index 9ed1821184ec..fddcb596c421 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.c
> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> @@ -2488,6 +2488,41 @@ static int mv88e6xxx_broadcast_setup(struct mv88e6xxx_chip *chip, u16 vid)
>  	return 0;
>  }
>  
> +#if IS_ENABLED(CONFIG_NET_DSA_MV88E6XXX_ALWAYS_FLOOD_LOCAL_ALL_HOSTS_ADDRESS)
> +static int mv88e6xxx_port_add_multicast(struct mv88e6xxx_chip *chip, int port,
> +					u16 vid)
> +{
> +	u8 state = MV88E6XXX_G1_ATU_DATA_STATE_MC_STATIC;
> +	const char multicast[][ETH_ALEN] = {
> +		{ 0x01, 0x00, 0x5e, 0x00, 0x00, 0x01 },
> +		{ 0x33, 0x33, 0x00, 0x00, 0x00, 0x01 }
> +	};
> +	int i, err;
> +
> +	for (i = 0; i < ARRAY_SIZE(multicast); i++) {
> +		err = mv88e6xxx_port_db_load_purge(chip, port, multicast[i], vid, state);
> +		if (err)
> +			return err;
> +	}
> +
> +	return 0;
> +}
> +
> +static int mv88e6xxx_multicast_setup(struct mv88e6xxx_chip *chip, u16 vid)
> +{
> +	int port;
> +	int err;
> +
> +	for (port = 0; port < mv88e6xxx_num_ports(chip); port++) {
> +		err = mv88e6xxx_port_add_multicast(chip, port, vid);
> +		if (err)
> +			return err;
> +	}
> +
> +	return 0;
> +}
> +#endif
> +
>  struct mv88e6xxx_port_broadcast_sync_ctx {
>  	int port;
>  	bool flood;
> @@ -2572,6 +2607,12 @@ static int mv88e6xxx_port_vlan_join(struct mv88e6xxx_chip *chip, int port,
>  		err = mv88e6xxx_broadcast_setup(chip, vlan.vid);
>  		if (err)
>  			return err;
> +
> +#if IS_ENABLED(CONFIG_NET_DSA_MV88E6XXX_ALWAYS_FLOOD_LOCAL_ALL_HOSTS_ADDRESS)
> +		err = mv88e6xxx_multicast_setup(chip, vlan.vid);
> +		if (err)
> +			return err;
> +#endif
>  	} else if (vlan.member[port] != member) {
>  		vlan.member[port] = member;
>  
> @@ -3930,6 +3971,12 @@ static int mv88e6xxx_setup(struct dsa_switch *ds)
>  	if (err)
>  		goto unlock;
>  
> +#if IS_ENABLED(CONFIG_NET_DSA_MV88E6XXX_ALWAYS_FLOOD_LOCAL_ALL_HOSTS_ADDRESS)
> +	err = mv88e6xxx_multicast_setup(chip, 0);

This is the danger of developing on an old kernel and then just blindly
forward-porting. It will compile and smile and look pretty, but it won't
work. You need to request your board provider to do something and give
you access to mainline code.

In newer kernels, VID 0 is MV88E6XXX_VID_STANDALONE, aka a completely
separate address database compared to what the bridge is in charge of.
So, applied to the mainline kernel as of today, your change does nothing
useful, because when under a VLAN-unaware bridge, packets now get
classified to MV88E6XXX_VID_BRIDGED (4095).

For that matter, the port database (MV88E6XXX_VID_STANDALONE) should be
controlled through dev_mc_add(), and "ip maddr" shows you the link-local
multicast addresses offloaded to the device's RX filter.

mv88e6xxx today does not pass the requirements for dsa_switch_supports_mc_filtering(),
so dev_mc_add() does not actually program anything into hardware, but if
it did, it would have been the port database (VID 0), and this is what
makes your change wrong. You can read more about address databases in
Documentation/networking/dsa/dsa.rst.

> +	if (err)
> +		goto unlock;
> +#endif
> +
>  	err = mv88e6xxx_pot_setup(chip);
>  	if (err)
>  		goto unlock;
> -- 
> 2.17.1
>
diff mbox series

Patch

diff --git a/drivers/net/dsa/mv88e6xxx/Kconfig b/drivers/net/dsa/mv88e6xxx/Kconfig
index e3181d5471df..ef7798bf50d7 100644
--- a/drivers/net/dsa/mv88e6xxx/Kconfig
+++ b/drivers/net/dsa/mv88e6xxx/Kconfig
@@ -17,3 +17,15 @@  config NET_DSA_MV88E6XXX_PTP
 	help
 	  Say Y to enable PTP hardware timestamping on Marvell 88E6xxx switch
 	  chips that support it.
+
+config NET_DSA_MV88E6XXX_ALWAYS_FLOOD_LOCAL_ALL_HOSTS_ADDRESS
+	bool "Always flood local all hosts multicast packets"
+	depends on NET_DSA_MV88E6XXX
+	help
+	  When set to Y, always flood multicast packets destined for
+	  224.0.0.1 (Local Network All Hosts multicast address) and
+	  ff02::1 (Link-Local All Nodes multicast address), even when
+	  multicast flooding is disabled for a port.  This is so that
+	  multicast snooping can continue to function even when
+	  multicast flooding is disabled.
+	  If in doubt, say N.
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 9ed1821184ec..fddcb596c421 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -2488,6 +2488,41 @@  static int mv88e6xxx_broadcast_setup(struct mv88e6xxx_chip *chip, u16 vid)
 	return 0;
 }
 
+#if IS_ENABLED(CONFIG_NET_DSA_MV88E6XXX_ALWAYS_FLOOD_LOCAL_ALL_HOSTS_ADDRESS)
+static int mv88e6xxx_port_add_multicast(struct mv88e6xxx_chip *chip, int port,
+					u16 vid)
+{
+	u8 state = MV88E6XXX_G1_ATU_DATA_STATE_MC_STATIC;
+	const char multicast[][ETH_ALEN] = {
+		{ 0x01, 0x00, 0x5e, 0x00, 0x00, 0x01 },
+		{ 0x33, 0x33, 0x00, 0x00, 0x00, 0x01 }
+	};
+	int i, err;
+
+	for (i = 0; i < ARRAY_SIZE(multicast); i++) {
+		err = mv88e6xxx_port_db_load_purge(chip, port, multicast[i], vid, state);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int mv88e6xxx_multicast_setup(struct mv88e6xxx_chip *chip, u16 vid)
+{
+	int port;
+	int err;
+
+	for (port = 0; port < mv88e6xxx_num_ports(chip); port++) {
+		err = mv88e6xxx_port_add_multicast(chip, port, vid);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+#endif
+
 struct mv88e6xxx_port_broadcast_sync_ctx {
 	int port;
 	bool flood;
@@ -2572,6 +2607,12 @@  static int mv88e6xxx_port_vlan_join(struct mv88e6xxx_chip *chip, int port,
 		err = mv88e6xxx_broadcast_setup(chip, vlan.vid);
 		if (err)
 			return err;
+
+#if IS_ENABLED(CONFIG_NET_DSA_MV88E6XXX_ALWAYS_FLOOD_LOCAL_ALL_HOSTS_ADDRESS)
+		err = mv88e6xxx_multicast_setup(chip, vlan.vid);
+		if (err)
+			return err;
+#endif
 	} else if (vlan.member[port] != member) {
 		vlan.member[port] = member;
 
@@ -3930,6 +3971,12 @@  static int mv88e6xxx_setup(struct dsa_switch *ds)
 	if (err)
 		goto unlock;
 
+#if IS_ENABLED(CONFIG_NET_DSA_MV88E6XXX_ALWAYS_FLOOD_LOCAL_ALL_HOSTS_ADDRESS)
+	err = mv88e6xxx_multicast_setup(chip, 0);
+	if (err)
+		goto unlock;
+#endif
+
 	err = mv88e6xxx_pot_setup(chip);
 	if (err)
 		goto unlock;