diff mbox series

[RFC,v2,net-next,03/16] net: dsa: inherit the actual bridge port flags at join time

Message ID 20210318231829.3892920-4-olteanv@gmail.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series Better support for sandwiched LAGs with bridge and DSA | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count fail Series longer than 15 patches
netdev/tree_selection success Clearly marked for net-next
netdev/subject_prefix success Link
netdev/cc_maintainers success CCed 7 of 7 maintainers
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit fail Errors and warnings before: 0 this patch: 7
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 162 lines checked
netdev/build_allmodconfig_warn fail Errors and warnings before: 0 this patch: 7
netdev/header_inline success Link

Commit Message

Vladimir Oltean March 18, 2021, 11:18 p.m. UTC
From: Vladimir Oltean <vladimir.oltean@nxp.com>

DSA currently assumes that the bridge port starts off with this
constellation of bridge port flags:

- learning on
- unicast flooding on
- multicast flooding on
- broadcast flooding on

just by virtue of code copy-pasta from the bridge layer (new_nbp).
This was a simple enough strategy thus far, because the 'bridge join'
moment always coincided with the 'bridge port creation' moment.

But with sandwiched interfaces, such as:

 br0
  |
bond0
  |
 swp0

it may happen that the user has had time to change the bridge port flags
of bond0 before enslaving swp0 to it. In that case, swp0 will falsely
assume that the bridge port flags are those determined by new_nbp, when
in fact this can happen:

ip link add br0 type bridge
ip link add bond0 type bond
ip link set bond0 master br0
ip link set bond0 type bridge_slave learning off
ip link set swp0 master br0

Now swp0 has learning enabled, bond0 has learning disabled. Not nice.

Fix this by "dumpster diving" through the actual bridge port flags with
br_port_flag_is_set, at bridge join time.

We use this opportunity to split dsa_port_change_brport_flags into two
distinct functions called dsa_port_inherit_brport_flags and
dsa_port_clear_brport_flags, now that the implementation for the two
cases is no longer similar.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 net/dsa/port.c | 123 ++++++++++++++++++++++++++++++++-----------------
 1 file changed, 82 insertions(+), 41 deletions(-)

Comments

Florian Fainelli March 19, 2021, 10:08 p.m. UTC | #1
On 3/18/2021 4:18 PM, Vladimir Oltean wrote:
> From: Vladimir Oltean <vladimir.oltean@nxp.com>
> 
> DSA currently assumes that the bridge port starts off with this
> constellation of bridge port flags:
> 
> - learning on
> - unicast flooding on
> - multicast flooding on
> - broadcast flooding on
> 
> just by virtue of code copy-pasta from the bridge layer (new_nbp).
> This was a simple enough strategy thus far, because the 'bridge join'
> moment always coincided with the 'bridge port creation' moment.
> 
> But with sandwiched interfaces, such as:
> 
>  br0
>   |
> bond0
>   |
>  swp0
> 
> it may happen that the user has had time to change the bridge port flags
> of bond0 before enslaving swp0 to it. In that case, swp0 will falsely
> assume that the bridge port flags are those determined by new_nbp, when
> in fact this can happen:
> 
> ip link add br0 type bridge
> ip link add bond0 type bond
> ip link set bond0 master br0
> ip link set bond0 type bridge_slave learning off
> ip link set swp0 master br0
> 
> Now swp0 has learning enabled, bond0 has learning disabled. Not nice.
> 
> Fix this by "dumpster diving" through the actual bridge port flags with
> br_port_flag_is_set, at bridge join time.
> 
> We use this opportunity to split dsa_port_change_brport_flags into two
> distinct functions called dsa_port_inherit_brport_flags and
> dsa_port_clear_brport_flags, now that the implementation for the two
> cases is no longer similar.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---
>  net/dsa/port.c | 123 ++++++++++++++++++++++++++++++++-----------------
>  1 file changed, 82 insertions(+), 41 deletions(-)
> 
> diff --git a/net/dsa/port.c b/net/dsa/port.c
> index fcbe5b1545b8..346c50467810 100644
> --- a/net/dsa/port.c
> +++ b/net/dsa/port.c
> @@ -122,26 +122,82 @@ void dsa_port_disable(struct dsa_port *dp)
>  	rtnl_unlock();
>  }
>  
> -static void dsa_port_change_brport_flags(struct dsa_port *dp,
> -					 bool bridge_offload)
> +static void dsa_port_clear_brport_flags(struct dsa_port *dp,
> +					struct netlink_ext_ack *extack)
>  {
>  	struct switchdev_brport_flags flags;
> -	int flag;
>  
> -	flags.mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
> -	if (bridge_offload)
> -		flags.val = flags.mask;
> -	else
> -		flags.val = flags.mask & ~BR_LEARNING;
> +	flags.mask = BR_LEARNING;
> +	flags.val = 0;
> +	dsa_port_bridge_flags(dp, flags, extack);

Would not you want to use the same for_each_set_bit() loop that
dsa_port_change_br_flags() uses, that would be a tad more compact.
Vladimir Oltean March 20, 2021, 10:05 a.m. UTC | #2
On Fri, Mar 19, 2021 at 03:08:46PM -0700, Florian Fainelli wrote:
> 
> 
> On 3/18/2021 4:18 PM, Vladimir Oltean wrote:
> > From: Vladimir Oltean <vladimir.oltean@nxp.com>
> > 
> > DSA currently assumes that the bridge port starts off with this
> > constellation of bridge port flags:
> > 
> > - learning on
> > - unicast flooding on
> > - multicast flooding on
> > - broadcast flooding on
> > 
> > just by virtue of code copy-pasta from the bridge layer (new_nbp).
> > This was a simple enough strategy thus far, because the 'bridge join'
> > moment always coincided with the 'bridge port creation' moment.
> > 
> > But with sandwiched interfaces, such as:
> > 
> >  br0
> >   |
> > bond0
> >   |
> >  swp0
> > 
> > it may happen that the user has had time to change the bridge port flags
> > of bond0 before enslaving swp0 to it. In that case, swp0 will falsely
> > assume that the bridge port flags are those determined by new_nbp, when
> > in fact this can happen:
> > 
> > ip link add br0 type bridge
> > ip link add bond0 type bond
> > ip link set bond0 master br0
> > ip link set bond0 type bridge_slave learning off
> > ip link set swp0 master br0
> > 
> > Now swp0 has learning enabled, bond0 has learning disabled. Not nice.
> > 
> > Fix this by "dumpster diving" through the actual bridge port flags with
> > br_port_flag_is_set, at bridge join time.
> > 
> > We use this opportunity to split dsa_port_change_brport_flags into two
> > distinct functions called dsa_port_inherit_brport_flags and
> > dsa_port_clear_brport_flags, now that the implementation for the two
> > cases is no longer similar.
> > 
> > Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> > ---
> >  net/dsa/port.c | 123 ++++++++++++++++++++++++++++++++-----------------
> >  1 file changed, 82 insertions(+), 41 deletions(-)
> > 
> > diff --git a/net/dsa/port.c b/net/dsa/port.c
> > index fcbe5b1545b8..346c50467810 100644
> > --- a/net/dsa/port.c
> > +++ b/net/dsa/port.c
> > @@ -122,26 +122,82 @@ void dsa_port_disable(struct dsa_port *dp)
> >  	rtnl_unlock();
> >  }
> >  
> > -static void dsa_port_change_brport_flags(struct dsa_port *dp,
> > -					 bool bridge_offload)
> > +static void dsa_port_clear_brport_flags(struct dsa_port *dp,
> > +					struct netlink_ext_ack *extack)
> >  {
> >  	struct switchdev_brport_flags flags;
> > -	int flag;
> >  
> > -	flags.mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
> > -	if (bridge_offload)
> > -		flags.val = flags.mask;
> > -	else
> > -		flags.val = flags.mask & ~BR_LEARNING;
> > +	flags.mask = BR_LEARNING;
> > +	flags.val = 0;
> > +	dsa_port_bridge_flags(dp, flags, extack);
> 
> Would not you want to use the same for_each_set_bit() loop that
> dsa_port_change_br_flags() uses, that would be a tad more compact.
> -- 
> Florian

The reworded version has an equal number of lines, but at least it
catches errors now:

static void dsa_port_clear_brport_flags(struct dsa_port *dp,
					struct netlink_ext_ack *extack)
{
	const unsigned long val = BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
	const unsigned long mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD |
				   BR_BCAST_FLOOD;
	int flag, err;

	for_each_set_bit(flag, &mask, 32) {
		struct switchdev_brport_flags flags = {0};

		flags.mask = BIT(flag);
		flags.val = val & BIT(flag);

		err = dsa_port_bridge_flags(dp, flags, extack);
		if (err && err != -EOPNOTSUPP)
			dev_err(dp->ds->dev,
				"failed to clear bridge port flag %d: %d (%pe)\n",
				flag, err, ERR_PTR(err));
	}
}
diff mbox series

Patch

diff --git a/net/dsa/port.c b/net/dsa/port.c
index fcbe5b1545b8..346c50467810 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -122,26 +122,82 @@  void dsa_port_disable(struct dsa_port *dp)
 	rtnl_unlock();
 }
 
-static void dsa_port_change_brport_flags(struct dsa_port *dp,
-					 bool bridge_offload)
+static void dsa_port_clear_brport_flags(struct dsa_port *dp,
+					struct netlink_ext_ack *extack)
 {
 	struct switchdev_brport_flags flags;
-	int flag;
 
-	flags.mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
-	if (bridge_offload)
-		flags.val = flags.mask;
-	else
-		flags.val = flags.mask & ~BR_LEARNING;
+	flags.mask = BR_LEARNING;
+	flags.val = 0;
+	dsa_port_bridge_flags(dp, flags, extack);
+
+	flags.mask = BR_FLOOD;
+	flags.val = BR_FLOOD;
+	dsa_port_bridge_flags(dp, flags, extack);
+
+	flags.mask = BR_MCAST_FLOOD;
+	flags.val = BR_MCAST_FLOOD;
+	dsa_port_bridge_flags(dp, flags, extack);
+
+	flags.mask = BR_BCAST_FLOOD;
+	flags.val = BR_BCAST_FLOOD;
+	dsa_port_bridge_flags(dp, flags, extack);
+}
+
+static int dsa_port_inherit_brport_flags(struct dsa_port *dp,
+					 struct netlink_ext_ack *extack)
+{
+	const unsigned long mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD |
+				   BR_BCAST_FLOOD;
+	struct net_device *brport_dev = dsa_port_to_bridge_port(dp);
+	int flag, err;
+
+	for_each_set_bit(flag, &mask, 32) {
+		struct switchdev_brport_flags flags = {0};
 
-	for_each_set_bit(flag, &flags.mask, 32) {
-		struct switchdev_brport_flags tmp;
+		flags.mask = BIT(flag);
 
-		tmp.val = flags.val & BIT(flag);
-		tmp.mask = BIT(flag);
+		if (br_port_flag_is_set(brport_dev, BIT(flag)))
+			flags.val = BIT(flag);
 
-		dsa_port_bridge_flags(dp, tmp, NULL);
+		err = dsa_port_bridge_flags(dp, flags, extack);
+		if (err && err != -EOPNOTSUPP)
+			return err;
 	}
+
+	return 0;
+}
+
+static int dsa_port_switchdev_sync(struct dsa_port *dp,
+				   struct netlink_ext_ack *extack)
+{
+	int err;
+
+	err = dsa_port_inherit_brport_flags(dp, extack);
+	if (err)
+		return err;
+
+	return 0;
+}
+
+/* Configure the port for standalone mode (no address learning, flood
+ * everything, BR_STATE_FORWARDING, etc).
+ * The bridge only emits SWITCHDEV_ATTR_ID_PORT_* events when the user
+ * requests it through netlink or sysfs, but not automatically at port
+ * join or leave, so we need to handle resetting the brport flags ourselves.
+ * But we even prefer it that way, because otherwise, some setups might never
+ * get the notification they need, for example, when a port leaves a LAG that
+ * offloads the bridge, it becomes standalone, but as far as the bridge is
+ * concerned, no port ever left.
+ */
+static void dsa_port_switchdev_unsync(struct dsa_port *dp)
+{
+	dsa_port_clear_brport_flags(dp, NULL);
+
+	/* Port left the bridge, put in BR_STATE_DISABLED by the bridge layer,
+	 * so allow it to be in BR_STATE_FORWARDING to be kept functional
+	 */
+	dsa_port_set_state_now(dp, BR_STATE_FORWARDING);
 }
 
 int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br,
@@ -155,24 +211,25 @@  int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br,
 	};
 	int err;
 
-	/* Notify the port driver to set its configurable flags in a way that
-	 * matches the initial settings of a bridge port.
-	 */
-	dsa_port_change_brport_flags(dp, true);
-
 	/* Here the interface is already bridged. Reflect the current
 	 * configuration so that drivers can program their chips accordingly.
 	 */
 	dp->bridge_dev = br;
 
 	err = dsa_broadcast(DSA_NOTIFIER_BRIDGE_JOIN, &info);
+	if (err)
+		goto out_rollback;
 
-	/* The bridging is rolled back on error */
-	if (err) {
-		dsa_port_change_brport_flags(dp, false);
-		dp->bridge_dev = NULL;
-	}
+	err = dsa_port_switchdev_sync(dp, extack);
+	if (err)
+		goto out_rollback_unbridge;
 
+	return 0;
+
+out_rollback_unbridge:
+	dsa_broadcast(DSA_NOTIFIER_BRIDGE_LEAVE, &info);
+out_rollback:
+	dp->bridge_dev = NULL;
 	return err;
 }
 
@@ -186,6 +243,8 @@  void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br)
 	};
 	int err;
 
+	dsa_port_switchdev_unsync(dp);
+
 	/* Here the port is already unbridged. Reflect the current configuration
 	 * so that drivers can program their chips accordingly.
 	 */
@@ -194,24 +253,6 @@  void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br)
 	err = dsa_broadcast(DSA_NOTIFIER_BRIDGE_LEAVE, &info);
 	if (err)
 		pr_err("DSA: failed to notify DSA_NOTIFIER_BRIDGE_LEAVE\n");
-
-	/* Configure the port for standalone mode (no address learning,
-	 * flood everything).
-	 * The bridge only emits SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS events
-	 * when the user requests it through netlink or sysfs, but not
-	 * automatically at port join or leave, so we need to handle resetting
-	 * the brport flags ourselves. But we even prefer it that way, because
-	 * otherwise, some setups might never get the notification they need,
-	 * for example, when a port leaves a LAG that offloads the bridge,
-	 * it becomes standalone, but as far as the bridge is concerned, no
-	 * port ever left.
-	 */
-	dsa_port_change_brport_flags(dp, false);
-
-	/* Port left the bridge, put in BR_STATE_DISABLED by the bridge layer,
-	 * so allow it to be in BR_STATE_FORWARDING to be kept functional
-	 */
-	dsa_port_set_state_now(dp, BR_STATE_FORWARDING);
 }
 
 int dsa_port_lag_change(struct dsa_port *dp,