Message ID | 20210318231829.3892920-4-olteanv@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | Better support for sandwiched LAGs with bridge and DSA | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | fail | Series longer than 15 patches |
netdev/tree_selection | success | Clearly marked for net-next |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | success | CCed 7 of 7 maintainers |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | fail | Errors and warnings before: 0 this patch: 7 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 162 lines checked |
netdev/build_allmodconfig_warn | fail | Errors and warnings before: 0 this patch: 7 |
netdev/header_inline | success | Link |
On 3/18/2021 4:18 PM, Vladimir Oltean wrote: > From: Vladimir Oltean <vladimir.oltean@nxp.com> > > DSA currently assumes that the bridge port starts off with this > constellation of bridge port flags: > > - learning on > - unicast flooding on > - multicast flooding on > - broadcast flooding on > > just by virtue of code copy-pasta from the bridge layer (new_nbp). > This was a simple enough strategy thus far, because the 'bridge join' > moment always coincided with the 'bridge port creation' moment. > > But with sandwiched interfaces, such as: > > br0 > | > bond0 > | > swp0 > > it may happen that the user has had time to change the bridge port flags > of bond0 before enslaving swp0 to it. In that case, swp0 will falsely > assume that the bridge port flags are those determined by new_nbp, when > in fact this can happen: > > ip link add br0 type bridge > ip link add bond0 type bond > ip link set bond0 master br0 > ip link set bond0 type bridge_slave learning off > ip link set swp0 master br0 > > Now swp0 has learning enabled, bond0 has learning disabled. Not nice. > > Fix this by "dumpster diving" through the actual bridge port flags with > br_port_flag_is_set, at bridge join time. > > We use this opportunity to split dsa_port_change_brport_flags into two > distinct functions called dsa_port_inherit_brport_flags and > dsa_port_clear_brport_flags, now that the implementation for the two > cases is no longer similar. > > Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> > --- > net/dsa/port.c | 123 ++++++++++++++++++++++++++++++++----------------- > 1 file changed, 82 insertions(+), 41 deletions(-) > > diff --git a/net/dsa/port.c b/net/dsa/port.c > index fcbe5b1545b8..346c50467810 100644 > --- a/net/dsa/port.c > +++ b/net/dsa/port.c > @@ -122,26 +122,82 @@ void dsa_port_disable(struct dsa_port *dp) > rtnl_unlock(); > } > > -static void dsa_port_change_brport_flags(struct dsa_port *dp, > - bool bridge_offload) > +static void dsa_port_clear_brport_flags(struct dsa_port *dp, > + struct netlink_ext_ack *extack) > { > struct switchdev_brport_flags flags; > - int flag; > > - flags.mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD; > - if (bridge_offload) > - flags.val = flags.mask; > - else > - flags.val = flags.mask & ~BR_LEARNING; > + flags.mask = BR_LEARNING; > + flags.val = 0; > + dsa_port_bridge_flags(dp, flags, extack); Would not you want to use the same for_each_set_bit() loop that dsa_port_change_br_flags() uses, that would be a tad more compact.
On Fri, Mar 19, 2021 at 03:08:46PM -0700, Florian Fainelli wrote: > > > On 3/18/2021 4:18 PM, Vladimir Oltean wrote: > > From: Vladimir Oltean <vladimir.oltean@nxp.com> > > > > DSA currently assumes that the bridge port starts off with this > > constellation of bridge port flags: > > > > - learning on > > - unicast flooding on > > - multicast flooding on > > - broadcast flooding on > > > > just by virtue of code copy-pasta from the bridge layer (new_nbp). > > This was a simple enough strategy thus far, because the 'bridge join' > > moment always coincided with the 'bridge port creation' moment. > > > > But with sandwiched interfaces, such as: > > > > br0 > > | > > bond0 > > | > > swp0 > > > > it may happen that the user has had time to change the bridge port flags > > of bond0 before enslaving swp0 to it. In that case, swp0 will falsely > > assume that the bridge port flags are those determined by new_nbp, when > > in fact this can happen: > > > > ip link add br0 type bridge > > ip link add bond0 type bond > > ip link set bond0 master br0 > > ip link set bond0 type bridge_slave learning off > > ip link set swp0 master br0 > > > > Now swp0 has learning enabled, bond0 has learning disabled. Not nice. > > > > Fix this by "dumpster diving" through the actual bridge port flags with > > br_port_flag_is_set, at bridge join time. > > > > We use this opportunity to split dsa_port_change_brport_flags into two > > distinct functions called dsa_port_inherit_brport_flags and > > dsa_port_clear_brport_flags, now that the implementation for the two > > cases is no longer similar. > > > > Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> > > --- > > net/dsa/port.c | 123 ++++++++++++++++++++++++++++++++----------------- > > 1 file changed, 82 insertions(+), 41 deletions(-) > > > > diff --git a/net/dsa/port.c b/net/dsa/port.c > > index fcbe5b1545b8..346c50467810 100644 > > --- a/net/dsa/port.c > > +++ b/net/dsa/port.c > > @@ -122,26 +122,82 @@ void dsa_port_disable(struct dsa_port *dp) > > rtnl_unlock(); > > } > > > > -static void dsa_port_change_brport_flags(struct dsa_port *dp, > > - bool bridge_offload) > > +static void dsa_port_clear_brport_flags(struct dsa_port *dp, > > + struct netlink_ext_ack *extack) > > { > > struct switchdev_brport_flags flags; > > - int flag; > > > > - flags.mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD; > > - if (bridge_offload) > > - flags.val = flags.mask; > > - else > > - flags.val = flags.mask & ~BR_LEARNING; > > + flags.mask = BR_LEARNING; > > + flags.val = 0; > > + dsa_port_bridge_flags(dp, flags, extack); > > Would not you want to use the same for_each_set_bit() loop that > dsa_port_change_br_flags() uses, that would be a tad more compact. > -- > Florian The reworded version has an equal number of lines, but at least it catches errors now: static void dsa_port_clear_brport_flags(struct dsa_port *dp, struct netlink_ext_ack *extack) { const unsigned long val = BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD; const unsigned long mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD; int flag, err; for_each_set_bit(flag, &mask, 32) { struct switchdev_brport_flags flags = {0}; flags.mask = BIT(flag); flags.val = val & BIT(flag); err = dsa_port_bridge_flags(dp, flags, extack); if (err && err != -EOPNOTSUPP) dev_err(dp->ds->dev, "failed to clear bridge port flag %d: %d (%pe)\n", flag, err, ERR_PTR(err)); } }
diff --git a/net/dsa/port.c b/net/dsa/port.c index fcbe5b1545b8..346c50467810 100644 --- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -122,26 +122,82 @@ void dsa_port_disable(struct dsa_port *dp) rtnl_unlock(); } -static void dsa_port_change_brport_flags(struct dsa_port *dp, - bool bridge_offload) +static void dsa_port_clear_brport_flags(struct dsa_port *dp, + struct netlink_ext_ack *extack) { struct switchdev_brport_flags flags; - int flag; - flags.mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD; - if (bridge_offload) - flags.val = flags.mask; - else - flags.val = flags.mask & ~BR_LEARNING; + flags.mask = BR_LEARNING; + flags.val = 0; + dsa_port_bridge_flags(dp, flags, extack); + + flags.mask = BR_FLOOD; + flags.val = BR_FLOOD; + dsa_port_bridge_flags(dp, flags, extack); + + flags.mask = BR_MCAST_FLOOD; + flags.val = BR_MCAST_FLOOD; + dsa_port_bridge_flags(dp, flags, extack); + + flags.mask = BR_BCAST_FLOOD; + flags.val = BR_BCAST_FLOOD; + dsa_port_bridge_flags(dp, flags, extack); +} + +static int dsa_port_inherit_brport_flags(struct dsa_port *dp, + struct netlink_ext_ack *extack) +{ + const unsigned long mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | + BR_BCAST_FLOOD; + struct net_device *brport_dev = dsa_port_to_bridge_port(dp); + int flag, err; + + for_each_set_bit(flag, &mask, 32) { + struct switchdev_brport_flags flags = {0}; - for_each_set_bit(flag, &flags.mask, 32) { - struct switchdev_brport_flags tmp; + flags.mask = BIT(flag); - tmp.val = flags.val & BIT(flag); - tmp.mask = BIT(flag); + if (br_port_flag_is_set(brport_dev, BIT(flag))) + flags.val = BIT(flag); - dsa_port_bridge_flags(dp, tmp, NULL); + err = dsa_port_bridge_flags(dp, flags, extack); + if (err && err != -EOPNOTSUPP) + return err; } + + return 0; +} + +static int dsa_port_switchdev_sync(struct dsa_port *dp, + struct netlink_ext_ack *extack) +{ + int err; + + err = dsa_port_inherit_brport_flags(dp, extack); + if (err) + return err; + + return 0; +} + +/* Configure the port for standalone mode (no address learning, flood + * everything, BR_STATE_FORWARDING, etc). + * The bridge only emits SWITCHDEV_ATTR_ID_PORT_* events when the user + * requests it through netlink or sysfs, but not automatically at port + * join or leave, so we need to handle resetting the brport flags ourselves. + * But we even prefer it that way, because otherwise, some setups might never + * get the notification they need, for example, when a port leaves a LAG that + * offloads the bridge, it becomes standalone, but as far as the bridge is + * concerned, no port ever left. + */ +static void dsa_port_switchdev_unsync(struct dsa_port *dp) +{ + dsa_port_clear_brport_flags(dp, NULL); + + /* Port left the bridge, put in BR_STATE_DISABLED by the bridge layer, + * so allow it to be in BR_STATE_FORWARDING to be kept functional + */ + dsa_port_set_state_now(dp, BR_STATE_FORWARDING); } int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br, @@ -155,24 +211,25 @@ int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br, }; int err; - /* Notify the port driver to set its configurable flags in a way that - * matches the initial settings of a bridge port. - */ - dsa_port_change_brport_flags(dp, true); - /* Here the interface is already bridged. Reflect the current * configuration so that drivers can program their chips accordingly. */ dp->bridge_dev = br; err = dsa_broadcast(DSA_NOTIFIER_BRIDGE_JOIN, &info); + if (err) + goto out_rollback; - /* The bridging is rolled back on error */ - if (err) { - dsa_port_change_brport_flags(dp, false); - dp->bridge_dev = NULL; - } + err = dsa_port_switchdev_sync(dp, extack); + if (err) + goto out_rollback_unbridge; + return 0; + +out_rollback_unbridge: + dsa_broadcast(DSA_NOTIFIER_BRIDGE_LEAVE, &info); +out_rollback: + dp->bridge_dev = NULL; return err; } @@ -186,6 +243,8 @@ void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br) }; int err; + dsa_port_switchdev_unsync(dp); + /* Here the port is already unbridged. Reflect the current configuration * so that drivers can program their chips accordingly. */ @@ -194,24 +253,6 @@ void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br) err = dsa_broadcast(DSA_NOTIFIER_BRIDGE_LEAVE, &info); if (err) pr_err("DSA: failed to notify DSA_NOTIFIER_BRIDGE_LEAVE\n"); - - /* Configure the port for standalone mode (no address learning, - * flood everything). - * The bridge only emits SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS events - * when the user requests it through netlink or sysfs, but not - * automatically at port join or leave, so we need to handle resetting - * the brport flags ourselves. But we even prefer it that way, because - * otherwise, some setups might never get the notification they need, - * for example, when a port leaves a LAG that offloads the bridge, - * it becomes standalone, but as far as the bridge is concerned, no - * port ever left. - */ - dsa_port_change_brport_flags(dp, false); - - /* Port left the bridge, put in BR_STATE_DISABLED by the bridge layer, - * so allow it to be in BR_STATE_FORWARDING to be kept functional - */ - dsa_port_set_state_now(dp, BR_STATE_FORWARDING); } int dsa_port_lag_change(struct dsa_port *dp,