diff mbox series

[net-next,6/6] net: bridge: avoid uselessly making offloaded ports promiscuous

Message ID 20220408200337.718067-7-vladimir.oltean@nxp.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series Disable host flooding for DSA ports under a bridge | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers warning 2 maintainers not CCed: bridge@lists.linux-foundation.org razor@blackwall.org
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 73 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Vladimir Oltean April 8, 2022, 8:03 p.m. UTC
The bridge driver's intention by making ports promiscuous is to turn off
their RX filters such that these ports receive packets with any MAC DA.

A quick survey of the kernel drivers that call
switchdev_bridge_port_offload() shows that either these do not implement
ndo_change_rx_flags() at all, or they explicitly ignore changes to
IFF_PROMISC (am65_cpsw_slave_set_promisc, cpsw_set_promiscious,
ocelot_set_rx_mode).

This makes sense, because hardware that is purpose-built to do L2
forwarding generally already knows it should accept any MAC DA on its
ports.

That is not to say that IFF_PROMISC makes no sense for switchdev drivers.
For example, DSA has the concept of multiple address databases (this is
achieved by effectively partitioning the FDB: reserve a database - FID -
for each port operating as standalone, a FID for each VLAN-unaware
bridge, a FID for each bridge VLAN). The address database of a
standalone port is managed through the standard dev->uc and dev->uc
lists and is used to filter towards the hosts the addresses required for
local termination. The bridge-related address databases are managed
using switchdev (SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE).

IFF_PROMISC is intrinsically connected to dev->uc and dev->mc (see the
implementation of __dev_set_rx_mode which puts the interface in
promiscuous mode if the unicast list isn't empty but the device doesn't
support IFF_UNICAST_FLT), and therefore to what DSA implements as the
standalone port address database (there, an entry in dev->uc means
"forward it to CPU", the absence of it means "drop it", and promiscuity
means "put the CPU in the flood mask of packets with unknown MAC DA").

Whereas there is no IFF_PROMISC equivalent to the FDB entries notified
through switchdev (therefore to the bridge-related address databases),
because none is needed.

In this model, the bridge driver, which is only trying to secure its
reception of packets, is in fact overstepping, because it manages
something which is outside of its competence: the host flooding of the
standalone port database, when in fact that database will not be the one
used by packets handled by the bridging service.

In turn, this prevents further optimizations from being applied in
particular to DSA, and in general to any switchdev driver. A desirable
goal is to eliminate host flooding of packets which are known to be
unnecessary and only dropped later in software [1].

In an ideal world with ideal hardware:
(a) flooding would be controlled per FID rather than per port
(b) egress flooding towards a certain port can be controlled
    independently depending on the actual port ingress port, rather than
    globally, regardless of ingress port

When (a) does not hold true, the bridge will force the port to keep host
flooding enabled, even if this is not otherwise needed (there is no
station behind a "foreign interface" that requires software forwarding;
the only packets sent by the accelerator to the CPU are for termination
purposes).

When (b) does not hold true, it means that a 4-port switch where 1 port
is standalone and 3 are bridged (again with no foreign interface) will
have host flooding enabled for all 4 ports (including the standalone
port, because the bridge is keeping host flooding enabled, and all ports
are serviced by the same CPU port).

Since DSA is a framework and not just a driver for a single device,
these nonidealities do hold true, and the bridge unnecessarily setting
IFF_PROMISC on its ports is a real roadblock towards disabling host
flooding in practical scenarios.

The proposed solution is to make the bridge driver stop touching port
promiscuity for offloaded switchdev ports, and let them manage
promiscuity by themselves as they see fit. It can achieve this by
looking at net_bridge_port :: offload_count, which is updated
voluntarily by switchdev drivers using switchdev_bridge_port_offload().

br_manage_promisc() is already called by nbp_update_port_count() on a
port join/leave, and the implicit assumption is that
switchdev_bridge_port_offload() has already been called by that time
(from netdev_master_upper_dev_link).

[1] https://www.youtube.com/watch?v=B1HhxEcU7Jg

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 net/bridge/br_if.c | 63 ++++++++++++++++++++++++++++------------------
 1 file changed, 39 insertions(+), 24 deletions(-)

Comments

kernel test robot April 9, 2022, 10:17 a.m. UTC | #1
Hi Vladimir,

I love your patch! Yet something to improve:

[auto build test ERROR on net-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Vladimir-Oltean/Disable-host-flooding-for-DSA-ports-under-a-bridge/20220409-041556
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git e8bd70250a821edb541c3abe1eacdad9f8dc7adf
config: x86_64-randconfig-a003 (https://download.01.org/0day-ci/archive/20220409/202204091856.0PBgeBSa-lkp@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project 893e1c18b98d8bbc7b8d7d22cc2c348f65c72ad9)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/2b24e24c704fa129c753dbc8fb764c95f4a3562c
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Vladimir-Oltean/Disable-host-flooding-for-DSA-ports-under-a-bridge/20220409-041556
        git checkout 2b24e24c704fa129c753dbc8fb764c95f4a3562c
        # save the config file to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash net/bridge/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> net/bridge/br_if.c:145:10: error: no member named 'offload_count' in 'struct net_bridge_port'
                   if (p->offload_count) {
                       ~  ^
   1 error generated.


vim +145 net/bridge/br_if.c

   129	
   130	/* When a port is added or removed or when certain port flags
   131	 * change, this function is called to automatically manage
   132	 * promiscuity setting of all the bridge ports.  We are always called
   133	 * under RTNL so can skip using rcu primitives.
   134	 */
   135	void br_manage_promisc(struct net_bridge *br)
   136	{
   137		struct net_bridge_port *p;
   138	
   139		list_for_each_entry(p, &br->port_list, list) {
   140			/* Offloaded ports have a separate address database for
   141			 * forwarding, which is managed through switchdev and not
   142			 * through dev_uc_add(), so the promiscuous concept makes no
   143			 * sense for them. Avoid updating promiscuity in that case.
   144			 */
 > 145			if (p->offload_count) {
   146				br_port_clear_promisc(p);
   147				continue;
   148			}
   149	
   150			/* If bridge is promiscuous, unconditionally place all ports
   151			 * in promiscuous mode too. This allows the bridge device to
   152			 * locally receive all unknown traffic.
   153			 */
   154			if (br->dev->flags & IFF_PROMISC) {
   155				br_port_set_promisc(p);
   156				continue;
   157			}
   158	
   159			/* If vlan filtering is disabled, place all ports in
   160			 * promiscuous mode.
   161			 */
   162			if (!br_vlan_enabled(br->dev)) {
   163				br_port_set_promisc(p);
   164				continue;
   165			}
   166	
   167			/* If the number of auto-ports is <= 1, then all other ports
   168			 * will have their output configuration statically specified
   169			 * through fdbs. Since ingress on the auto-port becomes
   170			 * forwarding/egress to other ports and egress configuration is
   171			 * statically known, we can say that ingress configuration of
   172			 * the auto-port is also statically known.
   173			 * This lets us disable promiscuous mode and write this config
   174			 * to hw.
   175			 */
   176			if (br->auto_cnt == 0 ||
   177			    (br->auto_cnt == 1 && br_auto_port(p)))
   178				br_port_clear_promisc(p);
   179			else
   180				br_port_set_promisc(p);
   181		}
   182	}
   183
Vladimir Oltean April 9, 2022, 11:04 a.m. UTC | #2
On Sat, Apr 09, 2022 at 06:17:15PM +0800, kernel test robot wrote:
> >> net/bridge/br_if.c:145:10: error: no member named 'offload_count' in 'struct net_bridge_port'
>                    if (p->offload_count) {
>                        ~  ^
>    130	/* When a port is added or removed or when certain port flags
>    131	 * change, this function is called to automatically manage
>    132	 * promiscuity setting of all the bridge ports.  We are always called
>    133	 * under RTNL so can skip using rcu primitives.
>    134	 */
>    135	void br_manage_promisc(struct net_bridge *br)
>    136	{
>    137		struct net_bridge_port *p;
>    138	
>    139		list_for_each_entry(p, &br->port_list, list) {
>    140			/* Offloaded ports have a separate address database for
>    141			 * forwarding, which is managed through switchdev and not
>    142			 * through dev_uc_add(), so the promiscuous concept makes no
>    143			 * sense for them. Avoid updating promiscuity in that case.
>    144			 */
>  > 145			if (p->offload_count) {

Good point. Please imagine there's a static inline nbp_get_offload_count()
that returns 0 when CONFIG_NET_SWITCHDEV=n. I'll address this for v2.

>    146				br_port_clear_promisc(p);
>    147				continue;
>    148			}
>    149	
>    150			/* If bridge is promiscuous, unconditionally place all ports
>    151			 * in promiscuous mode too. This allows the bridge device to
>    152			 * locally receive all unknown traffic.
>    153			 */
>    154			if (br->dev->flags & IFF_PROMISC) {
>    155				br_port_set_promisc(p);
>    156				continue;
>    157			}
>    158	
>    159			/* If vlan filtering is disabled, place all ports in
>    160			 * promiscuous mode.
>    161			 */
>    162			if (!br_vlan_enabled(br->dev)) {
>    163				br_port_set_promisc(p);
>    164				continue;
>    165			}
>    166	
>    167			/* If the number of auto-ports is <= 1, then all other ports
>    168			 * will have their output configuration statically specified
>    169			 * through fdbs. Since ingress on the auto-port becomes
>    170			 * forwarding/egress to other ports and egress configuration is
>    171			 * statically known, we can say that ingress configuration of
>    172			 * the auto-port is also statically known.
>    173			 * This lets us disable promiscuous mode and write this config
>    174			 * to hw.
>    175			 */
>    176			if (br->auto_cnt == 0 ||
>    177			    (br->auto_cnt == 1 && br_auto_port(p)))
>    178				br_port_clear_promisc(p);
>    179			else
>    180				br_port_set_promisc(p);
>    181		}
>    182	}
>    183
kernel test robot April 14, 2022, 1:53 p.m. UTC | #3
Hi Vladimir,

I love your patch! Yet something to improve:

[auto build test ERROR on net-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Vladimir-Oltean/Disable-host-flooding-for-DSA-ports-under-a-bridge/20220409-041556
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git e8bd70250a821edb541c3abe1eacdad9f8dc7adf
config: i386-debian-10.3 (https://download.01.org/0day-ci/archive/20220414/202204142133.6pDDI4xD-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.2.0-19) 11.2.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/2b24e24c704fa129c753dbc8fb764c95f4a3562c
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Vladimir-Oltean/Disable-host-flooding-for-DSA-ports-under-a-bridge/20220409-041556
        git checkout 2b24e24c704fa129c753dbc8fb764c95f4a3562c
        # save the config file to linux build tree
        mkdir build_dir
        make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash net/bridge/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   net/bridge/br_if.c: In function 'br_manage_promisc':
>> net/bridge/br_if.c:145:22: error: 'struct net_bridge_port' has no member named 'offload_count'
     145 |                 if (p->offload_count) {
         |                      ^~


vim +145 net/bridge/br_if.c

   129	
   130	/* When a port is added or removed or when certain port flags
   131	 * change, this function is called to automatically manage
   132	 * promiscuity setting of all the bridge ports.  We are always called
   133	 * under RTNL so can skip using rcu primitives.
   134	 */
   135	void br_manage_promisc(struct net_bridge *br)
   136	{
   137		struct net_bridge_port *p;
   138	
   139		list_for_each_entry(p, &br->port_list, list) {
   140			/* Offloaded ports have a separate address database for
   141			 * forwarding, which is managed through switchdev and not
   142			 * through dev_uc_add(), so the promiscuous concept makes no
   143			 * sense for them. Avoid updating promiscuity in that case.
   144			 */
 > 145			if (p->offload_count) {
   146				br_port_clear_promisc(p);
   147				continue;
   148			}
   149	
   150			/* If bridge is promiscuous, unconditionally place all ports
   151			 * in promiscuous mode too. This allows the bridge device to
   152			 * locally receive all unknown traffic.
   153			 */
   154			if (br->dev->flags & IFF_PROMISC) {
   155				br_port_set_promisc(p);
   156				continue;
   157			}
   158	
   159			/* If vlan filtering is disabled, place all ports in
   160			 * promiscuous mode.
   161			 */
   162			if (!br_vlan_enabled(br->dev)) {
   163				br_port_set_promisc(p);
   164				continue;
   165			}
   166	
   167			/* If the number of auto-ports is <= 1, then all other ports
   168			 * will have their output configuration statically specified
   169			 * through fdbs. Since ingress on the auto-port becomes
   170			 * forwarding/egress to other ports and egress configuration is
   171			 * statically known, we can say that ingress configuration of
   172			 * the auto-port is also statically known.
   173			 * This lets us disable promiscuous mode and write this config
   174			 * to hw.
   175			 */
   176			if (br->auto_cnt == 0 ||
   177			    (br->auto_cnt == 1 && br_auto_port(p)))
   178				br_port_clear_promisc(p);
   179			else
   180				br_port_set_promisc(p);
   181		}
   182	}
   183
diff mbox series

Patch

diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 55f47cadb114..6ac5313e1cb8 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -135,34 +135,49 @@  static void br_port_clear_promisc(struct net_bridge_port *p)
 void br_manage_promisc(struct net_bridge *br)
 {
 	struct net_bridge_port *p;
-	bool set_all = false;
-
-	/* If vlan filtering is disabled or bridge interface is placed
-	 * into promiscuous mode, place all ports in promiscuous mode.
-	 */
-	if ((br->dev->flags & IFF_PROMISC) || !br_vlan_enabled(br->dev))
-		set_all = true;
 
 	list_for_each_entry(p, &br->port_list, list) {
-		if (set_all) {
+		/* Offloaded ports have a separate address database for
+		 * forwarding, which is managed through switchdev and not
+		 * through dev_uc_add(), so the promiscuous concept makes no
+		 * sense for them. Avoid updating promiscuity in that case.
+		 */
+		if (p->offload_count) {
+			br_port_clear_promisc(p);
+			continue;
+		}
+
+		/* If bridge is promiscuous, unconditionally place all ports
+		 * in promiscuous mode too. This allows the bridge device to
+		 * locally receive all unknown traffic.
+		 */
+		if (br->dev->flags & IFF_PROMISC) {
+			br_port_set_promisc(p);
+			continue;
+		}
+
+		/* If vlan filtering is disabled, place all ports in
+		 * promiscuous mode.
+		 */
+		if (!br_vlan_enabled(br->dev)) {
 			br_port_set_promisc(p);
-		} else {
-			/* If the number of auto-ports is <= 1, then all other
-			 * ports will have their output configuration
-			 * statically specified through fdbs.  Since ingress
-			 * on the auto-port becomes forwarding/egress to other
-			 * ports and egress configuration is statically known,
-			 * we can say that ingress configuration of the
-			 * auto-port is also statically known.
-			 * This lets us disable promiscuous mode and write
-			 * this config to hw.
-			 */
-			if (br->auto_cnt == 0 ||
-			    (br->auto_cnt == 1 && br_auto_port(p)))
-				br_port_clear_promisc(p);
-			else
-				br_port_set_promisc(p);
+			continue;
 		}
+
+		/* If the number of auto-ports is <= 1, then all other ports
+		 * will have their output configuration statically specified
+		 * through fdbs. Since ingress on the auto-port becomes
+		 * forwarding/egress to other ports and egress configuration is
+		 * statically known, we can say that ingress configuration of
+		 * the auto-port is also statically known.
+		 * This lets us disable promiscuous mode and write this config
+		 * to hw.
+		 */
+		if (br->auto_cnt == 0 ||
+		    (br->auto_cnt == 1 && br_auto_port(p)))
+			br_port_clear_promisc(p);
+		else
+			br_port_set_promisc(p);
 	}
 }