diff mbox series

[2/4] bond_alb: don't rewrite bridged non-local MACs

Message ID 20210518210849.1673577-3-jarod@redhat.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [1/4] bonding: add pure source-mac-based tx hashing option | expand

Checks

Context Check Description
netdev/cover_letter warning Series does not have a cover letter
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Guessed tree name to be net-next
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cc_maintainers success CCed 6 of 6 maintainers
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit fail Errors and warnings before: 0 this patch: 2
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 35 lines checked
netdev/build_allmodconfig_warn fail Errors and warnings before: 0 this patch: 2
netdev/header_inline success Link

Commit Message

Jarod Wilson May 18, 2021, 9:08 p.m. UTC
With a virtual machine behind a bridge on top of a bond, outgoing traffic
should retain the VM's source MAC. That works fine most of the time, until
doing a failover, and then the MAC gets rewritten to the bond slave's MAC,
and the return traffic gets dropped. If we don't rewrite the MAC there, we
don't lose any traffic.

Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Thomas Davis <tadavis@lbl.gov>
Cc: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
---
 drivers/net/bonding/bond_alb.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

Comments

Jay Vosburgh May 19, 2021, 10:31 p.m. UTC | #1
Jarod Wilson <jarod@redhat.com> wrote:

>With a virtual machine behind a bridge on top of a bond, outgoing traffic
>should retain the VM's source MAC. That works fine most of the time, until
>doing a failover, and then the MAC gets rewritten to the bond slave's MAC,
>and the return traffic gets dropped. If we don't rewrite the MAC there, we
>don't lose any traffic.

	Please have the log message here specify that this applies only
to balance-alb mode, and, the usual nomenclature for bonding patches is
"[PATCH] bonding:"; for this case, I'd suggest "balance-alb:" right
afterwards to be clear that it's only for alb mode.  I didn't really
spot the "bond_alb" tag for what it was on first read, and it is the
only indication that this change is specific to alb mode other than the
patch itself.

>Cc: Jay Vosburgh <j.vosburgh@gmail.com>
>Cc: Veaceslav Falico <vfalico@gmail.com>
>Cc: Andy Gospodarek <andy@greyhouse.net>
>Cc: "David S. Miller" <davem@davemloft.net>
>Cc: Jakub Kicinski <kuba@kernel.org>
>Cc: Thomas Davis <tadavis@lbl.gov>
>Cc: netdev@vger.kernel.org
>Signed-off-by: Jarod Wilson <jarod@redhat.com>
>---
> drivers/net/bonding/bond_alb.c | 23 ++++++++++++++++++++++-
> 1 file changed, 22 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
>index 3455f2cc13f2..ce8257c7cbea 100644
>--- a/drivers/net/bonding/bond_alb.c
>+++ b/drivers/net/bonding/bond_alb.c
>@@ -1302,6 +1302,26 @@ void bond_alb_deinitialize(struct bonding *bond)
> 		rlb_deinitialize(bond);
> }
> 
>+static bool bond_alb_bridged_mac(struct bonding *bond, struct ethhdr *eth_data)
>+{
>+	struct list_head *iter;
>+	struct slave *slave;
>+
>+	if (BOND_MODE(bond) != BOND_MODE_ALB)
>+		return false;
>+
>+	/* Don't modify source MACs that do not originate locally
>+	 * (e.g.,arrive via a bridge).
>+	 */
>+	if (!netif_is_bridge_port(bond->dev))
>+		return false;

	I believe this logic will fail if the plumbing is, e.g., bond ->
vlan -> bridge, as netif_is_bridge_port() would not return true for the
bond in that case.

	Making this reliable is tricky at best, and may be impossible to
be correct for all possible cases.  As such, I think the comment above
should reflect the limited scope of what is actually being checked here
(i.e., the bond itself is directly a bridge port).

	-J

>+
>+	if (bond_slave_has_mac_rx(bond, eth_data->h_source))
>+		return false;
>+
>+	return true;
>+}
>+
> static netdev_tx_t bond_do_alb_xmit(struct sk_buff *skb, struct bonding *bond,
> 				    struct slave *tx_slave)
> {
>@@ -1316,7 +1336,8 @@ static netdev_tx_t bond_do_alb_xmit(struct sk_buff *skb, struct bonding *bond,
> 	}
> 
> 	if (tx_slave && bond_slave_can_tx(tx_slave)) {
>-		if (tx_slave != rcu_access_pointer(bond->curr_active_slave)) {
>+		if (tx_slave != rcu_access_pointer(bond->curr_active_slave) &&
>+		    !bond_alb_bridged_mac(bond, eth_data)) {
> 			ether_addr_copy(eth_data->h_source,
> 					tx_slave->dev->dev_addr);
> 		}
>-- 
>2.30.2
>

---
	-Jay Vosburgh, jay.vosburgh@canonical.com
diff mbox series

Patch

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 3455f2cc13f2..ce8257c7cbea 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -1302,6 +1302,26 @@  void bond_alb_deinitialize(struct bonding *bond)
 		rlb_deinitialize(bond);
 }
 
+static bool bond_alb_bridged_mac(struct bonding *bond, struct ethhdr *eth_data)
+{
+	struct list_head *iter;
+	struct slave *slave;
+
+	if (BOND_MODE(bond) != BOND_MODE_ALB)
+		return false;
+
+	/* Don't modify source MACs that do not originate locally
+	 * (e.g.,arrive via a bridge).
+	 */
+	if (!netif_is_bridge_port(bond->dev))
+		return false;
+
+	if (bond_slave_has_mac_rx(bond, eth_data->h_source))
+		return false;
+
+	return true;
+}
+
 static netdev_tx_t bond_do_alb_xmit(struct sk_buff *skb, struct bonding *bond,
 				    struct slave *tx_slave)
 {
@@ -1316,7 +1336,8 @@  static netdev_tx_t bond_do_alb_xmit(struct sk_buff *skb, struct bonding *bond,
 	}
 
 	if (tx_slave && bond_slave_can_tx(tx_slave)) {
-		if (tx_slave != rcu_access_pointer(bond->curr_active_slave)) {
+		if (tx_slave != rcu_access_pointer(bond->curr_active_slave) &&
+		    !bond_alb_bridged_mac(bond, eth_data)) {
 			ether_addr_copy(eth_data->h_source,
 					tx_slave->dev->dev_addr);
 		}