Message ID | 20241210141245.327886-3-daniel@iogearbox.net (mailing list archive) |
---|---|
State | Accepted |
Commit | 77b11c8bf3a228d1c63464534c2dcc8d9c8bf7ff |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,1/5] net, team, bonding: Add netdev_base_features helper | expand |
On 12/10/24 16:12, Daniel Borkmann wrote: > Drivers like mlx5 expose NIC's vlan_features such as > NETIF_F_GSO_UDP_TUNNEL & NETIF_F_GSO_UDP_TUNNEL_CSUM which are > later not propagated when the underlying devices are bonded and > a vlan device created on top of the bond. > > Right now, the more cumbersome workaround for this is to create > the vlan on top of the mlx5 and then enslave the vlan devices > to a bond. > > To fix this, add NETIF_F_GSO_ENCAP_ALL to BOND_VLAN_FEATURES > such that bond_compute_features() can probe and propagate the > vlan_features from the slave devices up to the vlan device. > > Given the following bond: > > # ethtool -i enp2s0f{0,1}np{0,1} > driver: mlx5_core > [...] > > # ethtool -k enp2s0f0np0 | grep udp > tx-udp_tnl-segmentation: on > tx-udp_tnl-csum-segmentation: on > tx-udp-segmentation: on > rx-udp_tunnel-port-offload: on > rx-udp-gro-forwarding: off > > # ethtool -k enp2s0f1np1 | grep udp > tx-udp_tnl-segmentation: on > tx-udp_tnl-csum-segmentation: on > tx-udp-segmentation: on > rx-udp_tunnel-port-offload: on > rx-udp-gro-forwarding: off > > # ethtool -k bond0 | grep udp > tx-udp_tnl-segmentation: on > tx-udp_tnl-csum-segmentation: on > tx-udp-segmentation: on > rx-udp_tunnel-port-offload: off [fixed] > rx-udp-gro-forwarding: off > > Before: > > # ethtool -k bond0.100 | grep udp > tx-udp_tnl-segmentation: off [requested on] > tx-udp_tnl-csum-segmentation: off [requested on] > tx-udp-segmentation: on > rx-udp_tunnel-port-offload: off [fixed] > rx-udp-gro-forwarding: off > > After: > > # ethtool -k bond0.100 | grep udp > tx-udp_tnl-segmentation: on > tx-udp_tnl-csum-segmentation: on > tx-udp-segmentation: on > rx-udp_tunnel-port-offload: off [fixed] > rx-udp-gro-forwarding: off > > Various users have run into this reporting performance issues when > configuring Cilium in vxlan tunneling mode and having the combination > of bond & vlan for the core devices connecting the Kubernetes cluster > to the outside world. > > Fixes: a9b3ace44c7d ("bonding: fix vlan_features computing") > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> > Cc: Nikolay Aleksandrov <razor@blackwall.org> > Cc: Ido Schimmel <idosch@idosch.org> > Cc: Jiri Pirko <jiri@nvidia.com> > --- > drivers/net/bonding/bond_main.c | 1 + > 1 file changed, 1 insertion(+) > Indeed, I've tested a similar change a year ago to get the expected performance. Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
On Tue, Dec 10, 2024 at 03:12:43PM +0100, Daniel Borkmann wrote: > Drivers like mlx5 expose NIC's vlan_features such as > NETIF_F_GSO_UDP_TUNNEL & NETIF_F_GSO_UDP_TUNNEL_CSUM which are > later not propagated when the underlying devices are bonded and > a vlan device created on top of the bond. > > Right now, the more cumbersome workaround for this is to create > the vlan on top of the mlx5 and then enslave the vlan devices > to a bond. > > To fix this, add NETIF_F_GSO_ENCAP_ALL to BOND_VLAN_FEATURES > such that bond_compute_features() can probe and propagate the > vlan_features from the slave devices up to the vlan device. > > Given the following bond: > > # ethtool -i enp2s0f{0,1}np{0,1} > driver: mlx5_core > [...] > > # ethtool -k enp2s0f0np0 | grep udp > tx-udp_tnl-segmentation: on > tx-udp_tnl-csum-segmentation: on > tx-udp-segmentation: on > rx-udp_tunnel-port-offload: on > rx-udp-gro-forwarding: off > > # ethtool -k enp2s0f1np1 | grep udp > tx-udp_tnl-segmentation: on > tx-udp_tnl-csum-segmentation: on > tx-udp-segmentation: on > rx-udp_tunnel-port-offload: on > rx-udp-gro-forwarding: off > > # ethtool -k bond0 | grep udp > tx-udp_tnl-segmentation: on > tx-udp_tnl-csum-segmentation: on > tx-udp-segmentation: on > rx-udp_tunnel-port-offload: off [fixed] > rx-udp-gro-forwarding: off > > Before: > > # ethtool -k bond0.100 | grep udp > tx-udp_tnl-segmentation: off [requested on] > tx-udp_tnl-csum-segmentation: off [requested on] > tx-udp-segmentation: on > rx-udp_tunnel-port-offload: off [fixed] > rx-udp-gro-forwarding: off > > After: > > # ethtool -k bond0.100 | grep udp > tx-udp_tnl-segmentation: on > tx-udp_tnl-csum-segmentation: on > tx-udp-segmentation: on > rx-udp_tunnel-port-offload: off [fixed] > rx-udp-gro-forwarding: off > > Various users have run into this reporting performance issues when > configuring Cilium in vxlan tunneling mode and having the combination > of bond & vlan for the core devices connecting the Kubernetes cluster > to the outside world. > > Fixes: a9b3ace44c7d ("bonding: fix vlan_features computing") > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> > Cc: Nikolay Aleksandrov <razor@blackwall.org> > Cc: Ido Schimmel <idosch@idosch.org> > Cc: Jiri Pirko <jiri@nvidia.com> > --- > drivers/net/bonding/bond_main.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index 320dd71392ef..7b78c2bada81 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -1534,6 +1534,7 @@ static netdev_features_t bond_fix_features(struct net_device *dev, > > #define BOND_VLAN_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG | \ > NETIF_F_FRAGLIST | NETIF_F_GSO_SOFTWARE | \ > + NETIF_F_GSO_ENCAP_ALL | \ > NETIF_F_HIGHDMA | NETIF_F_LRO) > > #define BOND_ENC_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG | \ > -- > 2.43.0 > Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 320dd71392ef..7b78c2bada81 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1534,6 +1534,7 @@ static netdev_features_t bond_fix_features(struct net_device *dev, #define BOND_VLAN_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG | \ NETIF_F_FRAGLIST | NETIF_F_GSO_SOFTWARE | \ + NETIF_F_GSO_ENCAP_ALL | \ NETIF_F_HIGHDMA | NETIF_F_LRO) #define BOND_ENC_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG | \
Drivers like mlx5 expose NIC's vlan_features such as NETIF_F_GSO_UDP_TUNNEL & NETIF_F_GSO_UDP_TUNNEL_CSUM which are later not propagated when the underlying devices are bonded and a vlan device created on top of the bond. Right now, the more cumbersome workaround for this is to create the vlan on top of the mlx5 and then enslave the vlan devices to a bond. To fix this, add NETIF_F_GSO_ENCAP_ALL to BOND_VLAN_FEATURES such that bond_compute_features() can probe and propagate the vlan_features from the slave devices up to the vlan device. Given the following bond: # ethtool -i enp2s0f{0,1}np{0,1} driver: mlx5_core [...] # ethtool -k enp2s0f0np0 | grep udp tx-udp_tnl-segmentation: on tx-udp_tnl-csum-segmentation: on tx-udp-segmentation: on rx-udp_tunnel-port-offload: on rx-udp-gro-forwarding: off # ethtool -k enp2s0f1np1 | grep udp tx-udp_tnl-segmentation: on tx-udp_tnl-csum-segmentation: on tx-udp-segmentation: on rx-udp_tunnel-port-offload: on rx-udp-gro-forwarding: off # ethtool -k bond0 | grep udp tx-udp_tnl-segmentation: on tx-udp_tnl-csum-segmentation: on tx-udp-segmentation: on rx-udp_tunnel-port-offload: off [fixed] rx-udp-gro-forwarding: off Before: # ethtool -k bond0.100 | grep udp tx-udp_tnl-segmentation: off [requested on] tx-udp_tnl-csum-segmentation: off [requested on] tx-udp-segmentation: on rx-udp_tunnel-port-offload: off [fixed] rx-udp-gro-forwarding: off After: # ethtool -k bond0.100 | grep udp tx-udp_tnl-segmentation: on tx-udp_tnl-csum-segmentation: on tx-udp-segmentation: on rx-udp_tunnel-port-offload: off [fixed] rx-udp-gro-forwarding: off Various users have run into this reporting performance issues when configuring Cilium in vxlan tunneling mode and having the combination of bond & vlan for the core devices connecting the Kubernetes cluster to the outside world. Fixes: a9b3ace44c7d ("bonding: fix vlan_features computing") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Nikolay Aleksandrov <razor@blackwall.org> Cc: Ido Schimmel <idosch@idosch.org> Cc: Jiri Pirko <jiri@nvidia.com> --- drivers/net/bonding/bond_main.c | 1 + 1 file changed, 1 insertion(+)