Message ID | 1738729316-25922-1-git-send-email-shradhagupta@linux.microsoft.com (mailing list archive) |
---|---|
State | New |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | Enable Big TCP for MANA devices | expand |
On Tue, 4 Feb 2025 20:21:55 -0800 Shradha Gupta <shradhagupta@linux.microsoft.com> wrote: > On Azure, increasing VF's TCP segment size to up-to GSO_MAX_SIZE > is not possible without allowing the same for netvsc NIC > (as the NICs are bonded together). For bonded NICs, the min of the max > segment size of the members is propagated in the stack. > > Therefore, we use netif_set_tso_max_size() to set max segment size > to VF's segment size for netvsc too, when the data path is switched over > to the VF > Tested on azure env with Accelerated Networking enabled and disabled. > > Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com> > Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Since datapath can change at anytime (ie hot remove of VF). How does TCP stack react to GSO max size changing underneath it. Is it like a path MTU change where some packets are lost until TCP retries and has to rediscover?
On Tue, Feb 04, 2025 at 09:56:43PM -0800, Stephen Hemminger wrote: > On Tue, 4 Feb 2025 20:21:55 -0800 > Shradha Gupta <shradhagupta@linux.microsoft.com> wrote: > > > On Azure, increasing VF's TCP segment size to up-to GSO_MAX_SIZE > > is not possible without allowing the same for netvsc NIC > > (as the NICs are bonded together). For bonded NICs, the min of the max > > segment size of the members is propagated in the stack. > > > > Therefore, we use netif_set_tso_max_size() to set max segment size > > to VF's segment size for netvsc too, when the data path is switched over > > to the VF > > Tested on azure env with Accelerated Networking enabled and disabled. > > > > Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com> > > Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> > > Since datapath can change at anytime (ie hot remove of VF). > How does TCP stack react to GSO max size changing underneath it. > Is it like a path MTU change where some packets are lost until > TCP retries and has to rediscover? Hi Stephen, Upon removal of the VF, a change in the MAX segment size (calculated as min of the new list of participants in the bond) is triggered. This causes the subsequent skb allocations for the nic for the xmit path to be under this limit. During this process, if an SKB with the older seg size (i.e with longer length) ends up in netvsc stack, the netvsc transport, later silently segments it in the size that the hardware supports (netvsc_dev->netvsc_gso_max_size) regards, Shradha.
diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h index e690b95b1bbb..def41067ea3f 100644 --- a/drivers/net/hyperv/hyperv_net.h +++ b/drivers/net/hyperv/hyperv_net.h @@ -1166,6 +1166,8 @@ struct netvsc_device { u32 max_chn; u32 num_chn; + u32 netvsc_gso_max_size; + atomic_t open_chn; struct work_struct subchan_work; wait_queue_head_t subchan_open; diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index d6c4abfc3a28..4696939f08a0 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -2461,6 +2461,21 @@ static int netvsc_vf_changed(struct net_device *vf_netdev, unsigned long event) } else { netdev_info(ndev, "Data path switched %s VF: %s\n", vf_is_up ? "to" : "from", vf_netdev->name); + + /* In Azure, when accelerated networking in enabled, other NICs + * like MANA, MLX, are configured as a bonded nic with + * netvsc(failover) NIC. For bonded NICs, the min of the max + * segment size of the members is propagated in the stack. + * In order to allow these NICs (MANA/MLX) to use up to + * GSO_MAX_SIZE segment size, we need to allow netvsc NIC to + * also support this in the guest. + * This value is only increased for netvsc NIC when datapath is + * switched over to the VF + */ + if (vf_is_up) + netif_set_tso_max_size(ndev, vf_netdev->tso_max_size); + else + netif_set_tso_max_size(ndev, netvsc_dev->netvsc_gso_max_size); } return NOTIFY_OK; diff --git a/drivers/net/hyperv/rndis_filter.c b/drivers/net/hyperv/rndis_filter.c index c0ceeef4fcd8..82747dfacd70 100644 --- a/drivers/net/hyperv/rndis_filter.c +++ b/drivers/net/hyperv/rndis_filter.c @@ -1356,9 +1356,10 @@ static int rndis_netdev_set_hwcaps(struct rndis_device *rndis_device, struct net_device_context *net_device_ctx = netdev_priv(net); struct ndis_offload hwcaps; struct ndis_offload_params offloads; - unsigned int gso_max_size = GSO_LEGACY_MAX_SIZE; int ret; + nvdev->netvsc_gso_max_size = GSO_LEGACY_MAX_SIZE; + /* Find HW offload capabilities */ ret = rndis_query_hwcaps(rndis_device, nvdev, &hwcaps); if (ret != 0) @@ -1390,8 +1391,8 @@ static int rndis_netdev_set_hwcaps(struct rndis_device *rndis_device, offloads.lso_v2_ipv4 = NDIS_OFFLOAD_PARAMETERS_LSOV2_ENABLED; net->hw_features |= NETIF_F_TSO; - if (hwcaps.lsov2.ip4_maxsz < gso_max_size) - gso_max_size = hwcaps.lsov2.ip4_maxsz; + if (hwcaps.lsov2.ip4_maxsz < nvdev->netvsc_gso_max_size) + nvdev->netvsc_gso_max_size = hwcaps.lsov2.ip4_maxsz; } if (hwcaps.csum.ip4_txcsum & NDIS_TXCSUM_CAP_UDP4) { @@ -1411,8 +1412,8 @@ static int rndis_netdev_set_hwcaps(struct rndis_device *rndis_device, offloads.lso_v2_ipv6 = NDIS_OFFLOAD_PARAMETERS_LSOV2_ENABLED; net->hw_features |= NETIF_F_TSO6; - if (hwcaps.lsov2.ip6_maxsz < gso_max_size) - gso_max_size = hwcaps.lsov2.ip6_maxsz; + if (hwcaps.lsov2.ip6_maxsz < nvdev->netvsc_gso_max_size) + nvdev->netvsc_gso_max_size = hwcaps.lsov2.ip6_maxsz; } if (hwcaps.csum.ip6_txcsum & NDIS_TXCSUM_CAP_UDP6) { @@ -1438,7 +1439,7 @@ static int rndis_netdev_set_hwcaps(struct rndis_device *rndis_device, */ net->features &= ~NETVSC_SUPPORTED_HW_FEATURES | net->hw_features; - netif_set_tso_max_size(net, gso_max_size); + netif_set_tso_max_size(net, nvdev->netvsc_gso_max_size); ret = rndis_filter_set_offload_params(net, nvdev, &offloads);