diff mbox series

[1/1] ice: WiP support for BIG TCP packets

Message ID 20230109161833.223510-1-pawel.chmielewski@intel.com (mailing list archive)
State RFC
Delegated to: Netdev Maintainers
Headers show
Series [1/1] ice: WiP support for BIG TCP packets | expand

Checks

Context Check Description
netdev/tree_selection success Guessed tree name to be net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers fail 7 maintainers not CCed: edumazet@google.com davem@davemloft.net kuba@kernel.org pabeni@redhat.com anthony.l.nguyen@intel.com intel-wired-lan@lists.osuosl.org jesse.brandeburg@intel.com
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch fail ERROR: Macros with complex values should be enclosed in parentheses
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Pawel Chmielewski Jan. 9, 2023, 4:18 p.m. UTC
This patch is a proof of concept for testing BIG TCP feature in ice driver.
Please see letter below.

Signed-off-by: Pawel Chmielewski <pawel.chmielewski@intel.com>
---
Hi All
I'm writing on the list, as you may be able to provide me some feedback.
I want to enable BIG TCP feature in intel ice drive, but I think I'm 
missing something.
In the code itself, I've set 128k as a maximum tso size for the netif,
and added stripping the HBH option from the header.
For testing purposes, gso_max_size & gro_max_size were set to 128k and 
mtu to 9000.
I've assumed that the ice tso offload will do the rest of the job.
However- while running netperf TCP_RR and TCP_STREAM tests,
I saw that only up to ~20% of the transmitted test packets have 
the specified size. 
Other packets to be transmitted, appear from the stack as splitted.

I've been running the following testcases:
netperf -t TCP_RR -H 2001:db8:0:f101::1  -- -r80000,80000 -O MIN_LATENCY,P90_LATENCY,P99_LATENCY,THROUGHPUT
netperf -l-1 -t TCP_STREAM -H 2001:db8:0:f101::1  -- -m 128K -O MIN_LATENCY,P90_LATENCY,P99_LATENCY,THROUGHPUT
I suspected a shrinking tcp window size, but sniffing with tcpdump showed rather big scaling factor (usually 128x).
Apart from using netperf, I also tried a simple IPv6 user space application
(with SO_SNDBUF option set to 192k and TCP_WINDOW_CLAMP to 96k) - similar results.

I'd be very grateful for any feedback/suggestions

Pawel
---
 drivers/net/ethernet/intel/ice/ice_main.c | 4 ++++
 drivers/net/ethernet/intel/ice/ice_txrx.c | 9 +++++++++
 2 files changed, 13 insertions(+)

Comments

Alexander Duyck Jan. 9, 2023, 6:22 p.m. UTC | #1
On Mon, 2023-01-09 at 17:18 +0100, Pawel Chmielewski wrote:
> This patch is a proof of concept for testing BIG TCP feature in ice driver.
> Please see letter below.
> 
> Signed-off-by: Pawel Chmielewski <pawel.chmielewski@intel.com>
> ---
> Hi All
> I'm writing on the list, as you may be able to provide me some feedback.
> I want to enable BIG TCP feature in intel ice drive, but I think I'm 
> missing something.
> In the code itself, I've set 128k as a maximum tso size for the netif,
> and added stripping the HBH option from the header.
> For testing purposes, gso_max_size & gro_max_size were set to 128k and 
> mtu to 9000.
> I've assumed that the ice tso offload will do the rest of the job.
> However- while running netperf TCP_RR and TCP_STREAM tests,
> I saw that only up to ~20% of the transmitted test packets have 
> the specified size. 
> Other packets to be transmitted, appear from the stack as splitted.
> 
> I've been running the following testcases:
> netperf -t TCP_RR -H 2001:db8:0:f101::1  -- -r80000,80000 -O MIN_LATENCY,P90_LATENCY,P99_LATENCY,THROUGHPUT
> netperf -l-1 -t TCP_STREAM -H 2001:db8:0:f101::1  -- -m 128K -O MIN_LATENCY,P90_LATENCY,P99_LATENCY,THROUGHPUT
> I suspected a shrinking tcp window size, but sniffing with tcpdump showed rather big scaling factor (usually 128x).
> Apart from using netperf, I also tried a simple IPv6 user space application
> (with SO_SNDBUF option set to 192k and TCP_WINDOW_CLAMP to 96k) - similar results.
> 
> I'd be very grateful for any feedback/suggestions
> 
> Pawel
> ---
>  drivers/net/ethernet/intel/ice/ice_main.c | 4 ++++
>  drivers/net/ethernet/intel/ice/ice_txrx.c | 9 +++++++++
>  2 files changed, 13 insertions(+)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> index 2b23b4714a26..4e657820e55d 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -48,6 +48,8 @@ static DEFINE_IDA(ice_aux_ida);
>  DEFINE_STATIC_KEY_FALSE(ice_xdp_locking_key);
>  EXPORT_SYMBOL(ice_xdp_locking_key);
>  
> +#define ICE_MAX_TSO_SIZE 131072
> +
>  /**
>   * ice_hw_to_dev - Get device pointer from the hardware structure
>   * @hw: pointer to the device HW structure
> @@ -3422,6 +3424,8 @@ static void ice_set_netdev_features(struct net_device *netdev)
>  	 * be changed at runtime
>  	 */
>  	netdev->hw_features |= NETIF_F_RXFCS;
> +
> +	netif_set_tso_max_size(netdev, ICE_MAX_TSO_SIZE);
>  }
>  
>  /**
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> index 086f0b3ab68d..7e0ac483cad9 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> @@ -23,6 +23,9 @@
>  #define FDIR_DESC_RXDID 0x40
>  #define ICE_FDIR_CLEAN_DELAY 10
>  
> +#define HBH_HDR_SIZE sizeof(struct hop_jumbo_hdr)
> +#define HBH_OFFSET ETH_HLEN + sizeof(struct ipv6hdr)
> +
>  /**
>   * ice_prgm_fdir_fltr - Program a Flow Director filter
>   * @vsi: VSI to send dummy packet
> @@ -2300,6 +2303,12 @@ ice_xmit_frame_ring(struct sk_buff *skb, struct ice_tx_ring *tx_ring)
>  
>  	ice_trace(xmit_frame_ring, tx_ring, skb);
>  
> +	if (ipv6_has_hopopt_jumbo(skb)) {
> +		memmove(skb->data + HBH_HDR_SIZE, skb->data, HBH_OFFSET);
> +		__skb_pull(skb, HBH_HDR_SIZE);
> +		skb_reset_mac_header(skb);
> +	}
> +
>  	count = ice_xmit_desc_count(skb);
>  	if (ice_chk_linearize(skb, count)) {
>  		if (__skb_linearize(skb))

Your removal code here is forgetting to handle the network header. As a
result your frames will be pointer mangled in terms of header location.

You might be better off using ipv6_hopopt_jumbo_remove() rather than
just coding your own bit to remove it.
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 2b23b4714a26..4e657820e55d 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -48,6 +48,8 @@  static DEFINE_IDA(ice_aux_ida);
 DEFINE_STATIC_KEY_FALSE(ice_xdp_locking_key);
 EXPORT_SYMBOL(ice_xdp_locking_key);
 
+#define ICE_MAX_TSO_SIZE 131072
+
 /**
  * ice_hw_to_dev - Get device pointer from the hardware structure
  * @hw: pointer to the device HW structure
@@ -3422,6 +3424,8 @@  static void ice_set_netdev_features(struct net_device *netdev)
 	 * be changed at runtime
 	 */
 	netdev->hw_features |= NETIF_F_RXFCS;
+
+	netif_set_tso_max_size(netdev, ICE_MAX_TSO_SIZE);
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index 086f0b3ab68d..7e0ac483cad9 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -23,6 +23,9 @@ 
 #define FDIR_DESC_RXDID 0x40
 #define ICE_FDIR_CLEAN_DELAY 10
 
+#define HBH_HDR_SIZE sizeof(struct hop_jumbo_hdr)
+#define HBH_OFFSET ETH_HLEN + sizeof(struct ipv6hdr)
+
 /**
  * ice_prgm_fdir_fltr - Program a Flow Director filter
  * @vsi: VSI to send dummy packet
@@ -2300,6 +2303,12 @@  ice_xmit_frame_ring(struct sk_buff *skb, struct ice_tx_ring *tx_ring)
 
 	ice_trace(xmit_frame_ring, tx_ring, skb);
 
+	if (ipv6_has_hopopt_jumbo(skb)) {
+		memmove(skb->data + HBH_HDR_SIZE, skb->data, HBH_OFFSET);
+		__skb_pull(skb, HBH_HDR_SIZE);
+		skb_reset_mac_header(skb);
+	}
+
 	count = ice_xmit_desc_count(skb);
 	if (ice_chk_linearize(skb, count)) {
 		if (__skb_linearize(skb))