[net-next,5/7] ibmvnic: Introduce send sub-crq direct

Message ID	20240801212340.132607-6-nnac123@linux.ibm.com (mailing list archive)
State	Changes Requested
Delegated to:	Netdev Maintainers
Headers	show Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E5A7314A0A4 for <netdev@vger.kernel.org>; Thu, 1 Aug 2024 21:23:53 +0000 (UTC) From: Nick Child <nnac123@linux.ibm.com> To: netdev@vger.kernel.org Cc: bjking1@linux.ibm.com, haren@linux.ibm.com, ricklind@us.ibm.com, Nick Child <nnac123@linux.ibm.com> Subject: [PATCH net-next 5/7] ibmvnic: Introduce send sub-crq direct Date: Thu, 1 Aug 2024 16:23:38 -0500 Message-ID: <20240801212340.132607-6-nnac123@linux.ibm.com> In-Reply-To: <20240801212340.132607-1-nnac123@linux.ibm.com> References: <20240801212340.132607-1-nnac123@linux.ibm.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	ibmvnic RR performance improvements \| expand [net-next,0/7] ibmvnic RR performance improvements [net-next,1/7] ibmvnic: Only replenish rx pool when resources are getting low [net-next,2/7] ibmvnic: Use header len helper functions on tx [net-next,3/7] ibmvnic: Reduce memcpys in tx descriptor generation [net-next,4/7] ibmvnic: Remove duplicate memory barriers in tx [net-next,5/7] ibmvnic: Introduce send sub-crq direct [net-next,6/7] ibmvnic: Only record tx completed bytes once per handler [net-next,7/7] ibmvnic: Perform tx CSO during send scrq direct

Message ID

20240801212340.132607-6-nnac123@linux.ibm.com (mailing list archive)

State

Changes Requested

Delegated to:

Netdev Maintainers

Headers

From: Nick Child <nnac123@linux.ibm.com>
To: netdev@vger.kernel.org
Cc: bjking1@linux.ibm.com, haren@linux.ibm.com, ricklind@us.ibm.com,
        Nick Child <nnac123@linux.ibm.com>
Subject: [PATCH net-next 5/7] ibmvnic: Introduce send sub-crq direct
Date: Thu,  1 Aug 2024 16:23:38 -0500
Message-ID: <20240801212340.132607-6-nnac123@linux.ibm.com>
In-Reply-To: <20240801212340.132607-1-nnac123@linux.ibm.com>
References: <20240801212340.132607-1-nnac123@linux.ibm.com>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

ibmvnic RR performance improvements | expand

Context	Check	Description
netdev/series_format	success	Posting correctly formatted
netdev/tree_selection	success	Clearly marked for net-next
netdev/ynl	success	Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 7 this patch: 7
netdev/build_tools	success	No tools touched, skip
netdev/cc_maintainers	warning	10 maintainers not CCed: ricklind@linux.ibm.com edumazet@google.com kuba@kernel.org mpe@ellerman.id.au christophe.leroy@csgroup.eu linuxppc-dev@lists.ozlabs.org naveen@kernel.org npiggin@gmail.com pabeni@redhat.com tlfalcon@linux.ibm.com
netdev/build_clang	success	Errors and warnings before: 7 this patch: 7
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 7 this patch: 7
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 109 lines checked
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 5 this patch: 5
netdev/source_inline	success	Was 0 now: 0
netdev/contest	success	net-next-2024-08-05--12-00 (tests: 706)

Context

Check

Description

netdev/series_format

success

Posting correctly formatted

netdev/tree_selection

success

Clearly marked for net-next

netdev/ynl

success

Generated files up to date; no warnings/errors; no diff in generated;

netdev/fixes_present

success

Fixes tag not required for -next series

netdev/header_inline

success

No static functions without inline keyword in header files

netdev/build_32bit

success

Errors and warnings before: 7 this patch: 7

netdev/build_tools

success

No tools touched, skip

netdev/cc_maintainers

warning

10 maintainers not CCed: ricklind@linux.ibm.com edumazet@google.com kuba@kernel.org mpe@ellerman.id.au christophe.leroy@csgroup.eu linuxppc-dev@lists.ozlabs.org naveen@kernel.org npiggin@gmail.com pabeni@redhat.com tlfalcon@linux.ibm.com

netdev/build_clang

success

Errors and warnings before: 7 this patch: 7

netdev/verify_signedoff

success

Signed-off-by tag matches author and committer

netdev/deprecated_api

success

None detected

netdev/check_selftest

success

No net selftest shell script

netdev/verify_fixes

success

No Fixes tag

netdev/build_allmodconfig_warn

success

Errors and warnings before: 7 this patch: 7

netdev/checkpatch

success

total: 0 errors, 0 warnings, 0 checks, 109 lines checked

netdev/build_clang_rust

success

No Rust files in patch. Skipping build

netdev/kdoc

success

Errors and warnings before: 5 this patch: 5

netdev/source_inline

success

Was 0 now: 0

netdev/contest

success

net-next-2024-08-05--12-00 (tests: 706)

Commit Message

Nick Child Aug. 1, 2024, 9:23 p.m. UTC

Firmware supports two hcalls to send a sub-crq request:
H_SEND_SUB_CRQ_INDIRECT and H_SEND_SUB_CRQ. The indirect hcall allows
for submission of batched messages while the other hcall is limited to
only one message. This protocol is defined in PAPR section 17.2.3.3.

Previously, the ibmvnic xmit function only used the indirect hcall. This
allowed the driver to batch it's skbs. A single skb can occupy a few
entries per hcall depending on if FW requires skb header information or
not. The FW only needs header information if the packet is segmented.

By this logic, if an skb is not GSO then it can fit in one sub-crq
message and therefore is a candidate for H_SEND_SUB_CRQ.
Batching skb transmission is only useful when there are more packets
coming down the line (ie netdev_xmit_more is true).

As it turns out, H_SEND_SUB_CRQ induces less latency than
H_SEND_SUB_CRQ_INDIRECT. Therefore, use H_SEND_SUB_CRQ where
appropriate.

Small latency gains seen when doing TCP_RR_150 (request/response
workload). Ftrace results (graph-time=1):
  Previous:
     ibmvnic_xmit = 29618270.83 us / 8860058.0 hits = AVG 3.34
     ibmvnic_tx_scrq_flush = 21972231.02 us / 6553972.0 hits = AVG 3.35
  Now:
     ibmvnic_xmit = 22153350.96 us / 8438942.0 hits = AVG 2.63
     ibmvnic_tx_scrq_flush = 15858922.4 us / 6244076.0 hits = AVG 2.54

Signed-off-by: Nick Child <nnac123@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 52 ++++++++++++++++++++++++++----
 1 file changed, 46 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 533e79a0c6ac..c9aa276507bb 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -117,6 +117,7 @@  static void free_long_term_buff(struct ibmvnic_adapter *adapter,
 				struct ibmvnic_long_term_buff *ltb);
 static void ibmvnic_disable_irqs(struct ibmvnic_adapter *adapter);
 static void flush_reset_queue(struct ibmvnic_adapter *adapter);
+static void print_subcrq_error(struct device *dev, int rc, const char *func);
 
 struct ibmvnic_stat {
 	char name[ETH_GSTRING_LEN];
@@ -2331,8 +2332,29 @@  static void ibmvnic_tx_scrq_clean_buffer(struct ibmvnic_adapter *adapter,
 	}
 }
 
+static int send_subcrq_direct(struct ibmvnic_adapter *adapter,
+			      u64 remote_handle, u64 *entry)
+{
+	unsigned int ua = adapter->vdev->unit_address;
+	struct device *dev = &adapter->vdev->dev;
+	int rc;
+
+	/* Make sure the hypervisor sees the complete request */
+	dma_wmb();
+	rc = plpar_hcall_norets(H_SEND_SUB_CRQ, ua,
+				cpu_to_be64(remote_handle),
+				cpu_to_be64(entry[0]), cpu_to_be64(entry[1]),
+				cpu_to_be64(entry[2]), cpu_to_be64(entry[3]));
+
+	if (rc)
+		print_subcrq_error(dev, rc, __func__);
+
+	return rc;
+}
+
 static int ibmvnic_tx_scrq_flush(struct ibmvnic_adapter *adapter,
-				 struct ibmvnic_sub_crq_queue *tx_scrq)
+				 struct ibmvnic_sub_crq_queue *tx_scrq,
+				 bool indirect)
 {
 	struct ibmvnic_ind_xmit_queue *ind_bufp;
 	u64 dma_addr;
@@ -2347,7 +2369,13 @@  static int ibmvnic_tx_scrq_flush(struct ibmvnic_adapter *adapter,
 
 	if (!entries)
 		return 0;
-	rc = send_subcrq_indirect(adapter, handle, dma_addr, entries);
+
+	if (indirect)
+		rc = send_subcrq_indirect(adapter, handle, dma_addr, entries);
+	else
+		rc = send_subcrq_direct(adapter, handle,
+					(u64 *)ind_bufp->indir_arr);
+
 	if (rc)
 		ibmvnic_tx_scrq_clean_buffer(adapter, tx_scrq);
 	else
@@ -2405,7 +2433,7 @@  static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 		tx_dropped++;
 		tx_send_failed++;
 		ret = NETDEV_TX_OK;
-		lpar_rc = ibmvnic_tx_scrq_flush(adapter, tx_scrq);
+		lpar_rc = ibmvnic_tx_scrq_flush(adapter, tx_scrq, true);
 		if (lpar_rc != H_SUCCESS)
 			goto tx_err;
 		goto out;
@@ -2423,7 +2451,7 @@  static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 		tx_send_failed++;
 		tx_dropped++;
 		ret = NETDEV_TX_OK;
-		lpar_rc = ibmvnic_tx_scrq_flush(adapter, tx_scrq);
+		lpar_rc = ibmvnic_tx_scrq_flush(adapter, tx_scrq, true);
 		if (lpar_rc != H_SUCCESS)
 			goto tx_err;
 		goto out;
@@ -2518,6 +2546,16 @@  static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 		tx_crq.v1.flags1 |= IBMVNIC_TX_LSO;
 		tx_crq.v1.mss = cpu_to_be16(skb_shinfo(skb)->gso_size);
 		hdrs += 2;
+	} else if (!ind_bufp->index && !netdev_xmit_more()) {
+		ind_bufp->indir_arr[0] = tx_crq;
+		ind_bufp->index = 1;
+		tx_buff->num_entries = 1;
+		netdev_tx_sent_queue(txq, skb->len);
+		lpar_rc = ibmvnic_tx_scrq_flush(adapter, tx_scrq, false);
+		if (lpar_rc != H_SUCCESS)
+			goto tx_err;
+
+		goto early_exit;
 	}
 
 	if ((*hdrs >> 7) & 1)
@@ -2527,7 +2565,7 @@  static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 	tx_buff->num_entries = num_entries;
 	/* flush buffer if current entry can not fit */
 	if (num_entries + ind_bufp->index > IBMVNIC_MAX_IND_DESCS) {
-		lpar_rc = ibmvnic_tx_scrq_flush(adapter, tx_scrq);
+		lpar_rc = ibmvnic_tx_scrq_flush(adapter, tx_scrq, true);
 		if (lpar_rc != H_SUCCESS)
 			goto tx_flush_err;
 	}
@@ -2535,15 +2573,17 @@  static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 	indir_arr[0] = tx_crq;
 	memcpy(&ind_bufp->indir_arr[ind_bufp->index], &indir_arr[0],
 	       num_entries * sizeof(struct ibmvnic_generic_scrq));
+
 	ind_bufp->index += num_entries;
 	if (__netdev_tx_sent_queue(txq, skb->len,
 				   netdev_xmit_more() &&
 				   ind_bufp->index < IBMVNIC_MAX_IND_DESCS)) {
-		lpar_rc = ibmvnic_tx_scrq_flush(adapter, tx_scrq);
+		lpar_rc = ibmvnic_tx_scrq_flush(adapter, tx_scrq, true);
 		if (lpar_rc != H_SUCCESS)
 			goto tx_err;
 	}
 
+early_exit:
 	if (atomic_add_return(num_entries, &tx_scrq->used)
 					>= adapter->req_tx_entries_per_subcrq) {
 		netdev_dbg(netdev, "Stopping queue %d\n", queue_num);

[net-next,5/7] ibmvnic: Introduce send sub-crq direct

Checks

Commit Message

Patch