diff mbox series

[net-next,1/1] Enter smc_tx_wait when the tx length exceeds the available space

Message ID 20241226122217.1125-2-liqiang64@huawei.com (mailing list archive)
State New
Delegated to: Netdev Maintainers
Headers show
Series net/smc: An issue of smc sending messages | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1 this patch: 1
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 5 maintainers not CCed: edumazet@google.com linux-rdma@vger.kernel.org horms@kernel.org pabeni@redhat.com kuba@kernel.org
netdev/build_clang success Errors and warnings before: 2 this patch: 2
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 45 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-12-27--00-00 (tests: 881)

Commit Message

Li Qiang Dec. 26, 2024, 12:22 p.m. UTC
The variable send_done records the number of bytes that have been 
successfully sent in the context of the code. It is more reasonable 
to rename it to sent_bytes here.

Another modification point is that if the ring buf is full after 
sendmsg has sent part of the data, the current code will return 
directly without entering smc_tx_wait, so the judgment of send_done 
in front of smc_tx_wait is removed.

Signed-off-by: liqiang <liqiang64@huawei.com>
---
 net/smc/smc_tx.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

Comments

Wen Gu Dec. 26, 2024, 1:20 p.m. UTC | #1
On 2024/12/26 20:22, liqiang wrote:
> The variable send_done records the number of bytes that have been
> successfully sent in the context of the code. It is more reasonable
> to rename it to sent_bytes here.
> 
> Another modification point is that if the ring buf is full after
> sendmsg has sent part of the data, the current code will return
> directly without entering smc_tx_wait, so the judgment of send_done
> in front of smc_tx_wait is removed.
> 
> Signed-off-by: liqiang <liqiang64@huawei.com>
> ---

Hi liqiang,

I think this discussion thread[1] can help you understand why this is the case.
The current design is to avoid the stalled connection problem.

Some other small points: issues should be posted to 'net' tree instead of 'net-next'
tree[2], and currently net-next is closed[3].

[1] https://lore.kernel.org/netdev/20211027085208.16048-2-tonylu@linux.alibaba.com/
[2] https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt
[3] https://patchwork.hopto.org/net-next.html

Regards.

>   net/smc/smc_tx.c | 14 ++++++--------
>   1 file changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
> index 214ac3cbcf9a..6ecabc10793c 100644
> --- a/net/smc/smc_tx.c
> +++ b/net/smc/smc_tx.c
> @@ -180,7 +180,7 @@ static bool smc_tx_should_cork(struct smc_sock *smc, struct msghdr *msg)
>    */
>   int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
>   {
> -	size_t copylen, send_done = 0, send_remaining = len;
> +	size_t copylen, sent_bytes = 0, send_remaining = len;
>   	size_t chunk_len, chunk_off, chunk_len_sum;
>   	struct smc_connection *conn = &smc->conn;
>   	union smc_host_cursor prep;
> @@ -216,14 +216,12 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
>   		    conn->killed)
>   			return -EPIPE;
>   		if (smc_cdc_rxed_any_close(conn))
> -			return send_done ?: -ECONNRESET;
> +			return sent_bytes ?: -ECONNRESET;
>   
>   		if (msg->msg_flags & MSG_OOB)
>   			conn->local_tx_ctrl.prod_flags.urg_data_pending = 1;
>   
>   		if (!atomic_read(&conn->sndbuf_space) || conn->urg_tx_pend) {
> -			if (send_done)
> -				return send_done;
>   			rc = smc_tx_wait(smc, msg->msg_flags);
>   			if (rc)
>   				goto out_err;
> @@ -250,11 +248,11 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
>   					     msg, chunk_len);
>   			if (rc) {
>   				smc_sndbuf_sync_sg_for_device(conn);
> -				if (send_done)
> -					return send_done;
> +				if (sent_bytes)
> +					return sent_bytes;
>   				goto out_err;
>   			}
> -			send_done += chunk_len;
> +			sent_bytes += chunk_len;
>   			send_remaining -= chunk_len;
>   
>   			if (chunk_len_sum == copylen)
> @@ -287,7 +285,7 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
>   		trace_smc_tx_sendmsg(smc, copylen);
>   	} /* while (msg_data_left(msg)) */
>   
> -	return send_done;
> +	return sent_bytes;
>   
>   out_err:
>   	rc = sk_stream_error(sk, msg->msg_flags, rc);
Li Qiang Dec. 27, 2024, 7:53 a.m. UTC | #2
在 2024/12/26 21:20, Wen Gu 写道:
> 
> 
> On 2024/12/26 20:22, liqiang wrote:
>> The variable send_done records the number of bytes that have been
>> successfully sent in the context of the code. It is more reasonable
>> to rename it to sent_bytes here.
>>
>> Another modification point is that if the ring buf is full after
>> sendmsg has sent part of the data, the current code will return
>> directly without entering smc_tx_wait, so the judgment of send_done
>> in front of smc_tx_wait is removed.
>>
>> Signed-off-by: liqiang <liqiang64@huawei.com>
>> ---
> 
> Hi liqiang,
> 
> I think this discussion thread[1] can help you understand why this is the case.
> The current design is to avoid the stalled connection problem.

Yes, I read that discussion and the problem does exist. So we should correctly handle
fewer bytes sent than expected in user space (like netperf).

However, according to my verification, in the TCP network or loopback (without smc),
after increasing the memory sent by the client at one time to a large enough size,
a connection deadlock seems to occur. SMC processing will not be stuck due to the
expansion of the sending memory.But when the socket is blocking and sends messages,
its behavior is different from TCP socket.

> 
> Some other small points: issues should be posted to 'net' tree instead of 'net-next'
> tree[2], and currently net-next is closed[3].

Thank you for pointing out the problem, I learned from it.

> 
> [1] https://lore.kernel.org/netdev/20211027085208.16048-2-tonylu@linux.alibaba.com/
> [2] https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt
> [3] https://patchwork.hopto.org/net-next.html
> 
> Regards.
>
diff mbox series

Patch

diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
index 214ac3cbcf9a..6ecabc10793c 100644
--- a/net/smc/smc_tx.c
+++ b/net/smc/smc_tx.c
@@ -180,7 +180,7 @@  static bool smc_tx_should_cork(struct smc_sock *smc, struct msghdr *msg)
  */
 int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
 {
-	size_t copylen, send_done = 0, send_remaining = len;
+	size_t copylen, sent_bytes = 0, send_remaining = len;
 	size_t chunk_len, chunk_off, chunk_len_sum;
 	struct smc_connection *conn = &smc->conn;
 	union smc_host_cursor prep;
@@ -216,14 +216,12 @@  int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
 		    conn->killed)
 			return -EPIPE;
 		if (smc_cdc_rxed_any_close(conn))
-			return send_done ?: -ECONNRESET;
+			return sent_bytes ?: -ECONNRESET;
 
 		if (msg->msg_flags & MSG_OOB)
 			conn->local_tx_ctrl.prod_flags.urg_data_pending = 1;
 
 		if (!atomic_read(&conn->sndbuf_space) || conn->urg_tx_pend) {
-			if (send_done)
-				return send_done;
 			rc = smc_tx_wait(smc, msg->msg_flags);
 			if (rc)
 				goto out_err;
@@ -250,11 +248,11 @@  int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
 					     msg, chunk_len);
 			if (rc) {
 				smc_sndbuf_sync_sg_for_device(conn);
-				if (send_done)
-					return send_done;
+				if (sent_bytes)
+					return sent_bytes;
 				goto out_err;
 			}
-			send_done += chunk_len;
+			sent_bytes += chunk_len;
 			send_remaining -= chunk_len;
 
 			if (chunk_len_sum == copylen)
@@ -287,7 +285,7 @@  int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
 		trace_smc_tx_sendmsg(smc, copylen);
 	} /* while (msg_data_left(msg)) */
 
-	return send_done;
+	return sent_bytes;
 
 out_err:
 	rc = sk_stream_error(sk, msg->msg_flags, rc);