[net] mctp i3c: fix MCTP I3C driver multi-thread issue

Message ID	20241226025319.1724209-1-Leo-Yang@quantatw.com (mailing list archive)
State	New
Delegated to:	Netdev Maintainers
Headers	show Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BFA7417C77; Thu, 26 Dec 2024 02:59:04 +0000 (UTC) From: Leo Yang <leo.yang.sy0@gmail.com> To: jk@codeconstruct.com.au, matt@codeconstruct.com.au, andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Leo Yang <Leo-Yang@quantatw.com> Subject: [PATCH net] mctp i3c: fix MCTP I3C driver multi-thread issue Date: Thu, 26 Dec 2024 10:53:19 +0800 Message-Id: <20241226025319.1724209-1-Leo-Yang@quantatw.com> Precedence: bulk MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit
Series	[net] mctp i3c: fix MCTP I3C driver multi-thread issue \| expand [net] mctp i3c: fix MCTP I3C driver multi-thread issue

Message ID

20241226025319.1724209-1-Leo-Yang@quantatw.com (mailing list archive)

State

New

Delegated to:

Netdev Maintainers

Headers

From: Leo Yang <leo.yang.sy0@gmail.com>
To: jk@codeconstruct.com.au,
	matt@codeconstruct.com.au,
	andrew+netdev@lunn.ch,
	davem@davemloft.net,
	edumazet@google.com,
	kuba@kernel.org,
	pabeni@redhat.com,
	netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
Cc: Leo Yang <Leo-Yang@quantatw.com>
Subject: [PATCH net] mctp i3c: fix MCTP I3C driver multi-thread issue
Date: Thu, 26 Dec 2024 10:53:19 +0800
Message-Id: <20241226025319.1724209-1-Leo-Yang@quantatw.com>
Precedence: bulk
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Series

[net] mctp i3c: fix MCTP I3C driver multi-thread issue | expand

Context	Check	Description
netdev/series_format	success	Single patches do not need cover letters
netdev/tree_selection	success	Clearly marked for net
netdev/ynl	success	Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present	fail	Series targets non-next tree, but doesn't contain any Fixes tags
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 1 this patch: 1
netdev/build_tools	success	No tools touched, skip
netdev/cc_maintainers	success	CCed 7 of 7 maintainers
netdev/build_clang	success	Errors and warnings before: 2 this patch: 2
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 1 this patch: 1
netdev/checkpatch	warning	WARNING: From:/Signed-off-by: email address mismatch: 'From: Leo Yang <leo.yang.sy0@gmail.com>' != 'Signed-off-by: Leo Yang <Leo-Yang@quantatw.com>'
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0
netdev/contest	success	net-next-2024-12-26--09-00 (tests: 881)

Context

Check

Description

netdev/series_format

success

Single patches do not need cover letters

netdev/tree_selection

success

Clearly marked for net

netdev/ynl

success

Generated files up to date; no warnings/errors; no diff in generated;

netdev/fixes_present

fail

Series targets non-next tree, but doesn't contain any Fixes tags

netdev/header_inline

success

No static functions without inline keyword in header files

netdev/build_32bit

success

Errors and warnings before: 1 this patch: 1

netdev/build_tools

success

No tools touched, skip

netdev/cc_maintainers

success

CCed 7 of 7 maintainers

netdev/build_clang

success

Errors and warnings before: 2 this patch: 2

netdev/verify_signedoff

success

Signed-off-by tag matches author and committer

netdev/deprecated_api

success

None detected

netdev/check_selftest

success

No net selftest shell script

netdev/verify_fixes

success

No Fixes tag

netdev/build_allmodconfig_warn

success

Errors and warnings before: 1 this patch: 1

netdev/checkpatch

warning

WARNING: From:/Signed-off-by: email address mismatch: 'From: Leo Yang <leo.yang.sy0@gmail.com>' != 'Signed-off-by: Leo Yang <Leo-Yang@quantatw.com>'

netdev/build_clang_rust

success

No Rust files in patch. Skipping build

netdev/kdoc

success

Errors and warnings before: 0 this patch: 0

netdev/source_inline

success

Was 0 now: 0

netdev/contest

success

net-next-2024-12-26--09-00 (tests: 881)

Commit Message

Leo Yang Dec. 26, 2024, 2:53 a.m. UTC

We found a timeout problem with the pldm command on our system.  The
reason is that the MCTP-I3C driver has a race condition when receiving
multiple-packet messages in multi-thread, resulting in a wrong packet
order problem.

We identified this problem by adding a debug message to the
mctp_i3c_read function.

According to the MCTP spec, a multiple-packet message must be composed
in sequence, and if there is a wrong sequence, the whole message will be
discarded and wait for the next SOM.
For example, SOM → Pkt Seq #2 → Pkt Seq #1 → Pkt Seq #3 → EOM.

Therefore, we try to solve this problem by adding a mutex to the
mctp_i3c_read function.  Before the modification, when a command
requesting a multiple-packet message response is sent consecutively, an
error usually occurs within 100 loops.  After the mutex, it can go
through 40000 loops without any error, and it seems to run well.

But I'm a little worried about the performance of mutex in high load
situation (as spec seems to allow different endpoints to respond at the
same time), do you think this is a feasible solution?

Signed-off-by: Leo Yang <Leo-Yang@quantatw.com>
---
 drivers/net/mctp/mctp-i3c.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/mctp/mctp-i3c.c b/drivers/net/mctp/mctp-i3c.c
index 9adad59b8676..0d625b351ebd 100644
--- a/drivers/net/mctp/mctp-i3c.c
+++ b/drivers/net/mctp/mctp-i3c.c
@@ -125,6 +125,7 @@  static int mctp_i3c_read(struct mctp_i3c_device *mi)
 
 	xfer.data.in = skb_put(skb, mi->mrl);
 
+	mutex_lock(&mi->lock);
 	rc = i3c_device_do_priv_xfers(mi->i3c, &xfer, 1);
 	if (rc < 0)
 		goto err;
@@ -166,8 +167,10 @@  static int mctp_i3c_read(struct mctp_i3c_device *mi)
 		stats->rx_dropped++;
 	}
 
+	mutex_unlock(&mi->lock);
 	return 0;
 err:
+	mutex_unlock(&mi->lock);
 	kfree_skb(skb);
 	return rc;
 }

[net] mctp i3c: fix MCTP I3C driver multi-thread issue

Checks

Commit Message

Patch