diff mbox series

[08/32] lnet: socklnd: Duplicate ksock_conn_cb

Message ID 1659577097-19253-9-git-send-email-jsimmons@infradead.org (mailing list archive)
State New, archived
Headers show
Series lustre: Update to OpenSFS as of Aug 3 2022 | expand

Commit Message

James Simmons Aug. 4, 2022, 1:37 a.m. UTC
From: Chris Horn <chris.horn@hpe.com>

If two threads enter ksocknal_add_peer(), the first one to acquire
the ksnd_global_lock will create a ksock_peer_ni and associate a
ksock_conn_cb with it.

When the second thread acquires the ksnd_global_lock it will find the
existing ksock_peer_ni, but it does not check for an existing
ksock_conn_cb. As a result, it overwrites the existing ksock_conn_cb
(ksock_peer_ni::ksnp_conn_cb) and the ksock_conn_cb from the first
thread becomes stranded.

Modify ksocknal_add_peer() to check whether the peer_ni has an
existing ksock_conn_cb associated with it

Fixes: 3ffceb7502 ("lnet: socklnd: replace route construct")
HPE-bug-id: LUS-10956
WC-bug-id: https://jira.whamcloud.com/browse/LU-15860
Lustre-commit: 0c91d49a44e1214b5 ("LU-15860 socklnd: Duplicate ksock_conn_cb")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Reviewed-on: https://review.whamcloud.com/47361
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/klnds/socklnd/socklnd.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)
diff mbox series

Patch

diff --git a/net/lnet/klnds/socklnd/socklnd.c b/net/lnet/klnds/socklnd/socklnd.c
index 01b434f..2b08501 100644
--- a/net/lnet/klnds/socklnd/socklnd.c
+++ b/net/lnet/klnds/socklnd/socklnd.c
@@ -645,14 +645,17 @@  struct ksock_peer_ni *
 			 nidhash(&id->nid));
 	}
 
-	ksocknal_add_conn_cb_locked(peer_ni, conn_cb);
-
-	/* Remember conns_per_peer setting at the time
-	 * of connection initiation. It will define the
-	 * max number of conns per type for this conn_cb
-	 * while it's in use.
-	 */
-	conn_cb->ksnr_max_conns = ksocknal_get_conns_per_peer(peer_ni);
+	if (peer_ni->ksnp_conn_cb) {
+		ksocknal_conn_cb_decref(conn_cb);
+	} else {
+		ksocknal_add_conn_cb_locked(peer_ni, conn_cb);
+		/* Remember conns_per_peer setting at the time
+		 * of connection initiation. It will define the
+		 * max number of conns per type for this conn_cb
+		 * while it's in use.
+		 */
+		conn_cb->ksnr_max_conns = ksocknal_get_conns_per_peer(peer_ni);
+	}
 
 	write_unlock_bh(&ksocknal_data.ksnd_global_lock);