diff mbox series

[V3] iscsi: Do Not set param when sock is NULL

Message ID 20210128061753.1206620-1-gulam.mohamed@oracle.com (mailing list archive)
State Changes Requested
Headers show
Series [V3] iscsi: Do Not set param when sock is NULL | expand

Commit Message

Gulam Mohamed Jan. 28, 2021, 6:17 a.m. UTC
Description
===========
1. This Kernel panic could be due to a timing issue when there is a race
   between the sync thread and the initiator was processing of a login r
   esponse from the target. The session re-open can be invoked from two
   places
     a. Sessions sync thread when the iscsid restart b. From iscsid thro
        ugh iscsi error handler 2. The session reopen sequence is as fol
        lows in user-space (iscsi-initiator-utils) a. Disconnect the con
        nection b. Then send the stop connection request to the kernel w
        hich releases the connection (releases the socket) c. Queues the
        reopen for 2 seconds delay d. Once the delay expires, create the
        TCP connection again by calling the connect() call e. Poll for t
        he connection f. When poll is successful i.e when the TCP connec
        tion is established, it performs i. Creation of session and conn
        ection data structures ii. Bind the connection to the session. T
        his is the place where we assign the sock to tcp_sw_conn->sock i
     ii. Sets the parameters like target name, persistent address etc. i
     v. Creates the login pdu v. Sends the login pdu to kernel vi. Retur
        ns to the main loop to process further events. The kernel then s
        ends the login request over to the target node g. Once login res
        ponse with success is received, it enters into full feature phas
        e and sets the negotiable parameters like max_recv_data_length,m
        ax_transmit_length, data_digest etc. 3. While setting the negoti
        a ble parameters by calling "iscsi_session_set_neg_params()", ke
        rn el panicked as sock was NULL

What happened here is
---------------------
1. Before initiator received the login response mentioned in above point
    2.f.v, another reopen request was sent from the error handler/sync s
    ession for the same session, as the initiator utils was in main loop
    to process further events (as mentioned in point 2.f.vi above). 2. W
    hile processing this reopen, it stopped the connection which release
    d the socket and queued this connection and at this point of time th
    e login response was received for the earlier on

Fix
---
1. Add a new connection state ISCSI_CONN_BOUND in "enum iscsi_connection
   _state" 2. Set the connection state value to ISCSI_CONN_DOWN upon isc
   si_if_ep_disconnect() and iscsi_if_stop_conn() 3. Set the connection 
   state to the newly created value ISCSI_CONN_BOUND after bind connecti
   on (transport->bind_conn()) 4. In iscsi_set_param, return -ENOTCONN i
   f the connection state is not ISCSI_CONN_BOUND

Signed-off-by: Gulam Mohamed <gulam.mohamed@oracle.com>
---
 drivers/scsi/scsi_transport_iscsi.c | 9 ++++++++-
 include/scsi/scsi_transport_iscsi.h | 1 +
 2 files changed, 9 insertions(+), 1 deletion(-)

Comments

Mike Christie Feb. 3, 2021, 2:16 a.m. UTC | #1
On 1/28/21 12:17 AM, Gulam Mohamed wrote:
> Description
> ===========
> 1. This Kernel panic could be due to a timing issue when there is a race
>    between the sync thread and the initiator was processing of a login r

Hey,

Sorry. When I had said that we want to limit the width, I didn't mean that
it should split words like above.

>  	default:
> +		if (conn->state != ISCSI_CONN_BOUND)
> +			return -ENOTCONN;

How about making this a check for BOUND or UP? Some of the settings, like
the TMF related ones, can be set after the conn is connected. open-iscsi
doesn't support it, but maybe other tools do. Also there might be the case
where a tool sets a value then forces a relogin and the new value would get
used for some drivers.
Gulam Mohamed Feb. 17, 2021, 2:12 p.m. UTC | #2
Hi Michael,

        Regarding " Also there might be the case where a tool sets a value then forces a relogin and the new value would get used for some drivers." in your below mail, I was trying to understand this. Can you please give me an example? It will help me to understand clearly.

Regards,
Gulam Mohamed.

-----Original Message-----
From: Michael Christie <michael.christie@oracle.com> 
Sent: Wednesday, February 3, 2021 7:47 AM
To: Gulam Mohamed <gulam.mohamed@oracle.com>; lduncan@suse.com; cleech@redhat.com; Martin Petersen <martin.petersen@oracle.com>; linux-scsi@vger.kernel.org
Subject: Re: [PATCH V3] iscsi: Do Not set param when sock is NULL

On 1/28/21 12:17 AM, Gulam Mohamed wrote:
> Description
> ===========
> 1. This Kernel panic could be due to a timing issue when there is a race
>    between the sync thread and the initiator was processing of a login 
> r

Hey,

Sorry. When I had said that we want to limit the width, I didn't mean that it should split words like above.

>  	default:
> +		if (conn->state != ISCSI_CONN_BOUND)
> +			return -ENOTCONN;

How about making this a check for BOUND or UP? Some of the settings, like the TMF related ones, can be set after the conn is connected. open-iscsi doesn't support it, but maybe other tools do. Also there might be the case where a tool sets a value then forces a relogin and the new value would get used for some drivers.
Mike Christie Feb. 17, 2021, 9:05 p.m. UTC | #3
> On Feb 17, 2021, at 8:12 AM, Gulam Mohamed <gulam.mohamed@oracle.com> wrote:
> 
> Hi Michael,
> 
>        Regarding " Also there might be the case where a tool sets a value then forces a relogin and the new value would get used for some drivers." in your below mail, I was trying to understand this. Can you please give me an example? It will help me to understand clearly.
> 

Open-iscsi stores the config values in a bunch of files on the system's FS. But other implementations could use sysfs like a tmp local FS. For this, if you wanted to use a new max burst value then you would do set_param, and during login your tools could read sysfs and pick up that value.

We used to do something like this for boot. We would read the values from a mix of cmdline, ibft, etc, then create a session. When iscsid started up again on the real root it could read the values used from sysfs. In this example, we used sysfs as a tmp place to store our info, but people would want to extend this and instead of using that “bunch of files”, just use sysfs/set_param to get/store the iscsi info for the life of the boot.



> Regards,
> Gulam Mohamed.
> 
> -----Original Message-----
> From: Michael Christie <michael.christie@oracle.com> 
> Sent: Wednesday, February 3, 2021 7:47 AM
> To: Gulam Mohamed <gulam.mohamed@oracle.com>; lduncan@suse.com; cleech@redhat.com; Martin Petersen <martin.petersen@oracle.com>; linux-scsi@vger.kernel.org
> Subject: Re: [PATCH V3] iscsi: Do Not set param when sock is NULL
> 
> On 1/28/21 12:17 AM, Gulam Mohamed wrote:
>> Description
>> ===========
>> 1. This Kernel panic could be due to a timing issue when there is a race
>>   between the sync thread and the initiator was processing of a login 
>> r
> 
> Hey,
> 
> Sorry. When I had said that we want to limit the width, I didn't mean that it should split words like above.
> 
>> 	default:
>> +		if (conn->state != ISCSI_CONN_BOUND)
>> +			return -ENOTCONN;
> 
> How about making this a check for BOUND or UP? Some of the settings, like the TMF related ones, can be set after the conn is connected. open-iscsi doesn't support it, but maybe other tools do. Also there might be the case where a tool sets a value then forces a relogin and the new value would get used for some drivers.
diff mbox series

Patch

diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index 2e68c0a87698..555fdcf5a3ae 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -2474,6 +2474,7 @@  static void iscsi_if_stop_conn(struct iscsi_cls_conn *conn, int flag)
 	 */
 	mutex_lock(&conn_mutex);
 	conn->transport->stop_conn(conn, flag);
+	conn->state = ISCSI_CONN_DOWN;
 	mutex_unlock(&conn_mutex);
 
 }
@@ -2895,6 +2896,8 @@  iscsi_set_param(struct iscsi_transport *transport, struct iscsi_uevent *ev)
 			session->recovery_tmo = value;
 		break;
 	default:
+		if (conn->state != ISCSI_CONN_BOUND)
+			return -ENOTCONN;
 		err = transport->set_param(conn, ev->u.set_param.param,
 					   data, ev->u.set_param.len);
 	}
@@ -2956,6 +2959,7 @@  static int iscsi_if_ep_disconnect(struct iscsi_transport *transport,
 		mutex_lock(&conn->ep_mutex);
 		conn->ep = NULL;
 		mutex_unlock(&conn->ep_mutex);
+		conn->state = ISCSI_CONN_DOWN;
 	}
 
 	transport->ep_disconnect(ep);
@@ -3716,6 +3720,8 @@  iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group)
 		ev->r.retcode =	transport->bind_conn(session, conn,
 						ev->u.b_conn.transport_eph,
 						ev->u.b_conn.is_leading);
+		if (!ev->r.retcode)
+			conn->state = ISCSI_CONN_BOUND;
 		mutex_unlock(&conn_mutex);
 
 		if (ev->r.retcode || !transport->ep_connect)
@@ -3947,7 +3953,8 @@  iscsi_conn_attr(local_ipaddr, ISCSI_PARAM_LOCAL_IPADDR);
 static const char *const connection_state_names[] = {
 	[ISCSI_CONN_UP] = "up",
 	[ISCSI_CONN_DOWN] = "down",
-	[ISCSI_CONN_FAILED] = "failed"
+	[ISCSI_CONN_FAILED] = "failed",
+	[ISCSI_CONN_BOUND] = "bound"
 };
 
 static ssize_t show_conn_state(struct device *dev,
diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h
index 8a26a2ffa952..fc5a39839b4b 100644
--- a/include/scsi/scsi_transport_iscsi.h
+++ b/include/scsi/scsi_transport_iscsi.h
@@ -193,6 +193,7 @@  enum iscsi_connection_state {
 	ISCSI_CONN_UP = 0,
 	ISCSI_CONN_DOWN,
 	ISCSI_CONN_FAILED,
+	ISCSI_CONN_BOUND,
 };
 
 struct iscsi_cls_conn {