diff mbox series

[v3,05/17] scsi: iscsi: wait on cmds before freeing conn

Message ID 20210416020440.259271-6-michael.christie@oracle.com (mailing list archive)
State Changes Requested
Headers show
Series libicsi and qedi TMF fixes | expand

Commit Message

Mike Christie April 16, 2021, 2:04 a.m. UTC
If we haven't done an unbind target call, we can race during conn
destruction where iscsi_conn_teardown wakes up the eh/abort thread and its
still accessing a task while iscsi_conn_teardown is freeing the conn. This
patch has us wait for all threads to drop their refs to outstanding tasks
during conn destruction.

There is also an issue where we could be accessing the conn directly via
fields like conn->ehwait in the eh callbacks. The next patch will fix
those.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/libiscsi.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

Comments

Lee Duncan April 22, 2021, 3:02 p.m. UTC | #1
On 4/15/21 7:04 PM, Mike Christie wrote:
> If we haven't done an unbind target call, we can race during conn
> destruction where iscsi_conn_teardown wakes up the eh/abort thread and its
> still accessing a task while iscsi_conn_teardown is freeing the conn. This
> patch has us wait for all threads to drop their refs to outstanding tasks
> during conn destruction.
> 
> There is also an issue where we could be accessing the conn directly via
> fields like conn->ehwait in the eh callbacks. The next patch will fix
> those.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>  drivers/scsi/libiscsi.c | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
> index ce3898fdb10f..ce6d04035c64 100644
> --- a/drivers/scsi/libiscsi.c
> +++ b/drivers/scsi/libiscsi.c
> @@ -3120,6 +3120,24 @@ iscsi_conn_setup(struct iscsi_cls_session *cls_session, int dd_size,
>  }
>  EXPORT_SYMBOL_GPL(iscsi_conn_setup);
>  
> +static bool iscsi_session_has_tasks(struct iscsi_session *session)
> +{
> +	struct iscsi_task *task;
> +	int i;
> +
> +	spin_lock_bh(&session->back_lock);
> +	for (i = 0; i < session->cmds_max; i++) {
> +		task = session->cmds[i];
> +
> +		if (task->sc) {
> +			spin_unlock_bh(&session->back_lock);
> +			return true;
> +		}
> +	}
> +	spin_unlock_bh(&session->back_lock);
> +	return false;
> +}
> +
>  /**
>   * iscsi_conn_teardown - teardown iscsi connection
>   * @cls_conn: iscsi class connection
> @@ -3144,7 +3162,17 @@ void iscsi_conn_teardown(struct iscsi_cls_conn *cls_conn)
>  		session->state = ISCSI_STATE_TERMINATE;
>  		wake_up(&conn->ehwait);
>  	}
> +
>  	spin_unlock_bh(&session->frwd_lock);
> +	mutex_unlock(&session->eh_mutex);
> +	/*
> +	 * If the caller didn't do a target unbind we could be exiting a
> +	 * scsi-ml entry point that had a task ref. Wait on them here.
> +	 */
> +	while (iscsi_session_has_tasks(session))
> +		msleep(50);

Is there a limit on the amount of time this might spin?

> +
> +	mutex_lock(&session->eh_mutex);
>  
>  	/* flush queued up work because we free the connection below */
>  	iscsi_suspend_tx(conn);
>
Mike Christie April 22, 2021, 8:09 p.m. UTC | #2
On 4/22/21 10:02 AM, Lee Duncan wrote:
> On 4/15/21 7:04 PM, Mike Christie wrote:
>> If we haven't done an unbind target call, we can race during conn
>> destruction where iscsi_conn_teardown wakes up the eh/abort thread and its
>> still accessing a task while iscsi_conn_teardown is freeing the conn. This
>> patch has us wait for all threads to drop their refs to outstanding tasks
>> during conn destruction.
>>
>> There is also an issue where we could be accessing the conn directly via
>> fields like conn->ehwait in the eh callbacks. The next patch will fix
>> those.
>>
>> Signed-off-by: Mike Christie <michael.christie@oracle.com>
>> ---
>>  drivers/scsi/libiscsi.c | 28 ++++++++++++++++++++++++++++
>>  1 file changed, 28 insertions(+)
>>
>> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
>> index ce3898fdb10f..ce6d04035c64 100644
>> --- a/drivers/scsi/libiscsi.c
>> +++ b/drivers/scsi/libiscsi.c
>> @@ -3120,6 +3120,24 @@ iscsi_conn_setup(struct iscsi_cls_session *cls_session, int dd_size,
>>  }
>>  EXPORT_SYMBOL_GPL(iscsi_conn_setup);
>>  
>> +static bool iscsi_session_has_tasks(struct iscsi_session *session)
>> +{
>> +	struct iscsi_task *task;
>> +	int i;
>> +
>> +	spin_lock_bh(&session->back_lock);
>> +	for (i = 0; i < session->cmds_max; i++) {
>> +		task = session->cmds[i];
>> +
>> +		if (task->sc) {
>> +			spin_unlock_bh(&session->back_lock);
>> +			return true;
>> +		}
>> +	}
>> +	spin_unlock_bh(&session->back_lock);
>> +	return false;
>> +}
>> +
>>  /**
>>   * iscsi_conn_teardown - teardown iscsi connection
>>   * @cls_conn: iscsi class connection
>> @@ -3144,7 +3162,17 @@ void iscsi_conn_teardown(struct iscsi_cls_conn *cls_conn)
>>  		session->state = ISCSI_STATE_TERMINATE;
>>  		wake_up(&conn->ehwait);
>>  	}
>> +
>>  	spin_unlock_bh(&session->frwd_lock);
>> +	mutex_unlock(&session->eh_mutex);
>> +	/*
>> +	 * If the caller didn't do a target unbind we could be exiting a
>> +	 * scsi-ml entry point that had a task ref. Wait on them here.
>> +	 */
>> +	while (iscsi_session_has_tasks(session))
>> +		msleep(50);
> 
> Is there a limit on the amount of time this might spin?

No.

Are you asking because you think there should be one or because you just wanted
to check?

It's really like the patch description says and is a quick wait.

If you want to add a limit, we can return and leak mem and maybe crash because
we've left some objects running and setup. Maybe just get a warning when we remove
the session. Or we can let the crash happen.
diff mbox series

Patch

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index ce3898fdb10f..ce6d04035c64 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -3120,6 +3120,24 @@  iscsi_conn_setup(struct iscsi_cls_session *cls_session, int dd_size,
 }
 EXPORT_SYMBOL_GPL(iscsi_conn_setup);
 
+static bool iscsi_session_has_tasks(struct iscsi_session *session)
+{
+	struct iscsi_task *task;
+	int i;
+
+	spin_lock_bh(&session->back_lock);
+	for (i = 0; i < session->cmds_max; i++) {
+		task = session->cmds[i];
+
+		if (task->sc) {
+			spin_unlock_bh(&session->back_lock);
+			return true;
+		}
+	}
+	spin_unlock_bh(&session->back_lock);
+	return false;
+}
+
 /**
  * iscsi_conn_teardown - teardown iscsi connection
  * @cls_conn: iscsi class connection
@@ -3144,7 +3162,17 @@  void iscsi_conn_teardown(struct iscsi_cls_conn *cls_conn)
 		session->state = ISCSI_STATE_TERMINATE;
 		wake_up(&conn->ehwait);
 	}
+
 	spin_unlock_bh(&session->frwd_lock);
+	mutex_unlock(&session->eh_mutex);
+	/*
+	 * If the caller didn't do a target unbind we could be exiting a
+	 * scsi-ml entry point that had a task ref. Wait on them here.
+	 */
+	while (iscsi_session_has_tasks(session))
+		msleep(50);
+
+	mutex_lock(&session->eh_mutex);
 
 	/* flush queued up work because we free the connection below */
 	iscsi_suspend_tx(conn);