diff mbox

iscsi-target: Fix initial login PDU asynchronous socket close OOPs

Message ID 1495776751-4746-1-git-send-email-nab@linux-iscsi.org (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Nicholas A. Bellinger May 26, 2017, 5:32 a.m. UTC
From: Nicholas Bellinger <nab@linux-iscsi.org>

This patch fixes a OOPs originally introduced by:

   commit bb048357dad6d604520c91586334c9c230366a14
   Author: Nicholas Bellinger <nab@linux-iscsi.org>
   Date:   Thu Sep 5 14:54:04 2013 -0700

   iscsi-target: Add sk->sk_state_change to cleanup after TCP failure

which would trigger a NULL pointer dereference when a TCP connection
was closed asynchronously via iscsi_target_sk_state_change(), but only
when the initial PDU processing in iscsi_target_do_login() from iscsi_np
process context was blocked waiting for backend I/O to complete.

To address this issue, this patch makes the following changes.

First, it introduces some common helper functions used for checking
socket closing state, checking login_flags, and atomically checking
socket closing state + setting login_flags.

Second, it introduces a LOGIN_FLAGS_INITIAL_PDU bit to know when a TCP
connection has dropped via iscsi_target_sk_state_change(), but the
initial PDU processing within iscsi_target_do_login() in iscsi_np
context is still running.  For this case, it sets LOGIN_FLAGS_CLOSED,
but doesn't invoke schedule_delayed_work().

The original NULL pointer dereference case reported by MNC is now handled
by iscsi_target_do_login() doing a iscsi_target_sk_check_close() before
transitioning to FFP to determine when the socket has already closed,
or iscsi_target_start_negotiation() if the login needs to exchange
more PDUs (eg: iscsi_target_do_login returned 0) but the socket has
closed.  For both of these cases, the cleanup up of remaining connection
resources will occur in iscsi_target_start_negotiation() from iscsi_np
process context once the failure is detected.

Finally, to handle to case where iscsi_target_sk_state_change() is
called after the initial PDU procesing is complete, it now invokes
conn->login_work -> iscsi_target_do_login_rx() to perform cleanup once
existing iscsi_target_sk_check_close() checks detect connection failure.
For this case, the cleanup of remaining connection resources will occur
in iscsi_target_do_login_rx() from delayed workqueue process context
once the failure is detected.

Reported-by: Mike Christie <mchristi@redhat.com>
Cc: Mike Christie <mchristi@redhat.com>
Reported-by: Hannes Reinecke <hare@suse.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Varun Prakash <varun@chelsio.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
---
 drivers/target/iscsi/iscsi_target_nego.c | 194 +++++++++++++++++++++----------
 include/target/iscsi/iscsi_target_core.h |   1 +
 2 files changed, 133 insertions(+), 62 deletions(-)

Comments

Mike Christie May 27, 2017, 3:14 a.m. UTC | #1
Thanks for the patch.

On 05/26/2017 12:32 AM, Nicholas A. Bellinger wrote:
>  
> -	state = iscsi_target_sk_state_check(sk);
> -	write_unlock_bh(&sk->sk_callback_lock);
> -
> -	pr_debug("iscsi_target_sk_state_change: state: %d\n", state);
> +		orig_state_change(sk);
>  
> -	if (!state) {
> -		pr_debug("iscsi_target_sk_state_change got failed state\n");
> -		schedule_delayed_work(&conn->login_cleanup_work, 0);

I think login_cleanup_work is no longer used so you can also remove it
and its code.

The patch fixes the crash for me. However, is there a possible
regression where if the initiator attempts new relogins we could run out
of memory? With the old code, we would free the login attempts resources
at this time, but with the new code the initiator will send more login
attempts and so we just keep allocating more memory for each attempt
until we run out or the login is finally able to complete.
Nicholas A. Bellinger May 31, 2017, 4:58 a.m. UTC | #2
Hey MNC,

On Fri, 2017-05-26 at 22:14 -0500, Mike Christie wrote:
> Thanks for the patch.
> 

Btw, after running DATERA's internal longevity and scale tests across
~20 racks on v4.1.y with this patch over the long weekend, there haven't
been any additional regressions.

> On 05/26/2017 12:32 AM, Nicholas A. Bellinger wrote:
> >  
> > -	state = iscsi_target_sk_state_check(sk);
> > -	write_unlock_bh(&sk->sk_callback_lock);
> > -
> > -	pr_debug("iscsi_target_sk_state_change: state: %d\n", state);
> > +		orig_state_change(sk);
> >  
> > -	if (!state) {
> > -		pr_debug("iscsi_target_sk_state_change got failed state\n");
> > -		schedule_delayed_work(&conn->login_cleanup_work, 0);
> 
> I think login_cleanup_work is no longer used so you can also remove it
> and its code.

Yep, since this needs to goto stable, I left that part out for now..

Will take care of that post -rc4.

> 
> The patch fixes the crash for me. However, is there a possible
> regression where if the initiator attempts new relogins we could run out
> of memory? With the old code, we would free the login attempts resources
> at this time, but with the new code the initiator will send more login
> attempts and so we just keep allocating more memory for each attempt
> until we run out or the login is finally able to complete.

AFAICT, no. For the two cases in question:

 - Initial login request PDU processing done within iscsi_np kthread
context in iscsi_target_start_negotiation(), and
 - subsequent login request PDU processing done by delayed work-queue
kthread context in iscsi_target_do_login_rx() 

this patch doesn't change how aggressively connection cleanup happens
for failed login attempts in the face of new connection login attempts
for either case.

For the first case when iscsi_np process context invokes
iscsi_target_start_negotiation() -> iscsi_target_do_login() ->
iscsi_check_for_session_reinstatement() to wait for backend I/O to
complete, it still blocks other new connections from being accepted on
the specific iscsi_np process context.

This patch doesn't change this behavior.

What it does change is when the host closes the connection and
iscsi_target_sk_state_change() gets invoked, iscsi_np process context
waits for iscsi_check_for_session_reinstatement() to complete before
releasing the connection resources.

However since iscsi_np process context is blocked, new connections won't
be accepted until the new connection forcing session reinstatement
finishes waiting for outstanding backend I/O to complete.

For the second case of subsequent non initial login request PDUs handled
within delayed work-queue process context, AFAICT this patch doesn't
change the original behavior either..

Namely when iscsi_target_do_login_rx() is active and host closes the
connection causing iscsi_target_sk_state_change() to be invoked, it
still checks for LOGIN_FLAGS_READ_ACTIVE and doesn't queue shutdown to
occur.

As per the original logic preceding this change, it continues to wait
for iscsi_target_do_login_rx() to complete in delayed work-queue
context, unless sock_recvmsg() returns a failure (which it should once
TCP_CLOSE occurs) or times out via iscsi_target_login_timeout().  Once
the failure is detected from iscsi_target_do_login_rx(), the remaining
connection resources are related from there.

That said, was there another case you had in mind..?
Mike Christie May 31, 2017, 8:28 p.m. UTC | #3
On 05/30/2017 11:58 PM, Nicholas A. Bellinger wrote:
> Hey MNC,
> 
> On Fri, 2017-05-26 at 22:14 -0500, Mike Christie wrote:
>> Thanks for the patch.
>>
> 
> Btw, after running DATERA's internal longevity and scale tests across
> ~20 racks on v4.1.y with this patch over the long weekend, there haven't
> been any additional regressions.
> 
>> On 05/26/2017 12:32 AM, Nicholas A. Bellinger wrote:
>>>  
>>> -	state = iscsi_target_sk_state_check(sk);
>>> -	write_unlock_bh(&sk->sk_callback_lock);
>>> -
>>> -	pr_debug("iscsi_target_sk_state_change: state: %d\n", state);
>>> +		orig_state_change(sk);
>>>  
>>> -	if (!state) {
>>> -		pr_debug("iscsi_target_sk_state_change got failed state\n");
>>> -		schedule_delayed_work(&conn->login_cleanup_work, 0);
>>
>> I think login_cleanup_work is no longer used so you can also remove it
>> and its code.
> 
> Yep, since this needs to goto stable, I left that part out for now..
> 
> Will take care of that post -rc4.
> 
>>
>> The patch fixes the crash for me. However, is there a possible
>> regression where if the initiator attempts new relogins we could run out
>> of memory? With the old code, we would free the login attempts resources
>> at this time, but with the new code the initiator will send more login
>> attempts and so we just keep allocating more memory for each attempt
>> until we run out or the login is finally able to complete.
> 
> AFAICT, no. For the two cases in question:
> 
>  - Initial login request PDU processing done within iscsi_np kthread
> context in iscsi_target_start_negotiation(), and
>  - subsequent login request PDU processing done by delayed work-queue
> kthread context in iscsi_target_do_login_rx() 
> 
> this patch doesn't change how aggressively connection cleanup happens
> for failed login attempts in the face of new connection login attempts
> for either case.
> 
> For the first case when iscsi_np process context invokes
> iscsi_target_start_negotiation() -> iscsi_target_do_login() ->
> iscsi_check_for_session_reinstatement() to wait for backend I/O to
> complete, it still blocks other new connections from being accepted on
> the specific iscsi_np process context.
> 
> This patch doesn't change this behavior.
> 
> What it does change is when the host closes the connection and
> iscsi_target_sk_state_change() gets invoked, iscsi_np process context
> waits for iscsi_check_for_session_reinstatement() to complete before
> releasing the connection resources.
> 
> However since iscsi_np process context is blocked, new connections won't
> be accepted until the new connection forcing session reinstatement
> finishes waiting for outstanding backend I/O to complete.

I was seeing this. My original mail asked about iscsi login resources
incorrectly, but like you said we do not get that far. I get a giant
backlog (1 connection request per 5 seconds that we waited) of tcp level
connection requests and drops. When the wait is done I get a flood of
"iSCSI Login negotiation failed" due to the target handling all those
now stale requests/drops.

If we do not care about the memory use at the network level for this
case (it seems like a little and reconnects are not aggressive), then
patch works ok for me. I am guessing it gets nasty to handle, so maybe
not worth it to handle right now? I tried to do it in my patch which is
why it got all crazy with the waits/wakeups :)

Thanks, and you can add a tested-by or reviewed-by from me.
Nicholas A. Bellinger May 31, 2017, 11:53 p.m. UTC | #4
On Wed, 2017-05-31 at 15:28 -0500, Mike Christie wrote:
> On 05/30/2017 11:58 PM, Nicholas A. Bellinger wrote:
> > Hey MNC,
> > 
> > On Fri, 2017-05-26 at 22:14 -0500, Mike Christie wrote:
> >> Thanks for the patch.

<SNIP>

> >> The patch fixes the crash for me. However, is there a possible
> >> regression where if the initiator attempts new relogins we could run out
> >> of memory? With the old code, we would free the login attempts resources
> >> at this time, but with the new code the initiator will send more login
> >> attempts and so we just keep allocating more memory for each attempt
> >> until we run out or the login is finally able to complete.
> > 
> > AFAICT, no. For the two cases in question:
> > 
> >  - Initial login request PDU processing done within iscsi_np kthread
> > context in iscsi_target_start_negotiation(), and
> >  - subsequent login request PDU processing done by delayed work-queue
> > kthread context in iscsi_target_do_login_rx() 
> > 
> > this patch doesn't change how aggressively connection cleanup happens
> > for failed login attempts in the face of new connection login attempts
> > for either case.
> > 
> > For the first case when iscsi_np process context invokes
> > iscsi_target_start_negotiation() -> iscsi_target_do_login() ->
> > iscsi_check_for_session_reinstatement() to wait for backend I/O to
> > complete, it still blocks other new connections from being accepted on
> > the specific iscsi_np process context.
> > 
> > This patch doesn't change this behavior.
> > 
> > What it does change is when the host closes the connection and
> > iscsi_target_sk_state_change() gets invoked, iscsi_np process context
> > waits for iscsi_check_for_session_reinstatement() to complete before
> > releasing the connection resources.
> > 
> > However since iscsi_np process context is blocked, new connections won't
> > be accepted until the new connection forcing session reinstatement
> > finishes waiting for outstanding backend I/O to complete.
> 
> I was seeing this. My original mail asked about iscsi login resources
> incorrectly, but like you said we do not get that far. I get a giant
> backlog (1 connection request per 5 seconds that we waited) of tcp level
> connection requests and drops. When the wait is done I get a flood of
> "iSCSI Login negotiation failed" due to the target handling all those
> now stale requests/drops.

Ah, I see what you mean.  The TCP backlog = 256 default can fill up when
a small host side login timeout is used while iscsi_np is blocked
waiting for session reinstatement to complete.

> 
> If we do not care about the memory use at the network level for this
> case (it seems like a little and reconnects are not aggressive), then
> patch works ok for me. I am guessing it gets nasty to handle, so maybe
> not worth it to handle right now? 

Yeah, since it's a issue separate from root cause here, getting this
merged first makes sense.

> I tried to do it in my patch which is why it got all crazy with the waits/wakeups :)
> 

One option to consider is to immediately queue into delayed work-queue
context from iscsi_target_start_negotiation() instead of doing the
iscsi_target_do_login() and session reinstatement from iscsi_np context.

Just taking a quick look, this seems like it would be a pretty
straight-forward change..

> Thanks, and you can add a tested-by or reviewed-by from me.

Great, thanks MNC.

Will send out a PULL request for -rc4 shortly.
diff mbox

Patch

diff --git a/drivers/target/iscsi/iscsi_target_nego.c b/drivers/target/iscsi/iscsi_target_nego.c
index 7ccc9c1..6f88b31 100644
--- a/drivers/target/iscsi/iscsi_target_nego.c
+++ b/drivers/target/iscsi/iscsi_target_nego.c
@@ -493,14 +493,60 @@  static void iscsi_target_restore_sock_callbacks(struct iscsi_conn *conn)
 
 static int iscsi_target_do_login(struct iscsi_conn *, struct iscsi_login *);
 
-static bool iscsi_target_sk_state_check(struct sock *sk)
+static bool __iscsi_target_sk_check_close(struct sock *sk)
 {
 	if (sk->sk_state == TCP_CLOSE_WAIT || sk->sk_state == TCP_CLOSE) {
-		pr_debug("iscsi_target_sk_state_check: TCP_CLOSE_WAIT|TCP_CLOSE,"
+		pr_debug("__iscsi_target_sk_check_close: TCP_CLOSE_WAIT|TCP_CLOSE,"
 			"returning FALSE\n");
-		return false;
+		return true;
 	}
-	return true;
+	return false;
+}
+
+static bool iscsi_target_sk_check_close(struct iscsi_conn *conn)
+{
+	bool state = false;
+
+	if (conn->sock) {
+		struct sock *sk = conn->sock->sk;
+
+		read_lock_bh(&sk->sk_callback_lock);
+		state = (__iscsi_target_sk_check_close(sk) ||
+			 test_bit(LOGIN_FLAGS_CLOSED, &conn->login_flags));
+		read_unlock_bh(&sk->sk_callback_lock);
+	}
+	return state;
+}
+
+static bool iscsi_target_sk_check_flag(struct iscsi_conn *conn, unsigned int flag)
+{
+	bool state = false;
+
+	if (conn->sock) {
+		struct sock *sk = conn->sock->sk;
+
+		read_lock_bh(&sk->sk_callback_lock);
+		state = test_bit(flag, &conn->login_flags);
+		read_unlock_bh(&sk->sk_callback_lock);
+	}
+	return state;
+}
+
+static bool iscsi_target_sk_check_and_clear(struct iscsi_conn *conn, unsigned int flag)
+{
+	bool state = false;
+
+	if (conn->sock) {
+		struct sock *sk = conn->sock->sk;
+
+		write_lock_bh(&sk->sk_callback_lock);
+		state = (__iscsi_target_sk_check_close(sk) ||
+			 test_bit(LOGIN_FLAGS_CLOSED, &conn->login_flags));
+		if (!state)
+			clear_bit(flag, &conn->login_flags);
+		write_unlock_bh(&sk->sk_callback_lock);
+	}
+	return state;
 }
 
 static void iscsi_target_login_drop(struct iscsi_conn *conn, struct iscsi_login *login)
@@ -540,6 +586,20 @@  static void iscsi_target_do_login_rx(struct work_struct *work)
 
 	pr_debug("entering iscsi_target_do_login_rx, conn: %p, %s:%d\n",
 			conn, current->comm, current->pid);
+	/*
+	 * If iscsi_target_do_login_rx() has been invoked by ->sk_data_ready()
+	 * before initial PDU processing in iscsi_target_start_negotiation()
+	 * has completed, go ahead and retry until it's cleared.
+	 *
+	 * Otherwise if the TCP connection drops while this is occuring,
+	 * iscsi_target_start_negotiation() will detect the failure, call
+	 * cancel_delayed_work_sync(&conn->login_work), and cleanup the
+	 * remaining iscsi connection resources from iscsi_np process context.
+	 */
+	if (iscsi_target_sk_check_flag(conn, LOGIN_FLAGS_INITIAL_PDU)) {
+		schedule_delayed_work(&conn->login_work, msecs_to_jiffies(10));
+		return;
+	}
 
 	spin_lock(&tpg->tpg_state_lock);
 	state = (tpg->tpg_state == TPG_STATE_ACTIVE);
@@ -547,26 +607,12 @@  static void iscsi_target_do_login_rx(struct work_struct *work)
 
 	if (!state) {
 		pr_debug("iscsi_target_do_login_rx: tpg_state != TPG_STATE_ACTIVE\n");
-		iscsi_target_restore_sock_callbacks(conn);
-		iscsi_target_login_drop(conn, login);
-		iscsit_deaccess_np(np, tpg, tpg_np);
-		return;
+		goto err;
 	}
 
-	if (conn->sock) {
-		struct sock *sk = conn->sock->sk;
-
-		read_lock_bh(&sk->sk_callback_lock);
-		state = iscsi_target_sk_state_check(sk);
-		read_unlock_bh(&sk->sk_callback_lock);
-
-		if (!state) {
-			pr_debug("iscsi_target_do_login_rx, TCP state CLOSE\n");
-			iscsi_target_restore_sock_callbacks(conn);
-			iscsi_target_login_drop(conn, login);
-			iscsit_deaccess_np(np, tpg, tpg_np);
-			return;
-		}
+	if (iscsi_target_sk_check_close(conn)) {
+		pr_debug("iscsi_target_do_login_rx, TCP state CLOSE\n");
+		goto err;
 	}
 
 	conn->login_kworker = current;
@@ -584,34 +630,29 @@  static void iscsi_target_do_login_rx(struct work_struct *work)
 	flush_signals(current);
 	conn->login_kworker = NULL;
 
-	if (rc < 0) {
-		iscsi_target_restore_sock_callbacks(conn);
-		iscsi_target_login_drop(conn, login);
-		iscsit_deaccess_np(np, tpg, tpg_np);
-		return;
-	}
+	if (rc < 0)
+		goto err;
 
 	pr_debug("iscsi_target_do_login_rx after rx_login_io, %p, %s:%d\n",
 			conn, current->comm, current->pid);
 
 	rc = iscsi_target_do_login(conn, login);
 	if (rc < 0) {
-		iscsi_target_restore_sock_callbacks(conn);
-		iscsi_target_login_drop(conn, login);
-		iscsit_deaccess_np(np, tpg, tpg_np);
+		goto err;
 	} else if (!rc) {
-		if (conn->sock) {
-			struct sock *sk = conn->sock->sk;
-
-			write_lock_bh(&sk->sk_callback_lock);
-			clear_bit(LOGIN_FLAGS_READ_ACTIVE, &conn->login_flags);
-			write_unlock_bh(&sk->sk_callback_lock);
-		}
+		if (iscsi_target_sk_check_and_clear(conn, LOGIN_FLAGS_READ_ACTIVE))
+			goto err;
 	} else if (rc == 1) {
 		iscsi_target_nego_release(conn);
 		iscsi_post_login_handler(np, conn, zero_tsih);
 		iscsit_deaccess_np(np, tpg, tpg_np);
 	}
+	return;
+
+err:
+	iscsi_target_restore_sock_callbacks(conn);
+	iscsi_target_login_drop(conn, login);
+	iscsit_deaccess_np(np, tpg, tpg_np);
 }
 
 static void iscsi_target_do_cleanup(struct work_struct *work)
@@ -659,31 +700,54 @@  static void iscsi_target_sk_state_change(struct sock *sk)
 		orig_state_change(sk);
 		return;
 	}
+	state = __iscsi_target_sk_check_close(sk);
+	pr_debug("__iscsi_target_sk_close_change: state: %d\n", state);
+
 	if (test_bit(LOGIN_FLAGS_READ_ACTIVE, &conn->login_flags)) {
 		pr_debug("Got LOGIN_FLAGS_READ_ACTIVE=1 sk_state_change"
 			 " conn: %p\n", conn);
+		if (state)
+			set_bit(LOGIN_FLAGS_CLOSED, &conn->login_flags);
 		write_unlock_bh(&sk->sk_callback_lock);
 		orig_state_change(sk);
 		return;
 	}
-	if (test_and_set_bit(LOGIN_FLAGS_CLOSED, &conn->login_flags)) {
+	if (test_bit(LOGIN_FLAGS_CLOSED, &conn->login_flags)) {
 		pr_debug("Got LOGIN_FLAGS_CLOSED=1 sk_state_change conn: %p\n",
 			 conn);
 		write_unlock_bh(&sk->sk_callback_lock);
 		orig_state_change(sk);
 		return;
 	}
+	/*
+	 * If the TCP connection has dropped, go ahead and set LOGIN_FLAGS_CLOSED,
+	 * but only queue conn->login_work -> iscsi_target_do_login_rx()
+	 * processing if LOGIN_FLAGS_INITIAL_PDU has already been cleared.
+	 *
+	 * When iscsi_target_do_login_rx() runs, iscsi_target_sk_check_close()
+	 * will detect the dropped TCP connection from delayed workqueue context.
+	 *
+	 * If LOGIN_FLAGS_INITIAL_PDU is still set, which means the initial
+	 * iscsi_target_start_negotiation() is running, iscsi_target_do_login()
+	 * via iscsi_target_sk_check_close() or iscsi_target_start_negotiation()
+	 * via iscsi_target_sk_check_and_clear() is responsible for detecting the
+	 * dropped TCP connection in iscsi_np process context, and cleaning up
+	 * the remaining iscsi connection resources.
+	 */
+	if (state) {
+		pr_debug("iscsi_target_sk_state_change got failed state\n");
+		set_bit(LOGIN_FLAGS_CLOSED, &conn->login_flags);
+		state = test_bit(LOGIN_FLAGS_INITIAL_PDU, &conn->login_flags);
+		write_unlock_bh(&sk->sk_callback_lock);
 
-	state = iscsi_target_sk_state_check(sk);
-	write_unlock_bh(&sk->sk_callback_lock);
-
-	pr_debug("iscsi_target_sk_state_change: state: %d\n", state);
+		orig_state_change(sk);
 
-	if (!state) {
-		pr_debug("iscsi_target_sk_state_change got failed state\n");
-		schedule_delayed_work(&conn->login_cleanup_work, 0);
+		if (!state)
+			schedule_delayed_work(&conn->login_work, 0);
 		return;
 	}
+	write_unlock_bh(&sk->sk_callback_lock);
+
 	orig_state_change(sk);
 }
 
@@ -946,6 +1010,15 @@  static int iscsi_target_do_login(struct iscsi_conn *conn, struct iscsi_login *lo
 			if (iscsi_target_handle_csg_one(conn, login) < 0)
 				return -1;
 			if (login_rsp->flags & ISCSI_FLAG_LOGIN_TRANSIT) {
+				/*
+				 * Check to make sure the TCP connection has not
+				 * dropped asynchronously while session reinstatement
+				 * was occuring in this kthread context, before
+				 * transitioning to full feature phase operation.
+				 */
+				if (iscsi_target_sk_check_close(conn))
+					return -1;
+
 				login->tsih = conn->sess->tsih;
 				login->login_complete = 1;
 				iscsi_target_restore_sock_callbacks(conn);
@@ -972,21 +1045,6 @@  static int iscsi_target_do_login(struct iscsi_conn *conn, struct iscsi_login *lo
 		break;
 	}
 
-	if (conn->sock) {
-		struct sock *sk = conn->sock->sk;
-		bool state;
-
-		read_lock_bh(&sk->sk_callback_lock);
-		state = iscsi_target_sk_state_check(sk);
-		read_unlock_bh(&sk->sk_callback_lock);
-
-		if (!state) {
-			pr_debug("iscsi_target_do_login() failed state for"
-				 " conn: %p\n", conn);
-			return -1;
-		}
-	}
-
 	return 0;
 }
 
@@ -1255,10 +1313,22 @@  int iscsi_target_start_negotiation(
 
 		write_lock_bh(&sk->sk_callback_lock);
 		set_bit(LOGIN_FLAGS_READY, &conn->login_flags);
+		set_bit(LOGIN_FLAGS_INITIAL_PDU, &conn->login_flags);
 		write_unlock_bh(&sk->sk_callback_lock);
 	}
-
+	/*
+	 * If iscsi_target_do_login returns zero to signal more PDU
+	 * exchanges are required to complete the login, go ahead and
+	 * clear LOGIN_FLAGS_INITIAL_PDU but only if the TCP connection
+	 * is still active.
+	 *
+	 * Otherwise if TCP connection dropped asynchronously, go ahead
+	 * and perform connection cleanup now.
+	 */
 	ret = iscsi_target_do_login(conn, login);
+	if (!ret && iscsi_target_sk_check_and_clear(conn, LOGIN_FLAGS_INITIAL_PDU))
+		ret = -1;
+
 	if (ret < 0) {
 		cancel_delayed_work_sync(&conn->login_work);
 		cancel_delayed_work_sync(&conn->login_cleanup_work);
diff --git a/include/target/iscsi/iscsi_target_core.h b/include/target/iscsi/iscsi_target_core.h
index 275581d..5f17fb7 100644
--- a/include/target/iscsi/iscsi_target_core.h
+++ b/include/target/iscsi/iscsi_target_core.h
@@ -557,6 +557,7 @@  struct iscsi_conn {
 #define LOGIN_FLAGS_READ_ACTIVE		1
 #define LOGIN_FLAGS_CLOSED		2
 #define LOGIN_FLAGS_READY		4
+#define LOGIN_FLAGS_INITIAL_PDU		8
 	unsigned long		login_flags;
 	struct delayed_work	login_work;
 	struct delayed_work	login_cleanup_work;