diff mbox

[v6,14/33] target: Avoid that target drivers hang if a command is aborted

Message ID 20170215002612.14566-15-bart.vanassche@sandisk.com (mailing list archive)
State Superseded
Headers show

Commit Message

Bart Van Assche Feb. 15, 2017, 12:25 a.m. UTC
For several target drivers (e.g. ib_srpt and ib_isert) sleeping inside
transport_generic_free_cmd() causes RDMA completion processing to stall.
Hence only sleep inside this function if the (iSCSI) target driver
requires this.

This patch avoids that messages similar to the following appear in the
kernel log:

INFO: task kworker/u25:0:1013 blocked for more than 480 seconds.
      Tainted: G        W       4.10.0-rc7-dbg+ #3
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u25:0   D    0  1013      2 0x00000000
Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
Call Trace:
 __schedule+0x2da/0xb00
 schedule+0x38/0x90
 schedule_timeout+0x2fe/0x640
 wait_for_completion+0xfe/0x160
 transport_generic_free_cmd+0x2e/0x80 [target_core_mod]
 srpt_send_done+0x59/0x9f [ib_srpt]
 __ib_process_cq+0x4b/0xd0 [ib_core]
 ib_cq_poll_work+0x1b/0x60 [ib_core]
 process_one_work+0x208/0x6a0
 worker_thread+0x49/0x4a0
 kthread+0x107/0x140
 ret_from_fork+0x2e/0x40

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: <stable@vger.kernel.org>
---
 drivers/target/target_core_transport.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

Comments

Nicholas A. Bellinger Feb. 20, 2017, 9:38 p.m. UTC | #1
On Tue, 2017-02-14 at 16:25 -0800, Bart Van Assche wrote:
> For several target drivers (e.g. ib_srpt and ib_isert) sleeping inside
> transport_generic_free_cmd() causes RDMA completion processing to stall.
> Hence only sleep inside this function if the (iSCSI) target driver
> requires this.
> 
> This patch avoids that messages similar to the following appear in the
> kernel log:
> 
> INFO: task kworker/u25:0:1013 blocked for more than 480 seconds.
>       Tainted: G        W       4.10.0-rc7-dbg+ #3
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> kworker/u25:0   D    0  1013      2 0x00000000
> Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
> Call Trace:
>  __schedule+0x2da/0xb00
>  schedule+0x38/0x90
>  schedule_timeout+0x2fe/0x640
>  wait_for_completion+0xfe/0x160
>  transport_generic_free_cmd+0x2e/0x80 [target_core_mod]
>  srpt_send_done+0x59/0x9f [ib_srpt]
>  __ib_process_cq+0x4b/0xd0 [ib_core]
>  ib_cq_poll_work+0x1b/0x60 [ib_core]
>  process_one_work+0x208/0x6a0
>  worker_thread+0x49/0x4a0
>  kthread+0x107/0x140
>  ret_from_fork+0x2e/0x40
> 
> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
> Cc: <stable@vger.kernel.org>
> ---
>  drivers/target/target_core_transport.c | 9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
> index 2ed9721a7202..ab1c493a9a16 100644
> --- a/drivers/target/target_core_transport.c
> +++ b/drivers/target/target_core_transport.c
> @@ -2504,15 +2504,13 @@ int transport_generic_free_cmd(struct se_cmd *cmd, int wait_for_tasks)
>  	int ret = 0;
>  	bool aborted = false, tas = false;
>  
> -	if (!(cmd->se_cmd_flags & SCF_SE_LUN_CMD)) {
> -		if (wait_for_tasks && (cmd->se_cmd_flags & SCF_SCSI_TMR_CDB))
> -			target_wait_free_cmd(cmd, &aborted, &tas);
> +	if (wait_for_tasks)
> +		target_wait_free_cmd(cmd, &aborted, &tas);
>  
> +	if (!(cmd->se_cmd_flags & SCF_SE_LUN_CMD)) {
>  		if (!aborted || tas)
>  			ret = transport_put_cmd(cmd);
>  	} else {
> -		if (wait_for_tasks)
> -			target_wait_free_cmd(cmd, &aborted, &tas);
>  		/*
>  		 * Handle WRITE failure case where transport_generic_new_cmd()
>  		 * has already added se_cmd to state_list, but fabric has
> @@ -2535,7 +2533,6 @@ int transport_generic_free_cmd(struct se_cmd *cmd, int wait_for_tasks)
>  	 */
>  	if (aborted) {
>  		pr_debug("Detected CMD_T_ABORTED for ITT: %llu\n", cmd->tag);
> -		wait_for_completion(&cmd->cmd_wait_comp);
>  		cmd->se_tfo->release_cmd(cmd);
>  		ret = 1;
>  	}

Removing the aborted wait_for_completion breaks the second order
scenario described for iscsi-target previously here:

http://www.spinics.net/lists/target-devel/msg14456.html

where a concurrent CMD_T_ABORTED occurs while an iscsi connection is
shutting down, and transport_generic_free_cmd() must not return until
cmd->cmd_kref reaches zero and does the complete.

However, looking at transport_generic_free_cmd() in the context of
ib_srpt and ib_isert, I don't see how this scenario can ever happen in
as-is in upstream code.

Namely, all of the transport_generic_free_cmd() callers in ib_srpt and
ib_isert pass wait_for_tasks = false.  And the only time
transport_generic_free_cmd() will block is when wait_for_tasks = true,
or when wait_for_tasks = true && abort = true.

So as-is in upstream, how can transport_generic_free_cmd() ever block
when wait_for_tasks = false..?

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche Feb. 21, 2017, 6:58 p.m. UTC | #2
On 02/20/2017 01:38 PM, Nicholas A. Bellinger wrote:
> However, looking at transport_generic_free_cmd() in the context of
> ib_srpt and ib_isert, I don't see how this scenario can ever happen in
> as-is in upstream code.
> 
> Namely, all of the transport_generic_free_cmd() callers in ib_srpt and
> ib_isert pass wait_for_tasks = false.  And the only time
> transport_generic_free_cmd() will block is when wait_for_tasks = true,
> or when wait_for_tasks = true && abort = true.
> 
> So as-is in upstream, how can transport_generic_free_cmd() ever block
> when wait_for_tasks = false..?

I will check the other patches in this series for changes that trigger a
call to wait_for_completion() if wait_for_tasks == false.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicholas A. Bellinger March 2, 2017, 5:21 a.m. UTC | #3
On Tue, 2017-02-21 at 18:58 +0000, Bart Van Assche wrote:
> On 02/20/2017 01:38 PM, Nicholas A. Bellinger wrote:
> > However, looking at transport_generic_free_cmd() in the context of
> > ib_srpt and ib_isert, I don't see how this scenario can ever happen in
> > as-is in upstream code.
> > 
> > Namely, all of the transport_generic_free_cmd() callers in ib_srpt and
> > ib_isert pass wait_for_tasks = false.  And the only time
> > transport_generic_free_cmd() will block is when wait_for_tasks = true,
> > or when wait_for_tasks = true && abort = true.
> > 
> > So as-is in upstream, how can transport_generic_free_cmd() ever block
> > when wait_for_tasks = false..?
> 
> I will check the other patches in this series for changes that trigger a
> call to wait_for_completion() if wait_for_tasks == false.
> 

Given there has not been any follow-up to identify a case in upstream
where ib_isert or ib_srpt is using target_generic_free_cmd() with
wait_for_tasks = true or a case where wait_for_tasks = false blocks,
I'll conclude this was a bugfix incorrectly CC'ed to stable for breakage
introduced by other changes in this series.

To reiterate the importance of having bug-fixes, especially those
intended for stable, always be leading other patches..

There is no way for a maintainer to know which bug-fixes are to existing
code unless they precede all other patches and not intermixed with
various other changes.  The bug-fixes to existing upstream code need to
be tested on their own without the other changes (especially those that
effect the same area) invalidating the tests.

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche March 2, 2017, 5:24 a.m. UTC | #4
T24gV2VkLCAyMDE3LTAzLTAxIGF0IDIxOjIxIC0wODAwLCBOaWNob2xhcyBBLiBCZWxsaW5nZXIg
d3JvdGU6DQo+IFRvIHJlaXRlcmF0ZSB0aGUgaW1wb3J0YW5jZSBvZiBoYXZpbmcgYnVnLWZpeGVz
LCBlc3BlY2lhbGx5IHRob3NlDQo+IGludGVuZGVkIGZvciBzdGFibGUsIGFsd2F5cyBiZSBsZWFk
aW5nIG90aGVyIHBhdGNoZXMuLg0KDQpZb3Ugc2hvdWxkIGtub3cgdGhhdCBJIGNhbid0IGdyb3Vw
IGFsbCBidWdmaXhlcyBhdCB0aGUgc3RhcnQgb2YgdGhlIHNlcmllcw0KYmVjYXVzZSBzb21lIG9m
IHRoZSBidWdmaXhlcyBkZXBlbmQgb24gcGF0Y2hlcyB0aGF0IGFyZSBub3QgYnVnZml4ZXMuDQoN
CkJhcnQu
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicholas A. Bellinger March 2, 2017, 7:02 a.m. UTC | #5
On Thu, 2017-03-02 at 05:24 +0000, Bart Van Assche wrote:
> On Wed, 2017-03-01 at 21:21 -0800, Nicholas A. Bellinger wrote:
> > To reiterate the importance of having bug-fixes, especially those
> > intended for stable, always be leading other patches..
> 
> You should know that I can't group all bugfixes at the start of the series
> because some of the bugfixes depend on patches that are not bugfixes.
> 

That makes no sense.  Either it's a bug-fix to existing upstream code,
or it's not.

This patch was not a bug-fix to upstream, because it detailed a scenario
that doesn't existing in upstream.  That is, a case where ib_isert or
ib_srpt calls target_generic_free_cmd() with wait_for_tasks = true or a
case where wait_for_tasks = false blocks.






--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index 2ed9721a7202..ab1c493a9a16 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -2504,15 +2504,13 @@  int transport_generic_free_cmd(struct se_cmd *cmd, int wait_for_tasks)
 	int ret = 0;
 	bool aborted = false, tas = false;
 
-	if (!(cmd->se_cmd_flags & SCF_SE_LUN_CMD)) {
-		if (wait_for_tasks && (cmd->se_cmd_flags & SCF_SCSI_TMR_CDB))
-			target_wait_free_cmd(cmd, &aborted, &tas);
+	if (wait_for_tasks)
+		target_wait_free_cmd(cmd, &aborted, &tas);
 
+	if (!(cmd->se_cmd_flags & SCF_SE_LUN_CMD)) {
 		if (!aborted || tas)
 			ret = transport_put_cmd(cmd);
 	} else {
-		if (wait_for_tasks)
-			target_wait_free_cmd(cmd, &aborted, &tas);
 		/*
 		 * Handle WRITE failure case where transport_generic_new_cmd()
 		 * has already added se_cmd to state_list, but fabric has
@@ -2535,7 +2533,6 @@  int transport_generic_free_cmd(struct se_cmd *cmd, int wait_for_tasks)
 	 */
 	if (aborted) {
 		pr_debug("Detected CMD_T_ABORTED for ITT: %llu\n", cmd->tag);
-		wait_for_completion(&cmd->cmd_wait_comp);
 		cmd->se_tfo->release_cmd(cmd);
 		ret = 1;
 	}