diff mbox series

[2/2] target: iscsi: fix a race condition when aborting a task

Message ID 20201007145326.56850-3-mlombard@redhat.com (mailing list archive)
State New, archived
Headers show
Series fix race conditions with task aborts | expand

Commit Message

Maurizio Lombardi Oct. 7, 2020, 2:53 p.m. UTC
The iscsit_release_commands_from_conn() function does the following
operations:

1) locks the cmd_lock spinlock
2) Scans the list of commands and sets the CMD_T_FABRIC_STOP flag
3) Releases the cmd_lock spinlock
4) Rescans the list again and clears the i_conn_node link of each command

If an abort task timer is fired between 3) and 4), it will find
the CMD_T_FABRIC_STOP flag set and won't call list_del_init();
therefore it may end up calling __iscsit_free_cmd() with a
non-empty i_conn_node list, thus triggering the warning.

Considering that:

- we expect list_del_init() to be executed by
  iscsit_release_commands_from_conn() when the CMD_T_FABRIC_STOP is set.
- iscsit_aborted_task() is the only function that calls __iscsit_free_cmd()
  directly, while all the other functions call iscsit_free_cmd().
- the warning in __iscsit_free_cmd() is a duplicate (the same warning
  can be found in iscsit_free_cmd().

We can fix the bug by simply removing the warning from __iscsit_free_cmd()

kernel: ------------[ cut here ]------------
kernel: WARNING: CPU: 1 PID: 21173 at drivers/target/iscsi/iscsi_target_util.c:720 __iscsit_free_cmd+0x26e/0x290 [iscsi_target_mod]
...
kernel: CPU: 1 PID: 21173 Comm: kworker/u8:3 Kdump: loaded Not tainted 3.10.0-1062.4.1.el7.x86_64 #1
kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/17/2015
kernel: Workqueue: tmr-user target_tmr_work [target_core_mod]
kernel: Call Trace:
kernel: [<ffffffff91d78ba4>] dump_stack+0x19/0x1b
kernel: [<ffffffff9169a958>] __warn+0xd8/0x100
kernel: [<ffffffff9169aa9d>] warn_slowpath_null+0x1d/0x20
kernel: [<ffffffffc0a6f69e>] __iscsit_free_cmd+0x26e/0x290 [iscsi_target_mod]
kernel: [<ffffffffc0a70bc4>] iscsit_aborted_task+0x64/0x70 [iscsi_target_mod]
kernel: [<ffffffffc0a7830a>] lio_aborted_task+0x2a/0x30 [iscsi_target_mod]
kernel: [<ffffffffc09fa516>] transport_cmd_finish_abort+0x66/0xb0 [target_core_mod]
kernel: [<ffffffffc09f4d92>] core_tmr_abort_task+0x102/0x180 [target_core_mod]
kernel: [<ffffffffc09f7bb2>] target_tmr_work+0x152/0x170 [target_core_mod]
kernel: [<ffffffff916bd1df>] process_one_work+0x17f/0x440
kernel: [<ffffffff916be2f6>] worker_thread+0x126/0x3c0
kernel: [<ffffffff916be1d0>] ? manage_workers.isra.26+0x2a0/0x2a0
kernel: [<ffffffff916c51b1>] kthread+0xd1/0xe0
kernel: [<ffffffff916c50e0>] ? insert_kthread_work+0x40/0x40
kernel: [<ffffffff91d8bd37>] ret_from_fork_nospec_begin+0x21/0x21
kernel: [<ffffffff916c50e0>] ? insert_kthread_work+0x40/0x40
kernel: ---[ end trace ed2119501826ec7a ]---


Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
---
 drivers/target/iscsi/iscsi_target_util.c | 2 --
 1 file changed, 2 deletions(-)

Comments

Mike Christie Oct. 22, 2020, 2:42 a.m. UTC | #1
On 10/7/20 9:53 AM, Maurizio Lombardi wrote:
> The iscsit_release_commands_from_conn() function does the following
> operations:
> 
> 1) locks the cmd_lock spinlock
> 2) Scans the list of commands and sets the CMD_T_FABRIC_STOP flag
> 3) Releases the cmd_lock spinlock
> 4) Rescans the list again and clears the i_conn_node link of each command
> 
> If an abort task timer is fired between 3) and 4), it will find
> the CMD_T_FABRIC_STOP flag set and won't call list_del_init();
> therefore it may end up calling __iscsit_free_cmd() with a
> non-empty i_conn_node list, thus triggering the warning.
> 
> Considering that:
> 
> - we expect list_del_init() to be executed by
>   iscsit_release_commands_from_conn() when the CMD_T_FABRIC_STOP is set.
> - iscsit_aborted_task() is the only function that calls __iscsit_free_cmd()
>   directly, while all the other functions call iscsit_free_cmd().
> - the warning in __iscsit_free_cmd() is a duplicate (the same warning
>   can be found in iscsit_free_cmd().
> 

For iscsi, what does the last put on the cmd for a successful abort. Ignore this conn stop case.

If we abort a command successfully, the abort path does:

target_handle_abort -> transport_cmd_check_stop_to_fabric -> lio_check_stop_free -> target_put_sess_cmd

When we create the cmd we do transport_init_se_cmd and that sets the refcount=1. We do target_get_sess_cmd with ack_kref=true so that sets refcount=2.

On the abort completion path, target_handle_abort does target_put_sess_cmd (refcount=1 after), check_stop_free ends up doing a put (refcount=0 after).

If we free the cmd from the abort path, then for your conn stop plus abort race case, could we do:

1. thread1 runs iscsit_release_commands_from_conn and sets CMD_T_FABRIC_STOP.
2. thread2 runs iscsit_aborted_task and then does __iscsit_free_cmd. It then returns from the aborted_task callout and we finish target_handle_abort and do:

target_handle_abort -> transport_cmd_check_stop_to_fabric -> lio_check_stop_free -> target_put_sess_cmd

The cmd is now freed.
3. thread1 now finishes iscsit_release_commands_from_conn and runs iscsit_free_cmd while accessing a command we just released.
Maurizio Lombardi Oct. 27, 2020, 1:49 p.m. UTC | #2
Hello Mike,

Dne 22. 10. 20 v 4:42 Mike Christie napsal(a):
> If we free the cmd from the abort path, then for your conn stop plus abort race case, could we do:
> 
> 1. thread1 runs iscsit_release_commands_from_conn and sets CMD_T_FABRIC_STOP.
> 2. thread2 runs iscsit_aborted_task and then does __iscsit_free_cmd. It then returns from the aborted_task callout and we finish target_handle_abort and do:
> 
> target_handle_abort -> transport_cmd_check_stop_to_fabric -> lio_check_stop_free -> target_put_sess_cmd
> 
> The cmd is now freed.
> 3. thread1 now finishes iscsit_release_commands_from_conn and runs iscsit_free_cmd while accessing a command we just released.
> 
> 

Thanks for the review!

There are definitely some problems with task aborts and commands' refcounting *
but this is a different bug than the one this patch is trying to solve (a race to list_del_init());
unless you are saying that abort tasks should never be executed when the connection 
is going down and we have to prevent such cases from happening at all.

* I did a test a few days ago: 
1) blocked the backstore device on the target
2) issued a write command on the initiator
3) waited for abort task and connection shutdown
4) switched the target backstore device to state "running"

I triggered the following warnings in target_handle_abort():

        } else {
                /*
                 * Allow the fabric driver to unmap any resources before
                 * releasing the descriptor via TFO->release_cmd().
                 */
                cmd->se_tfo->aborted_task(cmd);
                if (ack_kref)
                        WARN_ON_ONCE(target_put_sess_cmd(cmd) != 0); <-----------------------
                /*
                 * To do: establish a unit attention condition on the I_T
                 * nexus associated with cmd. See also the paragraph "Aborting
                 * commands" in SAM.
                 */
        }

        WARN_ON_ONCE(kref_read(&cmd->cmd_kref) == 0); <----------------------

        transport_lun_remove_cmd(cmd);

        transport_cmd_check_stop_to_fabric(cmd);
}

target_handle_abort() is executed against a command which has already been released.

[  240.883865] iSCSI/iqn.1994-05.com.redhat:9489e049fb5e: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[  353.848593] ABORT_TASK: Found referenced iSCSI task_tag: 101
[  386.521582] iSCSI Login timeout on Network Portal 0.0.0.0:3260
[  396.537133] WARNING: CPU: 1 PID: 2936 at drivers/target/target_core_transport.c:805 target_handle_abort+0xce/0x180 [target_core_mod]
[  396.537139] Modules linked in: iscsi_target_mod target_core_user uio target_core_pscsi target_core_file target_core_iblock target_core_mod uinput nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink sunrpc bochs_drm drm_vram_helper drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ppdev drm joydev pcspkr i2c_piix4 parport_pc parport xfs libcrc32c sr_mod cdrom sd_mod sg ata_generic ata_piix libata virtio_net net_failover failover serio_raw fuse [last unloaded: ip_tables]
[  396.537302] CPU: 1 PID: 2936 Comm: kworker/1:2 Kdump: loaded Not tainted 4.18.0-240.7.el8.x86_64 #1
[  396.537306] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
[  396.537353] Workqueue: target_completion target_abort_work [target_core_mod]
[  396.537394] RIP: 0010:target_handle_abort+0xce/0x180 [target_core_mod]
[  396.537399] Code: 00 e8 e6 db 00 f1 41 81 e4 00 00 40 00 74 9d 48 8d bb 0c 01 00 00 f0 ff 8b 0c 01 00 00 0f 88 ed cd 00 00 75 87 e8 32 ed ff ff <0f> 0b e9 7b ff ff ff c6 03 40 0f 1f 44 00 00 0f 1f 44 00 00 48 8b
[  396.537402] RSP: 0018:ffffac1b818bbe80 EFLAGS: 00010286
[  396.537407] RAX: 0000312f0022d690 RBX: ffff9aec21595350 RCX: 0000000000000000
[  396.537410] RDX: 0000000000000000 RSI: 0000000000000282 RDI: 0000000000000282
[  396.537413] RBP: 0000000000000000 R08: ffff9aebf4d65a00 R09: ffff9aebf4c12d98
[  396.537416] R10: 0000000000000082 R11: 000000000000002e R12: 0000000000400000
[  396.537418] R13: 0000000000000000 R14: ffff9aec02cd9c00 R15: ffff9aec21595488
[  396.537423] FS:  0000000000000000(0000) GS:ffff9aec7fa80000(0000) knlGS:0000000000000000
[  396.537427] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  396.537430] CR2: 000055bfa25e9110 CR3: 00000000614c6000 CR4: 00000000000006e0
[  396.537444] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  396.537447] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  396.537449] Call Trace:
[  396.537504]  process_one_work+0x1a7/0x360
[  396.537525]  worker_thread+0x30/0x390
[  396.537532]  ? create_worker+0x1a0/0x1a0
[  396.537536]  kthread+0x112/0x130
[  396.537542]  ? kthread_flush_work_fn+0x10/0x10
[  396.537550]  ret_from_fork+0x35/0x40
[  396.537570] ---[ end trace d6b5bf9aea218e55 ]---
[  396.537734] WARNING: CPU: 1 PID: 2936 at drivers/target/target_core_transport.c:813 target_handle_abort+0x178/0x180 [target_core_mod]
[  396.537737] Modules linked in: iscsi_target_mod target_core_user uio target_core_pscsi target_core_file target_core_iblock target_core_mod uinput nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink sunrpc bochs_drm drm_vram_helper drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ppdev drm joydev pcspkr i2c_piix4 parport_pc parport xfs libcrc32c sr_mod cdrom sd_mod sg ata_generic ata_piix libata virtio_net net_failover failover serio_raw fuse [last unloaded: ip_tables]
[  396.537811] CPU: 1 PID: 2936 Comm: kworker/1:2 Kdump: loaded Tainted: G        W        --------- -  - 4.18.0-240.7.el8.x86_64 #1
[  396.537814] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
[  396.537860] Workqueue: target_completion target_abort_work [target_core_mod]
[  396.537903] RIP: 0010:target_handle_abort+0x178/0x180 [target_core_mod]
[  396.537909] Code: 67 03 00 48 85 ed 74 1d 48 8b 45 00 48 8b 7d 08 48 83 c5 18 48 89 de e8 26 db 00 f1 48 8b 45 00 48 85 c0 75 e7 e9 6a ff ff ff <0f> 0b e9 df fe ff ff 90 0f 1f 44 00 00 55 48 8d af 08 01 00 00 53
[  396.537912] RSP: 0018:ffffac1b818bbe80 EFLAGS: 00010246
[  396.537916] RAX: 0000000000000000 RBX: ffff9aec21595350 RCX: 0000000000000000
[  396.537919] RDX: 0000000000000000 RSI: 0000000000000282 RDI: 0000000000000282
[  396.537922] RBP: 0000000000000000 R08: ffff9aebf4d65a00 R09: ffff9aebf4c12d98
[  396.537925] R10: 0000000000000082 R11: 000000000000002e R12: 0000000000400000
[  396.537928] R13: 0000000000000000 R14: ffff9aec02cd9c00 R15: ffff9aec21595488
[  396.537933] FS:  0000000000000000(0000) GS:ffff9aec7fa80000(0000) knlGS:0000000000000000
[  396.537936] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  396.537939] CR2: 000055bfa25e9110 CR3: 00000000614c6000 CR4: 00000000000006e0
[  396.537951] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  396.537954] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  396.537956] Call Trace:
[  396.537971]  process_one_work+0x1a7/0x360
[  396.537979]  worker_thread+0x30/0x390
[  396.537985]  ? create_worker+0x1a0/0x1a0
[  396.537989]  kthread+0x112/0x130
[  396.537994]  ? kthread_flush_work_fn+0x10/0x10
[  396.538009]  ret_from_fork+0x35/0x40
[  396.538016] ---[ end trace d6b5bf9aea218e56 ]---
[  396.538221] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 101

Maurizio
Mike Christie Oct. 27, 2020, 5:54 p.m. UTC | #3
On 10/27/20 8:49 AM, Maurizio Lombardi wrote:
> Hello Mike,
> 
> Dne 22. 10. 20 v 4:42 Mike Christie napsal(a):
>> If we free the cmd from the abort path, then for your conn stop plus abort race case, could we do:
>>
>> 1. thread1 runs iscsit_release_commands_from_conn and sets CMD_T_FABRIC_STOP.
>> 2. thread2 runs iscsit_aborted_task and then does __iscsit_free_cmd. It then returns from the aborted_task callout and we finish target_handle_abort and do:
>>
>> target_handle_abort -> transport_cmd_check_stop_to_fabric -> lio_check_stop_free -> target_put_sess_cmd
>>
>> The cmd is now freed.
>> 3. thread1 now finishes iscsit_release_commands_from_conn and runs iscsit_free_cmd while accessing a command we just released.
>>
>>
> 
> Thanks for the review!
> 
> There are definitely some problems with task aborts and commands' refcounting *
> but this is a different bug than the one this patch is trying to solve (a race to list_del_init());
> unless you are saying that abort tasks should never be executed when the connection 
> is going down and we have to prevent such cases from happening at all.

Yeah, I think if we prevent the race then we fix the refcount issue and your issue.
Here is a patch that is only compile tested:

From 209709bcedd9a6ce6003e6bb86f3ebf547dca6af Mon Sep 17 00:00:00 2001
From: Mike Christie <michael.christie@oracle.com>
Date: Tue, 27 Oct 2020 12:30:53 -0500
Subject: [PATCH] iscsi target: fix cmd abort vs fabric stop race

The abort and cmd stop paths can race where:

1. thread1 runs iscsit_release_commands_from_conn and sets
CMD_T_FABRIC_STOP.
2. thread2 runs iscsit_aborted_task and then does __iscsit_free_cmd. It
then returns from the aborted_task callout and we finish
target_handle_abort and do:

target_handle_abort -> transport_cmd_check_stop_to_fabric ->
lio_check_stop_free -> target_put_sess_cmd

The cmd is now freed.
3. thread1 now finishes iscsit_release_commands_from_conn and runs
iscsit_free_cmd while accessing a command we just released.

In __target_check_io_state we check for CMD_T_FABRIC_STOP and set the
CMD_T_ABORTED if the driver is not cleaning up the cmd because of
a session shutdown. However, iscsit_release_commands_from_conn only
sets the CMD_T_FABRIC_STOP and does not check to see if the abort path
has claimed completion ownership of the command.

This adds a check in iscsit_release_commands_from_conn so only the
abort or fabric stop path cleanup the command.
---
 drivers/target/iscsi/iscsi_target.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
index f77e5ee..85027d3 100644
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -483,8 +483,7 @@ int iscsit_queue_rsp(struct iscsi_conn *conn, struct iscsi_cmd *cmd)
 void iscsit_aborted_task(struct iscsi_conn *conn, struct iscsi_cmd *cmd)
 {
 	spin_lock_bh(&conn->cmd_lock);
-	if (!list_empty(&cmd->i_conn_node) &&
-	    !(cmd->se_cmd.transport_state & CMD_T_FABRIC_STOP))
+	if (!list_empty(&cmd->i_conn_node))
 		list_del_init(&cmd->i_conn_node);
 	spin_unlock_bh(&conn->cmd_lock);
 
@@ -4088,6 +4087,16 @@ static void iscsit_release_commands_from_conn(struct iscsi_conn *conn)
 
 		if (se_cmd->se_tfo != NULL) {
 			spin_lock_irq(&se_cmd->t_state_lock);
+			if (se_cmd->transport_state & CMD_T_ABORTED) {
+				/*
+				 * LIO's abort path owns the cleanup for this,
+				 * so put it back on the list and let
+				 * aborted_task handle it.
+				 */
+				list_add_tail(&cmd->i_conn_node,
+					      &conn->conn_cmd_list);
+				continue;
+			}
 			se_cmd->transport_state |= CMD_T_FABRIC_STOP;
 			spin_unlock_irq(&se_cmd->t_state_lock);
 		}
Mike Christie Oct. 27, 2020, 8:03 p.m. UTC | #4
> On Oct 27, 2020, at 12:54 PM, Mike Christie <michael.christie@oracle.com> wrote:
> 
> On 10/27/20 8:49 AM, Maurizio Lombardi wrote:
>> Hello Mike,
>> 
>> Dne 22. 10. 20 v 4:42 Mike Christie napsal(a):
>>> If we free the cmd from the abort path, then for your conn stop plus abort race case, could we do:
>>> 
>>> 1. thread1 runs iscsit_release_commands_from_conn and sets CMD_T_FABRIC_STOP.
>>> 2. thread2 runs iscsit_aborted_task and then does __iscsit_free_cmd. It then returns from the aborted_task callout and we finish target_handle_abort and do:
>>> 
>>> target_handle_abort -> transport_cmd_check_stop_to_fabric -> lio_check_stop_free -> target_put_sess_cmd
>>> 
>>> The cmd is now freed.
>>> 3. thread1 now finishes iscsit_release_commands_from_conn and runs iscsit_free_cmd while accessing a command we just released.
>>> 
>>> 
>> 
>> Thanks for the review!
>> 
>> There are definitely some problems with task aborts and commands' refcounting *
>> but this is a different bug than the one this patch is trying to solve (a race to list_del_init());
>> unless you are saying that abort tasks should never be executed when the connection 
>> is going down and we have to prevent such cases from happening at all.
> 
> Yeah, I think if we prevent the race then we fix the refcount issue and your issue.
> Here is a patch that is only compile tested:
> 
> From 209709bcedd9a6ce6003e6bb86f3ebf547dca6af Mon Sep 17 00:00:00 2001
> From: Mike Christie <michael.christie@oracle.com>
> Date: Tue, 27 Oct 2020 12:30:53 -0500
> Subject: [PATCH] iscsi target: fix cmd abort vs fabric stop race
> 
> The abort and cmd stop paths can race where:
> 
> 1. thread1 runs iscsit_release_commands_from_conn and sets
> CMD_T_FABRIC_STOP.
> 2. thread2 runs iscsit_aborted_task and then does __iscsit_free_cmd. It
> then returns from the aborted_task callout and we finish
> target_handle_abort and do:
> 
> target_handle_abort -> transport_cmd_check_stop_to_fabric ->
> lio_check_stop_free -> target_put_sess_cmd
> 
> The cmd is now freed.
> 3. thread1 now finishes iscsit_release_commands_from_conn and runs
> iscsit_free_cmd while accessing a command we just released.
> 
> In __target_check_io_state we check for CMD_T_FABRIC_STOP and set the
> CMD_T_ABORTED if the driver is not cleaning up the cmd because of
> a session shutdown. However, iscsit_release_commands_from_conn only
> sets the CMD_T_FABRIC_STOP and does not check to see if the abort path
> has claimed completion ownership of the command.
> 
> This adds a check in iscsit_release_commands_from_conn so only the
> abort or fabric stop path cleanup the command.
> ---
> drivers/target/iscsi/iscsi_target.c | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
> index f77e5ee..85027d3 100644
> --- a/drivers/target/iscsi/iscsi_target.c
> +++ b/drivers/target/iscsi/iscsi_target.c
> @@ -483,8 +483,7 @@ int iscsit_queue_rsp(struct iscsi_conn *conn, struct iscsi_cmd *cmd)
> void iscsit_aborted_task(struct iscsi_conn *conn, struct iscsi_cmd *cmd)
> {
> 	spin_lock_bh(&conn->cmd_lock);
> -	if (!list_empty(&cmd->i_conn_node) &&
> -	    !(cmd->se_cmd.transport_state & CMD_T_FABRIC_STOP))
> +	if (!list_empty(&cmd->i_conn_node))
> 		list_del_init(&cmd->i_conn_node);
> 	spin_unlock_bh(&conn->cmd_lock);
> 
> @@ -4088,6 +4087,16 @@ static void iscsit_release_commands_from_conn(struct iscsi_conn *conn)
> 
> 		if (se_cmd->se_tfo != NULL) {
> 			spin_lock_irq(&se_cmd->t_state_lock);
> +			if (se_cmd->transport_state & CMD_T_ABORTED) {
> +				/*
> +				 * LIO's abort path owns the cleanup for this,
> +				 * so put it back on the list and let
> +				 * aborted_task handle it.
> +				 */
> +				list_add_tail(&cmd->i_conn_node,
> +					      &conn->conn_cmd_list);


That should have been a move from the tmp list back to the conn_cmd_list.


> +				continue;
> +			}
> 			se_cmd->transport_state |= CMD_T_FABRIC_STOP;
> 			spin_unlock_irq(&se_cmd->t_state_lock);
> 		}
> -- 
> 1.8.3.1
>
Maurizio Lombardi Oct. 28, 2020, 5:09 p.m. UTC | #5
Dne 27. 10. 20 v 21:03 Michael Christie napsal(a):
> 
> 
>> On Oct 27, 2020, at 12:54 PM, Mike Christie <michael.christie@oracle.com> wrote:
>>
>> On 10/27/20 8:49 AM, Maurizio Lombardi wrote:
>>> Hello Mike,
>>>
>>> Dne 22. 10. 20 v 4:42 Mike Christie napsal(a):
>>>> If we free the cmd from the abort path, then for your conn stop plus abort race case, could we do:
>>>>
>>>> 1. thread1 runs iscsit_release_commands_from_conn and sets CMD_T_FABRIC_STOP.
>>>> 2. thread2 runs iscsit_aborted_task and then does __iscsit_free_cmd. It then returns from the aborted_task callout and we finish target_handle_abort and do:
>>>>
>>>> target_handle_abort -> transport_cmd_check_stop_to_fabric -> lio_check_stop_free -> target_put_sess_cmd
>>>>
>>>> The cmd is now freed.
>>>> 3. thread1 now finishes iscsit_release_commands_from_conn and runs iscsit_free_cmd while accessing a command we just released.
>>>>
>>>>
>>>
>>> Thanks for the review!
>>>
>>> There are definitely some problems with task aborts and commands' refcounting *
>>> but this is a different bug than the one this patch is trying to solve (a race to list_del_init());
>>> unless you are saying that abort tasks should never be executed when the connection 
>>> is going down and we have to prevent such cases from happening at all.
>>
>> Yeah, I think if we prevent the race then we fix the refcount issue and your issue.
>> Here is a patch that is only compile tested:
>>
>> From 209709bcedd9a6ce6003e6bb86f3ebf547dca6af Mon Sep 17 00:00:00 2001
>> From: Mike Christie <michael.christie@oracle.com>
>> Date: Tue, 27 Oct 2020 12:30:53 -0500
>> Subject: [PATCH] iscsi target: fix cmd abort vs fabric stop race
>>
>> The abort and cmd stop paths can race where:
>>
>> 1. thread1 runs iscsit_release_commands_from_conn and sets
>> CMD_T_FABRIC_STOP.
>> 2. thread2 runs iscsit_aborted_task and then does __iscsit_free_cmd. It
>> then returns from the aborted_task callout and we finish
>> target_handle_abort and do:
>>
>> target_handle_abort -> transport_cmd_check_stop_to_fabric ->
>> lio_check_stop_free -> target_put_sess_cmd
>>
>> The cmd is now freed.
>> 3. thread1 now finishes iscsit_release_commands_from_conn and runs
>> iscsit_free_cmd while accessing a command we just released.
>>
>> In __target_check_io_state we check for CMD_T_FABRIC_STOP and set the
>> CMD_T_ABORTED if the driver is not cleaning up the cmd because of
>> a session shutdown. However, iscsit_release_commands_from_conn only
>> sets the CMD_T_FABRIC_STOP and does not check to see if the abort path
>> has claimed completion ownership of the command.
>>
>> This adds a check in iscsit_release_commands_from_conn so only the
>> abort or fabric stop path cleanup the command.
>> ---
>> drivers/target/iscsi/iscsi_target.c | 13 +++++++++++--
>> 1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
>> index f77e5ee..85027d3 100644
>> --- a/drivers/target/iscsi/iscsi_target.c
>> +++ b/drivers/target/iscsi/iscsi_target.c
>> @@ -483,8 +483,7 @@ int iscsit_queue_rsp(struct iscsi_conn *conn, struct iscsi_cmd *cmd)
>> void iscsit_aborted_task(struct iscsi_conn *conn, struct iscsi_cmd *cmd)
>> {
>> 	spin_lock_bh(&conn->cmd_lock);
>> -	if (!list_empty(&cmd->i_conn_node) &&
>> -	    !(cmd->se_cmd.transport_state & CMD_T_FABRIC_STOP))
>> +	if (!list_empty(&cmd->i_conn_node))
>> 		list_del_init(&cmd->i_conn_node);
>> 	spin_unlock_bh(&conn->cmd_lock);
>>
>> @@ -4088,6 +4087,16 @@ static void iscsit_release_commands_from_conn(struct iscsi_conn *conn)
>>
>> 		if (se_cmd->se_tfo != NULL) {
>> 			spin_lock_irq(&se_cmd->t_state_lock);
>> +			if (se_cmd->transport_state & CMD_T_ABORTED) {
>> +				/*
>> +				 * LIO's abort path owns the cleanup for this,
>> +				 * so put it back on the list and let
>> +				 * aborted_task handle it.
>> +				 */
>> +				list_add_tail(&cmd->i_conn_node,
>> +					      &conn->conn_cmd_list);
> 
> 
> That should have been a move from the tmp list back to the conn_cmd_list.

Nice, it looks simple and I will test it.
I am a bit worried there could be other possible race conditions.

Example: 

thread1: connection is going to be closed,
iscsit_release_commands_from_conn() finds a command that is about
to be aborted, re-adds it to conn_cmd_list and proceeds.
iscsit_close_connection() decreases the conn usage count and finally calls iscsit_free_conn(conn)
that destroys the conn structure.

thread2: iscsit_aborted_task() gets called and tries to lock the conn->cmd_lock spinlock, dereferencing
an invalid pointer.

Possible solutions that I can think of:

- Make iscsit_release_commands_from_conn() wait for the abort task to finish
or
- abort handler could hold a reference to the conn structure so that iscsit_close_connection()
will sleep when calling iscsit_check_conn_usage_count(conn) until abort finishes.


Maurizio
Mike Christie Oct. 28, 2020, 8:37 p.m. UTC | #6
On 10/28/20 12:09 PM, Maurizio Lombardi wrote:
> 
> 
> Dne 27. 10. 20 v 21:03 Michael Christie napsal(a):
>>
>>
>>> On Oct 27, 2020, at 12:54 PM, Mike Christie <michael.christie@oracle.com> wrote:
>>>
>>> On 10/27/20 8:49 AM, Maurizio Lombardi wrote:
>>>> Hello Mike,
>>>>
>>>> Dne 22. 10. 20 v 4:42 Mike Christie napsal(a):
>>>>> If we free the cmd from the abort path, then for your conn stop plus abort race case, could we do:
>>>>>
>>>>> 1. thread1 runs iscsit_release_commands_from_conn and sets CMD_T_FABRIC_STOP.
>>>>> 2. thread2 runs iscsit_aborted_task and then does __iscsit_free_cmd. It then returns from the aborted_task callout and we finish target_handle_abort and do:
>>>>>
>>>>> target_handle_abort -> transport_cmd_check_stop_to_fabric -> lio_check_stop_free -> target_put_sess_cmd
>>>>>
>>>>> The cmd is now freed.
>>>>> 3. thread1 now finishes iscsit_release_commands_from_conn and runs iscsit_free_cmd while accessing a command we just released.
>>>>>
>>>>>
>>>>
>>>> Thanks for the review!
>>>>
>>>> There are definitely some problems with task aborts and commands' refcounting *
>>>> but this is a different bug than the one this patch is trying to solve (a race to list_del_init());
>>>> unless you are saying that abort tasks should never be executed when the connection
>>>> is going down and we have to prevent such cases from happening at all.
>>>
>>> Yeah, I think if we prevent the race then we fix the refcount issue and your issue.
>>> Here is a patch that is only compile tested:
>>>
>>>  From 209709bcedd9a6ce6003e6bb86f3ebf547dca6af Mon Sep 17 00:00:00 2001
>>> From: Mike Christie <michael.christie@oracle.com>
>>> Date: Tue, 27 Oct 2020 12:30:53 -0500
>>> Subject: [PATCH] iscsi target: fix cmd abort vs fabric stop race
>>>
>>> The abort and cmd stop paths can race where:
>>>
>>> 1. thread1 runs iscsit_release_commands_from_conn and sets
>>> CMD_T_FABRIC_STOP.
>>> 2. thread2 runs iscsit_aborted_task and then does __iscsit_free_cmd. It
>>> then returns from the aborted_task callout and we finish
>>> target_handle_abort and do:
>>>
>>> target_handle_abort -> transport_cmd_check_stop_to_fabric ->
>>> lio_check_stop_free -> target_put_sess_cmd
>>>
>>> The cmd is now freed.
>>> 3. thread1 now finishes iscsit_release_commands_from_conn and runs
>>> iscsit_free_cmd while accessing a command we just released.
>>>
>>> In __target_check_io_state we check for CMD_T_FABRIC_STOP and set the
>>> CMD_T_ABORTED if the driver is not cleaning up the cmd because of
>>> a session shutdown. However, iscsit_release_commands_from_conn only
>>> sets the CMD_T_FABRIC_STOP and does not check to see if the abort path
>>> has claimed completion ownership of the command.
>>>
>>> This adds a check in iscsit_release_commands_from_conn so only the
>>> abort or fabric stop path cleanup the command.
>>> ---
>>> drivers/target/iscsi/iscsi_target.c | 13 +++++++++++--
>>> 1 file changed, 11 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
>>> index f77e5ee..85027d3 100644
>>> --- a/drivers/target/iscsi/iscsi_target.c
>>> +++ b/drivers/target/iscsi/iscsi_target.c
>>> @@ -483,8 +483,7 @@ int iscsit_queue_rsp(struct iscsi_conn *conn, struct iscsi_cmd *cmd)
>>> void iscsit_aborted_task(struct iscsi_conn *conn, struct iscsi_cmd *cmd)
>>> {
>>> 	spin_lock_bh(&conn->cmd_lock);
>>> -	if (!list_empty(&cmd->i_conn_node) &&
>>> -	    !(cmd->se_cmd.transport_state & CMD_T_FABRIC_STOP))
>>> +	if (!list_empty(&cmd->i_conn_node))
>>> 		list_del_init(&cmd->i_conn_node);
>>> 	spin_unlock_bh(&conn->cmd_lock);
>>>
>>> @@ -4088,6 +4087,16 @@ static void iscsit_release_commands_from_conn(struct iscsi_conn *conn)
>>>
>>> 		if (se_cmd->se_tfo != NULL) {
>>> 			spin_lock_irq(&se_cmd->t_state_lock);
>>> +			if (se_cmd->transport_state & CMD_T_ABORTED) {
>>> +				/*
>>> +				 * LIO's abort path owns the cleanup for this,
>>> +				 * so put it back on the list and let
>>> +				 * aborted_task handle it.
>>> +				 */
>>> +				list_add_tail(&cmd->i_conn_node,
>>> +					      &conn->conn_cmd_list);
>>
>>
>> That should have been a move from the tmp list back to the conn_cmd_list.
> 
> Nice, it looks simple and I will test it.
> I am a bit worried there could be other possible race conditions.
> 
> Example:
> 
> thread1: connection is going to be closed,
> iscsit_release_commands_from_conn() finds a command that is about
> to be aborted, re-adds it to conn_cmd_list and proceeds.
> iscsit_close_connection() decreases the conn usage count and finally calls iscsit_free_conn(conn)
> that destroys the conn structure.
> 
> thread2: iscsit_aborted_task() gets called and tries to lock the conn->cmd_lock spinlock, dereferencing
> an invalid pointer.

Nice catch. You are right.

> 
> Possible solutions that I can think of:
> 
> - Make iscsit_release_commands_from_conn() wait for the abort task to finish

Yeah you could set a completion in there then have aborted_task do the 
complete() call maybe?

There is also the transport_cmd_check_stop_to_fabric 
t_transport_stop_comp completion and transport_wait_for_tasks pairing 
that seems close to what we want. I think we would have to change it to 
make it work for this case. Oh yeah, that is more of a brain stroming 
guess than a review type of comment :)


> or
> - abort handler could hold a reference to the conn structure so that iscsit_close_connection()
> will sleep when calling iscsit_check_conn_usage_count(conn) until abort finishes.
>
Maurizio Lombardi Nov. 10, 2020, 9:29 p.m. UTC | #7
Dne 28. 10. 20 v 21:37 Mike Christie napsal(a):
>>
>> Possible solutions that I can think of:
>>
>> - Make iscsit_release_commands_from_conn() wait for the abort task to finish
> 
> Yeah you could set a completion in there then have aborted_task do the complete() call maybe?
> 

We could do something like this, what do you think?

diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
index 067074ef50818..ffd3dbc53a42f 100644
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -490,13 +490,16 @@ EXPORT_SYMBOL(iscsit_queue_rsp);
 
 void iscsit_aborted_task(struct iscsi_conn *conn, struct iscsi_cmd *cmd)
 {
+	struct se_cmd *se_cmd = cmd->se_cmd.se_tfo ? &cmd->se_cmd : NULL;
+
 	spin_lock_bh(&conn->cmd_lock);
-	if (!list_empty(&cmd->i_conn_node) &&
-	    !(cmd->se_cmd.transport_state & CMD_T_FABRIC_STOP))
+	if (!list_empty(&cmd->i_conn_node))
 		list_del_init(&cmd->i_conn_node);
 	spin_unlock_bh(&conn->cmd_lock);
 
 	__iscsit_free_cmd(cmd, true);
+	if (se_cmd && se_cmd->abrt_task_compl)
+		complete(se_cmd->abrt_task_compl);
 }
 EXPORT_SYMBOL(iscsit_aborted_task);
 
@@ -4080,6 +4083,7 @@ int iscsi_target_rx_thread(void *arg)
 
 static void iscsit_release_commands_from_conn(struct iscsi_conn *conn)
 {
+	DECLARE_COMPLETION_ONSTACK(compl);
 	LIST_HEAD(tmp_list);
 	struct iscsi_cmd *cmd = NULL, *cmd_tmp = NULL;
 	struct iscsi_session *sess = conn->sess;
@@ -4096,8 +4100,24 @@ static void iscsit_release_commands_from_conn(struct iscsi_conn *conn)
 
 		if (se_cmd->se_tfo != NULL) {
 			spin_lock_irq(&se_cmd->t_state_lock);
+			if (se_cmd->transport_state & CMD_T_ABORTED) {
+				/*
+				 * LIO's abort path owns the cleanup for this,
+				 * so put it back on the list and let
+				 * aborted_task handle it.
+				 */
+				list_move_tail(&cmd->i_conn_node, &conn->conn_cmd_list);
+				WARN_ON_ONCE(se_cmd->abrt_task_compl);
+				se_cmd->abrt_task_compl = &compl;
+			}
 			se_cmd->transport_state |= CMD_T_FABRIC_STOP;
 			spin_unlock_irq(&se_cmd->t_state_lock);
+
+			if (se_cmd->abrt_task_compl) {
+				spin_unlock_bh(&conn->cmd_lock);
+				wait_for_completion(&compl);
+				spin_lock_bh(&conn->cmd_lock);
+			}
 		}
 	}
 	spin_unlock_bh(&conn->cmd_lock);
diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index db53a0d649da7..5611e6c00f18c 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -1391,6 +1391,7 @@ void transport_init_se_cmd(
 	init_completion(&cmd->t_transport_stop_comp);
 	cmd->free_compl = NULL;
 	cmd->abrt_compl = NULL;
+	cmd->abrt_task_compl = NULL;
 	spin_lock_init(&cmd->t_state_lock);
 	INIT_WORK(&cmd->work, NULL);
 	kref_init(&cmd->cmd_kref);
diff --git a/include/target/target_core_base.h b/include/target/target_core_base.h
index 549947d407cfd..25cc451930281 100644
--- a/include/target/target_core_base.h
+++ b/include/target/target_core_base.h
@@ -491,6 +491,7 @@ struct se_cmd {
 	struct list_head	se_cmd_list;
 	struct completion	*free_compl;
 	struct completion	*abrt_compl;
+	struct completion	*abrt_task_compl;
 	const struct target_core_fabric_ops *se_tfo;
 	sense_reason_t		(*execute_cmd)(struct se_cmd *);
 	sense_reason_t (*transport_complete_callback)(struct se_cmd *, bool, int *);
Mike Christie Nov. 10, 2020, 11:08 p.m. UTC | #8
On 11/10/20 3:29 PM, Maurizio Lombardi wrote:
> 
> 
> Dne 28. 10. 20 v 21:37 Mike Christie napsal(a):
>>>
>>> Possible solutions that I can think of:
>>>
>>> - Make iscsit_release_commands_from_conn() wait for the abort task to finish
>>
>> Yeah you could set a completion in there then have aborted_task do the complete() call maybe?
>>
> 
> We could do something like this, what do you think?
> 
> diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
> index 067074ef50818..ffd3dbc53a42f 100644
> --- a/drivers/target/iscsi/iscsi_target.c
> +++ b/drivers/target/iscsi/iscsi_target.c
> @@ -490,13 +490,16 @@ EXPORT_SYMBOL(iscsit_queue_rsp);
>   
>   void iscsit_aborted_task(struct iscsi_conn *conn, struct iscsi_cmd *cmd)
>   {
> +	struct se_cmd *se_cmd = cmd->se_cmd.se_tfo ? &cmd->se_cmd : NULL;
> +
>   	spin_lock_bh(&conn->cmd_lock);
> -	if (!list_empty(&cmd->i_conn_node) &&
> -	    !(cmd->se_cmd.transport_state & CMD_T_FABRIC_STOP))
> +	if (!list_empty(&cmd->i_conn_node))
>   		list_del_init(&cmd->i_conn_node);
>   	spin_unlock_bh(&conn->cmd_lock);
>   
>   	__iscsit_free_cmd(cmd, true);
> +	if (se_cmd && se_cmd->abrt_task_compl)
> +		complete(se_cmd->abrt_task_compl);
>   }
>   EXPORT_SYMBOL(iscsit_aborted_task);
>   
> @@ -4080,6 +4083,7 @@ int iscsi_target_rx_thread(void *arg)
>   
>   static void iscsit_release_commands_from_conn(struct iscsi_conn *conn)
>   {
> +	DECLARE_COMPLETION_ONSTACK(compl);
>   	LIST_HEAD(tmp_list);
>   	struct iscsi_cmd *cmd = NULL, *cmd_tmp = NULL;
>   	struct iscsi_session *sess = conn->sess;
> @@ -4096,8 +4100,24 @@ static void iscsit_release_commands_from_conn(struct iscsi_conn *conn)
>   
>   		if (se_cmd->se_tfo != NULL) {
>   			spin_lock_irq(&se_cmd->t_state_lock);
> +			if (se_cmd->transport_state & CMD_T_ABORTED) {
> +				/*
> +				 * LIO's abort path owns the cleanup for this,
> +				 * so put it back on the list and let
> +				 * aborted_task handle it.
> +				 */
> +				list_move_tail(&cmd->i_conn_node, &conn->conn_cmd_list);
> +				WARN_ON_ONCE(se_cmd->abrt_task_compl);
> +				se_cmd->abrt_task_compl = &compl;
> +			}
>   			se_cmd->transport_state |= CMD_T_FABRIC_STOP;
>   			spin_unlock_irq(&se_cmd->t_state_lock);
> +
> +			if (se_cmd->abrt_task_compl) {
> +				spin_unlock_bh(&conn->cmd_lock);
> +				wait_for_completion(&compl);
> +				spin_lock_bh(&conn->cmd_lock);

You can still hit your freed conn race I think. The aborted_task callout 
is not the last time we are referencing the iscsi cmd and conn. That 
code path still has to do the release_cmd callout for example. Once we 
get past this wait_for_completion the abort code path could be still 
running. If this iscsit_release_commands_from_conn completes first then 
we can hit your case where we free the conn before the abort code path 
has done release_cmd and we could hit that cmd->conn->sess reference.

I think if you do the complete in iscsit_release_cmd then we would not 
hit that issue.

There might be a second issue though. What happens if aborted_task ran 
first and deleted the cmd from the conn_cmd_list. It would then be 
running while iscsit_release_commands_from_conn is running. We would 
then not do the wait_for_completion above.


> +			}
>   		}
>   	}
>   	spin_unlock_bh(&conn->cmd_lock);
> diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
> index db53a0d649da7..5611e6c00f18c 100644
> --- a/drivers/target/target_core_transport.c
> +++ b/drivers/target/target_core_transport.c
> @@ -1391,6 +1391,7 @@ void transport_init_se_cmd(
>   	init_completion(&cmd->t_transport_stop_comp);
>   	cmd->free_compl = NULL;
>   	cmd->abrt_compl = NULL;
> +	cmd->abrt_task_compl = NULL;
>   	spin_lock_init(&cmd->t_state_lock);
>   	INIT_WORK(&cmd->work, NULL);
>   	kref_init(&cmd->cmd_kref);
> diff --git a/include/target/target_core_base.h b/include/target/target_core_base.h
> index 549947d407cfd..25cc451930281 100644
> --- a/include/target/target_core_base.h
> +++ b/include/target/target_core_base.h
> @@ -491,6 +491,7 @@ struct se_cmd {
>   	struct list_head	se_cmd_list;
>   	struct completion	*free_compl;
>   	struct completion	*abrt_compl;
> +	struct completion	*abrt_task_compl;

This should be on the iscsi cmd since only iscsi uses it.

>   	const struct target_core_fabric_ops *se_tfo;
>   	sense_reason_t		(*execute_cmd)(struct se_cmd *);
>   	sense_reason_t (*transport_complete_callback)(struct se_cmd *, bool, int *);
>
Mike Christie Nov. 11, 2020, 2:16 a.m. UTC | #9
On 10/28/20 12:09 PM, Maurizio Lombardi wrote:
> 
> 
> Dne 27. 10. 20 v 21:03 Michael Christie napsal(a):
>>
>>
>>> On Oct 27, 2020, at 12:54 PM, Mike Christie <michael.christie@oracle.com> wrote:
>>>
>>> On 10/27/20 8:49 AM, Maurizio Lombardi wrote:
>>>> Hello Mike,
>>>>
>>>> Dne 22. 10. 20 v 4:42 Mike Christie napsal(a):
>>>>> If we free the cmd from the abort path, then for your conn stop plus abort race case, could we do:
>>>>>
>>>>> 1. thread1 runs iscsit_release_commands_from_conn and sets CMD_T_FABRIC_STOP.
>>>>> 2. thread2 runs iscsit_aborted_task and then does __iscsit_free_cmd. It then returns from the aborted_task callout and we finish target_handle_abort and do:
>>>>>
>>>>> target_handle_abort -> transport_cmd_check_stop_to_fabric -> lio_check_stop_free -> target_put_sess_cmd
>>>>>
>>>>> The cmd is now freed.
>>>>> 3. thread1 now finishes iscsit_release_commands_from_conn and runs iscsit_free_cmd while accessing a command we just released.
>>>>>
>>>>>
>>>>
>>>> Thanks for the review!
>>>>
>>>> There are definitely some problems with task aborts and commands' refcounting *
>>>> but this is a different bug than the one this patch is trying to solve (a race to list_del_init());
>>>> unless you are saying that abort tasks should never be executed when the connection
>>>> is going down and we have to prevent such cases from happening at all.
>>>
>>> Yeah, I think if we prevent the race then we fix the refcount issue and your issue.
>>> Here is a patch that is only compile tested:
>>>
>>>  From 209709bcedd9a6ce6003e6bb86f3ebf547dca6af Mon Sep 17 00:00:00 2001
>>> From: Mike Christie <michael.christie@oracle.com>
>>> Date: Tue, 27 Oct 2020 12:30:53 -0500
>>> Subject: [PATCH] iscsi target: fix cmd abort vs fabric stop race
>>>
>>> The abort and cmd stop paths can race where:
>>>
>>> 1. thread1 runs iscsit_release_commands_from_conn and sets
>>> CMD_T_FABRIC_STOP.
>>> 2. thread2 runs iscsit_aborted_task and then does __iscsit_free_cmd. It
>>> then returns from the aborted_task callout and we finish
>>> target_handle_abort and do:
>>>
>>> target_handle_abort -> transport_cmd_check_stop_to_fabric ->
>>> lio_check_stop_free -> target_put_sess_cmd
>>>
>>> The cmd is now freed.
>>> 3. thread1 now finishes iscsit_release_commands_from_conn and runs
>>> iscsit_free_cmd while accessing a command we just released.
>>>
>>> In __target_check_io_state we check for CMD_T_FABRIC_STOP and set the
>>> CMD_T_ABORTED if the driver is not cleaning up the cmd because of
>>> a session shutdown. However, iscsit_release_commands_from_conn only
>>> sets the CMD_T_FABRIC_STOP and does not check to see if the abort path
>>> has claimed completion ownership of the command.
>>>
>>> This adds a check in iscsit_release_commands_from_conn so only the
>>> abort or fabric stop path cleanup the command.
>>> ---
>>> drivers/target/iscsi/iscsi_target.c | 13 +++++++++++--
>>> 1 file changed, 11 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
>>> index f77e5ee..85027d3 100644
>>> --- a/drivers/target/iscsi/iscsi_target.c
>>> +++ b/drivers/target/iscsi/iscsi_target.c
>>> @@ -483,8 +483,7 @@ int iscsit_queue_rsp(struct iscsi_conn *conn, struct iscsi_cmd *cmd)
>>> void iscsit_aborted_task(struct iscsi_conn *conn, struct iscsi_cmd *cmd)
>>> {
>>> 	spin_lock_bh(&conn->cmd_lock);
>>> -	if (!list_empty(&cmd->i_conn_node) &&
>>> -	    !(cmd->se_cmd.transport_state & CMD_T_FABRIC_STOP))
>>> +	if (!list_empty(&cmd->i_conn_node))
>>> 		list_del_init(&cmd->i_conn_node);
>>> 	spin_unlock_bh(&conn->cmd_lock);
>>>
>>> @@ -4088,6 +4087,16 @@ static void iscsit_release_commands_from_conn(struct iscsi_conn *conn)
>>>
>>> 		if (se_cmd->se_tfo != NULL) {
>>> 			spin_lock_irq(&se_cmd->t_state_lock);
>>> +			if (se_cmd->transport_state & CMD_T_ABORTED) {
>>> +				/*
>>> +				 * LIO's abort path owns the cleanup for this,
>>> +				 * so put it back on the list and let
>>> +				 * aborted_task handle it.
>>> +				 */
>>> +				list_add_tail(&cmd->i_conn_node,
>>> +					      &conn->conn_cmd_list);
>>
>>
>> That should have been a move from the tmp list back to the conn_cmd_list.
> 
> Nice, it looks simple and I will test it.
> I am a bit worried there could be other possible race conditions.
> 
> Example:
> 
> thread1: connection is going to be closed,
> iscsit_release_commands_from_conn() finds a command that is about
> to be aborted, re-adds it to conn_cmd_list and proceeds.
> iscsit_close_connection() decreases the conn usage count and finally calls iscsit_free_conn(conn)
> that destroys the conn structure.
> 
> thread2: iscsit_aborted_task() gets called and tries to lock the conn->cmd_lock spinlock, dereferencing
> an invalid pointer.

Hey, I tested this out and I do not think this will happen. We will get 
stuck waiting on the TMF completion for the affected cmd/cmds.

In conn_cmd_list we would have [CMD1 -> ABORT TMF]. Those cmds get moved 
to the tmp list. It might happen where CMD1's CMD_T_ABORTED bit is set, 
and iscsit_release_commands_from_conn will would put it back onto the 
conn_cmd_list. But then it will see the ABORT on the list. We will then 
wait on the ABORT in:

iscsit_release_commands_from_conn -> iscsit_free_cmd -> 
transport_generic_free_cmd.

The abort at this time would be waiting on CMD1 in 
target_put_cmd_and_wait in target_core_tmr.c. Eventually CMD1 completes, 
aborted_task is run and the cmd is released. target_release_cmd_kref 
will then do complete on abort_compl. The abort will then wake from 
target_put_cmd_and_wait and will complete. We will then return from 
transport_generic_free_cmd for the abort.

A LUN RESET should work similarly but we would be waiting on the cmds 
and aborts and then the LUN RESET would complete.
Maurizio Lombardi Nov. 11, 2020, 2:58 p.m. UTC | #10
Dne 11. 11. 20 v 3:16 Mike Christie napsal(a):
> Hey, I tested this out and I do not think this will happen. We will get stuck waiting on the TMF completion for the affected cmd/cmds.
> 
> In conn_cmd_list we would have [CMD1 -> ABORT TMF]. Those cmds get moved to the tmp list. It might happen where CMD1's CMD_T_ABORTED bit is set, and iscsit_release_commands_from_conn will would put it back onto the conn_cmd_list. But then it will see the ABORT on the list. We will then wait on the ABORT in:
> 
> iscsit_release_commands_from_conn -> iscsit_free_cmd -> transport_generic_free_cmd.

Hi Mike,

I'm not sure if I understood this part.

The commands are moved to the tmp_list;
we check for CMD_T_ABORTED and eventually move the commands from tmp_list back to conn_cmd_list
because it's the abort task the one that should do the cleanup.

iscsit_release_commands_from_conn() then scans the tmp_list and calls iscsit_free_cmd()... but not against
those commands with CMD_T_ABORTED flag set because we just moved them back to conn_cmd_list
and aren't linked to tmp_list anymore.

Am I missing something?

Maurizio
Mike Christie Nov. 11, 2020, 3:37 p.m. UTC | #11
> On Nov 11, 2020, at 8:58 AM, Maurizio Lombardi <mlombard@redhat.com> wrote:
> 
> 
> 
> Dne 11. 11. 20 v 3:16 Mike Christie napsal(a):
>> Hey, I tested this out and I do not think this will happen. We will get stuck waiting on the TMF completion for the affected cmd/cmds.
>> 
>> In conn_cmd_list we would have [CMD1 -> ABORT TMF]. Those cmds get moved to the tmp list. It might happen where CMD1's CMD_T_ABORTED bit is set, and iscsit_release_commands_from_conn will would put it back onto the conn_cmd_list. But then it will see the ABORT on the list. We will then wait on the ABORT in:
>> 
>> iscsit_release_commands_from_conn -> iscsit_free_cmd -> transport_generic_free_cmd.
> 
> Hi Mike,
> 
> I'm not sure if I understood this part.
> 
> The commands are moved to the tmp_list;
> we check for CMD_T_ABORTED and eventually move the commands from tmp_list back to conn_cmd_list
> because it's the abort task the one that should do the cleanup.

I’m not sure what you mean here. Are you saying both CMD1’s se_cmd and the ABORT’s se_cmd will have the CMD_T_ABORTED bit set and will both go through the aborted_task callout?


> 
> iscsit_release_commands_from_conn() then scans the tmp_list and calls iscsit_free_cmd()... but not against
> those commands with CMD_T_ABORTED flag set because we just moved them back to conn_cmd_list
> and aren't linked to tmp_list anymore.
> 
> Am I missing something?


If you have a SCSI READ/WRITE se_cmd (CMD1 in my example) and a ABORT se_cmd (ABORT TMF in my example) on the conn_cmd_list, then the ABORT’s se_cmd would not have the CMD_T_ABORTED bit set, right? If so, what sets it?

If the SCSI R/W has the CMD_T_ABORTED bit set, we move it it back to the conn_cmd_list and the abort code path cleans it up. But then we still have the ABORT’s se_cmd on the tmp_list. We will then call 

transport_generic_free_cmd(wait_for_tasks=true) -> __transport_wait_for_tasks(fabric_stop=true)

And wait for the ABORT to complete, and the ABORT does not complete until the last ref on the command it’s aborting completes.

If you have a LUN RESET in the mix like:

[CMD1 -> ABORT TMF -> LUN RESET TMF]

Then CMD1 and the ABORT could have their CMD_T_ABORTED bit set. core_tmr_drain_tmr_list would call __target_check_io_state during the RESET processing. However, in this case, the LUN RESET’s se_cmd would not have the bit set, so we would end up waiting like I described for that to complete. In that case though the RESET waits for the cmds and tmfs it is cleaning up.
Maurizio Lombardi Nov. 11, 2020, 3:48 p.m. UTC | #12
Dne 11. 11. 20 v 16:37 Michael Christie napsal(a):
> If the SCSI R/W has the CMD_T_ABORTED bit set, we move it it back to the conn_cmd_list and the abort code path cleans it up. But then we still have the ABORT’s se_cmd on the tmp_list. We will then call 
> 
> transport_generic_free_cmd(wait_for_tasks=true) -> __transport_wait_for_tasks(fabric_stop=true)
> 
> And wait for the ABORT to complete, and the ABORT does not complete until the last ref on the command it’s aborting completes.


Right. now I understand it.
Thanks.

Maurizio
diff mbox series

Patch

diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c
index 3082f5bde9fa..61233755ece0 100644
--- a/drivers/target/iscsi/iscsi_target_util.c
+++ b/drivers/target/iscsi/iscsi_target_util.c
@@ -741,8 +741,6 @@  void __iscsit_free_cmd(struct iscsi_cmd *cmd, bool check_queues)
 {
 	struct iscsi_conn *conn = cmd->conn;
 
-	WARN_ON(!list_empty(&cmd->i_conn_node));
-
 	if (cmd->data_direction == DMA_TO_DEVICE) {
 		iscsit_stop_dataout_timer(cmd);
 		iscsit_free_r2ts_from_list(cmd);