[1/8] scsi: target: Modify core_tmr_abort_task()

Message ID	20200710104817.19462-2-bstroesser@ts.fujitsu.com (mailing list archive)
State	Superseded
Headers	show Return-Path: <SRS0=WTMl=AV=vger.kernel.org=linux-scsi-owner@kernel.org> From: Bodo Stroesser <bstroesser@ts.fujitsu.com> To: "Martin K. Petersen" <martin.petersen@oracle.com>, Mike Christie <michael.christie@oracle.com>, linux-scsi@vger.kernel.org, target-devel@vger.kernel.org Cc: Bodo Stroesser <bstroesser@ts.fujitsu.com> Subject: [PATCH 1/8] scsi: target: Modify core_tmr_abort_task() Date: Fri, 10 Jul 2020 12:48:10 +0200 Message-Id: <20200710104817.19462-2-bstroesser@ts.fujitsu.com> In-Reply-To: <20200710104817.19462-1-bstroesser@ts.fujitsu.com> References: <20200710104817.19462-1-bstroesser@ts.fujitsu.com> Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk
Series	scsi: target: tcmu: Add TMR notification for tcmu \| expand [0/8] scsi: target: tcmu: Add TMR notification for tcmu [1/8] scsi: target: Modify core_tmr_abort_task() [2/8] scsi: target: Add tmr_notify backend function [3/8] scsi: target: tcmu: Use priv pointer in se_cmd [4/8] scsi: target: tcmu: Do not queue aborted commands [5/8] scsi: target: tcmu: Factor out new helper ring_insert_padding [6/8] scsi: target: tcmu: Fix and simplify timeout handling [7/8] scsi: target: tcmu: Implement tmr_notify callback [8/8] scsi: target: tcmu: Make TMR notification optional

Bodo Stroesser July 10, 2020, 10:48 a.m. UTC

This patch modifies core_tmr_abort_task() to use same looping
and locking scheme as core_tmr_drain_state_list() does.

This frees the state_list element in se_cmd for later use
by tmr notification handling.

Note: __target_check_io_state() now is called with param 0
instead of dev->dev_attrib.emulate_tas, because tas is not
relevant since we always get ABRT on same session like the
aborted command.

Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
---
 drivers/target/target_core_tmr.c | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

Mike Christie July 12, 2020, 12:49 a.m. UTC | #1

On 7/10/20 5:48 AM, Bodo Stroesser wrote:
> This patch modifies core_tmr_abort_task() to use same looping
> and locking scheme as core_tmr_drain_state_list() does.
> 
> This frees the state_list element in se_cmd for later use
> by tmr notification handling.
> 
> Note: __target_check_io_state() now is called with param 0
> instead of dev->dev_attrib.emulate_tas, because tas is not
> relevant since we always get ABRT on same session like the
> aborted command.
> 
> Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
> ---
>   drivers/target/target_core_tmr.c | 22 ++++++++++++++--------
>   1 file changed, 14 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/target/target_core_tmr.c b/drivers/target/target_core_tmr.c
> index 89c84d472cd7..b65d7a0a5df1 100644
> --- a/drivers/target/target_core_tmr.c
> +++ b/drivers/target/target_core_tmr.c
> @@ -116,14 +116,15 @@ void core_tmr_abort_task(
>   	struct se_tmr_req *tmr,
>   	struct se_session *se_sess)
>   {
> -	struct se_cmd *se_cmd;
> +	struct se_cmd *se_cmd, *next;
>   	unsigned long flags;
> +	bool rc;
>   	u64 ref_tag;
>   
> -	spin_lock_irqsave(&se_sess->sess_cmd_lock, flags);
> -	list_for_each_entry(se_cmd, &se_sess->sess_cmd_list, se_cmd_list) {
> +	spin_lock_irqsave(&dev->execute_task_lock, flags);
> +	list_for_each_entry_safe(se_cmd, next, &dev->state_list, state_list) {
>   
> -		if (dev != se_cmd->se_dev)
> +		if (se_sess != se_cmd->se_sess)
>   			continue;
>   
>   		/* skip task management functions, including tmr->task_cmd */
> @@ -137,11 +138,16 @@ void core_tmr_abort_task(
>   		printk("ABORT_TASK: Found referenced %s task_tag: %llu\n",
>   			se_cmd->se_tfo->fabric_name, ref_tag);
>   
> -		if (!__target_check_io_state(se_cmd, se_sess,
> -					     dev->dev_attrib.emulate_tas))
> +		spin_lock_irqsave(&se_sess->sess_cmd_lock, flags);
> +		rc = __target_check_io_state(se_cmd, se_sess, 0);
> +		spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags);
> +		if (!rc)
>   			continue;
>   
> -		spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags);
> +		list_del_init(&se_cmd->state_list);

Do we only want to do this if in the next patch tmr_notify can 
successfully accept the tmr?

If tmr_notify can't allocate memory for the abort case, we would delete 
it from the list. Later the initiator would send a LUN_RESET but 
core_tmr_drain_state_list won't see it and pass it to the tmr_notify call.

> +		se_cmd->state_active = false;
> +
> +		spin_unlock_irqrestore(&dev->execute_task_lock, flags);
>   
>   		/*
>   		 * Ensure that this ABORT request is visible to the LU RESET
> @@ -159,7 +165,7 @@ void core_tmr_abort_task(
>   		atomic_long_inc(&dev->aborts_complete);
>   		return;
>   	}
> -	spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags);
> +	spin_unlock_irqrestore(&dev->execute_task_lock, flags);
>   
>   	printk("ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: %lld\n",
>   			tmr->ref_task_tag);
>

Bodo Stroesser July 13, 2020, 11:39 a.m. UTC | #2

On 2020-07-12 02:49, Mike Christie wrote:
> On 7/10/20 5:48 AM, Bodo Stroesser wrote:

...

>> @@ -137,11 +138,16 @@ void core_tmr_abort_task(
>>           printk("ABORT_TASK: Found referenced %s task_tag: %llu\n",
>>               se_cmd->se_tfo->fabric_name, ref_tag);
>> -        if (!__target_check_io_state(se_cmd, se_sess,
>> -                         dev->dev_attrib.emulate_tas))
>> +        spin_lock_irqsave(&se_sess->sess_cmd_lock, flags);
>> +        rc = __target_check_io_state(se_cmd, se_sess, 0);
>> +        spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags);
>> +        if (!rc)
>>               continue;
>> -        spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags);
>> +        list_del_init(&se_cmd->state_list);
> 
> Do we only want to do this if in the next patch tmr_notify can 
> successfully accept the tmr?

This change does not modify behavior of the core. Please see how
core_tmr_drain_state_list() uses state_list to queue commands after
successfully calling __target_check_io_state() for later call of 
target_put_cmd_and_wait().

This patch just unifies ABORT_TASK and LUN_RESET handling such,
that in next patch we can call tmr_notify handler with a list of
aborted commands in both cases. The list uses the state_list list_head
in the se_cmd.

> 
> If tmr_notify can't allocate memory for the abort case, we would delete 
> it from the list. Later the initiator would send a LUN_RESET but 
> core_tmr_drain_state_list won't see it and pass it to the tmr_notify call.
> 

tmr_notify handler will not and must not modify the list of aborted
cmds. So if backend just does nothing for whatever reason we will end
up with the "old" behavior of the core as it was before the change.

The general idea is to send info about received TMRs together with the
list of aborted commands to the backend, to allow TMR tracing and
canceling of aborted commands. But backend must not interfere with TMR
processing in core.

>> +        se_cmd->state_active = false;
>> +
>> +        spin_unlock_irqrestore(&dev->execute_task_lock, flags);
>>           /*
>>            * Ensure that this ABORT request is visible to the LU RESET
>> @@ -159,7 +165,7 @@ void core_tmr_abort_task(
>>           atomic_long_inc(&dev->aborts_complete);
>>           return;
>>       }
>> -    spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags);
>> +    spin_unlock_irqrestore(&dev->execute_task_lock, flags);
>>       printk("ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 
>> %lld\n",
>>               tmr->ref_task_tag);
>>
>

Mike Christie July 14, 2020, 5:43 p.m. UTC | #3

On 7/13/20 6:39 AM, Bodo Stroesser wrote:
> On 2020-07-12 02:49, Mike Christie wrote:
>> On 7/10/20 5:48 AM, Bodo Stroesser wrote:
> 
> ...
> 
>>> @@ -137,11 +138,16 @@ void core_tmr_abort_task(
>>>           printk("ABORT_TASK: Found referenced %s task_tag: %llu\n",
>>>               se_cmd->se_tfo->fabric_name, ref_tag);
>>> -        if (!__target_check_io_state(se_cmd, se_sess,
>>> -                         dev->dev_attrib.emulate_tas))
>>> +        spin_lock_irqsave(&se_sess->sess_cmd_lock, flags);
>>> +        rc = __target_check_io_state(se_cmd, se_sess, 0);
>>> +        spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags);
>>> +        if (!rc)
>>>               continue;
>>> -        spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags);
>>> +        list_del_init(&se_cmd->state_list);
>>
>> Do we only want to do this if in the next patch tmr_notify can 
>> successfully accept the tmr?
> 
> This change does not modify behavior of the core. Please see how
> core_tmr_drain_state_list() uses state_list to queue commands after
> successfully calling __target_check_io_state() for later call of 
> target_put_cmd_and_wait().
> 
> This patch just unifies ABORT_TASK and LUN_RESET handling such,
> that in next patch we can call tmr_notify handler with a list of
> aborted commands in both cases. The list uses the state_list list_head
> in the se_cmd.
> 
>>
>> If tmr_notify can't allocate memory for the abort case, we would 
>> delete it from the list. Later the initiator would send a LUN_RESET 
>> but core_tmr_drain_state_list won't see it and pass it to the 
>> tmr_notify call.
>>
> 
> tmr_notify handler will not and must not modify the list of aborted
> cmds. So if backend just does nothing for whatever reason we will end
> up with the "old" behavior of the core as it was before the change.
> 
> The general idea is to send info about received TMRs together with the
> list of aborted commands to the backend, to allow TMR tracing and
> canceling of aborted commands. But backend must not interfere with TMR
> processing in core.

I don't think we are talking about the same thing.

For the abort case, let's say tmr_notify fails. So for the tcmu case, 
the tmr does not get queued and userspace never knows about it. The 
initiator then sends a LUN reset. For the reset, the tmr_notify call's 
list will be missing the command that got the abort, right? How will the 
backend know to clean up that command during the LUN reset handling?

> 
>>> +        se_cmd->state_active = false;
>>> +
>>> +        spin_unlock_irqrestore(&dev->execute_task_lock, flags);
>>>           /*
>>>            * Ensure that this ABORT request is visible to the LU RESET
>>> @@ -159,7 +165,7 @@ void core_tmr_abort_task(
>>>           atomic_long_inc(&dev->aborts_complete);
>>>           return;
>>>       }
>>> -    spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags);
>>> +    spin_unlock_irqrestore(&dev->execute_task_lock, flags);
>>>       printk("ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for 
>>> ref_tag: %lld\n",
>>>               tmr->ref_task_tag);
>>>
>>

Bodo Stroesser July 15, 2020, 2:25 p.m. UTC | #4

On 2020-07-14 19:43, Mike Christie wrote:
> On 7/13/20 6:39 AM, Bodo Stroesser wrote:
>> On 2020-07-12 02:49, Mike Christie wrote:
>>> On 7/10/20 5:48 AM, Bodo Stroesser wrote:
>>
>> ...
>>
>>>> @@ -137,11 +138,16 @@ void core_tmr_abort_task(
>>>>           printk("ABORT_TASK: Found referenced %s task_tag: %llu\n",
>>>>               se_cmd->se_tfo->fabric_name, ref_tag);
>>>> -        if (!__target_check_io_state(se_cmd, se_sess,
>>>> -                         dev->dev_attrib.emulate_tas))
>>>> +        spin_lock_irqsave(&se_sess->sess_cmd_lock, flags);
>>>> +        rc = __target_check_io_state(se_cmd, se_sess, 0);
>>>> +        spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags);
>>>> +        if (!rc)
>>>>               continue;
>>>> -        spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags);
>>>> +        list_del_init(&se_cmd->state_list);
>>>
>>> Do we only want to do this if in the next patch tmr_notify can 
>>> successfully accept the tmr?
>>
>> This change does not modify behavior of the core. Please see how
>> core_tmr_drain_state_list() uses state_list to queue commands after
>> successfully calling __target_check_io_state() for later call of 
>> target_put_cmd_and_wait().
>>
>> This patch just unifies ABORT_TASK and LUN_RESET handling such,
>> that in next patch we can call tmr_notify handler with a list of
>> aborted commands in both cases. The list uses the state_list list_head
>> in the se_cmd.
>>
>>>
>>> If tmr_notify can't allocate memory for the abort case, we would 
>>> delete it from the list. Later the initiator would send a LUN_RESET 
>>> but core_tmr_drain_state_list won't see it and pass it to the 
>>> tmr_notify call.
>>>
>>
>> tmr_notify handler will not and must not modify the list of aborted
>> cmds. So if backend just does nothing for whatever reason we will end
>> up with the "old" behavior of the core as it was before the change.
>>
>> The general idea is to send info about received TMRs together with the
>> list of aborted commands to the backend, to allow TMR tracing and
>> canceling of aborted commands. But backend must not interfere with TMR
>> processing in core.
> 
> I don't think we are talking about the same thing.

Yeah, probably I misunderstood.

> 
> For the abort case, let's say tmr_notify fails.So for the tcmu case, 
> the tmr does not get queued and userspace never knows about it.

True.

For the tcmu case tmr_notify can fail only if kmalloc(..., GFP_KERNEL)
fails.

> The 
> initiator then sends a LUN reset. For the reset, the tmr_notify call's 
> list will be missing the command that got the abort, right?

Yes.

In core, the LUN RESET will wait in core_tmr_drain_tmr_list for the
previous ABORT_TASK to complete, while the ABORT_TASK itself waits for
the aborted command to complete in core_tmr_abort_task.

So, when LUN_RESET handling comes to calling core_tmr_drain_state_list,
the aborted command will definitely already be done.

> How will the 
> backend know to clean up that command during the LUN reset handling?
> 

It will not know, so core just has to wait until backend completes the
aborted cmd - maybe after long time - just like it always did before my
change.

I don't see a clean way for error handling if tmr notification fails
in backend. Do you have an idea?

Regarding tcmu I think if we run into OOM, then losing tmr notifications
might be one of the smallest problems.

>>
>>>> +        se_cmd->state_active = false;
>>>> +
>>>> +        spin_unlock_irqrestore(&dev->execute_task_lock, flags);
>>>>           /*
>>>>            * Ensure that this ABORT request is visible to the LU RESET
>>>> @@ -159,7 +165,7 @@ void core_tmr_abort_task(
>>>>           atomic_long_inc(&dev->aborts_complete);
>>>>           return;
>>>>       }
>>>> -    spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags);
>>>> +    spin_unlock_irqrestore(&dev->execute_task_lock, flags);
>>>>       printk("ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for 
>>>> ref_tag: %lld\n",
>>>>               tmr->ref_task_tag);
>>>>
>>>
>

Mike Christie July 15, 2020, 6:24 p.m. UTC | #5

On 7/15/20 9:25 AM, Bodo Stroesser wrote:
> On 2020-07-14 19:43, Mike Christie wrote:
>> On 7/13/20 6:39 AM, Bodo Stroesser wrote:
>>> On 2020-07-12 02:49, Mike Christie wrote:
>>>> On 7/10/20 5:48 AM, Bodo Stroesser wrote:
>>>
>>> ...
>>>
>>>>> @@ -137,11 +138,16 @@ void core_tmr_abort_task(
>>>>>           printk("ABORT_TASK: Found referenced %s task_tag: %llu\n",
>>>>>               se_cmd->se_tfo->fabric_name, ref_tag);
>>>>> -        if (!__target_check_io_state(se_cmd, se_sess,
>>>>> -                         dev->dev_attrib.emulate_tas))
>>>>> +        spin_lock_irqsave(&se_sess->sess_cmd_lock, flags);
>>>>> +        rc = __target_check_io_state(se_cmd, se_sess, 0);
>>>>> +        spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags);
>>>>> +        if (!rc)
>>>>>               continue;
>>>>> -        spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags);
>>>>> +        list_del_init(&se_cmd->state_list);
>>>>
>>>> Do we only want to do this if in the next patch tmr_notify can 
>>>> successfully accept the tmr?
>>>
>>> This change does not modify behavior of the core. Please see how
>>> core_tmr_drain_state_list() uses state_list to queue commands after
>>> successfully calling __target_check_io_state() for later call of 
>>> target_put_cmd_and_wait().
>>>
>>> This patch just unifies ABORT_TASK and LUN_RESET handling such,
>>> that in next patch we can call tmr_notify handler with a list of
>>> aborted commands in both cases. The list uses the state_list list_head
>>> in the se_cmd.
>>>
>>>>
>>>> If tmr_notify can't allocate memory for the abort case, we would 
>>>> delete it from the list. Later the initiator would send a LUN_RESET 
>>>> but core_tmr_drain_state_list won't see it and pass it to the 
>>>> tmr_notify call.
>>>>
>>>
>>> tmr_notify handler will not and must not modify the list of aborted
>>> cmds. So if backend just does nothing for whatever reason we will end
>>> up with the "old" behavior of the core as it was before the change.
>>>
>>> The general idea is to send info about received TMRs together with the
>>> list of aborted commands to the backend, to allow TMR tracing and
>>> canceling of aborted commands. But backend must not interfere with TMR
>>> processing in core.
>>
>> I don't think we are talking about the same thing.
> 
> Yeah, probably I misunderstood.
> 
>>
>> For the abort case, let's say tmr_notify fails.So for the tcmu case, 
>> the tmr does not get queued and userspace never knows about it.
> 
> True.
> 
> For the tcmu case tmr_notify can fail only if kmalloc(..., GFP_KERNEL)
> fails.
> 
>> The initiator then sends a LUN reset. For the reset, the tmr_notify 
>> call's list will be missing the command that got the abort, right?
> 
> Yes.
> 
> In core, the LUN RESET will wait in core_tmr_drain_tmr_list for the
> previous ABORT_TASK to complete, while the ABORT_TASK itself waits for
> the aborted command to complete in core_tmr_abort_task.
> 
> So, when LUN_RESET handling comes to calling core_tmr_drain_state_list,
> the aborted command will definitely already be done.
> 
>> How will the backend know to clean up that command during the LUN 
>> reset handling?
>>
> 
> It will not know, so core just has to wait until backend completes the
> aborted cmd - maybe after long time - just like it always did before my
> change.

Ah I see. When I made the original comment and was asking about if the 
callback was supposed to perform some of the TMF operations I was 
thinking we were going to change up a lot of this code.

I think your idea where it's more of a way to notify the modules and we 
are not changing any of this behavior is better since no one wants to 
get into that :)

Seems ok to me.

[1/8] scsi: target: Modify core_tmr_abort_task()

Commit Message

Comments

Patch