Message ID | 20200710104817.19462-2-bstroesser@ts.fujitsu.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | scsi: target: tcmu: Add TMR notification for tcmu | expand |
On 7/10/20 5:48 AM, Bodo Stroesser wrote: > This patch modifies core_tmr_abort_task() to use same looping > and locking scheme as core_tmr_drain_state_list() does. > > This frees the state_list element in se_cmd for later use > by tmr notification handling. > > Note: __target_check_io_state() now is called with param 0 > instead of dev->dev_attrib.emulate_tas, because tas is not > relevant since we always get ABRT on same session like the > aborted command. > > Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com> > --- > drivers/target/target_core_tmr.c | 22 ++++++++++++++-------- > 1 file changed, 14 insertions(+), 8 deletions(-) > > diff --git a/drivers/target/target_core_tmr.c b/drivers/target/target_core_tmr.c > index 89c84d472cd7..b65d7a0a5df1 100644 > --- a/drivers/target/target_core_tmr.c > +++ b/drivers/target/target_core_tmr.c > @@ -116,14 +116,15 @@ void core_tmr_abort_task( > struct se_tmr_req *tmr, > struct se_session *se_sess) > { > - struct se_cmd *se_cmd; > + struct se_cmd *se_cmd, *next; > unsigned long flags; > + bool rc; > u64 ref_tag; > > - spin_lock_irqsave(&se_sess->sess_cmd_lock, flags); > - list_for_each_entry(se_cmd, &se_sess->sess_cmd_list, se_cmd_list) { > + spin_lock_irqsave(&dev->execute_task_lock, flags); > + list_for_each_entry_safe(se_cmd, next, &dev->state_list, state_list) { > > - if (dev != se_cmd->se_dev) > + if (se_sess != se_cmd->se_sess) > continue; > > /* skip task management functions, including tmr->task_cmd */ > @@ -137,11 +138,16 @@ void core_tmr_abort_task( > printk("ABORT_TASK: Found referenced %s task_tag: %llu\n", > se_cmd->se_tfo->fabric_name, ref_tag); > > - if (!__target_check_io_state(se_cmd, se_sess, > - dev->dev_attrib.emulate_tas)) > + spin_lock_irqsave(&se_sess->sess_cmd_lock, flags); > + rc = __target_check_io_state(se_cmd, se_sess, 0); > + spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); > + if (!rc) > continue; > > - spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); > + list_del_init(&se_cmd->state_list); Do we only want to do this if in the next patch tmr_notify can successfully accept the tmr? If tmr_notify can't allocate memory for the abort case, we would delete it from the list. Later the initiator would send a LUN_RESET but core_tmr_drain_state_list won't see it and pass it to the tmr_notify call. > + se_cmd->state_active = false; > + > + spin_unlock_irqrestore(&dev->execute_task_lock, flags); > > /* > * Ensure that this ABORT request is visible to the LU RESET > @@ -159,7 +165,7 @@ void core_tmr_abort_task( > atomic_long_inc(&dev->aborts_complete); > return; > } > - spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); > + spin_unlock_irqrestore(&dev->execute_task_lock, flags); > > printk("ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: %lld\n", > tmr->ref_task_tag); >
On 2020-07-12 02:49, Mike Christie wrote: > On 7/10/20 5:48 AM, Bodo Stroesser wrote: ... >> @@ -137,11 +138,16 @@ void core_tmr_abort_task( >> printk("ABORT_TASK: Found referenced %s task_tag: %llu\n", >> se_cmd->se_tfo->fabric_name, ref_tag); >> - if (!__target_check_io_state(se_cmd, se_sess, >> - dev->dev_attrib.emulate_tas)) >> + spin_lock_irqsave(&se_sess->sess_cmd_lock, flags); >> + rc = __target_check_io_state(se_cmd, se_sess, 0); >> + spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); >> + if (!rc) >> continue; >> - spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); >> + list_del_init(&se_cmd->state_list); > > Do we only want to do this if in the next patch tmr_notify can > successfully accept the tmr? This change does not modify behavior of the core. Please see how core_tmr_drain_state_list() uses state_list to queue commands after successfully calling __target_check_io_state() for later call of target_put_cmd_and_wait(). This patch just unifies ABORT_TASK and LUN_RESET handling such, that in next patch we can call tmr_notify handler with a list of aborted commands in both cases. The list uses the state_list list_head in the se_cmd. > > If tmr_notify can't allocate memory for the abort case, we would delete > it from the list. Later the initiator would send a LUN_RESET but > core_tmr_drain_state_list won't see it and pass it to the tmr_notify call. > tmr_notify handler will not and must not modify the list of aborted cmds. So if backend just does nothing for whatever reason we will end up with the "old" behavior of the core as it was before the change. The general idea is to send info about received TMRs together with the list of aborted commands to the backend, to allow TMR tracing and canceling of aborted commands. But backend must not interfere with TMR processing in core. >> + se_cmd->state_active = false; >> + >> + spin_unlock_irqrestore(&dev->execute_task_lock, flags); >> /* >> * Ensure that this ABORT request is visible to the LU RESET >> @@ -159,7 +165,7 @@ void core_tmr_abort_task( >> atomic_long_inc(&dev->aborts_complete); >> return; >> } >> - spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); >> + spin_unlock_irqrestore(&dev->execute_task_lock, flags); >> printk("ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: >> %lld\n", >> tmr->ref_task_tag); >> >
On 7/13/20 6:39 AM, Bodo Stroesser wrote: > On 2020-07-12 02:49, Mike Christie wrote: >> On 7/10/20 5:48 AM, Bodo Stroesser wrote: > > ... > >>> @@ -137,11 +138,16 @@ void core_tmr_abort_task( >>> printk("ABORT_TASK: Found referenced %s task_tag: %llu\n", >>> se_cmd->se_tfo->fabric_name, ref_tag); >>> - if (!__target_check_io_state(se_cmd, se_sess, >>> - dev->dev_attrib.emulate_tas)) >>> + spin_lock_irqsave(&se_sess->sess_cmd_lock, flags); >>> + rc = __target_check_io_state(se_cmd, se_sess, 0); >>> + spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); >>> + if (!rc) >>> continue; >>> - spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); >>> + list_del_init(&se_cmd->state_list); >> >> Do we only want to do this if in the next patch tmr_notify can >> successfully accept the tmr? > > This change does not modify behavior of the core. Please see how > core_tmr_drain_state_list() uses state_list to queue commands after > successfully calling __target_check_io_state() for later call of > target_put_cmd_and_wait(). > > This patch just unifies ABORT_TASK and LUN_RESET handling such, > that in next patch we can call tmr_notify handler with a list of > aborted commands in both cases. The list uses the state_list list_head > in the se_cmd. > >> >> If tmr_notify can't allocate memory for the abort case, we would >> delete it from the list. Later the initiator would send a LUN_RESET >> but core_tmr_drain_state_list won't see it and pass it to the >> tmr_notify call. >> > > tmr_notify handler will not and must not modify the list of aborted > cmds. So if backend just does nothing for whatever reason we will end > up with the "old" behavior of the core as it was before the change. > > The general idea is to send info about received TMRs together with the > list of aborted commands to the backend, to allow TMR tracing and > canceling of aborted commands. But backend must not interfere with TMR > processing in core. I don't think we are talking about the same thing. For the abort case, let's say tmr_notify fails. So for the tcmu case, the tmr does not get queued and userspace never knows about it. The initiator then sends a LUN reset. For the reset, the tmr_notify call's list will be missing the command that got the abort, right? How will the backend know to clean up that command during the LUN reset handling? > >>> + se_cmd->state_active = false; >>> + >>> + spin_unlock_irqrestore(&dev->execute_task_lock, flags); >>> /* >>> * Ensure that this ABORT request is visible to the LU RESET >>> @@ -159,7 +165,7 @@ void core_tmr_abort_task( >>> atomic_long_inc(&dev->aborts_complete); >>> return; >>> } >>> - spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); >>> + spin_unlock_irqrestore(&dev->execute_task_lock, flags); >>> printk("ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for >>> ref_tag: %lld\n", >>> tmr->ref_task_tag); >>> >>
On 2020-07-14 19:43, Mike Christie wrote: > On 7/13/20 6:39 AM, Bodo Stroesser wrote: >> On 2020-07-12 02:49, Mike Christie wrote: >>> On 7/10/20 5:48 AM, Bodo Stroesser wrote: >> >> ... >> >>>> @@ -137,11 +138,16 @@ void core_tmr_abort_task( >>>> printk("ABORT_TASK: Found referenced %s task_tag: %llu\n", >>>> se_cmd->se_tfo->fabric_name, ref_tag); >>>> - if (!__target_check_io_state(se_cmd, se_sess, >>>> - dev->dev_attrib.emulate_tas)) >>>> + spin_lock_irqsave(&se_sess->sess_cmd_lock, flags); >>>> + rc = __target_check_io_state(se_cmd, se_sess, 0); >>>> + spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); >>>> + if (!rc) >>>> continue; >>>> - spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); >>>> + list_del_init(&se_cmd->state_list); >>> >>> Do we only want to do this if in the next patch tmr_notify can >>> successfully accept the tmr? >> >> This change does not modify behavior of the core. Please see how >> core_tmr_drain_state_list() uses state_list to queue commands after >> successfully calling __target_check_io_state() for later call of >> target_put_cmd_and_wait(). >> >> This patch just unifies ABORT_TASK and LUN_RESET handling such, >> that in next patch we can call tmr_notify handler with a list of >> aborted commands in both cases. The list uses the state_list list_head >> in the se_cmd. >> >>> >>> If tmr_notify can't allocate memory for the abort case, we would >>> delete it from the list. Later the initiator would send a LUN_RESET >>> but core_tmr_drain_state_list won't see it and pass it to the >>> tmr_notify call. >>> >> >> tmr_notify handler will not and must not modify the list of aborted >> cmds. So if backend just does nothing for whatever reason we will end >> up with the "old" behavior of the core as it was before the change. >> >> The general idea is to send info about received TMRs together with the >> list of aborted commands to the backend, to allow TMR tracing and >> canceling of aborted commands. But backend must not interfere with TMR >> processing in core. > > I don't think we are talking about the same thing. Yeah, probably I misunderstood. > > For the abort case, let's say tmr_notify fails.So for the tcmu case, > the tmr does not get queued and userspace never knows about it. True. For the tcmu case tmr_notify can fail only if kmalloc(..., GFP_KERNEL) fails. > The > initiator then sends a LUN reset. For the reset, the tmr_notify call's > list will be missing the command that got the abort, right? Yes. In core, the LUN RESET will wait in core_tmr_drain_tmr_list for the previous ABORT_TASK to complete, while the ABORT_TASK itself waits for the aborted command to complete in core_tmr_abort_task. So, when LUN_RESET handling comes to calling core_tmr_drain_state_list, the aborted command will definitely already be done. > How will the > backend know to clean up that command during the LUN reset handling? > It will not know, so core just has to wait until backend completes the aborted cmd - maybe after long time - just like it always did before my change. I don't see a clean way for error handling if tmr notification fails in backend. Do you have an idea? Regarding tcmu I think if we run into OOM, then losing tmr notifications might be one of the smallest problems. >> >>>> + se_cmd->state_active = false; >>>> + >>>> + spin_unlock_irqrestore(&dev->execute_task_lock, flags); >>>> /* >>>> * Ensure that this ABORT request is visible to the LU RESET >>>> @@ -159,7 +165,7 @@ void core_tmr_abort_task( >>>> atomic_long_inc(&dev->aborts_complete); >>>> return; >>>> } >>>> - spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); >>>> + spin_unlock_irqrestore(&dev->execute_task_lock, flags); >>>> printk("ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for >>>> ref_tag: %lld\n", >>>> tmr->ref_task_tag); >>>> >>> >
On 7/15/20 9:25 AM, Bodo Stroesser wrote: > On 2020-07-14 19:43, Mike Christie wrote: >> On 7/13/20 6:39 AM, Bodo Stroesser wrote: >>> On 2020-07-12 02:49, Mike Christie wrote: >>>> On 7/10/20 5:48 AM, Bodo Stroesser wrote: >>> >>> ... >>> >>>>> @@ -137,11 +138,16 @@ void core_tmr_abort_task( >>>>> printk("ABORT_TASK: Found referenced %s task_tag: %llu\n", >>>>> se_cmd->se_tfo->fabric_name, ref_tag); >>>>> - if (!__target_check_io_state(se_cmd, se_sess, >>>>> - dev->dev_attrib.emulate_tas)) >>>>> + spin_lock_irqsave(&se_sess->sess_cmd_lock, flags); >>>>> + rc = __target_check_io_state(se_cmd, se_sess, 0); >>>>> + spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); >>>>> + if (!rc) >>>>> continue; >>>>> - spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); >>>>> + list_del_init(&se_cmd->state_list); >>>> >>>> Do we only want to do this if in the next patch tmr_notify can >>>> successfully accept the tmr? >>> >>> This change does not modify behavior of the core. Please see how >>> core_tmr_drain_state_list() uses state_list to queue commands after >>> successfully calling __target_check_io_state() for later call of >>> target_put_cmd_and_wait(). >>> >>> This patch just unifies ABORT_TASK and LUN_RESET handling such, >>> that in next patch we can call tmr_notify handler with a list of >>> aborted commands in both cases. The list uses the state_list list_head >>> in the se_cmd. >>> >>>> >>>> If tmr_notify can't allocate memory for the abort case, we would >>>> delete it from the list. Later the initiator would send a LUN_RESET >>>> but core_tmr_drain_state_list won't see it and pass it to the >>>> tmr_notify call. >>>> >>> >>> tmr_notify handler will not and must not modify the list of aborted >>> cmds. So if backend just does nothing for whatever reason we will end >>> up with the "old" behavior of the core as it was before the change. >>> >>> The general idea is to send info about received TMRs together with the >>> list of aborted commands to the backend, to allow TMR tracing and >>> canceling of aborted commands. But backend must not interfere with TMR >>> processing in core. >> >> I don't think we are talking about the same thing. > > Yeah, probably I misunderstood. > >> >> For the abort case, let's say tmr_notify fails.So for the tcmu case, >> the tmr does not get queued and userspace never knows about it. > > True. > > For the tcmu case tmr_notify can fail only if kmalloc(..., GFP_KERNEL) > fails. > >> The initiator then sends a LUN reset. For the reset, the tmr_notify >> call's list will be missing the command that got the abort, right? > > Yes. > > In core, the LUN RESET will wait in core_tmr_drain_tmr_list for the > previous ABORT_TASK to complete, while the ABORT_TASK itself waits for > the aborted command to complete in core_tmr_abort_task. > > So, when LUN_RESET handling comes to calling core_tmr_drain_state_list, > the aborted command will definitely already be done. > >> How will the backend know to clean up that command during the LUN >> reset handling? >> > > It will not know, so core just has to wait until backend completes the > aborted cmd - maybe after long time - just like it always did before my > change. Ah I see. When I made the original comment and was asking about if the callback was supposed to perform some of the TMF operations I was thinking we were going to change up a lot of this code. I think your idea where it's more of a way to notify the modules and we are not changing any of this behavior is better since no one wants to get into that :) Seems ok to me.
diff --git a/drivers/target/target_core_tmr.c b/drivers/target/target_core_tmr.c index 89c84d472cd7..b65d7a0a5df1 100644 --- a/drivers/target/target_core_tmr.c +++ b/drivers/target/target_core_tmr.c @@ -116,14 +116,15 @@ void core_tmr_abort_task( struct se_tmr_req *tmr, struct se_session *se_sess) { - struct se_cmd *se_cmd; + struct se_cmd *se_cmd, *next; unsigned long flags; + bool rc; u64 ref_tag; - spin_lock_irqsave(&se_sess->sess_cmd_lock, flags); - list_for_each_entry(se_cmd, &se_sess->sess_cmd_list, se_cmd_list) { + spin_lock_irqsave(&dev->execute_task_lock, flags); + list_for_each_entry_safe(se_cmd, next, &dev->state_list, state_list) { - if (dev != se_cmd->se_dev) + if (se_sess != se_cmd->se_sess) continue; /* skip task management functions, including tmr->task_cmd */ @@ -137,11 +138,16 @@ void core_tmr_abort_task( printk("ABORT_TASK: Found referenced %s task_tag: %llu\n", se_cmd->se_tfo->fabric_name, ref_tag); - if (!__target_check_io_state(se_cmd, se_sess, - dev->dev_attrib.emulate_tas)) + spin_lock_irqsave(&se_sess->sess_cmd_lock, flags); + rc = __target_check_io_state(se_cmd, se_sess, 0); + spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); + if (!rc) continue; - spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); + list_del_init(&se_cmd->state_list); + se_cmd->state_active = false; + + spin_unlock_irqrestore(&dev->execute_task_lock, flags); /* * Ensure that this ABORT request is visible to the LU RESET @@ -159,7 +165,7 @@ void core_tmr_abort_task( atomic_long_inc(&dev->aborts_complete); return; } - spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); + spin_unlock_irqrestore(&dev->execute_task_lock, flags); printk("ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: %lld\n", tmr->ref_task_tag);
This patch modifies core_tmr_abort_task() to use same looping and locking scheme as core_tmr_drain_state_list() does. This frees the state_list element in se_cmd for later use by tmr notification handling. Note: __target_check_io_state() now is called with param 0 instead of dev->dev_attrib.emulate_tas, because tas is not relevant since we always get ABRT on same session like the aborted command. Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com> --- drivers/target/target_core_tmr.c | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-)