diff mbox series

[13/13] target: flush submission work during TMR processing

Message ID 20210209123845.4856-14-michael.christie@oracle.com (mailing list archive)
State New, archived
Headers show
Series target: fix cmd plugging and completion | expand

Commit Message

Mike Christie Feb. 9, 2021, 12:38 p.m. UTC
If a cmd is on the submission workqueue then the TMR code will
miss it, and end up returning task not found or success for
lun resets. The fabric driver might then tell the initiator that
the running cmds have been handled when they are about to run.

This adds a cancel when we are processing TMRs.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/target/target_core_tmr.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Laurence Oberman Feb. 9, 2021, 2:31 p.m. UTC | #1
On Tue, 2021-02-09 at 06:38 -0600, Mike Christie wrote:
> If a cmd is on the submission workqueue then the TMR code will
> miss it, and end up returning task not found or success for
> lun resets. The fabric driver might then tell the initiator that
> the running cmds have been handled when they are about to run.
> 
> This adds a cancel when we are processing TMRs.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>  drivers/target/target_core_tmr.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/target/target_core_tmr.c
> b/drivers/target/target_core_tmr.c
> index 7347285471fa..9b7f159f9341 100644
> --- a/drivers/target/target_core_tmr.c
> +++ b/drivers/target/target_core_tmr.c
> @@ -124,6 +124,8 @@ void core_tmr_abort_task(
>  	int i;
>  
>  	for (i = 0; i < dev->queue_cnt; i++) {
> +		cancel_work_sync(&dev->queues[i].sq.work);
> +
>  		spin_lock_irqsave(&dev->queues[i].lock, flags);
>  		list_for_each_entry_safe(se_cmd, next, &dev-
> >queues[i].state_list,
>  					 state_list) {
> @@ -302,6 +304,8 @@ static void core_tmr_drain_state_list(
>  	 * in the Control Mode Page.
>  	 */
>  	for (i = 0; i < dev->queue_cnt; i++) {
> +		cancel_work_sync(&dev->queues[i].sq.work);
> +
>  		spin_lock_irqsave(&dev->queues[i].lock, flags);
>  		list_for_each_entry_safe(cmd, next, &dev-
> >queues[i].state_list,
>  	


> 				 state_list) {

Hello Mike
Thanks for these
This one in particular is the one that I think will help our case. I
will pull all of these and test later this week as a bundle.

Many Thanks
Laurence Oberman
Bodo Stroesser Feb. 9, 2021, 5:05 p.m. UTC | #2
On 09.02.21 13:38, Mike Christie wrote:
> If a cmd is on the submission workqueue then the TMR code will
> miss it, and end up returning task not found or success for
> lun resets. The fabric driver might then tell the initiator that
> the running cmds have been handled when they are about to run.
> 
> This adds a cancel when we are processing TMRs.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>   drivers/target/target_core_tmr.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/target/target_core_tmr.c b/drivers/target/target_core_tmr.c
> index 7347285471fa..9b7f159f9341 100644
> --- a/drivers/target/target_core_tmr.c
> +++ b/drivers/target/target_core_tmr.c
> @@ -124,6 +124,8 @@ void core_tmr_abort_task(
>   	int i;
>   
>   	for (i = 0; i < dev->queue_cnt; i++) {
> +		cancel_work_sync(&dev->queues[i].sq.work);
> +
>   		spin_lock_irqsave(&dev->queues[i].lock, flags);
>   		list_for_each_entry_safe(se_cmd, next, &dev->queues[i].state_list,
>   					 state_list) {
> @@ -302,6 +304,8 @@ static void core_tmr_drain_state_list(
>   	 * in the Control Mode Page.
>   	 */
>   	for (i = 0; i < dev->queue_cnt; i++) {
> +		cancel_work_sync(&dev->queues[i].sq.work);
> +
>   		spin_lock_irqsave(&dev->queues[i].lock, flags);
>   		list_for_each_entry_safe(cmd, next, &dev->queues[i].state_list,
>   					 state_list) {
> 

Is there a possible race?

I mean, if cancel_work_sync() is called for a work that was queued but not started,
can we end up with cmds sleeping in a queue until a further cmd is queued to the same queue?
Mike Christie Feb. 9, 2021, 6:49 p.m. UTC | #3
On 2/9/21 11:05 AM, Bodo Stroesser wrote:
> On 09.02.21 13:38, Mike Christie wrote:
>> If a cmd is on the submission workqueue then the TMR code will
>> miss it, and end up returning task not found or success for
>> lun resets. The fabric driver might then tell the initiator that
>> the running cmds have been handled when they are about to run.
>>
>> This adds a cancel when we are processing TMRs.
>>
>> Signed-off-by: Mike Christie <michael.christie@oracle.com>
>> ---
>>   drivers/target/target_core_tmr.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/target/target_core_tmr.c b/drivers/target/target_core_tmr.c
>> index 7347285471fa..9b7f159f9341 100644
>> --- a/drivers/target/target_core_tmr.c
>> +++ b/drivers/target/target_core_tmr.c
>> @@ -124,6 +124,8 @@ void core_tmr_abort_task(
>>   	int i;
>>   
>>   	for (i = 0; i < dev->queue_cnt; i++) {
>> +		cancel_work_sync(&dev->queues[i].sq.work);
>> +
>>   		spin_lock_irqsave(&dev->queues[i].lock, flags);
>>   		list_for_each_entry_safe(se_cmd, next, &dev->queues[i].state_list,
>>   					 state_list) {
>> @@ -302,6 +304,8 @@ static void core_tmr_drain_state_list(
>>   	 * in the Control Mode Page.
>>   	 */
>>   	for (i = 0; i < dev->queue_cnt; i++) {
>> +		cancel_work_sync(&dev->queues[i].sq.work);
>> +
>>   		spin_lock_irqsave(&dev->queues[i].lock, flags);
>>   		list_for_each_entry_safe(cmd, next, &dev->queues[i].state_list,
>>   					 state_list) {
>>
> 
> Is there a possible race?
> 
> I mean, if cancel_work_sync() is called for a work that was queued but not started,
> can we end up with cmds sleeping in a queue until a further cmd is queued to the same queue?

Oh yeah, you are right cancel is wrong. It should be a flush.
Laurence Oberman Feb. 10, 2021, 2:25 p.m. UTC | #4
On Tue, 2021-02-09 at 09:31 -0500, Laurence Oberman wrote:
> On Tue, 2021-02-09 at 06:38 -0600, Mike Christie wrote:
> > If a cmd is on the submission workqueue then the TMR code will
> > miss it, and end up returning task not found or success for
> > lun resets. The fabric driver might then tell the initiator that
> > the running cmds have been handled when they are about to run.
> > 
> > This adds a cancel when we are processing TMRs.
> > 
> > Signed-off-by: Mike Christie <michael.christie@oracle.com>
> > ---
> >  drivers/target/target_core_tmr.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/drivers/target/target_core_tmr.c
> > b/drivers/target/target_core_tmr.c
> > index 7347285471fa..9b7f159f9341 100644
> > --- a/drivers/target/target_core_tmr.c
> > +++ b/drivers/target/target_core_tmr.c
> > @@ -124,6 +124,8 @@ void core_tmr_abort_task(
> >  	int i;
> >  
> >  	for (i = 0; i < dev->queue_cnt; i++) {
> > +		cancel_work_sync(&dev->queues[i].sq.work);
> > +
> >  		spin_lock_irqsave(&dev->queues[i].lock, flags);
> >  		list_for_each_entry_safe(se_cmd, next, &dev-
> > > queues[i].state_list,
> > 
> >  					 state_list) {
> > @@ -302,6 +304,8 @@ static void core_tmr_drain_state_list(
> >  	 * in the Control Mode Page.
> >  	 */
> >  	for (i = 0; i < dev->queue_cnt; i++) {
> > +		cancel_work_sync(&dev->queues[i].sq.work);
> > +
> >  		spin_lock_irqsave(&dev->queues[i].lock, flags);
> >  		list_for_each_entry_safe(cmd, next, &dev-
> > > queues[i].state_list,
> > 
> >  	
> 
> 
> > 				 state_list) {
> 
> Hello Mike
> Thanks for these
> This one in particular is the one that I think will help our case. I
> will pull all of these and test later this week as a bundle.
> 
> Many Thanks
> Laurence Oberman
> 

I pulled v2 and built a test kernel to start the testing but I see
Christoph has also suggested changes as well to your v3 submission.

I will wait until we are finalized and then do a full test.

Thanks
Laurence
diff mbox series

Patch

diff --git a/drivers/target/target_core_tmr.c b/drivers/target/target_core_tmr.c
index 7347285471fa..9b7f159f9341 100644
--- a/drivers/target/target_core_tmr.c
+++ b/drivers/target/target_core_tmr.c
@@ -124,6 +124,8 @@  void core_tmr_abort_task(
 	int i;
 
 	for (i = 0; i < dev->queue_cnt; i++) {
+		cancel_work_sync(&dev->queues[i].sq.work);
+
 		spin_lock_irqsave(&dev->queues[i].lock, flags);
 		list_for_each_entry_safe(se_cmd, next, &dev->queues[i].state_list,
 					 state_list) {
@@ -302,6 +304,8 @@  static void core_tmr_drain_state_list(
 	 * in the Control Mode Page.
 	 */
 	for (i = 0; i < dev->queue_cnt; i++) {
+		cancel_work_sync(&dev->queues[i].sq.work);
+
 		spin_lock_irqsave(&dev->queues[i].lock, flags);
 		list_for_each_entry_safe(cmd, next, &dev->queues[i].state_list,
 					 state_list) {