Message ID | 20210209123845.4856-14-michael.christie@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | target: fix cmd plugging and completion | expand |
On Tue, 2021-02-09 at 06:38 -0600, Mike Christie wrote: > If a cmd is on the submission workqueue then the TMR code will > miss it, and end up returning task not found or success for > lun resets. The fabric driver might then tell the initiator that > the running cmds have been handled when they are about to run. > > This adds a cancel when we are processing TMRs. > > Signed-off-by: Mike Christie <michael.christie@oracle.com> > --- > drivers/target/target_core_tmr.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/target/target_core_tmr.c > b/drivers/target/target_core_tmr.c > index 7347285471fa..9b7f159f9341 100644 > --- a/drivers/target/target_core_tmr.c > +++ b/drivers/target/target_core_tmr.c > @@ -124,6 +124,8 @@ void core_tmr_abort_task( > int i; > > for (i = 0; i < dev->queue_cnt; i++) { > + cancel_work_sync(&dev->queues[i].sq.work); > + > spin_lock_irqsave(&dev->queues[i].lock, flags); > list_for_each_entry_safe(se_cmd, next, &dev- > >queues[i].state_list, > state_list) { > @@ -302,6 +304,8 @@ static void core_tmr_drain_state_list( > * in the Control Mode Page. > */ > for (i = 0; i < dev->queue_cnt; i++) { > + cancel_work_sync(&dev->queues[i].sq.work); > + > spin_lock_irqsave(&dev->queues[i].lock, flags); > list_for_each_entry_safe(cmd, next, &dev- > >queues[i].state_list, > > state_list) { Hello Mike Thanks for these This one in particular is the one that I think will help our case. I will pull all of these and test later this week as a bundle. Many Thanks Laurence Oberman
On 09.02.21 13:38, Mike Christie wrote: > If a cmd is on the submission workqueue then the TMR code will > miss it, and end up returning task not found or success for > lun resets. The fabric driver might then tell the initiator that > the running cmds have been handled when they are about to run. > > This adds a cancel when we are processing TMRs. > > Signed-off-by: Mike Christie <michael.christie@oracle.com> > --- > drivers/target/target_core_tmr.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/target/target_core_tmr.c b/drivers/target/target_core_tmr.c > index 7347285471fa..9b7f159f9341 100644 > --- a/drivers/target/target_core_tmr.c > +++ b/drivers/target/target_core_tmr.c > @@ -124,6 +124,8 @@ void core_tmr_abort_task( > int i; > > for (i = 0; i < dev->queue_cnt; i++) { > + cancel_work_sync(&dev->queues[i].sq.work); > + > spin_lock_irqsave(&dev->queues[i].lock, flags); > list_for_each_entry_safe(se_cmd, next, &dev->queues[i].state_list, > state_list) { > @@ -302,6 +304,8 @@ static void core_tmr_drain_state_list( > * in the Control Mode Page. > */ > for (i = 0; i < dev->queue_cnt; i++) { > + cancel_work_sync(&dev->queues[i].sq.work); > + > spin_lock_irqsave(&dev->queues[i].lock, flags); > list_for_each_entry_safe(cmd, next, &dev->queues[i].state_list, > state_list) { > Is there a possible race? I mean, if cancel_work_sync() is called for a work that was queued but not started, can we end up with cmds sleeping in a queue until a further cmd is queued to the same queue?
On 2/9/21 11:05 AM, Bodo Stroesser wrote: > On 09.02.21 13:38, Mike Christie wrote: >> If a cmd is on the submission workqueue then the TMR code will >> miss it, and end up returning task not found or success for >> lun resets. The fabric driver might then tell the initiator that >> the running cmds have been handled when they are about to run. >> >> This adds a cancel when we are processing TMRs. >> >> Signed-off-by: Mike Christie <michael.christie@oracle.com> >> --- >> drivers/target/target_core_tmr.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/drivers/target/target_core_tmr.c b/drivers/target/target_core_tmr.c >> index 7347285471fa..9b7f159f9341 100644 >> --- a/drivers/target/target_core_tmr.c >> +++ b/drivers/target/target_core_tmr.c >> @@ -124,6 +124,8 @@ void core_tmr_abort_task( >> int i; >> >> for (i = 0; i < dev->queue_cnt; i++) { >> + cancel_work_sync(&dev->queues[i].sq.work); >> + >> spin_lock_irqsave(&dev->queues[i].lock, flags); >> list_for_each_entry_safe(se_cmd, next, &dev->queues[i].state_list, >> state_list) { >> @@ -302,6 +304,8 @@ static void core_tmr_drain_state_list( >> * in the Control Mode Page. >> */ >> for (i = 0; i < dev->queue_cnt; i++) { >> + cancel_work_sync(&dev->queues[i].sq.work); >> + >> spin_lock_irqsave(&dev->queues[i].lock, flags); >> list_for_each_entry_safe(cmd, next, &dev->queues[i].state_list, >> state_list) { >> > > Is there a possible race? > > I mean, if cancel_work_sync() is called for a work that was queued but not started, > can we end up with cmds sleeping in a queue until a further cmd is queued to the same queue? Oh yeah, you are right cancel is wrong. It should be a flush.
On Tue, 2021-02-09 at 09:31 -0500, Laurence Oberman wrote: > On Tue, 2021-02-09 at 06:38 -0600, Mike Christie wrote: > > If a cmd is on the submission workqueue then the TMR code will > > miss it, and end up returning task not found or success for > > lun resets. The fabric driver might then tell the initiator that > > the running cmds have been handled when they are about to run. > > > > This adds a cancel when we are processing TMRs. > > > > Signed-off-by: Mike Christie <michael.christie@oracle.com> > > --- > > drivers/target/target_core_tmr.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/drivers/target/target_core_tmr.c > > b/drivers/target/target_core_tmr.c > > index 7347285471fa..9b7f159f9341 100644 > > --- a/drivers/target/target_core_tmr.c > > +++ b/drivers/target/target_core_tmr.c > > @@ -124,6 +124,8 @@ void core_tmr_abort_task( > > int i; > > > > for (i = 0; i < dev->queue_cnt; i++) { > > + cancel_work_sync(&dev->queues[i].sq.work); > > + > > spin_lock_irqsave(&dev->queues[i].lock, flags); > > list_for_each_entry_safe(se_cmd, next, &dev- > > > queues[i].state_list, > > > > state_list) { > > @@ -302,6 +304,8 @@ static void core_tmr_drain_state_list( > > * in the Control Mode Page. > > */ > > for (i = 0; i < dev->queue_cnt; i++) { > > + cancel_work_sync(&dev->queues[i].sq.work); > > + > > spin_lock_irqsave(&dev->queues[i].lock, flags); > > list_for_each_entry_safe(cmd, next, &dev- > > > queues[i].state_list, > > > > > > > > state_list) { > > Hello Mike > Thanks for these > This one in particular is the one that I think will help our case. I > will pull all of these and test later this week as a bundle. > > Many Thanks > Laurence Oberman > I pulled v2 and built a test kernel to start the testing but I see Christoph has also suggested changes as well to your v3 submission. I will wait until we are finalized and then do a full test. Thanks Laurence
diff --git a/drivers/target/target_core_tmr.c b/drivers/target/target_core_tmr.c index 7347285471fa..9b7f159f9341 100644 --- a/drivers/target/target_core_tmr.c +++ b/drivers/target/target_core_tmr.c @@ -124,6 +124,8 @@ void core_tmr_abort_task( int i; for (i = 0; i < dev->queue_cnt; i++) { + cancel_work_sync(&dev->queues[i].sq.work); + spin_lock_irqsave(&dev->queues[i].lock, flags); list_for_each_entry_safe(se_cmd, next, &dev->queues[i].state_list, state_list) { @@ -302,6 +304,8 @@ static void core_tmr_drain_state_list( * in the Control Mode Page. */ for (i = 0; i < dev->queue_cnt; i++) { + cancel_work_sync(&dev->queues[i].sq.work); + spin_lock_irqsave(&dev->queues[i].lock, flags); list_for_each_entry_safe(cmd, next, &dev->queues[i].state_list, state_list) {
If a cmd is on the submission workqueue then the TMR code will miss it, and end up returning task not found or success for lun resets. The fabric driver might then tell the initiator that the running cmds have been handled when they are about to run. This adds a cancel when we are processing TMRs. Signed-off-by: Mike Christie <michael.christie@oracle.com> --- drivers/target/target_core_tmr.c | 4 ++++ 1 file changed, 4 insertions(+)