diff mbox series

qedi: Fix cmd_cleanup_cmpl counter mismatch issue.

Message ID 20211123122115.8599-1-mrangankar@marvell.com (mailing list archive)
State Superseded
Headers show
Series qedi: Fix cmd_cleanup_cmpl counter mismatch issue. | expand

Commit Message

Manish Rangankar Nov. 23, 2021, 12:21 p.m. UTC
When issued LUN reset under heavy i/o, we hit the qedi WARN_ON
because of a mismatch in firmware i/o cmd cleanup request count
and i/o cmd cleanup response count received. The mismatch is
because of the race caused by the postfix increment of
cmd_cleanup_cmpl.

[qedi_clearsq:1295]:18: fatal error, need hard reset, cid=0x0
WARNING: CPU: 48 PID: 110963 at drivers/scsi/qedi/qedi_fw.c:1296 qedi_clearsq+0xa5/0xd0 [qedi]
CPU: 48 PID: 110963 Comm: kworker/u130:0 Kdump: loaded Tainted: G        W
Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 04/15/2020
Workqueue: iscsi_conn_cleanup iscsi_cleanup_conn_work_fn [scsi_transport_iscsi]
RIP: 0010:qedi_clearsq+0xa5/0xd0 [qedi]
 RSP: 0018:ffffac2162c7fd98 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff975213c40ab8 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: ffff9761bf816858 RDI: ffff9761bf816858
 RBP: ffff975247018628 R08: 000000000000522c R09: 000000000000005b
 R10: 0000000000000000 R11: ffffac2162c7fbd8 R12: ffff97522e1b2be8
 R13: 0000000000000000 R14: ffff97522e1b2800 R15: 0000000000000001
 FS:  0000000000000000(0000) GS:ffff9761bf800000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f1a34e3e1a0 CR3: 0000000108bb2000 CR4: 0000000000350ee0
 Call Trace:
  qedi_ep_disconnect+0x533/0x550 [qedi]
  ? iscsi_dbg_trace+0x63/0x80 [scsi_transport_iscsi]
  ? _cond_resched+0x15/0x30
  ? iscsi_suspend_queue+0x19/0x40 [libiscsi]
  iscsi_ep_disconnect+0xb0/0x130 [scsi_transport_iscsi]
  iscsi_cleanup_conn_work_fn+0x82/0x130 [scsi_transport_iscsi]
  process_one_work+0x1a7/0x360
  ? create_worker+0x1a0/0x1a0
  worker_thread+0x30/0x390
  ? create_worker+0x1a0/0x1a0
  kthread+0x116/0x130
  ? kthread_flush_work_fn+0x10/0x10
  ret_from_fork+0x22/0x40
 ---[ end trace 5f1441f59082235c ]---

Signed-off-by: Manish Rangankar <mrangankar@marvell.com>
---
 drivers/scsi/qedi/qedi_fw.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

Comments

Lee Duncan Nov. 23, 2021, 6:04 p.m. UTC | #1
On 11/23/21 4:21 AM, Manish Rangankar wrote:
> When issued LUN reset under heavy i/o, we hit the qedi WARN_ON
> because of a mismatch in firmware i/o cmd cleanup request count
> and i/o cmd cleanup response count received. The mismatch is
> because of the race caused by the postfix increment of
> cmd_cleanup_cmpl.
> 
> [qedi_clearsq:1295]:18: fatal error, need hard reset, cid=0x0
> WARNING: CPU: 48 PID: 110963 at drivers/scsi/qedi/qedi_fw.c:1296 qedi_clearsq+0xa5/0xd0 [qedi]
> CPU: 48 PID: 110963 Comm: kworker/u130:0 Kdump: loaded Tainted: G        W
> Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 04/15/2020
> Workqueue: iscsi_conn_cleanup iscsi_cleanup_conn_work_fn [scsi_transport_iscsi]
> RIP: 0010:qedi_clearsq+0xa5/0xd0 [qedi]
>  RSP: 0018:ffffac2162c7fd98 EFLAGS: 00010246
>  RAX: 0000000000000000 RBX: ffff975213c40ab8 RCX: 0000000000000000
>  RDX: 0000000000000000 RSI: ffff9761bf816858 RDI: ffff9761bf816858
>  RBP: ffff975247018628 R08: 000000000000522c R09: 000000000000005b
>  R10: 0000000000000000 R11: ffffac2162c7fbd8 R12: ffff97522e1b2be8
>  R13: 0000000000000000 R14: ffff97522e1b2800 R15: 0000000000000001
>  FS:  0000000000000000(0000) GS:ffff9761bf800000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 00007f1a34e3e1a0 CR3: 0000000108bb2000 CR4: 0000000000350ee0
>  Call Trace:
>   qedi_ep_disconnect+0x533/0x550 [qedi]
>   ? iscsi_dbg_trace+0x63/0x80 [scsi_transport_iscsi]
>   ? _cond_resched+0x15/0x30
>   ? iscsi_suspend_queue+0x19/0x40 [libiscsi]
>   iscsi_ep_disconnect+0xb0/0x130 [scsi_transport_iscsi]
>   iscsi_cleanup_conn_work_fn+0x82/0x130 [scsi_transport_iscsi]
>   process_one_work+0x1a7/0x360
>   ? create_worker+0x1a0/0x1a0
>   worker_thread+0x30/0x390
>   ? create_worker+0x1a0/0x1a0
>   kthread+0x116/0x130
>   ? kthread_flush_work_fn+0x10/0x10
>   ret_from_fork+0x22/0x40
>  ---[ end trace 5f1441f59082235c ]---
> 
> Signed-off-by: Manish Rangankar <mrangankar@marvell.com>
> ---
>  drivers/scsi/qedi/qedi_fw.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
> index 84a4204a2cb4..2eed2c6cf424 100644
> --- a/drivers/scsi/qedi/qedi_fw.c
> +++ b/drivers/scsi/qedi/qedi_fw.c
> @@ -813,10 +813,11 @@ static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
>  
>  check_cleanup_reqs:
>  	if (qedi_conn->cmd_cleanup_req > 0) {
> -		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_TID,
> -			  "Freeing tid=0x%x for cid=0x%x\n",
> -			  cqe->itid, qedi_conn->iscsi_conn_id);
> -		qedi_conn->cmd_cleanup_cmpl++;
> +		++qedi_conn->cmd_cleanup_cmpl;
> +		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
> +			  "Freeing tid=0x%x for cid=0x%x cleanup count=%d\n",
> +			  cqe->itid, qedi_conn->iscsi_conn_id,
> +			  qedi_conn->cmd_cleanup_cmpl);
>  		wake_up(&qedi_conn->wait_queue);
>  	} else {
>  		QEDI_ERR(&qedi->dbg_ctx,
> 

Reviewed-by: Lee Duncan <lduncan@suse.com>
Mike Christie Nov. 23, 2021, 9:16 p.m. UTC | #2
On 11/23/21 6:21 AM, Manish Rangankar wrote:
> When issued LUN reset under heavy i/o, we hit the qedi WARN_ON
> because of a mismatch in firmware i/o cmd cleanup request count
> and i/o cmd cleanup response count received. The mismatch is
> because of the race caused by the postfix increment of
> cmd_cleanup_cmpl.
> 
> [qedi_clearsq:1295]:18: fatal error, need hard reset, cid=0x0
> WARNING: CPU: 48 PID: 110963 at drivers/scsi/qedi/qedi_fw.c:1296 qedi_clearsq+0xa5/0xd0 [qedi]
> CPU: 48 PID: 110963 Comm: kworker/u130:0 Kdump: loaded Tainted: G        W
> Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 04/15/2020
> Workqueue: iscsi_conn_cleanup iscsi_cleanup_conn_work_fn [scsi_transport_iscsi]
> RIP: 0010:qedi_clearsq+0xa5/0xd0 [qedi]
>  RSP: 0018:ffffac2162c7fd98 EFLAGS: 00010246
>  RAX: 0000000000000000 RBX: ffff975213c40ab8 RCX: 0000000000000000
>  RDX: 0000000000000000 RSI: ffff9761bf816858 RDI: ffff9761bf816858
>  RBP: ffff975247018628 R08: 000000000000522c R09: 000000000000005b
>  R10: 0000000000000000 R11: ffffac2162c7fbd8 R12: ffff97522e1b2be8
>  R13: 0000000000000000 R14: ffff97522e1b2800 R15: 0000000000000001
>  FS:  0000000000000000(0000) GS:ffff9761bf800000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 00007f1a34e3e1a0 CR3: 0000000108bb2000 CR4: 0000000000350ee0
>  Call Trace:
>   qedi_ep_disconnect+0x533/0x550 [qedi]
>   ? iscsi_dbg_trace+0x63/0x80 [scsi_transport_iscsi]
>   ? _cond_resched+0x15/0x30
>   ? iscsi_suspend_queue+0x19/0x40 [libiscsi]
>   iscsi_ep_disconnect+0xb0/0x130 [scsi_transport_iscsi]
>   iscsi_cleanup_conn_work_fn+0x82/0x130 [scsi_transport_iscsi]
>   process_one_work+0x1a7/0x360
>   ? create_worker+0x1a0/0x1a0
>   worker_thread+0x30/0x390
>   ? create_worker+0x1a0/0x1a0
>   kthread+0x116/0x130
>   ? kthread_flush_work_fn+0x10/0x10
>   ret_from_fork+0x22/0x40
>  ---[ end trace 5f1441f59082235c ]---
> 
> Signed-off-by: Manish Rangankar <mrangankar@marvell.com>
> ---
>  drivers/scsi/qedi/qedi_fw.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
> index 84a4204a2cb4..2eed2c6cf424 100644
> --- a/drivers/scsi/qedi/qedi_fw.c
> +++ b/drivers/scsi/qedi/qedi_fw.c
> @@ -813,10 +813,11 @@ static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
>  
>  check_cleanup_reqs:
>  	if (qedi_conn->cmd_cleanup_req > 0) {
> -		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_TID,
> -			  "Freeing tid=0x%x for cid=0x%x\n",
> -			  cqe->itid, qedi_conn->iscsi_conn_id);
> -		qedi_conn->cmd_cleanup_cmpl++;
> +		++qedi_conn->cmd_cleanup_cmpl;
> +		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
> +			  "Freeing tid=0x%x for cid=0x%x cleanup count=%d\n",
> +			  cqe->itid, qedi_conn->iscsi_conn_id,
> +			  qedi_conn->cmd_cleanup_cmpl);

Is the issue that cmd_cleanup_cmpl's increment is not seen by
qedi_cleanup_all_io's wait_event_interruptible_timeout call when it
wakes up, and your patch fixes this by doing a pre increment?

Does doing a pre increment give you barrier like behavior and is that
why this works? I thought if wake_up ends up waking up the other thread it
does a barrier already, so it's not clear to me how changing to a
pre-increment helps.

Is doing a pre-increment a common way to handle this? It looks like we do a
post increment and wake_up* in other places. However, like in the scsi layer
we do wake_up_process and memory-barriers.txt says that always does a general
barrier, so is that why we can do a post increment there?

Does pre-increment give you barrier like behavior, and is the wake_up call
not waking up the process so we didn't get a barrier from that, and so that's
why this works?


>  		wake_up(&qedi_conn->wait_queue);
>  	} else {
>  		QEDI_ERR(&qedi->dbg_ctx,
>
Manish Rangankar Nov. 24, 2021, 6:05 a.m. UTC | #3
> >
> >  check_cleanup_reqs:
> >  	if (qedi_conn->cmd_cleanup_req > 0) {
> > -		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_TID,
> > -			  "Freeing tid=0x%x for cid=0x%x\n",
> > -			  cqe->itid, qedi_conn->iscsi_conn_id);
> > -		qedi_conn->cmd_cleanup_cmpl++;
> > +		++qedi_conn->cmd_cleanup_cmpl;
> > +		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
> > +			  "Freeing tid=0x%x for cid=0x%x cleanup count=%d\n",
> > +			  cqe->itid, qedi_conn->iscsi_conn_id,
> > +			  qedi_conn->cmd_cleanup_cmpl);
> 
> Is the issue that cmd_cleanup_cmpl's increment is not seen by
> qedi_cleanup_all_io's wait_event_interruptible_timeout call when it wakes up,
> and your patch fixes this by doing a pre increment?
> 

Yes, cmd_cleanup_cmpl's increment is not seen by qedi_cleanup_all_io's 
wait_event_interruptible_timeout call when it wakes up, even after firmware 
post all the ISCSI_CQE_TYPE_TASK_CLEANUP events for requested cmd_cleanup_req.
Yes, pre increment did addressed this issue. Do you feel otherwise ?

> Does doing a pre increment give you barrier like behavior and is that why this
> works? I thought if wake_up ends up waking up the other thread it does a barrier
> already, so it's not clear to me how changing to a pre-increment helps.
> 
> Is doing a pre-increment a common way to handle this? It looks like we do a
> post increment and wake_up* in other places. However, like in the scsi layer we
> do wake_up_process and memory-barriers.txt says that always does a general
> barrier, so is that why we can do a post increment there?
> 
> Does pre-increment give you barrier like behavior, and is the wake_up call not
> waking up the process so we didn't get a barrier from that, and so that's why this
> works?
> 

Issue happen before calling wake_up. When we gets a ISCSI_CQE_TYPE_TASK_CLEANUP surge on
multiple Rx threads, cmd_cleanup_cmpl tend to miss the increment. The scenario is more similar to
multiple threads access cmd_cleanup_cmpl causing race during postfix increment. This could be because of 
thread reading the same value at a time.

Now that I am explaining it, it felt instead of pre-incrementing cmd_cleanup_cmpl, 
it should be atomic variable. Do see any issue ? 

From logs,
-------------------------------------------------------
[root@rhel82-leo RHEL90_LOGS]# grep -inr "qedi_iscsi_cleanup_task:2160" conn_err.log | wc -l
99

[root@rhel82-leo RHEL90_LOGS]# grep -inr "qedi_cleanup_all_io:1215" conn_err.log | wc -l
99

[root@rhel82-leo RHEL90_LOGS]# grep -inr "qedi_fp_process_cqes:925" conn_err.log | wc -l
99

[root@rhel82-leo RHEL90_LOGS]# grep -inr "qedi_fp_process_cqes:922" conn_err.log | wc -l
99

[Thu Oct 21 22:03:32 2021] [0000:a5:00.5]:[qedi_cleanup_all_io:1246]:18: i/o cmd_cleanup_req=99, not equal to cmd_cleanup_cmpl=97, cid=0x0   <<<
[Thu Oct 21 22:03:38 2021] [0000:a5:00.5]:[qedi_clearsq:1299]:18: fatal error, need hard reset, cid=0x0
-----------------------------------------------------
Mike Christie Nov. 24, 2021, 5:41 p.m. UTC | #4
On 11/24/21 12:05 AM, Manish Rangankar wrote:
>>>
>>>  check_cleanup_reqs:
>>>  	if (qedi_conn->cmd_cleanup_req > 0) {
>>> -		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_TID,
>>> -			  "Freeing tid=0x%x for cid=0x%x\n",
>>> -			  cqe->itid, qedi_conn->iscsi_conn_id);
>>> -		qedi_conn->cmd_cleanup_cmpl++;
>>> +		++qedi_conn->cmd_cleanup_cmpl;
>>> +		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
>>> +			  "Freeing tid=0x%x for cid=0x%x cleanup count=%d\n",
>>> +			  cqe->itid, qedi_conn->iscsi_conn_id,
>>> +			  qedi_conn->cmd_cleanup_cmpl);
>>
>> Is the issue that cmd_cleanup_cmpl's increment is not seen by
>> qedi_cleanup_all_io's wait_event_interruptible_timeout call when it wakes up,
>> and your patch fixes this by doing a pre increment?
>>
> 
> Yes, cmd_cleanup_cmpl's increment is not seen by qedi_cleanup_all_io's 
> wait_event_interruptible_timeout call when it wakes up, even after firmware 
> post all the ISCSI_CQE_TYPE_TASK_CLEANUP events for requested cmd_cleanup_req.
> Yes, pre increment did addressed this issue. Do you feel otherwise ?
> 
>> Does doing a pre increment give you barrier like behavior and is that why this
>> works? I thought if wake_up ends up waking up the other thread it does a barrier
>> already, so it's not clear to me how changing to a pre-increment helps.
>>
>> Is doing a pre-increment a common way to handle this? It looks like we do a
>> post increment and wake_up* in other places. However, like in the scsi layer we
>> do wake_up_process and memory-barriers.txt says that always does a general
>> barrier, so is that why we can do a post increment there?
>>
>> Does pre-increment give you barrier like behavior, and is the wake_up call not
>> waking up the process so we didn't get a barrier from that, and so that's why this
>> works?
>>
> 
> Issue happen before calling wake_up. When we gets a ISCSI_CQE_TYPE_TASK_CLEANUP surge on
> multiple Rx threads, cmd_cleanup_cmpl tend to miss the increment. The scenario is more similar to
> multiple threads access cmd_cleanup_cmpl causing race during postfix increment. This could be because of 
> thread reading the same value at a time.
> 
> Now that I am explaining it, it felt instead of pre-incrementing cmd_cleanup_cmpl, 
> it should be atomic variable. Do see any issue ? 
> 

Yeah, atomic.

And then I guess for this:

        if (qedi_conn->cmd_cleanup_req > 0) {
                QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_TID,
                          "Freeing tid=0x%x for cid=0x%x\n",
                          cqe->itid, qedi_conn->iscsi_conn_id);
                qedi_conn->cmd_cleanup_cmpl++;
                wake_up(&qedi_conn->wait_queue);


we might only want to do the wake_up once:

if (atomic_inc_return(&qedi_conn->cmd_cleanup_cmpl) ==
    qedi_conn->cmd_cleanup_req) {

?
Manish Rangankar Nov. 25, 2021, 5:34 a.m. UTC | #5
> >
> > Now that I am explaining it, it felt instead of pre-incrementing
> > cmd_cleanup_cmpl, it should be atomic variable. Do see any issue ?
> >
> 
> Yeah, atomic.
> 
> And then I guess for this:
> 
>         if (qedi_conn->cmd_cleanup_req > 0) {
>                 QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_TID,
>                           "Freeing tid=0x%x for cid=0x%x\n",
>                           cqe->itid, qedi_conn->iscsi_conn_id);
>                 qedi_conn->cmd_cleanup_cmpl++;
>                 wake_up(&qedi_conn->wait_queue);
> 
> 
> we might only want to do the wake_up once:
> 
> if (atomic_inc_return(&qedi_conn->cmd_cleanup_cmpl) ==
>     qedi_conn->cmd_cleanup_req) {
> 
> ?

Agree, I will share the updated patch.
diff mbox series

Patch

diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
index 84a4204a2cb4..2eed2c6cf424 100644
--- a/drivers/scsi/qedi/qedi_fw.c
+++ b/drivers/scsi/qedi/qedi_fw.c
@@ -813,10 +813,11 @@  static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
 
 check_cleanup_reqs:
 	if (qedi_conn->cmd_cleanup_req > 0) {
-		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_TID,
-			  "Freeing tid=0x%x for cid=0x%x\n",
-			  cqe->itid, qedi_conn->iscsi_conn_id);
-		qedi_conn->cmd_cleanup_cmpl++;
+		++qedi_conn->cmd_cleanup_cmpl;
+		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
+			  "Freeing tid=0x%x for cid=0x%x cleanup count=%d\n",
+			  cqe->itid, qedi_conn->iscsi_conn_id,
+			  qedi_conn->cmd_cleanup_cmpl);
 		wake_up(&qedi_conn->wait_queue);
 	} else {
 		QEDI_ERR(&qedi->dbg_ctx,