diff mbox series

lpfc: Retry FLOGI if previous attempt was rejected with busy

Message ID 20220110145937.129224-1-dwagner@suse.de (mailing list archive)
State Changes Requested
Headers show
Series lpfc: Retry FLOGI if previous attempt was rejected with busy | expand

Commit Message

Daniel Wagner Jan. 10, 2022, 2:59 p.m. UTC
The login state machine stops at the first FLOGI attempt which fails
marked as busy:

lpfc 0000:58:00.0: 1:(0):2858 FLOGI failure Status:x9/x50000 TMO:x14 Data x19140820 x0

Add the FLOGI cmd to the list of commands which are allowed to retry.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---

we observerd log below during failover operations. With this patch all
is good. FLOGI is retried and succeeds eventually.

lpfc 0000:58:00.0: 29: [  575.971250] 1:0392 Async Event: word0:x8010140, word1:x3204000, word2:x3, word3:xc0011000
lpfc 0000:58:00.0: 30: [  575.971260] 1:2896 Async FC event - Speed:8000GBaud Topology:x1 LA Type:x1 Port Type:1 Port Number:0 Logical speed:8000Mbps Fault:0
lpfc 0000:58:00.0: 31: [  575.971264] 1:(0):0354 Mbox cmd issue - Enqueue Data: x95 (x0/x0) x0 xc600 x2
lpfc 0000:58:00.0: 32: [  575.971266] 1:(0):0355 Mailbox cmd x95 (x0/x0) issue Data: x0 xc700
lpfc 0000:58:00.0: 33: [  575.971461] 1:(0):0307 Mailbox cmd x95 (x0/x0) Cmpl lpfc_mbx_cmpl_read_topology [lpfc] Data: x9500 x3 x2001 x1 x80 x277cd000 x44 x80002005 x200a x0 x76cf064 x0
lpfc 0000:58:00.0: 34: [  575.971472] 1:(0):0354 Mbox cmd issue - Enqueue Data: x8d (x0/x0) x0 xc600 x2
lpfc 0000:58:00.0: 35: [  575.971473] 1:(0):0354 Mbox cmd issue - Enqueue Data: x7 (x0/x0) x6 xc600 x2
lpfc 0000:58:00.0: 36: [  575.971475] 1:(0):0355 Mailbox cmd x8d (x0/x0) issue Data: x6 xc700
lpfc 0000:58:00.0: 37: [  575.971682] 1:(0):0355 Mailbox cmd x7 (x0/x0) issue Data: x6 xc700
lpfc 0000:58:00.0: 38: [  575.971689] 1:(0):0307 Mailbox cmd x8d (x0/x0) Cmpl lpfc_mbx_cmpl_read_sparam [lpfc] Data: x8d00 x0 x0 x70 x277cd800 x44 x1 x0 x0 x0 x0 x0
lpfc 0000:58:00.0: 39: [  575.971826] 1:(0):0307 Mailbox cmd x7 (x0/x0) Cmpl lpfc_mbx_cmpl_local_config_link [lpfc] Data: x700 x0 x0 x0 x7d0 x76c xa x0 xf x0 x1800 x0
lpfc 0000:58:00.0: 40: [  575.971827] 1:(0):0354 Mbox cmd issue - Enqueue Data: x8d (x0/x0) x6 xc600 x2
lpfc 0000:58:00.0: 41: [  575.971827] 1:(0):0355 Mailbox cmd x8d (x0/x0) issue Data: x6 xc700
lpfc 0000:58:00.0: 42: [  575.972048] 1:(0):0307 Mailbox cmd x8d (x0/x0) Cmpl lpfc_mbx_cmpl_read_sparam [lpfc] Data: x8d00 x0 x0 x70 x277cd800 x44 x1 x0 x0 x0 x0 x0
lpfc 0000:58:00.0: 43: [  575.972050] 1:(0):0247 Start Discovery Timer state x7 Data: x21 xffff8804c6b149e8 x0 x0
lpfc 0000:58:00.0: 44: [  575.972051] 1:(0):0932 FIND node did xfffffe NOT FOUND.
lpfc 0000:58:00.0: 45: [  575.972052] 1:0001 Allocated rpi:x0 max:x3000 lim:x3000
lpfc 0000:58:00.0: 46: [  575.972053] 1:(0):0007 Init New ndlp xffff8804c715d000, rpi:x0 DID:fffffe flg:x0 refcnt:1
lpfc 0000:58:00.0: 47: [  575.972055] 1:(0):0116 Xmit ELS command x4 to remote NPORT xfffffe I/O tag: x2fc0, port state:x7 rpi x0 fc_flag:x810114
lpfc 0000:58:00.0: 48: [  575.972055] 1:(0):0247 Start Discovery Timer state x7 Data: x21 xffff8804c6b149e8 x0 x0
lpfc 0000:58:00.0: 49: [  576.011558] 1:0357 ELS CQE error: status=x9: CQE: 2fc00900 00000000 00050000 80010000
lpfc 0000:58:00.0: 50: [  576.011566] 1:0328 Rsp Ring 2 error: IOCB Data: x40000000 x277cd400 x44 x0 x50000 xfffffe x12fc0 x14428a96 x0 x0 x0 x0 x0 x0 x0 x0
lpfc 0000:58:00.0: 1:(0):2858 FLOGI failure Status:x9/x50000 TMO:x14 Data x19140820 x0


 drivers/scsi/lpfc/lpfc_els.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

James Smart Jan. 11, 2022, 10:08 p.m. UTC | #1
On 1/10/2022 6:59 AM, Daniel Wagner wrote:
> The login state machine stops at the first FLOGI attempt which fails
> marked as busy:
> 
> lpfc 0000:58:00.0: 1:(0):2858 FLOGI failure Status:x9/x50000 TMO:x14 Data x19140820 x0
> 
> Add the FLOGI cmd to the list of commands which are allowed to retry.
> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
> 
> we observerd log below during failover operations. With this patch all
> is good. FLOGI is retried and succeeds eventually.
> 
> lpfc 0000:58:00.0: 29: [  575.971250] 1:0392 Async Event: word0:x8010140, word1:x3204000, word2:x3, word3:xc0011000
> lpfc 0000:58:00.0: 30: [  575.971260] 1:2896 Async FC event - Speed:8000GBaud Topology:x1 LA Type:x1 Port Type:1 Port Number:0 Logical speed:8000Mbps Fault:0
> lpfc 0000:58:00.0: 31: [  575.971264] 1:(0):0354 Mbox cmd issue - Enqueue Data: x95 (x0/x0) x0 xc600 x2
> lpfc 0000:58:00.0: 32: [  575.971266] 1:(0):0355 Mailbox cmd x95 (x0/x0) issue Data: x0 xc700
> lpfc 0000:58:00.0: 33: [  575.971461] 1:(0):0307 Mailbox cmd x95 (x0/x0) Cmpl lpfc_mbx_cmpl_read_topology [lpfc] Data: x9500 x3 x2001 x1 x80 x277cd000 x44 x80002005 x200a x0 x76cf064 x0
> lpfc 0000:58:00.0: 34: [  575.971472] 1:(0):0354 Mbox cmd issue - Enqueue Data: x8d (x0/x0) x0 xc600 x2
> lpfc 0000:58:00.0: 35: [  575.971473] 1:(0):0354 Mbox cmd issue - Enqueue Data: x7 (x0/x0) x6 xc600 x2
> lpfc 0000:58:00.0: 36: [  575.971475] 1:(0):0355 Mailbox cmd x8d (x0/x0) issue Data: x6 xc700
> lpfc 0000:58:00.0: 37: [  575.971682] 1:(0):0355 Mailbox cmd x7 (x0/x0) issue Data: x6 xc700
> lpfc 0000:58:00.0: 38: [  575.971689] 1:(0):0307 Mailbox cmd x8d (x0/x0) Cmpl lpfc_mbx_cmpl_read_sparam [lpfc] Data: x8d00 x0 x0 x70 x277cd800 x44 x1 x0 x0 x0 x0 x0
> lpfc 0000:58:00.0: 39: [  575.971826] 1:(0):0307 Mailbox cmd x7 (x0/x0) Cmpl lpfc_mbx_cmpl_local_config_link [lpfc] Data: x700 x0 x0 x0 x7d0 x76c xa x0 xf x0 x1800 x0
> lpfc 0000:58:00.0: 40: [  575.971827] 1:(0):0354 Mbox cmd issue - Enqueue Data: x8d (x0/x0) x6 xc600 x2
> lpfc 0000:58:00.0: 41: [  575.971827] 1:(0):0355 Mailbox cmd x8d (x0/x0) issue Data: x6 xc700
> lpfc 0000:58:00.0: 42: [  575.972048] 1:(0):0307 Mailbox cmd x8d (x0/x0) Cmpl lpfc_mbx_cmpl_read_sparam [lpfc] Data: x8d00 x0 x0 x70 x277cd800 x44 x1 x0 x0 x0 x0 x0
> lpfc 0000:58:00.0: 43: [  575.972050] 1:(0):0247 Start Discovery Timer state x7 Data: x21 xffff8804c6b149e8 x0 x0
> lpfc 0000:58:00.0: 44: [  575.972051] 1:(0):0932 FIND node did xfffffe NOT FOUND.
> lpfc 0000:58:00.0: 45: [  575.972052] 1:0001 Allocated rpi:x0 max:x3000 lim:x3000
> lpfc 0000:58:00.0: 46: [  575.972053] 1:(0):0007 Init New ndlp xffff8804c715d000, rpi:x0 DID:fffffe flg:x0 refcnt:1
> lpfc 0000:58:00.0: 47: [  575.972055] 1:(0):0116 Xmit ELS command x4 to remote NPORT xfffffe I/O tag: x2fc0, port state:x7 rpi x0 fc_flag:x810114
> lpfc 0000:58:00.0: 48: [  575.972055] 1:(0):0247 Start Discovery Timer state x7 Data: x21 xffff8804c6b149e8 x0 x0
> lpfc 0000:58:00.0: 49: [  576.011558] 1:0357 ELS CQE error: status=x9: CQE: 2fc00900 00000000 00050000 80010000
> lpfc 0000:58:00.0: 50: [  576.011566] 1:0328 Rsp Ring 2 error: IOCB Data: x40000000 x277cd400 x44 x0 x50000 xfffffe x12fc0 x14428a96 x0 x0 x0 x0 x0 x0 x0 x0
> lpfc 0000:58:00.0: 1:(0):2858 FLOGI failure Status:x9/x50000 TMO:x14 Data x19140820 x0
> 
> 
>   drivers/scsi/lpfc/lpfc_els.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
> index db5ccae1b63d..1880e95cb785 100644
> --- a/drivers/scsi/lpfc/lpfc_els.c
> +++ b/drivers/scsi/lpfc/lpfc_els.c
> @@ -4664,7 +4664,8 @@ lpfc_els_retry(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
>   			break;
>   
>   		case LSRJT_LOGICAL_BSY:
> -			if ((cmd == ELS_CMD_PLOGI) ||
> +			if ((cmd == ELS_CMD_FLOGI) ||
> +			    (cmd == ELS_CMD_PLOGI) ||
>   			    (cmd == ELS_CMD_PRLI) ||
>   			    (cmd == ELS_CMD_NVMEPRLI)) {
>   				delay = 1000;

Daniel,

We want to look more closely at this. We do have FLOGI retry logic, but 
perhaps not tied to lsrjt logical busy. We have had OEM requirements as 
to how we do FLOGIs and respond to different status's. This change may 
disrupt those requirements (works for this config, not others). I'll get 
back to you shortly.

-- james
diff mbox series

Patch

diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
index db5ccae1b63d..1880e95cb785 100644
--- a/drivers/scsi/lpfc/lpfc_els.c
+++ b/drivers/scsi/lpfc/lpfc_els.c
@@ -4664,7 +4664,8 @@  lpfc_els_retry(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
 			break;
 
 		case LSRJT_LOGICAL_BSY:
-			if ((cmd == ELS_CMD_PLOGI) ||
+			if ((cmd == ELS_CMD_FLOGI) ||
+			    (cmd == ELS_CMD_PLOGI) ||
 			    (cmd == ELS_CMD_PRLI) ||
 			    (cmd == ELS_CMD_NVMEPRLI)) {
 				delay = 1000;