[21/42] lpfc: Fix error in remote port address change
Message ID 20190814235712.4487-22-jsmart2021@gmail.com
State Accepted
  • lpfc: Update lpfc to revision
James Smart Aug. 14, 2019, 11:56 p.m. UTC
In a test with high nvme remote port counts connected via a
multi-hop FC switch config where switches were systematically
reset (e.g. fabric partitioning and re-establishment), the nvme
remote ports would switch addresses based on the switch
reconfiguration events. The driver would get into a situation
where the nvme port changed address, PLOGI and PRLI would succeed
nvme transport registration occurred, but subsequent LS requests
by the nvme subsystem failed due to a bad ndlp state and
connectivity to the device failed.

The driver hit a race condition on multiple devices that address
swapped simultaneously. In cases where the driver notices the
remote port structure came back as the same value as previously
(meaning a nvme_rport structure was re-enabled and did not go
through devloss_tmo/connect_tmo_failures on all controllers) the
driver would unconditionally exit assuming the ndlp information
was correct. But, if the ndlp's had been swapped, the ndlp had
stale port state information, which when used by the LS request
commands, would fail the commands.

Fix by checking whether a node swap had occurred, and only exit
if no ndlp swap had occurred.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
 drivers/scsi/lpfc/lpfc_nvme.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c
index e8924e90c4eb..103708503592 100644
--- a/drivers/scsi/lpfc/lpfc_nvme.c
+++ b/drivers/scsi/lpfc/lpfc_nvme.c
@@ -2348,7 +2348,7 @@  lpfc_nvme_register_port(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp)
 				lpfc_printf_vlog(ndlp->vport, KERN_INFO,
-						 "6014 Rebinding lport to "
+						 "6014 Rebind lport to current "
 						 "remoteport %p wwpn 0x%llx, "
 						 "Data: x%x x%x %p %p x%x x%06x\n",
@@ -2359,7 +2359,16 @@  lpfc_nvme_register_port(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp)
-				return 0;
+				/* It's a complete rebind only if the driver
+				 * is registering with the same ndlp. Otherwise
+				 * the driver likely executed a node swap
+				 * prior to this registration and the ndlp to
+				 * remoteport binding needs to be redone.
+				 */
+				if (prev_ndlp == ndlp)
+					return 0;
 			/* Sever the ndlp<->rport association
@@ -2393,8 +2402,8 @@  lpfc_nvme_register_port(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp)
 		lpfc_printf_vlog(vport, KERN_INFO,
-				 "6022 Binding new rport to "
-				 "lport %p Remoteport %p rport %p WWNN 0x%llx, "
+				 "6022 Bind lport x%px to remoteport x%px "
+				 "rport x%px WWNN 0x%llx, "
 				 "Rport WWPN 0x%llx DID "
 				 "x%06x Role x%x, ndlp %p prev_ndlp %p\n",
 				 lport, remote_port, rport,