Message ID | 1498638793-44672-10-git-send-email-hare@suse.de (mailing list archive) |
---|---|
State | Changes Requested, archived |
Headers | show |
Hi Hannes, unfortunately I only realized now by accident that there's stuff to review. Would be nice to send it also explicitly to driver maintainers in addition to the list. Since you've asked for this multiple times, I happened to just now code a patch series of 6 patches in order to decouple zfcp from scsi_cmnd for device, target, and host reset. While I provide some review comments below, I think it might be clearer and easier to review if you would rebase your series on top of my decoupling. Let me know how urgent you'd like to see my code. I planned to send it as RFC soon anyway. However, it hasn't seen any function testing yet. If you don't care, let me know and I can send it. On 06/28/2017 10:32 AM, Hannes Reinecke wrote: > When issuing a host reset we should be waiting for all > ports to become unblocked; just waiting for one might > be resulting in host reset to return too early. > > Signed-off-by: Hannes Reinecke <hare@suse.com> > --- > drivers/s390/scsi/zfcp_scsi.c | 27 +++++++++++++++++++++++---- > 1 file changed, 23 insertions(+), 4 deletions(-) > > diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c > index 0678cf7..3d18659 100644 > --- a/drivers/s390/scsi/zfcp_scsi.c > +++ b/drivers/s390/scsi/zfcp_scsi.c > @@ -311,13 +311,32 @@ static int zfcp_scsi_eh_target_reset_handler(struct scsi_cmnd *scpnt) > > static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt) > { > - struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scpnt->device); > - struct zfcp_adapter *adapter = zfcp_sdev->port->adapter; > - int ret; > + struct Scsi_Host *host = scpnt->device->host; > + struct zfcp_adapter *adapter = (struct zfcp_adapter *)host->hostdata[0]; > + int ret = 0; > + unsigned long flags; > + struct zfcp_port *port; > > zfcp_erp_adapter_reopen(adapter, 0, "schrh_1"); > zfcp_erp_wait(adapter); > - ret = fc_block_scsi_eh(scpnt); > +retry_rport_blocked: > + spin_lock_irqsave(host->host_lock, flags); missing read_lock(&adapter->port_list_lock); Hm, well, I have to think about lock ordering, because my patch has the port_list as outer loop and inside it calls fc_block_scsi_eh (also modified with fc_rport as argument). If there's any other location taking both locks we better get them in the same order. > + list_for_each_entry(port, &adapter->port_list, list) { > + struct fc_rport *rport = port->rport; port->rport can be NULL, so need to check > + > + if (rport->port_state == FC_PORTSTATE_BLOCKED) { > + if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT) > + ret = FAST_IO_FAIL; Hm, doesn't this get lost if a next iteration hits ret=NEEDS_RETRY? I was pondering in my own patch version what to return of just a subset of ports ran into fast_io_fail. And for now I thought just fast_io_fail would be good to let I/O bubble up for path failover, even if this would include of rport which meanwhile unblocked properly and would not need bubbling up pending requests because they could service them with a simple retry. > + else > + ret = NEEDS_RETRY; > + break; > + } Why do you open code fc_block_scsi_eh() instead of calling it with port->rport (if it's !=NULL)? Typically all rports would be blocked after adapter recovery, until they become unblocked (via zfcp's async rport_work). So we can wait for each in turn which should still only wait in summary for the last one to unblock. E.g. if the first rport takes longest we wait for it, and the rest of the loop will be fast since the others happen to be unblocked (or fast_io_fail) already? > + } missing read_unlock(&adapter->port_list_lock); > + spin_unlock_irqrestore(host->host_lock, flags); > + if (ret == NEEDS_RETRY) { > + msleep(1000); > + goto retry_rport_blocked; > + } > if (ret) > return ret; >
On 06/28/2017 10:32 AM, Hannes Reinecke wrote: > When issuing a host reset we should be waiting for all > ports to become unblocked; just waiting for one might > be resulting in host reset to return too early. > > Signed-off-by: Hannes Reinecke <hare@suse.com> > --- > drivers/s390/scsi/zfcp_scsi.c | 27 +++++++++++++++++++++++---- > 1 file changed, 23 insertions(+), 4 deletions(-) > > diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c > index 0678cf7..3d18659 100644 > --- a/drivers/s390/scsi/zfcp_scsi.c > +++ b/drivers/s390/scsi/zfcp_scsi.c > @@ -311,13 +311,32 @@ static int zfcp_scsi_eh_target_reset_handler(struct scsi_cmnd *scpnt) > > static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt) > { > - struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scpnt->device); > - struct zfcp_adapter *adapter = zfcp_sdev->port->adapter; > - int ret; > + struct Scsi_Host *host = scpnt->device->host; > + struct zfcp_adapter *adapter = (struct zfcp_adapter *)host->hostdata[0]; Oh, only realized when reading the later "[PATCH 14/47] scsi: Use Scsi_Host as argument for eh_host_reset_handler" that this part already anticipates parts of that later patch. Seems not fully topically separated patches to me, if this one is only about the fc_block_scsi_eh aspect in zfcp_scsi_eh_host_reset_handler().
On 07/24/2017 06:18 PM, Steffen Maier wrote: > Hi Hannes, > > unfortunately I only realized now by accident that there's stuff to > review. Would be nice to send it also explicitly to driver maintainers > in addition to the list. > Well, the entire patchset got send as an RFC anyway, and as such I didn't include all the individual driver maintainers. But will be doing so for the actual patchset. > Since you've asked for this multiple times, I happened to just now code > a patch series of 6 patches in order to decouple zfcp from scsi_cmnd for > device, target, and host reset. > Oh, cool. That's precisely what I need. > While I provide some review comments below, I think it might be clearer > and easier to review if you would rebase your series on top of my > decoupling. > Sure. Get me the patches and I'll be doing it :-) > Let me know how urgent you'd like to see my code. I planned to send it > as RFC soon anyway. However, it hasn't seen any function testing yet. If > you don't care, let me know and I can send it. > At this stage I don't really care; the idea is to get the preliminary patches in (preferably before the next merge window), so that the actual patch to modify SCSI EH syntax doesn't have to modify the drivers themselves, only the calling sequence. > On 06/28/2017 10:32 AM, Hannes Reinecke wrote: >> When issuing a host reset we should be waiting for all >> ports to become unblocked; just waiting for one might >> be resulting in host reset to return too early. >> >> Signed-off-by: Hannes Reinecke <hare@suse.com> >> --- >> drivers/s390/scsi/zfcp_scsi.c | 27 +++++++++++++++++++++++---- >> 1 file changed, 23 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/s390/scsi/zfcp_scsi.c >> b/drivers/s390/scsi/zfcp_scsi.c >> index 0678cf7..3d18659 100644 >> --- a/drivers/s390/scsi/zfcp_scsi.c >> +++ b/drivers/s390/scsi/zfcp_scsi.c >> @@ -311,13 +311,32 @@ static int >> zfcp_scsi_eh_target_reset_handler(struct scsi_cmnd *scpnt) >> >> static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt) >> { >> - struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scpnt->device); >> - struct zfcp_adapter *adapter = zfcp_sdev->port->adapter; >> - int ret; >> + struct Scsi_Host *host = scpnt->device->host; >> + struct zfcp_adapter *adapter = (struct zfcp_adapter >> *)host->hostdata[0]; >> + int ret = 0; >> + unsigned long flags; >> + struct zfcp_port *port; >> >> zfcp_erp_adapter_reopen(adapter, 0, "schrh_1"); >> zfcp_erp_wait(adapter); >> - ret = fc_block_scsi_eh(scpnt); >> +retry_rport_blocked: >> + spin_lock_irqsave(host->host_lock, flags); > > missing read_lock(&adapter->port_list_lock); > > Hm, well, I have to think about lock ordering, because my patch has the > port_list as outer loop and inside it calls fc_block_scsi_eh (also > modified with fc_rport as argument). > If there's any other location taking both locks we better get them in > the same order. > In general I'm not terribly happy with this; after all, there is no guarantee that the rport list after host reset is identical to the list prior to it; in extremis we could even end up with no rports whatsoever. In which case we wouldn't have anything to synchronize upon, leaving host reset in a somewhat dubious state. I'd be far happier if we could have a synchronisation point independent on the rport states; then this problem wouldn't occur. (And rports would be handled separately via the dev_loss_tmo mechanism anyway, so it wouldn't matter if the rports are still recovering after host reset returned.) Cheers, Hannes
diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c index 0678cf7..3d18659 100644 --- a/drivers/s390/scsi/zfcp_scsi.c +++ b/drivers/s390/scsi/zfcp_scsi.c @@ -311,13 +311,32 @@ static int zfcp_scsi_eh_target_reset_handler(struct scsi_cmnd *scpnt) static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt) { - struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scpnt->device); - struct zfcp_adapter *adapter = zfcp_sdev->port->adapter; - int ret; + struct Scsi_Host *host = scpnt->device->host; + struct zfcp_adapter *adapter = (struct zfcp_adapter *)host->hostdata[0]; + int ret = 0; + unsigned long flags; + struct zfcp_port *port; zfcp_erp_adapter_reopen(adapter, 0, "schrh_1"); zfcp_erp_wait(adapter); - ret = fc_block_scsi_eh(scpnt); +retry_rport_blocked: + spin_lock_irqsave(host->host_lock, flags); + list_for_each_entry(port, &adapter->port_list, list) { + struct fc_rport *rport = port->rport; + + if (rport->port_state == FC_PORTSTATE_BLOCKED) { + if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT) + ret = FAST_IO_FAIL; + else + ret = NEEDS_RETRY; + break; + } + } + spin_unlock_irqrestore(host->host_lock, flags); + if (ret == NEEDS_RETRY) { + msleep(1000); + goto retry_rport_blocked; + } if (ret) return ret;
When issuing a host reset we should be waiting for all ports to become unblocked; just waiting for one might be resulting in host reset to return too early. Signed-off-by: Hannes Reinecke <hare@suse.com> --- drivers/s390/scsi/zfcp_scsi.c | 27 +++++++++++++++++++++++---- 1 file changed, 23 insertions(+), 4 deletions(-)