Message ID | 5633EA98.8050604@sandisk.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 10/30/2015 03:09 PM, Bart Van Assche wrote: > When dropping a lock while iterating a list we must restart the search > as other threads could have manipulated the list under us. Without this > we can get stuck in an endless loop. > > This is a slightly modified version of a patch from Christoph Hellwig > (see also https://www.spinics.net/lists/linux-scsi/msg89416.html). > > Reported-by: Johannes Thumshirn <jthumshirn@suse.de> > Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> > Cc: Johannes Thumshirn <jthumshirn@suse.de> > Cc: Christoph Hellwig <hch@lst.de> > Cc: Dan Williams <dan.j.williams@intel.com> > Cc: stable <stable@vger.kernel.org> > --- > drivers/scsi/scsi_sysfs.c | 16 ++++------------ > 1 file changed, 4 insertions(+), 12 deletions(-) > > diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c > index b9fb61a..5a183d1 100644 > --- a/drivers/scsi/scsi_sysfs.c > +++ b/drivers/scsi/scsi_sysfs.c > @@ -1158,32 +1158,24 @@ static void __scsi_remove_target(struct scsi_target *starget) > void scsi_remove_target(struct device *dev) > { > struct Scsi_Host *shost = dev_to_shost(dev->parent); > - struct scsi_target *starget, *last = NULL; > + struct scsi_target *starget; > unsigned long flags; > > - /* remove targets being careful to lookup next entry before > - * deleting the last > - */ > +restart: > spin_lock_irqsave(shost->host_lock, flags); > list_for_each_entry(starget, &shost->__targets, siblings) { > if (starget->reaped) > continue; > if (starget->dev.parent == dev || &starget->dev == dev) { > - /* assuming new targets arrive at the end */ > kref_get(&starget->reap_ref); > starget->reaped = true; > spin_unlock_irqrestore(shost->host_lock, flags); > - if (last) > - scsi_target_reap(last); > - last = starget; > __scsi_remove_target(starget); > - spin_lock_irqsave(shost->host_lock, flags); > + scsi_target_reap(starget); > + goto restart; > } > } > spin_unlock_irqrestore(shost->host_lock, flags); > - > - if (last) > - scsi_target_reap(last); > } > EXPORT_SYMBOL(scsi_remove_target); (replying to my own e-mail) Hello Christoph, Is it OK for you if I mention you as author of this e-mail ? Thanks, Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2015-11-04 at 14:35 -0800, Bart Van Assche wrote: > On 10/30/2015 03:09 PM, Bart Van Assche wrote: > > When dropping a lock while iterating a list we must restart the search > > as other threads could have manipulated the list under us. Without this > > we can get stuck in an endless loop. > > > > This is a slightly modified version of a patch from Christoph Hellwig > > (see also https://www.spinics.net/lists/linux-scsi/msg89416.html). > > > > Reported-by: Johannes Thumshirn <jthumshirn@suse.de> > > Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> > > Cc: Johannes Thumshirn <jthumshirn@suse.de> > > Cc: Christoph Hellwig <hch@lst.de> > > Cc: Dan Williams <dan.j.williams@intel.com> > > Cc: stable <stable@vger.kernel.org> > > --- > > drivers/scsi/scsi_sysfs.c | 16 ++++------------ > > 1 file changed, 4 insertions(+), 12 deletions(-) > > > > diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c > > index b9fb61a..5a183d1 100644 > > --- a/drivers/scsi/scsi_sysfs.c > > +++ b/drivers/scsi/scsi_sysfs.c > > @@ -1158,32 +1158,24 @@ static void __scsi_remove_target(struct scsi_target *starget) > > void scsi_remove_target(struct device *dev) > > { > > struct Scsi_Host *shost = dev_to_shost(dev->parent); > > - struct scsi_target *starget, *last = NULL; > > + struct scsi_target *starget; > > unsigned long flags; > > > > - /* remove targets being careful to lookup next entry before > > - * deleting the last > > - */ > > +restart: > > spin_lock_irqsave(shost->host_lock, flags); > > list_for_each_entry(starget, &shost->__targets, siblings) { > > if (starget->reaped) > > continue; > > if (starget->dev.parent == dev || &starget->dev == dev) { > > - /* assuming new targets arrive at the end */ > > kref_get(&starget->reap_ref); > > starget->reaped = true; > > spin_unlock_irqrestore(shost->host_lock, flags); > > - if (last) > > - scsi_target_reap(last); > > - last = starget; > > __scsi_remove_target(starget); > > - spin_lock_irqsave(shost->host_lock, flags); > > + scsi_target_reap(starget); > > + goto restart; > > } > > } > > spin_unlock_irqrestore(shost->host_lock, flags); > > - > > - if (last) > > - scsi_target_reap(last); > > } > > EXPORT_SYMBOL(scsi_remove_target); > > (replying to my own e-mail) > > Hello Christoph, > > Is it OK for you if I mention you as author of this e-mail ? Could you just both co-operate, especially since there's not much difference between the patches. The fundamental problem with this is how have the conditions that caused us to move away from list restart: commit bc3f02a795d3b4faa99d37390174be2a75d091bd Author: Dan Williams <djbw@fb.com> Date: Tue Aug 28 22:12:10 2012 -0700 [SCSI] scsi_remove_target: fix softlockup regression on hot remove Which was triggered by this bug report http://thread.gmane.org/gmane.linux.kernel/1348679 been mitigated? James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11/04/2015 02:44 PM, James Bottomley wrote: > On Wed, 2015-11-04 at 14:35 -0800, Bart Van Assche wrote: >> (replying to my own e-mail) >> >> Hello Christoph, >> >> Is it OK for you if I mention you as author of this e-mail ? > > Could you just both co-operate, especially since there's not much > difference between the patches. Hello James and Christoph, Patch [2/2] is identical to the patch Christoph had posted except that it has been rebased on top of patch [1/2] of this series. I will be happy to step back if Christoph or someone else wants to take the lead here. The only reason I started looking into this is that there was no further progress on the original e-mail thread about this subject (http://thread.gmane.org/gmane.linux.kernel/2052359). Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 04, 2015 at 02:35:40PM -0800, Bart Van Assche wrote: > (replying to my own e-mail) > > Hello Christoph, > > Is it OK for you if I mention you as author of this e-mail ? I don't care whom this fix is attributed to, but let's get it in: Reviewed-by: Christoph Hellwig <hch@lst.de> or Signed-off-by: Christoph Hellwig <hch@lst.de> at your preference :) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 4, 2015 at 2:44 PM, James Bottomley <James.Bottomley@hansenpartnership.com> wrote: [..] > The fundamental problem with this is how have the conditions that caused > us to move away from list restart: > > commit bc3f02a795d3b4faa99d37390174be2a75d091bd > Author: Dan Williams <djbw@fb.com> > Date: Tue Aug 28 22:12:10 2012 -0700 > > [SCSI] scsi_remove_target: fix softlockup regression on hot remove > > Which was triggered by this bug report > > http://thread.gmane.org/gmane.linux.kernel/1348679 > > been mitigated? > I think it has because the problem that led to that report was the fact that scsi_target_destroy() did not advance the target state, but we changed that in f2495e228fce. http://marc.info/?l=linux-scsi&m=144527527308725&w=2 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2015-11-05 at 08:55 -0800, Dan Williams wrote: > On Wed, Nov 4, 2015 at 2:44 PM, James Bottomley > <James.Bottomley@hansenpartnership.com> wrote: > [..] > > The fundamental problem with this is how have the conditions that caused > > us to move away from list restart: > > > > commit bc3f02a795d3b4faa99d37390174be2a75d091bd > > Author: Dan Williams <djbw@fb.com> > > Date: Tue Aug 28 22:12:10 2012 -0700 > > > > [SCSI] scsi_remove_target: fix softlockup regression on hot remove > > > > Which was triggered by this bug report > > > > http://thread.gmane.org/gmane.linux.kernel/1348679 > > > > been mitigated? > > > > I think it has because the problem that led to that report was the > fact that scsi_target_destroy() did not advance the target state, but > we changed that in f2495e228fce. > > http://marc.info/?l=linux-scsi&m=144527527308725&w=2 Great, thanks, we'll put Christoph's version in then, because it can be cc'd to stable without problems. Bart, you can redo your state updates on top of this because that's inessential to the current problem. James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Bart, can you resend your patch on top of 4.4-rc1? I think we really need it so we should get it into 4.4 and backport it to -stable. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index b9fb61a..5a183d1 100644 --- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -1158,32 +1158,24 @@ static void __scsi_remove_target(struct scsi_target *starget) void scsi_remove_target(struct device *dev) { struct Scsi_Host *shost = dev_to_shost(dev->parent); - struct scsi_target *starget, *last = NULL; + struct scsi_target *starget; unsigned long flags; - /* remove targets being careful to lookup next entry before - * deleting the last - */ +restart: spin_lock_irqsave(shost->host_lock, flags); list_for_each_entry(starget, &shost->__targets, siblings) { if (starget->reaped) continue; if (starget->dev.parent == dev || &starget->dev == dev) { - /* assuming new targets arrive at the end */ kref_get(&starget->reap_ref); starget->reaped = true; spin_unlock_irqrestore(shost->host_lock, flags); - if (last) - scsi_target_reap(last); - last = starget; __scsi_remove_target(starget); - spin_lock_irqsave(shost->host_lock, flags); + scsi_target_reap(starget); + goto restart; } } spin_unlock_irqrestore(shost->host_lock, flags); - - if (last) - scsi_target_reap(last); } EXPORT_SYMBOL(scsi_remove_target);
When dropping a lock while iterating a list we must restart the search as other threads could have manipulated the list under us. Without this we can get stuck in an endless loop. This is a slightly modified version of a patch from Christoph Hellwig (see also https://www.spinics.net/lists/linux-scsi/msg89416.html). Reported-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: stable <stable@vger.kernel.org> --- drivers/scsi/scsi_sysfs.c | 16 ++++------------ 1 file changed, 4 insertions(+), 12 deletions(-)