Message ID | 20151001043844.GA8397@xzibit.linux.bs1.fc.nec.co.jp (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Oct 01, 2015 at 04:38:45AM +0000, Junichi Nomura wrote: > But another workload using dm-multipath still caues the same crash. > I think a patch like the following is needed. > What do you think? Thanks Junichi, I suspected that we should defer the release until we tear down the sdev instead of unregister time. Give me a little more time to review your patch and actually reproduce your issue before a formal ACK, though. Any chance you could share all your multipath tests in a git repository somewhere? It seems like you're the only one actually having a good set of reproducable but minimalistic tests. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/01/15 14:21, Christoph Hellwig wrote: > Any chance you could share all your multipath tests in a git repository > somewhere? It seems like you're the only one actually having a good > set of reproducable but minimalistic tests. Hmm, sorry I don't have a public git repository... I'm using a pair of KVM guest, one for iSCSI target and the other for testing. I could reproduce the crash using this loop, within a few minutes: service multipathd start while true; do multipath -F iscsiadm -m node --logout iscsiadm -m node --login done It might implicitly depend on udev to do some small amount of I/O after device uevent though. /etc/multipath.conf is like this: blacklist { devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st|vd)[0-9]*" } defaults { user_friendly_names no find_multipaths yes path_grouping_policy group_by_prio path_selector "queue-length 0" path_checker "tur" failback immediate prio alua retain_attached_hw_handler yes }
On Thu, Oct 01, 2015 at 11:40:23AM +0000, Junichi Nomura wrote: > On 10/01/15 14:21, Christoph Hellwig wrote: > > Any chance you could share all your multipath tests in a git repository > > somewhere? It seems like you're the only one actually having a good > > set of reproducable but minimalistic tests. > > Hmm, sorry I don't have a public git repository... > > I'm using a pair of KVM guest, one for iSCSI target and the other > for testing. Any chance you could share your various scripts in some way to that people doing multipath changes can run them to verify those changes? > > I could reproduce the crash using this loop, within a few minutes: > > service multipathd start > while true; do > multipath -F > iscsiadm -m node --logout > iscsiadm -m node --login > done > > It might implicitly depend on udev to do some small amount of I/O > after device uevent though. I can't reproduce this unfortunately. I suspect udev doesn't do enough stupid things on my old Debian test system. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
This patch looks fine to me:
Acked-by: Christoph Hellwig <hch@lst.de>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/04/15 16:42, Christoph Hellwig wrote: > Any chance you could share your various scripts in some way to that > people doing multipath changes can run them to verify those changes? Hi Christoph, I've refined some pieces of test scripts which could be useful for people and posted here: http://marc.info/?l=linux-scsi&m=144419652913948&w=2 https://www.redhat.com/archives/dm-devel/2015-October/msg00034.html The scripts are stress testing and expected to trigger crash/stall/ data corruption if you run long enough. Though it's not perfect I hope this helps people do regression testing for dm-mpath related code changes.
diff --git a/drivers/scsi/scsi_dh.c b/drivers/scsi/scsi_dh.c index edb044a..19bf9db 100644 --- a/drivers/scsi/scsi_dh.c +++ b/drivers/scsi/scsi_dh.c @@ -232,10 +232,14 @@ int scsi_dh_add_device(struct scsi_device *sdev) return err; } -void scsi_dh_remove_device(struct scsi_device *sdev) +void scsi_dh_release_device(struct scsi_device *sdev) { if (sdev->handler) scsi_dh_handler_detach(sdev); +} + +void scsi_dh_remove_device(struct scsi_device *sdev) +{ device_remove_file(&sdev->sdev_gendev, &scsi_dh_state_attr); } diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h index 644bb73..4d01cdb 100644 --- a/drivers/scsi/scsi_priv.h +++ b/drivers/scsi/scsi_priv.h @@ -173,9 +173,11 @@ extern struct async_domain scsi_sd_probe_domain; /* scsi_dh.c */ #ifdef CONFIG_SCSI_DH int scsi_dh_add_device(struct scsi_device *sdev); +void scsi_dh_release_device(struct scsi_device *sdev); void scsi_dh_remove_device(struct scsi_device *sdev); #else static inline int scsi_dh_add_device(struct scsi_device *sdev) { return 0; } +static inline void scsi_dh_release_device(struct scsi_device *sdev) { } static inline void scsi_dh_remove_device(struct scsi_device *sdev) { } #endif diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index b333389..dff8faf 100644 --- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -399,6 +399,8 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work) sdev = container_of(work, struct scsi_device, ew.work); + scsi_dh_release_device(sdev); + parent = sdev->sdev_gendev.parent; spin_lock_irqsave(sdev->host->host_lock, flags);--