diff mbox

Failed path will not be recovered when disabling/enabling remote port

Message ID 20090720164654.GA22977@mars.virtualiron.com (mailing list archive)
State Not Applicable, archived
Delegated to: christophe varoqui
Headers show

Commit Message

Konrad Rzeszutek July 20, 2009, 4:46 p.m. UTC
> > Could also be a race condition that is present in SLES10 + RHEL5
> > kernels. Where the SysFS directories are created (and the udev event it
> > sent out), but the kernel hasn't populated the SysFS directories. So
> > when multipathd tries to read them it finds no pertient information and
> > shoves it off to the 'orphan' state.
> > 
> Really? With SLES10? Have you actually observed this?

With SLES10 SP2 to be exact. It wasn't an issue with SLES10 since the
initial patch was there. The equipment I used to test this was an
AX150FC with failed batteries (so no cache writes) and with a failed
controller so it would run extra slow.

> We're running multipath _after_ udev has processed the event.

Right, the one where the SysFS directory is created. Then multipatd
reads the data. I remember posting it here and mentioning that this
problem exists on SLES10SP2 and RHEL5 but not on the upstream kernels.

> And udev already waited for sysfs, so we should be safe there.

Not so. The udev gets the SCSI uevent creation, creates the /dev/sdX, and
so. But the kernel hasn't yet fully populated the SysFS entries (so
/sys/block/sdX/device/vendor does exist, but has no data in it).
> 
> It might be applicable to mainline multipath-tools, but

It really depends on how the SysFS directories are populated and how
slow the SCSI target is.

> the SLES10 one ... I'd be surprised.
> 
> Well, reasonably surprised. multipath keeps on throwing
> an amazing number of issues still.
> 
> Do you have more information here?

Here is the patch along with a detailed description.

The "multipath-tools-add-wait" patch is a backport/write of the
wait_for_file routine used in the sysfs_get_[vendor|model|rev]
macros. The SLES10 SP2 back-ported a lot of the upstream features
of multipath, and one of those was getting rid of this function.
I haven't yet found out the reason why it was deleted - looks
as if a mistake as the upstream kernel _should_ cause the same
set of problems with multipath.
[update: Upstream kernel has this fixed]

The reason a wait is necessary is due to the way the kernel
sends the event. When a SCSI device is added the SCSI subsystem
pursues this path:

_sysfs_add_sdev:
	calls device_add ...
 	[ '/devices/platform/host16/session6/target16:0:0/16:0:0:17'] uevent
		bus_attach_device
			bus_for_each_drv
				driver_probe_device
					sd_probe
 					['/class/scsi_disk/16:0:0:17' ] uevent
						add_disk
 						['/block/sdai'] [ Here multipath starts its job ]

	calls class_device_add ...
		[ '/class/scsi_device/16:0:0:17' ] uevent
		sg_add:
			[ '/class/scsi_generic/sg35' ] uevent


	done with device_add, and now we add the  attributes:
	--> scsi_sysfs_sdev_attrs[i].vendor, model, rev <-- THIS is the
problem.

[Multipathd at the 'block/sdai' event has started analyzing the data, and
it reads the SysFS, but the 'vendor', 'model' have no data so multipathd
discards them an orphans the devices. That data gets to be there once
'device_add' is finished.]


There are four uevents sent from the kernel in the creation of
a SysFS representation of the device. After the last event, the
SysFs entries for vendor,model, rev are populated. Which can
lead to a race condition when multipath investigates the
the new block device and finds it can't read the vendor. This
patch adds the wait_for_file routine which adds some wait
time.


> 
> Cheers,
> 
> Hannes
> -- 
> Dr. Hannes Reinecke		      zSeries & Storage
> hare@suse.de			      +49 911 74053 688
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Markus Rex, HRB 16746 (AG Nürnberg)
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Comments

Hannes Reinecke July 21, 2009, 6:19 a.m. UTC | #1
Hi Konrad,

Konrad Rzeszutek wrote:
>>> Could also be a race condition that is present in SLES10 + RHEL5
>>> kernels. Where the SysFS directories are created (and the udev event it
>>> sent out), but the kernel hasn't populated the SysFS directories. So
>>> when multipathd tries to read them it finds no pertient information and
>>> shoves it off to the 'orphan' state.
>>>
>> Really? With SLES10? Have you actually observed this?
> 
> With SLES10 SP2 to be exact. It wasn't an issue with SLES10 since the
> initial patch was there. The equipment I used to test this was an
> AX150FC with failed batteries (so no cache writes) and with a failed
> controller so it would run extra slow.
> 
>> We're running multipath _after_ udev has processed the event.
> 
> Right, the one where the SysFS directory is created. Then multipatd
> reads the data. I remember posting it here and mentioning that this
> problem exists on SLES10SP2 and RHEL5 but not on the upstream kernels.
> 
>> And udev already waited for sysfs, so we should be safe there.
> 
> Not so. The udev gets the SCSI uevent creation, creates the /dev/sdX, and
> so. But the kernel hasn't yet fully populated the SysFS entries (so
> /sys/block/sdX/device/vendor does exist, but has no data in it).
>> It might be applicable to mainline multipath-tools, but
> 
> It really depends on how the SysFS directories are populated and how
> slow the SCSI target is.
> 
>> the SLES10 one ... I'd be surprised.
>>
>> Well, reasonably surprised. multipath keeps on throwing
>> an amazing number of issues still.
>>
>> Do you have more information here?
> 
> Here is the patch along with a detailed description.
> 
> The "multipath-tools-add-wait" patch is a backport/write of the
> wait_for_file routine used in the sysfs_get_[vendor|model|rev]
> macros. The SLES10 SP2 back-ported a lot of the upstream features
> of multipath, and one of those was getting rid of this function.
> I haven't yet found out the reason why it was deleted - looks
> as if a mistake as the upstream kernel _should_ cause the same
> set of problems with multipath.
> [update: Upstream kernel has this fixed]
> 
> The reason a wait is necessary is due to the way the kernel
> sends the event. When a SCSI device is added the SCSI subsystem
> pursues this path:
> 
> _sysfs_add_sdev:
> 	calls device_add ...
>  	[ '/devices/platform/host16/session6/target16:0:0/16:0:0:17'] uevent
> 		bus_attach_device
> 			bus_for_each_drv
> 				driver_probe_device
> 					sd_probe
>  					['/class/scsi_disk/16:0:0:17' ] uevent
> 						add_disk
>  						['/block/sdai'] [ Here multipath starts its job ]
> 
> 	calls class_device_add ...
> 		[ '/class/scsi_device/16:0:0:17' ] uevent
> 		sg_add:
> 			[ '/class/scsi_generic/sg35' ] uevent
> 
> 
> 	done with device_add, and now we add the  attributes:
> 	--> scsi_sysfs_sdev_attrs[i].vendor, model, rev <-- THIS is the
> problem.
> 
> [Multipathd at the 'block/sdai' event has started analyzing the data, and
> it reads the SysFS, but the 'vendor', 'model' have no data so multipathd
> discards them an orphans the devices. That data gets to be there once
> 'device_add' is finished.]
> 
Ah. Hmm. Seems you are correct.

I'll have to apply the patch, then.

Fancy opening a bugzilla for it?

Cheers,

Hannes
Konrad Rzeszutek July 21, 2009, 9:42 p.m. UTC | #2
> Ah. Hmm. Seems you are correct.
> 
> I'll have to apply the patch, then.
> 
> Fancy opening a bugzilla for it?

Done. BZ #524018.
> 
> Cheers,
> 
> Hannes
> -- 
> Dr. Hannes Reinecke		      zSeries & Storage
> hare@suse.de			      +49 911 74053 688
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Markus Rex, HRB 16746 (AG Nürnberg)
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
diff mbox

Patch

diff -uNrp multipath-tools-0.4.7.orig/libmultipath/discovery.c multipath-tools-0.4.7/libmultipath/discovery.c
--- multipath-tools-0.4.7.orig/libmultipath/discovery.c	2008-09-25 14:02:28.000000000 -0400
+++ multipath-tools-0.4.7/libmultipath/discovery.c	2008-09-25 19:07:50.000000000 -0400
@@ -125,11 +125,54 @@  path_discovery (vector pathvec, struct c
 	return r;
 }
 
+
+/*
+ * the daemon can race udev upon path add,
+ * not multipath(8), ran by udev
+ */
+#if DAEMON
+#define WAIT_MAX_SECONDS 5
+#define WAIT_LOOP_PER_SECOND 5
+
+static int
+wait_for_file (char * filename)
+{
+	int loop;
+	struct stat stats;
+	
+	loop = WAIT_MAX_SECONDS * WAIT_LOOP_PER_SECOND;
+	
+	while (--loop) {
+		if (stat(filename, &stats) == 0)
+			return 0;
+
+		if (errno != ENOENT)
+			return 1;
+
+		usleep(1000 * 1000 / WAIT_LOOP_PER_SECOND);
+	}
+	return 1;
+}
+#else
+static int
+wait_for_file (char * filename)
+{
+	return 0;
+}
+#endif
+
 #define declare_sysfs_get_str(fname) \
 extern int \
 sysfs_get_##fname (struct sysfs_device * dev, char * buff, size_t len) \
 { \
 	char *attr; \
+	char attr_path[SYSFS_PATH_SIZE]; \
+\
+	if (safe_sprintf(attr_path, "%s/%s/%s", sysfs_path, dev->devpath, #fname)) \
+		return 1; \
+\
+	if (wait_for_file(attr_path)) \
+		return 1; \
 \
 	attr = sysfs_attr_get_value(dev->devpath, #fname); \
 	if (!attr) \