Message ID | 1477709726-5442-10-git-send-email-bmarzins@redhat.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Delegated to: | christophe varoqui |
Headers | show |
On 10/29/2016 04:55 AM, Benjamin Marzinski wrote: > If a LUN on a storage device gets remapped while in-use by multipath, > it's possible that the multipath device will continue writing to this > new LUN, causing corruption. This is not multipath's fault (users > should go remapping in-use LUNs), but it's possible for multipath to > detect this and disable IO to the device. If disable_changed_wwids > is set to "yes", multipathd will detect when a LUN changes wwids when it > receives the uevent for this, and will disable access to the device > until the LUN is mapped back. > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> > --- > libmultipath/config.c | 1 + > libmultipath/config.h | 1 + > libmultipath/defaults.h | 1 + > libmultipath/dict.c | 4 ++++ > libmultipath/discovery.c | 15 +++++++-------- > libmultipath/discovery.h | 1 + > libmultipath/structs.h | 1 + > multipathd/main.c | 32 ++++++++++++++++++++++++++++++++ > 8 files changed, 48 insertions(+), 8 deletions(-) > Hmm. Not sure if the really buys us anything. By the time we process the uevent it might already be too late, and I/O might already have been written to that device. I do agree on the warning, though. Cheers, Hannes
On Sun, Oct 30, 2016 at 02:54:14PM +0100, Hannes Reinecke wrote: > On 10/29/2016 04:55 AM, Benjamin Marzinski wrote: > >If a LUN on a storage device gets remapped while in-use by multipath, > >it's possible that the multipath device will continue writing to this > >new LUN, causing corruption. This is not multipath's fault (users > >should go remapping in-use LUNs), but it's possible for multipath to > >detect this and disable IO to the device. If disable_changed_wwids > >is set to "yes", multipathd will detect when a LUN changes wwids when it > >receives the uevent for this, and will disable access to the device > >until the LUN is mapped back. > > > >Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> > >--- > > libmultipath/config.c | 1 + > > libmultipath/config.h | 1 + > > libmultipath/defaults.h | 1 + > > libmultipath/dict.c | 4 ++++ > > libmultipath/discovery.c | 15 +++++++-------- > > libmultipath/discovery.h | 1 + > > libmultipath/structs.h | 1 + > > multipathd/main.c | 32 ++++++++++++++++++++++++++++++++ > > 8 files changed, 48 insertions(+), 8 deletions(-) > > > Hmm. Not sure if the really buys us anything. By the time we process the > uevent it might already be too late, and I/O might already have been written > to that device. > I do agree on the warning, though. I have taken some pains to try to explain this short-coming to the customer who filed this bugzilla. The good news is that in practice, the device ususally goes down when you are remapping the LUN, and you disable the device before it ever comes back. In my testing, I wasn't able to actually ever write to the wrong device with disable_changed_wwids enabled, but I do agree that you have the potential to race here. -Ben > > Cheers, > > Hannes > -- > Dr. Hannes Reinecke zSeries & Storage > hare@suse.de +49 911 74053 688 > SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg > GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) > > -- > dm-devel mailing list > dm-devel@redhat.com > https://www.redhat.com/mailman/listinfo/dm-devel -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On 10/29/2016 04:55 AM, Benjamin Marzinski wrote: > If a LUN on a storage device gets remapped while in-use by multipath, > it's possible that the multipath device will continue writing to this > new LUN, causing corruption. This is not multipath's fault (users > should go remapping in-use LUNs), but it's possible for multipath to > detect this and disable IO to the device. If disable_changed_wwids > is set to "yes", multipathd will detect when a LUN changes wwids when it > receives the uevent for this, and will disable access to the device > until the LUN is mapped back. Some info is needed at multipath.conf.5 for this new keyword. > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> > --- > libmultipath/config.c | 1 + > libmultipath/config.h | 1 + > libmultipath/defaults.h | 1 + > libmultipath/dict.c | 4 ++++ > libmultipath/discovery.c | 15 +++++++-------- > libmultipath/discovery.h | 1 + > libmultipath/structs.h | 1 + > multipathd/main.c | 32 ++++++++++++++++++++++++++++++++ > 8 files changed, 48 insertions(+), 8 deletions(-) > > diff --git a/libmultipath/config.c b/libmultipath/config.c > index bdcad80..a97b318 100644 > --- a/libmultipath/config.c > +++ b/libmultipath/config.c > @@ -618,6 +618,7 @@ load_config (char * file) > conf->uev_wait_timeout = DEFAULT_UEV_WAIT_TIMEOUT; > conf->deferred_remove = DEFAULT_DEFERRED_REMOVE; > conf->skip_kpartx = DEFAULT_SKIP_KPARTX; > + conf->disable_changed_wwids = DEFAULT_DISABLE_CHANGED_WWIDS; > > /* > * preload default hwtable > diff --git a/libmultipath/config.h b/libmultipath/config.h > index d59a993..dbdaa44 100644 > --- a/libmultipath/config.h > +++ b/libmultipath/config.h > @@ -144,6 +144,7 @@ struct config { > int delayed_reconfig; > int uev_wait_timeout; > int skip_kpartx; > + int disable_changed_wwids; > unsigned int version[3]; > > char * multipath_dir; > diff --git a/libmultipath/defaults.h b/libmultipath/defaults.h > index a1fee9b..a72078f 100644 > --- a/libmultipath/defaults.h > +++ b/libmultipath/defaults.h > @@ -36,6 +36,7 @@ > #define DEFAULT_FORCE_SYNC 0 > #define DEFAULT_PARTITION_DELIM NULL > #define DEFAULT_SKIP_KPARTX SKIP_KPARTX_OFF > +#define DEFAULT_DISABLE_CHANGED_WWIDS 0 > > #define DEFAULT_CHECKINT 5 > #define MAX_CHECKINT(a) (a << 2) > diff --git a/libmultipath/dict.c b/libmultipath/dict.c > index e0a3014..61b6910 100644 > --- a/libmultipath/dict.c > +++ b/libmultipath/dict.c > @@ -412,6 +412,9 @@ declare_hw_snprint(skip_kpartx, print_yes_no_undef) > declare_mp_handler(skip_kpartx, set_yes_no_undef) > declare_mp_snprint(skip_kpartx, print_yes_no_undef) > > +declare_def_handler(disable_changed_wwids, set_yes_no) > +declare_def_snprint(disable_changed_wwids, print_yes_no) > + > static int > def_config_dir_handler(struct config *conf, vector strvec) > { > @@ -1395,6 +1398,7 @@ init_keywords(vector keywords) > install_keyword("retrigger_delay", &def_retrigger_delay_handler, &snprint_def_retrigger_delay); > install_keyword("missing_uev_wait_timeout", &def_uev_wait_timeout_handler, &snprint_def_uev_wait_timeout); > install_keyword("skip_kpartx", &def_skip_kpartx_handler, &snprint_def_skip_kpartx); > + install_keyword("disable_changed_wwids", &def_disable_changed_wwids_handler, &snprint_def_disable_changed_wwids); > __deprecated install_keyword("default_selector", &def_selector_handler, NULL); > __deprecated install_keyword("default_path_grouping_policy", &def_pgpolicy_handler, NULL); > __deprecated install_keyword("default_uid_attribute", &def_uid_attribute_handler, NULL); > diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c > index bb3116d..756344f 100644 > --- a/libmultipath/discovery.c > +++ b/libmultipath/discovery.c > @@ -1538,13 +1538,12 @@ get_prio (struct path * pp) > } > > static int > -get_udev_uid(struct path * pp, char *uid_attribute) > +get_udev_uid(struct path * pp, char *uid_attribute, struct udev_device *udev) > { > ssize_t len; > const char *value; > > - value = udev_device_get_property_value(pp->udev, > - uid_attribute); > + value = udev_device_get_property_value(udev, uid_attribute); > if (!value || strlen(value) == 0) > value = getenv(uid_attribute); > if (value && strlen(value)) { > @@ -1625,8 +1624,8 @@ get_vpd_uid(struct path * pp) > return get_vpd_sysfs(parent, 0x83, pp->wwid, WWID_SIZE); > } > > -static int > -get_uid (struct path * pp, int path_state) > +int > +get_uid (struct path * pp, int path_state, struct udev_device *udev) > { > char *c; > const char *origin = "unknown"; > @@ -1639,7 +1638,7 @@ get_uid (struct path * pp, int path_state) > put_multipath_config(conf); > } > > - if (!pp->udev) { > + if (!udev) { > condlog(1, "%s: no udev information", pp->dev); > return 1; > } > @@ -1669,7 +1668,7 @@ get_uid (struct path * pp, int path_state) > int retrigger; > > if (pp->uid_attribute) { > - len = get_udev_uid(pp, pp->uid_attribute); > + len = get_udev_uid(pp, pp->uid_attribute, udev); > origin = "udev"; > if (len <= 0) > condlog(1, > @@ -1798,7 +1797,7 @@ pathinfo (struct path *pp, struct config *conf, int mask) > } > > if ((mask & DI_WWID) && !strlen(pp->wwid)) { > - get_uid(pp, path_state); > + get_uid(pp, path_state, pp->udev); > if (!strlen(pp->wwid)) { > pp->initialized = INIT_MISSING_UDEV; > pp->tick = conf->retrigger_delay; > diff --git a/libmultipath/discovery.h b/libmultipath/discovery.h > index 0f5b1e6..176eac1 100644 > --- a/libmultipath/discovery.h > +++ b/libmultipath/discovery.h > @@ -49,6 +49,7 @@ ssize_t sysfs_get_vpd (struct udev_device * udev, int pg, unsigned char * buff, > size_t len); > int sysfs_get_asymmetric_access_state(struct path *pp, > char *buff, int buflen); > +int get_uid(struct path * pp, int path_state, struct udev_device *udev); > > /* > * discovery bitmask > diff --git a/libmultipath/structs.h b/libmultipath/structs.h > index 3a716d8..58508f6 100644 > --- a/libmultipath/structs.h > +++ b/libmultipath/structs.h > @@ -217,6 +217,7 @@ struct path { > int fd; > int initialized; > int retriggers; > + int wwid_changed; > > /* configlet pointers */ > struct hwentry * hwe; > diff --git a/multipathd/main.c b/multipathd/main.c > index dbb4554..bb96cca 100644 > --- a/multipathd/main.c > +++ b/multipathd/main.c > @@ -958,6 +958,12 @@ uev_update_path (struct uevent *uev, struct vectors * vecs) > { > int ro, retval = 0; > struct path * pp; > + struct config *conf; > + int disable_changed_wwids; > + > + conf = get_multipath_config(); > + disable_changed_wwids = conf->disable_changed_wwids; > + put_multipath_config(conf); > > ro = uevent_get_disk_ro(uev); > > @@ -969,6 +975,25 @@ uev_update_path (struct uevent *uev, struct vectors * vecs) > if (pp) { > struct multipath *mpp = pp->mpp; > > + if (disable_changed_wwids && > + (strlen(pp->wwid) || pp->wwid_changed)) { > + char wwid[WWID_SIZE]; > + > + strcpy(wwid, pp->wwid); > + get_uid(pp, pp->state, uev->udev); > + if (strcmp(wwid, pp->wwid) != 0) { > + condlog(0, "%s: path wwid changed from '%s' to '%s'. disallowing", uev->kernel, wwid, pp->wwid); > + strcpy(pp->wwid, wwid); > + if (!pp->wwid_changed) { > + pp->wwid_changed = 1; > + pp->tick = 1; > + dm_fail_path(pp->mpp->alias, pp->dev_t); > + } > + goto out; > + } else > + pp->wwid_changed = 0; > + } > + > if (pp->initialized == INIT_REQUESTED_UDEV) > retval = uev_add_path(uev, vecs); > else if (mpp && ro >= 0) { > @@ -983,6 +1008,7 @@ uev_update_path (struct uevent *uev, struct vectors * vecs) > } > } > } > +out: > lock_cleanup_pop(vecs->lock); > if (!pp) > condlog(0, "%s: spurious uevent, path not found", uev->kernel); > @@ -1509,6 +1535,12 @@ check_path (struct vectors * vecs, struct path * pp, int ticks) > } else > checker_clear_message(&pp->checker); > > + if (pp->wwid_changed) { > + condlog(2, "%s: path wwid has changed. Refusing to use", > + pp->dev); > + newstate = PATH_DOWN; > + } > + > if (newstate == PATH_WILD || newstate == PATH_UNCHECKED) { > condlog(2, "%s: unusable path", pp->dev); > conf = get_multipath_config(); > -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Sun, Nov 06, 2016 at 01:03:31AM +0100, Xose Vazquez Perez wrote: > On 10/29/2016 04:55 AM, Benjamin Marzinski wrote: > > > If a LUN on a storage device gets remapped while in-use by multipath, > > it's possible that the multipath device will continue writing to this > > new LUN, causing corruption. This is not multipath's fault (users > > should go remapping in-use LUNs), but it's possible for multipath to > > detect this and disable IO to the device. If disable_changed_wwids > > is set to "yes", multipathd will detect when a LUN changes wwids when it > > receives the uevent for this, and will disable access to the device > > until the LUN is mapped back. > > Some info is needed at multipath.conf.5 for this new keyword. > Oops! Thanks. I'll send a patch with the manpage info. -Ben > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> > > --- > > libmultipath/config.c | 1 + > > libmultipath/config.h | 1 + > > libmultipath/defaults.h | 1 + > > libmultipath/dict.c | 4 ++++ > > libmultipath/discovery.c | 15 +++++++-------- > > libmultipath/discovery.h | 1 + > > libmultipath/structs.h | 1 + > > multipathd/main.c | 32 ++++++++++++++++++++++++++++++++ > > 8 files changed, 48 insertions(+), 8 deletions(-) > > > > diff --git a/libmultipath/config.c b/libmultipath/config.c > > index bdcad80..a97b318 100644 > > --- a/libmultipath/config.c > > +++ b/libmultipath/config.c > > @@ -618,6 +618,7 @@ load_config (char * file) > > conf->uev_wait_timeout = DEFAULT_UEV_WAIT_TIMEOUT; > > conf->deferred_remove = DEFAULT_DEFERRED_REMOVE; > > conf->skip_kpartx = DEFAULT_SKIP_KPARTX; > > + conf->disable_changed_wwids = DEFAULT_DISABLE_CHANGED_WWIDS; > > > > /* > > * preload default hwtable > > diff --git a/libmultipath/config.h b/libmultipath/config.h > > index d59a993..dbdaa44 100644 > > --- a/libmultipath/config.h > > +++ b/libmultipath/config.h > > @@ -144,6 +144,7 @@ struct config { > > int delayed_reconfig; > > int uev_wait_timeout; > > int skip_kpartx; > > + int disable_changed_wwids; > > unsigned int version[3]; > > > > char * multipath_dir; > > diff --git a/libmultipath/defaults.h b/libmultipath/defaults.h > > index a1fee9b..a72078f 100644 > > --- a/libmultipath/defaults.h > > +++ b/libmultipath/defaults.h > > @@ -36,6 +36,7 @@ > > #define DEFAULT_FORCE_SYNC 0 > > #define DEFAULT_PARTITION_DELIM NULL > > #define DEFAULT_SKIP_KPARTX SKIP_KPARTX_OFF > > +#define DEFAULT_DISABLE_CHANGED_WWIDS 0 > > > > #define DEFAULT_CHECKINT 5 > > #define MAX_CHECKINT(a) (a << 2) > > diff --git a/libmultipath/dict.c b/libmultipath/dict.c > > index e0a3014..61b6910 100644 > > --- a/libmultipath/dict.c > > +++ b/libmultipath/dict.c > > @@ -412,6 +412,9 @@ declare_hw_snprint(skip_kpartx, print_yes_no_undef) > > declare_mp_handler(skip_kpartx, set_yes_no_undef) > > declare_mp_snprint(skip_kpartx, print_yes_no_undef) > > > > +declare_def_handler(disable_changed_wwids, set_yes_no) > > +declare_def_snprint(disable_changed_wwids, print_yes_no) > > + > > static int > > def_config_dir_handler(struct config *conf, vector strvec) > > { > > @@ -1395,6 +1398,7 @@ init_keywords(vector keywords) > > install_keyword("retrigger_delay", &def_retrigger_delay_handler, &snprint_def_retrigger_delay); > > install_keyword("missing_uev_wait_timeout", &def_uev_wait_timeout_handler, &snprint_def_uev_wait_timeout); > > install_keyword("skip_kpartx", &def_skip_kpartx_handler, &snprint_def_skip_kpartx); > > + install_keyword("disable_changed_wwids", &def_disable_changed_wwids_handler, &snprint_def_disable_changed_wwids); > > __deprecated install_keyword("default_selector", &def_selector_handler, NULL); > > __deprecated install_keyword("default_path_grouping_policy", &def_pgpolicy_handler, NULL); > > __deprecated install_keyword("default_uid_attribute", &def_uid_attribute_handler, NULL); > > diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c > > index bb3116d..756344f 100644 > > --- a/libmultipath/discovery.c > > +++ b/libmultipath/discovery.c > > @@ -1538,13 +1538,12 @@ get_prio (struct path * pp) > > } > > > > static int > > -get_udev_uid(struct path * pp, char *uid_attribute) > > +get_udev_uid(struct path * pp, char *uid_attribute, struct udev_device *udev) > > { > > ssize_t len; > > const char *value; > > > > - value = udev_device_get_property_value(pp->udev, > > - uid_attribute); > > + value = udev_device_get_property_value(udev, uid_attribute); > > if (!value || strlen(value) == 0) > > value = getenv(uid_attribute); > > if (value && strlen(value)) { > > @@ -1625,8 +1624,8 @@ get_vpd_uid(struct path * pp) > > return get_vpd_sysfs(parent, 0x83, pp->wwid, WWID_SIZE); > > } > > > > -static int > > -get_uid (struct path * pp, int path_state) > > +int > > +get_uid (struct path * pp, int path_state, struct udev_device *udev) > > { > > char *c; > > const char *origin = "unknown"; > > @@ -1639,7 +1638,7 @@ get_uid (struct path * pp, int path_state) > > put_multipath_config(conf); > > } > > > > - if (!pp->udev) { > > + if (!udev) { > > condlog(1, "%s: no udev information", pp->dev); > > return 1; > > } > > @@ -1669,7 +1668,7 @@ get_uid (struct path * pp, int path_state) > > int retrigger; > > > > if (pp->uid_attribute) { > > - len = get_udev_uid(pp, pp->uid_attribute); > > + len = get_udev_uid(pp, pp->uid_attribute, udev); > > origin = "udev"; > > if (len <= 0) > > condlog(1, > > @@ -1798,7 +1797,7 @@ pathinfo (struct path *pp, struct config *conf, int mask) > > } > > > > if ((mask & DI_WWID) && !strlen(pp->wwid)) { > > - get_uid(pp, path_state); > > + get_uid(pp, path_state, pp->udev); > > if (!strlen(pp->wwid)) { > > pp->initialized = INIT_MISSING_UDEV; > > pp->tick = conf->retrigger_delay; > > diff --git a/libmultipath/discovery.h b/libmultipath/discovery.h > > index 0f5b1e6..176eac1 100644 > > --- a/libmultipath/discovery.h > > +++ b/libmultipath/discovery.h > > @@ -49,6 +49,7 @@ ssize_t sysfs_get_vpd (struct udev_device * udev, int pg, unsigned char * buff, > > size_t len); > > int sysfs_get_asymmetric_access_state(struct path *pp, > > char *buff, int buflen); > > +int get_uid(struct path * pp, int path_state, struct udev_device *udev); > > > > /* > > * discovery bitmask > > diff --git a/libmultipath/structs.h b/libmultipath/structs.h > > index 3a716d8..58508f6 100644 > > --- a/libmultipath/structs.h > > +++ b/libmultipath/structs.h > > @@ -217,6 +217,7 @@ struct path { > > int fd; > > int initialized; > > int retriggers; > > + int wwid_changed; > > > > /* configlet pointers */ > > struct hwentry * hwe; > > diff --git a/multipathd/main.c b/multipathd/main.c > > index dbb4554..bb96cca 100644 > > --- a/multipathd/main.c > > +++ b/multipathd/main.c > > @@ -958,6 +958,12 @@ uev_update_path (struct uevent *uev, struct vectors * vecs) > > { > > int ro, retval = 0; > > struct path * pp; > > + struct config *conf; > > + int disable_changed_wwids; > > + > > + conf = get_multipath_config(); > > + disable_changed_wwids = conf->disable_changed_wwids; > > + put_multipath_config(conf); > > > > ro = uevent_get_disk_ro(uev); > > > > @@ -969,6 +975,25 @@ uev_update_path (struct uevent *uev, struct vectors * vecs) > > if (pp) { > > struct multipath *mpp = pp->mpp; > > > > + if (disable_changed_wwids && > > + (strlen(pp->wwid) || pp->wwid_changed)) { > > + char wwid[WWID_SIZE]; > > + > > + strcpy(wwid, pp->wwid); > > + get_uid(pp, pp->state, uev->udev); > > + if (strcmp(wwid, pp->wwid) != 0) { > > + condlog(0, "%s: path wwid changed from '%s' to '%s'. disallowing", uev->kernel, wwid, pp->wwid); > > + strcpy(pp->wwid, wwid); > > + if (!pp->wwid_changed) { > > + pp->wwid_changed = 1; > > + pp->tick = 1; > > + dm_fail_path(pp->mpp->alias, pp->dev_t); > > + } > > + goto out; > > + } else > > + pp->wwid_changed = 0; > > + } > > + > > if (pp->initialized == INIT_REQUESTED_UDEV) > > retval = uev_add_path(uev, vecs); > > else if (mpp && ro >= 0) { > > @@ -983,6 +1008,7 @@ uev_update_path (struct uevent *uev, struct vectors * vecs) > > } > > } > > } > > +out: > > lock_cleanup_pop(vecs->lock); > > if (!pp) > > condlog(0, "%s: spurious uevent, path not found", uev->kernel); > > @@ -1509,6 +1535,12 @@ check_path (struct vectors * vecs, struct path * pp, int ticks) > > } else > > checker_clear_message(&pp->checker); > > > > + if (pp->wwid_changed) { > > + condlog(2, "%s: path wwid has changed. Refusing to use", > > + pp->dev); > > + newstate = PATH_DOWN; > > + } > > + > > if (newstate == PATH_WILD || newstate == PATH_UNCHECKED) { > > condlog(2, "%s: unusable path", pp->dev); > > conf = get_multipath_config(); > > -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Hi, everyone In my testing in the use of 3Par storage device, when a LUN changes LUN number multipathd can not receive the uevent and can not disable access to the device. but for HP 6300 storage device, it can receive the uevent and disable access to the device. So I think 3Par storage device don't support the case. Now I don't have a right way to deal with the case. Thanks for your advise. root@ubuntu98:~# udevadm monitor monitor will print the received events for: UDEV - the event which udev sends out after rule processing KERNEL - the kernel uevent KERNEL[68031.677399] change /devices/virtual/block/dm-1 (block) UDEV [68031.711148] change /devices/virtual/block/dm-1 (block) KERNEL[68032.678563] change /devices/virtual/block/dm-1 (block) UDEV [68032.705265] change /devices/virtual/block/dm-1 (block) KERNEL[68033.681120] change /devices/virtual/block/dm-1 (block) UDEV [68033.709744] change /devices/virtual/block/dm-1 (block) KERNEL[68035.682421] change /devices/virtual/block/dm-1 (block) KERNEL[68051.693362] change /devices/virtual/block/dm-2 (block) UDEV [68051.722144] change /devices/virtual/block/dm-2 (block) KERNEL[68052.694393] change /devices/virtual/block/dm-2 (block) KERNEL[68052.694761] change /devices/virtual/block/dm-2 (block) UDEV [68052.721103] change /devices/virtual/block/dm-2 (block) root@ubuntu98:~# lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 16.04 LTS Release: 16.04 Codename: xenial ________________________________ All the best wishes for you. zhangguanghui From: dm-devel-bounces@redhat.com<mailto:dm-devel-bounces@redhat.com> Date: 2016-11-04 23:32 To: Hannes Reinecke<mailto:hare@suse.de> CC: dm-devel@redhat.com<mailto:dm-devel@redhat.com> Subject: Re: [dm-devel] [PATCH 09/10] add disable_changed_wwids option On Sun, Oct 30, 2016 at 02:54:14PM +0100, Hannes Reinecke wrote: > On 10/29/2016 04:55 AM, Benjamin Marzinski wrote: > >If a LUN on a storage device gets remapped while in-use by multipath, > >it's possible that the multipath device will continue writing to this > >new LUN, causing corruption. This is not multipath's fault (users > >should go remapping in-use LUNs), but it's possible for multipath to > >detect this and disable IO to the device. If disable_changed_wwids > >is set to "yes", multipathd will detect when a LUN changes wwids when it > >receives the uevent for this, and will disable access to the device > >until the LUN is mapped back. > > > >Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> > >--- > > libmultipath/config.c | 1 + > > libmultipath/config.h | 1 + > > libmultipath/defaults.h | 1 + > > libmultipath/dict.c | 4 ++++ > > libmultipath/discovery.c | 15 +++++++-------- > > libmultipath/discovery.h | 1 + > > libmultipath/structs.h | 1 + > > multipathd/main.c | 32 ++++++++++++++++++++++++++++++++ > > 8 files changed, 48 insertions(+), 8 deletions(-) > > > Hmm. Not sure if the really buys us anything. By the time we process the > uevent it might already be too late, and I/O might already have been written > to that device. > I do agree on the warning, though. I have taken some pains to try to explain this short-coming to the customer who filed this bugzilla. The good news is that in practice, the device ususally goes down when you are remapping the LUN, and you disable the device before it ever comes back. In my testing, I wasn't able to actually ever write to the wrong device with disable_changed_wwids enabled, but I do agree that you have the potential to race here. -Ben > > Cheers, > > Hannes > -- > Dr. Hannes Reinecke zSeries & Storage > hare@suse.de +49 911 74053 688 > SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg > GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) > > -- > dm-devel mailing list > dm-devel@redhat.com > https://www.redhat.com/mailman/listinfo/dm-devel -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel ------------------------------------------------------------------------------------------------------------------------------------- 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 邮件! This e-mail and its attachments contain confidential information from H3C, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
diff --git a/libmultipath/config.c b/libmultipath/config.c index bdcad80..a97b318 100644 --- a/libmultipath/config.c +++ b/libmultipath/config.c @@ -618,6 +618,7 @@ load_config (char * file) conf->uev_wait_timeout = DEFAULT_UEV_WAIT_TIMEOUT; conf->deferred_remove = DEFAULT_DEFERRED_REMOVE; conf->skip_kpartx = DEFAULT_SKIP_KPARTX; + conf->disable_changed_wwids = DEFAULT_DISABLE_CHANGED_WWIDS; /* * preload default hwtable diff --git a/libmultipath/config.h b/libmultipath/config.h index d59a993..dbdaa44 100644 --- a/libmultipath/config.h +++ b/libmultipath/config.h @@ -144,6 +144,7 @@ struct config { int delayed_reconfig; int uev_wait_timeout; int skip_kpartx; + int disable_changed_wwids; unsigned int version[3]; char * multipath_dir; diff --git a/libmultipath/defaults.h b/libmultipath/defaults.h index a1fee9b..a72078f 100644 --- a/libmultipath/defaults.h +++ b/libmultipath/defaults.h @@ -36,6 +36,7 @@ #define DEFAULT_FORCE_SYNC 0 #define DEFAULT_PARTITION_DELIM NULL #define DEFAULT_SKIP_KPARTX SKIP_KPARTX_OFF +#define DEFAULT_DISABLE_CHANGED_WWIDS 0 #define DEFAULT_CHECKINT 5 #define MAX_CHECKINT(a) (a << 2) diff --git a/libmultipath/dict.c b/libmultipath/dict.c index e0a3014..61b6910 100644 --- a/libmultipath/dict.c +++ b/libmultipath/dict.c @@ -412,6 +412,9 @@ declare_hw_snprint(skip_kpartx, print_yes_no_undef) declare_mp_handler(skip_kpartx, set_yes_no_undef) declare_mp_snprint(skip_kpartx, print_yes_no_undef) +declare_def_handler(disable_changed_wwids, set_yes_no) +declare_def_snprint(disable_changed_wwids, print_yes_no) + static int def_config_dir_handler(struct config *conf, vector strvec) { @@ -1395,6 +1398,7 @@ init_keywords(vector keywords) install_keyword("retrigger_delay", &def_retrigger_delay_handler, &snprint_def_retrigger_delay); install_keyword("missing_uev_wait_timeout", &def_uev_wait_timeout_handler, &snprint_def_uev_wait_timeout); install_keyword("skip_kpartx", &def_skip_kpartx_handler, &snprint_def_skip_kpartx); + install_keyword("disable_changed_wwids", &def_disable_changed_wwids_handler, &snprint_def_disable_changed_wwids); __deprecated install_keyword("default_selector", &def_selector_handler, NULL); __deprecated install_keyword("default_path_grouping_policy", &def_pgpolicy_handler, NULL); __deprecated install_keyword("default_uid_attribute", &def_uid_attribute_handler, NULL); diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c index bb3116d..756344f 100644 --- a/libmultipath/discovery.c +++ b/libmultipath/discovery.c @@ -1538,13 +1538,12 @@ get_prio (struct path * pp) } static int -get_udev_uid(struct path * pp, char *uid_attribute) +get_udev_uid(struct path * pp, char *uid_attribute, struct udev_device *udev) { ssize_t len; const char *value; - value = udev_device_get_property_value(pp->udev, - uid_attribute); + value = udev_device_get_property_value(udev, uid_attribute); if (!value || strlen(value) == 0) value = getenv(uid_attribute); if (value && strlen(value)) { @@ -1625,8 +1624,8 @@ get_vpd_uid(struct path * pp) return get_vpd_sysfs(parent, 0x83, pp->wwid, WWID_SIZE); } -static int -get_uid (struct path * pp, int path_state) +int +get_uid (struct path * pp, int path_state, struct udev_device *udev) { char *c; const char *origin = "unknown"; @@ -1639,7 +1638,7 @@ get_uid (struct path * pp, int path_state) put_multipath_config(conf); } - if (!pp->udev) { + if (!udev) { condlog(1, "%s: no udev information", pp->dev); return 1; } @@ -1669,7 +1668,7 @@ get_uid (struct path * pp, int path_state) int retrigger; if (pp->uid_attribute) { - len = get_udev_uid(pp, pp->uid_attribute); + len = get_udev_uid(pp, pp->uid_attribute, udev); origin = "udev"; if (len <= 0) condlog(1, @@ -1798,7 +1797,7 @@ pathinfo (struct path *pp, struct config *conf, int mask) } if ((mask & DI_WWID) && !strlen(pp->wwid)) { - get_uid(pp, path_state); + get_uid(pp, path_state, pp->udev); if (!strlen(pp->wwid)) { pp->initialized = INIT_MISSING_UDEV; pp->tick = conf->retrigger_delay; diff --git a/libmultipath/discovery.h b/libmultipath/discovery.h index 0f5b1e6..176eac1 100644 --- a/libmultipath/discovery.h +++ b/libmultipath/discovery.h @@ -49,6 +49,7 @@ ssize_t sysfs_get_vpd (struct udev_device * udev, int pg, unsigned char * buff, size_t len); int sysfs_get_asymmetric_access_state(struct path *pp, char *buff, int buflen); +int get_uid(struct path * pp, int path_state, struct udev_device *udev); /* * discovery bitmask diff --git a/libmultipath/structs.h b/libmultipath/structs.h index 3a716d8..58508f6 100644 --- a/libmultipath/structs.h +++ b/libmultipath/structs.h @@ -217,6 +217,7 @@ struct path { int fd; int initialized; int retriggers; + int wwid_changed; /* configlet pointers */ struct hwentry * hwe; diff --git a/multipathd/main.c b/multipathd/main.c index dbb4554..bb96cca 100644 --- a/multipathd/main.c +++ b/multipathd/main.c @@ -958,6 +958,12 @@ uev_update_path (struct uevent *uev, struct vectors * vecs) { int ro, retval = 0; struct path * pp; + struct config *conf; + int disable_changed_wwids; + + conf = get_multipath_config(); + disable_changed_wwids = conf->disable_changed_wwids; + put_multipath_config(conf); ro = uevent_get_disk_ro(uev); @@ -969,6 +975,25 @@ uev_update_path (struct uevent *uev, struct vectors * vecs) if (pp) { struct multipath *mpp = pp->mpp; + if (disable_changed_wwids && + (strlen(pp->wwid) || pp->wwid_changed)) { + char wwid[WWID_SIZE]; + + strcpy(wwid, pp->wwid); + get_uid(pp, pp->state, uev->udev); + if (strcmp(wwid, pp->wwid) != 0) { + condlog(0, "%s: path wwid changed from '%s' to '%s'. disallowing", uev->kernel, wwid, pp->wwid); + strcpy(pp->wwid, wwid); + if (!pp->wwid_changed) { + pp->wwid_changed = 1; + pp->tick = 1; + dm_fail_path(pp->mpp->alias, pp->dev_t); + } + goto out; + } else + pp->wwid_changed = 0; + } + if (pp->initialized == INIT_REQUESTED_UDEV) retval = uev_add_path(uev, vecs); else if (mpp && ro >= 0) { @@ -983,6 +1008,7 @@ uev_update_path (struct uevent *uev, struct vectors * vecs) } } } +out: lock_cleanup_pop(vecs->lock); if (!pp) condlog(0, "%s: spurious uevent, path not found", uev->kernel); @@ -1509,6 +1535,12 @@ check_path (struct vectors * vecs, struct path * pp, int ticks) } else checker_clear_message(&pp->checker); + if (pp->wwid_changed) { + condlog(2, "%s: path wwid has changed. Refusing to use", + pp->dev); + newstate = PATH_DOWN; + } + if (newstate == PATH_WILD || newstate == PATH_UNCHECKED) { condlog(2, "%s: unusable path", pp->dev); conf = get_multipath_config();
If a LUN on a storage device gets remapped while in-use by multipath, it's possible that the multipath device will continue writing to this new LUN, causing corruption. This is not multipath's fault (users should go remapping in-use LUNs), but it's possible for multipath to detect this and disable IO to the device. If disable_changed_wwids is set to "yes", multipathd will detect when a LUN changes wwids when it receives the uevent for this, and will disable access to the device until the LUN is mapped back. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> --- libmultipath/config.c | 1 + libmultipath/config.h | 1 + libmultipath/defaults.h | 1 + libmultipath/dict.c | 4 ++++ libmultipath/discovery.c | 15 +++++++-------- libmultipath/discovery.h | 1 + libmultipath/structs.h | 1 + multipathd/main.c | 32 ++++++++++++++++++++++++++++++++ 8 files changed, 48 insertions(+), 8 deletions(-)