Message ID | Pine.LNX.4.64.0906231548060.18687@e-smith.charlieb.ott.istop.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Delegated to: | christophe varoqui |
Headers | show |
Hi Charlie, We do not want to print these messages as they will be flooding your /var/log/messages even when everything is good. We want to print these messages only to get User's attention. If your concern is that the last message is "down", then we should be having a static variable and print up or ghost message only once (when we toggle from down to up/ghost). chandra On Tue, 2009-06-23 at 15:50 -0400, Charlie Brady wrote: > It seems to me that this patch is required, to fix a problem with "path > down" messages being logged when the path transitions to up (and > presumably ghost). > > --- libcheckers/rdac.c.orig 2009-06-23 15:38:05.371422000 -0400 > +++ libcheckers/rdac.c 2009-06-23 15:39:28.020843000 -0400 > @@ -105,5 +105,12 @@ > return PATH_DOWN; > } > > - return ((inq.avtcvp & 0x1) ? PATH_UP : PATH_GHOST); > + if (inq.avtcvp & 0x1) { > + MSG(c, MSG_RDAC_UP); > + return PATH_UP; > + } > + else { > + MSG(c, MSG_RDAC_GHOST); > + return PATH_GHOST; > + } > } > > -- > dm-devel mailing list > dm-devel@redhat.com > https://www.redhat.com/mailman/listinfo/dm-devel -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Wed, 24 Jun 2009, Chandra Seetharaman wrote: > We do not want to print these messages as they will be flooding > your /var/log/messages even when everything is good. Agreed. > We want to print these messages only to get User's attention. Agreed. > If your concern is that the last message is "down", then we should be > having a static variable and print up or ghost message only once (when > we toggle from down to up/ghost). You do have a static variable, so that messages are only printed when the status changes. However, because of the missing MSG() calls when the status is PATH_UP and PATH_GHOST, then the string printed is "path down", when it should be "path up" or "path ghost". I believe that the patch is good. The MSG() calls will change the message string, and then the higher layer will notice the status change, and print the message. Please check the code and correct me if I am wrong. > > chandra > On Tue, 2009-06-23 at 15:50 -0400, Charlie Brady wrote: >> It seems to me that this patch is required, to fix a problem with "path >> down" messages being logged when the path transitions to up (and >> presumably ghost). >> >> --- libcheckers/rdac.c.orig 2009-06-23 15:38:05.371422000 -0400 >> +++ libcheckers/rdac.c 2009-06-23 15:39:28.020843000 -0400 >> @@ -105,5 +105,12 @@ >> return PATH_DOWN; >> } >> >> - return ((inq.avtcvp & 0x1) ? PATH_UP : PATH_GHOST); >> + if (inq.avtcvp & 0x1) { >> + MSG(c, MSG_RDAC_UP); >> + return PATH_UP; >> + } >> + else { >> + MSG(c, MSG_RDAC_GHOST); >> + return PATH_GHOST; >> + } >> } >> >> -- >> dm-devel mailing list >> dm-devel@redhat.com >> https://www.redhat.com/mailman/listinfo/dm-devel > > -- > dm-devel mailing list > dm-devel@redhat.com > https://www.redhat.com/mailman/listinfo/dm-devel > -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Wed, Jun 24, 2009 at 04:19:00PM -0400, Charlie Brady wrote: > > On Wed, 24 Jun 2009, Chandra Seetharaman wrote: > >> We do not want to print these messages as they will be flooding >> your /var/log/messages even when everything is good. > > Agreed. > >> We want to print these messages only to get User's attention. > > Agreed. > >> If your concern is that the last message is "down", then we should be >> having a static variable and print up or ghost message only once (when >> we toggle from down to up/ghost). > > You do have a static variable, so that messages are only printed when the > status changes. However, because of the missing MSG() calls when the status > is PATH_UP and PATH_GHOST, then the string printed is "path down", when it > should be "path up" or "path ghost". > > I believe that the patch is good. The MSG() calls will change the message > string, and then the higher layer will notice the status change, and print > the message. > > Please check the code and correct me if I am wrong. You are correct. Looks fine by me. I'll add this for RHEL 5.5. It will make it into Centos after that. -Ben > >> >> chandra >> On Tue, 2009-06-23 at 15:50 -0400, Charlie Brady wrote: >>> It seems to me that this patch is required, to fix a problem with "path >>> down" messages being logged when the path transitions to up (and >>> presumably ghost). >>> >>> --- libcheckers/rdac.c.orig 2009-06-23 15:38:05.371422000 -0400 >>> +++ libcheckers/rdac.c 2009-06-23 15:39:28.020843000 -0400 >>> @@ -105,5 +105,12 @@ >>> return PATH_DOWN; >>> } >>> >>> - return ((inq.avtcvp & 0x1) ? PATH_UP : PATH_GHOST); >>> + if (inq.avtcvp & 0x1) { >>> + MSG(c, MSG_RDAC_UP); >>> + return PATH_UP; >>> + } >>> + else { >>> + MSG(c, MSG_RDAC_GHOST); >>> + return PATH_GHOST; >>> + } >>> } >>> >>> -- >>> dm-devel mailing list >>> dm-devel@redhat.com >>> https://www.redhat.com/mailman/listinfo/dm-devel >> >> -- >> dm-devel mailing list >> dm-devel@redhat.com >> https://www.redhat.com/mailman/listinfo/dm-devel >> > > -- > dm-devel mailing list > dm-devel@redhat.com > https://www.redhat.com/mailman/listinfo/dm-devel -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Wed, 24 Jun 2009, Charlie Brady wrote: >> If your concern is that the last message is "down", then we should be >> having a static variable and print up or ghost message only once (when >> we toggle from down to up/ghost). > > You do have a static variable, so that messages are only printed when the > status changes. I think I need to retract that. The messages appear to be printed when the status is not changing, at least for the "path down" messages. I've patched scsi_dh_rdac.c so that SUN/LCSM100_I is included in rdac_dev_list[], and restarted iscsi and multipathd. I've now provoked a path failure via iptables. Here are the logs I see (with unpatched multipathd) - the duplicate message suppression from the rdac path checker is not working: Jun 25 12:35:02 sun4150node1 kernel: device-mapper: multipath: Using scsi_dh module scsi_dh_rdac for failover/failback and device management. Jun 25 12:35:08 sun4150node1 multipathd: cannot open /sbin/dasd_id : No such file or directory Jun 25 12:35:08 sun4150node1 multipathd: cannot open /sbin/gnbd_import : No such file or directory Jun 25 12:35:08 sun4150node1 multipathd: [copy.c] cannot open /sbin/dasd_id Jun 25 12:35:08 sun4150node1 multipathd: cannot copy /sbin/dasd_id in ramfs : No such file or directory Jun 25 12:35:08 sun4150node1 multipathd: [copy.c] cannot open /sbin/gnbd_import Jun 25 12:35:08 sun4150node1 multipathd: cannot copy /sbin/gnbd_import in ramfs : No such file or directory Jun 25 12:35:08 sun4150node1 multipathd: mpath0: event checker started Jun 25 12:35:02 sun4150node1 kernel: device-mapper: multipath: Using scsi_dh module scsi_dh_rdac for failover/failback and device management. Jun 25 12:46:55 sun4150node1 kernel: ping timeout of 5 secs expired, last rx 1703243, last ping 1708243, now 1713243 Jun 25 12:46:55 sun4150node1 kernel: connection3:0: iscsi: detected conn error (1011) Jun 25 12:46:55 sun4150node1 iscsid: Kernel reported iSCSI connection 3:0 error (1011) state (3) Jun 25 12:48:55 sun4150node1 kernel: session3: iscsi: session recovery timed out after 120 secs Jun 25 12:48:55 sun4150node1 kernel: iscsi: cmd 0x12 is not queued (8) Jun 25 12:48:55 sun4150node1 multipathd: path checkers start up Jun 25 12:48:55 sun4150node1 multipathd: sdd: rdac checker reports path is down Jun 25 12:48:55 sun4150node1 kernel: device-mapper: multipath: Failing path 8:48. Jun 25 12:48:55 sun4150node1 multipathd: checker failed path 8:48 in map mpath0 Jun 25 12:48:55 sun4150node1 multipathd: mpath0: remaining active paths: 1 Jun 25 12:48:55 sun4150node1 multipathd: dm-2: add map (uevent) Jun 25 12:48:55 sun4150node1 multipathd: dm-2: devmap already registered Jun 25 12:49:00 sun4150node1 kernel: iscsi: cmd 0x12 is not queued (8) Jun 25 12:49:00 sun4150node1 multipathd: sdd: rdac checker reports path is down Jun 25 12:49:35 sun4150node1 last message repeated 7 times Jun 25 12:49:35 sun4150node1 last message repeated 7 times Jun 25 12:50:05 sun4150node1 last message repeated 6 times Jun 25 12:50:05 sun4150node1 kernel: iscsi: cmd 0x28 is not queued (8) Jun 25 12:50:05 sun4150node1 kernel: sd 14:0:0:0: SCSI error: return code = 0x00010000 Jun 25 12:50:05 sun4150node1 kernel: end_request: I/O error, dev sdd, sector 0 Jun 25 12:50:05 sun4150node1 kernel: Buffer I/O error on device sdd, logical block 0 Jun 25 12:50:05 sun4150node1 kernel: Buffer I/O error on device sdd, logical block 1 Jun 25 12:50:05 sun4150node1 kernel: Buffer I/O error on device sdd, logical block 2 Jun 25 12:50:05 sun4150node1 kernel: Buffer I/O error on device sdd, logical block 3 Jun 25 12:50:05 sun4150node1 kernel: iscsi: cmd 0x28 is not queued (8) -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Thu, 25 Jun 2009, Charlie Brady wrote: > > On Wed, 24 Jun 2009, Charlie Brady wrote: > >> > If your concern is that the last message is "down", then we should be >> > having a static variable and print up or ghost message only once (when >> > we toggle from down to up/ghost). >> >> You do have a static variable, so that messages are only printed when the >> status changes. > > I think I need to retract that. The messages appear to be printed when the > status is not changing, at least for the "path down" messages. > > I've patched scsi_dh_rdac.c so that SUN/LCSM100_I is included in > rdac_dev_list[], and restarted iscsi and multipathd. I've now provoked a path > failure via iptables. Here are the logs I see (with unpatched multipathd) - > the duplicate message suppression from the rdac path checker is not working: OTOH, on path recovery, the wrong message is displayed ("path down" instead of "path up") but the message is not repeated: Jun 25 12:57:51 sun4150node1 last message repeated 9 times Jun 25 12:57:54 sun4150node1 iscsid: received iferror -38 Jun 25 12:57:54 sun4150node1 last message repeated 2 times Jun 25 12:57:54 sun4150node1 iscsid: connection3:0 is operational after recovery (44 attempts) Jun 25 12:57:56 sun4150node1 multipathd: sdd: rdac checker reports path is down Jun 25 12:57:56 sun4150node1 multipathd: 8:48: reinstated Jun 25 12:57:56 sun4150node1 multipathd: mpath0: remaining active paths: 2 Jun 25 12:57:56 sun4150node1 multipathd: dm-2: add map (uevent) Jun 25 12:57:56 sun4150node1 multipathd: dm-2: devmap already registered -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Thu, 25 Jun 2009, Charlie Brady wrote: > On Thu, 25 Jun 2009, Charlie Brady wrote: > >> I think I need to retract that. The messages appear to be printed when the >> status is not changing, at least for the "path down" messages. >> >> I've patched scsi_dh_rdac.c so that SUN/LCSM100_I is included in >> rdac_dev_list[], and restarted iscsi and multipathd. I've now provoked a >> path failure via iptables. Here are the logs I see (with unpatched >> multipathd) - the duplicate message suppression from the rdac path checker >> is not working: > > OTOH, on path recovery, the wrong message is displayed ("path down" instead > of "path up") but the message is not repeated: >From multipathd/main.c, there are three conditions where the messages are printed: ... if (newstate != pp->state) { int oldstate = pp->state; pp->state = newstate; LOG_MSG(1, checker_message(&pp->checker)); ... } else if (newstate == PATH_UP || newstate == PATH_GHOST) { LOG_MSG(4, checker_message(&pp->checker)); ... else if (newstate == PATH_DOWN) LOG_MSG(2, checker_message(&pp->checker)); ... I think only the first might be required, and only the first will be seen if logging verbosity < 2 (default is 2). So there are two path checker problems I can see (in multipath-tools-0.4.7.rhel5.13): - the rdac checker leaves the message set to "path down" when path transitions back to up or ghost. - multipathd prints the message on every PATH_DOWN except the first when verbosity is >= 2, and on every PATH_UP or PATH_GHOST except the first when verbosity is >= 4. --- Charlie -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Yes, I agree... Your patch should work as you expect. Thanks chandra On Wed, 2009-06-24 at 16:19 -0400, Charlie Brady wrote: > On Wed, 24 Jun 2009, Chandra Seetharaman wrote: > > > We do not want to print these messages as they will be flooding > > your /var/log/messages even when everything is good. > > Agreed. > > > We want to print these messages only to get User's attention. > > Agreed. > > > If your concern is that the last message is "down", then we should be > > having a static variable and print up or ghost message only once (when > > we toggle from down to up/ghost). > > You do have a static variable, so that messages are only printed when the > status changes. However, because of the missing MSG() calls when the > status is PATH_UP and PATH_GHOST, then the string printed is "path down", > when it should be "path up" or "path ghost". > > I believe that the patch is good. The MSG() calls will change the message > string, and then the higher layer will notice the status change, and print > the message. > > Please check the code and correct me if I am wrong. > > > > > chandra > > On Tue, 2009-06-23 at 15:50 -0400, Charlie Brady wrote: > >> It seems to me that this patch is required, to fix a problem with "path > >> down" messages being logged when the path transitions to up (and > >> presumably ghost). > >> > >> --- libcheckers/rdac.c.orig 2009-06-23 15:38:05.371422000 -0400 > >> +++ libcheckers/rdac.c 2009-06-23 15:39:28.020843000 -0400 > >> @@ -105,5 +105,12 @@ > >> return PATH_DOWN; > >> } > >> > >> - return ((inq.avtcvp & 0x1) ? PATH_UP : PATH_GHOST); > >> + if (inq.avtcvp & 0x1) { > >> + MSG(c, MSG_RDAC_UP); > >> + return PATH_UP; > >> + } > >> + else { > >> + MSG(c, MSG_RDAC_GHOST); > >> + return PATH_GHOST; > >> + } > >> } > >> > >> -- > >> dm-devel mailing list > >> dm-devel@redhat.com > >> https://www.redhat.com/mailman/listinfo/dm-devel > > > > -- > > dm-devel mailing list > > dm-devel@redhat.com > > https://www.redhat.com/mailman/listinfo/dm-devel > > -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Thu, 25 Jun 2009, Chandra Seetharaman wrote:
> Yes, I agree... Your patch should work as you expect.
Thanks - bmarzins@sourceware.org has already committed the fix.
Please see subsequent messages (including patch) re excessive path checker
message logging.
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
On Thu, Jun 25, 2009 at 01:28:40PM -0400, Charlie Brady wrote: > > On Thu, 25 Jun 2009, Charlie Brady wrote: > >> On Thu, 25 Jun 2009, Charlie Brady wrote: >> >>> I think I need to retract that. The messages appear to be printed when the >>> status is not changing, at least for the "path down" messages. >>> >>> I've patched scsi_dh_rdac.c so that SUN/LCSM100_I is included in >>> rdac_dev_list[], and restarted iscsi and multipathd. I've now provoked a >>> path failure via iptables. Here are the logs I see (with unpatched >>> multipathd) - the duplicate message suppression from the rdac path checker >>> is not working: >> >> OTOH, on path recovery, the wrong message is displayed ("path down" >> instead of "path up") but the message is not repeated: > >> From multipathd/main.c, there are three conditions where the messages are > printed: > > ... > if (newstate != pp->state) { > int oldstate = pp->state; > pp->state = newstate; > LOG_MSG(1, checker_message(&pp->checker)); > ... > } > else if (newstate == PATH_UP || newstate == PATH_GHOST) { > LOG_MSG(4, checker_message(&pp->checker)); > ... > else if (newstate == PATH_DOWN) > LOG_MSG(2, checker_message(&pp->checker)); > ... > > I think only the first might be required, and only the first will be seen > if logging verbosity < 2 (default is 2). > > So there are two path checker problems I can see (in > multipath-tools-0.4.7.rhel5.13): > > - the rdac checker leaves the message set to "path down" when path > transitions back to up or ghost. Any your patch fixes that. > - multipathd prints the message on every PATH_DOWN except the first when > verbosity is >= 2, and on every PATH_UP or PATH_GHOST except the first > when verbosity is >= 4. Why is this a problem? On verbosity = 1, you only get messages when you change states. On verbosity = 2, you additionally get repeated messages for failed paths, which are often something that the sysadmin should be doing something about. On verbosity = 4, you get a message whenever the path checker runs, which is helpful for debugging, which is the only time you would ever want verbosity = 4. This seems sensible to me. If you don't like the repeated path down messages, you can change your verbosity level to 1. Right now that involves editting the init script, but in future releases, you can change it in /etc/multipath.conf. -Ben > > --- > Charlie > > -- > dm-devel mailing list > dm-devel@redhat.com > https://www.redhat.com/mailman/listinfo/dm-devel -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
--- libcheckers/rdac.c.orig 2009-06-23 15:38:05.371422000 -0400 +++ libcheckers/rdac.c 2009-06-23 15:39:28.020843000 -0400 @@ -105,5 +105,12 @@ return PATH_DOWN; } - return ((inq.avtcvp & 0x1) ? PATH_UP : PATH_GHOST); + if (inq.avtcvp & 0x1) { + MSG(c, MSG_RDAC_UP); + return PATH_UP; + } + else { + MSG(c, MSG_RDAC_GHOST); + return PATH_GHOST; + } }