Message ID | 20180209080225.5137-3-qi.fuli@jp.fujitsu.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Feb 9, 2018 at 12:02 AM, QI Fuli <qi.fuli@jp.fujitsu.com> wrote: > This patch is used to add $ndctl create-monitor command, by which users can > create a new monitor. Users can select the DIMMS to be monitored by using > [--dimm] [--bus] [--region] [--namespace] options. The notifications can > be outputed to a special file or syslog by using [--output] option, the > special file will be placed under /var/log/ndctl. A name is also required for > a monitor,so users can destroy the monitor by the name. When a monitor is > created successfully, a file with same name will be created under > /var/ndctl/monitor. > Example: > #ndctl create-monitor --monitor m_nmem1 --dimm nmem1 --output m_nmem.log Hi Qi, This is getting closer to where I want to see this go, but still some architecture details to incorporate. I mentioned on the cover letter that systemd can handle starting, stopping, and show the status of the monitor. The other detail to incorporate is that monitor events can come DIMMs, but also namespaces, regions, and the bus. The event list I have collected to date is: dimm-spares-remaining dimm-media-temperature dimm-controller-temperature dimm-health-state dimm-unclean-shutdown dimm-detected namespace-media-error namespace-detected region-media-error region-detected bus-media-error bus-address-range-scrub-complete bus-detected ...and I think all of those should be separate options, probably something like the following, but I'd Vishal to comment if this scheme can be handled with the bash tab-completion implementation: ndctl monitor --dimm-events=spares-remaining,media-temperature --namespace-events=all --regions-events --bus=ACPI.NFIT ...where an empty --<object>-events option is equivalent to --<object>-events=all. Also, similar to "ndctl list" specifying specific buses, namespaces, etc causes the monitor to filter the objects based on those properties. Since "ndctl list" already has this filtering implemented I'd like to see it refactored and shared between the 2 implementations rather than duplicated as is done in this patch. In other words rework cmd_list() into a generic nvdimm object walking routine with callback functions to 'list' or 'monitor' a given object that matches the filter. Let me know if the above makes sense. I'm thinking the 'ndctl list' refactoring might be something I need to handle.
On Sun, 2018-02-11 at 12:48 -0800, Dan Williams wrote: > On Fri, Feb 9, 2018 at 12:02 AM, QI Fuli <qi.fuli@jp.fujitsu.com> > wrote: > > This patch is used to add $ndctl create-monitor command, by which > > users can > > create a new monitor. Users can select the DIMMS to be monitored by > > using > > [--dimm] [--bus] [--region] [--namespace] options. The > > notifications can > > be outputed to a special file or syslog by using [--output] option, > > the > > special file will be placed under /var/log/ndctl. A name is also > > required for > > a monitor,so users can destroy the monitor by the name. When a > > monitor is > > created successfully, a file with same name will be created under > > /var/ndctl/monitor. > > Example: > > #ndctl create-monitor --monitor m_nmem1 --dimm nmem1 --output > > m_nmem.log > > Hi Qi, > > This is getting closer to where I want to see this go, but still some > architecture details to incorporate. I mentioned on the cover letter > that systemd can handle starting, stopping, and show the status of > the > monitor. The other detail to incorporate is that monitor events can > come DIMMs, but also namespaces, regions, and the bus. > > The event list I have collected to date is: > > dimm-spares-remaining > dimm-media-temperature > dimm-controller-temperature > dimm-health-state > dimm-unclean-shutdown > dimm-detected > namespace-media-error > namespace-detected > region-media-error > region-detected > bus-media-error > bus-address-range-scrub-complete > bus-detected > > ...and I think all of those should be separate options, probably > something like the following, but I'd Vishal to comment if this > scheme > can be handled with the bash tab-completion implementation: > > ndctl monitor --dimm-events=spares-remaining,media-temperature > --namespace-events=all --regions-events --bus=ACPI.NFIT Yes I think we should be able to extend bash completion for a comma separated list of arguments. > > ...where an empty --<object>-events option is equivalent to > --<object>-events=all. Also, similar to "ndctl list" specifying > specific buses, namespaces, etc causes the monitor to filter the > objects based on those properties. > > Since "ndctl list" already has this filtering implemented I'd like to > see it refactored and shared between the 2 implementations rather > than > duplicated as is done in this patch. In other words rework cmd_list() > into a generic nvdimm object walking routine with callback functions > to 'list' or 'monitor' a given object that matches the filter. > > Let me know if the above makes sense. I'm thinking the 'ndctl list' > refactoring might be something I need to handle.
Hi, > On Fri, Feb 9, 2018 at 12:02 AM, QI Fuli <qi.fuli@jp.fujitsu.com> wrote: > > This patch is used to add $ndctl create-monitor command, by which users can > > create a new monitor. Users can select the DIMMS to be monitored by using > > [--dimm] [--bus] [--region] [--namespace] options. The notifications can > > be outputed to a special file or syslog by using [--output] option, the > > special file will be placed under /var/log/ndctl. A name is also required for > > a monitor,so users can destroy the monitor by the name. When a monitor is > > created successfully, a file with same name will be created under > > /var/ndctl/monitor. > > Example: > > #ndctl create-monitor --monitor m_nmem1 --dimm nmem1 --output m_nmem.log > > Hi Qi, > > This is getting closer to where I want to see this go, but still some > architecture details to incorporate. I mentioned on the cover letter > that systemd can handle starting, stopping, and show the status of the > monitor. The other detail to incorporate is that monitor events can > come DIMMs, but also namespaces, regions, and the bus. > > The event list I have collected to date is: > > dimm-spares-remaining > dimm-media-temperature > dimm-controller-temperature > dimm-health-state > dimm-unclean-shutdown > dimm-detected > namespace-media-error > namespace-detected > region-media-error > region-detected > bus-media-error > bus-address-range-scrub-complete > bus-detected > > ...and I think all of those should be separate options, probably > something like the following, but I'd Vishal to comment if this scheme > can be handled with the bash tab-completion implementation: > > ndctl monitor --dimm-events=spares-remaining,media-temperature > --namespace-events=all --regions-events --bus=ACPI.NFIT > > ...where an empty --<object>-events option is equivalent to > --<object>-events=all. Also, similar to "ndctl list" specifying > specific buses, namespaces, etc causes the monitor to filter the > objects based on those properties. Hmmmm.... Currently, I'm confusing what features/options are required for nvdimm daemon. For example, what is use-case of "--bus=ACPI.NFIT"? For normal administorator of a server, what he/she's interest is "need to replace nvdimm module or not", and "need to backup/restore on the nvdimm module or not". For normal programs, they just use device name or directory/filename of the filesystem on the nvdimm. To backup thier data, he/she need to solve relationship between nvdimm modules and device name (/dev/pmem* or /dev/dax*). So, IMHO, I suppose "namespace(device name) specifying (or all namespace)" is enough the following events which requires replace the nvdimm module. - spare-remaining - helth-state - media-error And I'm not sure what is use-case of specifying region, bus, and dimm on these events. In addition, could you tell me what administrator/program can do on the following events? What nvdimm daemon should do on each event? - media-temperature - controller-temperature - address-range-scrub-complete - unclean-shutdown Thanks, > > Since "ndctl list" already has this filtering implemented I'd like to > see it refactored and shared between the 2 implementations rather than > duplicated as is done in this patch. In other words rework cmd_list() > into a generic nvdimm object walking routine with callback functions > to 'list' or 'monitor' a given object that matches the filter. > > Let me know if the above makes sense. I'm thinking the 'ndctl list' > refactoring might be something I need to handle. > _______________________________________________ > Linux-nvdimm mailing list > Linux-nvdimm@lists.01.org > https://lists.01.org/mailman/listinfo/linux-nvdimm
> Hi, > > > On Fri, Feb 9, 2018 at 12:02 AM, QI Fuli <qi.fuli@jp.fujitsu.com> wrote: > > > This patch is used to add $ndctl create-monitor command, by which users can > > > create a new monitor. Users can select the DIMMS to be monitored by using > > > [--dimm] [--bus] [--region] [--namespace] options. The notifications can > > > be outputed to a special file or syslog by using [--output] option, the > > > special file will be placed under /var/log/ndctl. A name is also required for > > > a monitor,so users can destroy the monitor by the name. When a monitor is > > > created successfully, a file with same name will be created under > > > /var/ndctl/monitor. > > > Example: > > > #ndctl create-monitor --monitor m_nmem1 --dimm nmem1 --output m_nmem.log > > > > Hi Qi, > > > > This is getting closer to where I want to see this go, but still some > > architecture details to incorporate. I mentioned on the cover letter > > that systemd can handle starting, stopping, and show the status of the > > monitor. The other detail to incorporate is that monitor events can > > come DIMMs, but also namespaces, regions, and the bus. > > > > The event list I have collected to date is: > > > > dimm-spares-remaining > > dimm-media-temperature > > dimm-controller-temperature > > dimm-health-state > > dimm-unclean-shutdown > > dimm-detected > > namespace-media-error > > namespace-detected > > region-media-error > > region-detected > > bus-media-error > > bus-address-range-scrub-complete > > bus-detected > > > > ...and I think all of those should be separate options, probably > > something like the following, but I'd Vishal to comment if this scheme > > can be handled with the bash tab-completion implementation: > > > > ndctl monitor --dimm-events=spares-remaining,media-temperature > > --namespace-events=all --regions-events --bus=ACPI.NFIT > > > > ...where an empty --<object>-events option is equivalent to > > --<object>-events=all. Also, similar to "ndctl list" specifying > > specific buses, namespaces, etc causes the monitor to filter the > > objects based on those properties. > > Hmmmm.... > > Currently, I'm confusing what features/options are required for nvdimm daemon. > For example, what is use-case of "--bus=ACPI.NFIT"? > > For normal administorator of a server, what he/she's interest is > "need to replace nvdimm module or not", and "need to backup/restore > on the nvdimm module or not". > > For normal programs, they just use device name or directory/filename of > the filesystem on the nvdimm. > > To backup thier data, he/she need to solve relationship between > nvdimm modules and device name (/dev/pmem* or /dev/dax*). > > So, IMHO, I suppose "namespace(device name) specifying (or all namespace)" > is enough the following events which requires replace the nvdimm module. > > - spare-remaining > - helth-state > - media-error > > And I'm not sure what is use-case of specifying region, bus, and dimm > on these events. > > > In addition, could you tell me what administrator/program can do > on the following events? What nvdimm daemon should do on each event? > > - media-temperature > - controller-temperature > - address-range-scrub-complete > - unclean-shutdown > I would like to ask one more thing. What should nvdimm daemon do on detected events of bus/region/namespace/dimm ? IIRC, udev handles such hotplug events. What is relationship/roles between nvdimm daemon and udev? Bye, > > Thanks, > > > > > Since "ndctl list" already has this filtering implemented I'd like to > > see it refactored and shared between the 2 implementations rather than > > duplicated as is done in this patch. In other words rework cmd_list() > > into a generic nvdimm object walking routine with callback functions > > to 'list' or 'monitor' a given object that matches the filter. > > > > Let me know if the above makes sense. I'm thinking the 'ndctl list' > > refactoring might be something I need to handle. > > _______________________________________________ > > Linux-nvdimm mailing list > > Linux-nvdimm@lists.01.org > > https://lists.01.org/mailman/listinfo/linux-nvdimm >
>> Hi Qi, >> >> This is getting closer to where I want to see this go, but still some >> architecture details to incorporate. I mentioned on the cover letter >> that systemd can handle starting, stopping, and show the status of >> the >> monitor. The other detail to incorporate is that monitor events can >> come DIMMs, but also namespaces, regions, and the bus. >> >> The event list I have collected to date is: >> >> dimm-spares-remaining >> dimm-media-temperature >> dimm-controller-temperature >> dimm-health-state >> dimm-unclean-shutdown >> dimm-detected >> namespace-media-error >> namespace-detected >> region-media-error >> region-detected >> bus-media-error >> bus-address-range-scrub-complete >> bus-detected By referring to dimm_listen() in smart-listen.c, I understand how to monitor events come DIMMs. I have checked the ndctl, but I could not find how to monitor events come namespaces, region and bus. Could you please mention more? >> ...and I think all of those should be separate options, probably >> something like the following, but I'd Vishal to comment if this >> scheme >> can be handled with the bash tab-completion implementation: >> >> ndctl monitor --dimm-events=spares-remaining,media-temperature >> --namespace-events=all --regions-events --bus=ACPI.NFIT > Yes I think we should be able to extend bash completion for a comma > separated list of arguments. > >> ...where an empty --<object>-events option is equivalent to >> --<object>-events=all. Also, similar to "ndctl list" specifying >> specific buses, namespaces, etc causes the monitor to filter the >> objects based on those properties. >> >> Since "ndctl list" already has this filtering implemented I'd like to >> see it refactored and shared between the 2 implementations rather >> than >> duplicated as is done in this patch. In other words rework cmd_list() >> into a generic nvdimm object walking routine with callback functions >> to 'list' or 'monitor' a given object that matches the filter. >> >> Let me know if the above makes sense. I'm thinking the 'ndctl list' >> refactoring might be something I need to handle. Yes, I agree with refactoring ndctl_list and add a structure shared between ndctl list and ndctl monitor. But I prefer to use tokens rather than callback functions. I also think the option should support multiple parameters likes: ndclt monitor --dimm nmem1,nmem2 --region region1,region2 The parameters from the same option can be stored in a list_node. #define filter_node(field) \ struct filter_##field##_node { \ const char *name; \ struct list_node list; \ }; filter_node(bus); filter_node(dimm); filter_node(region); filter_node(namespace); struct ndctl_filter { struct list_head filter_bus; struct list_head filter_dimm; struct list_head filter_region; struct list_head filter_namespace; } nf; Make the tokens like follows: #define list_util_field_filter(field) \ list_for_each(&nf.filter_##field, filter_##field##_node, list) { \ if (util_##field##_filter(field, filter_##field##_node->name)) { \ flag = true; \ break; \ } \ } #define list_util_bus_filter_by_field(field) \ list_for_each(&nf.filter_##field, filter_##field##_node, list) { \ if (util_bus_filter_by_##field(bus, filter_##field##_node->name)) { \ flag = true; \ break; \ } \ } #define list_util_dimm_filter_by_field(field) \ list_for_each(&nf.filter_##field, filter_##field##_node, list) { \ if (util_dimm_filter_by_##field(dimm, filter_##field##_node->name)) { \ flag = true; \ break; \ } \ } Here is how to use the tokens: ndctl_bus_foreach(ctx, bus) { struct filter_bus_node *filter_bus_node; list_util_field_filter(bus); if (!flag) { struct filter_region_node *filter_region_node; list_util_bus_filter_by_field(region); } if (!flag) { struct filter_namespace_node *filter_namespace_node; list_util_bus_filter_by_field(namespace); } if (!flag) continue; ndctl_dimm_foreach(bus, dimm) { flag = false; struct filter_dimm_node *filter_dimm_node; list_util_field_filter(dimm); if (!flag) { struct filter_region_node *filter_region_node; list_util_dimm_filter_by_field(region); } if (!flag) { struct filter_namespace_node *filter_namespace_node; list_util_dimm_filter_by_field(namespace); } if (!flag) continue; } } To make sure the util_ndctl_filter() goes right, I would like you give some comments?
On Tue, Feb 13, 2018 at 1:58 AM, Yasunori Goto <y-goto@jp.fujitsu.com> wrote: > Hi, > >> On Fri, Feb 9, 2018 at 12:02 AM, QI Fuli <qi.fuli@jp.fujitsu.com> wrote: >> > This patch is used to add $ndctl create-monitor command, by which users can >> > create a new monitor. Users can select the DIMMS to be monitored by using >> > [--dimm] [--bus] [--region] [--namespace] options. The notifications can >> > be outputed to a special file or syslog by using [--output] option, the >> > special file will be placed under /var/log/ndctl. A name is also required for >> > a monitor,so users can destroy the monitor by the name. When a monitor is >> > created successfully, a file with same name will be created under >> > /var/ndctl/monitor. >> > Example: >> > #ndctl create-monitor --monitor m_nmem1 --dimm nmem1 --output m_nmem.log >> >> Hi Qi, >> >> This is getting closer to where I want to see this go, but still some >> architecture details to incorporate. I mentioned on the cover letter >> that systemd can handle starting, stopping, and show the status of the >> monitor. The other detail to incorporate is that monitor events can >> come DIMMs, but also namespaces, regions, and the bus. >> >> The event list I have collected to date is: >> >> dimm-spares-remaining >> dimm-media-temperature >> dimm-controller-temperature >> dimm-health-state >> dimm-unclean-shutdown >> dimm-detected >> namespace-media-error >> namespace-detected >> region-media-error >> region-detected >> bus-media-error >> bus-address-range-scrub-complete >> bus-detected >> >> ...and I think all of those should be separate options, probably >> something like the following, but I'd Vishal to comment if this scheme >> can be handled with the bash tab-completion implementation: >> >> ndctl monitor --dimm-events=spares-remaining,media-temperature >> --namespace-events=all --regions-events --bus=ACPI.NFIT >> >> ...where an empty --<object>-events option is equivalent to >> --<object>-events=all. Also, similar to "ndctl list" specifying >> specific buses, namespaces, etc causes the monitor to filter the >> objects based on those properties. > > Hmmmm.... > > Currently, I'm confusing what features/options are required for nvdimm daemon. > For example, what is use-case of "--bus=ACPI.NFIT"? Other platforms may support different bus types, there are also proposals like this one for custom NVDIMM buses [1]. The other use case is allowing the user to monitor for any media error on the bus, or the completion of ARS. > For normal administorator of a server, what he/she's interest is > "need to replace nvdimm module or not", and "need to backup/restore > on the nvdimm module or not". I think that's only part of it. Data center operations want to know more than just when it is time to replace a module, they want to collect almost any data that the operating system can provide about the platform. > For normal programs, they just use device name or directory/filename of > the filesystem on the nvdimm. > > To backup thier data, he/she need to solve relationship between > nvdimm modules and device name (/dev/pmem* or /dev/dax*). > > So, IMHO, I suppose "namespace(device name) specifying (or all namespace)" > is enough the following events which requires replace the nvdimm module. > > - spare-remaining > - helth-state > - media-error > > And I'm not sure what is use-case of specifying region, bus, and dimm > on these events. The reason for supporting those other event sources in my mind is having a unified interface for tracking topology, health/status, and error events. > > In addition, could you tell me what administrator/program can do > on the following events? What nvdimm daemon should do on each event? > > - media-temperature > - controller-temperature > - address-range-scrub-complete > - unclean-shutdown Media temperature and controller temperature alarms can signal to data center operations that the server is getting too hot, and might need remediation, perhaps a specific fan has failed and replacing that fan becomes a high priority when these alarms start firing. Address-range-scrub complete might be a signal that the server may get a boost in throughput since the overhead of the background operation is now complete. ARS may continue to run long after the server has booted, so the end of that event may be important to server loading decisions. Unclean shutdown notification allows events that occurred at the last shutdown to be recorded at the next boot. Otherwise an operator would need to write a separate tool to go retrieve this information. Having it all in one place reduces the number of tools / data-sources that operations infrastructure needs to consider. [1]: https://lists.01.org/pipermail/linux-nvdimm/2018-January/013926.html
On Tue, Feb 13, 2018 at 2:12 AM, Yasunori Goto <y-goto@jp.fujitsu.com> wrote: >> Hi, [..] > I would like to ask one more thing. > > What should nvdimm daemon do on detected events of bus/region/namespace/dimm ? > IIRC, udev handles such hotplug events. > What is relationship/roles between nvdimm daemon and udev? UDEV handles the hotplug events, but I imagine device topology changes are of interest to whatever agent is going to be consuming monitor events, or monitoring changes that an administrator makes to a machine. My inspiration for including these events in the monitor comes from the monitor functionality in mdadm. It has events like: DeviceDisappeared RebuildStarted RebuildNN RebuildFinished Fail FailSpare SpareActive NewArray DegradedArray MoveSpare SparesMissing To be clear, I'm not suggesting that all of these events need to be supported in the first implementation of the monitor, but I want the architecture and the user interface to be ready to support events from any device in the topology.
On Wed, Feb 14, 2018 at 9:51 PM, Qi, Fuli <qi.fuli@jp.fujitsu.com> wrote: > >>> Hi Qi, >>> >>> This is getting closer to where I want to see this go, but still some >>> architecture details to incorporate. I mentioned on the cover letter >>> that systemd can handle starting, stopping, and show the status of >>> the >>> monitor. The other detail to incorporate is that monitor events can >>> come DIMMs, but also namespaces, regions, and the bus. >>> >>> The event list I have collected to date is: >>> >>> dimm-spares-remaining >>> dimm-media-temperature >>> dimm-controller-temperature >>> dimm-health-state >>> dimm-unclean-shutdown >>> dimm-detected >>> namespace-media-error >>> namespace-detected >>> region-media-error >>> region-detected >>> bus-media-error >>> bus-address-range-scrub-complete >>> bus-detected > > By referring to dimm_listen() in smart-listen.c, I understand how to monitor > events > come DIMMs. I have checked the ndctl, but I could not find how to monitor > events > come namespaces, region and bus. Could you please mention more? >>> >>> ...and I think all of those should be separate options, probably >>> something like the following, but I'd Vishal to comment if this >>> scheme >>> can be handled with the bash tab-completion implementation: >>> >>> ndctl monitor --dimm-events=spares-remaining,media-temperature >>> --namespace-events=all --regions-events --bus=ACPI.NFIT >> >> Yes I think we should be able to extend bash completion for a comma >> separated list of arguments. >> >>> ...where an empty --<object>-events option is equivalent to >>> --<object>-events=all. Also, similar to "ndctl list" specifying >>> specific buses, namespaces, etc causes the monitor to filter the >>> objects based on those properties. >>> >>> Since "ndctl list" already has this filtering implemented I'd like to >>> see it refactored and shared between the 2 implementations rather >>> than >>> duplicated as is done in this patch. In other words rework cmd_list() >>> into a generic nvdimm object walking routine with callback functions >>> to 'list' or 'monitor' a given object that matches the filter. >>> >>> Let me know if the above makes sense. I'm thinking the 'ndctl list' >>> refactoring might be something I need to handle. > > Yes, I agree with refactoring ndctl_list and add a structure shared between > ndctl list and ndctl monitor. > But I prefer to use tokens rather than callback functions. > > I also think the option should support multiple parameters likes: > ndclt monitor --dimm nmem1,nmem2 --region region1,region2 > > The parameters from the same option can be stored in a list_node. > > #define filter_node(field) \ > struct filter_##field##_node { \ > const char *name; \ > struct list_node list; \ > }; > > filter_node(bus); > filter_node(dimm); > filter_node(region); > filter_node(namespace); > > struct ndctl_filter { > struct list_head filter_bus; > struct list_head filter_dimm; > struct list_head filter_region; > struct list_head filter_namespace; > } nf; > > Make the tokens like follows: > > #define list_util_field_filter(field) \ > list_for_each(&nf.filter_##field, filter_##field##_node, list) { \ > if (util_##field##_filter(field, filter_##field##_node->name)) { \ > flag = true; \ > break; \ > } \ > } > > #define list_util_bus_filter_by_field(field) \ > list_for_each(&nf.filter_##field, filter_##field##_node, list) { \ > if (util_bus_filter_by_##field(bus, filter_##field##_node->name)) { > \ > flag = true; \ > break; \ > } \ > } > > #define list_util_dimm_filter_by_field(field) \ > list_for_each(&nf.filter_##field, filter_##field##_node, list) { \ > if (util_dimm_filter_by_##field(dimm, filter_##field##_node->name)) > { \ > flag = true; \ > break; \ > } \ > } > > Here is how to use the tokens: > > ndctl_bus_foreach(ctx, bus) { > struct filter_bus_node *filter_bus_node; > list_util_field_filter(bus); > if (!flag) { > struct filter_region_node *filter_region_node; > list_util_bus_filter_by_field(region); > } > if (!flag) { > struct filter_namespace_node *filter_namespace_node; > list_util_bus_filter_by_field(namespace); > } > if (!flag) > continue; > ndctl_dimm_foreach(bus, dimm) { > flag = false; > struct filter_dimm_node *filter_dimm_node; > list_util_field_filter(dimm); > if (!flag) { > struct filter_region_node *filter_region_node; > list_util_dimm_filter_by_field(region); > } > if (!flag) { > struct filter_namespace_node *filter_namespace_node; > list_util_dimm_filter_by_field(namespace); > } > if (!flag) > continue; > } > } > > To make sure the util_ndctl_filter() goes right, I would like you give some > comments? I think the multi-layer macros makes the code difficult to read, and I'm not seeing how it handles the case where we're building up a combined json record representing the entire bus, i.e. "ndctl list -BRND". I'd rather re-write "ndctl list" in terms of this filter structure: struct ndctl_filter_ops { int (*match_bus)(struct ndctl_bus *bus, void **ctx, unsigned long flags); int (*match_dimm)(struct ndctl_bus *dimm, void *ctx, unsigned long flags); int (*match_region)(struct ndctl_bus *region, void **ctx, unsigned long flags); int (*match_namespace)(struct ndctl_bus *namespace, void *ctx, unsigned long flags); }; ...where the match routine is only populated if that device is selected and it passes filter specific context via the "ctx" variable. In the case of "ndctl list" that might be the "jbus" container json object being returned from ->match_bus() and then passed to ->match_dimm() to hold "jdimm" instances. Note that ->match_dimm() and ->match_namespace() only take context as input because those objects don't have any children.
> On Tue, Feb 13, 2018 at 2:12 AM, Yasunori Goto <y-goto@jp.fujitsu.com> wrote: > >> Hi, > [..] > > I would like to ask one more thing. > > > > What should nvdimm daemon do on detected events of bus/region/namespace/dimm ? > > IIRC, udev handles such hotplug events. > > What is relationship/roles between nvdimm daemon and udev? > > UDEV handles the hotplug events, but I imagine device topology changes > are of interest to whatever agent is going to be consuming monitor > events, or monitoring changes that an administrator makes to a > machine. My inspiration for including these events in the monitor > comes from the monitor functionality in mdadm. It has events like: > > DeviceDisappeared > RebuildStarted > RebuildNN > RebuildFinished > Fail > FailSpare > SpareActive > NewArray > DegradedArray > MoveSpare > SparesMissing > > To be clear, I'm not suggesting that all of these events need to be > supported in the first implementation of the monitor, but I want the > architecture and the user interface to be ready to support events from > any device in the topology. > Ok, I got it. Thank you for your clarification. Bye, --- Yasunori Goto
diff --git a/builtin.h b/builtin.h index 5e1b7ef..850f6a8 100644 --- a/builtin.h +++ b/builtin.h @@ -36,6 +36,7 @@ int cmd_write_labels(int argc, const char **argv, void *ctx); int cmd_init_labels(int argc, const char **argv, void *ctx); int cmd_check_labels(int argc, const char **argv, void *ctx); int cmd_inject_error(int argc, const char **argv, void *ctx); +int cmd_create_monitor(int argc, const char **argv, void *ctx); int cmd_list(int argc, const char **argv, void *ctx); #ifdef ENABLE_TEST int cmd_test(int argc, const char **argv, void *ctx); diff --git a/configure.ac b/configure.ac index 70ba360..e859e04 100644 --- a/configure.ac +++ b/configure.ac @@ -160,6 +160,9 @@ AC_CONFIG_FILES([ Documentation/daxctl/Makefile ]) +AC_CONFIG_COMMANDS([monitorlogdir], [$MKDIR_P /var/log/ndctl]) +AC_CONFIG_COMMANDS([monitorprocdir], [$MKDIR_P /var/ndctl/monitor]) + AC_OUTPUT AC_MSG_RESULT([ $PACKAGE $VERSION diff --git a/ndctl/Makefile.am b/ndctl/Makefile.am index 6677607..d9a484d 100644 --- a/ndctl/Makefile.am +++ b/ndctl/Makefile.am @@ -13,7 +13,8 @@ ndctl_SOURCES = ndctl.c \ test.c \ ../util/json.c \ util/json-smart.c \ - inject-error.c + inject-error.c \ + monitor.c if ENABLE_DESTRUCTIVE ndctl_SOURCES += ../test/blk_namespaces.c \ diff --git a/ndctl/monitor.c b/ndctl/monitor.c new file mode 100644 index 0000000..cf1cd6e --- /dev/null +++ b/ndctl/monitor.c @@ -0,0 +1,342 @@ +/* + * Copyright (c) 2018, FUJITSU LIMITED. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU Lesser General Public License, + * version 2.1, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT ANY + * WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + * FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for + * more details. + */ +#include <stdio.h> +#include <json-c/json.h> +#include <signal.h> +#include <dirent.h> +#include <util/parse-options.h> +#include <util/log.h> +#include <util/filter.h> +#include <util/json.h> +#include <ndctl/lib/private.h> +#include <ndctl/libndctl.h> +#include <sys/stat.h> +#define NUM_MAX_DIMM 1024 +#define BUF_SIZE 4096 + +struct monitor_dimm { + struct ndctl_dimm *dimm; + const char *devname; + int health_eventfd; +}; + +static struct parameter { + const char *bus; + const char *region; + const char *dimm; + const char *namespace; + const char *event; + const char *output; + const char *monitor; + bool all; +} param; + +static const char *proc_path = "/var/ndctl/monitor/"; + +static char *get_full_path_filename(const char *path, const char *name) +{ + char *filename; + int len = strlen(path) + strlen (name) +1; + filename = (char *) malloc(len); + if (!filename) + return NULL; + filename[0] = '\0'; + strcpy(filename, path); + strcat(filename, name); + return filename; +} + +static void log_syslog(struct ndctl_ctx *ctx, int priority, const char *file, + int line, const char *fn, const char *format, va_list args) +{ + char *buf; + buf = (char *)malloc(BUF_SIZE); + if (!buf) { + syslog(LOG_ERR, "could not get memory for log_syslog\n"); + exit(EXIT_FAILURE); + } + vsnprintf(buf, BUF_SIZE, format, args); + syslog(priority, "%s", buf); + free(buf); +} + +static void log_output(struct ndctl_ctx *ctx, int priority, const char *file, + int line, const char *fn, const char *format, va_list args) +{ + FILE *f; + char *filename, *buf; + const char *log_path = "/var/log/ndctl/"; + filename = get_full_path_filename(log_path, param.output); + + f = fopen(filename, "a+"); + if (!f) { + syslog(LOG_ERR, "open %s failed\n", filename); + exit(EXIT_FAILURE); + } + + buf = (char *)malloc(BUF_SIZE); + if (!buf) { + syslog(LOG_ERR, "could not get memory for log_output\n"); + exit(EXIT_FAILURE); + } + vsnprintf(buf, BUF_SIZE, format, args); + fprintf(f, "%s\n", buf); + free(buf); + fclose(f); +} + +static int notify_json_msg(struct ndctl_ctx *ctx, struct monitor_dimm *m_dimm) +{ + time_t c_time; + char date[32]; + struct json_object *jmsg, *jdatetime, *jpid, *jdimm, *jhealth; + + jmsg = json_object_new_object(); + + c_time = time(NULL); + strftime(date, sizeof(date), "%Y-%m-%d %H:%M:%S", localtime(&c_time)); + jdatetime = json_object_new_string(date); + json_object_object_add(jmsg, "datetime", jdatetime); + + jpid = json_object_new_int((int)getpid()); + json_object_object_add(jmsg, "pid", jpid); + + jdimm = util_dimm_to_json(m_dimm->dimm, 0); + json_object_object_add(jmsg, "dimm", jdimm); + + jhealth = util_dimm_health_to_json(m_dimm->dimm); + if (jhealth) + json_object_object_add(jdimm, "health", jhealth); + + notice(ctx, "%s", + json_object_to_json_string_ext(jmsg, JSON_C_TO_STRING_PLAIN)); + return 0; +} + +#define add_param_to_json(field) \ +if (param.field) { \ + j##field = json_object_new_string(param.field); \ + json_object_object_add(jmonitors, #field, j##field); \ +} + +static int create_monitor_proc(struct ndctl_ctx *ctx) +{ + time_t c_time; + char date[32]; + char *filename; + struct json_object *jmonitors, *jmonitor, *jpid, *jcreate, *jbus, + *jdimm, *jregion, *jnamespace, *joutput, *jevent; + jmonitors = json_object_new_object(); + add_param_to_json(monitor); + c_time = time(NULL); + strftime(date, sizeof(date), "%Y-%m-%d %H:%M:%S", localtime(&c_time)); + jcreate = json_object_new_string(date); + json_object_object_add(jmonitors, "create", jcreate); + jpid = json_object_new_int((int)getpid()); + json_object_object_add(jmonitors, "pid", jpid); + add_param_to_json(dimm); + add_param_to_json(bus); + add_param_to_json(region); + add_param_to_json(namespace); + add_param_to_json(event); + add_param_to_json(output); + + filename = get_full_path_filename(proc_path, param.monitor); + return json_object_to_file(filename, jmonitors); +} + +static int destroy_monitor_proc(void) +{ + char *filename; + filename = get_full_path_filename(proc_path, param.monitor); + return remove(filename); +} + +static int set_monitor_dimm(struct ndctl_ctx *ctx, fd_set *fds, int *maxfd, + struct monitor_dimm *m_dimm) +{ + struct ndctl_bus *bus; + int fd, num_dimm = 0; + char buf[BUF_SIZE]; + + ndctl_bus_foreach(ctx, bus) { + struct ndctl_dimm *dimm; + if(!util_bus_filter(bus, param.bus) + || !util_bus_filter_by_dimm(bus, param.dimm) + || !util_bus_filter_by_region(bus, param.region) + || !util_bus_filter_by_namespace(bus, param.namespace)) + continue; + ndctl_dimm_foreach(bus, dimm) { + if(!util_dimm_filter(dimm, param.dimm) + || !util_dimm_filter_by_region(dimm, + param.region) + || !util_dimm_filter_by_namespace(dimm, + param.namespace)) + continue; + if (!ndctl_dimm_is_cmd_supported(dimm, + ND_CMD_SMART_THRESHOLD)) + continue; + m_dimm[num_dimm].dimm = dimm; + m_dimm[num_dimm].devname = ndctl_dimm_get_devname(dimm); + fd = ndctl_dimm_get_health_eventfd(dimm); + pread(fd, buf, sizeof(buf), 0); + m_dimm[num_dimm].health_eventfd = fd; + + if (fds) + FD_SET(fd, fds); + if (maxfd) { + if (*maxfd < fd) + *maxfd = fd; + } + num_dimm++; + } + } + return num_dimm; +} + +static bool check_monitor_exist(void) +{ + FILE *f; + char *filename; + filename = get_full_path_filename(proc_path, param.monitor); + f = fopen(filename, "r"); + if (!f) + return false; + fclose(f); + return true; +} +static int monitor_dimm_event(struct ndctl_ctx *ctx) +{ + int rc, maxfd, num_dimm; + struct monitor_dimm *m_dimm; + char buf[BUF_SIZE]; + m_dimm = calloc(NUM_MAX_DIMM, sizeof(struct monitor_dimm)); + if (!m_dimm) { + error("monitor_dimm memory space cannot be allocated\n"); + goto out; + } + + fd_set fds; + FD_ZERO(&fds); + + num_dimm = set_monitor_dimm(ctx, &fds, &maxfd, m_dimm); + if (num_dimm == 0) { + error("no monitor dimms can be found\n"); + goto out; + } + + + if (daemon(0, 0) != 0) { + err(ctx, "daemon start failed\n"); + goto out; + } + + printf("ndctl create-monitor %s started\n", param.monitor); + if (create_monitor_proc(ctx) != 0) { + err(ctx, "daemon start failed\n"); + goto out; + } + + while(1){ + rc = select(maxfd + 1, NULL, NULL, &fds, NULL); + if (rc < 1) { + if (rc == 0) + err(ctx, "select unexpected timeout\n"); + else + err(ctx, "select %s\n", strerror(errno)); + goto out_clean; + } + for (int i = 0; i < num_dimm; i++) { + if (!FD_ISSET(m_dimm[i].health_eventfd, &fds)){ + FD_SET(m_dimm[i].health_eventfd, &fds); + continue; + } + if (notify_json_msg(ctx, &m_dimm[i]) != 0) + goto out_clean; + pread(m_dimm[i].health_eventfd, buf, sizeof(buf), 0); + } + } + free(m_dimm); + return 0; +out: + printf("ndctl create-monitor %s failed\n", param.monitor); + return 1; + +out_clean: + err(ctx, "ndctl monitor %s stoped\n", param.monitor); + if (destroy_monitor_proc() != 0) + err(ctx, "failed to clean temp file"); + return 1; + +} + +int cmd_create_monitor(int argc, const char **argv, void *ctx) +{ + const struct option options[] = { + OPT_STRING('b', "bus", ¶m.bus, "bus-id", "filter by bus"), + OPT_STRING('r', "region", ¶m.region, "region-id", + "filter by region"), + OPT_STRING('d', "dimm", ¶m.dimm, "dimm-id", + "filter by dimm"), + OPT_STRING('n', "namespace", ¶m.namespace, + "namespace-id", "filter by namespace id"), + OPT_STRING('e', "event", ¶m.event, "event-id", + "filter by event"), + OPT_STRING('o', "output", ¶m.output, "output file name", + "monitor output file name"), + OPT_STRING('m', "monitor", ¶m.monitor, "monitor name", + "monitor name") + }; + const char * const u[] = { + "ndctl create-monitor <monitor> <output> [<options>]", + NULL + }; + argc = parse_options(argc, argv, options, u, 0); + for (int i = 0; i < argc; i++) { + error("unknown parameter \"%s\"\n", argv[i]); + goto out; + } + if (!param.monitor) { + error("monitor name --monitor is required\n"); + goto out; + } + if (check_monitor_exist()) { + error("monitor %s is exist\n", param.monitor); + goto out; + } + if (!param.output) { + error("output file name --output is required\n"); + goto out; + } + + ndctl_set_log_priority((struct ndctl_ctx*)ctx, LOG_NOTICE); + + if (strcmp(param.output, "syslog") == 0) + ndctl_set_log_fn((struct ndctl_ctx*)ctx, log_syslog); + else + ndctl_set_log_fn((struct ndctl_ctx*)ctx, log_output); + + if (monitor_dimm_event((struct ndctl_ctx*)ctx) != 0) + goto out; + + char *filename; + filename = get_full_path_filename(proc_path, param.monitor); + if (remove(filename) != 0) { + err((struct ndctl_ctx*)ctx, "failed to delete %s\n", filename); + goto out; + } + return 0; + +out: + return 1; +} diff --git a/ndctl/ndctl.c b/ndctl/ndctl.c index 0f748e1..6c63d79 100644 --- a/ndctl/ndctl.c +++ b/ndctl/ndctl.c @@ -84,6 +84,7 @@ static struct cmd_struct commands[] = { { "init-labels", cmd_init_labels }, { "check-labels", cmd_check_labels }, { "inject-error", cmd_inject_error }, + { "create-monitor", cmd_create_monitor }, { "list", cmd_list }, { "help", cmd_help }, #ifdef ENABLE_TEST
This patch is used to add $ndctl create-monitor command, by which users can create a new monitor. Users can select the DIMMS to be monitored by using [--dimm] [--bus] [--region] [--namespace] options. The notifications can be outputed to a special file or syslog by using [--output] option, the special file will be placed under /var/log/ndctl. A name is also required for a monitor,so users can destroy the monitor by the name. When a monitor is created successfully, a file with same name will be created under /var/ndctl/monitor. Example: #ndctl create-monitor --monitor m_nmem1 --dimm nmem1 --output m_nmem.log Signed-off-by: QI Fuli <qi.fuli@jp.fujitsu.com> --- builtin.h | 1 + configure.ac | 3 + ndctl/Makefile.am | 3 +- ndctl/monitor.c | 342 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ ndctl/ndctl.c | 1 + 5 files changed, 349 insertions(+), 1 deletion(-) create mode 100644 ndctl/monitor.c