diff mbox

[RFC,v4] ndctl: monitor: add ndctl monitor daemon

Message ID 20180313113308.2136-1-qi.fuli@jp.fujitsu.com (mailing list archive)
State New, archived
Headers show

Commit Message

QI Fuli March 13, 2018, 11:33 a.m. UTC
This is the v4 patch for ndctl monitor daemon, a tiny daemon to monitor the
smart events of nvdimm DIMMs. Users can run a monitor as a one-shot command
or a daemon in background by using the [--daemon] option. DIMMs to monitor
can be selected by [--dimm] [--bus] [--region] [--namespace] options.
When a smart event fires, monitor daemon will log the notification which
including dimm health status to syslog or a logfile by setting [--log] option.
The notification follows json format and can be consumed by log collectors
like Fluented.

For example, a monitor daemon can be started by the following command:
# ndctl monitor --dimm nmem1 --log /var/log/monitor.log --daemon daemon-name

Then check the monitor daemon status by using systemd:
# systemctl status ndctl-monitor@daemon-name.service

To stop the monitor daemon by:
# systemctl stop ndctl-monitor@daemon-name.service

Also, a monitor daemon can be started by systemd:
# systemctl start ndctl-monitor.service
Which monitors all dimms.

In this implemention, when a ndctl monitor starts with [--daemon] option, all
the arguments will be saved into a temp file named as daemon-name and placed
under /etc/sysconfig/ndctl/ directory. The temp file would be used as an
EnvironmentFile by systemd, and it would be deleted automatically when the
systemd service is stopped.
Due to the deletion the following commands will not work.
# systemctl enable ndctl-monitor@daemon-name.service
# systemctl restart ndctl-monitor@daemon-name.service

I am not sure whether these commands are needed for ndctl monitor daemon,
your comments will be appreciated.

Signed-off-by: QI Fuli <qi.fuli@jp.fujitsu.com>

Change log since v3:
 - Removing create-monitor, show-monitor, list-monitor, destroy-monitor
 - Adding [--daemon] option to run ndctl monitor as a daemon 
 - Using systemd to manage ndctl monitor daemon
 - Replacing filter_monitor_dimm() with filter_dimm()

Change log since v2:
 - Changing the interface of daemon to the ndctl command line
 - Changing the name of daemon form "nvdimmd" to "monitor"
 - Removing the config file, unit_file, nvdimmd dir
 - Removing nvdimmd_test program
 - Adding ndctl/monitor.c

Change log since v1:
 - Adding a config file(/etc/nvdimmd/nvdimmd.conf)
 - Using struct log_ctx instead of syslog()
    - Using log_syslog() to save the notify messages to syslog
    - Using log_file() to save the notify messages to special file
 - Adding LOG_NOTICE level to log_priority
 - Using automake instead of Makefile
 - Adding a new util file(nvdimmd/util.c) including helper functions
   needed for nvdimm daemon
 - Adding nvdimmd_test program

---
 builtin.h                    |   1 +
 ndctl/Makefile.am            |  13 +-
 ndctl/monitor.c              | 411 +++++++++++++++++++++++++++++++++++++++++++
 ndctl/ndctl-monitor.service  |   7 +
 ndctl/ndctl-monitor@.service |   9 +
 ndctl/ndctl.c                |   1 +
 util/filter.c                |   5 +-
 util/filter.h                |   3 +
 util/parse-options.h         |   1 +
 9 files changed, 448 insertions(+), 3 deletions(-)
 create mode 100644 ndctl/monitor.c
 create mode 100644 ndctl/ndctl-monitor.service
 create mode 100644 ndctl/ndctl-monitor@.service

Comments

Dan Williams March 14, 2018, 6:19 p.m. UTC | #1
On Tue, Mar 13, 2018 at 4:33 AM, QI Fuli <qi.fuli@jp.fujitsu.com> wrote:
> This is the v4 patch for ndctl monitor daemon, a tiny daemon to monitor the
> smart events of nvdimm DIMMs. Users can run a monitor as a one-shot command
> or a daemon in background by using the [--daemon] option. DIMMs to monitor
> can be selected by [--dimm] [--bus] [--region] [--namespace] options.
> When a smart event fires, monitor daemon will log the notification which
> including dimm health status to syslog or a logfile by setting [--log] option.
> The notification follows json format and can be consumed by log collectors
> like Fluented.
>
> For example, a monitor daemon can be started by the following command:
> # ndctl monitor --dimm nmem1 --log /var/log/monitor.log --daemon daemon-name
>
> Then check the monitor daemon status by using systemd:
> # systemctl status ndctl-monitor@daemon-name.service
>
> To stop the monitor daemon by:
> # systemctl stop ndctl-monitor@daemon-name.service
>
> Also, a monitor daemon can be started by systemd:
> # systemctl start ndctl-monitor.service
> Which monitors all dimms.
>
> In this implemention, when a ndctl monitor starts with [--daemon] option, all
> the arguments will be saved into a temp file named as daemon-name and placed
> under /etc/sysconfig/ndctl/ directory. The temp file would be used as an
> EnvironmentFile by systemd, and it would be deleted automatically when the
> systemd service is stopped.

The monitors started by hand should be kept separate from the monitors
started by systemd. The default monitor started by systemd should get
its configuration from /etc/ndctl.conf, and we should otherwise have a
--conf-file option to the monitor to override the default
configuration. Any other monitors started outside of the systemd
should remain independent.

> Due to the deletion the following commands will not work.
> # systemctl enable ndctl-monitor@daemon-name.service
> # systemctl restart ndctl-monitor@daemon-name.service
>
> I am not sure whether these commands are needed for ndctl monitor daemon,
> your comments will be appreciated.

Both those commands are needed.

>
> Signed-off-by: QI Fuli <qi.fuli@jp.fujitsu.com>
>
> Change log since v3:
>  - Removing create-monitor, show-monitor, list-monitor, destroy-monitor
>  - Adding [--daemon] option to run ndctl monitor as a daemon
>  - Using systemd to manage ndctl monitor daemon
>  - Replacing filter_monitor_dimm() with filter_dimm()
>
> Change log since v2:
>  - Changing the interface of daemon to the ndctl command line
>  - Changing the name of daemon form "nvdimmd" to "monitor"
>  - Removing the config file, unit_file, nvdimmd dir
>  - Removing nvdimmd_test program
>  - Adding ndctl/monitor.c
>
> Change log since v1:
>  - Adding a config file(/etc/nvdimmd/nvdimmd.conf)
>  - Using struct log_ctx instead of syslog()
>     - Using log_syslog() to save the notify messages to syslog
>     - Using log_file() to save the notify messages to special file
>  - Adding LOG_NOTICE level to log_priority
>  - Using automake instead of Makefile
>  - Adding a new util file(nvdimmd/util.c) including helper functions
>    needed for nvdimm daemon
>  - Adding nvdimmd_test program
>
> ---
>  builtin.h                    |   1 +
>  ndctl/Makefile.am            |  13 +-
>  ndctl/monitor.c              | 411 +++++++++++++++++++++++++++++++++++++++++++
>  ndctl/ndctl-monitor.service  |   7 +
>  ndctl/ndctl-monitor@.service |   9 +
>  ndctl/ndctl.c                |   1 +
>  util/filter.c                |   5 +-
>  util/filter.h                |   3 +
>  util/parse-options.h         |   1 +
>  9 files changed, 448 insertions(+), 3 deletions(-)
>  create mode 100644 ndctl/monitor.c
>  create mode 100644 ndctl/ndctl-monitor.service
>  create mode 100644 ndctl/ndctl-monitor@.service
>
> diff --git a/builtin.h b/builtin.h
> index b24fc99..4b908f0 100644
> --- a/builtin.h
> +++ b/builtin.h
> @@ -36,6 +36,7 @@ int cmd_write_labels(int argc, const char **argv, void *ctx);
>  int cmd_init_labels(int argc, const char **argv, void *ctx);
>  int cmd_check_labels(int argc, const char **argv, void *ctx);
>  int cmd_inject_error(int argc, const char **argv, void *ctx);
> +int cmd_monitor(int argc, const char **argv, void *ctx);
>  int cmd_list(int argc, const char **argv, void *ctx);
>  #ifdef ENABLE_TEST
>  int cmd_test(int argc, const char **argv, void *ctx);
> diff --git a/ndctl/Makefile.am b/ndctl/Makefile.am
> index e0db97b..e364ef9 100644
> --- a/ndctl/Makefile.am
> +++ b/ndctl/Makefile.am
> @@ -16,7 +16,8 @@ ndctl_SOURCES = ndctl.c \
>                 util/json-firmware.c \
>                 inject-error.c \
>                 update.c \
> -               inject-smart.c
> +               inject-smart.c \
> +               monitor.c
>
>  if ENABLE_DESTRUCTIVE
>  ndctl_SOURCES += ../test/blk_namespaces.c \
> @@ -41,3 +42,13 @@ ndctl_SOURCES += ../test/libndctl.c \
>                  ../test/multi-pmem.c \
>                  ../test/core.c
>  endif
> +
> +unitfiles =\
> +       ndctl-monitor.service \
> +       ndctl-monitor@.service
> +
> +unitdir = /usr/lib/systemd/system/

This directory needs to be configurable as not all distributions are
guaranteed to put their unit files there.

See the "Installing systemd Service Files" section here:

    https://www.freedesktop.org/software/systemd/man/daemon.html

...for some sample code to have autotools probe for the proper directory.

> +
> +unit_DATA = $(unitfiles)
> +
> +EXTRA_DIST = $(unitfiles)
> diff --git a/ndctl/monitor.c b/ndctl/monitor.c
> new file mode 100644
> index 0000000..2164f27
> --- /dev/null
> +++ b/ndctl/monitor.c
> @@ -0,0 +1,411 @@
> +/*
> + * Copyright (c) 2018, FUJITSU LIMITED. All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU Lesser General Public License,
> + * version 2.1, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT ANY
> + * WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
> + * FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for
> + * more details.
> + */
> +#include <stdio.h>
> +#include <json-c/json.h>
> +#include <signal.h>
> +#include <libgen.h>
> +#include <dirent.h>
> +#include <util/parse-options.h>
> +#include <util/log.h>
> +#include <util/json.h>
> +#include <util/filter.h>
> +#include <ndctl/lib/private.h>
> +#include <ndctl/libndctl.h>
> +#include <sys/stat.h>
> +#define NUM_MAX_DIMM 1024
> +#define BUF_SIZE 4096
> +
> +struct monitor_dimm {
> +       struct ndctl_dimm *dimm;
> +       int health_eventfd;
> +       struct list_node list;
> +};
> +
> +struct monitor_filter_arg {
> +       struct list_head mdimm;
> +       int maxfd;
> +       fd_set fds;
> +       int num_dimm;
> +       unsigned long flags;
> +};
> +
> +struct util_filter_params param;
> +
> +static char *conf_dir = "/etc/sysconfig/ndctl/";
> +static char *def_log_dir = "/var/log/ndctl/";
> +
> +static char *concat(char *str1, char *str2)
> +{
> +       char *result = malloc(strlen(str1) + strlen(str2) + 1);
> +       strcpy(result, str1);
> +       strcat(result, str2);
> +       return result;
> +}
> +
> +static bool is_dir(char *filepath)
> +{
> +       DIR *dir = opendir(filepath);
> +       if (dir) {
> +               closedir(dir);
> +               return true;
> +       }
> +       return false;
> +}
> +
> +static int recur_mkdir(char *filepath, mode_t mode)
> +{
> +       char *p;
> +       char *buf = (char *)malloc(strlen(filepath) + 4);
> +
> +       strcpy(buf, filepath);
> +       for (p = strchr(buf + 1, '/'); p; p = strchr(p + 1, '/')) {
> +               *p = '\0';
> +               if (!is_dir(buf)) {
> +                       if (mkdir(buf, mode) < 0) {
> +                               free(buf);
> +                               return -1;
> +                       }
> +               }
> +               *p = '/';
> +       }
> +
> +       free(buf);
> +       return 0;
> +}
> +
> +static void log_file(struct ndctl_ctx *ctx, int priority, const char *file,
> +       int line, const char *fn, const char *format, va_list args)
> +{
> +       FILE *f;
> +       char *log_name, *log_dir, *buf;
> +       char *errmsg = (char *)malloc(BUF_SIZE);
> +       char *tail = "/";
> +
> +       log_dir = dirname(strdup(param.log));
> +       log_name = basename(strdup(param.log));
> +       if (strcmp(log_dir, ".") == 0)
> +               log_dir = def_log_dir;
> +       else
> +               log_dir = concat(log_dir, tail);
> +       if (log_dir[0] != '/')
> +               log_dir = concat(def_log_dir, log_dir);
> +       log_name = concat(log_dir, log_name);
> +
> +       if (!is_dir(log_dir)) {
> +               if (recur_mkdir(log_dir, 0744) != 0) {
> +                       sprintf(errmsg, "cannot create dir: %s", log_dir);
> +                       goto out;
> +               }
> +       }
> +
> +       f = fopen(log_name, "a+");
> +       if (!f) {
> +               sprintf(errmsg, "open %s failed", log_name);
> +               goto out;
> +       }
> +
> +       buf = (char *)malloc(BUF_SIZE);
> +       if (!buf) {
> +               sprintf(errmsg, "cannot get memory for log_file");
> +               goto out;
> +       }
> +       vsnprintf(buf, BUF_SIZE, format, args);
> +       fprintf(f, "%s\n", buf);
> +       free(buf);
> +       fclose(f);
> +       return;
> +out:
> +       syslog(LOG_ERR, "%s\n", errmsg);
> +       if (!param.fork)
> +               error("%s\n", errmsg);
> +       free(errmsg);
> +       exit(EXIT_FAILURE);
> +}
> +
> +static void log_syslog(struct ndctl_ctx *ctx, int priority, const char *file,
> +       int line, const char *fn, const char *format, va_list args)
> +{
> +       char *buf = (char *)malloc(BUF_SIZE);
> +       vsnprintf(buf, BUF_SIZE, format, args);
> +       syslog(priority, "%s", buf);
> +       free(buf);
> +}

This seems to be reinventing setting up a log file that I'm sure we
could borrow or link to...

ndctl borrows heavily from git, and git does logging in it's daemon.c,
I'd be happier if you copied that.

> +
> +#define fail(fmt, ...) \
> +do { \
> +       err(ctx, "ndctl-%s:%s:%d: " fmt, \
> +                               VERSION, __func__, __LINE__, ##__VA_ARGS__); \
> +       if (!param.fork) \
> +               fprintf(stderr, "ndctl-%s:%s:%d: " fmt, \
> +                               VERSION, __func__, __LINE__, ##__VA_ARGS__); \
> +} while (0)
> +
> +static int notify_json_msg(struct ndctl_ctx *ctx, struct ndctl_dimm *dimm)
> +{
> +       time_t c_time;
> +       char date[32];
> +       struct json_object *jmsg, *jdatetime, *jpid, *jdimm, *jhealth;
> +
> +       jmsg = json_object_new_object();
> +       if (!jmsg) {
> +               fail("\n");
> +               return -1;
> +       }
> +
> +       c_time = time(NULL);
> +       strftime(date, sizeof(date), "%Y-%m-%d %H:%M:%S", localtime(&c_time));

Is this a standard date format? I'd expect miliseconds since the epoch
or iso-8601 format?

> +       jdatetime = json_object_new_string(date);
> +       if (!jdatetime) {
> +               fail("\n");
> +               return -1;
> +       }
> +       json_object_object_add(jmsg, "datetime", jdatetime);
> +
> +       jpid = json_object_new_int((int)getpid());
> +       if (!jpid) {
> +               fail("\n");
> +               return -1;
> +       }
> +       json_object_object_add(jmsg, "pid", jpid);
> +
> +       jdimm = util_dimm_to_json(dimm, 0);
> +       if (!dimm) {
> +               fail("\n");
> +               return -1;
> +       }
> +       json_object_object_add(jmsg, "dimm", jdimm);
> +
> +       jhealth = util_dimm_health_to_json(dimm);
> +       if (!jhealth) {
> +               fail("\n");
> +               return -1;
> +       }
> +       json_object_object_add(jdimm, "health", jhealth);
> +
> +       notice(ctx, "%s",
> +               json_object_to_json_string_ext(jmsg, JSON_C_TO_STRING_PLAIN));
> +       if (!param.fork)
> +               printf("%s\n", json_object_to_json_string_ext(jmsg,
> +                               JSON_C_TO_STRING_PRETTY));
> +       return 0;
> +}
> +
> +static void filter_dimm(struct ndctl_dimm *dimm, struct util_filter_ctx *ctx)
> +{
> +       struct monitor_filter_arg *mfa = (struct monitor_filter_arg *)ctx->arg;
> +       int fd;
> +       char buf[BUF_SIZE];
> +
> +       if (!ndctl_dimm_is_cmd_supported(dimm, ND_CMD_SMART_THRESHOLD))
> +               return;
> +
> +       struct monitor_dimm *m_dimm = malloc(sizeof(*m_dimm));
> +       m_dimm->dimm = dimm;
> +       fd = ndctl_dimm_get_health_eventfd(dimm);
> +       pread(fd, buf, sizeof(buf), 0);
> +       m_dimm->health_eventfd = fd;
> +       list_add_tail(&mfa->mdimm, &m_dimm->list);
> +       FD_SET(fd, &mfa->fds);
> +       if (fd > mfa->maxfd)
> +               mfa->maxfd = fd;
> +       mfa->num_dimm++;
> +}
> +
> +static int monitor_smart_event(struct ndctl_ctx *ctx)
> +{
> +       struct util_filter_ctx fctx = { 0 };
> +       struct monitor_filter_arg mfa = { 0 };
> +       int rc;
> +       char buf[BUF_SIZE];
> +       char *errmsg = (char *)malloc(BUF_SIZE);
> +
> +       fctx.filter_dimm = filter_dimm;
> +       fctx.arg = &mfa;
> +       mfa.flags = 0;
> +       list_head_init(&mfa.mdimm);
> +       FD_ZERO(&mfa.fds);
> +
> +       rc = util_filter_walk(ctx, &fctx, &param);
> +       if (rc)
> +               goto out;
> +       if (mfa.num_dimm == 0) {
> +               errmsg = "no monitor dimms can be found";
> +               goto out;
> +       }
> +
> +       while(1){
> +               rc = select(mfa.maxfd + 1, NULL, NULL, &mfa.fds, NULL);
> +               if (rc < 1) {
> +                       errmsg = "select error";
> +                       if (rc == 0)
> +                               dbg(ctx, "select unexpected timeout\n");
> +                       else
> +                               dbg(ctx, "select %s\n", strerror(errno));
> +                       goto out;
> +               }
> +               struct monitor_dimm *m_dimm;
> +               list_for_each(&mfa.mdimm, m_dimm, list) {
> +                       if (!FD_ISSET(m_dimm->health_eventfd, &mfa.fds)) {
> +                               FD_SET(m_dimm->health_eventfd, &mfa.fds);
> +                               continue;
> +                       }
> +                       if (notify_json_msg(ctx, m_dimm->dimm) != 0)
> +                               goto out;
> +                       pread(m_dimm->health_eventfd, buf, sizeof(buf), 0);
> +               }
> +       }

This does not seem to be prepared to handle eventfds that are not
associated with DIMMs. I'd rather this was based on epoll so we can
directly recall the ndctl object when the event fires rather than
needing to loop through all the objects again and compare fds.

> +       return 0;
> +out:
> +       if (errmsg) {
> +               if (!param.fork)
> +                       error("%s\n", errmsg);
> +               err(ctx, "%s\n", errmsg);
> +       }
> +       return 1;
> +}
> +
> +static int create_confile(char *conf_path)
> +{
> +       FILE *f;
> +       char *buf;
> +
> +       if (!is_dir(conf_dir)) {
> +               if (recur_mkdir(conf_dir, 0744) != 0) {
> +                       error("cannot create dir: %s\n", conf_dir);
> +                       goto out;
> +               }
> +       }
> +
> +       f = fopen(conf_path, "w");
> +       if (!f) {
> +               error("open %s failed\n", conf_path);
> +               goto out;
> +       }
> +
> +       buf = (char *)malloc(BUF_SIZE);
> +       if (!buf) {
> +               error("cannot get memory for daemon config file\n");
> +               goto out;
> +       }
> +       strcpy(buf, "OPTIONS=-f");
> +       if (param.bus) {
> +               strcat(buf, " -b ");
> +               strcat(buf, param.bus);
> +       }
> +       if (param.dimm) {
> +               strcat(buf, " -d ");
> +               strcat(buf, param.dimm);
> +       }
> +       if (param.namespace) {
> +               strcat(buf, " -n ");
> +               strcat(buf, param.namespace);
> +       }
> +       if (param.region) {
> +               strcat(buf, " -r ");
> +               strcat(buf, param.region);
> +       }
> +       if (param.log) {
> +               strcat(buf, " -l ");
> +               strcat(buf, param.log);
> +       }
> +       fprintf(f, "%s", buf);
> +       fclose(f);
> +       free(buf);
> +       return 0;
> +out:
> +       return 1;
> +}

As mentioned earlier we shouldn't be creating a configuration file, it
should be using a pre-existing configuration file in the systemd case.

> +
> +static bool is_monitor_exist(void)
> +{
> +       char *conf_path = strdup(param.daemon);
> +       conf_path = concat(conf_dir, conf_path);
> +       FILE *f = fopen(conf_path, "r");
> +       if (f) {
> +               fclose(f);
> +               return true;
> +       }
> +       return false;
> +}
> +
> +int cmd_monitor(int argc, const char **argv, void *ctx)
> +{
> +       const struct option options[] = {
> +               OPT_STRING('b', "bus", &param.bus, "bus-id", "filter by bus"),
> +               OPT_STRING('r', "region", &param.region, "region-id",
> +                               "filter by region"),
> +               OPT_STRING('d', "dimm", &param.dimm, "dimm-id",
> +                               "filter by dimm"),
> +               OPT_STRING('n', "namespace", &param.namespace,
> +                               "namespace-id", "filter by namespace id"),
> +               OPT_STRING('l', "log", &param.log, "log name",
> +                               "monitor logfile"),
> +               OPT_STRING('D',"daemon", &param.daemon, "daemon-name",
> +                               "run ndctl monitor as a daemon"),
> +               OPT_BOOLEAN_HID('f', &param.fork),
> +               OPT_END(),
> +       };
> +       const char * const u[] = {
> +               "ndctl monitor [<options>]",
> +               NULL
> +       };
> +       argc = parse_options(argc, argv, options, u, 0);
> +       for (int i = 0; i < argc; i++) {
> +               error("unknown parameter \"%s\"\n", argv[i]);
> +               goto out;
> +       }
> +       if (argc)
> +               usage_with_options(u, options);
> +
> +       if (param.daemon) {
> +               if (is_monitor_exist()) {
> +                       error("monitor %s is exist\n", param.daemon);
> +                       goto out;
> +               }
> +               char *conf_path = strdup(param.daemon);
> +               conf_path = concat(conf_dir, conf_path);
> +               if (create_confile(conf_path) != 0)
> +                       goto out;
> +               char *sys_cmd = (char *)malloc(BUF_SIZE);
> +               sprintf(sys_cmd, "systemctl start ndctl-monitor@%s.service",
> +                               param.daemon);
> +               if (system(sys_cmd) != 0) {
> +                       free(sys_cmd);
> +                       remove(conf_path);
> +                       goto out;
> +               }
> +               free(sys_cmd);
> +               return 0;
> +       }
> +
> +       if (param.fork) {
> +               if (daemon(0, 0) != 0) {
> +                       err((struct ndctl_ctx*)ctx, "daemon start failed\n");
> +                       goto out;
> +               }
> +       }
> +
> +       ndctl_set_log_priority((struct ndctl_ctx*)ctx, LOG_NOTICE);
> +
> +       if (!param.log || strcmp(param.log, "syslog") == 0)
> +               ndctl_set_log_fn((struct ndctl_ctx*)ctx, log_syslog);
> +       else
> +               ndctl_set_log_fn((struct ndctl_ctx*)ctx, log_file);
> +
> +       if (monitor_smart_event((struct ndctl_ctx*)ctx) != 0)
> +               goto out;
> +
> +       return 0;
> +out:
> +       return 1;
> +}
> diff --git a/ndctl/ndctl-monitor.service b/ndctl/ndctl-monitor.service
> new file mode 100644
> index 0000000..2f1417b
> --- /dev/null
> +++ b/ndctl/ndctl-monitor.service
> @@ -0,0 +1,7 @@
> +[Unit]
> +Description=Ndctl Monitor Daemon
> +
> +[Service]
> +Type=forking
> +ExecStart=/usr/bin/ndctl monitor -f
> +ExecStop=/usr/bin/kill ${MAINPID}
> diff --git a/ndctl/ndctl-monitor@.service b/ndctl/ndctl-monitor@.service
> new file mode 100644
> index 0000000..91c85d9
> --- /dev/null
> +++ b/ndctl/ndctl-monitor@.service
> @@ -0,0 +1,9 @@
> +[Unit]
> +Description=Ndctl Monitor Daemon
> +
> +[Service]
> +Type=forking
> +EnvironmentFile=/etc/sysconfig/ndctl/%i
> +ExecStart=/usr/bin/ndctl monitor $OPTIONS
> +ExecStop=/usr/bin/kill ${MAINPID}
> +ExecStopPost=/usr/bin/rm /etc/sysconfig/ndctl/%i
> diff --git a/ndctl/ndctl.c b/ndctl/ndctl.c
> index d3c6db1..8938621 100644
> --- a/ndctl/ndctl.c
> +++ b/ndctl/ndctl.c
> @@ -86,6 +86,7 @@ static struct cmd_struct commands[] = {
>         { "inject-error", cmd_inject_error },
>         { "update-firmware", cmd_update_firmware },
>         { "inject-smart", cmd_inject_smart },
> +       { "monitor", cmd_monitor },
>         { "list", cmd_list },
>         { "help", cmd_help },
>         #ifdef ENABLE_TEST
> diff --git a/util/filter.c b/util/filter.c
> index b0b7fdf..fba2197 100644
> --- a/util/filter.c
> +++ b/util/filter.c
> @@ -315,7 +315,7 @@ int util_filter_walk(struct ndctl_ctx *ctx, struct util_filter_ctx *fctx,
>                                 || !util_bus_filter_by_namespace(bus, param->namespace))
>                         continue;
>
> -               if (!fctx->filter_bus(bus, fctx))
> +               if (fctx->filter_bus && !fctx->filter_bus(bus, fctx))
>                         continue;
>
>                 ndctl_dimm_foreach(bus, dimm) {
> @@ -345,7 +345,8 @@ int util_filter_walk(struct ndctl_ctx *ctx, struct util_filter_ctx *fctx,
>                         if (type && ndctl_region_get_type(region) != type)
>                                 continue;
>
> -                       if (!fctx->filter_region(region, fctx))
> +                       if (fctx->filter_region &&
> +                                       !fctx->filter_region(region, fctx))

No, I don't agree with these change. The filter_bus() and
filter_region() callbacks are mandatory because, like list, you need
to be prepared to handle dimms underneath the bus, and namespaces
underneath a region.

>                                 continue;
>
>                         ndctl_namespace_foreach(region, ndns) {
> diff --git a/util/filter.h b/util/filter.h
> index aea5a71..82f3b0d 100644
> --- a/util/filter.h
> +++ b/util/filter.h
> @@ -77,6 +77,9 @@ struct util_filter_params {
>         const char *dimm;
>         const char *mode;
>         const char *namespace;
> +       const char *log;
> +       const char *daemon;
> +       bool fork;

'log', 'daemon', and 'fork' are not util_filter_params, those should
be kept local to monitor.c. This is similar to the --human option to
"ndctl list". That isn't passed to util_filter_walk() it's handled
locally in list.c.

>  };
>
>  struct ndctl_ctx;
> diff --git a/util/parse-options.h b/util/parse-options.h
> index 6fd6b24..2262c42 100644
> --- a/util/parse-options.h
> +++ b/util/parse-options.h
> @@ -123,6 +123,7 @@ struct option {
>  #define OPT_GROUP(h)                { .type = OPTION_GROUP, .help = (h) }
>  #define OPT_BIT(s, l, v, h, b)      { .type = OPTION_BIT, .short_name = (s), .long_name = (l), .value = check_vtype(v, int *), .help = (h), .defval = (b) }
>  #define OPT_BOOLEAN(s, l, v, h)     { .type = OPTION_BOOLEAN, .short_name = (s), .long_name = (l), .value = check_vtype(v, bool *), .help = (h) }
> +#define OPT_BOOLEAN_HID(s, v)       { .type = OPTION_BOOLEAN, .short_name = (s), .value = check_vtype(v, bool *), .flags = PARSE_OPT_HIDDEN}

What is this for?
QI Fuli March 15, 2018, 10:41 a.m. UTC | #2
>> In this implemention, when a ndctl monitor starts with [--daemon] option, all
>> the arguments will be saved into a temp file named as daemon-name and placed
>> under /etc/sysconfig/ndctl/ directory. The temp file would be used as an
>> EnvironmentFile by systemd, and it would be deleted automatically when the
>> systemd service is stopped.
> The monitors started by hand should be kept separate from the monitors
> started by systemd. The default monitor started by systemd should get
> its configuration from /etc/ndctl.conf, and we should otherwise have a
> --conf-file option to the monitor to override the default
> configuration. Any other monitors started outside of the systemd
> should remain independent.
Thank you very much for your comments. I still have something unclear about
how to manage ndctl monitor daemon. Let's clarify the specification of 
monitor daemon.

I have the following questions.
  a) About command line interface
     1) I would like to confirm when a monitor daemon is launched  by using
        command line interface with [--daemon] option (like # ndctl monitor
        --dimm nmem1 --daemon), then how to manage it? (i.e. stop, check 
status)
     2) When multiple daemons be running at same time, how can we recognize
        each of them?

  b) About using systemctl command
     3) Should we need to limit the maximum number of running daemons to 
be one?
        If yes, I am thinking the option should support multiple values, 
then
        users can modify the default configuration by using 
[--conf-file] option.
        # ndctl monitor --dimm nmem1,nmem2 --conf-file

     4) If multiple daemons can be running at the same time, multiple 
configuration
        files named <daemon-name> are needed. Do you agree that let 
users create the
        configuration file manually when they want to launch a monitor 
daemon?
>> Due to the deletion the following commands will not work.
>> # systemctl enable ndctl-monitor@daemon-name.service
>> # systemctl restart ndctl-monitor@daemon-name.service
>>
>> I am not sure whether these commands are needed for ndctl monitor daemon,
>> your comments will be appreciated.
> Both those commands are needed.
>
Dan Williams March 16, 2018, 1:03 a.m. UTC | #3
On Thu, Mar 15, 2018 at 3:41 AM, Qi, Fuli <qi.fuli@jp.fujitsu.com> wrote:
>>>
>>> In this implemention, when a ndctl monitor starts with [--daemon] option, all
>>> the arguments will be saved into a temp file named as daemon-name and placed
>>> under /etc/sysconfig/ndctl/ directory. The temp file would be used as an
>>> EnvironmentFile by systemd, and it would be deleted automatically when the
>>> systemd service is stopped.
>>
>> The monitors started by hand should be kept separate from the monitors
>> started by systemd. The default monitor started by systemd should get
>> its configuration from /etc/ndctl.conf, and we should otherwise have a
>> --conf-file option to the monitor to override the default
>> configuration. Any other monitors started outside of the systemd
>> should remain independent.
>
> Thank you very much for your comments. I still have something unclear about
> how to manage ndctl monitor daemon. Let's clarify the specification of monitor daemon.
>
> I have the following questions.
>  a) About command line interface
>     1) I would like to confirm when a monitor daemon is launched  by using
>        command line interface with [--daemon] option (like # ndctl monitor
>        --dimm nmem1 --daemon), then how to manage it? (i.e. stop, check status)

I don't want to invent new policy here, how do other daemons behave
with respect to this, can you investigate? Use mdadm or dnsmasq as an
example.

>     2) When multiple daemons be running at same time, how can we recognize
>        each of them?

Process id.

>
>  b) About using systemctl command
>     3) Should we need to limit the maximum number of running daemons to be one?
>        If yes, I am thinking the option should support multiple values, then
>        users can modify the default configuration by using [--conf-file] option.
>        # ndctl monitor --dimm nmem1,nmem2 --conf-file
>
>     4) If multiple daemons can be running at the same time, multiple configuration
>        files named <daemon-name> are needed. Do you agree that let users create the
>        configuration file manually when they want to launch a monitor daemon?

Yes, we should always support a local configuration file in addition
to the default configuration.
Gotou, Yasunori/五島 康文 March 16, 2018, 6:55 a.m. UTC | #4
Hi,

> > +static void log_syslog(struct ndctl_ctx *ctx, int priority, const char *file,
> > +       int line, const char *fn, const char *format, va_list args)
> > +{
> > +       char *buf = (char *)malloc(BUF_SIZE);
> > +       vsnprintf(buf, BUF_SIZE, format, args);
> > +       syslog(priority, "%s", buf);
> > +       free(buf);
> > +}
> 
> This seems to be reinventing setting up a log file that I'm sure we
> could borrow or link to...
> 
> ndctl borrows heavily from git, and git does logging in it's daemon.c,
> I'd be happier if you copied that.

Maybe silly question...., but is it OK about OSS lisence?

In my understanding,  basically the lisence of git is GPL2,
LGPL is used just for use xdiff/ or some libc code in git....
https://github.com/git/git/blob/master/COPYING
https://github.com/git/git/blob/master/LGPL-2.1

However, the lisence of ndctl is LGPL2.1
https://github.com/pmem/ndctl/blob/master/COPYING

If we copied from git to ndctl, then I suppose that ndctl
daemon may have to use GPL2 rather than LGPL2.1, right?

Bye,
Gotou, Yasunori/五島 康文 March 16, 2018, 7:33 a.m. UTC | #5
> >  b) About using systemctl command
> >     3) Should we need to limit the maximum number of running daemons to be one?
> >        If yes, I am thinking the option should support multiple values, then
> >        users can modify the default configuration by using [--conf-file] option.
> >        # ndctl monitor --dimm nmem1,nmem2 --conf-file

I talked about here with Qi-san, and probably I understand what he really 
wanted to ask here.

Currently, ndctl user can specify just "one" filter option, like --dimm nmem1,
but he/she can not specify 2 or more options like --dimm nmem1,nmem2,nmem3.

So, he would like to ask which we should select in the following.

  a) Change ndctl for users to be able to specify multiple filter option.
     This change affects not only daemon, but also ndctl list command.
     In addition, we need to decide how much filter can be specified.
  b) Keep current filter.
     To monitor plural area(dimm/region/namespace/bus), plural daemons
     are executed for each area which is specified one option.
     This does not require option change.

Though Qi-san wanted to select b) at this question,
it is just workaround, we should select a) I think...

Thanks,
QI Fuli March 16, 2018, 10:01 a.m. UTC | #6
On 2018/03/16 16:33, Yasunori Goto wrote:
>>>   b) About using systemctl command
>>>      3) Should we need to limit the maximum number of running daemons to be one?
>>>         If yes, I am thinking the option should support multiple values, then
>>>         users can modify the default configuration by using [--conf-file] option.
>>>         # ndctl monitor --dimm nmem1,nmem2 --conf-file
> I talked about here with Qi-san, and probably I understand what he really
> wanted to ask here.
>
> Currently, ndctl user can specify just "one" filter option, like --dimm nmem1,
> but he/she can not specify 2 or more options like --dimm nmem1,nmem2,nmem3.
>
> So, he would like to ask which we should select in the following.
>
>    a) Change ndctl for users to be able to specify multiple filter option.
>       This change affects not only daemon, but also ndctl list command.
>       In addition, we need to decide how much filter can be specified.
>    b) Keep current filter.
>       To monitor plural area(dimm/region/namespace/bus), plural daemons
>       are executed for each area which is specified one option.
>       This does not require option change.
>
> Though Qi-san wanted to select b) at this question,
> it is just workaround, we should select a) I think...
>
> Thanks,
>
Goto-san, Thank you for your follow.

I am not sure if it is needed to support "one option multiple arguments" 
in ndctl list.
Think more about it, I am thinking it should be supported in ndctl 
monitor daemon.
If ndctl list does not need it, it can be implemented only in monitor 
daemon.
If list also needs it, we can discuss more details about how to make a 
given one
which can be shared between list and monitor.

Thank you.
Dan Williams March 16, 2018, 1:28 p.m. UTC | #7
On Thu, Mar 15, 2018 at 11:55 PM, Yasunori Goto <y-goto@jp.fujitsu.com> wrote:
> Hi,
>
>> > +static void log_syslog(struct ndctl_ctx *ctx, int priority, const char *file,
>> > +       int line, const char *fn, const char *format, va_list args)
>> > +{
>> > +       char *buf = (char *)malloc(BUF_SIZE);
>> > +       vsnprintf(buf, BUF_SIZE, format, args);
>> > +       syslog(priority, "%s", buf);
>> > +       free(buf);
>> > +}
>>
>> This seems to be reinventing setting up a log file that I'm sure we
>> could borrow or link to...
>>
>> ndctl borrows heavily from git, and git does logging in it's daemon.c,
>> I'd be happier if you copied that.
>
> Maybe silly question...., but is it OK about OSS lisence?
>
> In my understanding,  basically the lisence of git is GPL2,
> LGPL is used just for use xdiff/ or some libc code in git....
> https://github.com/git/git/blob/master/COPYING
> https://github.com/git/git/blob/master/LGPL-2.1
>
> However, the lisence of ndctl is LGPL2.1
> https://github.com/pmem/ndctl/blob/master/COPYING
>
> If we copied from git to ndctl, then I suppose that ndctl
> daemon may have to use GPL2 rather than LGPL2.1, right?

In the ndctl source tree the license of the code in ndctl/ is GPL-2,
the license of the code in ndctl/lib/ is LGPL-2.1. So we can freely
copy GPL-2 code into the utilities, but not the library.
Gotou, Yasunori/五島 康文 March 17, 2018, 12:23 a.m. UTC | #8
> On Thu, Mar 15, 2018 at 11:55 PM, Yasunori Goto <y-goto@jp.fujitsu.com> wrote:
> > Hi,
> >
> >> > +static void log_syslog(struct ndctl_ctx *ctx, int priority, const char *file,
> >> > +       int line, const char *fn, const char *format, va_list args)
> >> > +{
> >> > +       char *buf = (char *)malloc(BUF_SIZE);
> >> > +       vsnprintf(buf, BUF_SIZE, format, args);
> >> > +       syslog(priority, "%s", buf);
> >> > +       free(buf);
> >> > +}
> >>
> >> This seems to be reinventing setting up a log file that I'm sure we
> >> could borrow or link to...
> >>
> >> ndctl borrows heavily from git, and git does logging in it's daemon.c,
> >> I'd be happier if you copied that.
> >
> > Maybe silly question...., but is it OK about OSS lisence?
> >
> > In my understanding,  basically the lisence of git is GPL2,
> > LGPL is used just for use xdiff/ or some libc code in git....
> > https://github.com/git/git/blob/master/COPYING
> > https://github.com/git/git/blob/master/LGPL-2.1
> >
> > However, the lisence of ndctl is LGPL2.1
> > https://github.com/pmem/ndctl/blob/master/COPYING
> >
> > If we copied from git to ndctl, then I suppose that ndctl
> > daemon may have to use GPL2 rather than LGPL2.1, right?
> 
> In the ndctl source tree the license of the code in ndctl/ is GPL-2,
> the license of the code in ndctl/lib/ is LGPL-2.1. So we can freely
> copy GPL-2 code into the utilities, but not the library.

Ok. I see.

Thanks,
---
Yasunori Goto
QI Fuli March 19, 2018, 11:27 a.m. UTC | #9
>> diff --git a/ndctl/ndctl-monitor.service b/ndctl/ndctl-monitor.service
>> new file mode 100644
>> index 0000000..2f1417b
>> --- /dev/null
>> +++ b/ndctl/ndctl-monitor.service
>> @@ -0,0 +1,7 @@
>> +[Unit]
>> +Description=Ndctl Monitor Daemon
>> +
>> +[Service]
>> +Type=forking
>> +ExecStart=/usr/bin/ndctl monitor -f
>> +ExecStop=/usr/bin/kill ${MAINPID}
>> diff --git a/ndctl/ndctl-monitor@.service b/ndctl/ndctl-monitor@.service
>> new file mode 100644
>> index 0000000..91c85d9
>> --- /dev/null
>> +++ b/ndctl/ndctl-monitor@.service
>> @@ -0,0 +1,9 @@
>> +[Unit]
>> +Description=Ndctl Monitor Daemon
>> +
>> +[Service]
>> +Type=forking
>> +EnvironmentFile=/etc/sysconfig/ndctl/%i
>> +ExecStart=/usr/bin/ndctl monitor $OPTIONS
>> +ExecStop=/usr/bin/kill ${MAINPID}
>> +ExecStopPost=/usr/bin/rm /etc/sysconfig/ndctl/%i
>> diff --git a/ndctl/ndctl.c b/ndctl/ndctl.c
>> index d3c6db1..8938621 100644
>> --- a/ndctl/ndctl.c
>> +++ b/ndctl/ndctl.c
>> @@ -86,6 +86,7 @@ static struct cmd_struct commands[] = {
>>          { "inject-error", cmd_inject_error },
>>          { "update-firmware", cmd_update_firmware },
>>          { "inject-smart", cmd_inject_smart },
>> +       { "monitor", cmd_monitor },
>>          { "list", cmd_list },
>>          { "help", cmd_help },
>>          #ifdef ENABLE_TEST
>> diff --git a/util/filter.c b/util/filter.c
>> index b0b7fdf..fba2197 100644
>> --- a/util/filter.c
>> +++ b/util/filter.c
>> @@ -315,7 +315,7 @@ int util_filter_walk(struct ndctl_ctx *ctx, struct util_filter_ctx *fctx,
>>                                  || !util_bus_filter_by_namespace(bus, param->namespace))
>>                          continue;
>>
>> -               if (!fctx->filter_bus(bus, fctx))
>> +               if (fctx->filter_bus && !fctx->filter_bus(bus, fctx))
>>                          continue;
>>
>>                  ndctl_dimm_foreach(bus, dimm) {
>> @@ -345,7 +345,8 @@ int util_filter_walk(struct ndctl_ctx *ctx, struct util_filter_ctx *fctx,
>>                          if (type && ndctl_region_get_type(region) != type)
>>                                  continue;
>>
>> -                       if (!fctx->filter_region(region, fctx))
>> +                       if (fctx->filter_region &&
>> +                                       !fctx->filter_region(region, fctx))
> No, I don't agree with these change. The filter_bus() and
> filter_region() callbacks are mandatory because, like list, you need
> to be prepared to handle dimms underneath the bus, and namespaces
> underneath a region.

The filter_bus() and filter_region() callbacks are mandatory in ndctl list,
but not in ndctl monitor.
Current design only monitors the events coming from DIMMs, does not monitor
events coming from buses yet. When monitoring the events coming from DIMMs,
we just need to put the DIMMs which match filters into a linklist using
filter_dimm(). No additional actions will be taken on buses. It is the 
similar
case as in ndctl list, when the command does not contain [--buses] option,
filter_bus() will always return true.
Dan Williams March 19, 2018, 2:27 p.m. UTC | #10
>>> -                       if (!fctx->filter_region(region, fctx))
>>> +                       if (fctx->filter_region &&
>>> +                                       !fctx->filter_region(region,
>>> fctx))
>>
>> No, I don't agree with these change. The filter_bus() and
>> filter_region() callbacks are mandatory because, like list, you need
>> to be prepared to handle dimms underneath the bus, and namespaces
>> underneath a region.
>
>
> The filter_bus() and filter_region() callbacks are mandatory in ndctl list,
> but not in ndctl monitor.
> Current design only monitors the events coming from DIMMs, does not monitor
> events coming from buses yet. When monitoring the events coming from DIMMs,
> we just need to put the DIMMs which match filters into a linklist using
> filter_dimm(). No additional actions will be taken on buses. It is the
> similar
> case as in ndctl list, when the command does not contain [--buses] option,
> filter_bus() will always return true.

Right, I'd prefer that you create a ->filter_bus() implementation for
your use case that always returns true rather than change the core to
allow a ->filter_bus() routine to be omitted. This way the core stays
generic for all use cases and the bus monitoring can be added without
touching the core.
QI Fuli March 20, 2018, 1:03 a.m. UTC | #11
>>>> -                       if (!fctx->filter_region(region, fctx))
>>>> +                       if (fctx->filter_region &&
>>>> +                                       !fctx->filter_region(region,
>>>> fctx))
>>> No, I don't agree with these change. The filter_bus() and
>>> filter_region() callbacks are mandatory because, like list, you need
>>> to be prepared to handle dimms underneath the bus, and namespaces
>>> underneath a region.
>>
>> The filter_bus() and filter_region() callbacks are mandatory in ndctl list,
>> but not in ndctl monitor.
>> Current design only monitors the events coming from DIMMs, does not monitor
>> events coming from buses yet. When monitoring the events coming from DIMMs,
>> we just need to put the DIMMs which match filters into a linklist using
>> filter_dimm(). No additional actions will be taken on buses. It is the
>> similar
>> case as in ndctl list, when the command does not contain [--buses] option,
>> filter_bus() will always return true.
> Right, I'd prefer that you create a ->filter_bus() implementation for
> your use case that always returns true rather than change the core to
> allow a ->filter_bus() routine to be omitted. This way the core stays
> generic for all use cases and the bus monitoring can be added without
> touching the core.
>
>
OK, I see.
Thank you very much.
QI Fuli March 29, 2018, 10:02 a.m. UTC | #12
> -----Original Message-----

> From: Dan Williams [mailto:dan.j.williams@intel.com]

> Sent: Thursday, March 15, 2018 3:20 AM

> To: Qi, Fuli/斉 福利 <qi.fuli@jp.fujitsu.com>

> Cc: linux-nvdimm <linux-nvdimm@lists.01.org>

> Subject: Re: [RFC PATCH v4] ndctl: monitor: add ndctl monitor daemon

> 

> On Tue, Mar 13, 2018 at 4:33 AM, QI Fuli <qi.fuli@jp.fujitsu.com> wrote:

> > This is the v4 patch for ndctl monitor daemon, a tiny daemon to monitor the

> > smart events of nvdimm DIMMs. Users can run a monitor as a one-shot

> command

> > or a daemon in background by using the [--daemon] option. DIMMs to monitor

> > can be selected by [--dimm] [--bus] [--region] [--namespace] options.

> > When a smart event fires, monitor daemon will log the notification which

> > including dimm health status to syslog or a logfile by setting [--log] option.

> > The notification follows json format and can be consumed by log collectors

> > like Fluented.

> >

> > For example, a monitor daemon can be started by the following command:

> > # ndctl monitor --dimm nmem1 --log /var/log/monitor.log --daemon

> daemon-name

> >

> > Then check the monitor daemon status by using systemd:

> > # systemctl status ndctl-monitor@daemon-name.service

> >

> > To stop the monitor daemon by:

> > # systemctl stop ndctl-monitor@daemon-name.service

> >

> > Also, a monitor daemon can be started by systemd:

> > # systemctl start ndctl-monitor.service

> > Which monitors all dimms.

> >

> > In this implemention, when a ndctl monitor starts with [--daemon] option, all

> > the arguments will be saved into a temp file named as daemon-name and placed

> > under /etc/sysconfig/ndctl/ directory. The temp file would be used as an

> > EnvironmentFile by systemd, and it would be deleted automatically when the

> > systemd service is stopped.

> 

> The monitors started by hand should be kept separate from the monitors

> started by systemd. The default monitor started by systemd should get

> its configuration from /etc/ndctl.conf, and we should otherwise have a

> --conf-file option to the monitor to override the default

> configuration. Any other monitors started outside of the systemd

> should remain independent.

> 


I prefer to add an EnvironmentFile like /etc/sysconfig/ndctl/monitor to systemd
rather than add a configuration file. According to [1], environment variable
substitution is supported in systemd.service, so we can define the variables through 
"EnvironmentFile=/etc/sysconfig/ndctl/monitor".
In this fashion, we do not need to add any extra codes to parse the configuration file.

In this case, [--conf-file] option is not necessary either.
According to [2], sytemd units can be instantiated from a template file, thus we only
need to add a template unit file in advance. 
If user wants to run multiple monitors with different configurations, they can differentiate
them by adding multiple EnvironmentFiles, like /etc/sysconfig/ndctl/<monitor1...n>.
Then the monitors can be started by command like
"# systemctl start ndctl-monitor@<monitor1...n>.service".

When the monitors started by hand, it will do not need any configuration files,
because we can add options and parameters directly.

[1]https://www.freedesktop.org/software/systemd/man/systemd.service.html
[2]https://www.freedesktop.org/software/systemd/man/systemd.unit.html
Dan Williams March 29, 2018, 10:59 p.m. UTC | #13
On Thu, Mar 29, 2018 at 3:02 AM, Qi, Fuli <qi.fuli@jp.fujitsu.com> wrote:
>
>
>> -----Original Message-----
>> From: Dan Williams [mailto:dan.j.williams@intel.com]
>> Sent: Thursday, March 15, 2018 3:20 AM
>> To: Qi, Fuli/斉 福利 <qi.fuli@jp.fujitsu.com>
>> Cc: linux-nvdimm <linux-nvdimm@lists.01.org>
>> Subject: Re: [RFC PATCH v4] ndctl: monitor: add ndctl monitor daemon
>>
>> On Tue, Mar 13, 2018 at 4:33 AM, QI Fuli <qi.fuli@jp.fujitsu.com> wrote:
>> > This is the v4 patch for ndctl monitor daemon, a tiny daemon to monitor the
>> > smart events of nvdimm DIMMs. Users can run a monitor as a one-shot
>> command
>> > or a daemon in background by using the [--daemon] option. DIMMs to monitor
>> > can be selected by [--dimm] [--bus] [--region] [--namespace] options.
>> > When a smart event fires, monitor daemon will log the notification which
>> > including dimm health status to syslog or a logfile by setting [--log] option.
>> > The notification follows json format and can be consumed by log collectors
>> > like Fluented.
>> >
>> > For example, a monitor daemon can be started by the following command:
>> > # ndctl monitor --dimm nmem1 --log /var/log/monitor.log --daemon
>> daemon-name
>> >
>> > Then check the monitor daemon status by using systemd:
>> > # systemctl status ndctl-monitor@daemon-name.service
>> >
>> > To stop the monitor daemon by:
>> > # systemctl stop ndctl-monitor@daemon-name.service
>> >
>> > Also, a monitor daemon can be started by systemd:
>> > # systemctl start ndctl-monitor.service
>> > Which monitors all dimms.
>> >
>> > In this implemention, when a ndctl monitor starts with [--daemon] option, all
>> > the arguments will be saved into a temp file named as daemon-name and placed
>> > under /etc/sysconfig/ndctl/ directory. The temp file would be used as an
>> > EnvironmentFile by systemd, and it would be deleted automatically when the
>> > systemd service is stopped.
>>
>> The monitors started by hand should be kept separate from the monitors
>> started by systemd. The default monitor started by systemd should get
>> its configuration from /etc/ndctl.conf, and we should otherwise have a
>> --conf-file option to the monitor to override the default
>> configuration. Any other monitors started outside of the systemd
>> should remain independent.
>>
>
> I prefer to add an EnvironmentFile like /etc/sysconfig/ndctl/monitor to systemd
> rather than add a configuration file. According to [1], environment variable
> substitution is supported in systemd.service, so we can define the variables through
> "EnvironmentFile=/etc/sysconfig/ndctl/monitor".
> In this fashion, we do not need to add any extra codes to parse the configuration file.
>
> In this case, [--conf-file] option is not necessary either.
> According to [2], sytemd units can be instantiated from a template file, thus we only
> need to add a template unit file in advance.
> If user wants to run multiple monitors with different configurations, they can differentiate
> them by adding multiple EnvironmentFiles, like /etc/sysconfig/ndctl/<monitor1...n>.
> Then the monitors can be started by command like
> "# systemctl start ndctl-monitor@<monitor1...n>.service".
>
> When the monitors started by hand, it will do not need any configuration files,
> because we can add options and parameters directly.
>
> [1]https://www.freedesktop.org/software/systemd/man/systemd.service.html
> [2]https://www.freedesktop.org/software/systemd/man/systemd.unit.html

This seems to needlessly tie ndctl to systemd, it should be able to
operate without requiring systemd. I expect it would be
straightforward to copy the configuration file implementation from
git.
QI Fuli March 30, 2018, 7:34 a.m. UTC | #14
> >> > In this implemention, when a ndctl monitor starts with [--daemon]
> >> > option, all the arguments will be saved into a temp file named as
> >> > daemon-name and placed under /etc/sysconfig/ndctl/ directory. The
> >> > temp file would be used as an EnvironmentFile by systemd, and it
> >> > would be deleted automatically when the systemd service is stopped.
> >>
> >> The monitors started by hand should be kept separate from the
> >> monitors started by systemd. The default monitor started by systemd
> >> should get its configuration from /etc/ndctl.conf, and we should
> >> otherwise have a --conf-file option to the monitor to override the
> >> default configuration. Any other monitors started outside of the
> >> systemd should remain independent.
> >>
> >
> > I prefer to add an EnvironmentFile like /etc/sysconfig/ndctl/monitor
> > to systemd rather than add a configuration file. According to [1],
> > environment variable substitution is supported in systemd.service, so
> > we can define the variables through
> "EnvironmentFile=/etc/sysconfig/ndctl/monitor".
> > In this fashion, we do not need to add any extra codes to parse the configuration
> file.
> >
> > In this case, [--conf-file] option is not necessary either.
> > According to [2], sytemd units can be instantiated from a template
> > file, thus we only need to add a template unit file in advance.
> > If user wants to run multiple monitors with different configurations,
> > they can differentiate them by adding multiple EnvironmentFiles, like
> /etc/sysconfig/ndctl/<monitor1...n>.
> > Then the monitors can be started by command like "# systemctl start
> > ndctl-monitor@<monitor1...n>.service".
> >
> > When the monitors started by hand, it will do not need any
> > configuration files, because we can add options and parameters directly.
> >
> > [1]https://www.freedesktop.org/software/systemd/man/systemd.service.ht
> > ml
> > [2]https://www.freedesktop.org/software/systemd/man/systemd.unit.html
> 
> This seems to needlessly tie ndctl to systemd, it should be able to operate without
> requiring systemd. I expect it would be straightforward to copy the configuration file
> implementation from git.

Would you like to explain why it should be able to operate without requiring systemd?
The reason we want to add the configuration file is that when starting the monitors by systemd,
options and arguments cannot be passed by systemd.
When we start monitors by using "# ndctl monitor" command, we do not need any
configuration files, because options and arguments can be added directly.

Furthermore, if we choose configuration file implementation, we may encounter the following problem. 
If we start the monitor with command like "# ndctl monitor --dimm nmem1 --daemon --conf-file /etc/ndctl.conf",
when the variable of dimm in configuration file is not the same as the argument of [--dimm] option,
which argument should the filter_dimm() refer to?
Dan Williams March 30, 2018, 4:34 p.m. UTC | #15
On Fri, Mar 30, 2018 at 12:34 AM, Qi, Fuli <qi.fuli@jp.fujitsu.com> wrote:
>> >> > In this implemention, when a ndctl monitor starts with [--daemon]
>> >> > option, all the arguments will be saved into a temp file named as
>> >> > daemon-name and placed under /etc/sysconfig/ndctl/ directory. The
>> >> > temp file would be used as an EnvironmentFile by systemd, and it
>> >> > would be deleted automatically when the systemd service is stopped.
>> >>
>> >> The monitors started by hand should be kept separate from the
>> >> monitors started by systemd. The default monitor started by systemd
>> >> should get its configuration from /etc/ndctl.conf, and we should
>> >> otherwise have a --conf-file option to the monitor to override the
>> >> default configuration. Any other monitors started outside of the
>> >> systemd should remain independent.
>> >>
>> >
>> > I prefer to add an EnvironmentFile like /etc/sysconfig/ndctl/monitor
>> > to systemd rather than add a configuration file. According to [1],
>> > environment variable substitution is supported in systemd.service, so
>> > we can define the variables through
>> "EnvironmentFile=/etc/sysconfig/ndctl/monitor".
>> > In this fashion, we do not need to add any extra codes to parse the configuration
>> file.
>> >
>> > In this case, [--conf-file] option is not necessary either.
>> > According to [2], sytemd units can be instantiated from a template
>> > file, thus we only need to add a template unit file in advance.
>> > If user wants to run multiple monitors with different configurations,
>> > they can differentiate them by adding multiple EnvironmentFiles, like
>> /etc/sysconfig/ndctl/<monitor1...n>.
>> > Then the monitors can be started by command like "# systemctl start
>> > ndctl-monitor@<monitor1...n>.service".
>> >
>> > When the monitors started by hand, it will do not need any
>> > configuration files, because we can add options and parameters directly.
>> >
>> > [1]https://www.freedesktop.org/software/systemd/man/systemd.service.ht
>> > ml
>> > [2]https://www.freedesktop.org/software/systemd/man/systemd.unit.html
>>
>> This seems to needlessly tie ndctl to systemd, it should be able to operate without
>> requiring systemd. I expect it would be straightforward to copy the configuration file
>> implementation from git.
>
> Would you like to explain why it should be able to operate without requiring systemd?

systemd is not universally available in all distributions and it is
useful to have custom configuration files for manually started
daemons.

Another consideration is that a sub-set of monitoring activities can
be done without root privileges. It would be unforunate if a user
could not specify changes to the configuration because those files are
root-only and owned by systemd.

> The reason we want to add the configuration file is that when starting the monitors by systemd,
> options and arguments cannot be passed by systemd.
> When we start monitors by using "# ndctl monitor" command, we do not need any
> configuration files, because options and arguments can be added directly.
>
> Furthermore, if we choose configuration file implementation, we may encounter the following problem.
> If we start the monitor with command like "# ndctl monitor --dimm nmem1 --daemon --conf-file /etc/ndctl.conf",
> when the variable of dimm in configuration file is not the same as the argument of [--dimm] option,
> which argument should the filter_dimm() refer to?

The command line option should union with the configuration file for
filtering options. I.e. if the config file says to monitor nmem0 and
the command line says to monitor nmem1 then the tool should monitor
both. If there are other settings the might be in conflict then
command line should override the configuration file.

This matches the policy of most daemons that I have encountered in Linux.
QI Fuli April 2, 2018, 12:10 a.m. UTC | #16
> >>
> >> This seems to needlessly tie ndctl to systemd, it should be able to
> >> operate without requiring systemd. I expect it would be
> >> straightforward to copy the configuration file implementation from git.
> >
> > Would you like to explain why it should be able to operate without requiring
> systemd?
> 
> systemd is not universally available in all distributions and it is useful to have custom
> configuration files for manually started daemons.
> 
> Another consideration is that a sub-set of monitoring activities can be done without
> root privileges. It would be unforunate if a user could not specify changes to the
> configuration because those files are root-only and owned by systemd.
> 
> > The reason we want to add the configuration file is that when starting
> > the monitors by systemd, options and arguments cannot be passed by systemd.
> > When we start monitors by using "# ndctl monitor" command, we do not
> > need any configuration files, because options and arguments can be added
> directly.
> >
> > Furthermore, if we choose configuration file implementation, we may encounter
> the following problem.
> > If we start the monitor with command like "# ndctl monitor --dimm
> > nmem1 --daemon --conf-file /etc/ndctl.conf", when the variable of dimm
> > in configuration file is not the same as the argument of [--dimm] option, which
> argument should the filter_dimm() refer to?
> 
> The command line option should union with the configuration file for filtering options.
> I.e. if the config file says to monitor nmem0 and the command line says to monitor
> nmem1 then the tool should monitor both. If there are other settings the might be
> in conflict then command line should override the configuration file.
> 
> This matches the policy of most daemons that I have encountered in Linux.

OK, I see. Thank you very much.
Jeff Moyer April 4, 2018, 2:28 p.m. UTC | #17
"Qi, Fuli" <qi.fuli@jp.fujitsu.com> writes:

>> > In this implemention, when a ndctl monitor starts with [--daemon] option, all
>> > the arguments will be saved into a temp file named as daemon-name and placed
>> > under /etc/sysconfig/ndctl/ directory. The temp file would be used as an
>> > EnvironmentFile by systemd, and it would be deleted automatically when the
>> > systemd service is stopped.

IIRC, /etc/sysconfig is not present on non-Red Hat OSes.  Additionally,
for Red Hat OSes, /etc/sysconfig/<app>/ does not replace daemon
configuration files.

Cheers,
Jeff
QI Fuli April 5, 2018, 9:08 p.m. UTC | #18
> -----Original Message-----
> From: Jeff Moyer [mailto:jmoyer@redhat.com]
> Sent: Wednesday, April 4, 2018 11:29 PM
> To: Qi, Fuli/斉 福利 <qi.fuli@jp.fujitsu.com>
> Cc: 'Dan Williams' <dan.j.williams@intel.com>; linux-nvdimm
> <linux-nvdimm@lists.01.org>
> Subject: Re: [RFC PATCH v4] ndctl: monitor: add ndctl monitor daemon
> 
> "Qi, Fuli" <qi.fuli@jp.fujitsu.com> writes:
> 
> >> > In this implemention, when a ndctl monitor starts with [--daemon]
> >> > option, all the arguments will be saved into a temp file named as
> >> > daemon-name and placed under /etc/sysconfig/ndctl/ directory. The
> >> > temp file would be used as an EnvironmentFile by systemd, and it
> >> > would be deleted automatically when the systemd service is stopped.
> 
> IIRC, /etc/sysconfig is not present on non-Red Hat OSes.  Additionally, for Red Hat
> OSes, /etc/sysconfig/<app>/ does not replace daemon configuration files.
> 
Thank you very much for your information.
QI Fuli April 5, 2018, 11:17 p.m. UTC | #19
> >
> > I prefer to add an EnvironmentFile like /etc/sysconfig/ndctl/monitor
> > to systemd rather than add a configuration file. According to [1],
> > environment variable substitution is supported in systemd.service, so
> > we can define the variables through
> "EnvironmentFile=/etc/sysconfig/ndctl/monitor".
> > In this fashion, we do not need to add any extra codes to parse the configuration
> file.
> >
> > In this case, [--conf-file] option is not necessary either.
> > According to [2], sytemd units can be instantiated from a template
> > file, thus we only need to add a template unit file in advance.
> > If user wants to run multiple monitors with different configurations,
> > they can differentiate them by adding multiple EnvironmentFiles, like
> /etc/sysconfig/ndctl/<monitor1...n>.
> > Then the monitors can be started by command like "# systemctl start
> > ndctl-monitor@<monitor1...n>.service".
> >
> > When the monitors started by hand, it will do not need any
> > configuration files, because we can add options and parameters directly.
> >
> > [1]https://www.freedesktop.org/software/systemd/man/systemd.service.ht
> > ml
> > [2]https://www.freedesktop.org/software/systemd/man/systemd.unit.html
> 
> This seems to needlessly tie ndctl to systemd, it should be able to operate without
> requiring systemd. I expect it would be straightforward to copy the configuration file
> implementation from git.

I have read the configuration file implementation of git, my understanding is that git daemon does not
have any options used to override default configuration. 
I want to confirm if the configuration file is only used for ndctl monitor.
If yes, I do not think copy the configuration file implementation from git is a good choice,
because only getting keys and values from configuration file is needed for us and the structure of
configuration file implementation in git is too complexity. 
I prefer to borrow from udev[1], because the implementation in udev is simpler and it seems ndctl also borrows a lot from udev.

[1] https://git.kernel.org/pub/scm/linux/hotplug/udev.git/tree/src/libudev.c
Dan Williams April 6, 2018, 7:02 p.m. UTC | #20
bus="<bus filter option>"


On Thu, Apr 5, 2018 at 4:17 PM, Qi, Fuli <qi.fuli@jp.fujitsu.com> wrote:
[..]
>> This seems to needlessly tie ndctl to systemd, it should be able to operate without
>> requiring systemd. I expect it would be straightforward to copy the configuration file
>> implementation from git.
>
> I have read the configuration file implementation of git, my understanding is that git daemon does not
> have any options used to override default configuration.
> I want to confirm if the configuration file is only used for ndctl monitor.
> If yes, I do not think copy the configuration file implementation from git is a good choice,
> because only getting keys and values from configuration file is needed for us and the structure of
> configuration file implementation in git is too complexity.
> I prefer to borrow from udev[1], because the implementation in udev is simpler and it seems ndctl also borrows a lot from udev.
>
> [1] https://git.kernel.org/pub/scm/linux/hotplug/udev.git/tree/src/libudev.c

Thank you for doing the due diligence on this investigation it is appreciated.

I think the simple udev approach is acceptable. Going back to the
proposed command line options of: --dimm-events, --namespace-events,
--region-events, --bus-events and device filter selectors (like the
ndctl list options) we can just have variables in the config file for
those, so:

dimm-events="<list of dimm events>"
namespace-events=="<list of namespace events>"
region-events=="<list of region events>"
bus-events="<list of bus events>"
dimm="<bus filter option>"
dimm="<dimm filter option>"
region="<region filter option>"
namespace="<namespace filter option>"

Thoughts?
QI Fuli April 9, 2018, 8:38 a.m. UTC | #21
> -----Original Message-----

> From: Dan Williams [mailto:dan.j.williams@intel.com]

> Sent: Saturday, April 7, 2018 4:03 AM

> To: Qi, Fuli/斉 福利 <qi.fuli@jp.fujitsu.com>

> Cc: linux-nvdimm <linux-nvdimm@lists.01.org>

> Subject: Re: [RFC PATCH v4] ndctl: monitor: add ndctl monitor daemon

> 

> bus="<bus filter option>"

> 

> 

> On Thu, Apr 5, 2018 at 4:17 PM, Qi, Fuli <qi.fuli@jp.fujitsu.com> wrote:

> [..]

> >> This seems to needlessly tie ndctl to systemd, it should be able to

> >> operate without requiring systemd. I expect it would be

> >> straightforward to copy the configuration file implementation from git.

> >

> > I have read the configuration file implementation of git, my

> > understanding is that git daemon does not have any options used to override

> default configuration.

> > I want to confirm if the configuration file is only used for ndctl monitor.

> > If yes, I do not think copy the configuration file implementation from

> > git is a good choice, because only getting keys and values from

> > configuration file is needed for us and the structure of configuration file

> implementation in git is too complexity.

> > I prefer to borrow from udev[1], because the implementation in udev is simpler

> and it seems ndctl also borrows a lot from udev.

> >

> > [1]

> > https://git.kernel.org/pub/scm/linux/hotplug/udev.git/tree/src/libudev

> > .c

> 

> Thank you for doing the due diligence on this investigation it is appreciated.

> 

> I think the simple udev approach is acceptable. Going back to the proposed

> command line options of: --dimm-events, --namespace-events, --region-events,

> --bus-events and device filter selectors (like the ndctl list options) we can just have

> variables in the config file for those, so:

> 

> dimm-events="<list of dimm events>"

> namespace-events=="<list of namespace events>"

> region-events=="<list of region events>"

> bus-events="<list of bus events>"

> dimm="<bus filter option>"

> dimm="<dimm filter option>"

> region="<region filter option>"

> namespace="<namespace filter option>"

> 

> Thoughts?


Thank you very much.

I think the "logfile=<logfile path>" is also needed in the configuration file.

One more question is that do you think it is necessary for ndctl list to support "one option multiple arguments"?
Currently, only "one option one argument" is allowed in ndctl list like "ndctl list --dimm nmem1 --bus ndbus1".
Due to monitor should support "one option multiple arguments" like "ndctl monitor --dimm nmem1,nmem2",
I am thinking modify the "strut util_filter_params" in util/filter.h, change the "const char *bus" to "char **buses"
or "link_list bus" and refactor the util_filter_walk in util/filter.h, then ndctl list also can support "ndctl list --dimm nmem1,nmem2".
Dan Williams April 9, 2018, 5:45 p.m. UTC | #22
On Mon, Apr 9, 2018 at 1:38 AM, Qi, Fuli <qi.fuli@jp.fujitsu.com> wrote:
>
>
>> -----Original Message-----
>> From: Dan Williams [mailto:dan.j.williams@intel.com]
>> Sent: Saturday, April 7, 2018 4:03 AM
>> To: Qi, Fuli/斉 福利 <qi.fuli@jp.fujitsu.com>
>> Cc: linux-nvdimm <linux-nvdimm@lists.01.org>
>> Subject: Re: [RFC PATCH v4] ndctl: monitor: add ndctl monitor daemon
>>
>> bus="<bus filter option>"
>>
>>
>> On Thu, Apr 5, 2018 at 4:17 PM, Qi, Fuli <qi.fuli@jp.fujitsu.com> wrote:
>> [..]
>> >> This seems to needlessly tie ndctl to systemd, it should be able to
>> >> operate without requiring systemd. I expect it would be
>> >> straightforward to copy the configuration file implementation from git.
>> >
>> > I have read the configuration file implementation of git, my
>> > understanding is that git daemon does not have any options used to override
>> default configuration.
>> > I want to confirm if the configuration file is only used for ndctl monitor.
>> > If yes, I do not think copy the configuration file implementation from
>> > git is a good choice, because only getting keys and values from
>> > configuration file is needed for us and the structure of configuration file
>> implementation in git is too complexity.
>> > I prefer to borrow from udev[1], because the implementation in udev is simpler
>> and it seems ndctl also borrows a lot from udev.
>> >
>> > [1]
>> > https://git.kernel.org/pub/scm/linux/hotplug/udev.git/tree/src/libudev
>> > .c
>>
>> Thank you for doing the due diligence on this investigation it is appreciated.
>>
>> I think the simple udev approach is acceptable. Going back to the proposed
>> command line options of: --dimm-events, --namespace-events, --region-events,
>> --bus-events and device filter selectors (like the ndctl list options) we can just have
>> variables in the config file for those, so:
>>
>> dimm-events="<list of dimm events>"
>> namespace-events=="<list of namespace events>"
>> region-events=="<list of region events>"
>> bus-events="<list of bus events>"
>> dimm="<bus filter option>"
>> dimm="<dimm filter option>"
>> region="<region filter option>"
>> namespace="<namespace filter option>"
>>
>> Thoughts?
>
> Thank you very much.
>
> I think the "logfile=<logfile path>" is also needed in the configuration file.
>
> One more question is that do you think it is necessary for ndctl list to support "one option multiple arguments"?
> Currently, only "one option one argument" is allowed in ndctl list like "ndctl list --dimm nmem1 --bus ndbus1".
> Due to monitor should support "one option multiple arguments" like "ndctl monitor --dimm nmem1,nmem2",
> I am thinking modify the "strut util_filter_params" in util/filter.h, change the "const char *bus" to "char **buses"
> or "link_list bus" and refactor the util_filter_walk in util/filter.h, then ndctl list also can support "ndctl list --dimm nmem1,nmem2".

That would be a welcome feature. Given that we already support the
"all" keyword to match multiple objects in a filter it should be
straightforward to support lists of objects.

The only concern is syntax. I think since the label commands
"init-labels, read-labels, write-labels" takes a space separated list
of objects I think we should do the same specifying multiple object
names to the filter.

I.e. a request to list multiple specific dimms would be:

    ndctl list --dimm="nmem1 nmem2 nmem3"

...space separation instead of comma allows you to rewrite that as:

    ndctl list --dimm="$(echo nmem{1,2,3)"
QI Fuli April 10, 2018, 2:15 a.m. UTC | #23
> >
> > Thank you very much.
> >
> > I think the "logfile=<logfile path>" is also needed in the configuration file.
> >
> > One more question is that do you think it is necessary for ndctl list to support "one
> option multiple arguments"?
> > Currently, only "one option one argument" is allowed in ndctl list like "ndctl list
> --dimm nmem1 --bus ndbus1".
> > Due to monitor should support "one option multiple arguments" like
> > "ndctl monitor --dimm nmem1,nmem2", I am thinking modify the "strut
> util_filter_params" in util/filter.h, change the "const char *bus" to "char **buses"
> > or "link_list bus" and refactor the util_filter_walk in util/filter.h, then ndctl list also
> can support "ndctl list --dimm nmem1,nmem2".
> 
> That would be a welcome feature. Given that we already support the "all" keyword to
> match multiple objects in a filter it should be straightforward to support lists of
> objects.
> 
> The only concern is syntax. I think since the label commands "init-labels, read-labels,
> write-labels" takes a space separated list of objects I think we should do the same
> specifying multiple object names to the filter.
> 
> I.e. a request to list multiple specific dimms would be:
> 
>     ndctl list --dimm="nmem1 nmem2 nmem3"
> 
> ...space separation instead of comma allows you to rewrite that as:
> 
>     ndctl list --dimm="$(echo nmem{1,2,3)"

Ok, I see.
I will make a patch to refactor the util_filter_walk() and ndctl list.
Verma, Vishal L April 10, 2018, 3:06 a.m. UTC | #24
On Mon, 2018-04-09 at 10:45 -0700, Dan Williams wrote:
> On Mon, Apr 9, 2018 at 1:38 AM, Qi, Fuli <qi.fuli@jp.fujitsu.com> wrote:

> > 

> > 

> > > -----Original Message-----

> > > From: Dan Williams [mailto:dan.j.williams@intel.com]

> > > Sent: Saturday, April 7, 2018 4:03 AM

> > > To: Qi, Fuli/斉 福利 <qi.fuli@jp.fujitsu.com>

> > > Cc: linux-nvdimm <linux-nvdimm@lists.01.org>

> > > Subject: Re: [RFC PATCH v4] ndctl: monitor: add ndctl monitor daemon

> > > 

> > > bus="<bus filter option>"

> > > 

> > > 

> > > On Thu, Apr 5, 2018 at 4:17 PM, Qi, Fuli <qi.fuli@jp.fujitsu.com>

> > > wrote:

> > > [..]

> > > > > This seems to needlessly tie ndctl to systemd, it should be able

> > > > > to

> > > > > operate without requiring systemd. I expect it would be

> > > > > straightforward to copy the configuration file implementation

> > > > > from git.

> > > > 

> > > > I have read the configuration file implementation of git, my

> > > > understanding is that git daemon does not have any options used to

> > > > override

> > > 

> > > default configuration.

> > > > I want to confirm if the configuration file is only used for ndctl

> > > > monitor.

> > > > If yes, I do not think copy the configuration file implementation

> > > > from

> > > > git is a good choice, because only getting keys and values from

> > > > configuration file is needed for us and the structure of

> > > > configuration file

> > > 

> > > implementation in git is too complexity.

> > > > I prefer to borrow from udev[1], because the implementation in udev

> > > > is simpler

> > > 

> > > and it seems ndctl also borrows a lot from udev.

> > > > 

> > > > [1]

> > > > https://git.kernel.org/pub/scm/linux/hotplug/udev.git/tree/src/libu

> > > > dev

> > > > .c

> > > 

> > > Thank you for doing the due diligence on this investigation it is

> > > appreciated.

> > > 

> > > I think the simple udev approach is acceptable. Going back to the

> > > proposed

> > > command line options of: --dimm-events, --namespace-events, --region-

> > > events,

> > > --bus-events and device filter selectors (like the ndctl list

> > > options) we can just have

> > > variables in the config file for those, so:

> > > 

> > > dimm-events="<list of dimm events>"

> > > namespace-events=="<list of namespace events>"

> > > region-events=="<list of region events>"

> > > bus-events="<list of bus events>"

> > > dimm="<bus filter option>"

> > > dimm="<dimm filter option>"

> > > region="<region filter option>"

> > > namespace="<namespace filter option>"

> > > 

> > > Thoughts?

> > 

> > Thank you very much.

> > 

> > I think the "logfile=<logfile path>" is also needed in the

> > configuration file.

> > 

> > One more question is that do you think it is necessary for ndctl list

> > to support "one option multiple arguments"?

> > Currently, only "one option one argument" is allowed in ndctl list like

> > "ndctl list --dimm nmem1 --bus ndbus1".

> > Due to monitor should support "one option multiple arguments" like

> > "ndctl monitor --dimm nmem1,nmem2",

> > I am thinking modify the "strut util_filter_params" in util/filter.h,

> > change the "const char *bus" to "char **buses"

> > or "link_list bus" and refactor the util_filter_walk in util/filter.h,

> > then ndctl list also can support "ndctl list --dimm nmem1,nmem2".

> 

> That would be a welcome feature. Given that we already support the

> "all" keyword to match multiple objects in a filter it should be

> straightforward to support lists of objects.

> 

> The only concern is syntax. I think since the label commands

> "init-labels, read-labels, write-labels" takes a space separated list

> of objects I think we should do the same specifying multiple object

> names to the filter.


Agreed with space separated arguments. While at it, it would also be nice
to extend the multiple argument syntax to other commands, for example: 

	ndctl disable-namespace namespace1.0 namespace2.0 ...

> 

> I.e. a request to list multiple specific dimms would be:

> 

>     ndctl list --dimm="nmem1 nmem2 nmem3"

> 

> ...space separation instead of comma allows you to rewrite that as:

> 

>     ndctl list --dimm="$(echo nmem{1,2,3)"

> _______________________________________________

> Linux-nvdimm mailing list

> Linux-nvdimm@lists.01.org

> https://lists.01.org/mailman/listinfo/linux-nvdimm
diff mbox

Patch

diff --git a/builtin.h b/builtin.h
index b24fc99..4b908f0 100644
--- a/builtin.h
+++ b/builtin.h
@@ -36,6 +36,7 @@  int cmd_write_labels(int argc, const char **argv, void *ctx);
 int cmd_init_labels(int argc, const char **argv, void *ctx);
 int cmd_check_labels(int argc, const char **argv, void *ctx);
 int cmd_inject_error(int argc, const char **argv, void *ctx);
+int cmd_monitor(int argc, const char **argv, void *ctx);
 int cmd_list(int argc, const char **argv, void *ctx);
 #ifdef ENABLE_TEST
 int cmd_test(int argc, const char **argv, void *ctx);
diff --git a/ndctl/Makefile.am b/ndctl/Makefile.am
index e0db97b..e364ef9 100644
--- a/ndctl/Makefile.am
+++ b/ndctl/Makefile.am
@@ -16,7 +16,8 @@  ndctl_SOURCES = ndctl.c \
 		util/json-firmware.c \
 		inject-error.c \
 		update.c \
-		inject-smart.c
+		inject-smart.c \
+		monitor.c
 
 if ENABLE_DESTRUCTIVE
 ndctl_SOURCES += ../test/blk_namespaces.c \
@@ -41,3 +42,13 @@  ndctl_SOURCES += ../test/libndctl.c \
 		 ../test/multi-pmem.c \
 		 ../test/core.c
 endif
+
+unitfiles =\
+	ndctl-monitor.service \
+	ndctl-monitor@.service
+
+unitdir = /usr/lib/systemd/system/
+
+unit_DATA = $(unitfiles)
+
+EXTRA_DIST = $(unitfiles)
diff --git a/ndctl/monitor.c b/ndctl/monitor.c
new file mode 100644
index 0000000..2164f27
--- /dev/null
+++ b/ndctl/monitor.c
@@ -0,0 +1,411 @@ 
+/*
+ * Copyright (c) 2018, FUJITSU LIMITED. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU Lesser General Public License,
+ * version 2.1, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT ANY
+ * WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+ * FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for
+ * more details.
+ */
+#include <stdio.h>
+#include <json-c/json.h>
+#include <signal.h>
+#include <libgen.h>
+#include <dirent.h>
+#include <util/parse-options.h>
+#include <util/log.h>
+#include <util/json.h>
+#include <util/filter.h>
+#include <ndctl/lib/private.h>
+#include <ndctl/libndctl.h>
+#include <sys/stat.h>
+#define NUM_MAX_DIMM 1024
+#define BUF_SIZE 4096
+
+struct monitor_dimm {
+	struct ndctl_dimm *dimm;
+	int health_eventfd;
+	struct list_node list;
+};
+
+struct monitor_filter_arg {
+	struct list_head mdimm;
+	int maxfd;
+	fd_set fds;
+	int num_dimm;
+	unsigned long flags;
+};
+
+struct util_filter_params param;
+
+static char *conf_dir = "/etc/sysconfig/ndctl/";
+static char *def_log_dir = "/var/log/ndctl/";
+
+static char *concat(char *str1, char *str2)
+{
+	char *result = malloc(strlen(str1) + strlen(str2) + 1);
+	strcpy(result, str1);
+	strcat(result, str2);
+	return result;
+}
+
+static bool is_dir(char *filepath)
+{
+	DIR *dir = opendir(filepath);
+	if (dir) {
+		closedir(dir);
+		return true;
+	}
+	return false;
+}
+
+static int recur_mkdir(char *filepath, mode_t mode)
+{
+	char *p;
+	char *buf = (char *)malloc(strlen(filepath) + 4);
+
+	strcpy(buf, filepath);
+	for (p = strchr(buf + 1, '/'); p; p = strchr(p + 1, '/')) {
+		*p = '\0';
+		if (!is_dir(buf)) {
+			if (mkdir(buf, mode) < 0) {
+				free(buf);
+				return -1;
+			}
+		}
+		*p = '/';
+	}
+
+	free(buf);
+	return 0;
+}
+
+static void log_file(struct ndctl_ctx *ctx, int priority, const char *file,
+	int line, const char *fn, const char *format, va_list args)
+{
+	FILE *f;
+	char *log_name, *log_dir, *buf;
+	char *errmsg = (char *)malloc(BUF_SIZE);
+	char *tail = "/";
+
+	log_dir = dirname(strdup(param.log));
+	log_name = basename(strdup(param.log));
+	if (strcmp(log_dir, ".") == 0)
+		log_dir = def_log_dir;
+	else
+		log_dir = concat(log_dir, tail);
+	if (log_dir[0] != '/')
+		log_dir = concat(def_log_dir, log_dir);
+	log_name = concat(log_dir, log_name);
+
+	if (!is_dir(log_dir)) {
+		if (recur_mkdir(log_dir, 0744) != 0) {
+			sprintf(errmsg, "cannot create dir: %s", log_dir);
+			goto out;
+		}
+	}
+
+	f = fopen(log_name, "a+");
+	if (!f) {
+		sprintf(errmsg, "open %s failed", log_name);
+		goto out;
+	}
+
+	buf = (char *)malloc(BUF_SIZE);
+	if (!buf) {
+		sprintf(errmsg, "cannot get memory for log_file");
+		goto out;
+	}
+	vsnprintf(buf, BUF_SIZE, format, args);
+	fprintf(f, "%s\n", buf);
+	free(buf);
+	fclose(f);
+	return;
+out:
+	syslog(LOG_ERR, "%s\n", errmsg);
+	if (!param.fork)
+		error("%s\n", errmsg);
+	free(errmsg);
+	exit(EXIT_FAILURE);
+}
+
+static void log_syslog(struct ndctl_ctx *ctx, int priority, const char *file,
+	int line, const char *fn, const char *format, va_list args)
+{
+	char *buf = (char *)malloc(BUF_SIZE);
+	vsnprintf(buf, BUF_SIZE, format, args);
+	syslog(priority, "%s", buf);
+	free(buf);
+}
+
+#define fail(fmt, ...) \
+do { \
+	err(ctx, "ndctl-%s:%s:%d: " fmt, \
+				VERSION, __func__, __LINE__, ##__VA_ARGS__); \
+	if (!param.fork) \
+		fprintf(stderr, "ndctl-%s:%s:%d: " fmt, \
+				VERSION, __func__, __LINE__, ##__VA_ARGS__); \
+} while (0)
+
+static int notify_json_msg(struct ndctl_ctx *ctx, struct ndctl_dimm *dimm)
+{
+	time_t c_time;
+	char date[32];
+	struct json_object *jmsg, *jdatetime, *jpid, *jdimm, *jhealth;
+
+	jmsg = json_object_new_object();
+	if (!jmsg) {
+		fail("\n");
+		return -1;
+	}
+
+	c_time = time(NULL);
+	strftime(date, sizeof(date), "%Y-%m-%d %H:%M:%S", localtime(&c_time));
+	jdatetime = json_object_new_string(date);
+	if (!jdatetime) {
+		fail("\n");
+		return -1;
+	}
+	json_object_object_add(jmsg, "datetime", jdatetime);
+
+	jpid = json_object_new_int((int)getpid());
+	if (!jpid) {
+		fail("\n");
+		return -1;
+	}
+	json_object_object_add(jmsg, "pid", jpid);
+
+	jdimm = util_dimm_to_json(dimm, 0);
+	if (!dimm) {
+		fail("\n");
+		return -1;
+	}
+	json_object_object_add(jmsg, "dimm", jdimm);
+
+	jhealth = util_dimm_health_to_json(dimm);
+	if (!jhealth) {
+		fail("\n");
+		return -1;
+	}
+	json_object_object_add(jdimm, "health", jhealth);
+
+	notice(ctx, "%s",
+		json_object_to_json_string_ext(jmsg, JSON_C_TO_STRING_PLAIN));
+	if (!param.fork)
+		printf("%s\n", json_object_to_json_string_ext(jmsg,
+				JSON_C_TO_STRING_PRETTY));
+	return 0;
+}
+
+static void filter_dimm(struct ndctl_dimm *dimm, struct util_filter_ctx *ctx)
+{
+	struct monitor_filter_arg *mfa = (struct monitor_filter_arg *)ctx->arg;
+	int fd;
+	char buf[BUF_SIZE];
+
+	if (!ndctl_dimm_is_cmd_supported(dimm, ND_CMD_SMART_THRESHOLD))
+		return;
+
+	struct monitor_dimm *m_dimm = malloc(sizeof(*m_dimm));
+	m_dimm->dimm = dimm;
+	fd = ndctl_dimm_get_health_eventfd(dimm);
+	pread(fd, buf, sizeof(buf), 0);
+	m_dimm->health_eventfd = fd;
+	list_add_tail(&mfa->mdimm, &m_dimm->list);
+	FD_SET(fd, &mfa->fds);
+	if (fd > mfa->maxfd)
+		mfa->maxfd = fd;
+	mfa->num_dimm++;
+}
+
+static int monitor_smart_event(struct ndctl_ctx *ctx)
+{
+	struct util_filter_ctx fctx = { 0 };
+	struct monitor_filter_arg mfa = { 0 };
+	int rc;
+	char buf[BUF_SIZE];
+	char *errmsg = (char *)malloc(BUF_SIZE);
+
+	fctx.filter_dimm = filter_dimm;
+	fctx.arg = &mfa;
+	mfa.flags = 0;
+	list_head_init(&mfa.mdimm);
+	FD_ZERO(&mfa.fds);
+
+	rc = util_filter_walk(ctx, &fctx, &param);
+	if (rc)
+		goto out;
+	if (mfa.num_dimm == 0) {
+		errmsg = "no monitor dimms can be found";
+		goto out;
+	}
+
+	while(1){
+		rc = select(mfa.maxfd + 1, NULL, NULL, &mfa.fds, NULL);
+		if (rc < 1) {
+			errmsg = "select error";
+			if (rc == 0)
+				dbg(ctx, "select unexpected timeout\n");
+			else
+				dbg(ctx, "select %s\n", strerror(errno));
+			goto out;
+		}
+		struct monitor_dimm *m_dimm;
+		list_for_each(&mfa.mdimm, m_dimm, list) {
+			if (!FD_ISSET(m_dimm->health_eventfd, &mfa.fds)) {
+				FD_SET(m_dimm->health_eventfd, &mfa.fds);
+				continue;
+			}
+			if (notify_json_msg(ctx, m_dimm->dimm) != 0)
+				goto out;
+			pread(m_dimm->health_eventfd, buf, sizeof(buf), 0);
+		}
+	}
+	return 0;
+out:
+	if (errmsg) {
+		if (!param.fork)
+			error("%s\n", errmsg);
+		err(ctx, "%s\n", errmsg);
+	}
+	return 1;
+}
+
+static int create_confile(char *conf_path)
+{
+	FILE *f;
+	char *buf;
+
+	if (!is_dir(conf_dir)) {
+		if (recur_mkdir(conf_dir, 0744) != 0) {
+			error("cannot create dir: %s\n", conf_dir);
+			goto out;
+		}
+	}
+
+	f = fopen(conf_path, "w");
+	if (!f) {
+		error("open %s failed\n", conf_path);
+		goto out;
+	}
+
+	buf = (char *)malloc(BUF_SIZE);
+	if (!buf) {
+		error("cannot get memory for daemon config file\n");
+		goto out;
+	}
+	strcpy(buf, "OPTIONS=-f");
+	if (param.bus) {
+		strcat(buf, " -b ");
+		strcat(buf, param.bus);
+	}
+	if (param.dimm) {
+		strcat(buf, " -d ");
+		strcat(buf, param.dimm);
+	}
+	if (param.namespace) {
+		strcat(buf, " -n ");
+		strcat(buf, param.namespace);
+	}
+	if (param.region) {
+		strcat(buf, " -r ");
+		strcat(buf, param.region);
+	}
+	if (param.log) {
+		strcat(buf, " -l ");
+		strcat(buf, param.log);
+	}
+	fprintf(f, "%s", buf);
+	fclose(f);
+	free(buf);
+	return 0;
+out:
+	return 1;
+}
+
+static bool is_monitor_exist(void)
+{
+	char *conf_path = strdup(param.daemon);
+	conf_path = concat(conf_dir, conf_path);
+	FILE *f = fopen(conf_path, "r");
+	if (f) {
+		fclose(f);
+		return true;
+	}
+	return false;
+}
+
+int cmd_monitor(int argc, const char **argv, void *ctx)
+{
+	const struct option options[] = {
+		OPT_STRING('b', "bus", &param.bus, "bus-id", "filter by bus"),
+		OPT_STRING('r', "region", &param.region, "region-id",
+				"filter by region"),
+		OPT_STRING('d', "dimm", &param.dimm, "dimm-id",
+				"filter by dimm"),
+		OPT_STRING('n', "namespace", &param.namespace,
+				"namespace-id", "filter by namespace id"),
+		OPT_STRING('l', "log", &param.log, "log name",
+				"monitor logfile"),
+		OPT_STRING('D',"daemon", &param.daemon, "daemon-name",
+				"run ndctl monitor as a daemon"),
+		OPT_BOOLEAN_HID('f', &param.fork),
+		OPT_END(),
+	};
+	const char * const u[] = {
+		"ndctl monitor [<options>]",
+		NULL
+	};
+	argc = parse_options(argc, argv, options, u, 0);
+	for (int i = 0; i < argc; i++) {
+		error("unknown parameter \"%s\"\n", argv[i]);
+		goto out;
+	}
+	if (argc)
+		usage_with_options(u, options);
+
+	if (param.daemon) {
+		if (is_monitor_exist()) {
+			error("monitor %s is exist\n", param.daemon);
+			goto out;
+		}
+		char *conf_path = strdup(param.daemon);
+		conf_path = concat(conf_dir, conf_path);
+		if (create_confile(conf_path) != 0)
+			goto out;
+		char *sys_cmd = (char *)malloc(BUF_SIZE);
+		sprintf(sys_cmd, "systemctl start ndctl-monitor@%s.service",
+				param.daemon);
+		if (system(sys_cmd) != 0) {
+			free(sys_cmd);
+			remove(conf_path);
+			goto out;
+		}
+		free(sys_cmd);
+		return 0;
+	}
+
+	if (param.fork) {
+		if (daemon(0, 0) != 0) {
+			err((struct ndctl_ctx*)ctx, "daemon start failed\n");
+			goto out;
+		}
+	}
+
+	ndctl_set_log_priority((struct ndctl_ctx*)ctx, LOG_NOTICE);
+
+	if (!param.log || strcmp(param.log, "syslog") == 0)
+		ndctl_set_log_fn((struct ndctl_ctx*)ctx, log_syslog);
+	else
+		ndctl_set_log_fn((struct ndctl_ctx*)ctx, log_file);
+
+	if (monitor_smart_event((struct ndctl_ctx*)ctx) != 0)
+		goto out;
+
+	return 0;
+out:
+	return 1;
+}
diff --git a/ndctl/ndctl-monitor.service b/ndctl/ndctl-monitor.service
new file mode 100644
index 0000000..2f1417b
--- /dev/null
+++ b/ndctl/ndctl-monitor.service
@@ -0,0 +1,7 @@ 
+[Unit]
+Description=Ndctl Monitor Daemon
+
+[Service]
+Type=forking
+ExecStart=/usr/bin/ndctl monitor -f
+ExecStop=/usr/bin/kill ${MAINPID}
diff --git a/ndctl/ndctl-monitor@.service b/ndctl/ndctl-monitor@.service
new file mode 100644
index 0000000..91c85d9
--- /dev/null
+++ b/ndctl/ndctl-monitor@.service
@@ -0,0 +1,9 @@ 
+[Unit]
+Description=Ndctl Monitor Daemon
+
+[Service]
+Type=forking
+EnvironmentFile=/etc/sysconfig/ndctl/%i
+ExecStart=/usr/bin/ndctl monitor $OPTIONS
+ExecStop=/usr/bin/kill ${MAINPID}
+ExecStopPost=/usr/bin/rm /etc/sysconfig/ndctl/%i
diff --git a/ndctl/ndctl.c b/ndctl/ndctl.c
index d3c6db1..8938621 100644
--- a/ndctl/ndctl.c
+++ b/ndctl/ndctl.c
@@ -86,6 +86,7 @@  static struct cmd_struct commands[] = {
 	{ "inject-error", cmd_inject_error },
 	{ "update-firmware", cmd_update_firmware },
 	{ "inject-smart", cmd_inject_smart },
+	{ "monitor", cmd_monitor },
 	{ "list", cmd_list },
 	{ "help", cmd_help },
 	#ifdef ENABLE_TEST
diff --git a/util/filter.c b/util/filter.c
index b0b7fdf..fba2197 100644
--- a/util/filter.c
+++ b/util/filter.c
@@ -315,7 +315,7 @@  int util_filter_walk(struct ndctl_ctx *ctx, struct util_filter_ctx *fctx,
 				|| !util_bus_filter_by_namespace(bus, param->namespace))
 			continue;
 
-		if (!fctx->filter_bus(bus, fctx))
+		if (fctx->filter_bus && !fctx->filter_bus(bus, fctx))
 			continue;
 
 		ndctl_dimm_foreach(bus, dimm) {
@@ -345,7 +345,8 @@  int util_filter_walk(struct ndctl_ctx *ctx, struct util_filter_ctx *fctx,
 			if (type && ndctl_region_get_type(region) != type)
 				continue;
 
-			if (!fctx->filter_region(region, fctx))
+			if (fctx->filter_region &&
+					!fctx->filter_region(region, fctx))
 				continue;
 
 			ndctl_namespace_foreach(region, ndns) {
diff --git a/util/filter.h b/util/filter.h
index aea5a71..82f3b0d 100644
--- a/util/filter.h
+++ b/util/filter.h
@@ -77,6 +77,9 @@  struct util_filter_params {
 	const char *dimm;
 	const char *mode;
 	const char *namespace;
+	const char *log;
+	const char *daemon;
+	bool fork;
 };
 
 struct ndctl_ctx;
diff --git a/util/parse-options.h b/util/parse-options.h
index 6fd6b24..2262c42 100644
--- a/util/parse-options.h
+++ b/util/parse-options.h
@@ -123,6 +123,7 @@  struct option {
 #define OPT_GROUP(h)                { .type = OPTION_GROUP, .help = (h) }
 #define OPT_BIT(s, l, v, h, b)      { .type = OPTION_BIT, .short_name = (s), .long_name = (l), .value = check_vtype(v, int *), .help = (h), .defval = (b) }
 #define OPT_BOOLEAN(s, l, v, h)     { .type = OPTION_BOOLEAN, .short_name = (s), .long_name = (l), .value = check_vtype(v, bool *), .help = (h) }
+#define OPT_BOOLEAN_HID(s, v)       { .type = OPTION_BOOLEAN, .short_name = (s), .value = check_vtype(v, bool *), .flags = PARSE_OPT_HIDDEN}
 #define OPT_BOOLEAN_SET(s, l, v, os, h) \
 	{ .type = OPTION_BOOLEAN, .short_name = (s), .long_name = (l), \
 	.value = check_vtype(v, bool *), .help = (h), \