[v4,4/7] libxl: add infrastructure to track and query 'recent' domids
diff mbox series

Message ID 20200122144446.919-5-pdurrant@amazon.com
State Superseded
Headers show
Series
  • xl/libxl: domid allocation/preservation changes
Related show

Commit Message

Paul Durrant Jan. 22, 2020, 2:44 p.m. UTC
A domid is considered recent if the domain it represents was destroyed
less than a specified number of seconds ago. The number can be set using
the environment variable LIBXL_DOMID_REUSE_TIMEOUT. If the variable does
not exist then a default value of 60s is used.

Whenever a domain is destroyed, a time-stamped record will be written into
a history file (/var/run/xen/domid-history). To avoid the history file
growing too large, any records with time-stamps that indicate that the
age of a domid has exceeded the re-use timeout will also be purged.

A new utility function, libxl__is_recent_domid(), has been added. This
function reads the same history file checking whether a specified domid
has a record that does not exceed the re-use timeout. Since this utility
function does not write to the file, no records are actually purged by it.

NOTE: The history file is purged on boot to it is safe to use
      CLOCK_MONOTONIC as a time source.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wl@xen.org>
Cc: Anthony PERARD <anthony.perard@citrix.com>

v4:
 - Use new generalised libxl__flock
 - Don't read and write the same file
 - Use 'recent' rather than 'retired'
 - Add code into xen-init-dom0 to delete an old history file at boot

v2:
 - New in v2
---
 tools/helpers/xen-init-dom0.c |  30 ++++++++
 tools/libxl/libxl.h           |   2 +
 tools/libxl/libxl_domain.c    | 135 ++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_internal.c  |  10 +++
 tools/libxl/libxl_internal.h  |  14 ++++
 5 files changed, 191 insertions(+)

Comments

Ian Jackson Jan. 30, 2020, 5:23 p.m. UTC | #1
Paul Durrant writes ("[PATCH v4 4/7] libxl: add infrastructure to track and query 'recent' domids"):
> A domid is considered recent if the domain it represents was destroyed
> less than a specified number of seconds ago. The number can be set using
> the environment variable LIBXL_DOMID_REUSE_TIMEOUT. If the variable does
> not exist then a default value of 60s is used.
> 
> Whenever a domain is destroyed, a time-stamped record will be written into
> a history file (/var/run/xen/domid-history). To avoid the history file
> growing too large, any records with time-stamps that indicate that the
> age of a domid has exceeded the re-use timeout will also be purged.
> 
> A new utility function, libxl__is_recent_domid(), has been added. This
> function reads the same history file checking whether a specified domid
> has a record that does not exceed the re-use timeout. Since this utility
> function does not write to the file, no records are actually purged by it.
> 
> NOTE: The history file is purged on boot to it is safe to use
>       CLOCK_MONOTONIC as a time source.

Thanks.

> diff --git a/tools/helpers/xen-init-dom0.c b/tools/helpers/xen-init-dom0.c
> index a1e5729458..56f69ab66f 100644
> --- a/tools/helpers/xen-init-dom0.c
> +++ b/tools/helpers/xen-init-dom0.c

Thanks.  This part is good.

> +static void libxl__mark_domid_recent(libxl__gc *gc, uint32_t domid)
> +{
...
> +    clock_gettime(CLOCK_MONOTONIC, &ts);
> +
> +    while (of && fgets(line, sizeof(line), of)) {
> +        unsigned long sec;
> +        unsigned int ignored;
> +
> +        if (sscanf(line, "%lu %u", &sec, &ignored) != 2) {
> +            LOGED(ERROR, domid, "ignoring malformed line: %s", line);
> +            continue;
> +        }
> +
> +        if (ts.tv_sec - sec > timeout)
> +            continue; /* Ignore expired entries */

I find this code quite similar to this code

> +bool libxl__is_domid_recent(libxl__gc *gc, uint32_t domid)
...
> +    while (!feof(f)) {
> +        unsigned long sec;
> +        unsigned int check;
> +
> +        if (fscanf(f, "%lu %u", &sec, &check) != 2)
> +            continue;
> +
> +        if (check == domid && ts.tv_sec - sec <= timeout) {
> +            recent = true;
> +            break;
> +        }

This makes me uncomfortable.  For one thing, it duplicates any bugs
(and there is at least one error handling anomaly here) and duplicates
my review effort looking for those bugs :-).

Do you think this can be combined somehow ?  Possibilities seem to
include: explicit read_recent_{init,entry,finish} functions; a single
function with a "per-entry" callback; same but with a macro.  If you
do that you can also have it have handle the "file does not exist"
special case.

Also, the stdio error handling doesn't seem quite right.  What if
fgets gets a read error ?

While I'm looking at this, I found

> +    while (of && fgets(line, sizeof(line), of)) {

that confusing.  of!=0 is an entry condition, not a termination
condition.  When I first read this I looked for modifications to of in
the loop but of course there aren't any.

If you really want to keep it like this I guess I will tolerate it to
avoid the argument...

> +    fflush(nf);

Missing error check.  Also you should fclose here, not just fflush.
When you do that, set nf to 0 so the out block doesn't re-close it.

> +bool libxl__is_domid_recent(libxl__gc *gc, uint32_t domid)
> +{
...
> +    name = GCSPRINTF("%s/domid-history", libxl__run_dir_path());
> +    f = fopen(name, "r");
> +    if (!f) {
> +        if (errno != ENOENT) LOGED(WARN, domid, "failed to open %s", name);

I think this (and other unexpected manipulation failures) should be
fatal, rather than merely a warning.  That means your function should
return rc.  Sorry.  Same goes for libxl__mark_domid_recent.

Thanks,
Ian.
Anthony PERARD Jan. 31, 2020, 10:55 a.m. UTC | #2
On Wed, Jan 22, 2020 at 02:44:43PM +0000, Paul Durrant wrote:
> A domid is considered recent if the domain it represents was destroyed
> less than a specified number of seconds ago. The number can be set using
> the environment variable LIBXL_DOMID_REUSE_TIMEOUT. If the variable does
> not exist then a default value of 60s is used.

Could you rewrite that part of the commit message? By reading it, it
sounds to me like LIBXL_DOMID_REUSE_TIMEOUT is a configuration variable
that a toolstack may want to set. Whereas the comment in
libxl_internal.h indicates that it's for debuging.  Having env var for
debugging sounds good, but having env var as configuration doesn't.

> +/*
> + * Maximum number of seconds after desctruction then a domid remains
> + * 'recent'. Recent domids are not allowed to be re-used. This can be
> + * overidden, for debugging purposes, by the environment variable of the
> + * same name.
> + */
> +#define LIBXL_DOMID_REUSE_TIMEOUT 60

Thanks,
Durrant, Paul Jan. 31, 2020, 10:57 a.m. UTC | #3
> -----Original Message-----
> From: Anthony PERARD <anthony.perard@citrix.com>
> Sent: 31 January 2020 10:55
> To: Durrant, Paul <pdurrant@amazon.co.uk>
> Cc: xen-devel@lists.xenproject.org; Ian Jackson
> <ian.jackson@eu.citrix.com>; Wei Liu <wl@xen.org>
> Subject: Re: [PATCH v4 4/7] libxl: add infrastructure to track and query
> 'recent' domids
> 
> On Wed, Jan 22, 2020 at 02:44:43PM +0000, Paul Durrant wrote:
> > A domid is considered recent if the domain it represents was destroyed
> > less than a specified number of seconds ago. The number can be set using
> > the environment variable LIBXL_DOMID_REUSE_TIMEOUT. If the variable does
> > not exist then a default value of 60s is used.
> 
> Could you rewrite that part of the commit message? By reading it, it
> sounds to me like LIBXL_DOMID_REUSE_TIMEOUT is a configuration variable
> that a toolstack may want to set. Whereas the comment in
> libxl_internal.h indicates that it's for debuging.  Having env var for
> debugging sounds good, but having env var as configuration doesn't.
> 

Sure, I'll make the commit comment clearer.

  Paul

Patch
diff mbox series

diff --git a/tools/helpers/xen-init-dom0.c b/tools/helpers/xen-init-dom0.c
index a1e5729458..56f69ab66f 100644
--- a/tools/helpers/xen-init-dom0.c
+++ b/tools/helpers/xen-init-dom0.c
@@ -12,6 +12,32 @@ 
 #define DOMNAME_PATH   "/local/domain/0/name"
 #define DOMID_PATH     "/local/domain/0/domid"
 
+int clear_domid_history(void)
+{
+    int rc = 1;
+    xentoollog_logger_stdiostream *logger;
+    libxl_ctx *ctx;
+
+    logger = xtl_createlogger_stdiostream(stderr, XTL_ERROR, 0);
+    if (!logger)
+        return 1;
+
+    if (libxl_ctx_alloc(&ctx, LIBXL_VERSION, 0,
+                        (xentoollog_logger *)logger)) {
+        fprintf(stderr, "cannot init libxl context\n");
+        goto outlog;
+    }
+
+    if (!libxl_clear_domid_history(ctx))
+        rc = 0;
+
+    libxl_ctx_free(ctx);
+
+outlog:
+    xtl_logger_destroy((xentoollog_logger *)logger);
+    return rc;
+}
+
 int main(int argc, char **argv)
 {
     int rc;
@@ -70,6 +96,10 @@  int main(int argc, char **argv)
     if (rc)
         goto out;
 
+    rc = clear_domid_history();
+    if (rc)
+        goto out;
+
     /* Write xenstore entries. */
     if (!xs_write(xsh, XBT_NULL, DOMID_PATH, "0", strlen("0"))) {
         fprintf(stderr, "cannot set domid for Dom0\n");
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 18c1a2d6bf..1d235ecb1c 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -2657,6 +2657,8 @@  static inline int libxl_qemu_monitor_command_0x041200(libxl_ctx *ctx,
 
 #include <libxl_event.h>
 
+int libxl_clear_domid_history(libxl_ctx *ctx);
+
 #endif /* LIBXL_H */
 
 /*
diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c
index 1bdb1615d8..d424a8542f 100644
--- a/tools/libxl/libxl_domain.c
+++ b/tools/libxl/libxl_domain.c
@@ -1268,6 +1268,140 @@  static void dm_destroy_cb(libxl__egc *egc,
     libxl__devices_destroy(egc, &dis->drs);
 }
 
+static unsigned int libxl__get_domid_reuse_timeout(void)
+{
+    const char *env_timeout = getenv("LIBXL_DOMID_REUSE_TIMEOUT");
+
+    return env_timeout ? strtol(env_timeout, NULL, 0) :
+        LIBXL_DOMID_REUSE_TIMEOUT;
+}
+
+char *libxl__domid_history_path(libxl__gc *gc, const char *suffix)
+{
+    return GCSPRINTF("%s/domid-history%s", libxl__run_dir_path(),
+                     suffix ?: "");
+}
+
+int libxl_clear_domid_history(libxl_ctx *ctx)
+{
+    GC_INIT(ctx);
+    char *path;
+    int rc = ERROR_FAIL;
+
+    path = libxl__domid_history_path(gc, NULL);
+    if (!path)
+        goto out;
+
+    if (unlink(path) < 0 && errno != ENOENT) {
+        LOGE(ERROR, "failed to remove '%s'\n", path);
+        goto out;
+    }
+
+    rc = 0;
+
+out:
+    GC_FREE;
+    return rc;
+}
+
+static void libxl__mark_domid_recent(libxl__gc *gc, uint32_t domid)
+{
+    long timeout = libxl__get_domid_reuse_timeout();
+    libxl__flock *lock;
+    char *old, *new;
+    FILE *of = NULL, *nf = NULL;
+    struct timespec ts;
+    char line[64];
+
+    lock = libxl__lock_domid_history(gc);
+    if (!lock) {
+        LOGED(ERROR, domid, "failed to acquire lock");
+        goto out;
+    }
+
+    old = libxl__domid_history_path(gc, NULL);
+    of = fopen(old, "r");
+    if (!of && errno != ENOENT)
+        LOGED(WARN, domid, "failed to open '%s'", old);
+
+    new = libxl__domid_history_path(gc, ".new");
+    nf = fopen(new, "a");
+    if (!nf) {
+        LOGED(ERROR, domid, "failed to open '%s'", new);
+        goto out;
+    }
+
+    clock_gettime(CLOCK_MONOTONIC, &ts);
+
+    while (of && fgets(line, sizeof(line), of)) {
+        unsigned long sec;
+        unsigned int ignored;
+
+        if (sscanf(line, "%lu %u", &sec, &ignored) != 2) {
+            LOGED(ERROR, domid, "ignoring malformed line: %s", line);
+            continue;
+        }
+
+        if (ts.tv_sec - sec > timeout)
+            continue; /* Ignore expired entries */
+
+        if (fputs(line, nf) == EOF) {
+            LOGED(ERROR, domid, "failed to write");
+            goto out;
+        }
+    }
+
+    if (fprintf(nf, "%lu %u\n", ts.tv_sec, domid) < 0) {
+        LOGED(ERROR, domid, "failed to write");
+        goto out;
+    }
+
+    fflush(nf);
+
+    if (rename(new, old) < 0)
+        LOGED(ERROR, domid, "failed to rename '%s' -> '%s'", old, new);
+
+out:
+    if (nf) fclose(nf);
+    if (of) fclose(of);
+    if (lock) libxl__unlock_file(lock);
+}
+
+bool libxl__is_domid_recent(libxl__gc *gc, uint32_t domid)
+{
+    long timeout = libxl__get_domid_reuse_timeout();
+    bool recent = false;
+    const char *name;
+    FILE *f;
+    struct timespec ts;
+
+    name = GCSPRINTF("%s/domid-history", libxl__run_dir_path());
+    f = fopen(name, "r");
+    if (!f) {
+        if (errno != ENOENT) LOGED(WARN, domid, "failed to open %s", name);
+        return false;
+    }
+
+    clock_gettime(CLOCK_MONOTONIC, &ts);
+
+    while (!feof(f)) {
+        unsigned long sec;
+        unsigned int check;
+
+        if (fscanf(f, "%lu %u", &sec, &check) != 2)
+            continue;
+
+        if (check == domid && ts.tv_sec - sec <= timeout) {
+            recent = true;
+            break;
+        }
+    }
+
+    fclose(f);
+
+    return recent;
+}
+
 static void devices_destroy_cb(libxl__egc *egc,
                                libxl__devices_remove_state *drs,
                                int rc)
@@ -1331,6 +1465,7 @@  static void devices_destroy_cb(libxl__egc *egc,
         if (!ctx->xch) goto badchild;
 
         if (!dis->soft_reset) {
+            libxl__mark_domid_recent(gc, domid);
             rc = xc_domain_destroy(ctx->xch, domid);
         } else {
             rc = xc_domain_pause(ctx->xch, domid);
diff --git a/tools/libxl/libxl_internal.c b/tools/libxl/libxl_internal.c
index 211236dc99..bbd4c6cba9 100644
--- a/tools/libxl/libxl_internal.c
+++ b/tools/libxl/libxl_internal.c
@@ -504,6 +504,16 @@  libxl__flock *libxl__lock_domain_userdata(libxl__gc *gc, uint32_t domid)
     return lock;
 }
 
+libxl__flock *libxl__lock_domid_history(libxl__gc *gc)
+{
+    const char *lockfile;
+
+    lockfile = libxl__domid_history_path(gc, ".lock");
+    if (!lockfile) return NULL;
+
+    return libxl__lock_file(gc, lockfile);
+}
+
 int libxl__get_domain_configuration(libxl__gc *gc, uint32_t domid,
                                     libxl_domain_config *d_config)
 {
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 3fb38220e5..a50d5a2939 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -4233,6 +4233,8 @@  _hidden void libxl__remus_teardown(libxl__egc *egc,
 _hidden void libxl__remus_restore_setup(libxl__egc *egc,
                                         libxl__domain_create_state *dcs);
 
+_hidden char *libxl__domid_history_path(libxl__gc *gc,
+                                        const char *suffix);
 
 /*
  * Convenience macros.
@@ -4631,6 +4633,7 @@  libxl__flock *libxl__lock_file(libxl__gc *gc, const char *filename);
 void libxl__unlock_file(libxl__flock *lock);
 
 libxl__flock *libxl__lock_domain_userdata(libxl__gc *gc, uint32_t domid);
+libxl__flock *libxl__lock_domid_history(libxl__gc *gc);
 
 /*
  * Retrieve / store domain configuration from / to libxl private
@@ -4769,6 +4772,17 @@  _hidden int libxl__domain_pvcontrol(libxl__egc *egc,
                                     libxl__xswait_state *pvcontrol,
                                     domid_t domid, const char *cmd);
 
+/*
+ * Maximum number of seconds after desctruction then a domid remains
+ * 'recent'. Recent domids are not allowed to be re-used. This can be
+ * overidden, for debugging purposes, by the environment variable of the
+ * same name.
+ */
+#define LIBXL_DOMID_REUSE_TIMEOUT 60
+
+/* Check whether a domid is recent */
+bool libxl__is_domid_recent(libxl__gc *gc, uint32_t domid);
+
 #endif
 
 /*