Message ID | 20200122144446.919-5-pdurrant@amazon.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | xl/libxl: domid allocation/preservation changes | expand |
Paul Durrant writes ("[PATCH v4 4/7] libxl: add infrastructure to track and query 'recent' domids"): > A domid is considered recent if the domain it represents was destroyed > less than a specified number of seconds ago. The number can be set using > the environment variable LIBXL_DOMID_REUSE_TIMEOUT. If the variable does > not exist then a default value of 60s is used. > > Whenever a domain is destroyed, a time-stamped record will be written into > a history file (/var/run/xen/domid-history). To avoid the history file > growing too large, any records with time-stamps that indicate that the > age of a domid has exceeded the re-use timeout will also be purged. > > A new utility function, libxl__is_recent_domid(), has been added. This > function reads the same history file checking whether a specified domid > has a record that does not exceed the re-use timeout. Since this utility > function does not write to the file, no records are actually purged by it. > > NOTE: The history file is purged on boot to it is safe to use > CLOCK_MONOTONIC as a time source. Thanks. > diff --git a/tools/helpers/xen-init-dom0.c b/tools/helpers/xen-init-dom0.c > index a1e5729458..56f69ab66f 100644 > --- a/tools/helpers/xen-init-dom0.c > +++ b/tools/helpers/xen-init-dom0.c Thanks. This part is good. > +static void libxl__mark_domid_recent(libxl__gc *gc, uint32_t domid) > +{ ... > + clock_gettime(CLOCK_MONOTONIC, &ts); > + > + while (of && fgets(line, sizeof(line), of)) { > + unsigned long sec; > + unsigned int ignored; > + > + if (sscanf(line, "%lu %u", &sec, &ignored) != 2) { > + LOGED(ERROR, domid, "ignoring malformed line: %s", line); > + continue; > + } > + > + if (ts.tv_sec - sec > timeout) > + continue; /* Ignore expired entries */ I find this code quite similar to this code > +bool libxl__is_domid_recent(libxl__gc *gc, uint32_t domid) ... > + while (!feof(f)) { > + unsigned long sec; > + unsigned int check; > + > + if (fscanf(f, "%lu %u", &sec, &check) != 2) > + continue; > + > + if (check == domid && ts.tv_sec - sec <= timeout) { > + recent = true; > + break; > + } This makes me uncomfortable. For one thing, it duplicates any bugs (and there is at least one error handling anomaly here) and duplicates my review effort looking for those bugs :-). Do you think this can be combined somehow ? Possibilities seem to include: explicit read_recent_{init,entry,finish} functions; a single function with a "per-entry" callback; same but with a macro. If you do that you can also have it have handle the "file does not exist" special case. Also, the stdio error handling doesn't seem quite right. What if fgets gets a read error ? While I'm looking at this, I found > + while (of && fgets(line, sizeof(line), of)) { that confusing. of!=0 is an entry condition, not a termination condition. When I first read this I looked for modifications to of in the loop but of course there aren't any. If you really want to keep it like this I guess I will tolerate it to avoid the argument... > + fflush(nf); Missing error check. Also you should fclose here, not just fflush. When you do that, set nf to 0 so the out block doesn't re-close it. > +bool libxl__is_domid_recent(libxl__gc *gc, uint32_t domid) > +{ ... > + name = GCSPRINTF("%s/domid-history", libxl__run_dir_path()); > + f = fopen(name, "r"); > + if (!f) { > + if (errno != ENOENT) LOGED(WARN, domid, "failed to open %s", name); I think this (and other unexpected manipulation failures) should be fatal, rather than merely a warning. That means your function should return rc. Sorry. Same goes for libxl__mark_domid_recent. Thanks, Ian.
On Wed, Jan 22, 2020 at 02:44:43PM +0000, Paul Durrant wrote: > A domid is considered recent if the domain it represents was destroyed > less than a specified number of seconds ago. The number can be set using > the environment variable LIBXL_DOMID_REUSE_TIMEOUT. If the variable does > not exist then a default value of 60s is used. Could you rewrite that part of the commit message? By reading it, it sounds to me like LIBXL_DOMID_REUSE_TIMEOUT is a configuration variable that a toolstack may want to set. Whereas the comment in libxl_internal.h indicates that it's for debuging. Having env var for debugging sounds good, but having env var as configuration doesn't. > +/* > + * Maximum number of seconds after desctruction then a domid remains > + * 'recent'. Recent domids are not allowed to be re-used. This can be > + * overidden, for debugging purposes, by the environment variable of the > + * same name. > + */ > +#define LIBXL_DOMID_REUSE_TIMEOUT 60 Thanks,
> -----Original Message----- > From: Anthony PERARD <anthony.perard@citrix.com> > Sent: 31 January 2020 10:55 > To: Durrant, Paul <pdurrant@amazon.co.uk> > Cc: xen-devel@lists.xenproject.org; Ian Jackson > <ian.jackson@eu.citrix.com>; Wei Liu <wl@xen.org> > Subject: Re: [PATCH v4 4/7] libxl: add infrastructure to track and query > 'recent' domids > > On Wed, Jan 22, 2020 at 02:44:43PM +0000, Paul Durrant wrote: > > A domid is considered recent if the domain it represents was destroyed > > less than a specified number of seconds ago. The number can be set using > > the environment variable LIBXL_DOMID_REUSE_TIMEOUT. If the variable does > > not exist then a default value of 60s is used. > > Could you rewrite that part of the commit message? By reading it, it > sounds to me like LIBXL_DOMID_REUSE_TIMEOUT is a configuration variable > that a toolstack may want to set. Whereas the comment in > libxl_internal.h indicates that it's for debuging. Having env var for > debugging sounds good, but having env var as configuration doesn't. > Sure, I'll make the commit comment clearer. Paul
diff --git a/tools/helpers/xen-init-dom0.c b/tools/helpers/xen-init-dom0.c index a1e5729458..56f69ab66f 100644 --- a/tools/helpers/xen-init-dom0.c +++ b/tools/helpers/xen-init-dom0.c @@ -12,6 +12,32 @@ #define DOMNAME_PATH "/local/domain/0/name" #define DOMID_PATH "/local/domain/0/domid" +int clear_domid_history(void) +{ + int rc = 1; + xentoollog_logger_stdiostream *logger; + libxl_ctx *ctx; + + logger = xtl_createlogger_stdiostream(stderr, XTL_ERROR, 0); + if (!logger) + return 1; + + if (libxl_ctx_alloc(&ctx, LIBXL_VERSION, 0, + (xentoollog_logger *)logger)) { + fprintf(stderr, "cannot init libxl context\n"); + goto outlog; + } + + if (!libxl_clear_domid_history(ctx)) + rc = 0; + + libxl_ctx_free(ctx); + +outlog: + xtl_logger_destroy((xentoollog_logger *)logger); + return rc; +} + int main(int argc, char **argv) { int rc; @@ -70,6 +96,10 @@ int main(int argc, char **argv) if (rc) goto out; + rc = clear_domid_history(); + if (rc) + goto out; + /* Write xenstore entries. */ if (!xs_write(xsh, XBT_NULL, DOMID_PATH, "0", strlen("0"))) { fprintf(stderr, "cannot set domid for Dom0\n"); diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 18c1a2d6bf..1d235ecb1c 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -2657,6 +2657,8 @@ static inline int libxl_qemu_monitor_command_0x041200(libxl_ctx *ctx, #include <libxl_event.h> +int libxl_clear_domid_history(libxl_ctx *ctx); + #endif /* LIBXL_H */ /* diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c index 1bdb1615d8..d424a8542f 100644 --- a/tools/libxl/libxl_domain.c +++ b/tools/libxl/libxl_domain.c @@ -1268,6 +1268,140 @@ static void dm_destroy_cb(libxl__egc *egc, libxl__devices_destroy(egc, &dis->drs); } +static unsigned int libxl__get_domid_reuse_timeout(void) +{ + const char *env_timeout = getenv("LIBXL_DOMID_REUSE_TIMEOUT"); + + return env_timeout ? strtol(env_timeout, NULL, 0) : + LIBXL_DOMID_REUSE_TIMEOUT; +} + +char *libxl__domid_history_path(libxl__gc *gc, const char *suffix) +{ + return GCSPRINTF("%s/domid-history%s", libxl__run_dir_path(), + suffix ?: ""); +} + +int libxl_clear_domid_history(libxl_ctx *ctx) +{ + GC_INIT(ctx); + char *path; + int rc = ERROR_FAIL; + + path = libxl__domid_history_path(gc, NULL); + if (!path) + goto out; + + if (unlink(path) < 0 && errno != ENOENT) { + LOGE(ERROR, "failed to remove '%s'\n", path); + goto out; + } + + rc = 0; + +out: + GC_FREE; + return rc; +} + +static void libxl__mark_domid_recent(libxl__gc *gc, uint32_t domid) +{ + long timeout = libxl__get_domid_reuse_timeout(); + libxl__flock *lock; + char *old, *new; + FILE *of = NULL, *nf = NULL; + struct timespec ts; + char line[64]; + + lock = libxl__lock_domid_history(gc); + if (!lock) { + LOGED(ERROR, domid, "failed to acquire lock"); + goto out; + } + + old = libxl__domid_history_path(gc, NULL); + of = fopen(old, "r"); + if (!of && errno != ENOENT) + LOGED(WARN, domid, "failed to open '%s'", old); + + new = libxl__domid_history_path(gc, ".new"); + nf = fopen(new, "a"); + if (!nf) { + LOGED(ERROR, domid, "failed to open '%s'", new); + goto out; + } + + clock_gettime(CLOCK_MONOTONIC, &ts); + + while (of && fgets(line, sizeof(line), of)) { + unsigned long sec; + unsigned int ignored; + + if (sscanf(line, "%lu %u", &sec, &ignored) != 2) { + LOGED(ERROR, domid, "ignoring malformed line: %s", line); + continue; + } + + if (ts.tv_sec - sec > timeout) + continue; /* Ignore expired entries */ + + if (fputs(line, nf) == EOF) { + LOGED(ERROR, domid, "failed to write"); + goto out; + } + } + + if (fprintf(nf, "%lu %u\n", ts.tv_sec, domid) < 0) { + LOGED(ERROR, domid, "failed to write"); + goto out; + } + + fflush(nf); + + if (rename(new, old) < 0) + LOGED(ERROR, domid, "failed to rename '%s' -> '%s'", old, new); + +out: + if (nf) fclose(nf); + if (of) fclose(of); + if (lock) libxl__unlock_file(lock); +} + +bool libxl__is_domid_recent(libxl__gc *gc, uint32_t domid) +{ + long timeout = libxl__get_domid_reuse_timeout(); + bool recent = false; + const char *name; + FILE *f; + struct timespec ts; + + name = GCSPRINTF("%s/domid-history", libxl__run_dir_path()); + f = fopen(name, "r"); + if (!f) { + if (errno != ENOENT) LOGED(WARN, domid, "failed to open %s", name); + return false; + } + + clock_gettime(CLOCK_MONOTONIC, &ts); + + while (!feof(f)) { + unsigned long sec; + unsigned int check; + + if (fscanf(f, "%lu %u", &sec, &check) != 2) + continue; + + if (check == domid && ts.tv_sec - sec <= timeout) { + recent = true; + break; + } + } + + fclose(f); + + return recent; +} + static void devices_destroy_cb(libxl__egc *egc, libxl__devices_remove_state *drs, int rc) @@ -1331,6 +1465,7 @@ static void devices_destroy_cb(libxl__egc *egc, if (!ctx->xch) goto badchild; if (!dis->soft_reset) { + libxl__mark_domid_recent(gc, domid); rc = xc_domain_destroy(ctx->xch, domid); } else { rc = xc_domain_pause(ctx->xch, domid); diff --git a/tools/libxl/libxl_internal.c b/tools/libxl/libxl_internal.c index 211236dc99..bbd4c6cba9 100644 --- a/tools/libxl/libxl_internal.c +++ b/tools/libxl/libxl_internal.c @@ -504,6 +504,16 @@ libxl__flock *libxl__lock_domain_userdata(libxl__gc *gc, uint32_t domid) return lock; } +libxl__flock *libxl__lock_domid_history(libxl__gc *gc) +{ + const char *lockfile; + + lockfile = libxl__domid_history_path(gc, ".lock"); + if (!lockfile) return NULL; + + return libxl__lock_file(gc, lockfile); +} + int libxl__get_domain_configuration(libxl__gc *gc, uint32_t domid, libxl_domain_config *d_config) { diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index 3fb38220e5..a50d5a2939 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -4233,6 +4233,8 @@ _hidden void libxl__remus_teardown(libxl__egc *egc, _hidden void libxl__remus_restore_setup(libxl__egc *egc, libxl__domain_create_state *dcs); +_hidden char *libxl__domid_history_path(libxl__gc *gc, + const char *suffix); /* * Convenience macros. @@ -4631,6 +4633,7 @@ libxl__flock *libxl__lock_file(libxl__gc *gc, const char *filename); void libxl__unlock_file(libxl__flock *lock); libxl__flock *libxl__lock_domain_userdata(libxl__gc *gc, uint32_t domid); +libxl__flock *libxl__lock_domid_history(libxl__gc *gc); /* * Retrieve / store domain configuration from / to libxl private @@ -4769,6 +4772,17 @@ _hidden int libxl__domain_pvcontrol(libxl__egc *egc, libxl__xswait_state *pvcontrol, domid_t domid, const char *cmd); +/* + * Maximum number of seconds after desctruction then a domid remains + * 'recent'. Recent domids are not allowed to be re-used. This can be + * overidden, for debugging purposes, by the environment variable of the + * same name. + */ +#define LIBXL_DOMID_REUSE_TIMEOUT 60 + +/* Check whether a domid is recent */ +bool libxl__is_domid_recent(libxl__gc *gc, uint32_t domid); + #endif /*
A domid is considered recent if the domain it represents was destroyed less than a specified number of seconds ago. The number can be set using the environment variable LIBXL_DOMID_REUSE_TIMEOUT. If the variable does not exist then a default value of 60s is used. Whenever a domain is destroyed, a time-stamped record will be written into a history file (/var/run/xen/domid-history). To avoid the history file growing too large, any records with time-stamps that indicate that the age of a domid has exceeded the re-use timeout will also be purged. A new utility function, libxl__is_recent_domid(), has been added. This function reads the same history file checking whether a specified domid has a record that does not exceed the re-use timeout. Since this utility function does not write to the file, no records are actually purged by it. NOTE: The history file is purged on boot to it is safe to use CLOCK_MONOTONIC as a time source. Signed-off-by: Paul Durrant <pdurrant@amazon.com> --- Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Wei Liu <wl@xen.org> Cc: Anthony PERARD <anthony.perard@citrix.com> v4: - Use new generalised libxl__flock - Don't read and write the same file - Use 'recent' rather than 'retired' - Add code into xen-init-dom0 to delete an old history file at boot v2: - New in v2 --- tools/helpers/xen-init-dom0.c | 30 ++++++++ tools/libxl/libxl.h | 2 + tools/libxl/libxl_domain.c | 135 ++++++++++++++++++++++++++++++++++ tools/libxl/libxl_internal.c | 10 +++ tools/libxl/libxl_internal.h | 14 ++++ 5 files changed, 191 insertions(+)