diff mbox series

[v5,4/7] libxl: add infrastructure to track and query 'recent' domids

Message ID 20200131150149.2008-5-pdurrant@amazon.com (mailing list archive)
State Superseded
Headers show
Series xl/libxl: domid allocation/preservation changes | expand

Commit Message

Paul Durrant Jan. 31, 2020, 3:01 p.m. UTC
A domid is considered recent if the domain it represents was destroyed
less than a specified number of seconds ago. For debugging and/or testing
purposes the number can be set using the environment variable
LIBXL_DOMID_REUSE_TIMEOUT. If the variable does not exist then a default
value of 60s is used.

Whenever a domain is destroyed, a time-stamped record will be written into
a history file (/var/run/xen/domid-history). To avoid the history file
growing too large, any records with time-stamps that indicate that the
age of a domid has exceeded the re-use timeout will also be purged.

A new utility function, libxl__is_recent_domid(), has been added. This
function reads the same history file checking whether a specified domid
has a record that does not exceed the re-use timeout. Since this utility
function does not write to the file, no records are actually purged by it.

NOTE: The history file is purged on boot to it is safe to use
      CLOCK_MONOTONIC as a time source.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wl@xen.org>
Cc: Anthony PERARD <anthony.perard@citrix.com>

v5:
 - Re-work file manipulation some more
 - Add more error checks

v4:
 - Use new generalised libxl__flock
 - Don't read and write the same file
 - Use 'recent' rather than 'retired'
 - Add code into xen-init-dom0 to delete an old history file at boot

v2:
 - New in v2
---
 tools/helpers/xen-init-dom0.c |  30 +++++
 tools/libxl/libxl.h           |   2 +
 tools/libxl/libxl_domain.c    | 204 ++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_internal.c  |  10 ++
 tools/libxl/libxl_internal.h  |  14 +++
 5 files changed, 260 insertions(+)

Comments

Ian Jackson Feb. 17, 2020, 5:42 p.m. UTC | #1
Paul Durrant writes ("[PATCH v5 4/7] libxl: add infrastructure to track and query 'recent' domids"):
> A domid is considered recent if the domain it represents was destroyed
> less than a specified number of seconds ago. For debugging and/or testing
> purposes the number can be set using the environment variable
> LIBXL_DOMID_REUSE_TIMEOUT. If the variable does not exist then a default
> value of 60s is used.
> 
> Whenever a domain is destroyed, a time-stamped record will be written into
> a history file (/var/run/xen/domid-history). To avoid the history file
> growing too large, any records with time-stamps that indicate that the
> age of a domid has exceeded the re-use timeout will also be purged.
> 
> A new utility function, libxl__is_recent_domid(), has been added. This
> function reads the same history file checking whether a specified domid
> has a record that does not exceed the re-use timeout. Since this utility
> function does not write to the file, no records are actually purged by it.

Thanks for this.  Sorry for the delay in reviewing it.

I'm afraid I still have some comments about error handling etc.

> +int libxl_clear_domid_history(libxl_ctx *ctx);

I think this needs a clear doc comment saying it is for use in host
initialisation only.  If it is run with any domains running, or
concurrent libxl processes, things may malfunction.

> +static bool libxl__read_recent(FILE *f, unsigned long *sec,
> +                               unsigned int *domid)
> +{
> +    int n;
> +
> +    assert(f);
> +
> +    n = fscanf(f, "%lu %u", sec, domid);
> +    if (n == EOF)
> +        return false;

Missing error handling in case of read error.

> +    else if (n != 2) /* malformed entry */
> +        *domid = INVALID_DOMID;

Both call sites for this function have open-coded checks for this
return case, where they just go round again.  I think
libxl__read_recent should handle this itself, factoring the common
code into this function and avoiding that special case.

> +    return true;

I think this function should return an rc.  It could signal EOF by
setting *domid to INVALID_DOMID maybe, and errors by returning
ERROR_FAIL.

> +static bool libxl__write_recent(FILE *f, unsigned long sec,
> +                                unsigned int domid)
> +{
> +    assert(f);

This is rather pointless.  Please drop it.

> +    assert(libxl_domid_valid_guest(domid));

I doubt this is really needed but I don't mind it if you must.

> +    return fprintf(f, "%lu %u\n", sec, domid) > 0;

Wrong error handling.  This function should return rc.  fprintf
doesn't return a boolean.  Something should log errno (with LOGE
probably) if fprintf fails.

> +static int libxl__mark_domid_recent(libxl__gc *gc, uint32_t domid)
> +{
> +    long timeout = libxl__get_domid_reuse_timeout();
> +    libxl__flock *lock;

Please initialise lock = NULL so that it is easy to see that the out
block is correct.

(See tools/libxl/CODING_STYLE where this is discussed.)

> +    char *old, *new;
> +    FILE *of = NULL, *nf = NULL;
> +    struct timespec ts;
> +    int rc = ERROR_FAIL;

Please do not set rc to ERROR_FAIL like this.  Leave it undefined.
Set it on each exit path.  (If you are calling a function that returns
an rc, you can put it in rc, and then test rc and goto out without
assignment.)

(Again, see tools/libxl/CODING_STYLE where this is discussed.)

> +    lock = libxl__lock_domid_history(gc);
> +    if (!lock) {
> +        LOGED(ERROR, domid, "failed to acquire lock");
> +        goto out;
> +    }
> +
> +    old = libxl__domid_history_path(gc, NULL);
> +    of = fopen(old, "r");
> +    if (!of && errno != ENOENT)
> +        LOGED(WARN, domid, "failed to open '%s'", old);

This fopen code and its error handling is still duplicated between
libxl__mark_domid_recent and libxl__is_domid_recent.  I meant for you
to factor it out.  Likewise the other duplicated code in these two
functions.  I want there to be nothing duplicated that can be written
once.

Also failure to open the file should be an error, resulting failure of
this function and the whole surrounding operation, not simply produce
a warning in some logfile where it will be ignored.

> +        while (libxl__read_recent(of, &sec, &val)) {
> +            if (!libxl_domid_valid_guest(val))
> +                continue; /* Ignore invalid entries */
> +
> +            if (ts.tv_sec - sec > timeout)
> +                continue; /* Ignore expired entries */
> +
> +            if (!libxl__write_recent(nf, sec, val)) {
> +                LOGED(ERROR, domid, "failed to write to '%s'", new);
> +                goto out;
> +            }
> +        }
> +        if (ferror(of)) {
> +            LOGED(ERROR, domid, "failed to read from '%s'", old);
> +            goto out;
> +        }

Oh, wait, here is one of the missing pieces of error handling ?
Please put it where it belongs, next to the corresponding call.

> +    if (of && fclose(of) == EOF) {
> +        LOGED(ERROR, domid, "failed to close '%s'", old);

I don't see how of would be NULL here.

Thanks,
Ian.
Durrant, Paul Feb. 18, 2020, 9:24 a.m. UTC | #2
> -----Original Message-----
> From: Ian Jackson <ian.jackson@citrix.com>
> Sent: 17 February 2020 17:43
> To: Durrant, Paul <pdurrant@amazon.co.uk>
> Cc: xen-devel@lists.xenproject.org; Wei Liu <wl@xen.org>; Anthony Perard
> <anthony.perard@citrix.com>
> Subject: Re: [PATCH v5 4/7] libxl: add infrastructure to track and query
> 'recent' domids
> 
> Paul Durrant writes ("[PATCH v5 4/7] libxl: add infrastructure to track
> and query 'recent' domids"):
> > A domid is considered recent if the domain it represents was destroyed
> > less than a specified number of seconds ago. For debugging and/or
> testing
> > purposes the number can be set using the environment variable
> > LIBXL_DOMID_REUSE_TIMEOUT. If the variable does not exist then a default
> > value of 60s is used.
> >
> > Whenever a domain is destroyed, a time-stamped record will be written
> into
> > a history file (/var/run/xen/domid-history). To avoid the history file
> > growing too large, any records with time-stamps that indicate that the
> > age of a domid has exceeded the re-use timeout will also be purged.
> >
> > A new utility function, libxl__is_recent_domid(), has been added. This
> > function reads the same history file checking whether a specified domid
> > has a record that does not exceed the re-use timeout. Since this utility
> > function does not write to the file, no records are actually purged by
> it.
> 
> Thanks for this.  Sorry for the delay in reviewing it.
> 
> I'm afraid I still have some comments about error handling etc.
> 
> > +int libxl_clear_domid_history(libxl_ctx *ctx);
> 
> I think this needs a clear doc comment saying it is for use in host
> initialisation only.  If it is run with any domains running, or
> concurrent libxl processes, things may malfunction.

Ok. Not sure precisely what you mean by 'doc comment'... Do mean a comment in the header just above this declaration or elsewhere?

> 
> > +static bool libxl__read_recent(FILE *f, unsigned long *sec,
> > +                               unsigned int *domid)
> > +{
> > +    int n;
> > +
> > +    assert(f);
> > +
> > +    n = fscanf(f, "%lu %u", sec, domid);
> > +    if (n == EOF)
> > +        return false;
> 
> Missing error handling in case of read error.

'man fscanf' tells me:

"The  value EOF is returned if the end of input is reached before either the first suc‐
 cessful conversion or a matching failure occurs.  EOF is also returned if a read error
 occurs,  in  which case the error indicator for the stream (see ferror(3)) is set, and
 errno is set to indicate the error."

So EOF is set in all error cases. What am I missing?

> 
> > +    else if (n != 2) /* malformed entry */
> > +        *domid = INVALID_DOMID;
> 
> Both call sites for this function have open-coded checks for this
> return case, where they just go round again.  I think
> libxl__read_recent should handle this itself, factoring the common
> code into this function and avoiding that special case.

Ok. I thought it was more intuitive to have the function only ever read a single entry from the file, but I can easily add the retry loop if you prefer.

> 
> > +    return true;
> 
> I think this function should return an rc.  It could signal EOF by
> setting *domid to INVALID_DOMID maybe, and errors by returning
> ERROR_FAIL.

Ok. I thought it was slightly pointless to do that.

> 
> > +static bool libxl__write_recent(FILE *f, unsigned long sec,
> > +                                unsigned int domid)
> > +{
> > +    assert(f);
> 
> This is rather pointless.  Please drop it.
> 

If you think so, ok.

> > +    assert(libxl_domid_valid_guest(domid));
> 
> I doubt this is really needed but I don't mind it if you must.
> 
> > +    return fprintf(f, "%lu %u\n", sec, domid) > 0;
> 
> Wrong error handling.  This function should return rc.  fprintf
> doesn't return a boolean.

And nor does this code expect it to (since it tests for '> 0').

>  Something should log errno (with LOGE
> probably) if fprintf fails.

I can see you dislike boolean functions; I'll return an error as you desire.

> 
> > +static int libxl__mark_domid_recent(libxl__gc *gc, uint32_t domid)
> > +{
> > +    long timeout = libxl__get_domid_reuse_timeout();
> > +    libxl__flock *lock;
> 
> Please initialise lock = NULL so that it is easy to see that the out
> block is correct.
> 
> (See tools/libxl/CODING_STYLE where this is discussed.)
> 

Ok. Xen style generally avoids initializers where not strictly necessary.

> > +    char *old, *new;
> > +    FILE *of = NULL, *nf = NULL;
> > +    struct timespec ts;
> > +    int rc = ERROR_FAIL;
> 
> Please do not set rc to ERROR_FAIL like this.  Leave it undefined.
> Set it on each exit path.  (If you are calling a function that returns
> an rc, you can put it in rc, and then test rc and goto out without
> assignment.)
> 
> (Again, see tools/libxl/CODING_STYLE where this is discussed.)
> 

Ok.

> > +    lock = libxl__lock_domid_history(gc);
> > +    if (!lock) {
> > +        LOGED(ERROR, domid, "failed to acquire lock");
> > +        goto out;
> > +    }
> > +
> > +    old = libxl__domid_history_path(gc, NULL);
> > +    of = fopen(old, "r");
> > +    if (!of && errno != ENOENT)
> > +        LOGED(WARN, domid, "failed to open '%s'", old);
> 
> This fopen code and its error handling is still duplicated between
> libxl__mark_domid_recent and libxl__is_domid_recent. 

That's not quite true. The error semantics are different; the former does not tolerate a failure to open the file, the latter does.

> I meant for you
> to factor it out.  Likewise the other duplicated code in these two
> functions.  I want there to be nothing duplicated that can be written
> once.

I'm not sure how you want me to combine them, given the differing semantics.

> 
> Also failure to open the file should be an error, resulting failure of
> this function and the whole surrounding operation, not simply produce
> a warning in some logfile where it will be ignored.

But that will cause a failure when trying to create the first domain after boot, since the file won't exist.

> 
> > +        while (libxl__read_recent(of, &sec, &val)) {
> > +            if (!libxl_domid_valid_guest(val))
> > +                continue; /* Ignore invalid entries */
> > +
> > +            if (ts.tv_sec - sec > timeout)
> > +                continue; /* Ignore expired entries */
> > +
> > +            if (!libxl__write_recent(nf, sec, val)) {
> > +                LOGED(ERROR, domid, "failed to write to '%s'", new);
> > +                goto out;
> > +            }
> > +        }
> > +        if (ferror(of)) {
> > +            LOGED(ERROR, domid, "failed to read from '%s'", old);
> > +            goto out;
> > +        }
> 
> Oh, wait, here is one of the missing pieces of error handling ?
> Please put it where it belongs, next to the corresponding call.
> 
> > +    if (of && fclose(of) == EOF) {
> > +        LOGED(ERROR, domid, "failed to close '%s'", old);
> 
> I don't see how of would be NULL here.
> 

It will be NULL if the file did not exist, which will be the case until the first domain destruction occurs.

  Paul

> Thanks,
> Ian.
Ian Jackson Feb. 18, 2020, 11:38 a.m. UTC | #3
Durrant, Paul writes ("RE: [PATCH v5 4/7] libxl: add infrastructure to track and query 'recent' domids"):
> Ian Jackson <ian.jackson@citrix.com>:
> > Paul Durrant writes ("[PATCH v5 4/7] libxl: add infrastructure to track
> > > +int libxl_clear_domid_history(libxl_ctx *ctx);
> > 
> > I think this needs a clear doc comment saying it is for use in host
> > initialisation only.  If it is run with any domains running, or
> > concurrent libxl processes, things may malfunction.
> 
> Ok. Not sure precisely what you mean by 'doc comment'... Do mean a
> comment in the header just above this declaration [...] ?

Yes, precisely that.  Thanks.

> > > +static bool libxl__read_recent(FILE *f, unsigned long *sec,
> > > +                               unsigned int *domid)
> > > +{
> > > +    int n;
> > > +
> > > +    assert(f);
> > > +
> > > +    n = fscanf(f, "%lu %u", sec, domid);
> > > +    if (n == EOF)
> > > +        return false;
> > 
> > Missing error handling in case of read error.
> 
> 'man fscanf' tells me:
> 
> "The value EOF is returned if the end of input is reached before
> either the first suc‐ cessful conversion or a matching failure
> occurs.  EOF is also returned if a read error occurs, in which case
> the error indicator for the stream (see ferror(3)) is set, and errno
> is set to indicate the error."
> 
> So EOF is set in all error cases. What am I missing?

I thought it treats read error the same as EOF.  But of course
actually I discovered a ferror() (duplicated) later...

> > > +    else if (n != 2) /* malformed entry */
> > > +        *domid = INVALID_DOMID;
> > 
> > Both call sites for this function have open-coded checks for this
> > return case, where they just go round again.  I think
> > libxl__read_recent should handle this itself, factoring the common
> > code into this function and avoiding that special case.
> 
> Ok. I thought it was more intuitive to have the function only ever
> read a single entry from the file, but I can easily add the retry
> loop if you prefer.

I think the purpose of this function is to contain all the code that
can be shared between the two call sites.

> > > +    return true;
> > 
> > I think this function should return an rc.  It could signal EOF by
> > setting *domid to INVALID_DOMID maybe, and errors by returning
> > ERROR_FAIL.
> 
> Ok. I thought it was slightly pointless to do that.

I don't have a 100% fixed opinion about the precise calling
convention.  But this function needs to be able to report three
distinct conditions, not two:
  - here is the entry you asked for
  - EOF, we have established that there are no more entries
  - failure to read the file, abandon all hope

Elsewhere in libxl the convention is usually to use an rc return value
to signal errors, and signal "no error, but no such thing" by writing
a sentinel rather than a value to an out parameter.

Returning an rc means that in the future if we want better control of
errors (i) this internal api is more like other internal apis (ii) the
exact error code is specified at the point in the code where the error
is recognised.

> > I doubt this is really needed but I don't mind it if you must.
> > 
> > > +    return fprintf(f, "%lu %u\n", sec, domid) > 0;
> > 
> > Wrong error handling.  This function should return rc.  fprintf
> > doesn't return a boolean.
> 
> And nor does this code expect it to (since it tests for '> 0').

Oh.  I didn't spot that.  This is contrary to libxl/CODING_STYLE.

  * Function calls which might fail (ie most function calls) are
    handled by putting the return/status value into a variable, and
    then checking it in a separate statement:
            char *dompath = libxl__xs_get_dompath(gc, bl->domid);
            if (!dompath) { rc = ERROR_FAIL; goto out; }

For precisely this kind of reason.

> >  Something should log errno (with LOGE
> > probably) if fprintf fails.
> 
> I can see you dislike boolean functions; I'll return an error as you desire.

See above about error handling.  Certainly a boolean cannot be used
for a function which might return "yes" or "no" or "argh, can't say".
For a function which might return "ok" or "argh", rc and ERROR_* is
clearly better since you get to invent the error code.

> > > +static int libxl__mark_domid_recent(libxl__gc *gc, uint32_t domid)
> > > +{
> > > +    long timeout = libxl__get_domid_reuse_timeout();
> > > +    libxl__flock *lock;
> > 
> > Please initialise lock = NULL so that it is easy to see that the out
> > block is correct.
> > 
> > (See tools/libxl/CODING_STYLE where this is discussed.)
> 
> Ok. Xen style generally avoids initializers where not strictly necessary.

libxl does not use "Xen style".

If you want to challenge the contents of libxl/CODING_STYLE, that's
fair enough of course, but maybe in the middle of this patch review is
not ideal ?

> > > +    lock = libxl__lock_domid_history(gc);
> > > +    if (!lock) {
> > > +        LOGED(ERROR, domid, "failed to acquire lock");
> > > +        goto out;
> > > +    }
> > > +
> > > +    old = libxl__domid_history_path(gc, NULL);
> > > +    of = fopen(old, "r");
> > > +    if (!of && errno != ENOENT)
> > > +        LOGED(WARN, domid, "failed to open '%s'", old);
> > 
> > This fopen code and its error handling is still duplicated between
> > libxl__mark_domid_recent and libxl__is_domid_recent. 
> 
> That's not quite true. The error semantics are different; the former does not tolerate a failure to open the file, the latter does.

What is the reason for this difference in semantics ?  It seems to me
that either:
 (i) absence of the file means there are no recent domids (eg,
     after boot) and therefore both functions should tolerate it; or
 (ii) absence of the file means a system configuration error
     and therefore neither function should tolerate it.

> > Also failure to open the file should be an error, resulting failure of
> > this function and the whole surrounding operation, not simply produce
> > a warning in some logfile where it will be ignored.
> 
> But that will cause a failure when trying to create the first domain
> after boot, since the file won't exist.

I meant that failure to open *other than ENOENT*.

ISTM that of the two options above, (i) is to be preferred and
therefore that ENOENT should always be tolerated.  But maybe you can
explain to me why that isn't right.

> > > +    if (of && fclose(of) == EOF) {
> > > +        LOGED(ERROR, domid, "failed to close '%s'", old);
> > 
> > I don't see how of would be NULL here.
> 
> It will be NULL if the file did not exist, which will be the case until the first domain destruction occurs.

Oh yes.  I am confused because I keep reading `of' as `output file'.

In which case, please see CODING_STYLE about putting the return value
in a separate statement.  This will also avoid duplicating the
`of=NULL' since it can go right after fclose.

Maybe the closing could be done by libxl__read_recent, if it took a
FILE** ?  That would remove some duplication and leave only an
error-check-free   if (of) fclose(of);   in each out block.

Ian.
diff mbox series

Patch

diff --git a/tools/helpers/xen-init-dom0.c b/tools/helpers/xen-init-dom0.c
index a1e5729458..56f69ab66f 100644
--- a/tools/helpers/xen-init-dom0.c
+++ b/tools/helpers/xen-init-dom0.c
@@ -12,6 +12,32 @@ 
 #define DOMNAME_PATH   "/local/domain/0/name"
 #define DOMID_PATH     "/local/domain/0/domid"
 
+int clear_domid_history(void)
+{
+    int rc = 1;
+    xentoollog_logger_stdiostream *logger;
+    libxl_ctx *ctx;
+
+    logger = xtl_createlogger_stdiostream(stderr, XTL_ERROR, 0);
+    if (!logger)
+        return 1;
+
+    if (libxl_ctx_alloc(&ctx, LIBXL_VERSION, 0,
+                        (xentoollog_logger *)logger)) {
+        fprintf(stderr, "cannot init libxl context\n");
+        goto outlog;
+    }
+
+    if (!libxl_clear_domid_history(ctx))
+        rc = 0;
+
+    libxl_ctx_free(ctx);
+
+outlog:
+    xtl_logger_destroy((xentoollog_logger *)logger);
+    return rc;
+}
+
 int main(int argc, char **argv)
 {
     int rc;
@@ -70,6 +96,10 @@  int main(int argc, char **argv)
     if (rc)
         goto out;
 
+    rc = clear_domid_history();
+    if (rc)
+        goto out;
+
     /* Write xenstore entries. */
     if (!xs_write(xsh, XBT_NULL, DOMID_PATH, "0", strlen("0"))) {
         fprintf(stderr, "cannot set domid for Dom0\n");
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 18c1a2d6bf..1d235ecb1c 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -2657,6 +2657,8 @@  static inline int libxl_qemu_monitor_command_0x041200(libxl_ctx *ctx,
 
 #include <libxl_event.h>
 
+int libxl_clear_domid_history(libxl_ctx *ctx);
+
 #endif /* LIBXL_H */
 
 /*
diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c
index 973fc1434d..5349defcf0 100644
--- a/tools/libxl/libxl_domain.c
+++ b/tools/libxl/libxl_domain.c
@@ -1268,6 +1268,208 @@  static void dm_destroy_cb(libxl__egc *egc,
     libxl__devices_destroy(egc, &dis->drs);
 }
 
+static unsigned int libxl__get_domid_reuse_timeout(void)
+{
+    const char *env_timeout = getenv("LIBXL_DOMID_REUSE_TIMEOUT");
+
+    return env_timeout ? strtol(env_timeout, NULL, 0) :
+        LIBXL_DOMID_REUSE_TIMEOUT;
+}
+
+char *libxl__domid_history_path(libxl__gc *gc, const char *suffix)
+{
+    return GCSPRINTF("%s/domid-history%s", libxl__run_dir_path(),
+                     suffix ?: "");
+}
+
+int libxl_clear_domid_history(libxl_ctx *ctx)
+{
+    GC_INIT(ctx);
+    char *path;
+    int rc = ERROR_FAIL;
+
+    path = libxl__domid_history_path(gc, NULL);
+    if (!path)
+        goto out;
+
+    if (unlink(path) < 0 && errno != ENOENT) {
+        LOGE(ERROR, "failed to remove '%s'\n", path);
+        goto out;
+    }
+
+    rc = 0;
+
+out:
+    GC_FREE;
+    return rc;
+}
+
+static bool libxl__read_recent(FILE *f, unsigned long *sec,
+                               unsigned int *domid)
+{
+    int n;
+
+    assert(f);
+
+    n = fscanf(f, "%lu %u", sec, domid);
+    if (n == EOF)
+        return false;
+    else if (n != 2) /* malformed entry */
+        *domid = INVALID_DOMID;
+
+    return true;
+}
+
+static bool libxl__write_recent(FILE *f, unsigned long sec,
+                                unsigned int domid)
+{
+    assert(f);
+    assert(libxl_domid_valid_guest(domid));
+
+    return fprintf(f, "%lu %u\n", sec, domid) > 0;
+}
+
+static int libxl__mark_domid_recent(libxl__gc *gc, uint32_t domid)
+{
+    long timeout = libxl__get_domid_reuse_timeout();
+    libxl__flock *lock;
+    char *old, *new;
+    FILE *of = NULL, *nf = NULL;
+    struct timespec ts;
+    int rc = ERROR_FAIL;
+
+    lock = libxl__lock_domid_history(gc);
+    if (!lock) {
+        LOGED(ERROR, domid, "failed to acquire lock");
+        goto out;
+    }
+
+    old = libxl__domid_history_path(gc, NULL);
+    of = fopen(old, "r");
+    if (!of && errno != ENOENT)
+        LOGED(WARN, domid, "failed to open '%s'", old);
+
+    new = libxl__domid_history_path(gc, ".new");
+    nf = fopen(new, "a");
+    if (!nf) {
+        LOGED(ERROR, domid, "failed to open '%s'", new);
+        goto out;
+    }
+
+    if (clock_gettime(CLOCK_MONOTONIC, &ts)) {
+        LOGED(ERROR, domid, "failed to get time");
+        goto out;
+    }
+
+    if (of) {
+        unsigned long sec;
+        unsigned int val;
+
+        while (libxl__read_recent(of, &sec, &val)) {
+            if (!libxl_domid_valid_guest(val))
+                continue; /* Ignore invalid entries */
+
+            if (ts.tv_sec - sec > timeout)
+                continue; /* Ignore expired entries */
+
+            if (!libxl__write_recent(nf, sec, val)) {
+                LOGED(ERROR, domid, "failed to write to '%s'", new);
+                goto out;
+            }
+        }
+        if (ferror(of)) {
+            LOGED(ERROR, domid, "failed to read from '%s'", old);
+            goto out;
+        }
+    }
+
+    if (!libxl__write_recent(nf, ts.tv_sec, domid)) {
+        LOGED(ERROR, domid, "failed to write to '%s'", new);
+        goto out;
+    }
+
+    if (fclose(nf) == EOF) {
+        LOGED(ERROR, domid, "failed to close '%s'", new);
+        nf = NULL;
+        goto out;
+    }
+    nf = NULL;
+
+    if (of && fclose(of) == EOF) {
+        LOGED(ERROR, domid, "failed to close '%s'", old);
+        of = NULL;
+        goto out;
+    }
+    of = NULL;
+
+    if (rename(new, old) < 0) {
+        LOGED(ERROR, domid, "failed to rename '%s' -> '%s'", old, new);
+        goto out;
+    }
+    rc = 0;
+
+out:
+    if (nf) fclose(nf);
+    if (of) fclose(of);
+    if (lock) libxl__unlock_file(lock);
+
+    return rc;
+}
+
+int libxl__is_domid_recent(libxl__gc *gc, uint32_t domid, bool *recent)
+{
+    long timeout = libxl__get_domid_reuse_timeout();
+    const char *name;
+    FILE *f;
+    struct timespec ts;
+    unsigned long sec;
+    unsigned int val;
+    int rc = ERROR_FAIL;
+
+    name = GCSPRINTF("%s/domid-history", libxl__run_dir_path());
+    f = fopen(name, "r");
+    if (!f) {
+        if (errno != ENOENT)
+            LOGED(WARN, domid, "failed to open %s", name);
+        else
+            rc = 0;
+
+        goto out;
+    }
+
+    if (clock_gettime(CLOCK_MONOTONIC, &ts)) {
+        LOGED(ERROR, domid, "failed to get time");
+        goto out;
+    }
+
+    *recent = false;
+    while (libxl__read_recent(f, &sec, &val)) {
+        if (!libxl_domid_valid_guest(val))
+            continue; /* Ignore invalid entries */
+
+        if (val == domid && ts.tv_sec - sec <= timeout) {
+            *recent = true;
+            break;
+        }
+    }
+    if (ferror(f)) {
+        LOGED(ERROR, domid, "failed to read from '%s'", name);
+        goto out;
+    }
+
+    if (fclose(f) == EOF) {
+        LOGED(ERROR, domid, "failed to close '%s'", name);
+        f = NULL;
+        goto out;
+    }
+    f = NULL;
+    rc = 0;
+
+out:
+    if (f) fclose(f);
+    return rc;
+}
+
 static void devices_destroy_cb(libxl__egc *egc,
                                libxl__devices_remove_state *drs,
                                int rc)
@@ -1331,6 +1533,8 @@  static void devices_destroy_cb(libxl__egc *egc,
         if (!ctx->xch) goto badchild;
 
         if (!dis->soft_reset) {
+            rc = libxl__mark_domid_recent(gc, domid);
+            if (rc) goto badchild;
             rc = xc_domain_destroy(ctx->xch, domid);
         } else {
             rc = xc_domain_pause(ctx->xch, domid);
diff --git a/tools/libxl/libxl_internal.c b/tools/libxl/libxl_internal.c
index 211236dc99..bbd4c6cba9 100644
--- a/tools/libxl/libxl_internal.c
+++ b/tools/libxl/libxl_internal.c
@@ -504,6 +504,16 @@  libxl__flock *libxl__lock_domain_userdata(libxl__gc *gc, uint32_t domid)
     return lock;
 }
 
+libxl__flock *libxl__lock_domid_history(libxl__gc *gc)
+{
+    const char *lockfile;
+
+    lockfile = libxl__domid_history_path(gc, ".lock");
+    if (!lockfile) return NULL;
+
+    return libxl__lock_file(gc, lockfile);
+}
+
 int libxl__get_domain_configuration(libxl__gc *gc, uint32_t domid,
                                     libxl_domain_config *d_config)
 {
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index dd3c08bc14..39de2d5910 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -4260,6 +4260,8 @@  _hidden void libxl__remus_teardown(libxl__egc *egc,
 _hidden void libxl__remus_restore_setup(libxl__egc *egc,
                                         libxl__domain_create_state *dcs);
 
+_hidden char *libxl__domid_history_path(libxl__gc *gc,
+                                        const char *suffix);
 
 /*
  * Convenience macros.
@@ -4658,6 +4660,7 @@  libxl__flock *libxl__lock_file(libxl__gc *gc, const char *filename);
 void libxl__unlock_file(libxl__flock *lock);
 
 libxl__flock *libxl__lock_domain_userdata(libxl__gc *gc, uint32_t domid);
+libxl__flock *libxl__lock_domid_history(libxl__gc *gc);
 
 /*
  * Retrieve / store domain configuration from / to libxl private
@@ -4796,6 +4799,17 @@  _hidden int libxl__domain_pvcontrol(libxl__egc *egc,
                                     libxl__xswait_state *pvcontrol,
                                     domid_t domid, const char *cmd);
 
+/*
+ * Maximum number of seconds after desctruction then a domid remains
+ * 'recent'. Recent domids are not allowed to be re-used. This can be
+ * overidden, for debugging purposes, by the environment variable of the
+ * same name.
+ */
+#define LIBXL_DOMID_REUSE_TIMEOUT 60
+
+/* Check whether a domid is recent */
+int libxl__is_domid_recent(libxl__gc *gc, uint32_t domid, bool *recent);
+
 #endif
 
 /*