diff mbox

[v2,02/10] block: set snapshot option for block devices in blkreplay module

Message ID 20160915090054.6440.77726.stgit@PASHA-ISP (mailing list archive)
State New, archived
Headers show

Commit Message

Pavel Dovgalyuk Sept. 15, 2016, 9 a.m. UTC
This patch adds overlay option for blkreplay filter.
It allows creating persistent overlay file for saving and reloading
VM snapshots in record/replay modes.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
---
 block/blkreplay.c |  119 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 docs/replay.txt   |    8 ++++
 vl.c              |    2 -
 3 files changed, 128 insertions(+), 1 deletion(-)

Comments

Paolo Bonzini Sept. 15, 2016, 9:25 a.m. UTC | #1
On 15/09/2016 11:00, Pavel Dovgalyuk wrote:
> diff --git a/docs/replay.txt b/docs/replay.txt
> index 347b2ff..5be8f25 100644
> --- a/docs/replay.txt
> +++ b/docs/replay.txt
> @@ -196,6 +196,14 @@ is recorded to the log. In replay phase the queue is matched with
>  events read from the log. Therefore block devices requests are processed
>  deterministically.
>  
> +blkdriver also supports overlay option, which allows creating persistent
> +overlay file for saving and reloading VM snapshots in record/replay modes.
> +Replay mechanism automatically creates one snapshot named 'replay_init' to
> +allow rewinding execution while replaying.
> +Overlay file may be specified as follows:
> + -drive driver=blkreplay,if=none,image=img-direct,
> +        overlay=overlay.qcow2,id=img-blkreplay

So in this case the image is actually overlay.qcow2, and it is created 
with img-direct as the backing file?  Since you have to create 
overlay.qcow2 outside QEMU anyway, overlay.qcow2 might as well be the 
"image".  That is, you could choose between:

   -drive driver=blkreplay,if=none,image=overlay.qcow2,id=img-blkreplay \
   -rr snapshot=replay_init,...

   -drive driver=blkreplay,if=none,image=img-direct,id=img-blkreplay

The temporary snapshot would be created if there's no "-rr snapshot" option
on the command line.

Does this make sense?

Paolo

> 
> +static QemuOptsList blkreplay_runtime_opts = {
> +    .name = "quorum",

Pasto. ;)

> +    .head = QTAILQ_HEAD_INITIALIZER(blkreplay_runtime_opts.head),
> +    .desc = {
> +        {
> +            .name = "overlay",
> +            .type = QEMU_OPT_STRING,
> +            .help = "Optional overlay file for snapshots",
> +        },
> +        { /* end of list */ }
> +    },
> +};
Paolo Bonzini Sept. 15, 2016, 9:36 a.m. UTC | #2
On 15/09/2016 11:25, Paolo Bonzini wrote:
> So in this case the image is actually overlay.qcow2, and it is created 
> with img-direct as the backing file?  Since you have to create 
> overlay.qcow2 outside QEMU anyway, overlay.qcow2 might as well be the 
> "image".  That is, you could choose between:
> 
>    -drive driver=blkreplay,if=none,image=overlay.qcow2,id=img-blkreplay \
>    -rr snapshot=replay_init,...
> 
>    -drive driver=blkreplay,if=none,image=img-direct,id=img-blkreplay
> 
> The temporary snapshot would be created if there's no "-rr snapshot" option
> on the command line.

Sorry, that would be "-icount rrsnapshot=" of course.

Paolo
Pavel Dovgalyuk Sept. 16, 2016, 7:55 a.m. UTC | #3
> From: Paolo Bonzini [mailto:pbonzini@redhat.com]
> On 15/09/2016 11:00, Pavel Dovgalyuk wrote:
> > diff --git a/docs/replay.txt b/docs/replay.txt
> > index 347b2ff..5be8f25 100644
> > --- a/docs/replay.txt
> > +++ b/docs/replay.txt
> > @@ -196,6 +196,14 @@ is recorded to the log. In replay phase the queue is matched with
> >  events read from the log. Therefore block devices requests are processed
> >  deterministically.
> >
> > +blkdriver also supports overlay option, which allows creating persistent
> > +overlay file for saving and reloading VM snapshots in record/replay modes.
> > +Replay mechanism automatically creates one snapshot named 'replay_init' to
> > +allow rewinding execution while replaying.
> > +Overlay file may be specified as follows:
> > + -drive driver=blkreplay,if=none,image=img-direct,
> > +        overlay=overlay.qcow2,id=img-blkreplay
> 
> So in this case the image is actually overlay.qcow2, and it is created
> with img-direct as the backing file? 

Right.

> Since you have to create
> overlay.qcow2 outside QEMU anyway, overlay.qcow2 might as well be the
> "image".  That is, you could choose between:
> 
>    -drive driver=blkreplay,if=none,image=overlay.qcow2,id=img-blkreplay \
>    -rr snapshot=replay_init,...
> 
>    -drive driver=blkreplay,if=none,image=img-direct,id=img-blkreplay
> 
> The temporary snapshot would be created if there's no "-rr snapshot" option
> on the command line.
> 
> Does this make sense?

There are two different parts:
 - creating an overlay
 - creating the snapshot

Overlay is needed to preserve the state of the original backing file.
In the current version temporary overlay is always created at start of qemu.
Then all changes are destroyed at exit and disk remain the same.
It allows replaying the execution from the same disk state.

To allow reverse execution we have to create some snapshots, that will allow
rewinding the execution. Snapshots have to be written on some non-temporary overlay.

I don't think that it is convenient forcing user to create overlay manually.
Common debugging scenario includes multiple recording passes until the bug manifests
itself. Every new execution recorded should be accompanied by creating an overlay
to assure that the execution is started from the same disk state.

Specifying initial snapshot name makes sense if we want to suppress -loadvm option.
Then replay may be started from any state without using -loadvm.

Pavel Dovgalyuk
Paolo Bonzini Sept. 16, 2016, 9:28 a.m. UTC | #4
On 16/09/2016 09:55, Pavel Dovgalyuk wrote:
>> Since you have to create
>> overlay.qcow2 outside QEMU anyway, overlay.qcow2 might as well be the
>> "image".  That is, you could choose between:
>>
>>    -drive driver=blkreplay,if=none,image=overlay.qcow2,id=img-blkreplay \
>>    -rr snapshot=replay_init,...
>>
>>    -drive driver=blkreplay,if=none,image=img-direct,id=img-blkreplay
>>
>> The temporary snapshot would be created if there's no "-rr snapshot" option
>> on the command line.
>>
>> Does this make sense?
> 
> There are two different parts:
>  - creating an overlay
>  - creating the snapshot
> 
> Overlay is needed to preserve the state of the original backing file.
> In the current version temporary overlay is always created at start of qemu.

Yes, this would still be the default for rr mode.

> I don't think that it is convenient forcing user to create overlay manually.

Note that all I'm only saying that _only for the case where the user
creates the overlay manually anyway_ there is no need to specify both
image and overlay.  (I also don't like particularly the hard-coded
snapshot name replay_init, which can be overridden by -loadvm but not
when saving).

So there are various possibilites:

First proposal:

- automatically created overlay is -icount rr=record|replay (then
snapshot name doesn't matter, it can be replay_init)

- manually created overlay is -icount
rr=record|replay,rrsnapshot=snapname (then snapshot name matters because
you can have different snapshots in the same file)

If rr is enabled but rrsnapshot is absent, configure_icount can just set
"snapshot = 1" to force creation of the temporary overlay.  This
requires no change to the blkreplay driver


Second proposal:

- automatically created overlay is -icount rr=record|replay -snapshot

- manually created overlay is -icount rr=record|replay and an rrsnapshot
suboption can be added anyway if considered useful.

This requires no change to the blkreplay driver either.  It's a little
more verbose in the common case, but perhaps less surprising if you're
already used to -snapshot.

> Common debugging scenario includes multiple recording passes until the bug manifests
> itself. Every new execution recorded should be accompanied by creating an overlay
> to assure that the execution is started from the same disk state.
> 
> Specifying initial snapshot name makes sense if we want to suppress -loadvm option.

My rationale was to have similar command lines between record and replay
modes (just changing rr=record to rr=replay).

Paolo

> Then replay may be started from any state without using -loadvm.
> 
> Pavel Dovgalyuk
> 
> 
>
Pavel Dovgalyuk Sept. 16, 2016, 9:36 a.m. UTC | #5
> From: Paolo Bonzini [mailto:paolo.bonzini@gmail.com] On Behalf Of Paolo Bonzini
> On 16/09/2016 09:55, Pavel Dovgalyuk wrote:
> >> Since you have to create
> >> overlay.qcow2 outside QEMU anyway, overlay.qcow2 might as well be the
> >> "image".  That is, you could choose between:
> >>
> >>    -drive driver=blkreplay,if=none,image=overlay.qcow2,id=img-blkreplay \
> >>    -rr snapshot=replay_init,...
> >>
> >>    -drive driver=blkreplay,if=none,image=img-direct,id=img-blkreplay
> >>
> >> The temporary snapshot would be created if there's no "-rr snapshot" option
> >> on the command line.
> >>
> >> Does this make sense?
> >
> > There are two different parts:
> >  - creating an overlay
> >  - creating the snapshot
> >
> > Overlay is needed to preserve the state of the original backing file.
> > In the current version temporary overlay is always created at start of qemu.
> 
> Yes, this would still be the default for rr mode.
> 
> > I don't think that it is convenient forcing user to create overlay manually.
> 
> Note that all I'm only saying that _only for the case where the user
> creates the overlay manually anyway_ there is no need to specify both
> image and overlay.  (I also don't like particularly the hard-coded
> snapshot name replay_init, which can be overridden by -loadvm but not
> when saving).

Ok, this seems reasonable to fix.

> 
> So there are various possibilites:
> 
> First proposal:
> 
> - automatically created overlay is -icount rr=record|replay (then
> snapshot name doesn't matter, it can be replay_init)
> 
> - manually created overlay is -icount
> rr=record|replay,rrsnapshot=snapname (then snapshot name matters because
> you can have different snapshots in the same file)

We can't create overlay with icount suboptions, because there could be several
block devices. Each one needs its own overlay.

> If rr is enabled but rrsnapshot is absent, configure_icount can just set
> "snapshot = 1" to force creation of the temporary overlay.  This
> requires no change to the blkreplay driver
> 
> 
> Second proposal:
> 
> - automatically created overlay is -icount rr=record|replay -snapshot
> 
> - manually created overlay is -icount rr=record|replay and an rrsnapshot
> suboption can be added anyway if considered useful.

See above.

> This requires no change to the blkreplay driver either.  It's a little
> more verbose in the common case, but perhaps less surprising if you're
> already used to -snapshot.

Our internal version automatically creates overlays with generated names.
But now I think, that it is inconvenient and manual name specification is better.

> > Common debugging scenario includes multiple recording passes until the bug manifests
> > itself. Every new execution recorded should be accompanied by creating an overlay
> > to assure that the execution is started from the same disk state.
> >
> > Specifying initial snapshot name makes sense if we want to suppress -loadvm option.
> 
> My rationale was to have similar command lines between record and replay
> modes (just changing rr=record to rr=replay).

This seems reasonable, I'll try to fix this part.

Pavel Dovgalyuk
Paolo Bonzini Sept. 16, 2016, 9:49 a.m. UTC | #6
On 16/09/2016 11:36, Pavel Dovgalyuk wrote:
>> From: Paolo Bonzini [mailto:paolo.bonzini@gmail.com] On Behalf Of Paolo Bonzini
>> On 16/09/2016 09:55, Pavel Dovgalyuk wrote:
>>>> Since you have to create
>>>> overlay.qcow2 outside QEMU anyway, overlay.qcow2 might as well be the
>>>> "image".  That is, you could choose between:
>>>>
>>>>    -drive driver=blkreplay,if=none,image=overlay.qcow2,id=img-blkreplay \
>>>>    -rr snapshot=replay_init,...
>>>>
>>>>    -drive driver=blkreplay,if=none,image=img-direct,id=img-blkreplay
>>>>
>>>> The temporary snapshot would be created if there's no "-rr snapshot" option
>>>> on the command line.
>>>>
>>>> Does this make sense?
>>>
>>> There are two different parts:
>>>  - creating an overlay
>>>  - creating the snapshot
>>>
>>> Overlay is needed to preserve the state of the original backing file.
>>> In the current version temporary overlay is always created at start of qemu.
>>
>> Yes, this would still be the default for rr mode.
>>
>>> I don't think that it is convenient forcing user to create overlay manually.
>>
>> Note that all I'm only saying that _only for the case where the user
>> creates the overlay manually anyway_ there is no need to specify both
>> image and overlay.  (I also don't like particularly the hard-coded
>> snapshot name replay_init, which can be overridden by -loadvm but not
>> when saving).
> 
> Ok, this seems reasonable to fix.
> 
>>
>> So there are various possibilites:
>>
>> First proposal:
>>
>> - automatically created overlay is -icount rr=record|replay (then
>> snapshot name doesn't matter, it can be replay_init)
>>
>> - manually created overlay is -icount
>> rr=record|replay,rrsnapshot=snapname (then snapshot name matters because
>> you can have different snapshots in the same file)
> 
> We can't create overlay with icount suboptions, because there could be several
> block devices. Each one needs its own overlay.

Right, so the overlay name is specified in each -drive option the
blkreplay image.  rrsnapshot is just the name of the first snapshot that
is created (for rr=record, instead of requiring manual interaction with
the monitor) or loaded (for rr=replay; in this case it's a convenience
only).

Paolo
diff mbox

Patch

diff --git a/block/blkreplay.c b/block/blkreplay.c
index 30f9d5f..62ae861 100644
--- a/block/blkreplay.c
+++ b/block/blkreplay.c
@@ -14,6 +14,7 @@ 
 #include "block/block_int.h"
 #include "sysemu/replay.h"
 #include "qapi/error.h"
+#include "qapi/qmp/qstring.h"
 
 typedef struct Request {
     Coroutine *co;
@@ -25,11 +26,82 @@  typedef struct Request {
    block devices should not get overlapping ids. */
 static uint64_t request_id;
 
+static BlockDriverState *blkreplay_append_snapshot(BlockDriverState *bs,
+                                                   int flags,
+                                                   QDict *snapshot_options,
+                                                   Error **errp)
+{
+    int ret;
+    BlockDriverState *bs_snapshot;
+
+    /* Create temporary file, if needed */
+    if ((flags & BDRV_O_TEMPORARY) || replay_mode == REPLAY_MODE_RECORD) {
+        int64_t total_size;
+        QemuOpts *opts = NULL;
+        const char *tmp_filename = qdict_get_str(snapshot_options,
+                                                 "file.filename");
+
+        /* Get the required size from the image */
+        total_size = bdrv_getlength(bs);
+        if (total_size < 0) {
+            error_setg_errno(errp, -total_size, "Could not get image size");
+            goto out;
+        }
+
+        opts = qemu_opts_create(bdrv_qcow2.create_opts, NULL, 0,
+                                &error_abort);
+        qemu_opt_set_number(opts, BLOCK_OPT_SIZE, total_size, &error_abort);
+        ret = bdrv_create(&bdrv_qcow2, tmp_filename, opts, errp);
+        qemu_opts_del(opts);
+        if (ret < 0) {
+            error_prepend(errp, "Could not create temporary overlay '%s': ",
+                          tmp_filename);
+            goto out;
+        }
+    }
+
+    bs_snapshot = bdrv_open(NULL, NULL, snapshot_options, flags, errp);
+    snapshot_options = NULL;
+    if (!bs_snapshot) {
+        ret = -EINVAL;
+        goto out;
+    }
+
+    /* bdrv_append() consumes a strong reference to bs_snapshot (i.e. it will
+     * call bdrv_unref() on it), so in order to be able to return one, we have
+     * to increase bs_snapshot's refcount here */
+    bdrv_ref(bs_snapshot);
+    bdrv_append(bs_snapshot, bs);
+
+    return bs_snapshot;
+
+out:
+    QDECREF(snapshot_options);
+    return NULL;
+}
+
+static QemuOptsList blkreplay_runtime_opts = {
+    .name = "quorum",
+    .head = QTAILQ_HEAD_INITIALIZER(blkreplay_runtime_opts.head),
+    .desc = {
+        {
+            .name = "overlay",
+            .type = QEMU_OPT_STRING,
+            .help = "Optional overlay file for snapshots",
+        },
+        { /* end of list */ }
+    },
+};
+
 static int blkreplay_open(BlockDriverState *bs, QDict *options, int flags,
                           Error **errp)
 {
     Error *local_err = NULL;
     int ret;
+    QDict *snapshot_options = qdict_new();
+    int snapshot_flags = BDRV_O_RDWR;
+    const char *snapshot;
+    QemuOpts *opts = NULL;
 
     /* Open the image file */
     bs->file = bdrv_open_child(NULL, options, "image",
@@ -40,6 +112,43 @@  static int blkreplay_open(BlockDriverState *bs, QDict *options, int flags,
         goto fail;
     }
 
+    opts = qemu_opts_create(&blkreplay_runtime_opts, NULL, 0, &error_abort);
+    qemu_opts_absorb_qdict(opts, options, &local_err);
+    if (local_err) {
+        ret = -EINVAL;
+        goto fail;
+    }
+
+    /* Prepare options QDict for the overlay file */
+    qdict_put(snapshot_options, "file.driver",
+              qstring_from_str("file"));
+    qdict_put(snapshot_options, "driver",
+              qstring_from_str("qcow2"));
+
+    snapshot = qemu_opt_get(opts, "overlay");
+    if (snapshot) {
+        qdict_put(snapshot_options, "file.filename",
+                  qstring_from_str(snapshot));
+    } else {
+        char tmp_filename[PATH_MAX + 1];
+        ret = get_tmp_filename(tmp_filename, PATH_MAX + 1);
+        if (ret < 0) {
+            error_setg_errno(errp, -ret, "Could not get temporary filename");
+            goto fail;
+        }
+        qdict_put(snapshot_options, "file.filename",
+                  qstring_from_str(tmp_filename));
+        snapshot_flags |= BDRV_O_TEMPORARY;
+    }
+
+    /* Add temporary snapshot to preserve the image */
+    if (!blkreplay_append_snapshot(bs->file->bs, snapshot_flags,
+                      snapshot_options, &local_err)) {
+        ret = -EINVAL;
+        error_propagate(errp, local_err);
+        goto fail;
+    }
+
     ret = 0;
 fail:
     if (ret < 0) {
@@ -50,6 +159,7 @@  fail:
 
 static void blkreplay_close(BlockDriverState *bs)
 {
+    bdrv_unref(bs->file->bs);
 }
 
 static int64_t blkreplay_getlength(BlockDriverState *bs)
@@ -135,6 +245,12 @@  static int coroutine_fn blkreplay_co_flush(BlockDriverState *bs)
     return ret;
 }
 
+static bool blkreplay_recurse_is_first_non_filter(BlockDriverState *bs,
+                                                  BlockDriverState *candidate)
+{
+    return bdrv_recurse_is_first_non_filter(bs->file->bs, candidate);
+}
+
 static BlockDriver bdrv_blkreplay = {
     .format_name            = "blkreplay",
     .protocol_name          = "blkreplay",
@@ -150,6 +266,9 @@  static BlockDriver bdrv_blkreplay = {
     .bdrv_co_pwrite_zeroes  = blkreplay_co_pwrite_zeroes,
     .bdrv_co_pdiscard       = blkreplay_co_pdiscard,
     .bdrv_co_flush          = blkreplay_co_flush,
+
+    .is_filter              = true,
+    .bdrv_recurse_is_first_non_filter = blkreplay_recurse_is_first_non_filter,
 };
 
 static void bdrv_blkreplay_init(void)
diff --git a/docs/replay.txt b/docs/replay.txt
index 347b2ff..5be8f25 100644
--- a/docs/replay.txt
+++ b/docs/replay.txt
@@ -196,6 +196,14 @@  is recorded to the log. In replay phase the queue is matched with
 events read from the log. Therefore block devices requests are processed
 deterministically.
 
+blkdriver also supports overlay option, which allows creating persistent
+overlay file for saving and reloading VM snapshots in record/replay modes.
+Replay mechanism automatically creates one snapshot named 'replay_init' to
+allow rewinding execution while replaying.
+Overlay file may be specified as follows:
+ -drive driver=blkreplay,if=none,image=img-direct,
+        overlay=overlay.qcow2,id=img-blkreplay
+
 Network devices
 ---------------
 
diff --git a/vl.c b/vl.c
index cb0fedc..1c68779 100644
--- a/vl.c
+++ b/vl.c
@@ -4409,7 +4409,7 @@  int main(int argc, char **argv, char **envp)
     }
 
     /* open the virtual block devices */
-    if (snapshot || replay_mode != REPLAY_MODE_NONE) {
+    if (snapshot) {
         qemu_opts_foreach(qemu_find_opts("drive"), drive_enable_snapshot,
                           NULL, NULL);
     }