From patchwork Mon Mar 14 07:45:10 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Dovgalyuk X-Patchwork-Id: 8576971 Return-Path: X-Original-To: patchwork-qemu-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 22F5DC0554 for ; Mon, 14 Mar 2016 07:48:04 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 11F722047C for ; Mon, 14 Mar 2016 07:48:03 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 47B1320462 for ; Mon, 14 Mar 2016 07:48:01 +0000 (UTC) Received: from localhost ([::1]:39502 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1afNEO-0006Ik-P4 for patchwork-qemu-devel@patchwork.kernel.org; Mon, 14 Mar 2016 03:48:00 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53074) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1afNBh-0001lr-Jf for qemu-devel@nongnu.org; Mon, 14 Mar 2016 03:45:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1afNBf-00041r-QC for qemu-devel@nongnu.org; Mon, 14 Mar 2016 03:45:13 -0400 Received: from mail.ispras.ru ([83.149.199.45]:50527) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1afNBf-00041m-Ez for qemu-devel@nongnu.org; Mon, 14 Mar 2016 03:45:11 -0400 Received: from [10.10.150.45] (unknown [85.142.117.224]) by mail.ispras.ru (Postfix) with ESMTPSA id 9850D54007B; Mon, 14 Mar 2016 10:45:10 +0300 (MSK) To: qemu-devel@nongnu.org From: Pavel Dovgalyuk Date: Mon, 14 Mar 2016 10:45:10 +0300 Message-ID: <20160314074510.4980.15612.stgit@PASHA-ISP> In-Reply-To: <20160314074429.4980.34777.stgit@PASHA-ISP> References: <20160314074429.4980.34777.stgit@PASHA-ISP> User-Agent: StGit/0.16 MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 83.149.199.45 Cc: edgar.iglesias@xilinx.com, peter.maydell@linaro.org, igor.rubinov@gmail.com, alex.bennee@linaro.org, mark.burton@greensocs.com, real@ispras.ru, hines@cert.org, batuzovk@ispras.ru, maria.klimushenkova@ispras.ru, pavel.dovgaluk@ispras.ru, pbonzini@redhat.com, kwolf@redhat.com, stefanha@redhat.com, fred.konrad@greensocs.com Subject: [Qemu-devel] [PATCH v5 7/7] replay: introduce block devices record/replay X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch introduces block driver that implement recording and replaying of block devices' operations. All block completion operations are added to the queue. Queue is flushed at checkpoints and information about processed requests is recorded to the log. In replay phase the queue is matched with events read from the log. Therefore block devices requests are processed deterministically. Signed-off-by: Pavel Dovgalyuk Acked-by: Kevin Wolf --- block/Makefile.objs | 2 - block/blkreplay.c | 159 ++++++++++++++++++++++++++++++++++++++++++++++ docs/replay.txt | 20 ++++++ include/sysemu/replay.h | 2 + replay/replay-events.c | 20 ++++++ replay/replay-internal.h | 1 replay/replay.c | 2 - stubs/replay.c | 4 + 8 files changed, 208 insertions(+), 2 deletions(-) create mode 100755 block/blkreplay.c diff --git a/block/Makefile.objs b/block/Makefile.objs index 58ef2ef..38fea16 100644 --- a/block/Makefile.objs +++ b/block/Makefile.objs @@ -4,7 +4,7 @@ block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o block-obj-y += qed-check.o block-obj-$(CONFIG_VHDX) += vhdx.o vhdx-endian.o vhdx-log.o block-obj-y += quorum.o -block-obj-y += parallels.o blkdebug.o blkverify.o +block-obj-y += parallels.o blkdebug.o blkverify.o blkreplay.o block-obj-y += block-backend.o snapshot.o qapi.o block-obj-$(CONFIG_WIN32) += raw-win32.o win32-aio.o block-obj-$(CONFIG_POSIX) += raw-posix.o diff --git a/block/blkreplay.c b/block/blkreplay.c new file mode 100755 index 0000000..56024a6 --- /dev/null +++ b/block/blkreplay.c @@ -0,0 +1,159 @@ +/* + * Block protocol for record/replay + * + * Copyright (c) 2010-2016 Institute for System Programming + * of the Russian Academy of Sciences. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include "qemu-common.h" +#include "block/block_int.h" +#include "sysemu/replay.h" + +typedef struct Request { + Coroutine *co; + QEMUBH *bh; +} Request; + +/* Next request id. + This counter is global, because requests from different + block devices should not get overlapping ids. */ +static uint64_t request_id; + +static int blkreplay_open(BlockDriverState *bs, QDict *options, int flags, + Error **errp) +{ + Error *local_err = NULL; + int ret; + + /* Open the image file */ + bs->file = bdrv_open_child(NULL, options, "image", + bs, &child_file, false, &local_err); + if (local_err) { + ret = -EINVAL; + error_propagate(errp, local_err); + goto fail; + } + + ret = 0; +fail: + if (ret < 0) { + bdrv_unref_child(bs, bs->file); + } + return ret; +} + +static void blkreplay_close(BlockDriverState *bs) +{ +} + +static int64_t blkreplay_getlength(BlockDriverState *bs) +{ + return bdrv_getlength(bs->file->bs); +} + +/* This bh is used for synchronization of return from coroutines. + It continues yielded coroutine which then finishes its execution. + BH is called adjusted to some replay checkpoint, therefore + record and replay will always finish coroutines deterministically. +*/ +static void blkreplay_bh_cb(void *opaque) +{ + Request *req = opaque; + qemu_coroutine_enter(req->co, NULL); + qemu_bh_delete(req->bh); + g_free(req); +} + +static void block_request_create(uint64_t reqid, BlockDriverState *bs, + Coroutine *co) +{ + Request *req = g_new(Request, 1); + *req = (Request) { + .co = co, + .bh = aio_bh_new(bdrv_get_aio_context(bs), blkreplay_bh_cb, req), + }; + replay_block_event(req->bh, reqid); +} + +static int coroutine_fn blkreplay_co_readv(BlockDriverState *bs, + int64_t sector_num, int nb_sectors, QEMUIOVector *qiov) +{ + uint64_t reqid = request_id++; + int ret = bdrv_co_readv(bs->file->bs, sector_num, nb_sectors, qiov); + block_request_create(reqid, bs, qemu_coroutine_self()); + qemu_coroutine_yield(); + + return ret; +} + +static int coroutine_fn blkreplay_co_writev(BlockDriverState *bs, + int64_t sector_num, int nb_sectors, QEMUIOVector *qiov) +{ + uint64_t reqid = request_id++; + int ret = bdrv_co_writev(bs->file->bs, sector_num, nb_sectors, qiov); + block_request_create(reqid, bs, qemu_coroutine_self()); + qemu_coroutine_yield(); + + return ret; +} + +static int coroutine_fn blkreplay_co_write_zeroes(BlockDriverState *bs, + int64_t sector_num, int nb_sectors, BdrvRequestFlags flags) +{ + uint64_t reqid = request_id++; + int ret = bdrv_co_write_zeroes(bs->file->bs, sector_num, nb_sectors, flags); + block_request_create(reqid, bs, qemu_coroutine_self()); + qemu_coroutine_yield(); + + return ret; +} + +static int coroutine_fn blkreplay_co_discard(BlockDriverState *bs, + int64_t sector_num, int nb_sectors) +{ + uint64_t reqid = request_id++; + int ret = bdrv_co_discard(bs->file->bs, sector_num, nb_sectors); + block_request_create(reqid, bs, qemu_coroutine_self()); + qemu_coroutine_yield(); + + return ret; +} + +static int coroutine_fn blkreplay_co_flush(BlockDriverState *bs) +{ + uint64_t reqid = request_id++; + int ret = bdrv_co_flush(bs->file->bs); + block_request_create(reqid, bs, qemu_coroutine_self()); + qemu_coroutine_yield(); + + return ret; +} + +static BlockDriver bdrv_blkreplay = { + .format_name = "blkreplay", + .protocol_name = "blkreplay", + .instance_size = 0, + + .bdrv_file_open = blkreplay_open, + .bdrv_close = blkreplay_close, + .bdrv_getlength = blkreplay_getlength, + + .bdrv_co_readv = blkreplay_co_readv, + .bdrv_co_writev = blkreplay_co_writev, + + .bdrv_co_write_zeroes = blkreplay_co_write_zeroes, + .bdrv_co_discard = blkreplay_co_discard, + .bdrv_co_flush = blkreplay_co_flush, +}; + +static void bdrv_blkreplay_init(void) +{ + bdrv_register(&bdrv_blkreplay); +} + +block_init(bdrv_blkreplay_init); diff --git a/docs/replay.txt b/docs/replay.txt index cf13cef..3da0a4f 100644 --- a/docs/replay.txt +++ b/docs/replay.txt @@ -173,3 +173,23 @@ Sometimes the block layer uses asynchronous callbacks for its internal purposes (like reading or writing VM snapshots or disk image cluster tables). In this case bottom halves are not marked as "replayable" and do not saved into the log. + +Block devices +------------- + +Block devices record/replay module intercepts calls of +bdrv coroutine functions at the top of block drivers stack. +To record and replay block operations the drive must be configured +as following: + -drive file=disk.qcow,if=none,id=img-direct + -drive driver=blkreplay,if=none,image=img-direct,id=img-blkreplay + -device ide-hd,drive=img-blkreplay + +blkreplay driver should be inserted between disk image and virtual driver +controller. Therefore all disk requests may be recorded and replayed. + +All block completion operations are added to the queue in the coroutines. +Queue is flushed at checkpoints and information about processed requests +is recorded to the log. In replay phase the queue is matched with +events read from the log. Therefore block devices requests are processed +deterministically. diff --git a/include/sysemu/replay.h b/include/sysemu/replay.h index e798919..57492da 100644 --- a/include/sysemu/replay.h +++ b/include/sysemu/replay.h @@ -114,6 +114,8 @@ void replay_bh_schedule_event(QEMUBH *bh); void replay_input_event(QemuConsole *src, InputEvent *evt); /*! Adds input sync event to the queue */ void replay_input_sync_event(void); +/*! Adds block layer event to the queue */ +void replay_block_event(QEMUBH *bh, uint64_t id); /* Character device */ diff --git a/replay/replay-events.c b/replay/replay-events.c index b6b8a64..c797489 100644 --- a/replay/replay-events.c +++ b/replay/replay-events.c @@ -51,6 +51,9 @@ static void replay_run_event(Event *event) case REPLAY_ASYNC_EVENT_CHAR_READ: replay_event_char_read_run(event->opaque); break; + case REPLAY_ASYNC_EVENT_BLOCK: + aio_bh_call(event->opaque); + break; default: error_report("Replay: invalid async event ID (%d) in the queue", event->event_kind); @@ -153,6 +156,15 @@ void replay_add_input_sync_event(void) replay_add_event(REPLAY_ASYNC_EVENT_INPUT_SYNC, NULL, NULL, 0); } +void replay_block_event(QEMUBH *bh, uint64_t id) +{ + if (replay_mode != REPLAY_MODE_NONE && events_enabled) { + replay_add_event(REPLAY_ASYNC_EVENT_BLOCK, bh, NULL, id); + } else { + qemu_bh_schedule(bh); + } +} + static void replay_save_event(Event *event, int checkpoint) { if (replay_mode != REPLAY_MODE_PLAY) { @@ -174,6 +186,9 @@ static void replay_save_event(Event *event, int checkpoint) case REPLAY_ASYNC_EVENT_CHAR_READ: replay_event_char_read_save(event->opaque); break; + case REPLAY_ASYNC_EVENT_BLOCK: + replay_put_qword(event->id); + break; default: error_report("Unknown ID %d of replay event", event->id); exit(1); @@ -232,6 +247,11 @@ static Event *replay_read_event(int checkpoint) event->event_kind = read_event_kind; event->opaque = replay_event_char_read_load(); return event; + case REPLAY_ASYNC_EVENT_BLOCK: + if (read_id == -1) { + read_id = replay_get_qword(); + } + break; default: error_report("Unknown ID %d of replay event", read_event_kind); exit(1); diff --git a/replay/replay-internal.h b/replay/replay-internal.h index 11f9a85..efbf14c 100644 --- a/replay/replay-internal.h +++ b/replay/replay-internal.h @@ -49,6 +49,7 @@ enum ReplayAsyncEventKind { REPLAY_ASYNC_EVENT_INPUT, REPLAY_ASYNC_EVENT_INPUT_SYNC, REPLAY_ASYNC_EVENT_CHAR_READ, + REPLAY_ASYNC_EVENT_BLOCK, REPLAY_ASYNC_COUNT }; diff --git a/replay/replay.c b/replay/replay.c index fcfde4f..810db14 100644 --- a/replay/replay.c +++ b/replay/replay.c @@ -20,7 +20,7 @@ /* Current version of the replay mechanism. Increase it when file format changes. */ -#define REPLAY_VERSION 0xe02003 +#define REPLAY_VERSION 0xe02004 /* Size of replay log header */ #define HEADER_SIZE (sizeof(uint32_t) + sizeof(uint64_t)) diff --git a/stubs/replay.c b/stubs/replay.c index 00ca01f..6f4a8e8 100644 --- a/stubs/replay.c +++ b/stubs/replay.c @@ -29,3 +29,7 @@ bool replay_events_enabled(void) void replay_finish(void) { } + +void replay_block_event(QEMUBH *bh, uint64_t id) +{ +}