From patchwork Tue Jun 14 13:32:33 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 9175817 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id CBA1160573 for ; Tue, 14 Jun 2016 13:33:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BBF11280E0 for ; Tue, 14 Jun 2016 13:33:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B04CE2824A; Tue, 14 Jun 2016 13:33:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id EC695280E0 for ; Tue, 14 Jun 2016 13:33:39 +0000 (UTC) Received: from localhost ([::1]:35293 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bCoTL-0005Wg-3J for patchwork-qemu-devel@patchwork.kernel.org; Tue, 14 Jun 2016 09:33:39 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59508) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bCoSj-0005Rl-73 for qemu-devel@nongnu.org; Tue, 14 Jun 2016 09:33:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bCoSc-00064K-1Q for qemu-devel@nongnu.org; Tue, 14 Jun 2016 09:33:00 -0400 Received: from mx1.redhat.com ([209.132.183.28]:49386) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bCoSV-0005xs-3B; Tue, 14 Jun 2016 09:32:47 -0400 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A77F57267A; Tue, 14 Jun 2016 13:32:46 +0000 (UTC) Received: from noname.str.redhat.com. (dhcp-192-197.str.redhat.com [10.33.192.197]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u5EDWcZT028927; Tue, 14 Jun 2016 09:32:45 -0400 From: Kevin Wolf To: qemu-block@nongnu.org Date: Tue, 14 Jun 2016 15:32:33 +0200 Message-Id: <1465911155-19002-5-git-send-email-kwolf@redhat.com> In-Reply-To: <1465911155-19002-1-git-send-email-kwolf@redhat.com> References: <1465911155-19002-1-git-send-email-kwolf@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.26 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 14 Jun 2016 13:32:46 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v2 4/6] raw-posix: Switch to bdrv_co_* interfaces X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, famz@redhat.com, qemu-devel@nongnu.org, mreitz@redhat.com, stefanha@redhat.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP In order to use the modern byte-based .bdrv_co_preadv/pwritev() interface, this patch switches raw-posix to coroutine-based interfaces as a first step. In terms of semantics and performance, it doesn't make a difference with the existing code whether we go from a coroutine to a callback-based interface already in block/io.c or only in linux-aio.c As there have been concerns in the past that this change may be a step in the wrong direction with respect to a possible AIO fast path, the old callback-based interface for linux-aio is left around and can be reactivated when a fast path (e.g. directly from virtio-blk dataplane, bypassing the whole block layer) is implemented. Signed-off-by: Kevin Wolf Reviewed-by: Eric Blake --- block/linux-aio.c | 87 +++++++++++++++++++++++++++++++++++++++++-------------- block/raw-aio.h | 4 +++ block/raw-posix.c | 59 +++++++++++++++++-------------------- 3 files changed, 96 insertions(+), 54 deletions(-) diff --git a/block/linux-aio.c b/block/linux-aio.c index 294b2bf..74d9b33 100644 --- a/block/linux-aio.c +++ b/block/linux-aio.c @@ -11,8 +11,10 @@ #include "qemu-common.h" #include "block/aio.h" #include "qemu/queue.h" +#include "block/block.h" #include "block/raw-aio.h" #include "qemu/event_notifier.h" +#include "qemu/coroutine.h" #include @@ -30,6 +32,7 @@ struct qemu_laiocb { BlockAIOCB common; + Coroutine *co; LinuxAioState *ctx; struct iocb iocb; ssize_t ret; @@ -88,9 +91,14 @@ static void qemu_laio_process_completion(struct qemu_laiocb *laiocb) } } } - laiocb->common.cb(laiocb->common.opaque, ret); - qemu_aio_unref(laiocb); + laiocb->ret = ret; + if (laiocb->co) { + qemu_coroutine_enter(laiocb->co, NULL); + } else { + laiocb->common.cb(laiocb->common.opaque, ret); + qemu_aio_unref(laiocb); + } } /* The completion BH fetches completed I/O requests and invokes their @@ -232,22 +240,12 @@ void laio_io_unplug(BlockDriverState *bs, LinuxAioState *s) } } -BlockAIOCB *laio_submit(BlockDriverState *bs, LinuxAioState *s, int fd, - int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, - BlockCompletionFunc *cb, void *opaque, int type) +static int laio_do_submit(int fd, struct qemu_laiocb *laiocb, off_t offset, + int type) { - struct qemu_laiocb *laiocb; - struct iocb *iocbs; - off_t offset = sector_num * 512; - - laiocb = qemu_aio_get(&laio_aiocb_info, bs, cb, opaque); - laiocb->nbytes = nb_sectors * 512; - laiocb->ctx = s; - laiocb->ret = -EINPROGRESS; - laiocb->is_read = (type == QEMU_AIO_READ); - laiocb->qiov = qiov; - - iocbs = &laiocb->iocb; + LinuxAioState *s = laiocb->ctx; + struct iocb *iocbs = &laiocb->iocb; + QEMUIOVector *qiov = laiocb->qiov; switch (type) { case QEMU_AIO_WRITE: @@ -260,7 +258,7 @@ BlockAIOCB *laio_submit(BlockDriverState *bs, LinuxAioState *s, int fd, default: fprintf(stderr, "%s: invalid AIO request type 0x%x.\n", __func__, type); - goto out_free_aiocb; + return -EIO; } io_set_eventfd(&laiocb->iocb, event_notifier_get_fd(&s->e)); @@ -270,11 +268,56 @@ BlockAIOCB *laio_submit(BlockDriverState *bs, LinuxAioState *s, int fd, (!s->io_q.plugged || s->io_q.n >= MAX_QUEUED_IO)) { ioq_submit(s); } - return &laiocb->common; -out_free_aiocb: - qemu_aio_unref(laiocb); - return NULL; + return 0; +} + +int coroutine_fn laio_co_submit(BlockDriverState *bs, LinuxAioState *s, int fd, + int64_t sector_num, QEMUIOVector *qiov, + int nb_sectors, int type) +{ + off_t offset = sector_num * BDRV_SECTOR_SIZE; + int ret; + + struct qemu_laiocb laiocb = { + .co = qemu_coroutine_self(), + .nbytes = nb_sectors * BDRV_SECTOR_SIZE, + .ctx = s, + .is_read = (type == QEMU_AIO_READ), + .qiov = qiov, + }; + + ret = laio_do_submit(fd, &laiocb, offset, type); + if (ret < 0) { + return ret; + } + + qemu_coroutine_yield(); + return laiocb.ret; +} + +BlockAIOCB *laio_submit(BlockDriverState *bs, LinuxAioState *s, int fd, + int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, + BlockCompletionFunc *cb, void *opaque, int type) +{ + struct qemu_laiocb *laiocb; + off_t offset = sector_num * BDRV_SECTOR_SIZE; + int ret; + + laiocb = qemu_aio_get(&laio_aiocb_info, bs, cb, opaque); + laiocb->nbytes = nb_sectors * BDRV_SECTOR_SIZE; + laiocb->ctx = s; + laiocb->ret = -EINPROGRESS; + laiocb->is_read = (type == QEMU_AIO_READ); + laiocb->qiov = qiov; + + ret = laio_do_submit(fd, laiocb, offset, type); + if (ret < 0) { + qemu_aio_unref(laiocb); + return NULL; + } + + return &laiocb->common; } void laio_detach_aio_context(LinuxAioState *s, AioContext *old_context) diff --git a/block/raw-aio.h b/block/raw-aio.h index 714714e..03bbfba 100644 --- a/block/raw-aio.h +++ b/block/raw-aio.h @@ -15,6 +15,7 @@ #ifndef QEMU_RAW_AIO_H #define QEMU_RAW_AIO_H +#include "qemu/coroutine.h" #include "qemu/iov.h" /* AIO request types */ @@ -38,6 +39,9 @@ typedef struct LinuxAioState LinuxAioState; LinuxAioState *laio_init(void); void laio_cleanup(LinuxAioState *s); +int coroutine_fn laio_co_submit(BlockDriverState *bs, LinuxAioState *s, int fd, + int64_t sector_num, QEMUIOVector *qiov, + int nb_sectors, int type); BlockAIOCB *laio_submit(BlockDriverState *bs, LinuxAioState *s, int fd, int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, BlockCompletionFunc *cb, void *opaque, int type); diff --git a/block/raw-posix.c b/block/raw-posix.c index ce2e20f..cb98769 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -1325,14 +1325,13 @@ static BlockAIOCB *paio_submit(BlockDriverState *bs, int fd, return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque); } -static BlockAIOCB *raw_aio_submit(BlockDriverState *bs, - int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, - BlockCompletionFunc *cb, void *opaque, int type) +static int coroutine_fn raw_co_rw(BlockDriverState *bs, int64_t sector_num, + int nb_sectors, QEMUIOVector *qiov, int type) { BDRVRawState *s = bs->opaque; if (fd_open(bs) < 0) - return NULL; + return -EIO; /* * Check if the underlying device requires requests to be aligned, @@ -1345,14 +1344,26 @@ static BlockAIOCB *raw_aio_submit(BlockDriverState *bs, type |= QEMU_AIO_MISALIGNED; #ifdef CONFIG_LINUX_AIO } else if (s->use_aio) { - return laio_submit(bs, s->aio_ctx, s->fd, sector_num, qiov, - nb_sectors, cb, opaque, type); + return laio_co_submit(bs, s->aio_ctx, s->fd, sector_num, qiov, + nb_sectors, type); #endif } } - return paio_submit(bs, s->fd, sector_num, qiov, nb_sectors, - cb, opaque, type); + return paio_submit_co(bs, s->fd, sector_num * BDRV_SECTOR_SIZE, qiov, + nb_sectors * BDRV_SECTOR_SIZE, type); +} + +static int coroutine_fn raw_co_readv(BlockDriverState *bs, int64_t sector_num, + int nb_sectors, QEMUIOVector *qiov) +{ + return raw_co_rw(bs, sector_num, nb_sectors, qiov, QEMU_AIO_READ); +} + +static int coroutine_fn raw_co_writev(BlockDriverState *bs, int64_t sector_num, + int nb_sectors, QEMUIOVector *qiov) +{ + return raw_co_rw(bs, sector_num, nb_sectors, qiov, QEMU_AIO_WRITE); } static void raw_aio_plug(BlockDriverState *bs) @@ -1375,22 +1386,6 @@ static void raw_aio_unplug(BlockDriverState *bs) #endif } -static BlockAIOCB *raw_aio_readv(BlockDriverState *bs, - int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, - BlockCompletionFunc *cb, void *opaque) -{ - return raw_aio_submit(bs, sector_num, qiov, nb_sectors, - cb, opaque, QEMU_AIO_READ); -} - -static BlockAIOCB *raw_aio_writev(BlockDriverState *bs, - int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, - BlockCompletionFunc *cb, void *opaque) -{ - return raw_aio_submit(bs, sector_num, qiov, nb_sectors, - cb, opaque, QEMU_AIO_WRITE); -} - static BlockAIOCB *raw_aio_flush(BlockDriverState *bs, BlockCompletionFunc *cb, void *opaque) { @@ -1957,8 +1952,8 @@ BlockDriver bdrv_file = { .bdrv_co_get_block_status = raw_co_get_block_status, .bdrv_co_pwrite_zeroes = raw_co_pwrite_zeroes, - .bdrv_aio_readv = raw_aio_readv, - .bdrv_aio_writev = raw_aio_writev, + .bdrv_co_readv = raw_co_readv, + .bdrv_co_writev = raw_co_writev, .bdrv_aio_flush = raw_aio_flush, .bdrv_aio_discard = raw_aio_discard, .bdrv_refresh_limits = raw_refresh_limits, @@ -2405,8 +2400,8 @@ static BlockDriver bdrv_host_device = { .create_opts = &raw_create_opts, .bdrv_co_pwrite_zeroes = hdev_co_pwrite_zeroes, - .bdrv_aio_readv = raw_aio_readv, - .bdrv_aio_writev = raw_aio_writev, + .bdrv_co_readv = raw_co_readv, + .bdrv_co_writev = raw_co_writev, .bdrv_aio_flush = raw_aio_flush, .bdrv_aio_discard = hdev_aio_discard, .bdrv_refresh_limits = raw_refresh_limits, @@ -2535,8 +2530,8 @@ static BlockDriver bdrv_host_cdrom = { .bdrv_create = hdev_create, .create_opts = &raw_create_opts, - .bdrv_aio_readv = raw_aio_readv, - .bdrv_aio_writev = raw_aio_writev, + .bdrv_co_readv = raw_co_readv, + .bdrv_co_writev = raw_co_writev, .bdrv_aio_flush = raw_aio_flush, .bdrv_refresh_limits = raw_refresh_limits, .bdrv_io_plug = raw_aio_plug, @@ -2670,8 +2665,8 @@ static BlockDriver bdrv_host_cdrom = { .bdrv_create = hdev_create, .create_opts = &raw_create_opts, - .bdrv_aio_readv = raw_aio_readv, - .bdrv_aio_writev = raw_aio_writev, + .bdrv_co_readv = raw_co_readv, + .bdrv_co_writev = raw_co_writev, .bdrv_aio_flush = raw_aio_flush, .bdrv_refresh_limits = raw_refresh_limits, .bdrv_io_plug = raw_aio_plug,