From patchwork Mon Feb 18 13:06:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10817969 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0E0A817E9 for ; Mon, 18 Feb 2019 13:09:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F045C2AA5B for ; Mon, 18 Feb 2019 13:09:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E433B2AA67; Mon, 18 Feb 2019 13:09:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 058182AA66 for ; Mon, 18 Feb 2019 13:09:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730892AbfBRNJ4 (ORCPT ); Mon, 18 Feb 2019 08:09:56 -0500 Received: from foss.arm.com ([217.140.101.70]:57976 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730512AbfBRNJ4 (ORCPT ); Mon, 18 Feb 2019 08:09:56 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3EBBE15AB; Mon, 18 Feb 2019 05:09:55 -0800 (PST) Received: from ostrya.cambridge.arm.com (ostrya.cambridge.arm.com [10.1.197.2]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 6CB5B3F720; Mon, 18 Feb 2019 05:09:54 -0800 (PST) From: Jean-Philippe Brucker To: kvm@vger.kernel.org Cc: will.deacon@arm.com, andre.przywara@arm.com Subject: [PATCH kvmtool 1/9] qcow: Fix qcow1 exit fault Date: Mon, 18 Feb 2019 13:06:54 +0000 Message-Id: <20190218130702.32575-2-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190218130702.32575-1-jean-philippe.brucker@arm.com> References: <20190218130702.32575-1-jean-philippe.brucker@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Even though qcow1 doesn't use the refcount table, the cleanup path still attempts to iterate over its LRU list. Initialize the list to avoid a segfault on exit. Signed-off-by: Jean-Philippe Brucker Reviewed-by: Andre Przywara --- disk/qcow.c | 1 + 1 file changed, 1 insertion(+) diff --git a/disk/qcow.c b/disk/qcow.c index 64cf9270a..bed70c65c 100644 --- a/disk/qcow.c +++ b/disk/qcow.c @@ -1437,6 +1437,7 @@ static struct disk_image *qcow1_probe(int fd, bool readonly) l1t->root = (struct rb_root)RB_ROOT; INIT_LIST_HEAD(&l1t->lru_list); + INIT_LIST_HEAD(&q->refcount_table.lru_list); h = q->header = qcow1_read_header(fd); if (!h) From patchwork Mon Feb 18 13:06:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10817973 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A15D817FB for ; Mon, 18 Feb 2019 13:09:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 904902AA5B for ; Mon, 18 Feb 2019 13:09:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8477B2AA5F; Mon, 18 Feb 2019 13:09:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F093D2AA5B for ; Mon, 18 Feb 2019 13:09:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730908AbfBRNJ6 (ORCPT ); Mon, 18 Feb 2019 08:09:58 -0500 Received: from foss.arm.com ([217.140.101.70]:57982 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730512AbfBRNJ5 (ORCPT ); Mon, 18 Feb 2019 08:09:57 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1804A15AD; Mon, 18 Feb 2019 05:09:57 -0800 (PST) Received: from ostrya.cambridge.arm.com (ostrya.cambridge.arm.com [10.1.197.2]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 7B35C3F720; Mon, 18 Feb 2019 05:09:55 -0800 (PST) From: Jean-Philippe Brucker To: kvm@vger.kernel.org Cc: will.deacon@arm.com, andre.przywara@arm.com Subject: [PATCH kvmtool 2/9] virtio/blk: Set VIRTIO_BLK_F_RO when the disk is read-only Date: Mon, 18 Feb 2019 13:06:55 +0000 Message-Id: <20190218130702.32575-3-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190218130702.32575-1-jean-philippe.brucker@arm.com> References: <20190218130702.32575-1-jean-philippe.brucker@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Since we don't currently tell the guest when the disk backend is read-only, it will report any inconsistent read after write as an error. An image may be read-only either because user requested it on the command-line, or because write support isn't implemented. Pass the read-only attribute using the VIRTIO_BLK_F_RO feature. Signed-off-by: Jean-Philippe Brucker Reviewed-by: Andre Przywara --- disk/core.c | 9 +++++++-- include/kvm/disk-image.h | 1 + virtio/blk.c | 5 ++++- 3 files changed, 12 insertions(+), 3 deletions(-) diff --git a/disk/core.c b/disk/core.c index dd2f258b0..4c7c4f030 100644 --- a/disk/core.c +++ b/disk/core.c @@ -139,8 +139,10 @@ static struct disk_image *disk_image__open(const char *filename, bool readonly, /* blk device ?*/ disk = blkdev__probe(filename, flags, &st); - if (!IS_ERR_OR_NULL(disk)) + if (!IS_ERR_OR_NULL(disk)) { + disk->readonly = readonly; return disk; + } fd = open(filename, flags); if (fd < 0) @@ -150,13 +152,16 @@ static struct disk_image *disk_image__open(const char *filename, bool readonly, disk = qcow_probe(fd, true); if (!IS_ERR_OR_NULL(disk)) { pr_warning("Forcing read-only support for QCOW"); + disk->readonly = true; return disk; } /* raw image ?*/ disk = raw_image__probe(fd, &st, readonly); - if (!IS_ERR_OR_NULL(disk)) + if (!IS_ERR_OR_NULL(disk)) { + disk->readonly = readonly; return disk; + } if (close(fd) < 0) pr_warning("close() failed"); diff --git a/include/kvm/disk-image.h b/include/kvm/disk-image.h index b72805242..4746e88c9 100644 --- a/include/kvm/disk-image.h +++ b/include/kvm/disk-image.h @@ -59,6 +59,7 @@ struct disk_image { void *priv; void *disk_req_cb_param; void (*disk_req_cb)(void *param, long len); + bool readonly; bool async; int evt; #ifdef CONFIG_HAS_AIO diff --git a/virtio/blk.c b/virtio/blk.c index a57df2e96..6e7a1ee36 100644 --- a/virtio/blk.c +++ b/virtio/blk.c @@ -148,10 +148,13 @@ static u8 *get_config(struct kvm *kvm, void *dev) static u32 get_host_features(struct kvm *kvm, void *dev) { + struct blk_dev *bdev = dev; + return 1UL << VIRTIO_BLK_F_SEG_MAX | 1UL << VIRTIO_BLK_F_FLUSH | 1UL << VIRTIO_RING_F_EVENT_IDX - | 1UL << VIRTIO_RING_F_INDIRECT_DESC; + | 1UL << VIRTIO_RING_F_INDIRECT_DESC + | (bdev->disk->readonly ? 1UL << VIRTIO_BLK_F_RO : 0); } static void set_guest_features(struct kvm *kvm, void *dev, u32 features) From patchwork Mon Feb 18 13:06:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10817981 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3EC6317FB for ; Mon, 18 Feb 2019 13:10:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2C1672AA66 for ; Mon, 18 Feb 2019 13:10:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 207102AAA3; Mon, 18 Feb 2019 13:10:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 04FB62AA7A for ; Mon, 18 Feb 2019 13:10:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730906AbfBRNJ7 (ORCPT ); Mon, 18 Feb 2019 08:09:59 -0500 Received: from foss.arm.com ([217.140.101.70]:57986 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730917AbfBRNJ7 (ORCPT ); Mon, 18 Feb 2019 08:09:59 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 160751650; Mon, 18 Feb 2019 05:09:58 -0800 (PST) Received: from ostrya.cambridge.arm.com (ostrya.cambridge.arm.com [10.1.197.2]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 807543F73F; Mon, 18 Feb 2019 05:09:56 -0800 (PST) From: Jean-Philippe Brucker To: kvm@vger.kernel.org Cc: will.deacon@arm.com, andre.przywara@arm.com Subject: [PATCH kvmtool 3/9] guest: sync disk before shutting down Date: Mon, 18 Feb 2019 13:06:56 +0000 Message-Id: <20190218130702.32575-4-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190218130702.32575-1-jean-philippe.brucker@arm.com> References: <20190218130702.32575-1-jean-philippe.brucker@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP sync() should be called before reboot(RB_AUTOBOOT), otherwise data written to disks might be lost. Signed-off-by: Jean-Philippe Brucker --- guest/init.c | 1 + 1 file changed, 1 insertion(+) diff --git a/guest/init.c b/guest/init.c index 1f9cd048a..52f6567da 100644 --- a/guest/init.c +++ b/guest/init.c @@ -72,6 +72,7 @@ int main(int argc, char *argv[]) } while (corpse != child); } + sync(); reboot(RB_AUTOBOOT); printf("Init failed: %s\n", strerror(errno)); From patchwork Mon Feb 18 13:06:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10817975 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 42A1617E9 for ; Mon, 18 Feb 2019 13:10:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2F87A2AA5B for ; Mon, 18 Feb 2019 13:10:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 220322AA66; Mon, 18 Feb 2019 13:10:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F0E692AA5B for ; Mon, 18 Feb 2019 13:10:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730919AbfBRNKA (ORCPT ); Mon, 18 Feb 2019 08:10:00 -0500 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:57988 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730905AbfBRNJ7 (ORCPT ); Mon, 18 Feb 2019 08:09:59 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 977611684; Mon, 18 Feb 2019 05:09:58 -0800 (PST) Received: from ostrya.cambridge.arm.com (ostrya.cambridge.arm.com [10.1.197.2]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 8F1553F720; Mon, 18 Feb 2019 05:09:57 -0800 (PST) From: Jean-Philippe Brucker To: kvm@vger.kernel.org Cc: will.deacon@arm.com, andre.przywara@arm.com Subject: [PATCH kvmtool 4/9] disk/aio: Refactor AIO code Date: Mon, 18 Feb 2019 13:06:57 +0000 Message-Id: <20190218130702.32575-5-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190218130702.32575-1-jean-philippe.brucker@arm.com> References: <20190218130702.32575-1-jean-philippe.brucker@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Move all AIO code to a separate file, disk/aio.c, to remove as much #ifdefs as possible. Split the raw read/write disk ops into async and sync, and choose which ones to use depending on CONFIG_HAS_AIO. Note that we fix raw_image__close() which incorrectly checked CONFIG_HAS_VIRTIO instead of CONFIG_HAS_AIO, and closed an unitialized disk->evt. A subsequent commit will complete this refactoring by fixing use of the 'async' disk attribute. Signed-off-by: Jean-Philippe Brucker Reviewed-by: Andre Przywara --- Makefile | 2 + disk/aio.c | 111 +++++++++++++++++++++++++++++++++++++++ disk/core.c | 52 +++++------------- disk/raw.c | 39 +++----------- include/kvm/disk-image.h | 41 ++++++++++++--- include/kvm/read-write.h | 11 ---- util/read-write.c | 36 ------------- 7 files changed, 167 insertions(+), 125 deletions(-) create mode 100644 disk/aio.c diff --git a/Makefile b/Makefile index ec75cd999..89acec2d5 100644 --- a/Makefile +++ b/Makefile @@ -275,10 +275,12 @@ endif ifeq ($(call try-build,$(SOURCE_AIO),$(CFLAGS),$(LDFLAGS) -laio),y) CFLAGS_DYNOPT += -DCONFIG_HAS_AIO LIBS_DYNOPT += -laio + OBJS += disk/aio.o else ifeq ($(call try-build,$(SOURCE_AIO),$(CFLAGS),$(LDFLAGS) -laio -static),y) CFLAGS_STATOPT += -DCONFIG_HAS_AIO LIBS_STATOPT += -laio + OBJS += disk/aio.o else NOTFOUND += aio endif diff --git a/disk/aio.c b/disk/aio.c new file mode 100644 index 000000000..6afcffe5a --- /dev/null +++ b/disk/aio.c @@ -0,0 +1,111 @@ +#include +#include +#include + +#include "kvm/disk-image.h" +#include "kvm/kvm.h" +#include "linux/list.h" + +#define AIO_MAX 256 + +static int aio_pwritev(io_context_t ctx, struct iocb *iocb, int fd, + const struct iovec *iov, int iovcnt, off_t offset, + int ev, void *param) +{ + struct iocb *ios[1] = { iocb }; + int ret; + + io_prep_pwritev(iocb, fd, iov, iovcnt, offset); + io_set_eventfd(iocb, ev); + iocb->data = param; + +restart: + ret = io_submit(ctx, 1, ios); + if (ret == -EAGAIN) + goto restart; + return ret; +} + +static int aio_preadv(io_context_t ctx, struct iocb *iocb, int fd, + const struct iovec *iov, int iovcnt, off_t offset, + int ev, void *param) +{ + struct iocb *ios[1] = { iocb }; + int ret; + + io_prep_preadv(iocb, fd, iov, iovcnt, offset); + io_set_eventfd(iocb, ev); + iocb->data = param; + +restart: + ret = io_submit(ctx, 1, ios); + if (ret == -EAGAIN) + goto restart; + return ret; +} + +ssize_t raw_image__read_async(struct disk_image *disk, u64 sector, + const struct iovec *iov, int iovcount, + void *param) +{ + u64 offset = sector << SECTOR_SHIFT; + struct iocb iocb; + + return aio_preadv(disk->ctx, &iocb, disk->fd, iov, iovcount, + offset, disk->evt, param); +} + +ssize_t raw_image__write_async(struct disk_image *disk, u64 sector, + const struct iovec *iov, int iovcount, + void *param) +{ + u64 offset = sector << SECTOR_SHIFT; + struct iocb iocb; + + return aio_pwritev(disk->ctx, &iocb, disk->fd, iov, iovcount, + offset, disk->evt, param); +} + +static void *disk_aio_thread(void *param) +{ + struct disk_image *disk = param; + struct io_event event[AIO_MAX]; + struct timespec notime = {0}; + int nr, i; + u64 dummy; + + kvm__set_thread_name("disk-image-io"); + + while (read(disk->evt, &dummy, sizeof(dummy)) > 0) { + nr = io_getevents(disk->ctx, 1, ARRAY_SIZE(event), event, ¬ime); + for (i = 0; i < nr; i++) + disk->disk_req_cb(event[i].data, event[i].res); + } + + return NULL; +} + +int disk_aio_setup(struct disk_image *disk) +{ + int r; + pthread_t thread; + + disk->evt = eventfd(0, 0); + if (disk->evt < 0) + return -errno; + + io_setup(AIO_MAX, &disk->ctx); + r = pthread_create(&thread, NULL, disk_aio_thread, disk); + if (r) { + r = -errno; + close(disk->evt); + return r; + } + return 0; +} + +void disk_aio_destroy(struct disk_image *disk) +{ + close(disk->evt); + io_destroy(disk->ctx); +} diff --git a/disk/core.c b/disk/core.c index 4c7c4f030..89880703e 100644 --- a/disk/core.c +++ b/disk/core.c @@ -4,11 +4,8 @@ #include "kvm/kvm.h" #include -#include #include -#define AIO_MAX 256 - int debug_iodelay; static int disk_image__close(struct disk_image *disk); @@ -54,27 +51,6 @@ int disk_img_name_parser(const struct option *opt, const char *arg, int unset) return 0; } -#ifdef CONFIG_HAS_AIO -static void *disk_image__thread(void *param) -{ - struct disk_image *disk = param; - struct io_event event[AIO_MAX]; - struct timespec notime = {0}; - int nr, i; - u64 dummy; - - kvm__set_thread_name("disk-image-io"); - - while (read(disk->evt, &dummy, sizeof(dummy)) > 0) { - nr = io_getevents(disk->ctx, 1, ARRAY_SIZE(event), event, ¬ime); - for (i = 0; i < nr; i++) - disk->disk_req_cb(event[i].data, event[i].res); - } - - return NULL; -} -#endif - struct disk_image *disk_image__new(int fd, u64 size, struct disk_image_operations *ops, int use_mmap) @@ -99,26 +75,22 @@ struct disk_image *disk_image__new(int fd, u64 size, disk->priv = mmap(NULL, size, PROT_RW, MAP_PRIVATE | MAP_NORESERVE, fd, 0); if (disk->priv == MAP_FAILED) { r = -errno; - free(disk); - return ERR_PTR(r); + goto err_free_disk; } } -#ifdef CONFIG_HAS_AIO - { - pthread_t thread; + r = disk_aio_setup(disk); + if (r) + goto err_unmap_disk; - disk->evt = eventfd(0, 0); - io_setup(AIO_MAX, &disk->ctx); - r = pthread_create(&thread, NULL, disk_image__thread, disk); - if (r) { - r = -errno; - free(disk); - return ERR_PTR(r); - } - } -#endif return disk; + +err_unmap_disk: + if (disk->priv) + munmap(disk->priv, size); +err_free_disk: + free(disk); + return ERR_PTR(r); } static struct disk_image *disk_image__open(const char *filename, bool readonly, bool direct) @@ -243,6 +215,8 @@ static int disk_image__close(struct disk_image *disk) if (!disk) return 0; + disk_aio_destroy(disk); + if (disk->ops->close) return disk->ops->close(disk); diff --git a/disk/raw.c b/disk/raw.c index 93b2b4e8d..09da7e081 100644 --- a/disk/raw.c +++ b/disk/raw.c @@ -2,38 +2,17 @@ #include -#ifdef CONFIG_HAS_AIO -#include -#endif - -ssize_t raw_image__read(struct disk_image *disk, u64 sector, const struct iovec *iov, +ssize_t raw_image__read_sync(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount, void *param) { - u64 offset = sector << SECTOR_SHIFT; - -#ifdef CONFIG_HAS_AIO - struct iocb iocb; - - return aio_preadv(disk->ctx, &iocb, disk->fd, iov, iovcount, offset, - disk->evt, param); -#else - return preadv_in_full(disk->fd, iov, iovcount, offset); -#endif + return preadv_in_full(disk->fd, iov, iovcount, sector << SECTOR_SHIFT); } -ssize_t raw_image__write(struct disk_image *disk, u64 sector, const struct iovec *iov, - int iovcount, void *param) +ssize_t raw_image__write_sync(struct disk_image *disk, u64 sector, + const struct iovec *iov, int iovcount, + void *param) { - u64 offset = sector << SECTOR_SHIFT; - -#ifdef CONFIG_HAS_AIO - struct iocb iocb; - - return aio_pwritev(disk->ctx, &iocb, disk->fd, iov, iovcount, offset, - disk->evt, param); -#else - return pwritev_in_full(disk->fd, iov, iovcount, offset); -#endif + return pwritev_in_full(disk->fd, iov, iovcount, sector << SECTOR_SHIFT); } ssize_t raw_image__read_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov, @@ -79,12 +58,6 @@ int raw_image__close(struct disk_image *disk) if (disk->priv != MAP_FAILED) ret = munmap(disk->priv, disk->size); - close(disk->evt); - -#ifdef CONFIG_HAS_VIRTIO - io_destroy(disk->ctx); -#endif - return ret; } diff --git a/include/kvm/disk-image.h b/include/kvm/disk-image.h index 4746e88c9..953beb2d5 100644 --- a/include/kvm/disk-image.h +++ b/include/kvm/disk-image.h @@ -19,6 +19,10 @@ #include #include +#ifdef CONFIG_HAS_AIO +#include +#endif + #define SECTOR_SHIFT 9 #define SECTOR_SIZE (1UL << SECTOR_SHIFT) @@ -61,10 +65,10 @@ struct disk_image { void (*disk_req_cb)(void *param, long len); bool readonly; bool async; - int evt; #ifdef CONFIG_HAS_AIO io_context_t ctx; -#endif + int evt; +#endif /* CONFIG_HAS_AIO */ const char *wwpn; const char *tpgt; int debug_iodelay; @@ -84,14 +88,39 @@ ssize_t disk_image__get_serial(struct disk_image *disk, void *buffer, ssize_t *l struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly); struct disk_image *blkdev__probe(const char *filename, int flags, struct stat *st); -ssize_t raw_image__read(struct disk_image *disk, u64 sector, - const struct iovec *iov, int iovcount, void *param); -ssize_t raw_image__write(struct disk_image *disk, u64 sector, - const struct iovec *iov, int iovcount, void *param); +ssize_t raw_image__read_sync(struct disk_image *disk, u64 sector, + const struct iovec *iov, int iovcount, void *param); +ssize_t raw_image__write_sync(struct disk_image *disk, u64 sector, + const struct iovec *iov, int iovcount, void *param); ssize_t raw_image__read_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount, void *param); ssize_t raw_image__write_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount, void *param); int raw_image__close(struct disk_image *disk); void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param, long len)); + +#ifdef CONFIG_HAS_AIO +int disk_aio_setup(struct disk_image *disk); +void disk_aio_destroy(struct disk_image *disk); +ssize_t raw_image__read_async(struct disk_image *disk, u64 sector, + const struct iovec *iov, int iovcount, void *param); +ssize_t raw_image__write_async(struct disk_image *disk, u64 sector, + const struct iovec *iov, int iovcount, void *param); + +#define raw_image__read raw_image__read_async +#define raw_image__write raw_image__write_async + +#else /* !CONFIG_HAS_AIO */ +static inline int disk_aio_setup(struct disk_image *disk) +{ + /* No-op */ + return 0; +} +static inline void disk_aio_destroy(struct disk_image *disk) +{ +} +#define raw_image__read raw_image__read_sync +#define raw_image__write raw_image__write_sync +#endif /* CONFIG_HAS_AIO */ + #endif /* KVM__DISK_IMAGE_H */ diff --git a/include/kvm/read-write.h b/include/kvm/read-write.h index acbd6f0b1..8375d7c7d 100644 --- a/include/kvm/read-write.h +++ b/include/kvm/read-write.h @@ -5,10 +5,6 @@ #include #include -#ifdef CONFIG_HAS_AIO -#include -#endif - ssize_t xread(int fd, void *buf, size_t count); ssize_t xwrite(int fd, const void *buf, size_t count); @@ -35,11 +31,4 @@ ssize_t xpwritev(int fd, const struct iovec *iov, int iovcnt, off_t offset); ssize_t preadv_in_full(int fd, const struct iovec *iov, int iovcnt, off_t offset); ssize_t pwritev_in_full(int fd, const struct iovec *iov, int iovcnt, off_t offset); -#ifdef CONFIG_HAS_AIO -int aio_preadv(io_context_t ctx, struct iocb *iocb, int fd, const struct iovec *iov, int iovcnt, - off_t offset, int ev, void *param); -int aio_pwritev(io_context_t ctx, struct iocb *iocb, int fd, const struct iovec *iov, int iovcnt, - off_t offset, int ev, void *param); -#endif - #endif /* KVM_READ_WRITE_H */ diff --git a/util/read-write.c b/util/read-write.c index bf6fb2fc2..06fc0dfff 100644 --- a/util/read-write.c +++ b/util/read-write.c @@ -337,39 +337,3 @@ ssize_t pwritev_in_full(int fd, const struct iovec *iov, int iovcnt, off_t offse return total; } - -#ifdef CONFIG_HAS_AIO -int aio_pwritev(io_context_t ctx, struct iocb *iocb, int fd, const struct iovec *iov, int iovcnt, - off_t offset, int ev, void *param) -{ - struct iocb *ios[1] = { iocb }; - int ret; - - io_prep_pwritev(iocb, fd, iov, iovcnt, offset); - io_set_eventfd(iocb, ev); - iocb->data = param; - -restart: - ret = io_submit(ctx, 1, ios); - if (ret == -EAGAIN) - goto restart; - return ret; -} - -int aio_preadv(io_context_t ctx, struct iocb *iocb, int fd, const struct iovec *iov, int iovcnt, - off_t offset, int ev, void *param) -{ - struct iocb *ios[1] = { iocb }; - int ret; - - io_prep_preadv(iocb, fd, iov, iovcnt, offset); - io_set_eventfd(iocb, ev); - iocb->data = param; - -restart: - ret = io_submit(ctx, 1, ios); - if (ret == -EAGAIN) - goto restart; - return ret; -} -#endif From patchwork Mon Feb 18 13:06:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10817977 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 241441390 for ; Mon, 18 Feb 2019 13:10:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 133342AA5B for ; Mon, 18 Feb 2019 13:10:03 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 07BF52AA6E; Mon, 18 Feb 2019 13:10:03 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 569A42AA5B for ; Mon, 18 Feb 2019 13:10:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730932AbfBRNKB (ORCPT ); Mon, 18 Feb 2019 08:10:01 -0500 Received: from foss.arm.com ([217.140.101.70]:57996 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730922AbfBRNKA (ORCPT ); Mon, 18 Feb 2019 08:10:00 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 833601688; Mon, 18 Feb 2019 05:10:00 -0800 (PST) Received: from ostrya.cambridge.arm.com (ostrya.cambridge.arm.com [10.1.197.2]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 947B53F73F; Mon, 18 Feb 2019 05:09:58 -0800 (PST) From: Jean-Philippe Brucker To: kvm@vger.kernel.org Cc: will.deacon@arm.com, andre.przywara@arm.com Subject: [PATCH kvmtool 5/9] disk/aio: Fix use of disk->async Date: Mon, 18 Feb 2019 13:06:58 +0000 Message-Id: <20190218130702.32575-6-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190218130702.32575-1-jean-philippe.brucker@arm.com> References: <20190218130702.32575-1-jean-philippe.brucker@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add an 'async' attribute to disk_image_operations, that describes if they can submit async I/O or not. disk_image->async is now set iff CONFIG_HAS_AIO and the ops do use AIO. This fixes qcow1, which used to set async = 1 even though the qcow operations don't use AIO. The disk core would perform the read/write operation without pushing the completion onto the virtio queue, and the guest would be stuck waiting. Signed-off-by: Jean-Philippe Brucker Reviewed-by: Andre Przywara --- disk/aio.c | 9 +++++++++ disk/blk.c | 9 ++------- disk/qcow.c | 2 -- disk/raw.c | 15 +++------------ include/kvm/disk-image.h | 1 + 5 files changed, 15 insertions(+), 21 deletions(-) diff --git a/disk/aio.c b/disk/aio.c index 6afcffe5a..007415c69 100644 --- a/disk/aio.c +++ b/disk/aio.c @@ -90,6 +90,10 @@ int disk_aio_setup(struct disk_image *disk) int r; pthread_t thread; + /* No need to setup AIO if the disk ops won't make use of it */ + if (!disk->ops->async) + return 0; + disk->evt = eventfd(0, 0); if (disk->evt < 0) return -errno; @@ -101,11 +105,16 @@ int disk_aio_setup(struct disk_image *disk) close(disk->evt); return r; } + + disk->async = true; return 0; } void disk_aio_destroy(struct disk_image *disk) { + if (!disk->async) + return; + close(disk->evt); io_destroy(disk->ctx); } diff --git a/disk/blk.c b/disk/blk.c index 37581d331..48922e028 100644 --- a/disk/blk.c +++ b/disk/blk.c @@ -9,6 +9,7 @@ static struct disk_image_operations blk_dev_ops = { .read = raw_image__read, .write = raw_image__write, + .async = true, }; static bool is_mounted(struct stat *st) @@ -35,7 +36,6 @@ static bool is_mounted(struct stat *st) struct disk_image *blkdev__probe(const char *filename, int flags, struct stat *st) { - struct disk_image *disk; int fd, r; u64 size; @@ -67,10 +67,5 @@ struct disk_image *blkdev__probe(const char *filename, int flags, struct stat *s * mmap large disk. There is not enough virtual address space * in 32-bit host. However, this works on 64-bit host. */ - disk = disk_image__new(fd, size, &blk_dev_ops, DISK_IMAGE_REGULAR); -#ifdef CONFIG_HAS_AIO - if (!IS_ERR_OR_NULL(disk)) - disk->async = 1; -#endif - return disk; + return disk_image__new(fd, size, &blk_dev_ops, DISK_IMAGE_REGULAR); } diff --git a/disk/qcow.c b/disk/qcow.c index bed70c65c..dd6be62ee 100644 --- a/disk/qcow.c +++ b/disk/qcow.c @@ -1337,7 +1337,6 @@ static struct disk_image *qcow2_probe(int fd, bool readonly) if (IS_ERR_OR_NULL(disk_image)) goto free_refcount_table; - disk_image->async = 0; disk_image->priv = q; return disk_image; @@ -1474,7 +1473,6 @@ static struct disk_image *qcow1_probe(int fd, bool readonly) if (!disk_image) goto free_l1_table; - disk_image->async = 1; disk_image->priv = q; return disk_image; diff --git a/disk/raw.c b/disk/raw.c index 09da7e081..e869d6cc2 100644 --- a/disk/raw.c +++ b/disk/raw.c @@ -67,6 +67,7 @@ int raw_image__close(struct disk_image *disk) static struct disk_image_operations raw_image_regular_ops = { .read = raw_image__read, .write = raw_image__write, + .async = true, }; struct disk_image_operations ro_ops = { @@ -77,12 +78,11 @@ struct disk_image_operations ro_ops = { struct disk_image_operations ro_ops_nowrite = { .read = raw_image__read, + .async = true, }; struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly) { - struct disk_image *disk; - if (readonly) { /* * Use mmap's MAP_PRIVATE to implement non-persistent write @@ -93,10 +93,6 @@ struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly) disk = disk_image__new(fd, st->st_size, &ro_ops, DISK_IMAGE_MMAP); if (IS_ERR_OR_NULL(disk)) { disk = disk_image__new(fd, st->st_size, &ro_ops_nowrite, DISK_IMAGE_REGULAR); -#ifdef CONFIG_HAS_AIO - if (!IS_ERR_OR_NULL(disk)) - disk->async = 1; -#endif } return disk; @@ -104,11 +100,6 @@ struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly) /* * Use read/write instead of mmap */ - disk = disk_image__new(fd, st->st_size, &raw_image_regular_ops, DISK_IMAGE_REGULAR); -#ifdef CONFIG_HAS_AIO - if (!IS_ERR_OR_NULL(disk)) - disk->async = 1; -#endif - return disk; + return disk_image__new(fd, st->st_size, &raw_image_regular_ops, DISK_IMAGE_REGULAR); } } diff --git a/include/kvm/disk-image.h b/include/kvm/disk-image.h index 953beb2d5..adc9fe465 100644 --- a/include/kvm/disk-image.h +++ b/include/kvm/disk-image.h @@ -42,6 +42,7 @@ struct disk_image_operations { int iovcount, void *param); int (*flush)(struct disk_image *disk); int (*close)(struct disk_image *disk); + bool async; }; struct disk_image_params { From patchwork Mon Feb 18 13:06:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10817983 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id ACBC51390 for ; Mon, 18 Feb 2019 13:10:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 92A942AAAC for ; Mon, 18 Feb 2019 13:10:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 868652AA80; Mon, 18 Feb 2019 13:10:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0711C2AA5C for ; Mon, 18 Feb 2019 13:10:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730931AbfBRNKG (ORCPT ); Mon, 18 Feb 2019 08:10:06 -0500 Received: from foss.arm.com ([217.140.101.70]:58006 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729621AbfBRNKG (ORCPT ); Mon, 18 Feb 2019 08:10:06 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1F8F71993; Mon, 18 Feb 2019 05:10:06 -0800 (PST) Received: from ostrya.cambridge.arm.com (ostrya.cambridge.arm.com [10.1.197.2]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 9979A3F720; Mon, 18 Feb 2019 05:09:59 -0800 (PST) From: Jean-Philippe Brucker To: kvm@vger.kernel.org Cc: will.deacon@arm.com, andre.przywara@arm.com Subject: [PATCH kvmtool 6/9] disk/aio: Fix AIO thread Date: Mon, 18 Feb 2019 13:06:59 +0000 Message-Id: <20190218130702.32575-7-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190218130702.32575-1-jean-philippe.brucker@arm.com> References: <20190218130702.32575-1-jean-philippe.brucker@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Currently when the kernel completes a batch of AIO requests and signals it via eventfd, we retrieve at most AIO_MAX events (256), and ignore the rest. Call io_getevents() again in case more events are pending. Signed-off-by: Jean-Philippe Brucker Reviewed-by: Andre Przywara --- disk/aio.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/disk/aio.c b/disk/aio.c index 007415c69..1fcf36857 100644 --- a/disk/aio.c +++ b/disk/aio.c @@ -66,20 +66,31 @@ ssize_t raw_image__write_async(struct disk_image *disk, u64 sector, offset, disk->evt, param); } -static void *disk_aio_thread(void *param) +static int disk_aio_get_events(struct disk_image *disk) { - struct disk_image *disk = param; struct io_event event[AIO_MAX]; struct timespec notime = {0}; int nr, i; + + do { + nr = io_getevents(disk->ctx, 1, ARRAY_SIZE(event), event, ¬ime); + for (i = 0; i < nr; i++) + disk->disk_req_cb(event[i].data, event[i].res); + } while (nr > 0); + + return 0; +} + +static void *disk_aio_thread(void *param) +{ + struct disk_image *disk = param; u64 dummy; kvm__set_thread_name("disk-image-io"); while (read(disk->evt, &dummy, sizeof(dummy)) > 0) { - nr = io_getevents(disk->ctx, 1, ARRAY_SIZE(event), event, ¬ime); - for (i = 0; i < nr; i++) - disk->disk_req_cb(event[i].data, event[i].res); + if (disk_aio_get_events(disk)) + break; } return NULL; From patchwork Mon Feb 18 13:07:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10817979 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0C5BF17E9 for ; Mon, 18 Feb 2019 13:10:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EAA8E2AA5B for ; Mon, 18 Feb 2019 13:10:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DC0CD2AAD7; Mon, 18 Feb 2019 13:10:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D1B0F2AA6B for ; Mon, 18 Feb 2019 13:10:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730945AbfBRNKC (ORCPT ); Mon, 18 Feb 2019 08:10:02 -0500 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:57998 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730928AbfBRNKB (ORCPT ); Mon, 18 Feb 2019 08:10:01 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6829D16A3; Mon, 18 Feb 2019 05:10:01 -0800 (PST) Received: from ostrya.cambridge.arm.com (ostrya.cambridge.arm.com [10.1.197.2]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 9F0CB3F73F; Mon, 18 Feb 2019 05:10:00 -0800 (PST) From: Jean-Philippe Brucker To: kvm@vger.kernel.org Cc: will.deacon@arm.com, andre.przywara@arm.com Subject: [PATCH kvmtool 7/9] disk/aio: Cancel AIO thread on cleanup Date: Mon, 18 Feb 2019 13:07:00 +0000 Message-Id: <20190218130702.32575-8-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190218130702.32575-1-jean-philippe.brucker@arm.com> References: <20190218130702.32575-1-jean-philippe.brucker@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If the AIO thread is still calling io_getevents() while the exit path calls io_destroy(), it will segfault. Wait for the thread to finish before destroying the context. Signed-off-by: Jean-Philippe Brucker Reviewed-by: Andre Przywara --- disk/aio.c | 5 +++-- include/kvm/disk-image.h | 1 + 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/disk/aio.c b/disk/aio.c index 1fcf36857..277ddf7c9 100644 --- a/disk/aio.c +++ b/disk/aio.c @@ -99,7 +99,6 @@ static void *disk_aio_thread(void *param) int disk_aio_setup(struct disk_image *disk) { int r; - pthread_t thread; /* No need to setup AIO if the disk ops won't make use of it */ if (!disk->ops->async) @@ -110,7 +109,7 @@ int disk_aio_setup(struct disk_image *disk) return -errno; io_setup(AIO_MAX, &disk->ctx); - r = pthread_create(&thread, NULL, disk_aio_thread, disk); + r = pthread_create(&disk->thread, NULL, disk_aio_thread, disk); if (r) { r = -errno; close(disk->evt); @@ -126,6 +125,8 @@ void disk_aio_destroy(struct disk_image *disk) if (!disk->async) return; + pthread_cancel(disk->thread); + pthread_join(disk->thread, NULL); close(disk->evt); io_destroy(disk->ctx); } diff --git a/include/kvm/disk-image.h b/include/kvm/disk-image.h index adc9fe465..2275e2343 100644 --- a/include/kvm/disk-image.h +++ b/include/kvm/disk-image.h @@ -69,6 +69,7 @@ struct disk_image { #ifdef CONFIG_HAS_AIO io_context_t ctx; int evt; + pthread_t thread; #endif /* CONFIG_HAS_AIO */ const char *wwpn; const char *tpgt; From patchwork Mon Feb 18 13:07:01 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10817985 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 69BA71390 for ; Mon, 18 Feb 2019 13:10:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 584652AA9B for ; Mon, 18 Feb 2019 13:10:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4C4622AA80; Mon, 18 Feb 2019 13:10:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6D2DB2AA5B for ; Mon, 18 Feb 2019 13:10:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730947AbfBRNKL (ORCPT ); Mon, 18 Feb 2019 08:10:11 -0500 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:58018 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730652AbfBRNKL (ORCPT ); Mon, 18 Feb 2019 08:10:11 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6B3911650; Mon, 18 Feb 2019 05:10:02 -0800 (PST) Received: from ostrya.cambridge.arm.com (ostrya.cambridge.arm.com [10.1.197.2]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id A46873F73F; Mon, 18 Feb 2019 05:10:01 -0800 (PST) From: Jean-Philippe Brucker To: kvm@vger.kernel.org Cc: will.deacon@arm.com, andre.przywara@arm.com Subject: [PATCH kvmtool 8/9] disk/aio: Add wait() disk operation Date: Mon, 18 Feb 2019 13:07:01 +0000 Message-Id: <20190218130702.32575-9-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190218130702.32575-1-jean-philippe.brucker@arm.com> References: <20190218130702.32575-1-jean-philippe.brucker@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add a call into the disk layer to synchronize the AIO queue. Wait for all pending requests to complete. This will be necessary when resetting a virtqueue. The wait() operation isn't the same as flush(). A VIRTIO_BLK_T_FLUSH request ensures that any write request *that completed before the FLUSH is sent* is committed to permanent storage (e.g. written back from a write cache). But it doesn't do anything for requests that are still pending when the FLUSH is sent. Avoid introducing a mutex on the io_submit() and io_getevents() paths, because it can lead to 30% throughput drop on heavy FIO jobs. Instead manage an inflight counter using compare-and-swap operations, which is simple enough as the caller doesn't submit new requests while it waits for the AIO queue to drain. The __sync_fetch_and_* operations are a bit rough since they use full barriers, but that didn't seem to introduce a performance regression. Signed-off-by: Jean-Philippe Brucker --- disk/aio.c | 82 ++++++++++++++++++++++++---------------- disk/blk.c | 1 + disk/core.c | 8 ++++ disk/raw.c | 2 + include/kvm/disk-image.h | 9 +++++ 5 files changed, 70 insertions(+), 32 deletions(-) diff --git a/disk/aio.c b/disk/aio.c index 277ddf7c9..a7418c8c2 100644 --- a/disk/aio.c +++ b/disk/aio.c @@ -2,45 +2,31 @@ #include #include +#include "kvm/brlock.h" #include "kvm/disk-image.h" #include "kvm/kvm.h" #include "linux/list.h" #define AIO_MAX 256 -static int aio_pwritev(io_context_t ctx, struct iocb *iocb, int fd, - const struct iovec *iov, int iovcnt, off_t offset, - int ev, void *param) +static int aio_submit(struct disk_image *disk, int nr, struct iocb **ios) { - struct iocb *ios[1] = { iocb }; int ret; - io_prep_pwritev(iocb, fd, iov, iovcnt, offset); - io_set_eventfd(iocb, ev); - iocb->data = param; - + __sync_fetch_and_add(&disk->aio_inflight, nr); + /* + * A wmb() is needed here, to ensure disk_aio_thread() sees this + * increase after receiving the events. It is included in the + * __sync_fetch_and_add (as a full barrier). + */ restart: - ret = io_submit(ctx, 1, ios); + ret = io_submit(disk->ctx, nr, ios); if (ret == -EAGAIN) goto restart; - return ret; -} - -static int aio_preadv(io_context_t ctx, struct iocb *iocb, int fd, - const struct iovec *iov, int iovcnt, off_t offset, - int ev, void *param) -{ - struct iocb *ios[1] = { iocb }; - int ret; + else if (ret <= 0) + /* disk_aio_thread() is never going to see those */ + __sync_fetch_and_sub(&disk->aio_inflight, nr); - io_prep_preadv(iocb, fd, iov, iovcnt, offset); - io_set_eventfd(iocb, ev); - iocb->data = param; - -restart: - ret = io_submit(ctx, 1, ios); - if (ret == -EAGAIN) - goto restart; return ret; } @@ -48,22 +34,49 @@ ssize_t raw_image__read_async(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount, void *param) { - u64 offset = sector << SECTOR_SHIFT; struct iocb iocb; + u64 offset = sector << SECTOR_SHIFT; + struct iocb *ios[1] = { &iocb }; - return aio_preadv(disk->ctx, &iocb, disk->fd, iov, iovcount, - offset, disk->evt, param); + io_prep_preadv(&iocb, disk->fd, iov, iovcount, offset); + io_set_eventfd(&iocb, disk->evt); + iocb.data = param; + + return aio_submit(disk, 1, ios); } ssize_t raw_image__write_async(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount, void *param) { - u64 offset = sector << SECTOR_SHIFT; struct iocb iocb; + u64 offset = sector << SECTOR_SHIFT; + struct iocb *ios[1] = { &iocb }; + + io_prep_pwritev(&iocb, disk->fd, iov, iovcount, offset); + io_set_eventfd(&iocb, disk->evt); + iocb.data = param; + + return aio_submit(disk, 1, ios); +} - return aio_pwritev(disk->ctx, &iocb, disk->fd, iov, iovcount, - offset, disk->evt, param); +/* + * When this function returns there are no in-flight I/O. Caller ensures that + * io_submit() isn't called concurrently. + * + * Returns an inaccurate number of I/O that was in-flight when the function was + * called. + */ +int raw_image__wait(struct disk_image *disk) +{ + u64 inflight = disk->aio_inflight; + + while (disk->aio_inflight) { + usleep(100); + barrier(); + } + + return inflight; } static int disk_aio_get_events(struct disk_image *disk) @@ -76,6 +89,11 @@ static int disk_aio_get_events(struct disk_image *disk) nr = io_getevents(disk->ctx, 1, ARRAY_SIZE(event), event, ¬ime); for (i = 0; i < nr; i++) disk->disk_req_cb(event[i].data, event[i].res); + + /* Pairs with wmb() in aio_submit() */ + rmb(); + __sync_fetch_and_sub(&disk->aio_inflight, nr); + } while (nr > 0); return 0; diff --git a/disk/blk.c b/disk/blk.c index 48922e028..b4c9fba3b 100644 --- a/disk/blk.c +++ b/disk/blk.c @@ -9,6 +9,7 @@ static struct disk_image_operations blk_dev_ops = { .read = raw_image__read, .write = raw_image__write, + .wait = raw_image__wait, .async = true, }; diff --git a/disk/core.c b/disk/core.c index 89880703e..8d95c98e2 100644 --- a/disk/core.c +++ b/disk/core.c @@ -201,6 +201,14 @@ error: return err; } +int disk_image__wait(struct disk_image *disk) +{ + if (disk->ops->wait) + return disk->ops->wait(disk); + + return 0; +} + int disk_image__flush(struct disk_image *disk) { if (disk->ops->flush) diff --git a/disk/raw.c b/disk/raw.c index e869d6cc2..54b4e7408 100644 --- a/disk/raw.c +++ b/disk/raw.c @@ -67,6 +67,7 @@ int raw_image__close(struct disk_image *disk) static struct disk_image_operations raw_image_regular_ops = { .read = raw_image__read, .write = raw_image__write, + .wait = raw_image__wait, .async = true, }; @@ -78,6 +79,7 @@ struct disk_image_operations ro_ops = { struct disk_image_operations ro_ops_nowrite = { .read = raw_image__read, + .wait = raw_image__wait, .async = true, }; diff --git a/include/kvm/disk-image.h b/include/kvm/disk-image.h index 2275e2343..27d4f7da5 100644 --- a/include/kvm/disk-image.h +++ b/include/kvm/disk-image.h @@ -41,6 +41,7 @@ struct disk_image_operations { ssize_t (*write)(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount, void *param); int (*flush)(struct disk_image *disk); + int (*wait)(struct disk_image *disk); int (*close)(struct disk_image *disk); bool async; }; @@ -70,6 +71,7 @@ struct disk_image { io_context_t ctx; int evt; pthread_t thread; + u64 aio_inflight; #endif /* CONFIG_HAS_AIO */ const char *wwpn; const char *tpgt; @@ -81,6 +83,7 @@ int disk_image__init(struct kvm *kvm); int disk_image__exit(struct kvm *kvm); struct disk_image *disk_image__new(int fd, u64 size, struct disk_image_operations *ops, int mmap); int disk_image__flush(struct disk_image *disk); +int disk_image__wait(struct disk_image *disk); ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount, void *param); ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iovec *iov, @@ -108,6 +111,7 @@ ssize_t raw_image__read_async(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount, void *param); ssize_t raw_image__write_async(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount, void *param); +int raw_image__wait(struct disk_image *disk); #define raw_image__read raw_image__read_async #define raw_image__write raw_image__write_async @@ -121,6 +125,11 @@ static inline int disk_aio_setup(struct disk_image *disk) static inline void disk_aio_destroy(struct disk_image *disk) { } + +static inline int raw_image__wait(struct disk_image *disk) +{ + return 0; +} #define raw_image__read raw_image__read_sync #define raw_image__write raw_image__write_sync #endif /* CONFIG_HAS_AIO */ From patchwork Mon Feb 18 13:07:02 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10817987 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8DADF17E9 for ; Mon, 18 Feb 2019 13:10:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7DBE62AA6F for ; Mon, 18 Feb 2019 13:10:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7231C2AA9B; Mon, 18 Feb 2019 13:10:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 03CE12AAD7 for ; Mon, 18 Feb 2019 13:10:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729977AbfBRNKO (ORCPT ); Mon, 18 Feb 2019 08:10:14 -0500 Received: from foss.arm.com ([217.140.101.70]:58022 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730938AbfBRNKO (ORCPT ); Mon, 18 Feb 2019 08:10:14 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 798FD1713; Mon, 18 Feb 2019 05:10:03 -0800 (PST) Received: from ostrya.cambridge.arm.com (ostrya.cambridge.arm.com [10.1.197.2]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id A942B3F73F; Mon, 18 Feb 2019 05:10:02 -0800 (PST) From: Jean-Philippe Brucker To: kvm@vger.kernel.org Cc: will.deacon@arm.com, andre.przywara@arm.com Subject: [PATCH kvmtool 9/9] virtio/blk: sync I/O on reset Date: Mon, 18 Feb 2019 13:07:02 +0000 Message-Id: <20190218130702.32575-10-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190218130702.32575-1-jean-philippe.brucker@arm.com> References: <20190218130702.32575-1-jean-philippe.brucker@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Ensure that all requests are complete when resetting a virtqueue, by draining the AIO queue after stopping the submission thread. Signed-off-by: Jean-Philippe Brucker --- virtio/blk.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/virtio/blk.c b/virtio/blk.c index 6e7a1ee36..50db6f5fc 100644 --- a/virtio/blk.c +++ b/virtio/blk.c @@ -248,6 +248,8 @@ static void exit_vq(struct kvm *kvm, void *dev, u32 vq) close(bdev->io_efd); pthread_cancel(bdev->io_thread); pthread_join(bdev->io_thread, NULL); + + disk_image__wait(bdev->disk); } static int notify_vq(struct kvm *kvm, void *dev, u32 vq)