From patchwork Thu Apr 4 13:20:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10885605 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 76B2117EE for ; Thu, 4 Apr 2019 13:22:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6070E286C0 for ; Thu, 4 Apr 2019 13:22:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 54852286D6; Thu, 4 Apr 2019 13:22:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 05597286C0 for ; Thu, 4 Apr 2019 13:22:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728908AbfDDNWQ (ORCPT ); Thu, 4 Apr 2019 09:22:16 -0400 Received: from foss.arm.com ([217.140.101.70]:60200 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727053AbfDDNWP (ORCPT ); Thu, 4 Apr 2019 09:22:15 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 34FA716A3; Thu, 4 Apr 2019 06:22:15 -0700 (PDT) Received: from ostrya.cambridge.arm.com (ostrya.cambridge.arm.com [10.1.196.129]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 637BE3F68F; Thu, 4 Apr 2019 06:22:14 -0700 (PDT) From: Jean-Philippe Brucker To: Will.Deacon@arm.com Cc: Andre.Przywara@arm.com, kvm@vger.kernel.org Subject: [PATCH kvmtool v2 1/9] qcow: Fix qcow1 exit fault Date: Thu, 4 Apr 2019 14:20:42 +0100 Message-Id: <20190404132050.37309-2-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190404132050.37309-1-jean-philippe.brucker@arm.com> References: <20190404132050.37309-1-jean-philippe.brucker@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Even though qcow1 doesn't use the refcount table, the cleanup path still attempts to iterate over its LRU list. Initialize the list to avoid a segfault on exit. Reviewed-by: Andre Przywara Signed-off-by: Jean-Philippe Brucker --- disk/qcow.c | 1 + 1 file changed, 1 insertion(+) diff --git a/disk/qcow.c b/disk/qcow.c index 64cf9270a..bed70c65c 100644 --- a/disk/qcow.c +++ b/disk/qcow.c @@ -1437,6 +1437,7 @@ static struct disk_image *qcow1_probe(int fd, bool readonly) l1t->root = (struct rb_root)RB_ROOT; INIT_LIST_HEAD(&l1t->lru_list); + INIT_LIST_HEAD(&q->refcount_table.lru_list); h = q->header = qcow1_read_header(fd); if (!h) From patchwork Thu Apr 4 13:20:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10885607 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AD2051575 for ; Thu, 4 Apr 2019 13:22:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 96A5A286C0 for ; Thu, 4 Apr 2019 13:22:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 88B15286D6; Thu, 4 Apr 2019 13:22:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 284EA286C0 for ; Thu, 4 Apr 2019 13:22:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729144AbfDDNWR (ORCPT ); Thu, 4 Apr 2019 09:22:17 -0400 Received: from foss.arm.com ([217.140.101.70]:60206 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727053AbfDDNWQ (ORCPT ); Thu, 4 Apr 2019 09:22:16 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3A1121713; Thu, 4 Apr 2019 06:22:16 -0700 (PDT) Received: from ostrya.cambridge.arm.com (ostrya.cambridge.arm.com [10.1.196.129]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 716CC3F68F; Thu, 4 Apr 2019 06:22:15 -0700 (PDT) From: Jean-Philippe Brucker To: Will.Deacon@arm.com Cc: Andre.Przywara@arm.com, kvm@vger.kernel.org Subject: [PATCH kvmtool v2 2/9] virtio/blk: Set VIRTIO_BLK_F_RO when the disk is read-only Date: Thu, 4 Apr 2019 14:20:43 +0100 Message-Id: <20190404132050.37309-3-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190404132050.37309-1-jean-philippe.brucker@arm.com> References: <20190404132050.37309-1-jean-philippe.brucker@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Since we don't currently tell the guest when the disk backend is read-only, it will report any inconsistent read after write as an error. An image may be read-only either because user requested it on the command-line, or because write support isn't implemented. Pass the read-only attribute using the VIRTIO_BLK_F_RO feature. Reviewed-by: Andre Przywara Signed-off-by: Jean-Philippe Brucker --- disk/core.c | 9 +++++++-- include/kvm/disk-image.h | 1 + virtio/blk.c | 5 ++++- 3 files changed, 12 insertions(+), 3 deletions(-) diff --git a/disk/core.c b/disk/core.c index dd2f258b0..4c7c4f030 100644 --- a/disk/core.c +++ b/disk/core.c @@ -139,8 +139,10 @@ static struct disk_image *disk_image__open(const char *filename, bool readonly, /* blk device ?*/ disk = blkdev__probe(filename, flags, &st); - if (!IS_ERR_OR_NULL(disk)) + if (!IS_ERR_OR_NULL(disk)) { + disk->readonly = readonly; return disk; + } fd = open(filename, flags); if (fd < 0) @@ -150,13 +152,16 @@ static struct disk_image *disk_image__open(const char *filename, bool readonly, disk = qcow_probe(fd, true); if (!IS_ERR_OR_NULL(disk)) { pr_warning("Forcing read-only support for QCOW"); + disk->readonly = true; return disk; } /* raw image ?*/ disk = raw_image__probe(fd, &st, readonly); - if (!IS_ERR_OR_NULL(disk)) + if (!IS_ERR_OR_NULL(disk)) { + disk->readonly = readonly; return disk; + } if (close(fd) < 0) pr_warning("close() failed"); diff --git a/include/kvm/disk-image.h b/include/kvm/disk-image.h index b72805242..4746e88c9 100644 --- a/include/kvm/disk-image.h +++ b/include/kvm/disk-image.h @@ -59,6 +59,7 @@ struct disk_image { void *priv; void *disk_req_cb_param; void (*disk_req_cb)(void *param, long len); + bool readonly; bool async; int evt; #ifdef CONFIG_HAS_AIO diff --git a/virtio/blk.c b/virtio/blk.c index a57df2e96..6e7a1ee36 100644 --- a/virtio/blk.c +++ b/virtio/blk.c @@ -148,10 +148,13 @@ static u8 *get_config(struct kvm *kvm, void *dev) static u32 get_host_features(struct kvm *kvm, void *dev) { + struct blk_dev *bdev = dev; + return 1UL << VIRTIO_BLK_F_SEG_MAX | 1UL << VIRTIO_BLK_F_FLUSH | 1UL << VIRTIO_RING_F_EVENT_IDX - | 1UL << VIRTIO_RING_F_INDIRECT_DESC; + | 1UL << VIRTIO_RING_F_INDIRECT_DESC + | (bdev->disk->readonly ? 1UL << VIRTIO_BLK_F_RO : 0); } static void set_guest_features(struct kvm *kvm, void *dev, u32 features) From patchwork Thu Apr 4 13:20:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10885621 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A67471575 for ; Thu, 4 Apr 2019 13:22:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 908B1286C0 for ; Thu, 4 Apr 2019 13:22:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 85228286DD; Thu, 4 Apr 2019 13:22:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4288C286C0 for ; Thu, 4 Apr 2019 13:22:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729445AbfDDNWT (ORCPT ); Thu, 4 Apr 2019 09:22:19 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:60212 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729084AbfDDNWR (ORCPT ); Thu, 4 Apr 2019 09:22:17 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 49CDBA78; Thu, 4 Apr 2019 06:22:17 -0700 (PDT) Received: from ostrya.cambridge.arm.com (ostrya.cambridge.arm.com [10.1.196.129]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 7673F3F68F; Thu, 4 Apr 2019 06:22:16 -0700 (PDT) From: Jean-Philippe Brucker To: Will.Deacon@arm.com Cc: Andre.Przywara@arm.com, kvm@vger.kernel.org Subject: [PATCH kvmtool v2 3/9] guest: sync disk before shutting down Date: Thu, 4 Apr 2019 14:20:44 +0100 Message-Id: <20190404132050.37309-4-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190404132050.37309-1-jean-philippe.brucker@arm.com> References: <20190404132050.37309-1-jean-philippe.brucker@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP sync() should be called before reboot(RB_AUTOBOOT), otherwise data written to disks might be lost. Signed-off-by: Jean-Philippe Brucker --- guest/init.c | 1 + 1 file changed, 1 insertion(+) diff --git a/guest/init.c b/guest/init.c index 1f9cd048a..52f6567da 100644 --- a/guest/init.c +++ b/guest/init.c @@ -72,6 +72,7 @@ int main(int argc, char *argv[]) } while (corpse != child); } + sync(); reboot(RB_AUTOBOOT); printf("Init failed: %s\n", strerror(errno)); From patchwork Thu Apr 4 13:20:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10885609 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BE47B1708 for ; Thu, 4 Apr 2019 13:22:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A506A286C0 for ; Thu, 4 Apr 2019 13:22:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 99054286DD; Thu, 4 Apr 2019 13:22:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AB45C286C0 for ; Thu, 4 Apr 2019 13:22:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729511AbfDDNWT (ORCPT ); Thu, 4 Apr 2019 09:22:19 -0400 Received: from foss.arm.com ([217.140.101.70]:60218 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729275AbfDDNWS (ORCPT ); Thu, 4 Apr 2019 09:22:18 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4E5131715; Thu, 4 Apr 2019 06:22:18 -0700 (PDT) Received: from ostrya.cambridge.arm.com (ostrya.cambridge.arm.com [10.1.196.129]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 855F93F68F; Thu, 4 Apr 2019 06:22:17 -0700 (PDT) From: Jean-Philippe Brucker To: Will.Deacon@arm.com Cc: Andre.Przywara@arm.com, kvm@vger.kernel.org Subject: [PATCH kvmtool v2 4/9] disk/aio: Refactor AIO code Date: Thu, 4 Apr 2019 14:20:45 +0100 Message-Id: <20190404132050.37309-5-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190404132050.37309-1-jean-philippe.brucker@arm.com> References: <20190404132050.37309-1-jean-philippe.brucker@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Move all AIO code to a separate file, disk/aio.c, to remove as much #ifdefs as possible. Split the raw read/write disk ops into async and sync, and choose which ones to use depending on CONFIG_HAS_AIO. Note that we fix raw_image__close() which incorrectly checked CONFIG_HAS_VIRTIO instead of CONFIG_HAS_AIO, and closed an unitialized disk->evt. A subsequent commit will complete this refactoring by fixing use of the 'async' disk attribute. Reviewed-by: Andre Przywara Signed-off-by: Jean-Philippe Brucker --- Makefile | 2 + disk/aio.c | 111 +++++++++++++++++++++++++++++++++++++++ disk/core.c | 52 +++++------------- disk/raw.c | 39 +++----------- include/kvm/disk-image.h | 41 ++++++++++++--- include/kvm/read-write.h | 11 ---- util/read-write.c | 36 ------------- 7 files changed, 167 insertions(+), 125 deletions(-) create mode 100644 disk/aio.c diff --git a/Makefile b/Makefile index ec75cd999..c0881252a 100644 --- a/Makefile +++ b/Makefile @@ -275,10 +275,12 @@ endif ifeq ($(call try-build,$(SOURCE_AIO),$(CFLAGS),$(LDFLAGS) -laio),y) CFLAGS_DYNOPT += -DCONFIG_HAS_AIO LIBS_DYNOPT += -laio + OBJS_DYNOPT += disk/aio.o else ifeq ($(call try-build,$(SOURCE_AIO),$(CFLAGS),$(LDFLAGS) -laio -static),y) CFLAGS_STATOPT += -DCONFIG_HAS_AIO LIBS_STATOPT += -laio + OBJS_STATOPT += disk/aio.o else NOTFOUND += aio endif diff --git a/disk/aio.c b/disk/aio.c new file mode 100644 index 000000000..6afcffe5a --- /dev/null +++ b/disk/aio.c @@ -0,0 +1,111 @@ +#include +#include +#include + +#include "kvm/disk-image.h" +#include "kvm/kvm.h" +#include "linux/list.h" + +#define AIO_MAX 256 + +static int aio_pwritev(io_context_t ctx, struct iocb *iocb, int fd, + const struct iovec *iov, int iovcnt, off_t offset, + int ev, void *param) +{ + struct iocb *ios[1] = { iocb }; + int ret; + + io_prep_pwritev(iocb, fd, iov, iovcnt, offset); + io_set_eventfd(iocb, ev); + iocb->data = param; + +restart: + ret = io_submit(ctx, 1, ios); + if (ret == -EAGAIN) + goto restart; + return ret; +} + +static int aio_preadv(io_context_t ctx, struct iocb *iocb, int fd, + const struct iovec *iov, int iovcnt, off_t offset, + int ev, void *param) +{ + struct iocb *ios[1] = { iocb }; + int ret; + + io_prep_preadv(iocb, fd, iov, iovcnt, offset); + io_set_eventfd(iocb, ev); + iocb->data = param; + +restart: + ret = io_submit(ctx, 1, ios); + if (ret == -EAGAIN) + goto restart; + return ret; +} + +ssize_t raw_image__read_async(struct disk_image *disk, u64 sector, + const struct iovec *iov, int iovcount, + void *param) +{ + u64 offset = sector << SECTOR_SHIFT; + struct iocb iocb; + + return aio_preadv(disk->ctx, &iocb, disk->fd, iov, iovcount, + offset, disk->evt, param); +} + +ssize_t raw_image__write_async(struct disk_image *disk, u64 sector, + const struct iovec *iov, int iovcount, + void *param) +{ + u64 offset = sector << SECTOR_SHIFT; + struct iocb iocb; + + return aio_pwritev(disk->ctx, &iocb, disk->fd, iov, iovcount, + offset, disk->evt, param); +} + +static void *disk_aio_thread(void *param) +{ + struct disk_image *disk = param; + struct io_event event[AIO_MAX]; + struct timespec notime = {0}; + int nr, i; + u64 dummy; + + kvm__set_thread_name("disk-image-io"); + + while (read(disk->evt, &dummy, sizeof(dummy)) > 0) { + nr = io_getevents(disk->ctx, 1, ARRAY_SIZE(event), event, ¬ime); + for (i = 0; i < nr; i++) + disk->disk_req_cb(event[i].data, event[i].res); + } + + return NULL; +} + +int disk_aio_setup(struct disk_image *disk) +{ + int r; + pthread_t thread; + + disk->evt = eventfd(0, 0); + if (disk->evt < 0) + return -errno; + + io_setup(AIO_MAX, &disk->ctx); + r = pthread_create(&thread, NULL, disk_aio_thread, disk); + if (r) { + r = -errno; + close(disk->evt); + return r; + } + return 0; +} + +void disk_aio_destroy(struct disk_image *disk) +{ + close(disk->evt); + io_destroy(disk->ctx); +} diff --git a/disk/core.c b/disk/core.c index 4c7c4f030..89880703e 100644 --- a/disk/core.c +++ b/disk/core.c @@ -4,11 +4,8 @@ #include "kvm/kvm.h" #include -#include #include -#define AIO_MAX 256 - int debug_iodelay; static int disk_image__close(struct disk_image *disk); @@ -54,27 +51,6 @@ int disk_img_name_parser(const struct option *opt, const char *arg, int unset) return 0; } -#ifdef CONFIG_HAS_AIO -static void *disk_image__thread(void *param) -{ - struct disk_image *disk = param; - struct io_event event[AIO_MAX]; - struct timespec notime = {0}; - int nr, i; - u64 dummy; - - kvm__set_thread_name("disk-image-io"); - - while (read(disk->evt, &dummy, sizeof(dummy)) > 0) { - nr = io_getevents(disk->ctx, 1, ARRAY_SIZE(event), event, ¬ime); - for (i = 0; i < nr; i++) - disk->disk_req_cb(event[i].data, event[i].res); - } - - return NULL; -} -#endif - struct disk_image *disk_image__new(int fd, u64 size, struct disk_image_operations *ops, int use_mmap) @@ -99,26 +75,22 @@ struct disk_image *disk_image__new(int fd, u64 size, disk->priv = mmap(NULL, size, PROT_RW, MAP_PRIVATE | MAP_NORESERVE, fd, 0); if (disk->priv == MAP_FAILED) { r = -errno; - free(disk); - return ERR_PTR(r); + goto err_free_disk; } } -#ifdef CONFIG_HAS_AIO - { - pthread_t thread; + r = disk_aio_setup(disk); + if (r) + goto err_unmap_disk; - disk->evt = eventfd(0, 0); - io_setup(AIO_MAX, &disk->ctx); - r = pthread_create(&thread, NULL, disk_image__thread, disk); - if (r) { - r = -errno; - free(disk); - return ERR_PTR(r); - } - } -#endif return disk; + +err_unmap_disk: + if (disk->priv) + munmap(disk->priv, size); +err_free_disk: + free(disk); + return ERR_PTR(r); } static struct disk_image *disk_image__open(const char *filename, bool readonly, bool direct) @@ -243,6 +215,8 @@ static int disk_image__close(struct disk_image *disk) if (!disk) return 0; + disk_aio_destroy(disk); + if (disk->ops->close) return disk->ops->close(disk); diff --git a/disk/raw.c b/disk/raw.c index 93b2b4e8d..09da7e081 100644 --- a/disk/raw.c +++ b/disk/raw.c @@ -2,38 +2,17 @@ #include -#ifdef CONFIG_HAS_AIO -#include -#endif - -ssize_t raw_image__read(struct disk_image *disk, u64 sector, const struct iovec *iov, +ssize_t raw_image__read_sync(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount, void *param) { - u64 offset = sector << SECTOR_SHIFT; - -#ifdef CONFIG_HAS_AIO - struct iocb iocb; - - return aio_preadv(disk->ctx, &iocb, disk->fd, iov, iovcount, offset, - disk->evt, param); -#else - return preadv_in_full(disk->fd, iov, iovcount, offset); -#endif + return preadv_in_full(disk->fd, iov, iovcount, sector << SECTOR_SHIFT); } -ssize_t raw_image__write(struct disk_image *disk, u64 sector, const struct iovec *iov, - int iovcount, void *param) +ssize_t raw_image__write_sync(struct disk_image *disk, u64 sector, + const struct iovec *iov, int iovcount, + void *param) { - u64 offset = sector << SECTOR_SHIFT; - -#ifdef CONFIG_HAS_AIO - struct iocb iocb; - - return aio_pwritev(disk->ctx, &iocb, disk->fd, iov, iovcount, offset, - disk->evt, param); -#else - return pwritev_in_full(disk->fd, iov, iovcount, offset); -#endif + return pwritev_in_full(disk->fd, iov, iovcount, sector << SECTOR_SHIFT); } ssize_t raw_image__read_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov, @@ -79,12 +58,6 @@ int raw_image__close(struct disk_image *disk) if (disk->priv != MAP_FAILED) ret = munmap(disk->priv, disk->size); - close(disk->evt); - -#ifdef CONFIG_HAS_VIRTIO - io_destroy(disk->ctx); -#endif - return ret; } diff --git a/include/kvm/disk-image.h b/include/kvm/disk-image.h index 4746e88c9..953beb2d5 100644 --- a/include/kvm/disk-image.h +++ b/include/kvm/disk-image.h @@ -19,6 +19,10 @@ #include #include +#ifdef CONFIG_HAS_AIO +#include +#endif + #define SECTOR_SHIFT 9 #define SECTOR_SIZE (1UL << SECTOR_SHIFT) @@ -61,10 +65,10 @@ struct disk_image { void (*disk_req_cb)(void *param, long len); bool readonly; bool async; - int evt; #ifdef CONFIG_HAS_AIO io_context_t ctx; -#endif + int evt; +#endif /* CONFIG_HAS_AIO */ const char *wwpn; const char *tpgt; int debug_iodelay; @@ -84,14 +88,39 @@ ssize_t disk_image__get_serial(struct disk_image *disk, void *buffer, ssize_t *l struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly); struct disk_image *blkdev__probe(const char *filename, int flags, struct stat *st); -ssize_t raw_image__read(struct disk_image *disk, u64 sector, - const struct iovec *iov, int iovcount, void *param); -ssize_t raw_image__write(struct disk_image *disk, u64 sector, - const struct iovec *iov, int iovcount, void *param); +ssize_t raw_image__read_sync(struct disk_image *disk, u64 sector, + const struct iovec *iov, int iovcount, void *param); +ssize_t raw_image__write_sync(struct disk_image *disk, u64 sector, + const struct iovec *iov, int iovcount, void *param); ssize_t raw_image__read_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount, void *param); ssize_t raw_image__write_mmap(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount, void *param); int raw_image__close(struct disk_image *disk); void disk_image__set_callback(struct disk_image *disk, void (*disk_req_cb)(void *param, long len)); + +#ifdef CONFIG_HAS_AIO +int disk_aio_setup(struct disk_image *disk); +void disk_aio_destroy(struct disk_image *disk); +ssize_t raw_image__read_async(struct disk_image *disk, u64 sector, + const struct iovec *iov, int iovcount, void *param); +ssize_t raw_image__write_async(struct disk_image *disk, u64 sector, + const struct iovec *iov, int iovcount, void *param); + +#define raw_image__read raw_image__read_async +#define raw_image__write raw_image__write_async + +#else /* !CONFIG_HAS_AIO */ +static inline int disk_aio_setup(struct disk_image *disk) +{ + /* No-op */ + return 0; +} +static inline void disk_aio_destroy(struct disk_image *disk) +{ +} +#define raw_image__read raw_image__read_sync +#define raw_image__write raw_image__write_sync +#endif /* CONFIG_HAS_AIO */ + #endif /* KVM__DISK_IMAGE_H */ diff --git a/include/kvm/read-write.h b/include/kvm/read-write.h index acbd6f0b1..8375d7c7d 100644 --- a/include/kvm/read-write.h +++ b/include/kvm/read-write.h @@ -5,10 +5,6 @@ #include #include -#ifdef CONFIG_HAS_AIO -#include -#endif - ssize_t xread(int fd, void *buf, size_t count); ssize_t xwrite(int fd, const void *buf, size_t count); @@ -35,11 +31,4 @@ ssize_t xpwritev(int fd, const struct iovec *iov, int iovcnt, off_t offset); ssize_t preadv_in_full(int fd, const struct iovec *iov, int iovcnt, off_t offset); ssize_t pwritev_in_full(int fd, const struct iovec *iov, int iovcnt, off_t offset); -#ifdef CONFIG_HAS_AIO -int aio_preadv(io_context_t ctx, struct iocb *iocb, int fd, const struct iovec *iov, int iovcnt, - off_t offset, int ev, void *param); -int aio_pwritev(io_context_t ctx, struct iocb *iocb, int fd, const struct iovec *iov, int iovcnt, - off_t offset, int ev, void *param); -#endif - #endif /* KVM_READ_WRITE_H */ diff --git a/util/read-write.c b/util/read-write.c index bf6fb2fc2..06fc0dfff 100644 --- a/util/read-write.c +++ b/util/read-write.c @@ -337,39 +337,3 @@ ssize_t pwritev_in_full(int fd, const struct iovec *iov, int iovcnt, off_t offse return total; } - -#ifdef CONFIG_HAS_AIO -int aio_pwritev(io_context_t ctx, struct iocb *iocb, int fd, const struct iovec *iov, int iovcnt, - off_t offset, int ev, void *param) -{ - struct iocb *ios[1] = { iocb }; - int ret; - - io_prep_pwritev(iocb, fd, iov, iovcnt, offset); - io_set_eventfd(iocb, ev); - iocb->data = param; - -restart: - ret = io_submit(ctx, 1, ios); - if (ret == -EAGAIN) - goto restart; - return ret; -} - -int aio_preadv(io_context_t ctx, struct iocb *iocb, int fd, const struct iovec *iov, int iovcnt, - off_t offset, int ev, void *param) -{ - struct iocb *ios[1] = { iocb }; - int ret; - - io_prep_preadv(iocb, fd, iov, iovcnt, offset); - io_set_eventfd(iocb, ev); - iocb->data = param; - -restart: - ret = io_submit(ctx, 1, ios); - if (ret == -EAGAIN) - goto restart; - return ret; -} -#endif From patchwork Thu Apr 4 13:20:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10885611 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1B0AE17EE for ; Thu, 4 Apr 2019 13:22:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 059F7286C0 for ; Thu, 4 Apr 2019 13:22:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EE0FA286D6; Thu, 4 Apr 2019 13:22:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 75B2A286C6 for ; Thu, 4 Apr 2019 13:22:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729743AbfDDNWU (ORCPT ); Thu, 4 Apr 2019 09:22:20 -0400 Received: from foss.arm.com ([217.140.101.70]:60222 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729204AbfDDNWT (ORCPT ); Thu, 4 Apr 2019 09:22:19 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5396616A3; Thu, 4 Apr 2019 06:22:19 -0700 (PDT) Received: from ostrya.cambridge.arm.com (ostrya.cambridge.arm.com [10.1.196.129]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 8AB1D3F68F; Thu, 4 Apr 2019 06:22:18 -0700 (PDT) From: Jean-Philippe Brucker To: Will.Deacon@arm.com Cc: Andre.Przywara@arm.com, kvm@vger.kernel.org Subject: [PATCH kvmtool v2 5/9] disk/aio: Fix use of disk->async Date: Thu, 4 Apr 2019 14:20:46 +0100 Message-Id: <20190404132050.37309-6-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190404132050.37309-1-jean-philippe.brucker@arm.com> References: <20190404132050.37309-1-jean-philippe.brucker@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add an 'async' attribute to disk_image_operations, that describes if they can submit async I/O or not. disk_image->async is now set iff CONFIG_HAS_AIO and the ops do use AIO. This fixes qcow1, which used to set async = 1 even though the qcow operations don't use AIO. The disk core would perform the read/write operation without pushing the completion onto the virtio queue, and the guest would be stuck waiting. Reviewed-by: Andre Przywara Signed-off-by: Jean-Philippe Brucker --- disk/aio.c | 9 +++++++++ disk/blk.c | 9 ++------- disk/qcow.c | 2 -- disk/raw.c | 15 +++------------ include/kvm/disk-image.h | 1 + 5 files changed, 15 insertions(+), 21 deletions(-) diff --git a/disk/aio.c b/disk/aio.c index 6afcffe5a..007415c69 100644 --- a/disk/aio.c +++ b/disk/aio.c @@ -90,6 +90,10 @@ int disk_aio_setup(struct disk_image *disk) int r; pthread_t thread; + /* No need to setup AIO if the disk ops won't make use of it */ + if (!disk->ops->async) + return 0; + disk->evt = eventfd(0, 0); if (disk->evt < 0) return -errno; @@ -101,11 +105,16 @@ int disk_aio_setup(struct disk_image *disk) close(disk->evt); return r; } + + disk->async = true; return 0; } void disk_aio_destroy(struct disk_image *disk) { + if (!disk->async) + return; + close(disk->evt); io_destroy(disk->ctx); } diff --git a/disk/blk.c b/disk/blk.c index 37581d331..48922e028 100644 --- a/disk/blk.c +++ b/disk/blk.c @@ -9,6 +9,7 @@ static struct disk_image_operations blk_dev_ops = { .read = raw_image__read, .write = raw_image__write, + .async = true, }; static bool is_mounted(struct stat *st) @@ -35,7 +36,6 @@ static bool is_mounted(struct stat *st) struct disk_image *blkdev__probe(const char *filename, int flags, struct stat *st) { - struct disk_image *disk; int fd, r; u64 size; @@ -67,10 +67,5 @@ struct disk_image *blkdev__probe(const char *filename, int flags, struct stat *s * mmap large disk. There is not enough virtual address space * in 32-bit host. However, this works on 64-bit host. */ - disk = disk_image__new(fd, size, &blk_dev_ops, DISK_IMAGE_REGULAR); -#ifdef CONFIG_HAS_AIO - if (!IS_ERR_OR_NULL(disk)) - disk->async = 1; -#endif - return disk; + return disk_image__new(fd, size, &blk_dev_ops, DISK_IMAGE_REGULAR); } diff --git a/disk/qcow.c b/disk/qcow.c index bed70c65c..dd6be62ee 100644 --- a/disk/qcow.c +++ b/disk/qcow.c @@ -1337,7 +1337,6 @@ static struct disk_image *qcow2_probe(int fd, bool readonly) if (IS_ERR_OR_NULL(disk_image)) goto free_refcount_table; - disk_image->async = 0; disk_image->priv = q; return disk_image; @@ -1474,7 +1473,6 @@ static struct disk_image *qcow1_probe(int fd, bool readonly) if (!disk_image) goto free_l1_table; - disk_image->async = 1; disk_image->priv = q; return disk_image; diff --git a/disk/raw.c b/disk/raw.c index 09da7e081..e869d6cc2 100644 --- a/disk/raw.c +++ b/disk/raw.c @@ -67,6 +67,7 @@ int raw_image__close(struct disk_image *disk) static struct disk_image_operations raw_image_regular_ops = { .read = raw_image__read, .write = raw_image__write, + .async = true, }; struct disk_image_operations ro_ops = { @@ -77,12 +78,11 @@ struct disk_image_operations ro_ops = { struct disk_image_operations ro_ops_nowrite = { .read = raw_image__read, + .async = true, }; struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly) { - struct disk_image *disk; - if (readonly) { /* * Use mmap's MAP_PRIVATE to implement non-persistent write @@ -93,10 +93,6 @@ struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly) disk = disk_image__new(fd, st->st_size, &ro_ops, DISK_IMAGE_MMAP); if (IS_ERR_OR_NULL(disk)) { disk = disk_image__new(fd, st->st_size, &ro_ops_nowrite, DISK_IMAGE_REGULAR); -#ifdef CONFIG_HAS_AIO - if (!IS_ERR_OR_NULL(disk)) - disk->async = 1; -#endif } return disk; @@ -104,11 +100,6 @@ struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly) /* * Use read/write instead of mmap */ - disk = disk_image__new(fd, st->st_size, &raw_image_regular_ops, DISK_IMAGE_REGULAR); -#ifdef CONFIG_HAS_AIO - if (!IS_ERR_OR_NULL(disk)) - disk->async = 1; -#endif - return disk; + return disk_image__new(fd, st->st_size, &raw_image_regular_ops, DISK_IMAGE_REGULAR); } } diff --git a/include/kvm/disk-image.h b/include/kvm/disk-image.h index 953beb2d5..adc9fe465 100644 --- a/include/kvm/disk-image.h +++ b/include/kvm/disk-image.h @@ -42,6 +42,7 @@ struct disk_image_operations { int iovcount, void *param); int (*flush)(struct disk_image *disk); int (*close)(struct disk_image *disk); + bool async; }; struct disk_image_params { From patchwork Thu Apr 4 13:20:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10885619 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 879FE1575 for ; Thu, 4 Apr 2019 13:22:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7187D286C0 for ; Thu, 4 Apr 2019 13:22:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 661DA286DD; Thu, 4 Apr 2019 13:22:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 09482286C0 for ; Thu, 4 Apr 2019 13:22:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729320AbfDDNWW (ORCPT ); Thu, 4 Apr 2019 09:22:22 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:60226 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729556AbfDDNWU (ORCPT ); Thu, 4 Apr 2019 09:22:20 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 59A451713; Thu, 4 Apr 2019 06:22:20 -0700 (PDT) Received: from ostrya.cambridge.arm.com (ostrya.cambridge.arm.com [10.1.196.129]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 8FE7D3F68F; Thu, 4 Apr 2019 06:22:19 -0700 (PDT) From: Jean-Philippe Brucker To: Will.Deacon@arm.com Cc: Andre.Przywara@arm.com, kvm@vger.kernel.org Subject: [PATCH kvmtool v2 6/9] disk/aio: Fix AIO thread Date: Thu, 4 Apr 2019 14:20:47 +0100 Message-Id: <20190404132050.37309-7-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190404132050.37309-1-jean-philippe.brucker@arm.com> References: <20190404132050.37309-1-jean-philippe.brucker@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Currently when the kernel completes a batch of AIO requests and signals it via eventfd, we retrieve at most AIO_MAX events (256), and ignore the rest. Call io_getevents() again in case more events are pending. Reviewed-by: Andre Przywara Signed-off-by: Jean-Philippe Brucker --- disk/aio.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/disk/aio.c b/disk/aio.c index 007415c69..1fcf36857 100644 --- a/disk/aio.c +++ b/disk/aio.c @@ -66,20 +66,31 @@ ssize_t raw_image__write_async(struct disk_image *disk, u64 sector, offset, disk->evt, param); } -static void *disk_aio_thread(void *param) +static int disk_aio_get_events(struct disk_image *disk) { - struct disk_image *disk = param; struct io_event event[AIO_MAX]; struct timespec notime = {0}; int nr, i; + + do { + nr = io_getevents(disk->ctx, 1, ARRAY_SIZE(event), event, ¬ime); + for (i = 0; i < nr; i++) + disk->disk_req_cb(event[i].data, event[i].res); + } while (nr > 0); + + return 0; +} + +static void *disk_aio_thread(void *param) +{ + struct disk_image *disk = param; u64 dummy; kvm__set_thread_name("disk-image-io"); while (read(disk->evt, &dummy, sizeof(dummy)) > 0) { - nr = io_getevents(disk->ctx, 1, ARRAY_SIZE(event), event, ¬ime); - for (i = 0; i < nr; i++) - disk->disk_req_cb(event[i].data, event[i].res); + if (disk_aio_get_events(disk)) + break; } return NULL; From patchwork Thu Apr 4 13:20:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10885613 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 04D1D1575 for ; Thu, 4 Apr 2019 13:22:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E29A4286C0 for ; Thu, 4 Apr 2019 13:22:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D728E286D6; Thu, 4 Apr 2019 13:22:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 884AD286C0 for ; Thu, 4 Apr 2019 13:22:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729782AbfDDNWW (ORCPT ); Thu, 4 Apr 2019 09:22:22 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:60230 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729636AbfDDNWV (ORCPT ); Thu, 4 Apr 2019 09:22:21 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5DF06A78; Thu, 4 Apr 2019 06:22:21 -0700 (PDT) Received: from ostrya.cambridge.arm.com (ostrya.cambridge.arm.com [10.1.196.129]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 94FD93F68F; Thu, 4 Apr 2019 06:22:20 -0700 (PDT) From: Jean-Philippe Brucker To: Will.Deacon@arm.com Cc: Andre.Przywara@arm.com, kvm@vger.kernel.org Subject: [PATCH kvmtool v2 7/9] disk/aio: Cancel AIO thread on cleanup Date: Thu, 4 Apr 2019 14:20:48 +0100 Message-Id: <20190404132050.37309-8-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190404132050.37309-1-jean-philippe.brucker@arm.com> References: <20190404132050.37309-1-jean-philippe.brucker@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If the AIO thread is still calling io_getevents() while the exit path calls io_destroy(), it will segfault. Wait for the thread to finish before destroying the context. Reviewed-by: Andre Przywara Signed-off-by: Jean-Philippe Brucker --- disk/aio.c | 5 +++-- include/kvm/disk-image.h | 1 + 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/disk/aio.c b/disk/aio.c index 1fcf36857..277ddf7c9 100644 --- a/disk/aio.c +++ b/disk/aio.c @@ -99,7 +99,6 @@ static void *disk_aio_thread(void *param) int disk_aio_setup(struct disk_image *disk) { int r; - pthread_t thread; /* No need to setup AIO if the disk ops won't make use of it */ if (!disk->ops->async) @@ -110,7 +109,7 @@ int disk_aio_setup(struct disk_image *disk) return -errno; io_setup(AIO_MAX, &disk->ctx); - r = pthread_create(&thread, NULL, disk_aio_thread, disk); + r = pthread_create(&disk->thread, NULL, disk_aio_thread, disk); if (r) { r = -errno; close(disk->evt); @@ -126,6 +125,8 @@ void disk_aio_destroy(struct disk_image *disk) if (!disk->async) return; + pthread_cancel(disk->thread); + pthread_join(disk->thread, NULL); close(disk->evt); io_destroy(disk->ctx); } diff --git a/include/kvm/disk-image.h b/include/kvm/disk-image.h index adc9fe465..2275e2343 100644 --- a/include/kvm/disk-image.h +++ b/include/kvm/disk-image.h @@ -69,6 +69,7 @@ struct disk_image { #ifdef CONFIG_HAS_AIO io_context_t ctx; int evt; + pthread_t thread; #endif /* CONFIG_HAS_AIO */ const char *wwpn; const char *tpgt; From patchwork Thu Apr 4 13:20:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10885615 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A29541575 for ; Thu, 4 Apr 2019 13:22:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8C1E8286C6 for ; Thu, 4 Apr 2019 13:22:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 80BC8286D6; Thu, 4 Apr 2019 13:22:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CB98F286C0 for ; Thu, 4 Apr 2019 13:22:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729435AbfDDNWY (ORCPT ); Thu, 4 Apr 2019 09:22:24 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:60232 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729406AbfDDNWX (ORCPT ); Thu, 4 Apr 2019 09:22:23 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 65E321715; Thu, 4 Apr 2019 06:22:22 -0700 (PDT) Received: from ostrya.cambridge.arm.com (ostrya.cambridge.arm.com [10.1.196.129]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 9A5DB3F68F; Thu, 4 Apr 2019 06:22:21 -0700 (PDT) From: Jean-Philippe Brucker To: Will.Deacon@arm.com Cc: Andre.Przywara@arm.com, kvm@vger.kernel.org Subject: [PATCH kvmtool v2 8/9] disk/aio: Add wait() disk operation Date: Thu, 4 Apr 2019 14:20:49 +0100 Message-Id: <20190404132050.37309-9-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190404132050.37309-1-jean-philippe.brucker@arm.com> References: <20190404132050.37309-1-jean-philippe.brucker@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add a call into the disk layer to synchronize the AIO queue. Wait for all pending requests to complete. This will be necessary when resetting a virtqueue. The wait() operation isn't the same as flush(). A VIRTIO_BLK_T_FLUSH request ensures that any write request *that completed before the FLUSH is sent* is committed to permanent storage (e.g. written back from a write cache). But it doesn't do anything for requests that are still pending when the FLUSH is sent. Avoid introducing a mutex on the io_submit() and io_getevents() paths, because it can lead to 30% throughput drop on heavy FIO jobs. Instead manage an inflight counter using compare-and-swap operations, which is simple enough as the caller doesn't submit new requests while it waits for the AIO queue to drain. The __sync_fetch_and_* operations are a bit rough since they use full barriers, but that didn't seem to introduce a performance regression. Signed-off-by: Jean-Philippe Brucker --- disk/aio.c | 82 ++++++++++++++++++++++++---------------- disk/blk.c | 1 + disk/core.c | 8 ++++ disk/raw.c | 2 + include/kvm/disk-image.h | 9 +++++ 5 files changed, 70 insertions(+), 32 deletions(-) diff --git a/disk/aio.c b/disk/aio.c index 277ddf7c9..a7418c8c2 100644 --- a/disk/aio.c +++ b/disk/aio.c @@ -2,45 +2,31 @@ #include #include +#include "kvm/brlock.h" #include "kvm/disk-image.h" #include "kvm/kvm.h" #include "linux/list.h" #define AIO_MAX 256 -static int aio_pwritev(io_context_t ctx, struct iocb *iocb, int fd, - const struct iovec *iov, int iovcnt, off_t offset, - int ev, void *param) +static int aio_submit(struct disk_image *disk, int nr, struct iocb **ios) { - struct iocb *ios[1] = { iocb }; int ret; - io_prep_pwritev(iocb, fd, iov, iovcnt, offset); - io_set_eventfd(iocb, ev); - iocb->data = param; - + __sync_fetch_and_add(&disk->aio_inflight, nr); + /* + * A wmb() is needed here, to ensure disk_aio_thread() sees this + * increase after receiving the events. It is included in the + * __sync_fetch_and_add (as a full barrier). + */ restart: - ret = io_submit(ctx, 1, ios); + ret = io_submit(disk->ctx, nr, ios); if (ret == -EAGAIN) goto restart; - return ret; -} - -static int aio_preadv(io_context_t ctx, struct iocb *iocb, int fd, - const struct iovec *iov, int iovcnt, off_t offset, - int ev, void *param) -{ - struct iocb *ios[1] = { iocb }; - int ret; + else if (ret <= 0) + /* disk_aio_thread() is never going to see those */ + __sync_fetch_and_sub(&disk->aio_inflight, nr); - io_prep_preadv(iocb, fd, iov, iovcnt, offset); - io_set_eventfd(iocb, ev); - iocb->data = param; - -restart: - ret = io_submit(ctx, 1, ios); - if (ret == -EAGAIN) - goto restart; return ret; } @@ -48,22 +34,49 @@ ssize_t raw_image__read_async(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount, void *param) { - u64 offset = sector << SECTOR_SHIFT; struct iocb iocb; + u64 offset = sector << SECTOR_SHIFT; + struct iocb *ios[1] = { &iocb }; - return aio_preadv(disk->ctx, &iocb, disk->fd, iov, iovcount, - offset, disk->evt, param); + io_prep_preadv(&iocb, disk->fd, iov, iovcount, offset); + io_set_eventfd(&iocb, disk->evt); + iocb.data = param; + + return aio_submit(disk, 1, ios); } ssize_t raw_image__write_async(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount, void *param) { - u64 offset = sector << SECTOR_SHIFT; struct iocb iocb; + u64 offset = sector << SECTOR_SHIFT; + struct iocb *ios[1] = { &iocb }; + + io_prep_pwritev(&iocb, disk->fd, iov, iovcount, offset); + io_set_eventfd(&iocb, disk->evt); + iocb.data = param; + + return aio_submit(disk, 1, ios); +} - return aio_pwritev(disk->ctx, &iocb, disk->fd, iov, iovcount, - offset, disk->evt, param); +/* + * When this function returns there are no in-flight I/O. Caller ensures that + * io_submit() isn't called concurrently. + * + * Returns an inaccurate number of I/O that was in-flight when the function was + * called. + */ +int raw_image__wait(struct disk_image *disk) +{ + u64 inflight = disk->aio_inflight; + + while (disk->aio_inflight) { + usleep(100); + barrier(); + } + + return inflight; } static int disk_aio_get_events(struct disk_image *disk) @@ -76,6 +89,11 @@ static int disk_aio_get_events(struct disk_image *disk) nr = io_getevents(disk->ctx, 1, ARRAY_SIZE(event), event, ¬ime); for (i = 0; i < nr; i++) disk->disk_req_cb(event[i].data, event[i].res); + + /* Pairs with wmb() in aio_submit() */ + rmb(); + __sync_fetch_and_sub(&disk->aio_inflight, nr); + } while (nr > 0); return 0; diff --git a/disk/blk.c b/disk/blk.c index 48922e028..b4c9fba3b 100644 --- a/disk/blk.c +++ b/disk/blk.c @@ -9,6 +9,7 @@ static struct disk_image_operations blk_dev_ops = { .read = raw_image__read, .write = raw_image__write, + .wait = raw_image__wait, .async = true, }; diff --git a/disk/core.c b/disk/core.c index 89880703e..8d95c98e2 100644 --- a/disk/core.c +++ b/disk/core.c @@ -201,6 +201,14 @@ error: return err; } +int disk_image__wait(struct disk_image *disk) +{ + if (disk->ops->wait) + return disk->ops->wait(disk); + + return 0; +} + int disk_image__flush(struct disk_image *disk) { if (disk->ops->flush) diff --git a/disk/raw.c b/disk/raw.c index e869d6cc2..54b4e7408 100644 --- a/disk/raw.c +++ b/disk/raw.c @@ -67,6 +67,7 @@ int raw_image__close(struct disk_image *disk) static struct disk_image_operations raw_image_regular_ops = { .read = raw_image__read, .write = raw_image__write, + .wait = raw_image__wait, .async = true, }; @@ -78,6 +79,7 @@ struct disk_image_operations ro_ops = { struct disk_image_operations ro_ops_nowrite = { .read = raw_image__read, + .wait = raw_image__wait, .async = true, }; diff --git a/include/kvm/disk-image.h b/include/kvm/disk-image.h index 2275e2343..27d4f7da5 100644 --- a/include/kvm/disk-image.h +++ b/include/kvm/disk-image.h @@ -41,6 +41,7 @@ struct disk_image_operations { ssize_t (*write)(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount, void *param); int (*flush)(struct disk_image *disk); + int (*wait)(struct disk_image *disk); int (*close)(struct disk_image *disk); bool async; }; @@ -70,6 +71,7 @@ struct disk_image { io_context_t ctx; int evt; pthread_t thread; + u64 aio_inflight; #endif /* CONFIG_HAS_AIO */ const char *wwpn; const char *tpgt; @@ -81,6 +83,7 @@ int disk_image__init(struct kvm *kvm); int disk_image__exit(struct kvm *kvm); struct disk_image *disk_image__new(int fd, u64 size, struct disk_image_operations *ops, int mmap); int disk_image__flush(struct disk_image *disk); +int disk_image__wait(struct disk_image *disk); ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount, void *param); ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iovec *iov, @@ -108,6 +111,7 @@ ssize_t raw_image__read_async(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount, void *param); ssize_t raw_image__write_async(struct disk_image *disk, u64 sector, const struct iovec *iov, int iovcount, void *param); +int raw_image__wait(struct disk_image *disk); #define raw_image__read raw_image__read_async #define raw_image__write raw_image__write_async @@ -121,6 +125,11 @@ static inline int disk_aio_setup(struct disk_image *disk) static inline void disk_aio_destroy(struct disk_image *disk) { } + +static inline int raw_image__wait(struct disk_image *disk) +{ + return 0; +} #define raw_image__read raw_image__read_sync #define raw_image__write raw_image__write_sync #endif /* CONFIG_HAS_AIO */ From patchwork Thu Apr 4 13:20:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 10885617 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EADC117EE for ; Thu, 4 Apr 2019 13:22:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D56F1286C0 for ; Thu, 4 Apr 2019 13:22:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C98C6286D6; Thu, 4 Apr 2019 13:22:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8561D286C0 for ; Thu, 4 Apr 2019 13:22:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729878AbfDDNWY (ORCPT ); Thu, 4 Apr 2019 09:22:24 -0400 Received: from foss.arm.com ([217.140.101.70]:60238 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729828AbfDDNWX (ORCPT ); Thu, 4 Apr 2019 09:22:23 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7368316A3; Thu, 4 Apr 2019 06:22:23 -0700 (PDT) Received: from ostrya.cambridge.arm.com (ostrya.cambridge.arm.com [10.1.196.129]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id A25253F68F; Thu, 4 Apr 2019 06:22:22 -0700 (PDT) From: Jean-Philippe Brucker To: Will.Deacon@arm.com Cc: Andre.Przywara@arm.com, kvm@vger.kernel.org Subject: [PATCH kvmtool v2 9/9] virtio/blk: sync I/O on reset Date: Thu, 4 Apr 2019 14:20:50 +0100 Message-Id: <20190404132050.37309-10-jean-philippe.brucker@arm.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190404132050.37309-1-jean-philippe.brucker@arm.com> References: <20190404132050.37309-1-jean-philippe.brucker@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Ensure that all requests are complete when resetting a virtqueue, by draining the AIO queue after stopping the submission thread. Signed-off-by: Jean-Philippe Brucker --- virtio/blk.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/virtio/blk.c b/virtio/blk.c index 6e7a1ee36..50db6f5fc 100644 --- a/virtio/blk.c +++ b/virtio/blk.c @@ -248,6 +248,8 @@ static void exit_vq(struct kvm *kvm, void *dev, u32 vq) close(bdev->io_efd); pthread_cancel(bdev->io_thread); pthread_join(bdev->io_thread, NULL); + + disk_image__wait(bdev->disk); } static int notify_vq(struct kvm *kvm, void *dev, u32 vq)