From patchwork Fri Jul 2 09:09:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Lieven X-Patchwork-Id: 12355675 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7FF38C11F68 for ; Fri, 2 Jul 2021 09:13:54 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 06EDB6140C for ; Fri, 2 Jul 2021 09:13:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 06EDB6140C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kamp.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:60196 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lzFFB-0002WK-3J for qemu-devel@archiver.kernel.org; Fri, 02 Jul 2021 05:13:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:36250) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lzFBO-0002u7-BY for qemu-devel@nongnu.org; Fri, 02 Jul 2021 05:09:58 -0400 Received: from kerio.kamp.de ([195.62.97.192]:54450) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lzFBK-0006Y2-Fp for qemu-devel@nongnu.org; Fri, 02 Jul 2021 05:09:58 -0400 X-Footer: a2FtcC5kZQ== Received: from submission.kamp.de ([195.62.97.28]) by kerio.kamp.de with ESMTPS (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256 bits)) for qemu-devel@nongnu.org; Fri, 2 Jul 2021 11:09:41 +0200 Received: (qmail 37493 invoked from network); 2 Jul 2021 09:09:43 -0000 Received: from lieven-pc.kamp-intra.net (HELO lieven-pc) (relay@kamp.de@::ffff:172.21.12.60) by submission.kamp.de with ESMTPS (DHE-RSA-AES256-GCM-SHA384 encrypted) ESMTPA; 2 Jul 2021 09:09:43 -0000 Received: by lieven-pc (Postfix, from userid 1060) id B677F13DD23; Fri, 2 Jul 2021 11:09:43 +0200 (CEST) From: Peter Lieven To: qemu-block@nongnu.org Subject: [PATCH V4 1/6] block/rbd: bump librbd requirement to luminous release Date: Fri, 2 Jul 2021 11:09:30 +0200 Message-Id: <20210702090935.15300-2-pl@kamp.de> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210702090935.15300-1-pl@kamp.de> References: <20210702090935.15300-1-pl@kamp.de> Received-SPF: pass client-ip=195.62.97.192; envelope-from=pl@kamp.de; helo=kerio.kamp.de X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, idryomov@redhat.com, berrange@redhat.com, Peter Lieven , qemu-devel@nongnu.org, ct@flyingcircus.io, pbonzini@redhat.com, idryomov@gmail.com, mreitz@redhat.com, dillaman@redhat.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" even luminous (version 12.2) is unmaintained for over 3 years now. Bump the requirement to get rid of the ifdef'ry in the code. Qemu 6.1 dropped the support for RHEL-7 which was the last supported OS that required an older librbd. Signed-off-by: Peter Lieven Reviewed-by: Ilya Dryomov --- block/rbd.c | 120 ++++------------------------------------------------ meson.build | 7 ++- 2 files changed, 13 insertions(+), 114 deletions(-) diff --git a/block/rbd.c b/block/rbd.c index 26f64cce7c..6b1cbe1d75 100644 --- a/block/rbd.c +++ b/block/rbd.c @@ -55,24 +55,10 @@ * leading "\". */ -/* rbd_aio_discard added in 0.1.2 */ -#if LIBRBD_VERSION_CODE >= LIBRBD_VERSION(0, 1, 2) -#define LIBRBD_SUPPORTS_DISCARD -#else -#undef LIBRBD_SUPPORTS_DISCARD -#endif - #define OBJ_MAX_SIZE (1UL << OBJ_DEFAULT_OBJ_ORDER) #define RBD_MAX_SNAPS 100 -/* The LIBRBD_SUPPORTS_IOVEC is defined in librbd.h */ -#ifdef LIBRBD_SUPPORTS_IOVEC -#define LIBRBD_USE_IOVEC 1 -#else -#define LIBRBD_USE_IOVEC 0 -#endif - typedef enum { RBD_AIO_READ, RBD_AIO_WRITE, @@ -84,7 +70,6 @@ typedef struct RBDAIOCB { BlockAIOCB common; int64_t ret; QEMUIOVector *qiov; - char *bounce; RBDAIOCmd cmd; int error; struct BDRVRBDState *s; @@ -94,7 +79,6 @@ typedef struct RADOSCB { RBDAIOCB *acb; struct BDRVRBDState *s; int64_t size; - char *buf; int64_t ret; } RADOSCB; @@ -342,13 +326,9 @@ static int qemu_rbd_set_keypairs(rados_t cluster, const char *keypairs_json, static void qemu_rbd_memset(RADOSCB *rcb, int64_t offs) { - if (LIBRBD_USE_IOVEC) { - RBDAIOCB *acb = rcb->acb; - iov_memset(acb->qiov->iov, acb->qiov->niov, offs, 0, - acb->qiov->size - offs); - } else { - memset(rcb->buf + offs, 0, rcb->size - offs); - } + RBDAIOCB *acb = rcb->acb; + iov_memset(acb->qiov->iov, acb->qiov->niov, offs, 0, + acb->qiov->size - offs); } /* FIXME Deprecate and remove keypairs or make it available in QMP. */ @@ -504,13 +484,6 @@ static void qemu_rbd_complete_aio(RADOSCB *rcb) g_free(rcb); - if (!LIBRBD_USE_IOVEC) { - if (acb->cmd == RBD_AIO_READ) { - qemu_iovec_from_buf(acb->qiov, 0, acb->bounce, acb->qiov->size); - } - qemu_vfree(acb->bounce); - } - acb->common.cb(acb->common.opaque, (acb->ret > 0 ? 0 : acb->ret)); qemu_aio_unref(acb); @@ -878,28 +851,6 @@ static void rbd_finish_aiocb(rbd_completion_t c, RADOSCB *rcb) rbd_finish_bh, rcb); } -static int rbd_aio_discard_wrapper(rbd_image_t image, - uint64_t off, - uint64_t len, - rbd_completion_t comp) -{ -#ifdef LIBRBD_SUPPORTS_DISCARD - return rbd_aio_discard(image, off, len, comp); -#else - return -ENOTSUP; -#endif -} - -static int rbd_aio_flush_wrapper(rbd_image_t image, - rbd_completion_t comp) -{ -#ifdef LIBRBD_SUPPORTS_AIO_FLUSH - return rbd_aio_flush(image, comp); -#else - return -ENOTSUP; -#endif -} - static BlockAIOCB *rbd_start_aio(BlockDriverState *bs, int64_t off, QEMUIOVector *qiov, @@ -922,21 +873,6 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs, rcb = g_new(RADOSCB, 1); - if (!LIBRBD_USE_IOVEC) { - if (cmd == RBD_AIO_DISCARD || cmd == RBD_AIO_FLUSH) { - acb->bounce = NULL; - } else { - acb->bounce = qemu_try_blockalign(bs, qiov->size); - if (acb->bounce == NULL) { - goto failed; - } - } - if (cmd == RBD_AIO_WRITE) { - qemu_iovec_to_buf(acb->qiov, 0, acb->bounce, qiov->size); - } - rcb->buf = acb->bounce; - } - acb->ret = 0; acb->error = 0; acb->s = s; @@ -950,7 +886,7 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs, } switch (cmd) { - case RBD_AIO_WRITE: { + case RBD_AIO_WRITE: /* * RBD APIs don't allow us to write more than actual size, so in order * to support growing images, we resize the image before write @@ -962,25 +898,16 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs, goto failed_completion; } } -#ifdef LIBRBD_SUPPORTS_IOVEC - r = rbd_aio_writev(s->image, qiov->iov, qiov->niov, off, c); -#else - r = rbd_aio_write(s->image, off, size, rcb->buf, c); -#endif + r = rbd_aio_writev(s->image, qiov->iov, qiov->niov, off, c); break; - } case RBD_AIO_READ: -#ifdef LIBRBD_SUPPORTS_IOVEC - r = rbd_aio_readv(s->image, qiov->iov, qiov->niov, off, c); -#else - r = rbd_aio_read(s->image, off, size, rcb->buf, c); -#endif + r = rbd_aio_readv(s->image, qiov->iov, qiov->niov, off, c); break; case RBD_AIO_DISCARD: - r = rbd_aio_discard_wrapper(s->image, off, size, c); + r = rbd_aio_discard(s->image, off, size, c); break; case RBD_AIO_FLUSH: - r = rbd_aio_flush_wrapper(s->image, c); + r = rbd_aio_flush(s->image, c); break; default: r = -EINVAL; @@ -995,9 +922,6 @@ failed_completion: rbd_aio_release(c); failed: g_free(rcb); - if (!LIBRBD_USE_IOVEC) { - qemu_vfree(acb->bounce); - } qemu_aio_unref(acb); return NULL; @@ -1023,7 +947,6 @@ static BlockAIOCB *qemu_rbd_aio_pwritev(BlockDriverState *bs, RBD_AIO_WRITE); } -#ifdef LIBRBD_SUPPORTS_AIO_FLUSH static BlockAIOCB *qemu_rbd_aio_flush(BlockDriverState *bs, BlockCompletionFunc *cb, void *opaque) @@ -1031,20 +954,6 @@ static BlockAIOCB *qemu_rbd_aio_flush(BlockDriverState *bs, return rbd_start_aio(bs, 0, NULL, 0, cb, opaque, RBD_AIO_FLUSH); } -#else - -static int qemu_rbd_co_flush(BlockDriverState *bs) -{ -#if LIBRBD_VERSION_CODE >= LIBRBD_VERSION(0, 1, 1) - /* rbd_flush added in 0.1.1 */ - BDRVRBDState *s = bs->opaque; - return rbd_flush(s->image); -#else - return 0; -#endif -} -#endif - static int qemu_rbd_getinfo(BlockDriverState *bs, BlockDriverInfo *bdi) { BDRVRBDState *s = bs->opaque; @@ -1210,7 +1119,6 @@ static int qemu_rbd_snap_list(BlockDriverState *bs, return snap_count; } -#ifdef LIBRBD_SUPPORTS_DISCARD static BlockAIOCB *qemu_rbd_aio_pdiscard(BlockDriverState *bs, int64_t offset, int bytes, @@ -1220,9 +1128,7 @@ static BlockAIOCB *qemu_rbd_aio_pdiscard(BlockDriverState *bs, return rbd_start_aio(bs, offset, NULL, bytes, cb, opaque, RBD_AIO_DISCARD); } -#endif -#ifdef LIBRBD_SUPPORTS_INVALIDATE static void coroutine_fn qemu_rbd_co_invalidate_cache(BlockDriverState *bs, Error **errp) { @@ -1232,7 +1138,6 @@ static void coroutine_fn qemu_rbd_co_invalidate_cache(BlockDriverState *bs, error_setg_errno(errp, -r, "Failed to invalidate the cache"); } } -#endif static QemuOptsList qemu_rbd_create_opts = { .name = "rbd-create-opts", @@ -1290,23 +1195,14 @@ static BlockDriver bdrv_rbd = { .bdrv_aio_preadv = qemu_rbd_aio_preadv, .bdrv_aio_pwritev = qemu_rbd_aio_pwritev, -#ifdef LIBRBD_SUPPORTS_AIO_FLUSH .bdrv_aio_flush = qemu_rbd_aio_flush, -#else - .bdrv_co_flush_to_disk = qemu_rbd_co_flush, -#endif - -#ifdef LIBRBD_SUPPORTS_DISCARD .bdrv_aio_pdiscard = qemu_rbd_aio_pdiscard, -#endif .bdrv_snapshot_create = qemu_rbd_snap_create, .bdrv_snapshot_delete = qemu_rbd_snap_remove, .bdrv_snapshot_list = qemu_rbd_snap_list, .bdrv_snapshot_goto = qemu_rbd_snap_rollback, -#ifdef LIBRBD_SUPPORTS_INVALIDATE .bdrv_co_invalidate_cache = qemu_rbd_co_invalidate_cache, -#endif .strong_runtime_opts = qemu_rbd_strong_runtime_opts, }; diff --git a/meson.build b/meson.build index db6789af9c..eb150c70f7 100644 --- a/meson.build +++ b/meson.build @@ -706,13 +706,16 @@ if not get_option('rbd').auto() or have_block int main(void) { rados_t cluster; rados_create(&cluster, NULL); + #if LIBRBD_VERSION_CODE < LIBRBD_VERSION(1, 12, 0) + #error + #endif return 0; }''', dependencies: [librbd, librados]) rbd = declare_dependency(dependencies: [librbd, librados]) elif get_option('rbd').enabled() - error('could not link librados') + error('librbd >= 1.12.0 required') else - warning('could not link librados, disabling') + warning('librbd >= 1.12.0 not found, disabling rbd support') endif endif endif From patchwork Fri Jul 2 09:09:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Lieven X-Patchwork-Id: 12355671 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51CFAC11F68 for ; Fri, 2 Jul 2021 09:11:17 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EC04261407 for ; Fri, 2 Jul 2021 09:11:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EC04261407 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kamp.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:52996 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lzFCe-0005ty-5u for qemu-devel@archiver.kernel.org; Fri, 02 Jul 2021 05:11:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:36294) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lzFBS-00036T-Th for qemu-devel@nongnu.org; Fri, 02 Jul 2021 05:10:02 -0400 Received: from kerio.kamp.de ([195.62.97.192]:54480) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lzFBR-0006bQ-1k for qemu-devel@nongnu.org; Fri, 02 Jul 2021 05:10:02 -0400 X-Footer: a2FtcC5kZQ== Received: from submission.kamp.de ([195.62.97.28]) by kerio.kamp.de with ESMTPS (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256 bits)) for qemu-devel@nongnu.org; Fri, 2 Jul 2021 11:09:42 +0200 Received: (qmail 37495 invoked from network); 2 Jul 2021 09:09:43 -0000 Received: from lieven-pc.kamp-intra.net (HELO lieven-pc) (relay@kamp.de@::ffff:172.21.12.60) by submission.kamp.de with ESMTPS (DHE-RSA-AES256-GCM-SHA384 encrypted) ESMTPA; 2 Jul 2021 09:09:43 -0000 Received: by lieven-pc (Postfix, from userid 1060) id BACB513DD26; Fri, 2 Jul 2021 11:09:43 +0200 (CEST) From: Peter Lieven To: qemu-block@nongnu.org Subject: [PATCH V4 2/6] block/rbd: store object_size in BDRVRBDState Date: Fri, 2 Jul 2021 11:09:31 +0200 Message-Id: <20210702090935.15300-3-pl@kamp.de> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210702090935.15300-1-pl@kamp.de> References: <20210702090935.15300-1-pl@kamp.de> Received-SPF: pass client-ip=195.62.97.192; envelope-from=pl@kamp.de; helo=kerio.kamp.de X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, idryomov@redhat.com, berrange@redhat.com, Peter Lieven , qemu-devel@nongnu.org, ct@flyingcircus.io, pbonzini@redhat.com, idryomov@gmail.com, mreitz@redhat.com, dillaman@redhat.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Peter Lieven Reviewed-by: Ilya Dryomov --- block/rbd.c | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/block/rbd.c b/block/rbd.c index 6b1cbe1d75..b4caea4f1b 100644 --- a/block/rbd.c +++ b/block/rbd.c @@ -90,6 +90,7 @@ typedef struct BDRVRBDState { char *snap; char *namespace; uint64_t image_size; + uint64_t object_size; } BDRVRBDState; static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx, @@ -675,6 +676,7 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags, const QDictEntry *e; Error *local_err = NULL; char *keypairs, *secretid; + rbd_image_info_t info; int r; keypairs = g_strdup(qdict_get_try_str(options, "=keyvalue-pairs")); @@ -739,13 +741,15 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags, goto failed_open; } - r = rbd_get_size(s->image, &s->image_size); + r = rbd_stat(s->image, &info, sizeof(info)); if (r < 0) { - error_setg_errno(errp, -r, "error getting image size from %s", + error_setg_errno(errp, -r, "error getting image info from %s", s->image_name); rbd_close(s->image); goto failed_open; } + s->image_size = info.size; + s->object_size = info.obj_size; /* If we are using an rbd snapshot, we must be r/o, otherwise * leave as-is */ @@ -957,15 +961,7 @@ static BlockAIOCB *qemu_rbd_aio_flush(BlockDriverState *bs, static int qemu_rbd_getinfo(BlockDriverState *bs, BlockDriverInfo *bdi) { BDRVRBDState *s = bs->opaque; - rbd_image_info_t info; - int r; - - r = rbd_stat(s->image, &info, sizeof(info)); - if (r < 0) { - return r; - } - - bdi->cluster_size = info.obj_size; + bdi->cluster_size = s->object_size; return 0; } From patchwork Fri Jul 2 09:09:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Lieven X-Patchwork-Id: 12355667 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2D4CC11F68 for ; Fri, 2 Jul 2021 09:11:08 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1557C61413 for ; Fri, 2 Jul 2021 09:11:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1557C61413 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kamp.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:52376 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lzFCV-0005Uv-42 for qemu-devel@archiver.kernel.org; Fri, 02 Jul 2021 05:11:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:36212) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lzFBM-0002tM-G8 for qemu-devel@nongnu.org; Fri, 02 Jul 2021 05:09:56 -0400 Received: from kerio.kamp.de ([195.62.97.192]:54449) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lzFBK-0006Y1-Gk for qemu-devel@nongnu.org; Fri, 02 Jul 2021 05:09:56 -0400 X-Footer: a2FtcC5kZQ== Received: from submission.kamp.de ([195.62.97.28]) by kerio.kamp.de with ESMTPS (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256 bits)) for qemu-devel@nongnu.org; Fri, 2 Jul 2021 11:09:42 +0200 Received: (qmail 37498 invoked from network); 2 Jul 2021 09:09:43 -0000 Received: from lieven-pc.kamp-intra.net (HELO lieven-pc) (relay@kamp.de@::ffff:172.21.12.60) by submission.kamp.de with ESMTPS (DHE-RSA-AES256-GCM-SHA384 encrypted) ESMTPA; 2 Jul 2021 09:09:43 -0000 Received: by lieven-pc (Postfix, from userid 1060) id BE86413DD27; Fri, 2 Jul 2021 11:09:43 +0200 (CEST) From: Peter Lieven To: qemu-block@nongnu.org Subject: [PATCH V4 3/6] block/rbd: update s->image_size in qemu_rbd_getlength Date: Fri, 2 Jul 2021 11:09:32 +0200 Message-Id: <20210702090935.15300-4-pl@kamp.de> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210702090935.15300-1-pl@kamp.de> References: <20210702090935.15300-1-pl@kamp.de> Received-SPF: pass client-ip=195.62.97.192; envelope-from=pl@kamp.de; helo=kerio.kamp.de X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, idryomov@redhat.com, berrange@redhat.com, Peter Lieven , qemu-devel@nongnu.org, ct@flyingcircus.io, pbonzini@redhat.com, idryomov@gmail.com, mreitz@redhat.com, dillaman@redhat.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" while at it just call rbd_get_size and avoid rbd_stat. Signed-off-by: Peter Lieven Reviewed-by: Ilya Dryomov --- block/rbd.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/block/rbd.c b/block/rbd.c index b4caea4f1b..1f8dc84079 100644 --- a/block/rbd.c +++ b/block/rbd.c @@ -968,15 +968,14 @@ static int qemu_rbd_getinfo(BlockDriverState *bs, BlockDriverInfo *bdi) static int64_t qemu_rbd_getlength(BlockDriverState *bs) { BDRVRBDState *s = bs->opaque; - rbd_image_info_t info; int r; - r = rbd_stat(s->image, &info, sizeof(info)); + r = rbd_get_size(s->image, &s->image_size); if (r < 0) { return r; } - return info.size; + return s->image_size; } static int coroutine_fn qemu_rbd_co_truncate(BlockDriverState *bs, From patchwork Fri Jul 2 09:09:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Lieven X-Patchwork-Id: 12355669 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10832C11F68 for ; Fri, 2 Jul 2021 09:11:11 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 799346140C for ; Fri, 2 Jul 2021 09:11:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 799346140C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kamp.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:52678 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lzFCX-0005h4-FW for qemu-devel@archiver.kernel.org; Fri, 02 Jul 2021 05:11:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:36252) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lzFBP-0002w7-9Y for qemu-devel@nongnu.org; Fri, 02 Jul 2021 05:09:59 -0400 Received: from kerio.kamp.de ([195.62.97.192]:54451) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lzFBK-0006Y4-Fp for qemu-devel@nongnu.org; Fri, 02 Jul 2021 05:09:59 -0400 X-Footer: a2FtcC5kZQ== Received: from submission.kamp.de ([195.62.97.28]) by kerio.kamp.de with ESMTPS (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256 bits)) for qemu-devel@nongnu.org; Fri, 2 Jul 2021 11:09:42 +0200 Received: (qmail 37499 invoked from network); 2 Jul 2021 09:09:43 -0000 Received: from lieven-pc.kamp-intra.net (HELO lieven-pc) (relay@kamp.de@::ffff:172.21.12.60) by submission.kamp.de with ESMTPS (DHE-RSA-AES256-GCM-SHA384 encrypted) ESMTPA; 2 Jul 2021 09:09:43 -0000 Received: by lieven-pc (Postfix, from userid 1060) id C127813DD28; Fri, 2 Jul 2021 11:09:43 +0200 (CEST) From: Peter Lieven To: qemu-block@nongnu.org Subject: [PATCH V4 4/6] block/rbd: migrate from aio to coroutines Date: Fri, 2 Jul 2021 11:09:33 +0200 Message-Id: <20210702090935.15300-5-pl@kamp.de> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210702090935.15300-1-pl@kamp.de> References: <20210702090935.15300-1-pl@kamp.de> Received-SPF: pass client-ip=195.62.97.192; envelope-from=pl@kamp.de; helo=kerio.kamp.de X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, idryomov@redhat.com, berrange@redhat.com, Peter Lieven , qemu-devel@nongnu.org, ct@flyingcircus.io, pbonzini@redhat.com, idryomov@gmail.com, mreitz@redhat.com, dillaman@redhat.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Peter Lieven Reviewed-by: Ilya Dryomov --- block/rbd.c | 252 +++++++++++++++++++--------------------------------- 1 file changed, 90 insertions(+), 162 deletions(-) diff --git a/block/rbd.c b/block/rbd.c index 1f8dc84079..be0471944a 100644 --- a/block/rbd.c +++ b/block/rbd.c @@ -66,22 +66,6 @@ typedef enum { RBD_AIO_FLUSH } RBDAIOCmd; -typedef struct RBDAIOCB { - BlockAIOCB common; - int64_t ret; - QEMUIOVector *qiov; - RBDAIOCmd cmd; - int error; - struct BDRVRBDState *s; -} RBDAIOCB; - -typedef struct RADOSCB { - RBDAIOCB *acb; - struct BDRVRBDState *s; - int64_t size; - int64_t ret; -} RADOSCB; - typedef struct BDRVRBDState { rados_t cluster; rados_ioctx_t io_ctx; @@ -93,6 +77,13 @@ typedef struct BDRVRBDState { uint64_t object_size; } BDRVRBDState; +typedef struct RBDTask { + BlockDriverState *bs; + Coroutine *co; + bool complete; + int64_t ret; +} RBDTask; + static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx, BlockdevOptionsRbd *opts, bool cache, const char *keypairs, const char *secretid, @@ -325,13 +316,6 @@ static int qemu_rbd_set_keypairs(rados_t cluster, const char *keypairs_json, return ret; } -static void qemu_rbd_memset(RADOSCB *rcb, int64_t offs) -{ - RBDAIOCB *acb = rcb->acb; - iov_memset(acb->qiov->iov, acb->qiov->niov, offs, 0, - acb->qiov->size - offs); -} - /* FIXME Deprecate and remove keypairs or make it available in QMP. */ static int qemu_rbd_do_create(BlockdevCreateOptions *options, const char *keypairs, const char *password_secret, @@ -450,46 +434,6 @@ exit: return ret; } -/* - * This aio completion is being called from rbd_finish_bh() and runs in qemu - * BH context. - */ -static void qemu_rbd_complete_aio(RADOSCB *rcb) -{ - RBDAIOCB *acb = rcb->acb; - int64_t r; - - r = rcb->ret; - - if (acb->cmd != RBD_AIO_READ) { - if (r < 0) { - acb->ret = r; - acb->error = 1; - } else if (!acb->error) { - acb->ret = rcb->size; - } - } else { - if (r < 0) { - qemu_rbd_memset(rcb, 0); - acb->ret = r; - acb->error = 1; - } else if (r < rcb->size) { - qemu_rbd_memset(rcb, r); - if (!acb->error) { - acb->ret = rcb->size; - } - } else if (!acb->error) { - acb->ret = r; - } - } - - g_free(rcb); - - acb->common.cb(acb->common.opaque, (acb->ret > 0 ? 0 : acb->ret)); - - qemu_aio_unref(acb); -} - static char *qemu_rbd_mon_host(BlockdevOptionsRbd *opts, Error **errp) { const char **vals; @@ -826,89 +770,59 @@ static int qemu_rbd_resize(BlockDriverState *bs, uint64_t size) return 0; } -static const AIOCBInfo rbd_aiocb_info = { - .aiocb_size = sizeof(RBDAIOCB), -}; - -static void rbd_finish_bh(void *opaque) +static void qemu_rbd_finish_bh(void *opaque) { - RADOSCB *rcb = opaque; - qemu_rbd_complete_aio(rcb); + RBDTask *task = opaque; + task->complete = 1; + aio_co_wake(task->co); } /* - * This is the callback function for rbd_aio_read and _write + * This is the completion callback function for all rbd aio calls + * started from qemu_rbd_start_co(). * * Note: this function is being called from a non qemu thread so * we need to be careful about what we do here. Generally we only * schedule a BH, and do the rest of the io completion handling - * from rbd_finish_bh() which runs in a qemu context. + * from qemu_rbd_finish_bh() which runs in a qemu context. */ -static void rbd_finish_aiocb(rbd_completion_t c, RADOSCB *rcb) +static void qemu_rbd_completion_cb(rbd_completion_t c, RBDTask *task) { - RBDAIOCB *acb = rcb->acb; - - rcb->ret = rbd_aio_get_return_value(c); + task->ret = rbd_aio_get_return_value(c); rbd_aio_release(c); - - replay_bh_schedule_oneshot_event(bdrv_get_aio_context(acb->common.bs), - rbd_finish_bh, rcb); + aio_bh_schedule_oneshot(bdrv_get_aio_context(task->bs), + qemu_rbd_finish_bh, task); } -static BlockAIOCB *rbd_start_aio(BlockDriverState *bs, - int64_t off, - QEMUIOVector *qiov, - int64_t size, - BlockCompletionFunc *cb, - void *opaque, - RBDAIOCmd cmd) +static int coroutine_fn qemu_rbd_start_co(BlockDriverState *bs, + uint64_t offset, + uint64_t bytes, + QEMUIOVector *qiov, + int flags, + RBDAIOCmd cmd) { - RBDAIOCB *acb; - RADOSCB *rcb = NULL; + BDRVRBDState *s = bs->opaque; + RBDTask task = { .bs = bs, .co = qemu_coroutine_self() }; rbd_completion_t c; int r; - BDRVRBDState *s = bs->opaque; - - acb = qemu_aio_get(&rbd_aiocb_info, bs, cb, opaque); - acb->cmd = cmd; - acb->qiov = qiov; - assert(!qiov || qiov->size == size); + assert(!qiov || qiov->size == bytes); - rcb = g_new(RADOSCB, 1); - - acb->ret = 0; - acb->error = 0; - acb->s = s; - - rcb->acb = acb; - rcb->s = acb->s; - rcb->size = size; - r = rbd_aio_create_completion(rcb, (rbd_callback_t) rbd_finish_aiocb, &c); + r = rbd_aio_create_completion(&task, + (rbd_callback_t) qemu_rbd_completion_cb, &c); if (r < 0) { - goto failed; + return r; } switch (cmd) { - case RBD_AIO_WRITE: - /* - * RBD APIs don't allow us to write more than actual size, so in order - * to support growing images, we resize the image before write - * operations that exceed the current size. - */ - if (off + size > s->image_size) { - r = qemu_rbd_resize(bs, off + size); - if (r < 0) { - goto failed_completion; - } - } - r = rbd_aio_writev(s->image, qiov->iov, qiov->niov, off, c); - break; case RBD_AIO_READ: - r = rbd_aio_readv(s->image, qiov->iov, qiov->niov, off, c); + r = rbd_aio_readv(s->image, qiov->iov, qiov->niov, offset, c); + break; + case RBD_AIO_WRITE: + r = rbd_aio_writev(s->image, qiov->iov, qiov->niov, offset, c); break; case RBD_AIO_DISCARD: - r = rbd_aio_discard(s->image, off, size, c); + r = rbd_aio_discard(s->image, offset, bytes, c); break; case RBD_AIO_FLUSH: r = rbd_aio_flush(s->image, c); @@ -918,44 +832,69 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs, } if (r < 0) { - goto failed_completion; + error_report("rbd request failed early: cmd %d offset %" PRIu64 + " bytes %" PRIu64 " flags %d r %d (%s)", cmd, offset, + bytes, flags, r, strerror(-r)); + rbd_aio_release(c); + return r; } - return &acb->common; -failed_completion: - rbd_aio_release(c); -failed: - g_free(rcb); + while (!task.complete) { + qemu_coroutine_yield(); + } - qemu_aio_unref(acb); - return NULL; + if (task.ret < 0) { + error_report("rbd request failed: cmd %d offset %" PRIu64 " bytes %" + PRIu64 " flags %d task.ret %" PRIi64 " (%s)", cmd, offset, + bytes, flags, task.ret, strerror(-task.ret)); + return task.ret; + } + + /* zero pad short reads */ + if (cmd == RBD_AIO_READ && task.ret < qiov->size) { + qemu_iovec_memset(qiov, task.ret, 0, qiov->size - task.ret); + } + + return 0; +} + +static int +coroutine_fn qemu_rbd_co_preadv(BlockDriverState *bs, uint64_t offset, + uint64_t bytes, QEMUIOVector *qiov, + int flags) +{ + return qemu_rbd_start_co(bs, offset, bytes, qiov, flags, RBD_AIO_READ); } -static BlockAIOCB *qemu_rbd_aio_preadv(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags, - BlockCompletionFunc *cb, - void *opaque) +static int +coroutine_fn qemu_rbd_co_pwritev(BlockDriverState *bs, uint64_t offset, + uint64_t bytes, QEMUIOVector *qiov, + int flags) { - return rbd_start_aio(bs, offset, qiov, bytes, cb, opaque, - RBD_AIO_READ); + BDRVRBDState *s = bs->opaque; + /* + * RBD APIs don't allow us to write more than actual size, so in order + * to support growing images, we resize the image before write + * operations that exceed the current size. + */ + if (offset + bytes > s->image_size) { + int r = qemu_rbd_resize(bs, offset + bytes); + if (r < 0) { + return r; + } + } + return qemu_rbd_start_co(bs, offset, bytes, qiov, flags, RBD_AIO_WRITE); } -static BlockAIOCB *qemu_rbd_aio_pwritev(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags, - BlockCompletionFunc *cb, - void *opaque) +static int coroutine_fn qemu_rbd_co_flush(BlockDriverState *bs) { - return rbd_start_aio(bs, offset, qiov, bytes, cb, opaque, - RBD_AIO_WRITE); + return qemu_rbd_start_co(bs, 0, 0, NULL, 0, RBD_AIO_FLUSH); } -static BlockAIOCB *qemu_rbd_aio_flush(BlockDriverState *bs, - BlockCompletionFunc *cb, - void *opaque) +static int coroutine_fn qemu_rbd_co_pdiscard(BlockDriverState *bs, + int64_t offset, int count) { - return rbd_start_aio(bs, 0, NULL, 0, cb, opaque, RBD_AIO_FLUSH); + return qemu_rbd_start_co(bs, offset, count, NULL, 0, RBD_AIO_DISCARD); } static int qemu_rbd_getinfo(BlockDriverState *bs, BlockDriverInfo *bdi) @@ -1114,16 +1053,6 @@ static int qemu_rbd_snap_list(BlockDriverState *bs, return snap_count; } -static BlockAIOCB *qemu_rbd_aio_pdiscard(BlockDriverState *bs, - int64_t offset, - int bytes, - BlockCompletionFunc *cb, - void *opaque) -{ - return rbd_start_aio(bs, offset, NULL, bytes, cb, opaque, - RBD_AIO_DISCARD); -} - static void coroutine_fn qemu_rbd_co_invalidate_cache(BlockDriverState *bs, Error **errp) { @@ -1187,11 +1116,10 @@ static BlockDriver bdrv_rbd = { .bdrv_co_truncate = qemu_rbd_co_truncate, .protocol_name = "rbd", - .bdrv_aio_preadv = qemu_rbd_aio_preadv, - .bdrv_aio_pwritev = qemu_rbd_aio_pwritev, - - .bdrv_aio_flush = qemu_rbd_aio_flush, - .bdrv_aio_pdiscard = qemu_rbd_aio_pdiscard, + .bdrv_co_preadv = qemu_rbd_co_preadv, + .bdrv_co_pwritev = qemu_rbd_co_pwritev, + .bdrv_co_flush_to_disk = qemu_rbd_co_flush, + .bdrv_co_pdiscard = qemu_rbd_co_pdiscard, .bdrv_snapshot_create = qemu_rbd_snap_create, .bdrv_snapshot_delete = qemu_rbd_snap_remove, From patchwork Fri Jul 2 09:09:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Lieven X-Patchwork-Id: 12355679 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE4E6C11F68 for ; Fri, 2 Jul 2021 09:22:03 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 761B161090 for ; Fri, 2 Jul 2021 09:22:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 761B161090 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kamp.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:43874 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lzFN4-0002Zh-IL for qemu-devel@archiver.kernel.org; Fri, 02 Jul 2021 05:22:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39590) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lzFM4-0001QS-Dk for qemu-devel@nongnu.org; Fri, 02 Jul 2021 05:21:00 -0400 Received: from kerio.kamp.de ([195.62.97.192]:54614) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lzFM0-0007FF-Jr for qemu-devel@nongnu.org; Fri, 02 Jul 2021 05:21:00 -0400 X-Footer: a2FtcC5kZQ== Received: from submission.kamp.de ([195.62.97.28]) by kerio.kamp.de with ESMTPS (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256 bits)) for qemu-devel@nongnu.org; Fri, 2 Jul 2021 11:09:42 +0200 Received: (qmail 37506 invoked from network); 2 Jul 2021 09:09:43 -0000 Received: from lieven-pc.kamp-intra.net (HELO lieven-pc) (relay@kamp.de@::ffff:172.21.12.60) by submission.kamp.de with ESMTPS (DHE-RSA-AES256-GCM-SHA384 encrypted) ESMTPA; 2 Jul 2021 09:09:43 -0000 Received: by lieven-pc (Postfix, from userid 1060) id C405713DD29; Fri, 2 Jul 2021 11:09:43 +0200 (CEST) From: Peter Lieven To: qemu-block@nongnu.org Subject: [PATCH V4 5/6] block/rbd: add write zeroes support Date: Fri, 2 Jul 2021 11:09:34 +0200 Message-Id: <20210702090935.15300-6-pl@kamp.de> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210702090935.15300-1-pl@kamp.de> References: <20210702090935.15300-1-pl@kamp.de> Received-SPF: pass client-ip=195.62.97.192; envelope-from=pl@kamp.de; helo=kerio.kamp.de X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, idryomov@redhat.com, berrange@redhat.com, Peter Lieven , qemu-devel@nongnu.org, ct@flyingcircus.io, pbonzini@redhat.com, idryomov@gmail.com, mreitz@redhat.com, dillaman@redhat.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" this patch wittingly sets BDRV_REQ_NO_FALLBACK and silently ignores BDRV_REQ_MAY_UNMAP for older librbd versions. The rationale for this is as following (citing Ilya Dryomov current RBD maintainer): ---8<--- a) remove the BDRV_REQ_MAY_UNMAP check in qemu_rbd_co_pwrite_zeroes() and as a consequence always unmap if librbd is too old It's not clear what qemu's expectation is but in general Write Zeroes is allowed to unmap. The only guarantee is that subsequent reads return zeroes, everything else is a hint. This is how it is specified in the kernel and in the NVMe spec. In particular, block/nvme.c implements it as follows: if (flags & BDRV_REQ_MAY_UNMAP) { cdw12 |= (1 << 25); } This sets the Deallocate bit. But if it's not set, the device may still deallocate: """ If the Deallocate bit (CDW12.DEAC) is set to '1' in a Write Zeroes command, and the namespace supports clearing all bytes to 0h in the values read (e.g., bits 2:0 in the DLFEAT field are set to 001b) from a deallocated logical block and its metadata (excluding protection information), then for each specified logical block, the controller: - should deallocate that logical block; ... If the Deallocate bit is cleared to '0' in a Write Zeroes command, and the namespace supports clearing all bytes to 0h in the values read (e.g., bits 2:0 in the DLFEAT field are set to 001b) from a deallocated logical block and its metadata (excluding protection information), then, for each specified logical block, the controller: - may deallocate that logical block; """ https://nvmexpress.org/wp-content/uploads/NVM-Express-NVM-Command-Set-Specification-2021.06.02-Ratified-1.pdf b) set BDRV_REQ_NO_FALLBACK in supported_zero_flags Again, it's not clear what qemu expects here, but without it we end up in a ridiculous situation where specifying the "don't allow slow fallback" switch immediately fails all efficient zeroing requests on a device where Write Zeroes is always efficient: $ qemu-io -c 'help write' | grep -- '-[zun]' -n, -- with -z, don't allow slow fallback -u, -- with -z, allow unmapping -z, -- write zeroes using blk_co_pwrite_zeroes $ qemu-io -f rbd -c 'write -z -u -n 0 1M' rbd:foo/bar write failed: Operation not supported --->8--- Signed-off-by: Peter Lieven Reviewed-by: Ilya Dryomov --- block/rbd.c | 32 +++++++++++++++++++++++++++++++- 1 file changed, 31 insertions(+), 1 deletion(-) diff --git a/block/rbd.c b/block/rbd.c index be0471944a..149317d33c 100644 --- a/block/rbd.c +++ b/block/rbd.c @@ -63,7 +63,8 @@ typedef enum { RBD_AIO_READ, RBD_AIO_WRITE, RBD_AIO_DISCARD, - RBD_AIO_FLUSH + RBD_AIO_FLUSH, + RBD_AIO_WRITE_ZEROES } RBDAIOCmd; typedef struct BDRVRBDState { @@ -705,6 +706,10 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags, } } +#ifdef LIBRBD_SUPPORTS_WRITE_ZEROES + bs->supported_zero_flags = BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK; +#endif + /* When extending regular files, we get zeros from the OS */ bs->supported_truncate_flags = BDRV_REQ_ZERO_WRITE; @@ -827,6 +832,18 @@ static int coroutine_fn qemu_rbd_start_co(BlockDriverState *bs, case RBD_AIO_FLUSH: r = rbd_aio_flush(s->image, c); break; +#ifdef LIBRBD_SUPPORTS_WRITE_ZEROES + case RBD_AIO_WRITE_ZEROES: { + int zero_flags = 0; +#ifdef RBD_WRITE_ZEROES_FLAG_THICK_PROVISION + if (!(flags & BDRV_REQ_MAY_UNMAP)) { + zero_flags = RBD_WRITE_ZEROES_FLAG_THICK_PROVISION; + } +#endif + r = rbd_aio_write_zeroes(s->image, offset, bytes, c, zero_flags, 0); + break; + } +#endif default: r = -EINVAL; } @@ -897,6 +914,16 @@ static int coroutine_fn qemu_rbd_co_pdiscard(BlockDriverState *bs, return qemu_rbd_start_co(bs, offset, count, NULL, 0, RBD_AIO_DISCARD); } +#ifdef LIBRBD_SUPPORTS_WRITE_ZEROES +static int +coroutine_fn qemu_rbd_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, + int count, BdrvRequestFlags flags) +{ + return qemu_rbd_start_co(bs, offset, count, NULL, flags, + RBD_AIO_WRITE_ZEROES); +} +#endif + static int qemu_rbd_getinfo(BlockDriverState *bs, BlockDriverInfo *bdi) { BDRVRBDState *s = bs->opaque; @@ -1120,6 +1147,9 @@ static BlockDriver bdrv_rbd = { .bdrv_co_pwritev = qemu_rbd_co_pwritev, .bdrv_co_flush_to_disk = qemu_rbd_co_flush, .bdrv_co_pdiscard = qemu_rbd_co_pdiscard, +#ifdef LIBRBD_SUPPORTS_WRITE_ZEROES + .bdrv_co_pwrite_zeroes = qemu_rbd_co_pwrite_zeroes, +#endif .bdrv_snapshot_create = qemu_rbd_snap_create, .bdrv_snapshot_delete = qemu_rbd_snap_remove, From patchwork Fri Jul 2 09:09:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Lieven X-Patchwork-Id: 12355677 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22371C11F69 for ; Fri, 2 Jul 2021 09:13:56 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D7E6561413 for ; Fri, 2 Jul 2021 09:13:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D7E6561413 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kamp.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:60348 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lzFFD-0002cz-09 for qemu-devel@archiver.kernel.org; Fri, 02 Jul 2021 05:13:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:36298) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lzFBT-00039a-Vt for qemu-devel@nongnu.org; Fri, 02 Jul 2021 05:10:04 -0400 Received: from kerio.kamp.de ([195.62.97.192]:54485) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lzFBR-0006bT-6c for qemu-devel@nongnu.org; Fri, 02 Jul 2021 05:10:03 -0400 X-Footer: a2FtcC5kZQ== Received: from submission.kamp.de ([195.62.97.28]) by kerio.kamp.de with ESMTPS (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256 bits)) for qemu-devel@nongnu.org; Fri, 2 Jul 2021 11:09:42 +0200 Received: (qmail 37505 invoked from network); 2 Jul 2021 09:09:43 -0000 Received: from lieven-pc.kamp-intra.net (HELO lieven-pc) (relay@kamp.de@::ffff:172.21.12.60) by submission.kamp.de with ESMTPS (DHE-RSA-AES256-GCM-SHA384 encrypted) ESMTPA; 2 Jul 2021 09:09:43 -0000 Received: by lieven-pc (Postfix, from userid 1060) id C711513DD2A; Fri, 2 Jul 2021 11:09:43 +0200 (CEST) From: Peter Lieven To: qemu-block@nongnu.org Subject: [PATCH V4 6/6] block/rbd: drop qemu_rbd_refresh_limits Date: Fri, 2 Jul 2021 11:09:35 +0200 Message-Id: <20210702090935.15300-7-pl@kamp.de> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210702090935.15300-1-pl@kamp.de> References: <20210702090935.15300-1-pl@kamp.de> Received-SPF: pass client-ip=195.62.97.192; envelope-from=pl@kamp.de; helo=kerio.kamp.de X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, idryomov@redhat.com, berrange@redhat.com, Peter Lieven , qemu-devel@nongnu.org, ct@flyingcircus.io, pbonzini@redhat.com, idryomov@gmail.com, mreitz@redhat.com, dillaman@redhat.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" librbd supports 1 byte alignment for all aio operations. Currently, there is no API call to query limits from the ceph backend. So drop the bdrv_refresh_limits completely until there is such an API call. Signed-off-by: Peter Lieven Reviewed-by: Ilya Dryomov --- block/rbd.c | 9 --------- 1 file changed, 9 deletions(-) diff --git a/block/rbd.c b/block/rbd.c index 149317d33c..93f4bc8b93 100644 --- a/block/rbd.c +++ b/block/rbd.c @@ -228,14 +228,6 @@ done: return; } - -static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp) -{ - /* XXX Does RBD support AIO on less than 512-byte alignment? */ - bs->bl.request_alignment = 512; -} - - static int qemu_rbd_set_auth(rados_t cluster, BlockdevOptionsRbd *opts, Error **errp) { @@ -1130,7 +1122,6 @@ static BlockDriver bdrv_rbd = { .format_name = "rbd", .instance_size = sizeof(BDRVRBDState), .bdrv_parse_filename = qemu_rbd_parse_filename, - .bdrv_refresh_limits = qemu_rbd_refresh_limits, .bdrv_file_open = qemu_rbd_open, .bdrv_close = qemu_rbd_close, .bdrv_reopen_prepare = qemu_rbd_reopen_prepare,