From patchwork Sat Jun 18 02:28:44 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vaibhav Bhembre X-Patchwork-Id: 9185435 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 2D0966075E for ; Sat, 18 Jun 2016 05:23:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1608F27DD0 for ; Sat, 18 Jun 2016 05:23:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0A5FA2835E; Sat, 18 Jun 2016 05:23:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 212CE27DD0 for ; Sat, 18 Jun 2016 05:23:47 +0000 (UTC) Received: from localhost ([::1]:33208 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bE8jS-0000F7-77 for patchwork-qemu-devel@patchwork.kernel.org; Sat, 18 Jun 2016 01:23:46 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35734) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bE60S-0006pE-9y for qemu-devel@nongnu.org; Fri, 17 Jun 2016 22:29:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bE60N-0004Ll-Rm for qemu-devel@nongnu.org; Fri, 17 Jun 2016 22:29:07 -0400 Received: from mail-qk0-x235.google.com ([2607:f8b0:400d:c09::235]:35498) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bE60M-0004Ji-K2 for qemu-devel@nongnu.org; Fri, 17 Jun 2016 22:29:03 -0400 Received: by mail-qk0-x235.google.com with SMTP id c73so104193156qkg.2 for ; Fri, 17 Jun 2016 19:29:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=digitalocean.com; s=google; h=from:to:cc:subject:date:message-id; bh=TOJNsR4cDvRKCg8P4Y0iZaeunk7LUIdIo3gjWwNz1T8=; b=BiqBkTrSiR2QW5H+DyM0yYLyX0xRFcigq8JSWZLgR5F+dnGsubkFR2sQC9PmiRSb3+ q0n7KHIZjYPqD7KLOOziOUGBtj6GZpQEhnvqR61czr7GGHPU4JHscR9QGBIQnQbHhFUp 2Jrfx0kDovuldbrQdP0xweP99kscNba2q3Rnw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=TOJNsR4cDvRKCg8P4Y0iZaeunk7LUIdIo3gjWwNz1T8=; b=cxwNbwUKqu/lb0gGNTrfjtabhjZ2+jmUEnTZf5eFidoKLn9qSHwJmQ+ocEuHkuLiZK 3frxp0hPHx+jJGoYiPFWn6yIsgEiiTXkzw0nmtruvyCzAocMQDwEn8Wx8GS339W79uV7 OZoYua+K6dCV6EUeHN3DBEZod7CEvbxGG3bU4gpez/OUH2Q9fd0D+yMScMvTUhbBXrV7 L4UDnNqH19fSsCVdHF9+Je1WcZvfPyM8h4Vq/zmOQ/7v20sOElMSv7onufiA6ai7rSw7 a+taVrvLvI3UPpG1TlGsl51BKiZ9ffcwZSOON/AIK6qbI11knKsNpDXCjAQJXpTO8PT6 Esfw== X-Gm-Message-State: ALyK8tK8wH3623dbzY19oRT4axNV9zie04wfiwXWKOgFuK7/36MUd5dOPMqq9ym3rllDYy/Y X-Received: by 10.55.165.67 with SMTP id o64mr6720140qke.51.1466216939639; Fri, 17 Jun 2016 19:28:59 -0700 (PDT) Received: from localhost.localdomain ([162.243.188.76]) by smtp.gmail.com with ESMTPSA id u53sm14749513qtc.23.2016.06.17.19.28.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 17 Jun 2016 19:28:59 -0700 (PDT) From: Vaibhav Bhembre To: qemu-devel@nongnu.org Date: Fri, 17 Jun 2016 22:28:44 -0400 Message-Id: <1466216924-22172-1-git-send-email-vaibhav@digitalocean.com> X-Mailer: git-send-email 1.9.1 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2607:f8b0:400d:c09::235 X-Mailman-Approved-At: Sat, 18 Jun 2016 01:23:21 -0400 Subject: [Qemu-devel] [PATCH] rbd: reload ceph config for block device X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Vaibhav Bhembre , Josh Durgin , Jeff Cody Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP This patch adds ability to reload ceph configuration for an attached RBD block device. This is necessary for the cases where rebooting a VM and/or detaching-reattaching a RBD drive is not an easy option. The reload mechanism relies on the bdrv_reopen_* calls to provide a transactional guarantee (using 2PC) for pulling in new configuration parameters. In the _prepare phase we do the grunt-work of creating and establishing new connection and open another instance of the same RBD image. If any issues are observed while creating a connection using the new parameters we _abort the reload. The original connection to the cluster is kept available and all ongoing I/O on it should be fine. Once the _prepare phase completes successfully we enter the _commit phase. In this phase we simple move the I/O over to the new fd for the corresponding image we have already created in the _prepare phase and reclaim the old rados I/O context and connection. It is important to note that because we want to use this feature when a QEMU VM is already running, we need to switch the logic to have values in ceph.conf override the ones present in the -drive file=* string in order for new changes to take place, for same keys present in both places. Signed-off-by: Vaibhav Bhembre --- block/rbd.c | 122 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ hmp-commands.hx | 14 +++++++ hmp.c | 13 ++++++ hmp.h | 1 + qapi-schema.json | 13 ++++++ qmp-commands.hx | 21 ++++++++++ qmp.c | 31 ++++++++++++++ 7 files changed, 215 insertions(+) diff --git a/block/rbd.c b/block/rbd.c index 5226b6f..605f531 100644 --- a/block/rbd.c +++ b/block/rbd.c @@ -932,6 +932,125 @@ static int qemu_rbd_snap_list(BlockDriverState *bs, return snap_count; } +static int qemu_rbd_reopen_prepare(BDRVReopenState *reopen_state, + BlockReopenQueue *queue, Error **errp) +{ + BDRVRBDState *new_s; + rados_t c; + rados_ioctx_t io_ctx; + char pool[RBD_MAX_POOL_NAME_SIZE]; + char snap_buf[RBD_MAX_SNAP_NAME_SIZE]; + char conf[RBD_MAX_CONF_SIZE]; + char clientname_buf[RBD_MAX_CONF_VAL_SIZE]; + char *clientname; + int r; + + new_s = reopen_state->opaque = g_new0(BDRVRBDState, 1); + + r = qemu_rbd_parsename(reopen_state->bs->filename, + pool, sizeof pool, + snap_buf, sizeof snap_buf, + new_s->name, sizeof new_s->name, + conf, sizeof conf, + errp); + if (r < 0) { + return r; + } + + if (snap_buf[0] != '\0') { + new_s->snap = g_strdup(snap_buf); + } + + clientname = qemu_rbd_parse_clientname(conf, clientname_buf); + r = rados_create(&c, clientname); + if (r < 0) { + error_setg_errno(errp, -r, "error creating cluster from config"); + return r; + } + new_s->cluster = c; + + if (conf[0] != '\0') { + r = qemu_rbd_set_conf(c, conf, false, errp); + if (r < 0) { + error_setg_errno(errp, -r, "error setting config"); + return r; + } + } + + if (strstr(conf, "conf=") == NULL) { + r = rados_conf_read_file(c, NULL); + } else if (conf[0] != '\0') { + r = qemu_rbd_set_conf(c, conf, true, errp); + } + + if (r < 0) { + error_setg_errno(errp, -r, "error parsing config"); + return r; + } + + r = rados_connect(c); + if (r < 0) { + error_setg_errno(errp, -r, "error connecting"); + return r; + } + + r = rados_ioctx_create(c, pool, &io_ctx); + if (r < 0) { + error_setg_errno(errp, -r, "error creating ioctx"); + return r; + } + new_s->io_ctx = io_ctx; + + r = rbd_open(io_ctx, new_s->name, &new_s->image, new_s->snap); + if (r < 0) { + error_setg_errno(errp, -r, "error opening rbd"); + return r; + } + + return 0; +} + +static void qemu_rbd_reopen_abort(BDRVReopenState *reopen_state) +{ + BDRVRBDState *new_s = reopen_state->opaque; + + if (new_s->io_ctx) { + rados_ioctx_destroy(new_s->io_ctx); + } + + if (new_s->cluster) { + rados_shutdown(new_s->cluster); + } + + g_free(new_s->snap); + g_free(reopen_state->opaque); + reopen_state->opaque = NULL; +} + +static void qemu_rbd_reopen_commit(BDRVReopenState *reopen_state) +{ + BDRVRBDState *s, *new_s; + + s = reopen_state->bs->opaque; + new_s = reopen_state->opaque; + + rados_aio_flush(s->io_ctx); + + rbd_close(s->image); + rados_ioctx_destroy(s->io_ctx); + g_free(s->snap); + rados_shutdown(s->cluster); + + s->io_ctx = new_s->io_ctx; + s->cluster = new_s->cluster; + s->image = new_s->image; + s->snap = new_s->snap; + reopen_state->bs->read_only = (s->snap != NULL); + + g_free(reopen_state->opaque); + reopen_state->opaque = NULL; +} + #ifdef LIBRBD_SUPPORTS_DISCARD static BlockAIOCB* qemu_rbd_aio_discard(BlockDriverState *bs, int64_t sector_num, @@ -991,6 +1110,9 @@ static BlockDriver bdrv_rbd = { .create_opts = &qemu_rbd_create_opts, .bdrv_getlength = qemu_rbd_getlength, .bdrv_truncate = qemu_rbd_truncate, + .bdrv_reopen_prepare = qemu_rbd_reopen_prepare, + .bdrv_reopen_commit = qemu_rbd_reopen_commit, + .bdrv_reopen_abort = qemu_rbd_reopen_abort, .protocol_name = "rbd", .bdrv_aio_readv = qemu_rbd_aio_readv, diff --git a/hmp-commands.hx b/hmp-commands.hx index 98b4b1a..583c4a9 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1759,3 +1759,17 @@ ETEXI STEXI @end table ETEXI + + + { + .name = "reload-rbd-config", + .args_type = "device:s", + .params = "device", + .help = "reload rbd ceph config live", + .mhandler.cmd = hmp_reload_rbd_config, + }, + +STEXI +@item reload rbd config +Reload ceph config for RBD image. +ETEXI diff --git a/hmp.c b/hmp.c index 997a768..597fe74 100644 --- a/hmp.c +++ b/hmp.c @@ -2475,3 +2475,16 @@ void hmp_hotpluggable_cpus(Monitor *mon, const QDict *qdict) qapi_free_HotpluggableCPUList(saved); } + +void hmp_reload_rbd_config(Monitor *mon, const QDict *qdict) +{ + const char *device = qdict_get_str(qdict, "device"); + Error *err = NULL; + + qmp_reload_rbd_config(device, &err); + if (err) { + monitor_printf(mon, "%s\n", error_get_pretty(err)); + error_free(err); + return; + } +} diff --git a/hmp.h b/hmp.h index f5d9749..8d2edf7 100644 --- a/hmp.h +++ b/hmp.h @@ -133,5 +133,6 @@ void hmp_rocker_of_dpa_flows(Monitor *mon, const QDict *qdict); void hmp_rocker_of_dpa_groups(Monitor *mon, const QDict *qdict); void hmp_info_dump(Monitor *mon, const QDict *qdict); void hmp_hotpluggable_cpus(Monitor *mon, const QDict *qdict); +void hmp_reload_rbd_config(Monitor *mon, const QDict *qdict); #endif diff --git a/qapi-schema.json b/qapi-schema.json index 0964eec..2a30cc7 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -4308,3 +4308,16 @@ # Since: 2.7 ## { 'command': 'query-hotpluggable-cpus', 'returns': ['HotpluggableCPU'] } + +## +# @reload-rbd-config +# +# Reload the ceph config for a given RBD block device attached to the VM. +# +# @device: Name of the device. +# +# Returns: nothing on success. +# +# Since: 2.5 +## +{'command': 'reload-rbd-config', 'data': { 'device': 'str' } } diff --git a/qmp-commands.hx b/qmp-commands.hx index b444c20..6db6775 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -4983,3 +4983,24 @@ Example for pseries machine type started with { "props": { "core": 0 }, "type": "POWER8-spapr-cpu-core", "vcpus-count": 1, "qom-path": "/machine/unattached/device[0]"} ]}' + +EQMP + + { + .name = "reload-rbd-config", + .args_type = "device:s", + .mhandler.cmd_new = qmp_marshal_reload_rbd_config, + }, + +SQMP +reload-rbd-config +----------------------------------------- + +Reload the ceph config for an RBD block device. + +Arguments: None. + +Example: + +-> { "execute": "reload-rbd-config", "arguments": { "device": "drive-virtio-disk0" } } +<- { "return": {} } diff --git a/qmp.c b/qmp.c index 7df6543..d1205ac 100644 --- a/qmp.c +++ b/qmp.c @@ -708,3 +708,34 @@ ACPIOSTInfoList *qmp_query_acpi_ospm_status(Error **errp) return head; } + +void qmp_reload_rbd_config(const char *device, Error **errp) +{ + BlockBackend *blk; + BlockDriverState *bs; + Error *local_err = NULL; + int ret; + + blk = blk_by_name(device); + if (!blk) { + error_setg(errp, QERR_INVALID_PARAMETER, "device"); + return; + } + + bs = blk_bs(blk); + if (!bs) { + error_setg(errp, "no BDS found"); + return; + } + + ret = bdrv_reopen(bs, bdrv_get_flags(bs), &local_err); + if (local_err) { + error_propagate(errp, local_err); + return; + } + + if (ret) { + error_setg_errno(errp, -ret, "failed reopening device"); + return; + } +}