From patchwork Mon Apr 25 21:50:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12826282 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 52883C433EF for ; Mon, 25 Apr 2022 21:56:25 +0000 (UTC) Received: from localhost ([::1]:58972 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nj6gy-0005Us-9j for qemu-devel@archiver.kernel.org; Mon, 25 Apr 2022 17:56:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57096) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nj6cR-0005ed-D9 for qemu-devel@nongnu.org; Mon, 25 Apr 2022 17:51:52 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:56841) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nj6cN-00012r-Pt for qemu-devel@nongnu.org; Mon, 25 Apr 2022 17:51:41 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1650923499; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ukjPhv1Z+2rF9p/VCJwfH3dzg1DJLH5Q5ghzWxxWgDM=; b=PlX9fnJuoKItzVH1+ezMm+ZWQwK2+2SRHGklDoQc8dzX7ve7LySzBSUhHPEzKVkY3qr5N3 aH+etdcnxNb+NKATnupwIMGkRBXawhIEH9l0IPIfRsH9o3cjufFXaj7aZjA9ReQ+8fWNAd d9khJGXUADa6X2TESRWoUGEalJrcL8k= Received: from mail-oa1-f72.google.com (mail-oa1-f72.google.com [209.85.160.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-635-7AORP85sMT2zZ_tDTfxzKw-1; Mon, 25 Apr 2022 17:51:37 -0400 X-MC-Unique: 7AORP85sMT2zZ_tDTfxzKw-1 Received: by mail-oa1-f72.google.com with SMTP id 586e51a60fabf-e90d2b84b5so2233756fac.9 for ; Mon, 25 Apr 2022 14:51:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ukjPhv1Z+2rF9p/VCJwfH3dzg1DJLH5Q5ghzWxxWgDM=; b=JvsDo3lyQ948gV5FvjtwdtQ8mXedoSg/jt0OjBLHNtyUgBBRRuUqgmLFiETX80EE8i ZCTgc8FPqAhvamu6oBfzJ/hOhVxXsUMOIQEkwh7uGC+qDh4y7cRJAuin5SNh+gwvBAGX 6YrWCPR5WrrU2QucN3oO8GRLqu98FKm630Bp3OweSruE9GiVfc2GmGSJ8r8CcYO2Ysz9 5GIpRYFnX8yjrV8eJpef52iPVEo4GuK/S0rfjhEjv9aFchYHQtb4urbCWTHrsmWE6ycL g82jq9vudT1Ld9c8+tJqtK8URtiqcEc4MZ+cy16G4f8cSwyO6dCjz5DSySXtzWsKWGoK l4/A== X-Gm-Message-State: AOAM530JRR3dcS13fNrUh3QgxddAD3dnvRMTHm0v0wgkblnJ2YB8oknC nxcERi1shNV5MoFHouj6aN9/I3jlO9/0IvfSevOkT0+dWtPQFyuHQPK0E+5+eM5nQfSyozWVQym D+jTy6B+7atP/cTo= X-Received: by 2002:a05:6870:e74c:b0:e9:4d28:e84f with SMTP id t12-20020a056870e74c00b000e94d28e84fmr1781555oak.212.1650923495047; Mon, 25 Apr 2022 14:51:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyImxrHzXu75xP5fW0fFfJ3Vt8F/AmQmaSlUtGn0eD6JSleyYSnMPkWIfqZVo3LaVLlM3QCPA== X-Received: by 2002:a05:6870:e74c:b0:e9:4d28:e84f with SMTP id t12-20020a056870e74c00b000e94d28e84fmr1781545oak.212.1650923494754; Mon, 25 Apr 2022 14:51:34 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:431:c7f0:2ba0:92e8:26c9:ce7e:f03e]) by smtp.gmail.com with ESMTPSA id q7-20020a056870e60700b000e686d13878sm156807oag.18.2022.04.25.14.51.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Apr 2022 14:51:33 -0700 (PDT) From: Leonardo Bras To: =?utf-8?q?Marc-Andr=C3=A9_Lureau?= , Paolo Bonzini , Elena Ufimtseva , Jagannathan Raman , John G Johnson , =?utf-8?q?Daniel_P=2E_Berrang?= =?utf-8?q?=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Eric Blake , Markus Armbruster , Fam Zheng , Peter Xu Subject: [PATCH v9 1/7] QIOChannel: Add flags on io_writev and introduce io_flush callback Date: Mon, 25 Apr 2022 18:50:50 -0300 Message-Id: <20220425215055.611825-2-leobras@redhat.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220425215055.611825-1-leobras@redhat.com> References: <20220425215055.611825-1-leobras@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Add flags to io_writev and introduce io_flush as optional callback to QIOChannelClass, allowing the implementation of zero copy writes by subclasses. How to use them: - Write data using qio_channel_writev*(...,QIO_CHANNEL_WRITE_FLAG_ZERO_COPY), - Wait write completion with qio_channel_flush(). Notes: As some zero copy write implementations work asynchronously, it's recommended to keep the write buffer untouched until the return of qio_channel_flush(), to avoid the risk of sending an updated buffer instead of the buffer state during write. As io_flush callback is optional, if a subclass does not implement it, then: - io_flush will return 0 without changing anything. Also, some functions like qio_channel_writev_full_all() were adapted to receive a flag parameter. That allows shared code between zero copy and non-zero copy writev, and also an easier implementation on new flags. Signed-off-by: Leonardo Bras Reviewed-by: Daniel P. Berrangé Reviewed-by: Peter Xu Reviewed-by: Juan Quintela --- include/io/channel.h | 38 +++++++++++++++++++++- chardev/char-io.c | 2 +- hw/remote/mpqemu-link.c | 2 +- io/channel-buffer.c | 1 + io/channel-command.c | 1 + io/channel-file.c | 1 + io/channel-socket.c | 2 ++ io/channel-tls.c | 1 + io/channel-websock.c | 1 + io/channel.c | 49 +++++++++++++++++++++++------ migration/rdma.c | 1 + scsi/pr-manager-helper.c | 2 +- tests/unit/test-io-channel-socket.c | 1 + 13 files changed, 88 insertions(+), 14 deletions(-) diff --git a/include/io/channel.h b/include/io/channel.h index 88988979f8..c680ee7480 100644 --- a/include/io/channel.h +++ b/include/io/channel.h @@ -32,12 +32,15 @@ OBJECT_DECLARE_TYPE(QIOChannel, QIOChannelClass, #define QIO_CHANNEL_ERR_BLOCK -2 +#define QIO_CHANNEL_WRITE_FLAG_ZERO_COPY 0x1 + typedef enum QIOChannelFeature QIOChannelFeature; enum QIOChannelFeature { QIO_CHANNEL_FEATURE_FD_PASS, QIO_CHANNEL_FEATURE_SHUTDOWN, QIO_CHANNEL_FEATURE_LISTEN, + QIO_CHANNEL_FEATURE_WRITE_ZERO_COPY, }; @@ -104,6 +107,7 @@ struct QIOChannelClass { size_t niov, int *fds, size_t nfds, + int flags, Error **errp); ssize_t (*io_readv)(QIOChannel *ioc, const struct iovec *iov, @@ -136,6 +140,8 @@ struct QIOChannelClass { IOHandler *io_read, IOHandler *io_write, void *opaque); + int (*io_flush)(QIOChannel *ioc, + Error **errp); }; /* General I/O handling functions */ @@ -228,6 +234,7 @@ ssize_t qio_channel_readv_full(QIOChannel *ioc, * @niov: the length of the @iov array * @fds: an array of file handles to send * @nfds: number of file handles in @fds + * @flags: write flags (QIO_CHANNEL_WRITE_FLAG_*) * @errp: pointer to a NULL-initialized error object * * Write data to the IO channel, reading it from the @@ -260,6 +267,7 @@ ssize_t qio_channel_writev_full(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp); /** @@ -837,6 +845,7 @@ int qio_channel_readv_full_all(QIOChannel *ioc, * @niov: the length of the @iov array * @fds: an array of file handles to send * @nfds: number of file handles in @fds + * @flags: write flags (QIO_CHANNEL_WRITE_FLAG_*) * @errp: pointer to a NULL-initialized error object * * @@ -846,6 +855,14 @@ int qio_channel_readv_full_all(QIOChannel *ioc, * to be written, yielding from the current coroutine * if required. * + * If QIO_CHANNEL_WRITE_FLAG_ZERO_COPY is passed in flags, + * instead of waiting for all requested data to be written, + * this function will wait until it's all queued for writing. + * In this case, if the buffer gets changed between queueing and + * sending, the updated buffer will be sent. If this is not a + * desired behavior, it's suggested to call qio_channel_flush() + * before reusing the buffer. + * * Returns: 0 if all bytes were written, or -1 on error */ @@ -853,6 +870,25 @@ int qio_channel_writev_full_all(QIOChannel *ioc, const struct iovec *iov, size_t niov, int *fds, size_t nfds, - Error **errp); + int flags, Error **errp); + +/** + * qio_channel_flush: + * @ioc: the channel object + * @errp: pointer to a NULL-initialized error object + * + * Will block until every packet queued with + * qio_channel_writev_full() + QIO_CHANNEL_WRITE_FLAG_ZERO_COPY + * is sent, or return in case of any error. + * + * If not implemented, acts as a no-op, and returns 0. + * + * Returns -1 if any error is found, + * 1 if every send failed to use zero copy. + * 0 otherwise. + */ + +int qio_channel_flush(QIOChannel *ioc, + Error **errp); #endif /* QIO_CHANNEL_H */ diff --git a/chardev/char-io.c b/chardev/char-io.c index 8ced184160..4451128cba 100644 --- a/chardev/char-io.c +++ b/chardev/char-io.c @@ -122,7 +122,7 @@ int io_channel_send_full(QIOChannel *ioc, ret = qio_channel_writev_full( ioc, &iov, 1, - fds, nfds, NULL); + fds, nfds, 0, NULL); if (ret == QIO_CHANNEL_ERR_BLOCK) { if (offset) { return offset; diff --git a/hw/remote/mpqemu-link.c b/hw/remote/mpqemu-link.c index 2a4aa651ca..9bd98e8219 100644 --- a/hw/remote/mpqemu-link.c +++ b/hw/remote/mpqemu-link.c @@ -68,7 +68,7 @@ bool mpqemu_msg_send(MPQemuMsg *msg, QIOChannel *ioc, Error **errp) } if (!qio_channel_writev_full_all(ioc, send, G_N_ELEMENTS(send), - fds, nfds, errp)) { + fds, nfds, 0, errp)) { ret = true; } else { trace_mpqemu_send_io_error(msg->cmd, msg->size, nfds); diff --git a/io/channel-buffer.c b/io/channel-buffer.c index baa4e2b089..bf52011be2 100644 --- a/io/channel-buffer.c +++ b/io/channel-buffer.c @@ -81,6 +81,7 @@ static ssize_t qio_channel_buffer_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelBuffer *bioc = QIO_CHANNEL_BUFFER(ioc); diff --git a/io/channel-command.c b/io/channel-command.c index 338da73ade..54560464ae 100644 --- a/io/channel-command.c +++ b/io/channel-command.c @@ -258,6 +258,7 @@ static ssize_t qio_channel_command_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelCommand *cioc = QIO_CHANNEL_COMMAND(ioc); diff --git a/io/channel-file.c b/io/channel-file.c index d7cf6d278f..ef6807a6be 100644 --- a/io/channel-file.c +++ b/io/channel-file.c @@ -114,6 +114,7 @@ static ssize_t qio_channel_file_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelFile *fioc = QIO_CHANNEL_FILE(ioc); diff --git a/io/channel-socket.c b/io/channel-socket.c index 9f5ddf68b6..696a04dc9c 100644 --- a/io/channel-socket.c +++ b/io/channel-socket.c @@ -524,6 +524,7 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc); @@ -619,6 +620,7 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc); diff --git a/io/channel-tls.c b/io/channel-tls.c index 2ae1b92fc0..4ce890a538 100644 --- a/io/channel-tls.c +++ b/io/channel-tls.c @@ -301,6 +301,7 @@ static ssize_t qio_channel_tls_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelTLS *tioc = QIO_CHANNEL_TLS(ioc); diff --git a/io/channel-websock.c b/io/channel-websock.c index 55145a6a8c..9619906ac3 100644 --- a/io/channel-websock.c +++ b/io/channel-websock.c @@ -1127,6 +1127,7 @@ static ssize_t qio_channel_websock_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelWebsock *wioc = QIO_CHANNEL_WEBSOCK(ioc); diff --git a/io/channel.c b/io/channel.c index e8b019dc36..0640941ac5 100644 --- a/io/channel.c +++ b/io/channel.c @@ -72,18 +72,32 @@ ssize_t qio_channel_writev_full(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc); - if ((fds || nfds) && - !qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_FD_PASS)) { + if (fds || nfds) { + if (!qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_FD_PASS)) { + error_setg_errno(errp, EINVAL, + "Channel does not support file descriptor passing"); + return -1; + } + if (flags & QIO_CHANNEL_WRITE_FLAG_ZERO_COPY) { + error_setg_errno(errp, EINVAL, + "Zero Copy does not support file descriptor passing"); + return -1; + } + } + + if ((flags & QIO_CHANNEL_WRITE_FLAG_ZERO_COPY) && + !qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_WRITE_ZERO_COPY)) { error_setg_errno(errp, EINVAL, - "Channel does not support file descriptor passing"); + "Requested Zero Copy feature is not available"); return -1; } - return klass->io_writev(ioc, iov, niov, fds, nfds, errp); + return klass->io_writev(ioc, iov, niov, fds, nfds, flags, errp); } @@ -217,14 +231,14 @@ int qio_channel_writev_all(QIOChannel *ioc, size_t niov, Error **errp) { - return qio_channel_writev_full_all(ioc, iov, niov, NULL, 0, errp); + return qio_channel_writev_full_all(ioc, iov, niov, NULL, 0, 0, errp); } int qio_channel_writev_full_all(QIOChannel *ioc, const struct iovec *iov, size_t niov, int *fds, size_t nfds, - Error **errp) + int flags, Error **errp) { int ret = -1; struct iovec *local_iov = g_new(struct iovec, niov); @@ -237,8 +251,10 @@ int qio_channel_writev_full_all(QIOChannel *ioc, while (nlocal_iov > 0) { ssize_t len; - len = qio_channel_writev_full(ioc, local_iov, nlocal_iov, fds, nfds, - errp); + + len = qio_channel_writev_full(ioc, local_iov, nlocal_iov, fds, + nfds, flags, errp); + if (len == QIO_CHANNEL_ERR_BLOCK) { if (qemu_in_coroutine()) { qio_channel_yield(ioc, G_IO_OUT); @@ -277,7 +293,7 @@ ssize_t qio_channel_writev(QIOChannel *ioc, size_t niov, Error **errp) { - return qio_channel_writev_full(ioc, iov, niov, NULL, 0, errp); + return qio_channel_writev_full(ioc, iov, niov, NULL, 0, 0, errp); } @@ -297,7 +313,7 @@ ssize_t qio_channel_write(QIOChannel *ioc, Error **errp) { struct iovec iov = { .iov_base = (char *)buf, .iov_len = buflen }; - return qio_channel_writev_full(ioc, &iov, 1, NULL, 0, errp); + return qio_channel_writev_full(ioc, &iov, 1, NULL, 0, 0, errp); } @@ -473,6 +489,19 @@ off_t qio_channel_io_seek(QIOChannel *ioc, return klass->io_seek(ioc, offset, whence, errp); } +int qio_channel_flush(QIOChannel *ioc, + Error **errp) +{ + QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc); + + if (!klass->io_flush || + !qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_WRITE_ZERO_COPY)) { + return 0; + } + + return klass->io_flush(ioc, errp); +} + static void qio_channel_restart_read(void *opaque) { diff --git a/migration/rdma.c b/migration/rdma.c index ef1e65ec36..672d1958a9 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2840,6 +2840,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); diff --git a/scsi/pr-manager-helper.c b/scsi/pr-manager-helper.c index 451c7631b7..3be52a98d5 100644 --- a/scsi/pr-manager-helper.c +++ b/scsi/pr-manager-helper.c @@ -77,7 +77,7 @@ static int pr_manager_helper_write(PRManagerHelper *pr_mgr, iov.iov_base = (void *)buf; iov.iov_len = sz; n_written = qio_channel_writev_full(QIO_CHANNEL(pr_mgr->ioc), &iov, 1, - nfds ? &fd : NULL, nfds, errp); + nfds ? &fd : NULL, nfds, 0, errp); if (n_written <= 0) { assert(n_written != QIO_CHANNEL_ERR_BLOCK); diff --git a/tests/unit/test-io-channel-socket.c b/tests/unit/test-io-channel-socket.c index c49eec1f03..6713886d02 100644 --- a/tests/unit/test-io-channel-socket.c +++ b/tests/unit/test-io-channel-socket.c @@ -444,6 +444,7 @@ static void test_io_channel_unix_fd_pass(void) G_N_ELEMENTS(iosend), fdsend, G_N_ELEMENTS(fdsend), + 0, &error_abort); qio_channel_readv_full(dst, From patchwork Mon Apr 25 21:50:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12826279 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E2C6CC433EF for ; Mon, 25 Apr 2022 21:54:08 +0000 (UTC) Received: from localhost ([::1]:50954 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nj6el-0008TV-SA for qemu-devel@archiver.kernel.org; Mon, 25 Apr 2022 17:54:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57132) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nj6cU-0005el-Tb for qemu-devel@nongnu.org; Mon, 25 Apr 2022 17:51:52 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:43765) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nj6cQ-00013E-3C for qemu-devel@nongnu.org; Mon, 25 Apr 2022 17:51:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1650923501; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uIqEFdqWEF4dCTXgfI+IVJbQ/j6rHuIRMDolbwB0Ie8=; b=XOm9yCkY0YSZboZcMNTA21upEbGJ/hjXurXj3J+Oioa3dqeM9HGStY5F9Y3EECqOO+MwxZ W/X75NH2jMgiETrVNTwyZ2vXNjPpwf69WOLLMkQ+0PJJ+vUayvFhX9gOSVl35JdjML4Rr2 vGAbL9/71n4vGcPVMgQzr6n2cYIJrqI= Received: from mail-ot1-f69.google.com (mail-ot1-f69.google.com [209.85.210.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-63-qCSyr9lFNhCZyz1Jan-FKA-1; Mon, 25 Apr 2022 17:51:40 -0400 X-MC-Unique: qCSyr9lFNhCZyz1Jan-FKA-1 Received: by mail-ot1-f69.google.com with SMTP id h14-20020a9d554e000000b006050ab1f68eso6810152oti.7 for ; Mon, 25 Apr 2022 14:51:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=uIqEFdqWEF4dCTXgfI+IVJbQ/j6rHuIRMDolbwB0Ie8=; b=eO85TxGe+M+J/rJNKANvnpSJxGWBvVwE8kJXR0tfVLM4fgy9UotDcsmkYjh4kCZP6/ qOi/SF/n2sSeBZ3DKXYdyvzS4DHf4AAl8+uldvrt6S+rR+KCzxBW7Une4pGZ6lQ8UAyn 4Ztw5nyb+nTS4q1+whZtpEgpfPcN+BFYuntR1D7odbDp5+5If3jHXWNjktMDAin2RxHG vw/veqaVMfy0kbolwgc01g6sIyclZYQ2Ed50jc4lkGRWiFweHg+rcEft/CsWLQ6xGSOK vypV8ttZ1NqIbSQUeQ739iiTCAOZaeHchdDZ22ahWbmYFEaksVwXZ0JfS0hlas7T8jJ6 rxYw== X-Gm-Message-State: AOAM532GEy867sbOV8hjAxT3ndfEzJSQqgtP9CVDnQK72TuXjrm5zlDQ qi4Bl/WV3PeZ3kPRGiAshmqgWChc7Fo0BwcBLSDudVx3FQjy1YPiBb48uC0BoitUDfNlPdgX939 u6L3wi77sC5b4NVA= X-Received: by 2002:a9d:7491:0:b0:605:5000:8f3b with SMTP id t17-20020a9d7491000000b0060550008f3bmr7157081otk.66.1650923499221; Mon, 25 Apr 2022 14:51:39 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzj2595ZCNOJ26QprRtG5BQBqW0Z9xK9n+px/8iBEYenPUCLH7oLLqwmrZ0e+YPaFvsK8Is2Q== X-Received: by 2002:a9d:7491:0:b0:605:5000:8f3b with SMTP id t17-20020a9d7491000000b0060550008f3bmr7157066otk.66.1650923498978; Mon, 25 Apr 2022 14:51:38 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:431:c7f0:2ba0:92e8:26c9:ce7e:f03e]) by smtp.gmail.com with ESMTPSA id q7-20020a056870e60700b000e686d13878sm156807oag.18.2022.04.25.14.51.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Apr 2022 14:51:38 -0700 (PDT) From: Leonardo Bras To: =?utf-8?q?Marc-Andr=C3=A9_Lureau?= , Paolo Bonzini , Elena Ufimtseva , Jagannathan Raman , John G Johnson , =?utf-8?q?Daniel_P=2E_Berrang?= =?utf-8?q?=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Eric Blake , Markus Armbruster , Fam Zheng , Peter Xu Subject: [PATCH v9 2/7] QIOChannelSocket: Implement io_writev zero copy flag & io_flush for CONFIG_LINUX Date: Mon, 25 Apr 2022 18:50:51 -0300 Message-Id: <20220425215055.611825-3-leobras@redhat.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220425215055.611825-1-leobras@redhat.com> References: <20220425215055.611825-1-leobras@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" For CONFIG_LINUX, implement the new zero copy flag and the optional callback io_flush on QIOChannelSocket, but enables it only when MSG_ZEROCOPY feature is available in the host kernel, which is checked on qio_channel_socket_connect_sync() qio_channel_socket_flush() was implemented by counting how many times sendmsg(...,MSG_ZEROCOPY) was successfully called, and then reading the socket's error queue, in order to find how many of them finished sending. Flush will loop until those counters are the same, or until some error occurs. Notes on using writev() with QIO_CHANNEL_WRITE_FLAG_ZERO_COPY: 1: Buffer - As MSG_ZEROCOPY tells the kernel to use the same user buffer to avoid copying, some caution is necessary to avoid overwriting any buffer before it's sent. If something like this happen, a newer version of the buffer may be sent instead. - If this is a problem, it's recommended to call qio_channel_flush() before freeing or re-using the buffer. 2: Locked memory - When using MSG_ZERCOCOPY, the buffer memory will be locked after queued, and unlocked after it's sent. - Depending on the size of each buffer, and how often it's sent, it may require a larger amount of locked memory than usually available to non-root user. - If the required amount of locked memory is not available, writev_zero_copy will return an error, which can abort an operation like migration, - Because of this, when an user code wants to add zero copy as a feature, it requires a mechanism to disable it, so it can still be accessible to less privileged users. Signed-off-by: Leonardo Bras Reviewed-by: Peter Xu Reviewed-by: Daniel P. Berrangé Reviewed-by: Juan Quintela --- include/io/channel-socket.h | 2 + io/channel-socket.c | 108 ++++++++++++++++++++++++++++++++++-- 2 files changed, 106 insertions(+), 4 deletions(-) diff --git a/include/io/channel-socket.h b/include/io/channel-socket.h index e747e63514..513c428fe4 100644 --- a/include/io/channel-socket.h +++ b/include/io/channel-socket.h @@ -47,6 +47,8 @@ struct QIOChannelSocket { socklen_t localAddrLen; struct sockaddr_storage remoteAddr; socklen_t remoteAddrLen; + ssize_t zero_copy_queued; + ssize_t zero_copy_sent; }; diff --git a/io/channel-socket.c b/io/channel-socket.c index 696a04dc9c..1dd85fc1ef 100644 --- a/io/channel-socket.c +++ b/io/channel-socket.c @@ -25,6 +25,10 @@ #include "io/channel-watch.h" #include "trace.h" #include "qapi/clone-visitor.h" +#ifdef CONFIG_LINUX +#include +#include +#endif #define SOCKET_MAX_FDS 16 @@ -54,6 +58,8 @@ qio_channel_socket_new(void) sioc = QIO_CHANNEL_SOCKET(object_new(TYPE_QIO_CHANNEL_SOCKET)); sioc->fd = -1; + sioc->zero_copy_queued = 0; + sioc->zero_copy_sent = 0; ioc = QIO_CHANNEL(sioc); qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN); @@ -153,6 +159,16 @@ int qio_channel_socket_connect_sync(QIOChannelSocket *ioc, return -1; } +#ifdef CONFIG_LINUX + int ret, v = 1; + ret = setsockopt(fd, SOL_SOCKET, SO_ZEROCOPY, &v, sizeof(v)); + if (ret == 0) { + /* Zero copy available on host */ + qio_channel_set_feature(QIO_CHANNEL(ioc), + QIO_CHANNEL_FEATURE_WRITE_ZERO_COPY); + } +#endif + return 0; } @@ -533,6 +549,7 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc, char control[CMSG_SPACE(sizeof(int) * SOCKET_MAX_FDS)]; size_t fdsize = sizeof(int) * nfds; struct cmsghdr *cmsg; + int sflags = 0; memset(control, 0, CMSG_SPACE(sizeof(int) * SOCKET_MAX_FDS)); @@ -557,15 +574,27 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc, memcpy(CMSG_DATA(cmsg), fds, fdsize); } + if (flags & QIO_CHANNEL_WRITE_FLAG_ZERO_COPY) { + sflags = MSG_ZEROCOPY; + } + retry: - ret = sendmsg(sioc->fd, &msg, 0); + ret = sendmsg(sioc->fd, &msg, sflags); if (ret <= 0) { - if (errno == EAGAIN) { + switch (errno) { + case EAGAIN: return QIO_CHANNEL_ERR_BLOCK; - } - if (errno == EINTR) { + case EINTR: goto retry; + case ENOBUFS: + if (sflags & MSG_ZEROCOPY) { + error_setg_errno(errp, errno, + "Process can't lock enough memory for using MSG_ZEROCOPY"); + return -1; + } + break; } + error_setg_errno(errp, errno, "Unable to write to socket"); return -1; @@ -659,6 +688,74 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc, } #endif /* WIN32 */ + +#ifdef CONFIG_LINUX +static int qio_channel_socket_flush(QIOChannel *ioc, + Error **errp) +{ + QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc); + struct msghdr msg = {}; + struct sock_extended_err *serr; + struct cmsghdr *cm; + char control[CMSG_SPACE(sizeof(*serr))]; + int received; + int ret = 1; + + msg.msg_control = control; + msg.msg_controllen = sizeof(control); + memset(control, 0, sizeof(control)); + + while (sioc->zero_copy_sent < sioc->zero_copy_queued) { + received = recvmsg(sioc->fd, &msg, MSG_ERRQUEUE); + if (received < 0) { + switch (errno) { + case EAGAIN: + /* Nothing on errqueue, wait until something is available */ + qio_channel_wait(ioc, G_IO_ERR); + continue; + case EINTR: + continue; + default: + error_setg_errno(errp, errno, + "Unable to read errqueue"); + return -1; + } + } + + cm = CMSG_FIRSTHDR(&msg); + if (cm->cmsg_level != SOL_IP && + cm->cmsg_type != IP_RECVERR) { + error_setg_errno(errp, EPROTOTYPE, + "Wrong cmsg in errqueue"); + return -1; + } + + serr = (void *) CMSG_DATA(cm); + if (serr->ee_errno != SO_EE_ORIGIN_NONE) { + error_setg_errno(errp, serr->ee_errno, + "Error on socket"); + return -1; + } + if (serr->ee_origin != SO_EE_ORIGIN_ZEROCOPY) { + error_setg_errno(errp, serr->ee_origin, + "Error not from zero copy"); + return -1; + } + + /* No errors, count successfully finished sendmsg()*/ + sioc->zero_copy_sent += serr->ee_data - serr->ee_info + 1; + + /* If any sendmsg() succeeded using zero copy, return 0 at the end */ + if (serr->ee_code != SO_EE_CODE_ZEROCOPY_COPIED) { + ret = 0; + } + } + + return ret; +} + +#endif /* CONFIG_LINUX */ + static int qio_channel_socket_set_blocking(QIOChannel *ioc, bool enabled, @@ -789,6 +886,9 @@ static void qio_channel_socket_class_init(ObjectClass *klass, ioc_klass->io_set_delay = qio_channel_socket_set_delay; ioc_klass->io_create_watch = qio_channel_socket_create_watch; ioc_klass->io_set_aio_fd_handler = qio_channel_socket_set_aio_fd_handler; +#ifdef CONFIG_LINUX + ioc_klass->io_flush = qio_channel_socket_flush; +#endif } static const TypeInfo qio_channel_socket_info = { From patchwork Mon Apr 25 21:50:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12826280 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A7E6DC433F5 for ; Mon, 25 Apr 2022 21:54:24 +0000 (UTC) Received: from localhost ([::1]:52384 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nj6f1-00011g-PP for qemu-devel@archiver.kernel.org; Mon, 25 Apr 2022 17:54:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57196) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nj6ca-0005gN-PH for qemu-devel@nongnu.org; Mon, 25 Apr 2022 17:51:53 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:35789) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nj6cV-000140-HM for qemu-devel@nongnu.org; Mon, 25 Apr 2022 17:51:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1650923506; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nukln1m3F357YvTDPENXPAFylyoyrfes2qpG25R7VCs=; b=aKm+AKvWB4GR9+057h7mjNXjChmPhyl/hnGRw07x7FzWjGs9qSyVOy5ZCz4GdJiN1FnRdt XoYQJC/9udMOaBjzqNs/e0zgBbzKV0DedBUu+RtrxbOqTfe5hsQ8U5atxIKmpx3niKkIlQ awDFdUd71tw8ila5QySxCkn+LJxRz9I= Received: from mail-ot1-f69.google.com (mail-ot1-f69.google.com [209.85.210.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-259-SKVqRnYhP6-oqYr25MfqAQ-1; Mon, 25 Apr 2022 17:51:45 -0400 X-MC-Unique: SKVqRnYhP6-oqYr25MfqAQ-1 Received: by mail-ot1-f69.google.com with SMTP id t26-20020a0568301e3a00b005af6b88cf12so6790417otr.12 for ; Mon, 25 Apr 2022 14:51:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=nukln1m3F357YvTDPENXPAFylyoyrfes2qpG25R7VCs=; b=iX1UxqsalSPeVJR+Fx8j73Z0J+Np3qPXRkvy+QvGEz1MOpzCGPGmA4rwi7wpfTOqKL NK/TavABKp7A63UXvKFCn6jvlEjJT8Qbj3cEpUYW6OWVqwIOvBG/7xiJ0M4fwHpQUz8I XRfLauoHO8JAFTncCoW/sIv8ulCQdLv2beGnGdqW+7Q34ez6yCGnGBy42ZVbbEiGQHa4 c8dyVLqxpFTo/t/Jv0PA0VpV5IZ8PidML8mRX/oUzrToeMkX6ccYPQxbKm0j1qk3enqC jK2i3IV0flzY1JRg6O85xXrA3Mj5oVwnZn+6aZIbev1Qimjm3sdBwKD0qHXoVVdDalo7 QSwA== X-Gm-Message-State: AOAM532y9bQ/JMNf46c+tMCKmZbR2ONMiw13TQWP7n53uKUUjFyiZeUf ylieFgCR0aTDVHulbJKxGuWH10GJmj/hwfOdFDOMsJRPUr/cIqp0IY9k/eWA5Se9ysSdtUQtXuT P8WeEA2VTX7xeQGg= X-Received: by 2002:a05:6870:b023:b0:db:78e:7197 with SMTP id y35-20020a056870b02300b000db078e7197mr12090509oae.36.1650923503731; Mon, 25 Apr 2022 14:51:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxDH6ItJ0qqRMiHJzTjOHDy12iE/mCz/xT3CxsWu/j1eCCfu9+Yuv83zGO6rJlHVu3SdDmqRw== X-Received: by 2002:a05:6870:b023:b0:db:78e:7197 with SMTP id y35-20020a056870b02300b000db078e7197mr12090498oae.36.1650923503514; Mon, 25 Apr 2022 14:51:43 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:431:c7f0:2ba0:92e8:26c9:ce7e:f03e]) by smtp.gmail.com with ESMTPSA id q7-20020a056870e60700b000e686d13878sm156807oag.18.2022.04.25.14.51.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Apr 2022 14:51:42 -0700 (PDT) From: Leonardo Bras To: =?utf-8?q?Marc-Andr=C3=A9_Lureau?= , Paolo Bonzini , Elena Ufimtseva , Jagannathan Raman , John G Johnson , =?utf-8?q?Daniel_P=2E_Berrang?= =?utf-8?q?=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Eric Blake , Markus Armbruster , Fam Zheng , Peter Xu Subject: [PATCH v9 3/7] migration: Add zero-copy-send parameter for QMP/HMP for Linux Date: Mon, 25 Apr 2022 18:50:52 -0300 Message-Id: <20220425215055.611825-4-leobras@redhat.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220425215055.611825-1-leobras@redhat.com> References: <20220425215055.611825-1-leobras@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Add property that allows zero-copy migration of memory pages on the sending side, and also includes a helper function migrate_use_zero_copy_send() to check if it's enabled. No code is introduced to actually do the migration, but it allow future implementations to enable/disable this feature. On non-Linux builds this parameter is compiled-out. Signed-off-by: Leonardo Bras Reviewed-by: Peter Xu Reviewed-by: Daniel P. Berrangé Reviewed-by: Juan Quintela Acked-by: Markus Armbruster --- qapi/migration.json | 24 ++++++++++++++++++++++++ migration/migration.h | 5 +++++ migration/migration.c | 32 ++++++++++++++++++++++++++++++++ migration/socket.c | 11 +++++++++-- monitor/hmp-cmds.c | 6 ++++++ 5 files changed, 76 insertions(+), 2 deletions(-) diff --git a/qapi/migration.json b/qapi/migration.json index 409eb086a2..04246481ce 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -741,6 +741,13 @@ # will consume more CPU. # Defaults to 1. (Since 5.0) # +# @zero-copy-send: Controls behavior on sending memory pages on migration. +# When true, enables a zero-copy mechanism for sending memory +# pages, if host supports it. +# Requires that QEMU be permitted to use locked memory for guest +# RAM pages. +# Defaults to false. (Since 7.1) +# # @block-bitmap-mapping: Maps block nodes and bitmaps on them to # aliases for the purpose of dirty bitmap migration. Such # aliases may for example be the corresponding names on the @@ -780,6 +787,7 @@ 'xbzrle-cache-size', 'max-postcopy-bandwidth', 'max-cpu-throttle', 'multifd-compression', 'multifd-zlib-level' ,'multifd-zstd-level', + { 'name': 'zero-copy-send', 'if' : 'CONFIG_LINUX'}, 'block-bitmap-mapping' ] } ## @@ -906,6 +914,13 @@ # will consume more CPU. # Defaults to 1. (Since 5.0) # +# @zero-copy-send: Controls behavior on sending memory pages on migration. +# When true, enables a zero-copy mechanism for sending memory +# pages, if host supports it. +# Requires that QEMU be permitted to use locked memory for guest +# RAM pages. +# Defaults to false. (Since 7.1) +# # @block-bitmap-mapping: Maps block nodes and bitmaps on them to # aliases for the purpose of dirty bitmap migration. Such # aliases may for example be the corresponding names on the @@ -960,6 +975,7 @@ '*multifd-compression': 'MultiFDCompression', '*multifd-zlib-level': 'uint8', '*multifd-zstd-level': 'uint8', + '*zero-copy-send': { 'type': 'bool', 'if': 'CONFIG_LINUX' }, '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ] } } ## @@ -1106,6 +1122,13 @@ # will consume more CPU. # Defaults to 1. (Since 5.0) # +# @zero-copy-send: Controls behavior on sending memory pages on migration. +# When true, enables a zero-copy mechanism for sending memory +# pages, if host supports it. +# Requires that QEMU be permitted to use locked memory for guest +# RAM pages. +# Defaults to false. (Since 7.1) +# # @block-bitmap-mapping: Maps block nodes and bitmaps on them to # aliases for the purpose of dirty bitmap migration. Such # aliases may for example be the corresponding names on the @@ -1158,6 +1181,7 @@ '*multifd-compression': 'MultiFDCompression', '*multifd-zlib-level': 'uint8', '*multifd-zstd-level': 'uint8', + '*zero-copy-send': { 'type': 'bool', 'if': 'CONFIG_LINUX' }, '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ] } } ## diff --git a/migration/migration.h b/migration/migration.h index a863032b71..e8f2941a55 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -375,6 +375,11 @@ MultiFDCompression migrate_multifd_compression(void); int migrate_multifd_zlib_level(void); int migrate_multifd_zstd_level(void); +#ifdef CONFIG_LINUX +bool migrate_use_zero_copy_send(void); +#else +#define migrate_use_zero_copy_send() (false) +#endif int migrate_use_xbzrle(void); uint64_t migrate_xbzrle_cache_size(void); bool migrate_colo_enabled(void); diff --git a/migration/migration.c b/migration/migration.c index 5a31b23bd6..3e91f4b5e2 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -910,6 +910,10 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp) params->multifd_zlib_level = s->parameters.multifd_zlib_level; params->has_multifd_zstd_level = true; params->multifd_zstd_level = s->parameters.multifd_zstd_level; +#ifdef CONFIG_LINUX + params->has_zero_copy_send = true; + params->zero_copy_send = s->parameters.zero_copy_send; +#endif params->has_xbzrle_cache_size = true; params->xbzrle_cache_size = s->parameters.xbzrle_cache_size; params->has_max_postcopy_bandwidth = true; @@ -1567,6 +1571,11 @@ static void migrate_params_test_apply(MigrateSetParameters *params, if (params->has_multifd_compression) { dest->multifd_compression = params->multifd_compression; } +#ifdef CONFIG_LINUX + if (params->has_zero_copy_send) { + dest->zero_copy_send = params->zero_copy_send; + } +#endif if (params->has_xbzrle_cache_size) { dest->xbzrle_cache_size = params->xbzrle_cache_size; } @@ -1679,6 +1688,11 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp) if (params->has_multifd_compression) { s->parameters.multifd_compression = params->multifd_compression; } +#ifdef CONFIG_LINUX + if (params->has_zero_copy_send) { + s->parameters.zero_copy_send = params->zero_copy_send; + } +#endif if (params->has_xbzrle_cache_size) { s->parameters.xbzrle_cache_size = params->xbzrle_cache_size; xbzrle_cache_resize(params->xbzrle_cache_size, errp); @@ -2563,6 +2577,17 @@ int migrate_multifd_zstd_level(void) return s->parameters.multifd_zstd_level; } +#ifdef CONFIG_LINUX +bool migrate_use_zero_copy_send(void) +{ + MigrationState *s; + + s = migrate_get_current(); + + return s->parameters.zero_copy_send; +} +#endif + int migrate_use_xbzrle(void) { MigrationState *s; @@ -4206,6 +4231,10 @@ static Property migration_properties[] = { DEFINE_PROP_UINT8("multifd-zstd-level", MigrationState, parameters.multifd_zstd_level, DEFAULT_MIGRATE_MULTIFD_ZSTD_LEVEL), +#ifdef CONFIG_LINUX + DEFINE_PROP_BOOL("zero_copy_send", MigrationState, + parameters.zero_copy_send, false), +#endif DEFINE_PROP_SIZE("xbzrle-cache-size", MigrationState, parameters.xbzrle_cache_size, DEFAULT_MIGRATE_XBZRLE_CACHE_SIZE), @@ -4303,6 +4332,9 @@ static void migration_instance_init(Object *obj) params->has_multifd_compression = true; params->has_multifd_zlib_level = true; params->has_multifd_zstd_level = true; +#ifdef CONFIG_LINUX + params->has_zero_copy_send = true; +#endif params->has_xbzrle_cache_size = true; params->has_max_postcopy_bandwidth = true; params->has_max_cpu_throttle = true; diff --git a/migration/socket.c b/migration/socket.c index 05705a32d8..3754d8f72c 100644 --- a/migration/socket.c +++ b/migration/socket.c @@ -74,9 +74,16 @@ static void socket_outgoing_migration(QIOTask *task, if (qio_task_propagate_error(task, &err)) { trace_migration_socket_outgoing_error(error_get_pretty(err)); - } else { - trace_migration_socket_outgoing_connected(data->hostname); + goto out; } + + trace_migration_socket_outgoing_connected(data->hostname); + + if (migrate_use_zero_copy_send()) { + error_setg(&err, "Zero copy send not available in migration"); + } + +out: migration_channel_connect(data->s, sioc, data->hostname, err); object_unref(OBJECT(sioc)); } diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c index 634968498b..55b48d3733 100644 --- a/monitor/hmp-cmds.c +++ b/monitor/hmp-cmds.c @@ -1309,6 +1309,12 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict) p->has_multifd_zstd_level = true; visit_type_uint8(v, param, &p->multifd_zstd_level, &err); break; +#ifdef CONFIG_LINUX + case MIGRATION_PARAMETER_ZERO_COPY_SEND: + p->has_zero_copy_send = true; + visit_type_bool(v, param, &p->zero_copy_send, &err); + break; +#endif case MIGRATION_PARAMETER_XBZRLE_CACHE_SIZE: p->has_xbzrle_cache_size = true; if (!visit_type_size(v, param, &cache_size, &err)) { From patchwork Mon Apr 25 21:50:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12826285 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 75C09C433F5 for ; Mon, 25 Apr 2022 21:58:29 +0000 (UTC) Received: from localhost ([::1]:36796 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nj6iy-00019Z-KC for qemu-devel@archiver.kernel.org; Mon, 25 Apr 2022 17:58:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57220) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nj6cc-0005hX-3l for qemu-devel@nongnu.org; Mon, 25 Apr 2022 17:51:58 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:35525) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nj6cZ-00014H-CK for qemu-devel@nongnu.org; Mon, 25 Apr 2022 17:51:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1650923510; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OCN4rR4bAIHd7dZhYCUf3JnbZmufgbcGPMyBtVIgbmE=; b=ZotaaWku+rTv82kpAo0PbraBAeXQDHMaZY7kDkVHQ9bg258uQpYq4Jt4gcqp7ifbNczCMk V7Dycfz4ieagOomexpEvnVj29WXz515orgAj0b7zyiCb3A9Q//CaNdkuHqtMJrjejQdWfF hytJj5W9ZajSlRGyrw7AzMb2tXEpGTQ= Received: from mail-ot1-f70.google.com (mail-ot1-f70.google.com [209.85.210.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-149-IC7ewV_ZP1SZog2EXnGsCw-1; Mon, 25 Apr 2022 17:51:48 -0400 X-MC-Unique: IC7ewV_ZP1SZog2EXnGsCw-1 Received: by mail-ot1-f70.google.com with SMTP id x97-20020a9d20ea000000b00604724bc745so6838034ota.3 for ; Mon, 25 Apr 2022 14:51:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=OCN4rR4bAIHd7dZhYCUf3JnbZmufgbcGPMyBtVIgbmE=; b=yCoqgAZ37H4nCt43UtCI1qH+7nx92Op2UBaU4f2oJQb1KH1LmOyiIS4lExLZicbbv3 /EkRi1LpzSV6zKHYREs4SH/SwZ1kBCm9T5SKPlbPAMqYGWRSGTA+xWhmY1NjYAhoNe1a nKmZDAGUM03dMJ+Mf0jpbZC98tL/bmueEqanalh0FWazQ58zTLJFm8nDWFEaQKExcFPD PCja22xVc1WlmJGVHR2WST4I54nuyQV1cU51xtDaUmOq+LqT9u11Zuz1XVr9ft4AvLnj /ApORxAE64IwolopQMlAZ4IZvi6tnUEBDqjD2NuoDnRKr+pgYoUTXuxQy5W7fLJFfypp lYMg== X-Gm-Message-State: AOAM530IqGlC9QgW1fG/fNligIx6MZlhGUx0e2OIayDdV8KijKzdx0N+ iPChTDQRN4aCzrk/UvckxK80b/dS8nyJErKdyNV2+tpubiAWRZZHn/LJjdHq4wcdhrSOulQ9YAt NC6Kp5hxrX2Mc1EI= X-Received: by 2002:a9d:1702:0:b0:604:f0f2:bea2 with SMTP id i2-20020a9d1702000000b00604f0f2bea2mr7192670ota.362.1650923507887; Mon, 25 Apr 2022 14:51:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxbLfrZ95+XVYfEKJk0pQeBg1Q03FfZjNkcneKwphUXgEueWhIWDqBcFrWoADQ0jeBzMaBrkw== X-Received: by 2002:a9d:1702:0:b0:604:f0f2:bea2 with SMTP id i2-20020a9d1702000000b00604f0f2bea2mr7192653ota.362.1650923507689; Mon, 25 Apr 2022 14:51:47 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:431:c7f0:2ba0:92e8:26c9:ce7e:f03e]) by smtp.gmail.com with ESMTPSA id q7-20020a056870e60700b000e686d13878sm156807oag.18.2022.04.25.14.51.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Apr 2022 14:51:47 -0700 (PDT) From: Leonardo Bras To: =?utf-8?q?Marc-Andr=C3=A9_Lureau?= , Paolo Bonzini , Elena Ufimtseva , Jagannathan Raman , John G Johnson , =?utf-8?q?Daniel_P=2E_Berrang?= =?utf-8?q?=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Eric Blake , Markus Armbruster , Fam Zheng , Peter Xu Subject: [PATCH v9 4/7] migration: Add migrate_use_tls() helper Date: Mon, 25 Apr 2022 18:50:53 -0300 Message-Id: <20220425215055.611825-5-leobras@redhat.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220425215055.611825-1-leobras@redhat.com> References: <20220425215055.611825-1-leobras@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" A lot of places check parameters.tls_creds in order to evaluate if TLS is in use, and sometimes call migrate_get_current() just for that test. Add new helper function migrate_use_tls() in order to simplify testing for TLS usage. Signed-off-by: Leonardo Bras Reviewed-by: Juan Quintela Reviewed-by: Peter Xu Reviewed-by: Daniel P. Berrangé --- migration/migration.h | 1 + migration/channel.c | 3 +-- migration/migration.c | 9 +++++++++ migration/multifd.c | 5 +---- 4 files changed, 12 insertions(+), 6 deletions(-) diff --git a/migration/migration.h b/migration/migration.h index e8f2941a55..485d58b95f 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -380,6 +380,7 @@ bool migrate_use_zero_copy_send(void); #else #define migrate_use_zero_copy_send() (false) #endif +int migrate_use_tls(void); int migrate_use_xbzrle(void); uint64_t migrate_xbzrle_cache_size(void); bool migrate_colo_enabled(void); diff --git a/migration/channel.c b/migration/channel.c index c6a8dcf1d7..a162d00fea 100644 --- a/migration/channel.c +++ b/migration/channel.c @@ -38,8 +38,7 @@ void migration_channel_process_incoming(QIOChannel *ioc) trace_migration_set_incoming_channel( ioc, object_get_typename(OBJECT(ioc))); - if (s->parameters.tls_creds && - *s->parameters.tls_creds && + if (migrate_use_tls() && !object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_TLS)) { migration_tls_channel_process_incoming(s, ioc, &local_err); diff --git a/migration/migration.c b/migration/migration.c index 3e91f4b5e2..4b6df2eb5e 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2588,6 +2588,15 @@ bool migrate_use_zero_copy_send(void) } #endif +int migrate_use_tls(void) +{ + MigrationState *s; + + s = migrate_get_current(); + + return s->parameters.tls_creds && *s->parameters.tls_creds; +} + int migrate_use_xbzrle(void) { MigrationState *s; diff --git a/migration/multifd.c b/migration/multifd.c index 9ea4f581e2..2a8c8570c3 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -782,15 +782,12 @@ static bool multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc, Error *error) { - MigrationState *s = migrate_get_current(); - trace_multifd_set_outgoing_channel( ioc, object_get_typename(OBJECT(ioc)), migrate_get_current()->hostname, error); if (!error) { - if (s->parameters.tls_creds && - *s->parameters.tls_creds && + if (migrate_use_tls() && !object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_TLS)) { multifd_tls_channel_connect(p, ioc, &error); From patchwork Mon Apr 25 21:50:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12826281 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B1C91C433FE for ; Mon, 25 Apr 2022 21:54:25 +0000 (UTC) Received: from localhost ([::1]:52486 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nj6f2-00015r-JO for qemu-devel@archiver.kernel.org; Mon, 25 Apr 2022 17:54:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57252) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nj6ce-0005ha-L8 for qemu-devel@nongnu.org; Mon, 25 Apr 2022 17:51:57 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:23572) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nj6cc-00016C-TH for qemu-devel@nongnu.org; Mon, 25 Apr 2022 17:51:56 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1650923514; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RQFjS0vcfoRuqRYCfaPRvlIl2V4fbZjAVowp1pGheKU=; b=ilXo0Hc0FXrl5Hcvy6YdrU9X6fOt5a5UfajPDRsDnQRLn40JdvS0ZDsd41ZyAOMoCzmgbo 9CTrDZJ+tiYs64Yhm8zq9uVtHjiXcvn51aSiYQWAuXY2MQ4/95bJbbU33SbXsLHvOkmv1v QANMU2wGX6YGcH5Wjm49nL+SjZsKTDU= Received: from mail-oa1-f69.google.com (mail-oa1-f69.google.com [209.85.160.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-126-me_ZPjp_NomF3ZYkJH1CeA-1; Mon, 25 Apr 2022 17:51:52 -0400 X-MC-Unique: me_ZPjp_NomF3ZYkJH1CeA-1 Received: by mail-oa1-f69.google.com with SMTP id 586e51a60fabf-e630d5d6easo5522481fac.10 for ; Mon, 25 Apr 2022 14:51:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=RQFjS0vcfoRuqRYCfaPRvlIl2V4fbZjAVowp1pGheKU=; b=TcyK9+dzdBGo+lZwv6ofN3F1/e46RhcaK09HeJCs+IgsTQ4H49E0iyrJmy1cG9G/wN LuDgiSeUT0Bx8zYggDLA8BiLft9QGlX6+9vB/eiFdNWyXxAdV4ljvONcbPDgLWqLaFKw vtxJcb+qtw0F+ikxKAhOlP3MQgBbg4g2ZJhIcEIGIzmit4n5FpyHABCcnApM8fmbJdje qw4o+yQJbaKmMDYNmoj1qHyR18Kd1C4u7bbuO5fRHxmJFkn4dYGbZdELgYFjoMjCjzhi Ua7m8Yye7LwJ3O0FGazh9MSvmwAiejZvDoYIU08XcSgX1WuI9iPyx0Zr/P9mccm8fGFM j2cw== X-Gm-Message-State: AOAM533HFrzQMdO5z4560f7qWvlDZS/v7doJhD7kjsbiAz1KLgi7zbUU cbGqKA//ZKQYVNG0kt4lAr7XV8JpPLzGaTLrZwllJIZDzo40FFVGAaZeOwMiffpXajdX5LYNM0J ZBSDp+abUKzQ/C5I= X-Received: by 2002:a05:6870:ec8c:b0:e9:365:7a53 with SMTP id eo12-20020a056870ec8c00b000e903657a53mr7220150oab.269.1650923512052; Mon, 25 Apr 2022 14:51:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJynrdD0HIc9Um81wGUkYntFadtCDhfO3FDO9RX4lDB16UBDyNwMoZhhmJk8UEc1pab7wAz9bQ== X-Received: by 2002:a05:6870:ec8c:b0:e9:365:7a53 with SMTP id eo12-20020a056870ec8c00b000e903657a53mr7220138oab.269.1650923511814; Mon, 25 Apr 2022 14:51:51 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:431:c7f0:2ba0:92e8:26c9:ce7e:f03e]) by smtp.gmail.com with ESMTPSA id q7-20020a056870e60700b000e686d13878sm156807oag.18.2022.04.25.14.51.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Apr 2022 14:51:51 -0700 (PDT) From: Leonardo Bras To: =?utf-8?q?Marc-Andr=C3=A9_Lureau?= , Paolo Bonzini , Elena Ufimtseva , Jagannathan Raman , John G Johnson , =?utf-8?q?Daniel_P=2E_Berrang?= =?utf-8?q?=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Eric Blake , Markus Armbruster , Fam Zheng , Peter Xu Subject: [PATCH v9 5/7] multifd: multifd_send_sync_main now returns negative on error Date: Mon, 25 Apr 2022 18:50:54 -0300 Message-Id: <20220425215055.611825-6-leobras@redhat.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220425215055.611825-1-leobras@redhat.com> References: <20220425215055.611825-1-leobras@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.133.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Even though multifd_send_sync_main() currently emits error_reports, it's callers don't really check it before continuing. Change multifd_send_sync_main() to return -1 on error and 0 on success. Also change all it's callers to make use of this change and possibly fail earlier. (This change is important to next patch on multifd zero copy implementation, to make it sure an error in zero-copy flush does not go unnoticed. Signed-off-by: Leonardo Bras Reviewed-by: Daniel P. Berrangé Reviewed-by: Peter Xu --- migration/multifd.h | 2 +- migration/multifd.c | 10 ++++++---- migration/ram.c | 29 ++++++++++++++++++++++------- 3 files changed, 29 insertions(+), 12 deletions(-) diff --git a/migration/multifd.h b/migration/multifd.h index 7d0effcb03..bcf5992945 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -20,7 +20,7 @@ int multifd_load_cleanup(Error **errp); bool multifd_recv_all_channels_created(void); bool multifd_recv_new_channel(QIOChannel *ioc, Error **errp); void multifd_recv_sync_main(void); -void multifd_send_sync_main(QEMUFile *f); +int multifd_send_sync_main(QEMUFile *f); int multifd_queue_page(QEMUFile *f, RAMBlock *block, ram_addr_t offset); /* Multifd Compression flags */ diff --git a/migration/multifd.c b/migration/multifd.c index 2a8c8570c3..15fb668e64 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -566,17 +566,17 @@ void multifd_save_cleanup(void) multifd_send_state = NULL; } -void multifd_send_sync_main(QEMUFile *f) +int multifd_send_sync_main(QEMUFile *f) { int i; if (!migrate_use_multifd()) { - return; + return 0; } if (multifd_send_state->pages->num) { if (multifd_send_pages(f) < 0) { error_report("%s: multifd_send_pages fail", __func__); - return; + return -1; } } for (i = 0; i < migrate_multifd_channels(); i++) { @@ -589,7 +589,7 @@ void multifd_send_sync_main(QEMUFile *f) if (p->quit) { error_report("%s: channel %d has already quit", __func__, i); qemu_mutex_unlock(&p->mutex); - return; + return -1; } p->packet_num = multifd_send_state->packet_num++; @@ -608,6 +608,8 @@ void multifd_send_sync_main(QEMUFile *f) qemu_sem_wait(&p->sem_sync); } trace_multifd_send_sync_main(multifd_send_state->packet_num); + + return 0; } static void *multifd_send_thread(void *opaque) diff --git a/migration/ram.c b/migration/ram.c index a2489a2699..5f5e37f64d 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2909,6 +2909,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque) { RAMState **rsp = opaque; RAMBlock *block; + int ret; if (compress_threads_save_setup()) { return -1; @@ -2943,7 +2944,11 @@ static int ram_save_setup(QEMUFile *f, void *opaque) ram_control_before_iterate(f, RAM_CONTROL_SETUP); ram_control_after_iterate(f, RAM_CONTROL_SETUP); - multifd_send_sync_main(f); + ret = multifd_send_sync_main(f); + if (ret < 0) { + return ret; + } + qemu_put_be64(f, RAM_SAVE_FLAG_EOS); qemu_fflush(f); @@ -3052,7 +3057,11 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) out: if (ret >= 0 && migration_is_setup_or_active(migrate_get_current()->state)) { - multifd_send_sync_main(rs->f); + ret = multifd_send_sync_main(rs->f); + if (ret < 0) { + return ret; + } + qemu_put_be64(f, RAM_SAVE_FLAG_EOS); qemu_fflush(f); ram_transferred_add(8); @@ -3112,13 +3121,19 @@ static int ram_save_complete(QEMUFile *f, void *opaque) ram_control_after_iterate(f, RAM_CONTROL_FINISH); } - if (ret >= 0) { - multifd_send_sync_main(rs->f); - qemu_put_be64(f, RAM_SAVE_FLAG_EOS); - qemu_fflush(f); + if (ret < 0) { + return ret; } - return ret; + ret = multifd_send_sync_main(rs->f); + if (ret < 0) { + return ret; + } + + qemu_put_be64(f, RAM_SAVE_FLAG_EOS); + qemu_fflush(f); + + return 0; } static void ram_save_pending(QEMUFile *f, void *opaque, uint64_t max_size, From patchwork Mon Apr 25 21:50:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12826283 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0808CC433FE for ; Mon, 25 Apr 2022 21:56:26 +0000 (UTC) Received: from localhost ([::1]:59086 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nj6gz-0005Z5-39 for qemu-devel@archiver.kernel.org; Mon, 25 Apr 2022 17:56:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57288) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nj6cl-0005yW-Me for qemu-devel@nongnu.org; Mon, 25 Apr 2022 17:52:03 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:39382) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nj6cj-00016s-WB for qemu-devel@nongnu.org; Mon, 25 Apr 2022 17:52:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1650923521; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hbPgZABDK158e31/mU/kG6uBy4IBjH2iSa4i92TRHlM=; b=N96wmm9Wc05Mc535IWWZ5h30BFhZtOOLgnsWJHT9JfcBwE7/8O1wvXT/CtcMcNEjENo0hv tTm8oiVl92jnAe53sF13jZwcOQ8cPm68d1jqYGEd0Jjadpw9n+xqUlUn1PdKo5Uev1Dipk LgNdBh50xlRGT78u+0m3fW3LiEd2GtQ= Received: from mail-oa1-f71.google.com (mail-oa1-f71.google.com [209.85.160.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-665-PEUL7AWuO9aTrjwAz5Zszw-1; Mon, 25 Apr 2022 17:52:00 -0400 X-MC-Unique: PEUL7AWuO9aTrjwAz5Zszw-1 Received: by mail-oa1-f71.google.com with SMTP id 586e51a60fabf-e9114e0121so1966150fac.17 for ; Mon, 25 Apr 2022 14:52:00 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hbPgZABDK158e31/mU/kG6uBy4IBjH2iSa4i92TRHlM=; b=Jm9JW/KMJA2j19WeJ4mDOQcDjCpAoClwSqmixEAVM0EoG7z9EKEskQh0DrCccSyS5r CL2WzsJ4kICgy6UI0pbwYmUoXyeV76gR63j4f/+AsvZ/lX5fpOJCE7LU2oV20Pmo6eqk DQIpNGBct8l26iPatTFCbiq/BEysKFLPjidCdfSIto7/SyjUBHsLmKU3ecinZbGW6x8k FukQCj82NAkWJnyHjMTwSFxCLbyDxAttub/Zumz4pXrAjwoC/37TsVq/hWmdHeRUl97m HR9SNUSzBAumHWyIeZUtHq4wUr9n9uPtVLdrnre9w++7OXlZCBULTGpCi6niXo63tIIb rYEg== X-Gm-Message-State: AOAM531qsH94BhZKyNPmVvLCsxqQpOwTKyDnjrV7OELKZUkdkdkFCHOo AYcIAyM6ERGnzsgEo4IBlRd23AfJPODJ32V7OLIxUGjLuhj3ISk6PywITNJjTb9slqfz14yOhoy +EHC28rFsUnmwCw0= X-Received: by 2002:a05:6808:2193:b0:322:ce9a:f444 with SMTP id be19-20020a056808219300b00322ce9af444mr13792598oib.245.1650923517777; Mon, 25 Apr 2022 14:51:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzAhHXFkTmVhuc30rCqihEfTzUz+f68kUvPGbQB1/HpA4VSCUhfLaWG7eRGeeOYzv5ZsM0etA== X-Received: by 2002:a05:6808:2193:b0:322:ce9a:f444 with SMTP id be19-20020a056808219300b00322ce9af444mr13792544oib.245.1650923516071; Mon, 25 Apr 2022 14:51:56 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:431:c7f0:2ba0:92e8:26c9:ce7e:f03e]) by smtp.gmail.com with ESMTPSA id q7-20020a056870e60700b000e686d13878sm156807oag.18.2022.04.25.14.51.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Apr 2022 14:51:55 -0700 (PDT) From: Leonardo Bras To: =?utf-8?q?Marc-Andr=C3=A9_Lureau?= , Paolo Bonzini , Elena Ufimtseva , Jagannathan Raman , John G Johnson , =?utf-8?q?Daniel_P=2E_Berrang?= =?utf-8?q?=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Eric Blake , Markus Armbruster , Fam Zheng , Peter Xu Subject: [PATCH v9 6/7] multifd: Send header packet without flags if zero-copy-send is enabled Date: Mon, 25 Apr 2022 18:50:55 -0300 Message-Id: <20220425215055.611825-7-leobras@redhat.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220425215055.611825-1-leobras@redhat.com> References: <20220425215055.611825-1-leobras@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Since d48c3a0445 ("multifd: Use a single writev on the send side"), sending the header packet and the memory pages happens in the same writev, which can potentially make the migration faster. Using channel-socket as example, this works well with the default copying mechanism of sendmsg(), but with zero-copy-send=true, it will cause the migration to often break. This happens because the header packet buffer gets reused quite often, and there is a high chance that by the time the MSG_ZEROCOPY mechanism get to send the buffer, it has already changed, sending the wrong data and causing the migration to abort. It means that, as it is, the buffer for the header packet is not suitable for sending with MSG_ZEROCOPY. In order to enable zero copy for multifd, send the header packet on an individual write(), without any flags, and the remanining pages with a writev(), as it was happening before. This only changes how a migration with zero-copy-send=true works, not changing any current behavior for migrations with zero-copy-send=false. Signed-off-by: Leonardo Bras --- migration/multifd.c | 29 ++++++++++++++++++++++++++--- 1 file changed, 26 insertions(+), 3 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index 15fb668e64..6c940aaa98 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -639,6 +639,8 @@ static void *multifd_send_thread(void *opaque) if (p->pending_job) { uint64_t packet_num = p->packet_num; uint32_t flags = p->flags; + int iov_offset = 0; + p->iovs_num = 1; p->normal_num = 0; @@ -665,15 +667,36 @@ static void *multifd_send_thread(void *opaque) trace_multifd_send(p->id, packet_num, p->normal_num, flags, p->next_packet_size); - p->iov[0].iov_len = p->packet_len; - p->iov[0].iov_base = p->packet; + if (migrate_use_zero_copy_send()) { + /* Send header without zerocopy */ + ret = qio_channel_write_all(p->c, (void *)p->packet, + p->packet_len, &local_err); + if (ret != 0) { + break; + } + + if (!p->normal_num) { + /* No pages will be sent */ + goto skip_send; + } - ret = qio_channel_writev_all(p->c, p->iov, p->iovs_num, + /* Skip first iov : header */ + iov_offset = 1; + } else { + /* Send header using the same writev call */ + p->iov[0].iov_len = p->packet_len; + p->iov[0].iov_base = p->packet; + } + + ret = qio_channel_writev_all(p->c, p->iov + iov_offset, + p->iovs_num - iov_offset, &local_err); + if (ret != 0) { break; } +skip_send: qemu_mutex_lock(&p->mutex); p->pending_job--; qemu_mutex_unlock(&p->mutex); From patchwork Mon Apr 25 21:50:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12826284 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6E580C433F5 for ; Mon, 25 Apr 2022 21:57:00 +0000 (UTC) Received: from localhost ([::1]:60794 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nj6hX-0006gl-JL for qemu-devel@archiver.kernel.org; Mon, 25 Apr 2022 17:56:59 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57310) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nj6cn-00064u-5V for qemu-devel@nongnu.org; Mon, 25 Apr 2022 17:52:05 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:41880) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nj6cl-000175-4K for qemu-devel@nongnu.org; Mon, 25 Apr 2022 17:52:04 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1650923522; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Qb6tF6CJQjonKv1oVyihDvSgZidXotakUOfZADwTqiE=; b=LLF55KoQolC9eeQVeui9FrUJjX/GoU16+nHYzsTgqYjqpF13HjJxC22fvXNRXHGfGOeZfM ImHRcNqQ6Mq3kfsPwasHOU4ihwADV6Or3f6dtKeUSgv2P9POgFIJk8An9CjenUjUzLMPHr 1kK2tYdw4i6Lv35gbd/txS8+G6WHKrc= Received: from mail-oi1-f200.google.com (mail-oi1-f200.google.com [209.85.167.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-588-DR5Bk8cFMD6pG4gsh74qdw-1; Mon, 25 Apr 2022 17:52:01 -0400 X-MC-Unique: DR5Bk8cFMD6pG4gsh74qdw-1 Received: by mail-oi1-f200.google.com with SMTP id f2-20020aca3802000000b0032303e77135so5061152oia.2 for ; Mon, 25 Apr 2022 14:52:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Qb6tF6CJQjonKv1oVyihDvSgZidXotakUOfZADwTqiE=; b=nguDQn6JIOqR+7bem3ft369nI6AcOV5nYqT3TATPKggJmpsUfXyAJcXEvEw+JckfOF eYRGt1+lXZnlQiRoRy9sJ2DzPEqthyZtH4Fz4VMVydP0E/IlDzo/15JAOmYJnc0hjrfX SBm2H6Pm3OArBCdavUxKcevf9VEhFWjL4TPzUy4oOrMCFAs9EgqD9ZAS4PJAKizrcjZw no8/fwbthj87xodegrCTyx5Dm+ymmJAZ4B3mJj8XHm9yD4f48PGChVjcTECwBh1RFD+S ZstgskaqWvShd4yLoyNdAEeDUDiY92a/+5maAF0eTyPPMoDpGhVX7SMMWMdhswHykHFr cEGA== X-Gm-Message-State: AOAM530VzjEhvoJYwsJ2jT3KUriXKtAkhGt6p3op5hPx3FTYQ7ghCsCs 4pTE3kqw8AUyZSlf5fAEgg+eAUeusMmG1iqnJ1RYSQRjnyDjmg+x+DoERUVBASWIskRhtYhuNLM yViG/wiQ5nfxnNA0= X-Received: by 2002:a05:6870:15d3:b0:e5:bae5:4db with SMTP id k19-20020a05687015d300b000e5bae504dbmr12307640oad.245.1650923520379; Mon, 25 Apr 2022 14:52:00 -0700 (PDT) X-Google-Smtp-Source: ABdhPJySMZDNszJ3W99uhJa1xpaleIH+A4RDFW1jCK8CNx4HCawMJIqJAOnUPBmXseI9OzT7W+e2mA== X-Received: by 2002:a05:6870:15d3:b0:e5:bae5:4db with SMTP id k19-20020a05687015d300b000e5bae504dbmr12307629oad.245.1650923520198; Mon, 25 Apr 2022 14:52:00 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:431:c7f0:2ba0:92e8:26c9:ce7e:f03e]) by smtp.gmail.com with ESMTPSA id q7-20020a056870e60700b000e686d13878sm156807oag.18.2022.04.25.14.51.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Apr 2022 14:51:59 -0700 (PDT) From: Leonardo Bras To: =?utf-8?q?Marc-Andr=C3=A9_Lureau?= , Paolo Bonzini , Elena Ufimtseva , Jagannathan Raman , John G Johnson , =?utf-8?q?Daniel_P=2E_Berrang?= =?utf-8?q?=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Eric Blake , Markus Armbruster , Fam Zheng , Peter Xu Subject: [PATCH v9 7/7] multifd: Implement zero copy write in multifd migration (multifd-zero-copy) Date: Mon, 25 Apr 2022 18:50:56 -0300 Message-Id: <20220425215055.611825-8-leobras@redhat.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220425215055.611825-1-leobras@redhat.com> References: <20220425215055.611825-1-leobras@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.133.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Implement zero copy send on nocomp_send_write(), by making use of QIOChannel writev + flags & flush interface. Change multifd_send_sync_main() so flush_zero_copy() can be called after each iteration in order to make sure all dirty pages are sent before a new iteration is started. It will also flush at the beginning and at the end of migration. Also make it return -1 if flush_zero_copy() fails, in order to cancel the migration process, and avoid resuming the guest in the target host without receiving all current RAM. This will work fine on RAM migration because the RAM pages are not usually freed, and there is no problem on changing the pages content between writev_zero_copy() and the actual sending of the buffer, because this change will dirty the page and cause it to be re-sent on a next iteration anyway. A lot of locked memory may be needed in order to use multifd migration with zero-copy enabled, so disabling the feature should be necessary for low-privileged users trying to perform multifd migrations. Signed-off-by: Leonardo Bras --- migration/multifd.h | 2 ++ migration/migration.c | 11 ++++++++++- migration/multifd.c | 34 ++++++++++++++++++++++++++++++---- migration/socket.c | 5 +++-- 4 files changed, 45 insertions(+), 7 deletions(-) diff --git a/migration/multifd.h b/migration/multifd.h index bcf5992945..4d8d89e5e5 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -92,6 +92,8 @@ typedef struct { uint32_t packet_len; /* pointer to the packet */ MultiFDPacket_t *packet; + /* multifd flags for sending ram */ + int write_flags; /* multifd flags for each packet */ uint32_t flags; /* size of the next packet that contains pages */ diff --git a/migration/migration.c b/migration/migration.c index 4b6df2eb5e..31739b2af9 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1497,7 +1497,16 @@ static bool migrate_params_check(MigrationParameters *params, Error **errp) error_prepend(errp, "Invalid mapping given for block-bitmap-mapping: "); return false; } - +#ifdef CONFIG_LINUX + if (params->zero_copy_send && + (!migrate_use_multifd() || + params->multifd_compression != MULTIFD_COMPRESSION_NONE || + (params->tls_creds && *params->tls_creds))) { + error_setg(errp, + "Zero copy only available for non-compressed non-TLS multifd migration"); + return false; + } +#endif return true; } diff --git a/migration/multifd.c b/migration/multifd.c index 6c940aaa98..e37cc6e0d9 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -569,6 +569,7 @@ void multifd_save_cleanup(void) int multifd_send_sync_main(QEMUFile *f) { int i; + bool flush_zero_copy; if (!migrate_use_multifd()) { return 0; @@ -579,6 +580,14 @@ int multifd_send_sync_main(QEMUFile *f) return -1; } } + + /* + * When using zero-copy, it's necessary to flush after each iteration to + * make sure pages from earlier iterations don't end up replacing newer + * pages. + */ + flush_zero_copy = migrate_use_zero_copy_send(); + for (i = 0; i < migrate_multifd_channels(); i++) { MultiFDSendParams *p = &multifd_send_state->params[i]; @@ -600,6 +609,17 @@ int multifd_send_sync_main(QEMUFile *f) ram_counters.transferred += p->packet_len; qemu_mutex_unlock(&p->mutex); qemu_sem_post(&p->sem); + + if (flush_zero_copy && p->c) { + int ret; + Error *err = NULL; + + ret = qio_channel_flush(p->c, &err); + if (ret < 0) { + error_report_err(err); + return -1; + } + } } for (i = 0; i < migrate_multifd_channels(); i++) { MultiFDSendParams *p = &multifd_send_state->params[i]; @@ -688,10 +708,9 @@ static void *multifd_send_thread(void *opaque) p->iov[0].iov_base = p->packet; } - ret = qio_channel_writev_all(p->c, p->iov + iov_offset, - p->iovs_num - iov_offset, - &local_err); - + ret = qio_channel_writev_full_all(p->c, p->iov + iov_offset, + p->iovs_num - iov_offset, NULL, + 0, p->write_flags, &local_err); if (ret != 0) { break; } @@ -920,6 +939,13 @@ int multifd_save_setup(Error **errp) /* We need one extra place for the packet header */ p->iov = g_new0(struct iovec, page_count + 1); p->normal = g_new0(ram_addr_t, page_count); + + if (migrate_use_zero_copy_send()) { + p->write_flags = QIO_CHANNEL_WRITE_FLAG_ZERO_COPY; + } else { + p->write_flags = 0; + } + socket_send_channel_create(multifd_new_send_channel_async, p); } diff --git a/migration/socket.c b/migration/socket.c index 3754d8f72c..4fd5e85f50 100644 --- a/migration/socket.c +++ b/migration/socket.c @@ -79,8 +79,9 @@ static void socket_outgoing_migration(QIOTask *task, trace_migration_socket_outgoing_connected(data->hostname); - if (migrate_use_zero_copy_send()) { - error_setg(&err, "Zero copy send not available in migration"); + if (migrate_use_zero_copy_send() && + !qio_channel_has_feature(sioc, QIO_CHANNEL_FEATURE_WRITE_ZERO_COPY)) { + error_setg(&err, "Zero copy send feature not detected in host kernel"); } out: