From patchwork Tue Apr 26 23:06:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12828054 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 89B81C433EF for ; Tue, 26 Apr 2022 23:09:02 +0000 (UTC) Received: from localhost ([::1]:37646 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njUIn-0004Ey-DZ for qemu-devel@archiver.kernel.org; Tue, 26 Apr 2022 19:09:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:45176) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njUHQ-0001aE-N7 for qemu-devel@nongnu.org; Tue, 26 Apr 2022 19:07:36 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:60722) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njUHN-00008s-UV for qemu-devel@nongnu.org; Tue, 26 Apr 2022 19:07:36 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1651014453; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ukjPhv1Z+2rF9p/VCJwfH3dzg1DJLH5Q5ghzWxxWgDM=; b=Jo9inax3pmUfsqD7Q6L+RaNgs74nyYlitUxvaWJDhMohs/7XMWkenDdwO+/RcGWNUetNyA R5jnAlWq9Wz0BOCobMv+S1yt1xUpz7hvZ+bRKTXveLzD/KlieGkkLs3tATDeP4Hl/sTfMf +cmoUy5+D39RMvHkbWMBL5ID/CYpx7A= Received: from mail-oa1-f69.google.com (mail-oa1-f69.google.com [209.85.160.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-650-TMR-8niXOMCNs6XVrqdfog-1; Tue, 26 Apr 2022 19:07:32 -0400 X-MC-Unique: TMR-8niXOMCNs6XVrqdfog-1 Received: by mail-oa1-f69.google.com with SMTP id 586e51a60fabf-e90d2b84b5so100039fac.9 for ; Tue, 26 Apr 2022 16:07:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ukjPhv1Z+2rF9p/VCJwfH3dzg1DJLH5Q5ghzWxxWgDM=; b=SLU2EI039/yYbEvrbUOuYtCa/KQI/AHtRE65u1Unw0rRt7pCNsr4TjMGlRI2kldAxa /SvKZBrVicpKAcgnBZa7/sov2q5cMxhDcvEWiORNbav/LlXfsbL8Hpgme/YMLQa3m37v ZqCVUbjxsKOTCWivtrNeqxzmRN5A75CRZJFrM6aVRhtkjjqqrCpsCCPPPky1iZdPYM7A m/fsgmmhd7XWU6tUNEMdoKMN0eW9E0UUUd6AsiZxiJlnMfWUsvoj4KKXYY9dK5oTkzkO Otvbp2vYVZ1DIKI/x0H+dVC8UW2thExjH+uG1gXKzCalkUKDIU2YNMbeklF1wxR902na Qpsg== X-Gm-Message-State: AOAM533Lb+0gatrL4/kx/mTYuYxR3L8z1u3sr2I6wUWq/rEzj2gaApZ3 YWGjImB28iCPV7/0h9eAEPlzn/Sf+yJslBI5GHJB4kil0IPBAWiCc32S1QcUtXCACnhFBgcQwkJ ZWfF3HN6GIpQoCyo= X-Received: by 2002:a4a:3795:0:b0:35e:7ec1:9de2 with SMTP id r143-20020a4a3795000000b0035e7ec19de2mr3144214oor.24.1651014450995; Tue, 26 Apr 2022 16:07:30 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwdek45tuev01pDPNj79k3SAW0atvGaZflpZ85eAiroSh2HEIqdPjBdSv+B2wj9fY6EFXDiEg== X-Received: by 2002:a4a:3795:0:b0:35e:7ec1:9de2 with SMTP id r143-20020a4a3795000000b0035e7ec19de2mr3144206oor.24.1651014450708; Tue, 26 Apr 2022 16:07:30 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:431:c7f0:2ba0:92e8:26c9:ce7e:f03e]) by smtp.gmail.com with ESMTPSA id q4-20020a4a3004000000b0035e974ec923sm419855oof.2.2022.04.26.16.07.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Apr 2022 16:07:30 -0700 (PDT) From: Leonardo Bras To: =?utf-8?q?Marc-Andr=C3=A9_Lureau?= , Paolo Bonzini , Elena Ufimtseva , Jagannathan Raman , John G Johnson , =?utf-8?q?Daniel_P=2E_Berrang?= =?utf-8?q?=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Eric Blake , Markus Armbruster , Fam Zheng , Peter Xu Subject: [PATCH v10 1/7] QIOChannel: Add flags on io_writev and introduce io_flush callback Date: Tue, 26 Apr 2022 20:06:48 -0300 Message-Id: <20220426230654.637939-2-leobras@redhat.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220426230654.637939-1-leobras@redhat.com> References: <20220426230654.637939-1-leobras@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.133.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Add flags to io_writev and introduce io_flush as optional callback to QIOChannelClass, allowing the implementation of zero copy writes by subclasses. How to use them: - Write data using qio_channel_writev*(...,QIO_CHANNEL_WRITE_FLAG_ZERO_COPY), - Wait write completion with qio_channel_flush(). Notes: As some zero copy write implementations work asynchronously, it's recommended to keep the write buffer untouched until the return of qio_channel_flush(), to avoid the risk of sending an updated buffer instead of the buffer state during write. As io_flush callback is optional, if a subclass does not implement it, then: - io_flush will return 0 without changing anything. Also, some functions like qio_channel_writev_full_all() were adapted to receive a flag parameter. That allows shared code between zero copy and non-zero copy writev, and also an easier implementation on new flags. Signed-off-by: Leonardo Bras Reviewed-by: Daniel P. Berrangé Reviewed-by: Peter Xu Reviewed-by: Juan Quintela --- include/io/channel.h | 38 +++++++++++++++++++++- chardev/char-io.c | 2 +- hw/remote/mpqemu-link.c | 2 +- io/channel-buffer.c | 1 + io/channel-command.c | 1 + io/channel-file.c | 1 + io/channel-socket.c | 2 ++ io/channel-tls.c | 1 + io/channel-websock.c | 1 + io/channel.c | 49 +++++++++++++++++++++++------ migration/rdma.c | 1 + scsi/pr-manager-helper.c | 2 +- tests/unit/test-io-channel-socket.c | 1 + 13 files changed, 88 insertions(+), 14 deletions(-) diff --git a/include/io/channel.h b/include/io/channel.h index 88988979f8..c680ee7480 100644 --- a/include/io/channel.h +++ b/include/io/channel.h @@ -32,12 +32,15 @@ OBJECT_DECLARE_TYPE(QIOChannel, QIOChannelClass, #define QIO_CHANNEL_ERR_BLOCK -2 +#define QIO_CHANNEL_WRITE_FLAG_ZERO_COPY 0x1 + typedef enum QIOChannelFeature QIOChannelFeature; enum QIOChannelFeature { QIO_CHANNEL_FEATURE_FD_PASS, QIO_CHANNEL_FEATURE_SHUTDOWN, QIO_CHANNEL_FEATURE_LISTEN, + QIO_CHANNEL_FEATURE_WRITE_ZERO_COPY, }; @@ -104,6 +107,7 @@ struct QIOChannelClass { size_t niov, int *fds, size_t nfds, + int flags, Error **errp); ssize_t (*io_readv)(QIOChannel *ioc, const struct iovec *iov, @@ -136,6 +140,8 @@ struct QIOChannelClass { IOHandler *io_read, IOHandler *io_write, void *opaque); + int (*io_flush)(QIOChannel *ioc, + Error **errp); }; /* General I/O handling functions */ @@ -228,6 +234,7 @@ ssize_t qio_channel_readv_full(QIOChannel *ioc, * @niov: the length of the @iov array * @fds: an array of file handles to send * @nfds: number of file handles in @fds + * @flags: write flags (QIO_CHANNEL_WRITE_FLAG_*) * @errp: pointer to a NULL-initialized error object * * Write data to the IO channel, reading it from the @@ -260,6 +267,7 @@ ssize_t qio_channel_writev_full(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp); /** @@ -837,6 +845,7 @@ int qio_channel_readv_full_all(QIOChannel *ioc, * @niov: the length of the @iov array * @fds: an array of file handles to send * @nfds: number of file handles in @fds + * @flags: write flags (QIO_CHANNEL_WRITE_FLAG_*) * @errp: pointer to a NULL-initialized error object * * @@ -846,6 +855,14 @@ int qio_channel_readv_full_all(QIOChannel *ioc, * to be written, yielding from the current coroutine * if required. * + * If QIO_CHANNEL_WRITE_FLAG_ZERO_COPY is passed in flags, + * instead of waiting for all requested data to be written, + * this function will wait until it's all queued for writing. + * In this case, if the buffer gets changed between queueing and + * sending, the updated buffer will be sent. If this is not a + * desired behavior, it's suggested to call qio_channel_flush() + * before reusing the buffer. + * * Returns: 0 if all bytes were written, or -1 on error */ @@ -853,6 +870,25 @@ int qio_channel_writev_full_all(QIOChannel *ioc, const struct iovec *iov, size_t niov, int *fds, size_t nfds, - Error **errp); + int flags, Error **errp); + +/** + * qio_channel_flush: + * @ioc: the channel object + * @errp: pointer to a NULL-initialized error object + * + * Will block until every packet queued with + * qio_channel_writev_full() + QIO_CHANNEL_WRITE_FLAG_ZERO_COPY + * is sent, or return in case of any error. + * + * If not implemented, acts as a no-op, and returns 0. + * + * Returns -1 if any error is found, + * 1 if every send failed to use zero copy. + * 0 otherwise. + */ + +int qio_channel_flush(QIOChannel *ioc, + Error **errp); #endif /* QIO_CHANNEL_H */ diff --git a/chardev/char-io.c b/chardev/char-io.c index 8ced184160..4451128cba 100644 --- a/chardev/char-io.c +++ b/chardev/char-io.c @@ -122,7 +122,7 @@ int io_channel_send_full(QIOChannel *ioc, ret = qio_channel_writev_full( ioc, &iov, 1, - fds, nfds, NULL); + fds, nfds, 0, NULL); if (ret == QIO_CHANNEL_ERR_BLOCK) { if (offset) { return offset; diff --git a/hw/remote/mpqemu-link.c b/hw/remote/mpqemu-link.c index 2a4aa651ca..9bd98e8219 100644 --- a/hw/remote/mpqemu-link.c +++ b/hw/remote/mpqemu-link.c @@ -68,7 +68,7 @@ bool mpqemu_msg_send(MPQemuMsg *msg, QIOChannel *ioc, Error **errp) } if (!qio_channel_writev_full_all(ioc, send, G_N_ELEMENTS(send), - fds, nfds, errp)) { + fds, nfds, 0, errp)) { ret = true; } else { trace_mpqemu_send_io_error(msg->cmd, msg->size, nfds); diff --git a/io/channel-buffer.c b/io/channel-buffer.c index baa4e2b089..bf52011be2 100644 --- a/io/channel-buffer.c +++ b/io/channel-buffer.c @@ -81,6 +81,7 @@ static ssize_t qio_channel_buffer_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelBuffer *bioc = QIO_CHANNEL_BUFFER(ioc); diff --git a/io/channel-command.c b/io/channel-command.c index 338da73ade..54560464ae 100644 --- a/io/channel-command.c +++ b/io/channel-command.c @@ -258,6 +258,7 @@ static ssize_t qio_channel_command_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelCommand *cioc = QIO_CHANNEL_COMMAND(ioc); diff --git a/io/channel-file.c b/io/channel-file.c index d7cf6d278f..ef6807a6be 100644 --- a/io/channel-file.c +++ b/io/channel-file.c @@ -114,6 +114,7 @@ static ssize_t qio_channel_file_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelFile *fioc = QIO_CHANNEL_FILE(ioc); diff --git a/io/channel-socket.c b/io/channel-socket.c index 9f5ddf68b6..696a04dc9c 100644 --- a/io/channel-socket.c +++ b/io/channel-socket.c @@ -524,6 +524,7 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc); @@ -619,6 +620,7 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc); diff --git a/io/channel-tls.c b/io/channel-tls.c index 2ae1b92fc0..4ce890a538 100644 --- a/io/channel-tls.c +++ b/io/channel-tls.c @@ -301,6 +301,7 @@ static ssize_t qio_channel_tls_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelTLS *tioc = QIO_CHANNEL_TLS(ioc); diff --git a/io/channel-websock.c b/io/channel-websock.c index 55145a6a8c..9619906ac3 100644 --- a/io/channel-websock.c +++ b/io/channel-websock.c @@ -1127,6 +1127,7 @@ static ssize_t qio_channel_websock_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelWebsock *wioc = QIO_CHANNEL_WEBSOCK(ioc); diff --git a/io/channel.c b/io/channel.c index e8b019dc36..0640941ac5 100644 --- a/io/channel.c +++ b/io/channel.c @@ -72,18 +72,32 @@ ssize_t qio_channel_writev_full(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc); - if ((fds || nfds) && - !qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_FD_PASS)) { + if (fds || nfds) { + if (!qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_FD_PASS)) { + error_setg_errno(errp, EINVAL, + "Channel does not support file descriptor passing"); + return -1; + } + if (flags & QIO_CHANNEL_WRITE_FLAG_ZERO_COPY) { + error_setg_errno(errp, EINVAL, + "Zero Copy does not support file descriptor passing"); + return -1; + } + } + + if ((flags & QIO_CHANNEL_WRITE_FLAG_ZERO_COPY) && + !qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_WRITE_ZERO_COPY)) { error_setg_errno(errp, EINVAL, - "Channel does not support file descriptor passing"); + "Requested Zero Copy feature is not available"); return -1; } - return klass->io_writev(ioc, iov, niov, fds, nfds, errp); + return klass->io_writev(ioc, iov, niov, fds, nfds, flags, errp); } @@ -217,14 +231,14 @@ int qio_channel_writev_all(QIOChannel *ioc, size_t niov, Error **errp) { - return qio_channel_writev_full_all(ioc, iov, niov, NULL, 0, errp); + return qio_channel_writev_full_all(ioc, iov, niov, NULL, 0, 0, errp); } int qio_channel_writev_full_all(QIOChannel *ioc, const struct iovec *iov, size_t niov, int *fds, size_t nfds, - Error **errp) + int flags, Error **errp) { int ret = -1; struct iovec *local_iov = g_new(struct iovec, niov); @@ -237,8 +251,10 @@ int qio_channel_writev_full_all(QIOChannel *ioc, while (nlocal_iov > 0) { ssize_t len; - len = qio_channel_writev_full(ioc, local_iov, nlocal_iov, fds, nfds, - errp); + + len = qio_channel_writev_full(ioc, local_iov, nlocal_iov, fds, + nfds, flags, errp); + if (len == QIO_CHANNEL_ERR_BLOCK) { if (qemu_in_coroutine()) { qio_channel_yield(ioc, G_IO_OUT); @@ -277,7 +293,7 @@ ssize_t qio_channel_writev(QIOChannel *ioc, size_t niov, Error **errp) { - return qio_channel_writev_full(ioc, iov, niov, NULL, 0, errp); + return qio_channel_writev_full(ioc, iov, niov, NULL, 0, 0, errp); } @@ -297,7 +313,7 @@ ssize_t qio_channel_write(QIOChannel *ioc, Error **errp) { struct iovec iov = { .iov_base = (char *)buf, .iov_len = buflen }; - return qio_channel_writev_full(ioc, &iov, 1, NULL, 0, errp); + return qio_channel_writev_full(ioc, &iov, 1, NULL, 0, 0, errp); } @@ -473,6 +489,19 @@ off_t qio_channel_io_seek(QIOChannel *ioc, return klass->io_seek(ioc, offset, whence, errp); } +int qio_channel_flush(QIOChannel *ioc, + Error **errp) +{ + QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc); + + if (!klass->io_flush || + !qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_WRITE_ZERO_COPY)) { + return 0; + } + + return klass->io_flush(ioc, errp); +} + static void qio_channel_restart_read(void *opaque) { diff --git a/migration/rdma.c b/migration/rdma.c index ef1e65ec36..672d1958a9 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2840,6 +2840,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); diff --git a/scsi/pr-manager-helper.c b/scsi/pr-manager-helper.c index 451c7631b7..3be52a98d5 100644 --- a/scsi/pr-manager-helper.c +++ b/scsi/pr-manager-helper.c @@ -77,7 +77,7 @@ static int pr_manager_helper_write(PRManagerHelper *pr_mgr, iov.iov_base = (void *)buf; iov.iov_len = sz; n_written = qio_channel_writev_full(QIO_CHANNEL(pr_mgr->ioc), &iov, 1, - nfds ? &fd : NULL, nfds, errp); + nfds ? &fd : NULL, nfds, 0, errp); if (n_written <= 0) { assert(n_written != QIO_CHANNEL_ERR_BLOCK); diff --git a/tests/unit/test-io-channel-socket.c b/tests/unit/test-io-channel-socket.c index c49eec1f03..6713886d02 100644 --- a/tests/unit/test-io-channel-socket.c +++ b/tests/unit/test-io-channel-socket.c @@ -444,6 +444,7 @@ static void test_io_channel_unix_fd_pass(void) G_N_ELEMENTS(iosend), fdsend, G_N_ELEMENTS(fdsend), + 0, &error_abort); qio_channel_readv_full(dst, From patchwork Tue Apr 26 23:06:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12828055 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 16E33C433EF for ; Tue, 26 Apr 2022 23:09:16 +0000 (UTC) Received: from localhost ([::1]:38554 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njUJ1-0004qd-2r for qemu-devel@archiver.kernel.org; Tue, 26 Apr 2022 19:09:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:45208) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njUHU-0001oM-9H for qemu-devel@nongnu.org; Tue, 26 Apr 2022 19:07:40 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:54465) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njUHS-0000AQ-GA for qemu-devel@nongnu.org; Tue, 26 Apr 2022 19:07:39 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1651014458; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uIqEFdqWEF4dCTXgfI+IVJbQ/j6rHuIRMDolbwB0Ie8=; b=VUj4GCtZBIVHEnRZrW67QWdGLTD1b5feNi0Nm9gFKM8eLa/Y/zfpLp89HjE7pPoc3zB1pc 9NM0LPgpqDguYT51IbnCmI37lNaDDXI3VqKOReAor+XtUwCaBuXVKqPjNeZPrMGzd3KnNk S3kTshV9QwwhqovaR1G5kcIfTIiKxm4= Received: from mail-ot1-f69.google.com (mail-ot1-f69.google.com [209.85.210.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-541-7aZHy9vGNkigxZDlH8GI1Q-1; Tue, 26 Apr 2022 19:07:37 -0400 X-MC-Unique: 7aZHy9vGNkigxZDlH8GI1Q-1 Received: by mail-ot1-f69.google.com with SMTP id l20-20020a056830269400b00605523dff90so8326197otu.22 for ; Tue, 26 Apr 2022 16:07:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=uIqEFdqWEF4dCTXgfI+IVJbQ/j6rHuIRMDolbwB0Ie8=; b=Jt5rPPUi4YB0b+6vmlxs72IUUw+sZrrhWIfejZTpQRG8f33EfocQdguek+7kM3BTCF 29/vBWWt1fx+r85GNWb+3w5vWN7iNuJJCzQpqz9OkcEdk58tkvrDVQ9JwD0b3WssNeX/ jymXlwLBcZUGGxpQGjJTAHyoG6jKxel8/o7SuODNJgIOysJfjiDYHRcoC0xP5eJ8BF1y h3HhWWwlp7arfJSxv0A393zEGt4lQAJMmiBXUFHngY/nR6OMZCm/AVNP/dZWhKT11Teu gt+Wr0Y5NhmmWs5yDj8q/wjPzK72JzasKiSJSRuHEnL3zQgQKVWVqoqUOLqwaowWaAI7 WdCw== X-Gm-Message-State: AOAM531z9ZLsxFxEvF9r4SnvCK3VRP2QH1BWFGBnfS2k1rk4kwrggLrd DPoohnbyB6aeglCZlGVpRowmrUSXxI39AdMEt3ULCC7XcNQCD7vN27NXlfRuGfXYPm6QyhJVH/r yK5gp4e7lyxRAdIk= X-Received: by 2002:a9d:2f05:0:b0:605:49ae:ecbe with SMTP id h5-20020a9d2f05000000b0060549aeecbemr9054316otb.167.1651014456023; Tue, 26 Apr 2022 16:07:36 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx0WjnjgDwN2Zb3wHK8Uw76GbPn0ZKLEJtpY5AgXwuGnIB6JT7Ek4PsMrjauTfo5cRVZOoFLw== X-Received: by 2002:a9d:2f05:0:b0:605:49ae:ecbe with SMTP id h5-20020a9d2f05000000b0060549aeecbemr9054314otb.167.1651014455803; Tue, 26 Apr 2022 16:07:35 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:431:c7f0:2ba0:92e8:26c9:ce7e:f03e]) by smtp.gmail.com with ESMTPSA id q4-20020a4a3004000000b0035e974ec923sm419855oof.2.2022.04.26.16.07.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Apr 2022 16:07:35 -0700 (PDT) From: Leonardo Bras To: =?utf-8?q?Marc-Andr=C3=A9_Lureau?= , Paolo Bonzini , Elena Ufimtseva , Jagannathan Raman , John G Johnson , =?utf-8?q?Daniel_P=2E_Berrang?= =?utf-8?q?=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Eric Blake , Markus Armbruster , Fam Zheng , Peter Xu Subject: [PATCH v10 2/7] QIOChannelSocket: Implement io_writev zero copy flag & io_flush for CONFIG_LINUX Date: Tue, 26 Apr 2022 20:06:49 -0300 Message-Id: <20220426230654.637939-3-leobras@redhat.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220426230654.637939-1-leobras@redhat.com> References: <20220426230654.637939-1-leobras@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" For CONFIG_LINUX, implement the new zero copy flag and the optional callback io_flush on QIOChannelSocket, but enables it only when MSG_ZEROCOPY feature is available in the host kernel, which is checked on qio_channel_socket_connect_sync() qio_channel_socket_flush() was implemented by counting how many times sendmsg(...,MSG_ZEROCOPY) was successfully called, and then reading the socket's error queue, in order to find how many of them finished sending. Flush will loop until those counters are the same, or until some error occurs. Notes on using writev() with QIO_CHANNEL_WRITE_FLAG_ZERO_COPY: 1: Buffer - As MSG_ZEROCOPY tells the kernel to use the same user buffer to avoid copying, some caution is necessary to avoid overwriting any buffer before it's sent. If something like this happen, a newer version of the buffer may be sent instead. - If this is a problem, it's recommended to call qio_channel_flush() before freeing or re-using the buffer. 2: Locked memory - When using MSG_ZERCOCOPY, the buffer memory will be locked after queued, and unlocked after it's sent. - Depending on the size of each buffer, and how often it's sent, it may require a larger amount of locked memory than usually available to non-root user. - If the required amount of locked memory is not available, writev_zero_copy will return an error, which can abort an operation like migration, - Because of this, when an user code wants to add zero copy as a feature, it requires a mechanism to disable it, so it can still be accessible to less privileged users. Signed-off-by: Leonardo Bras Reviewed-by: Peter Xu Reviewed-by: Daniel P. Berrangé Reviewed-by: Juan Quintela --- include/io/channel-socket.h | 2 + io/channel-socket.c | 108 ++++++++++++++++++++++++++++++++++-- 2 files changed, 106 insertions(+), 4 deletions(-) diff --git a/include/io/channel-socket.h b/include/io/channel-socket.h index e747e63514..513c428fe4 100644 --- a/include/io/channel-socket.h +++ b/include/io/channel-socket.h @@ -47,6 +47,8 @@ struct QIOChannelSocket { socklen_t localAddrLen; struct sockaddr_storage remoteAddr; socklen_t remoteAddrLen; + ssize_t zero_copy_queued; + ssize_t zero_copy_sent; }; diff --git a/io/channel-socket.c b/io/channel-socket.c index 696a04dc9c..1dd85fc1ef 100644 --- a/io/channel-socket.c +++ b/io/channel-socket.c @@ -25,6 +25,10 @@ #include "io/channel-watch.h" #include "trace.h" #include "qapi/clone-visitor.h" +#ifdef CONFIG_LINUX +#include +#include +#endif #define SOCKET_MAX_FDS 16 @@ -54,6 +58,8 @@ qio_channel_socket_new(void) sioc = QIO_CHANNEL_SOCKET(object_new(TYPE_QIO_CHANNEL_SOCKET)); sioc->fd = -1; + sioc->zero_copy_queued = 0; + sioc->zero_copy_sent = 0; ioc = QIO_CHANNEL(sioc); qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN); @@ -153,6 +159,16 @@ int qio_channel_socket_connect_sync(QIOChannelSocket *ioc, return -1; } +#ifdef CONFIG_LINUX + int ret, v = 1; + ret = setsockopt(fd, SOL_SOCKET, SO_ZEROCOPY, &v, sizeof(v)); + if (ret == 0) { + /* Zero copy available on host */ + qio_channel_set_feature(QIO_CHANNEL(ioc), + QIO_CHANNEL_FEATURE_WRITE_ZERO_COPY); + } +#endif + return 0; } @@ -533,6 +549,7 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc, char control[CMSG_SPACE(sizeof(int) * SOCKET_MAX_FDS)]; size_t fdsize = sizeof(int) * nfds; struct cmsghdr *cmsg; + int sflags = 0; memset(control, 0, CMSG_SPACE(sizeof(int) * SOCKET_MAX_FDS)); @@ -557,15 +574,27 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc, memcpy(CMSG_DATA(cmsg), fds, fdsize); } + if (flags & QIO_CHANNEL_WRITE_FLAG_ZERO_COPY) { + sflags = MSG_ZEROCOPY; + } + retry: - ret = sendmsg(sioc->fd, &msg, 0); + ret = sendmsg(sioc->fd, &msg, sflags); if (ret <= 0) { - if (errno == EAGAIN) { + switch (errno) { + case EAGAIN: return QIO_CHANNEL_ERR_BLOCK; - } - if (errno == EINTR) { + case EINTR: goto retry; + case ENOBUFS: + if (sflags & MSG_ZEROCOPY) { + error_setg_errno(errp, errno, + "Process can't lock enough memory for using MSG_ZEROCOPY"); + return -1; + } + break; } + error_setg_errno(errp, errno, "Unable to write to socket"); return -1; @@ -659,6 +688,74 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc, } #endif /* WIN32 */ + +#ifdef CONFIG_LINUX +static int qio_channel_socket_flush(QIOChannel *ioc, + Error **errp) +{ + QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc); + struct msghdr msg = {}; + struct sock_extended_err *serr; + struct cmsghdr *cm; + char control[CMSG_SPACE(sizeof(*serr))]; + int received; + int ret = 1; + + msg.msg_control = control; + msg.msg_controllen = sizeof(control); + memset(control, 0, sizeof(control)); + + while (sioc->zero_copy_sent < sioc->zero_copy_queued) { + received = recvmsg(sioc->fd, &msg, MSG_ERRQUEUE); + if (received < 0) { + switch (errno) { + case EAGAIN: + /* Nothing on errqueue, wait until something is available */ + qio_channel_wait(ioc, G_IO_ERR); + continue; + case EINTR: + continue; + default: + error_setg_errno(errp, errno, + "Unable to read errqueue"); + return -1; + } + } + + cm = CMSG_FIRSTHDR(&msg); + if (cm->cmsg_level != SOL_IP && + cm->cmsg_type != IP_RECVERR) { + error_setg_errno(errp, EPROTOTYPE, + "Wrong cmsg in errqueue"); + return -1; + } + + serr = (void *) CMSG_DATA(cm); + if (serr->ee_errno != SO_EE_ORIGIN_NONE) { + error_setg_errno(errp, serr->ee_errno, + "Error on socket"); + return -1; + } + if (serr->ee_origin != SO_EE_ORIGIN_ZEROCOPY) { + error_setg_errno(errp, serr->ee_origin, + "Error not from zero copy"); + return -1; + } + + /* No errors, count successfully finished sendmsg()*/ + sioc->zero_copy_sent += serr->ee_data - serr->ee_info + 1; + + /* If any sendmsg() succeeded using zero copy, return 0 at the end */ + if (serr->ee_code != SO_EE_CODE_ZEROCOPY_COPIED) { + ret = 0; + } + } + + return ret; +} + +#endif /* CONFIG_LINUX */ + static int qio_channel_socket_set_blocking(QIOChannel *ioc, bool enabled, @@ -789,6 +886,9 @@ static void qio_channel_socket_class_init(ObjectClass *klass, ioc_klass->io_set_delay = qio_channel_socket_set_delay; ioc_klass->io_create_watch = qio_channel_socket_create_watch; ioc_klass->io_set_aio_fd_handler = qio_channel_socket_set_aio_fd_handler; +#ifdef CONFIG_LINUX + ioc_klass->io_flush = qio_channel_socket_flush; +#endif } static const TypeInfo qio_channel_socket_info = { From patchwork Tue Apr 26 23:06:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12828059 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1F751C433EF for ; Tue, 26 Apr 2022 23:12:26 +0000 (UTC) Received: from localhost ([::1]:48294 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njULx-00034h-6y for qemu-devel@archiver.kernel.org; Tue, 26 Apr 2022 19:12:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:45238) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njUHZ-00024S-3k for qemu-devel@nongnu.org; Tue, 26 Apr 2022 19:07:45 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:53758) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njUHX-0000At-7L for qemu-devel@nongnu.org; Tue, 26 Apr 2022 19:07:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1651014462; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nukln1m3F357YvTDPENXPAFylyoyrfes2qpG25R7VCs=; b=aZ1/5+jHpnWZXvlWTue4kFtdcmH4SeqxJiRGUjJ0UEHvLan/arhTudZxwL1d46S9wCbY43 gjS1uF8F/fmvmpGQFRh4gBI6Hs9F9ZGHFrV3ehcxDWIFQQCZYORmZ5gkFETWoG7KtK7I8L 65tcRcprQktBOMOtyyJbghBNJTj4YbI= Received: from mail-ot1-f72.google.com (mail-ot1-f72.google.com [209.85.210.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-589-y7zMl1IOPO6UUEzYgV2Iug-1; Tue, 26 Apr 2022 19:07:41 -0400 X-MC-Unique: y7zMl1IOPO6UUEzYgV2Iug-1 Received: by mail-ot1-f72.google.com with SMTP id e90-20020a9d01e3000000b0060579b0abe3so2081068ote.10 for ; Tue, 26 Apr 2022 16:07:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=nukln1m3F357YvTDPENXPAFylyoyrfes2qpG25R7VCs=; b=rfm15ycXqXz13rnba6y0WDQ7REyAw6iRGuhNV/NXQWk1bg1qV6tbPKxSh8II1QrRbL CjOlWrOk2XlfdCQikvUxnjVWiv97cGAXZbeNndHsCQwN2oKmmoklQX8DAenAQ3n7lnth TNcUm91t/57H+HEi7+1v2MQp5tdUIi6ydh+0OWXq49Nfy5FJrPoL/fXMcLceR+VG4DCe +j2wEKeg4exEaQOWaBvjG6CriF0d+/MpWWOESk3KRJ1ub3ghNl1D0MkL6j4El94glZpc ySywVjwhPgOimhkLxr1QBIExZ5tGhEsrW52mCaPSn/aGwgEGehdo5LriIZ5t7umkeplg H7Tw== X-Gm-Message-State: AOAM5315mjWNELJQJ50zcnA/YH2Hf5ftE5ixR2/4IVeOVcC2D8la0IHo bSYBeEJ2UfhaowMfN9dNnBrvZ/m7wlJxrEcb4VSXfvWk+GvYs3CV4cjqxIhx0Hkvh/PJkQ20bLV XLIQRdByrL20rynM= X-Received: by 2002:a05:6870:24a1:b0:e9:2282:614d with SMTP id s33-20020a05687024a100b000e92282614dmr6782965oaq.250.1651014460913; Tue, 26 Apr 2022 16:07:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwvz6Z+h8td3fdDsXGvV2OT5CgB8IUZLY3/CvyZpHpX4utvHyg7tMgzGFEzGtnLBC9fuJqGdw== X-Received: by 2002:a05:6870:24a1:b0:e9:2282:614d with SMTP id s33-20020a05687024a100b000e92282614dmr6782942oaq.250.1651014460615; Tue, 26 Apr 2022 16:07:40 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:431:c7f0:2ba0:92e8:26c9:ce7e:f03e]) by smtp.gmail.com with ESMTPSA id q4-20020a4a3004000000b0035e974ec923sm419855oof.2.2022.04.26.16.07.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Apr 2022 16:07:40 -0700 (PDT) From: Leonardo Bras To: =?utf-8?q?Marc-Andr=C3=A9_Lureau?= , Paolo Bonzini , Elena Ufimtseva , Jagannathan Raman , John G Johnson , =?utf-8?q?Daniel_P=2E_Berrang?= =?utf-8?q?=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Eric Blake , Markus Armbruster , Fam Zheng , Peter Xu Subject: [PATCH v10 3/7] migration: Add zero-copy-send parameter for QMP/HMP for Linux Date: Tue, 26 Apr 2022 20:06:50 -0300 Message-Id: <20220426230654.637939-4-leobras@redhat.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220426230654.637939-1-leobras@redhat.com> References: <20220426230654.637939-1-leobras@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Add property that allows zero-copy migration of memory pages on the sending side, and also includes a helper function migrate_use_zero_copy_send() to check if it's enabled. No code is introduced to actually do the migration, but it allow future implementations to enable/disable this feature. On non-Linux builds this parameter is compiled-out. Signed-off-by: Leonardo Bras Reviewed-by: Peter Xu Reviewed-by: Daniel P. Berrangé Reviewed-by: Juan Quintela --- qapi/migration.json | 24 ++++++++++++++++++++++++ migration/migration.h | 5 +++++ migration/migration.c | 32 ++++++++++++++++++++++++++++++++ migration/socket.c | 11 +++++++++-- monitor/hmp-cmds.c | 6 ++++++ 5 files changed, 76 insertions(+), 2 deletions(-) diff --git a/qapi/migration.json b/qapi/migration.json index 409eb086a2..04246481ce 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -741,6 +741,13 @@ # will consume more CPU. # Defaults to 1. (Since 5.0) # +# @zero-copy-send: Controls behavior on sending memory pages on migration. +# When true, enables a zero-copy mechanism for sending memory +# pages, if host supports it. +# Requires that QEMU be permitted to use locked memory for guest +# RAM pages. +# Defaults to false. (Since 7.1) +# # @block-bitmap-mapping: Maps block nodes and bitmaps on them to # aliases for the purpose of dirty bitmap migration. Such # aliases may for example be the corresponding names on the @@ -780,6 +787,7 @@ 'xbzrle-cache-size', 'max-postcopy-bandwidth', 'max-cpu-throttle', 'multifd-compression', 'multifd-zlib-level' ,'multifd-zstd-level', + { 'name': 'zero-copy-send', 'if' : 'CONFIG_LINUX'}, 'block-bitmap-mapping' ] } ## @@ -906,6 +914,13 @@ # will consume more CPU. # Defaults to 1. (Since 5.0) # +# @zero-copy-send: Controls behavior on sending memory pages on migration. +# When true, enables a zero-copy mechanism for sending memory +# pages, if host supports it. +# Requires that QEMU be permitted to use locked memory for guest +# RAM pages. +# Defaults to false. (Since 7.1) +# # @block-bitmap-mapping: Maps block nodes and bitmaps on them to # aliases for the purpose of dirty bitmap migration. Such # aliases may for example be the corresponding names on the @@ -960,6 +975,7 @@ '*multifd-compression': 'MultiFDCompression', '*multifd-zlib-level': 'uint8', '*multifd-zstd-level': 'uint8', + '*zero-copy-send': { 'type': 'bool', 'if': 'CONFIG_LINUX' }, '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ] } } ## @@ -1106,6 +1122,13 @@ # will consume more CPU. # Defaults to 1. (Since 5.0) # +# @zero-copy-send: Controls behavior on sending memory pages on migration. +# When true, enables a zero-copy mechanism for sending memory +# pages, if host supports it. +# Requires that QEMU be permitted to use locked memory for guest +# RAM pages. +# Defaults to false. (Since 7.1) +# # @block-bitmap-mapping: Maps block nodes and bitmaps on them to # aliases for the purpose of dirty bitmap migration. Such # aliases may for example be the corresponding names on the @@ -1158,6 +1181,7 @@ '*multifd-compression': 'MultiFDCompression', '*multifd-zlib-level': 'uint8', '*multifd-zstd-level': 'uint8', + '*zero-copy-send': { 'type': 'bool', 'if': 'CONFIG_LINUX' }, '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ] } } ## diff --git a/migration/migration.h b/migration/migration.h index a863032b71..e8f2941a55 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -375,6 +375,11 @@ MultiFDCompression migrate_multifd_compression(void); int migrate_multifd_zlib_level(void); int migrate_multifd_zstd_level(void); +#ifdef CONFIG_LINUX +bool migrate_use_zero_copy_send(void); +#else +#define migrate_use_zero_copy_send() (false) +#endif int migrate_use_xbzrle(void); uint64_t migrate_xbzrle_cache_size(void); bool migrate_colo_enabled(void); diff --git a/migration/migration.c b/migration/migration.c index 5a31b23bd6..3e91f4b5e2 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -910,6 +910,10 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp) params->multifd_zlib_level = s->parameters.multifd_zlib_level; params->has_multifd_zstd_level = true; params->multifd_zstd_level = s->parameters.multifd_zstd_level; +#ifdef CONFIG_LINUX + params->has_zero_copy_send = true; + params->zero_copy_send = s->parameters.zero_copy_send; +#endif params->has_xbzrle_cache_size = true; params->xbzrle_cache_size = s->parameters.xbzrle_cache_size; params->has_max_postcopy_bandwidth = true; @@ -1567,6 +1571,11 @@ static void migrate_params_test_apply(MigrateSetParameters *params, if (params->has_multifd_compression) { dest->multifd_compression = params->multifd_compression; } +#ifdef CONFIG_LINUX + if (params->has_zero_copy_send) { + dest->zero_copy_send = params->zero_copy_send; + } +#endif if (params->has_xbzrle_cache_size) { dest->xbzrle_cache_size = params->xbzrle_cache_size; } @@ -1679,6 +1688,11 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp) if (params->has_multifd_compression) { s->parameters.multifd_compression = params->multifd_compression; } +#ifdef CONFIG_LINUX + if (params->has_zero_copy_send) { + s->parameters.zero_copy_send = params->zero_copy_send; + } +#endif if (params->has_xbzrle_cache_size) { s->parameters.xbzrle_cache_size = params->xbzrle_cache_size; xbzrle_cache_resize(params->xbzrle_cache_size, errp); @@ -2563,6 +2577,17 @@ int migrate_multifd_zstd_level(void) return s->parameters.multifd_zstd_level; } +#ifdef CONFIG_LINUX +bool migrate_use_zero_copy_send(void) +{ + MigrationState *s; + + s = migrate_get_current(); + + return s->parameters.zero_copy_send; +} +#endif + int migrate_use_xbzrle(void) { MigrationState *s; @@ -4206,6 +4231,10 @@ static Property migration_properties[] = { DEFINE_PROP_UINT8("multifd-zstd-level", MigrationState, parameters.multifd_zstd_level, DEFAULT_MIGRATE_MULTIFD_ZSTD_LEVEL), +#ifdef CONFIG_LINUX + DEFINE_PROP_BOOL("zero_copy_send", MigrationState, + parameters.zero_copy_send, false), +#endif DEFINE_PROP_SIZE("xbzrle-cache-size", MigrationState, parameters.xbzrle_cache_size, DEFAULT_MIGRATE_XBZRLE_CACHE_SIZE), @@ -4303,6 +4332,9 @@ static void migration_instance_init(Object *obj) params->has_multifd_compression = true; params->has_multifd_zlib_level = true; params->has_multifd_zstd_level = true; +#ifdef CONFIG_LINUX + params->has_zero_copy_send = true; +#endif params->has_xbzrle_cache_size = true; params->has_max_postcopy_bandwidth = true; params->has_max_cpu_throttle = true; diff --git a/migration/socket.c b/migration/socket.c index 05705a32d8..3754d8f72c 100644 --- a/migration/socket.c +++ b/migration/socket.c @@ -74,9 +74,16 @@ static void socket_outgoing_migration(QIOTask *task, if (qio_task_propagate_error(task, &err)) { trace_migration_socket_outgoing_error(error_get_pretty(err)); - } else { - trace_migration_socket_outgoing_connected(data->hostname); + goto out; } + + trace_migration_socket_outgoing_connected(data->hostname); + + if (migrate_use_zero_copy_send()) { + error_setg(&err, "Zero copy send not available in migration"); + } + +out: migration_channel_connect(data->s, sioc, data->hostname, err); object_unref(OBJECT(sioc)); } diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c index 634968498b..55b48d3733 100644 --- a/monitor/hmp-cmds.c +++ b/monitor/hmp-cmds.c @@ -1309,6 +1309,12 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict) p->has_multifd_zstd_level = true; visit_type_uint8(v, param, &p->multifd_zstd_level, &err); break; +#ifdef CONFIG_LINUX + case MIGRATION_PARAMETER_ZERO_COPY_SEND: + p->has_zero_copy_send = true; + visit_type_bool(v, param, &p->zero_copy_send, &err); + break; +#endif case MIGRATION_PARAMETER_XBZRLE_CACHE_SIZE: p->has_xbzrle_cache_size = true; if (!visit_type_size(v, param, &cache_size, &err)) { From patchwork Tue Apr 26 23:06:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12828061 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 17421C433EF for ; Tue, 26 Apr 2022 23:15:54 +0000 (UTC) Received: from localhost ([::1]:56516 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njUPR-00007b-2B for qemu-devel@archiver.kernel.org; Tue, 26 Apr 2022 19:15:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:45256) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njUHg-0002Qf-5G for qemu-devel@nongnu.org; Tue, 26 Apr 2022 19:07:52 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:51633) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njUHe-0000Bq-FJ for qemu-devel@nongnu.org; Tue, 26 Apr 2022 19:07:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1651014469; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OCN4rR4bAIHd7dZhYCUf3JnbZmufgbcGPMyBtVIgbmE=; b=MzitEgln1iNEWSAtDIWbu5CSiYQRSqOEmK22c9uQ1cxGk3sS+CqYhAAXYV2KZCWf8oAzeq 6zpkfHDxRrMdFqmt5CEt0U8TmNnjRizEIMI91xNqq3w8J0xvI+IVy7mMgG+HYqJOsPzzXG UjXIvGXEyOGzd9rZblyeI8Er9TLdQHs= Received: from mail-oo1-f69.google.com (mail-oo1-f69.google.com [209.85.161.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-73-J_uZfuBINOSERuvosx5IsQ-1; Tue, 26 Apr 2022 19:07:48 -0400 X-MC-Unique: J_uZfuBINOSERuvosx5IsQ-1 Received: by mail-oo1-f69.google.com with SMTP id i12-20020a4ad08c000000b0035e7ff33045so205336oor.0 for ; Tue, 26 Apr 2022 16:07:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=OCN4rR4bAIHd7dZhYCUf3JnbZmufgbcGPMyBtVIgbmE=; b=72Nl33udTO8KPOVEw3T47OHnm8wiJoOGBFpK08/tWvbswimJrgglW41C6MCKj7Hv+s p7X4hfcqPALybfFXHXLgsG5efFsXajtf+FLjhxH+jY77+FmbGTIKXpFh8sKmPTxXCDZg 8Ud5s0zVTs+xjOK1610Qgko+xO10oy8O39yMOTXDK5++Tmvxdo9DUtCQySkluKdm3CJz Upr0ezbF+xq9mqC+s/mHhMdsFLreigBIASmnpTBZlaRTPJrIY2AhMA0Ny4oaoCftnxNS im8PkQV+NozlnkgDJxnJxWpQDerh84bIU7E2xPpnBP2ppL3IFhbaQvA6ygcv+GznEeNz GbVA== X-Gm-Message-State: AOAM533xSFYj4xv3bu3PXWCjHvIL039yLY1dT05QPutlqUDBTz32RxuJ kDS5VJ/GtJA22vpApnTsVeaFOyz7tRGGIa6uGpohIgdqgFTEhRx4wlVHb2J9unqciYi6SYftYWi 4cKSqfPop/rYH2WY= X-Received: by 2002:a05:6808:f06:b0:324:f7bd:c3a with SMTP id m6-20020a0568080f0600b00324f7bd0c3amr9935725oiw.25.1651014468214; Tue, 26 Apr 2022 16:07:48 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxO/36uEc66sBabrP0V7pzKYtV+/jNDRWVxOhJVyoa1EtckIAt0pYAEKS4HqBGnrEY8O/HwVg== X-Received: by 2002:a05:6808:f06:b0:324:f7bd:c3a with SMTP id m6-20020a0568080f0600b00324f7bd0c3amr9935704oiw.25.1651014467943; Tue, 26 Apr 2022 16:07:47 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:431:c7f0:2ba0:92e8:26c9:ce7e:f03e]) by smtp.gmail.com with ESMTPSA id q4-20020a4a3004000000b0035e974ec923sm419855oof.2.2022.04.26.16.07.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Apr 2022 16:07:47 -0700 (PDT) From: Leonardo Bras To: =?utf-8?q?Marc-Andr=C3=A9_Lureau?= , Paolo Bonzini , Elena Ufimtseva , Jagannathan Raman , John G Johnson , =?utf-8?q?Daniel_P=2E_Berrang?= =?utf-8?q?=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Eric Blake , Markus Armbruster , Fam Zheng , Peter Xu Subject: [PATCH v10 4/7] migration: Add migrate_use_tls() helper Date: Tue, 26 Apr 2022 20:06:51 -0300 Message-Id: <20220426230654.637939-5-leobras@redhat.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220426230654.637939-1-leobras@redhat.com> References: <20220426230654.637939-1-leobras@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" A lot of places check parameters.tls_creds in order to evaluate if TLS is in use, and sometimes call migrate_get_current() just for that test. Add new helper function migrate_use_tls() in order to simplify testing for TLS usage. Signed-off-by: Leonardo Bras Reviewed-by: Juan Quintela Reviewed-by: Peter Xu Reviewed-by: Daniel P. Berrangé --- migration/migration.h | 1 + migration/channel.c | 3 +-- migration/migration.c | 9 +++++++++ migration/multifd.c | 5 +---- 4 files changed, 12 insertions(+), 6 deletions(-) diff --git a/migration/migration.h b/migration/migration.h index e8f2941a55..485d58b95f 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -380,6 +380,7 @@ bool migrate_use_zero_copy_send(void); #else #define migrate_use_zero_copy_send() (false) #endif +int migrate_use_tls(void); int migrate_use_xbzrle(void); uint64_t migrate_xbzrle_cache_size(void); bool migrate_colo_enabled(void); diff --git a/migration/channel.c b/migration/channel.c index c6a8dcf1d7..a162d00fea 100644 --- a/migration/channel.c +++ b/migration/channel.c @@ -38,8 +38,7 @@ void migration_channel_process_incoming(QIOChannel *ioc) trace_migration_set_incoming_channel( ioc, object_get_typename(OBJECT(ioc))); - if (s->parameters.tls_creds && - *s->parameters.tls_creds && + if (migrate_use_tls() && !object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_TLS)) { migration_tls_channel_process_incoming(s, ioc, &local_err); diff --git a/migration/migration.c b/migration/migration.c index 3e91f4b5e2..4b6df2eb5e 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2588,6 +2588,15 @@ bool migrate_use_zero_copy_send(void) } #endif +int migrate_use_tls(void) +{ + MigrationState *s; + + s = migrate_get_current(); + + return s->parameters.tls_creds && *s->parameters.tls_creds; +} + int migrate_use_xbzrle(void) { MigrationState *s; diff --git a/migration/multifd.c b/migration/multifd.c index 9ea4f581e2..2a8c8570c3 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -782,15 +782,12 @@ static bool multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc, Error *error) { - MigrationState *s = migrate_get_current(); - trace_multifd_set_outgoing_channel( ioc, object_get_typename(OBJECT(ioc)), migrate_get_current()->hostname, error); if (!error) { - if (s->parameters.tls_creds && - *s->parameters.tls_creds && + if (migrate_use_tls() && !object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_TLS)) { multifd_tls_channel_connect(p, ioc, &error); From patchwork Tue Apr 26 23:06:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12828056 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0EE23C433F5 for ; Tue, 26 Apr 2022 23:11:34 +0000 (UTC) Received: from localhost ([::1]:45348 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njULF-00016j-6F for qemu-devel@archiver.kernel.org; Tue, 26 Apr 2022 19:11:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:45330) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njUI0-0003Jc-8r for qemu-devel@nongnu.org; Tue, 26 Apr 2022 19:08:12 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:28166) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njUHy-0000Cn-Hj for qemu-devel@nongnu.org; Tue, 26 Apr 2022 19:08:11 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1651014490; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OldqSj6ZhjcEWqr35wQqdzeCrylOxSMHkGFbo7CK6Qo=; b=Eckgb7egdgAgYKBo7acPTe/amhOvENZujr5R8/wTKfCzLBF2ChrgoQuripID5h7IdaX2qV E+ZHFFwzYFtleWUb3awpFkk4Zt0Y67VOdn2JDDebW2EROpJK6d0kbgZ9jeUErQs/ciaUOZ 2pzmGjfStgS/a4FvyIo2UqBJ8x5pziY= Received: from mail-ot1-f69.google.com (mail-ot1-f69.google.com [209.85.210.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-56-I7-nyvJ8MyWmhATTiRciCQ-1; Tue, 26 Apr 2022 19:08:08 -0400 X-MC-Unique: I7-nyvJ8MyWmhATTiRciCQ-1 Received: by mail-ot1-f69.google.com with SMTP id q11-20020a9d57cb000000b006059c31dc26so4936836oti.6 for ; Tue, 26 Apr 2022 16:08:08 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=OldqSj6ZhjcEWqr35wQqdzeCrylOxSMHkGFbo7CK6Qo=; b=LVDZeBlDsWIZ1/HVYbqD5epDX0+DmfGqphgD3q9iZ5e+vZ9grJpe1oUpZ/DvLxtnPU cWfcBp25Zfps6Zm5ES8vYsJB8u/eTR30Op+MEoIV4nj97mZyRClOBNIzjKHQrUk0qqrW e6rOxBGDQuV4zeBwAh7r1ulBB4xY12EzL08OGc2bNLOcbLfrTL7IzXxvmWhJjyWaBQ4l 0DPYbC8o8J5dF9iDO5iZAVSG6NGUNSXxT6Fm7hHMKt+BpgZvE5lpS2j4D2qxCo2i7DrP mgyqsm423uEocsN/QRAzi6k5O5U1rI0dEPdE7kDzKCN1QKFwlxYP3j0xMoa4cv7G6xHt knpQ== X-Gm-Message-State: AOAM5322SAo31cGi44uz3zPyaIx9Y3zWD88QZ//sZyhgDFnA4fmKByrI 4Pt0sbwZ0gOfdTy/L9SndauKFG24xuBdz8V+bG0B47uV9+VfdNLnWCuZZQ1LO3AlXRb9G8OF6kz khtzRdK/DuGEUnA4= X-Received: by 2002:a05:6808:2228:b0:322:8cd4:874 with SMTP id bd40-20020a056808222800b003228cd40874mr16186054oib.123.1651014487966; Tue, 26 Apr 2022 16:08:07 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzYTH/ZWjjDC5cBA1CeK+Ee7dY2ty4IO+4LXCfpIZ4G1ohtWyvfOMWSJxC0uDruXKpgO0MG6A== X-Received: by 2002:a05:6808:2228:b0:322:8cd4:874 with SMTP id bd40-20020a056808222800b003228cd40874mr16186045oib.123.1651014487787; Tue, 26 Apr 2022 16:08:07 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:431:c7f0:2ba0:92e8:26c9:ce7e:f03e]) by smtp.gmail.com with ESMTPSA id q4-20020a4a3004000000b0035e974ec923sm419855oof.2.2022.04.26.16.08.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Apr 2022 16:08:07 -0700 (PDT) From: Leonardo Bras To: =?utf-8?q?Marc-Andr=C3=A9_Lureau?= , Paolo Bonzini , Elena Ufimtseva , Jagannathan Raman , John G Johnson , =?utf-8?q?Daniel_P=2E_Berrang?= =?utf-8?q?=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Eric Blake , Markus Armbruster , Fam Zheng , Peter Xu Subject: [PATCH v10 5/7] multifd: multifd_send_sync_main now returns negative on error Date: Tue, 26 Apr 2022 20:06:54 -0300 Message-Id: <20220426230654.637939-6-leobras@redhat.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220426230654.637939-1-leobras@redhat.com> References: <20220426230654.637939-1-leobras@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.133.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Even though multifd_send_sync_main() currently emits error_reports, it's callers don't really check it before continuing. Change multifd_send_sync_main() to return -1 on error and 0 on success. Also change all it's callers to make use of this change and possibly fail earlier. (This change is important to next patch on multifd zero copy implementation, to make it sure an error in zero-copy flush does not go unnoticed. Signed-off-by: Leonardo Bras Reviewed-by: Daniel P. Berrangé Reviewed-by: Peter Xu --- migration/multifd.h | 2 +- migration/multifd.c | 10 ++++++---- migration/ram.c | 29 ++++++++++++++++++++++------- 3 files changed, 29 insertions(+), 12 deletions(-) diff --git a/migration/multifd.h b/migration/multifd.h index 7d0effcb03..bcf5992945 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -20,7 +20,7 @@ int multifd_load_cleanup(Error **errp); bool multifd_recv_all_channels_created(void); bool multifd_recv_new_channel(QIOChannel *ioc, Error **errp); void multifd_recv_sync_main(void); -void multifd_send_sync_main(QEMUFile *f); +int multifd_send_sync_main(QEMUFile *f); int multifd_queue_page(QEMUFile *f, RAMBlock *block, ram_addr_t offset); /* Multifd Compression flags */ diff --git a/migration/multifd.c b/migration/multifd.c index 2a8c8570c3..15fb668e64 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -566,17 +566,17 @@ void multifd_save_cleanup(void) multifd_send_state = NULL; } -void multifd_send_sync_main(QEMUFile *f) +int multifd_send_sync_main(QEMUFile *f) { int i; if (!migrate_use_multifd()) { - return; + return 0; } if (multifd_send_state->pages->num) { if (multifd_send_pages(f) < 0) { error_report("%s: multifd_send_pages fail", __func__); - return; + return -1; } } for (i = 0; i < migrate_multifd_channels(); i++) { @@ -589,7 +589,7 @@ void multifd_send_sync_main(QEMUFile *f) if (p->quit) { error_report("%s: channel %d has already quit", __func__, i); qemu_mutex_unlock(&p->mutex); - return; + return -1; } p->packet_num = multifd_send_state->packet_num++; @@ -608,6 +608,8 @@ void multifd_send_sync_main(QEMUFile *f) qemu_sem_wait(&p->sem_sync); } trace_multifd_send_sync_main(multifd_send_state->packet_num); + + return 0; } static void *multifd_send_thread(void *opaque) diff --git a/migration/ram.c b/migration/ram.c index a2489a2699..5f5e37f64d 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2909,6 +2909,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque) { RAMState **rsp = opaque; RAMBlock *block; + int ret; if (compress_threads_save_setup()) { return -1; @@ -2943,7 +2944,11 @@ static int ram_save_setup(QEMUFile *f, void *opaque) ram_control_before_iterate(f, RAM_CONTROL_SETUP); ram_control_after_iterate(f, RAM_CONTROL_SETUP); - multifd_send_sync_main(f); + ret = multifd_send_sync_main(f); + if (ret < 0) { + return ret; + } + qemu_put_be64(f, RAM_SAVE_FLAG_EOS); qemu_fflush(f); @@ -3052,7 +3057,11 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) out: if (ret >= 0 && migration_is_setup_or_active(migrate_get_current()->state)) { - multifd_send_sync_main(rs->f); + ret = multifd_send_sync_main(rs->f); + if (ret < 0) { + return ret; + } + qemu_put_be64(f, RAM_SAVE_FLAG_EOS); qemu_fflush(f); ram_transferred_add(8); @@ -3112,13 +3121,19 @@ static int ram_save_complete(QEMUFile *f, void *opaque) ram_control_after_iterate(f, RAM_CONTROL_FINISH); } - if (ret >= 0) { - multifd_send_sync_main(rs->f); - qemu_put_be64(f, RAM_SAVE_FLAG_EOS); - qemu_fflush(f); + if (ret < 0) { + return ret; } - return ret; + ret = multifd_send_sync_main(rs->f); + if (ret < 0) { + return ret; + } + + qemu_put_be64(f, RAM_SAVE_FLAG_EOS); + qemu_fflush(f); + + return 0; } static void ram_save_pending(QEMUFile *f, void *opaque, uint64_t max_size, From patchwork Tue Apr 26 23:06:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12828063 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 44F6DC433EF for ; Tue, 26 Apr 2022 23:17:54 +0000 (UTC) Received: from localhost ([::1]:35022 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njURN-0004rt-EQ for qemu-devel@archiver.kernel.org; Tue, 26 Apr 2022 19:17:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:45362) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njUI4-0003UW-Eu for qemu-devel@nongnu.org; Tue, 26 Apr 2022 19:08:16 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:56395) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njUI2-0000D8-P8 for qemu-devel@nongnu.org; Tue, 26 Apr 2022 19:08:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1651014494; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JghpxiChl6cuKnY7pXCN68XGYDcXsIdBNsxh/g1mscg=; b=OAbqqOVARiGZyfHJgbzeDzts2M8zbQgGQIW8JFh90eoRCuVZg20vafKvU+xep8H3ritv6S iJ6U7+nurBwoFv73v5cRu17MT7WJhbtBRf5eXMyEoQbNsQ3aM4Otp/aRx88k+J2OZ30lvB FN6bExpczLIkRwLghJC5myXWc0wpwro= Received: from mail-oa1-f70.google.com (mail-oa1-f70.google.com [209.85.160.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-467-bePjE5sNNxSzyB9gu7MgdQ-1; Tue, 26 Apr 2022 19:08:12 -0400 X-MC-Unique: bePjE5sNNxSzyB9gu7MgdQ-1 Received: by mail-oa1-f70.google.com with SMTP id 586e51a60fabf-db2a59fdc8so104406fac.6 for ; Tue, 26 Apr 2022 16:08:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=JghpxiChl6cuKnY7pXCN68XGYDcXsIdBNsxh/g1mscg=; b=Js2edTvKO1Fy88gr1XnakQMIHaJbHGAf0DM9D4wwUH0lnttVoB+DbvAahefI1tZiA+ Tcg4zuHOSCANYjiGgEIxITG7A1VOujiYH7sbx3u3lPcMv/Wc4EfXeswTaCIvSfeOPPKQ Fm0sZyDjiDw35X3SLK+oSXrOBhDszvF9UWT8VdMuRK0EiAAvf0EAm+/G4WtdqQasO7KL TySxe5Bbu4Q5z3n1zqbPl9+/w8IzyvtImJ7zIfT7bS2kQH3hHNtOBRO3sz6nx50mra6p UzjIG21gOR4RLVWg1JJFZY3pT4D/+17JB/cKhONstX0Pwt7U4qbavKmB0A/nGgSz9eWl iNuA== X-Gm-Message-State: AOAM530l3rHMdwX6E/LwQGUQUx+9sEbIda6AdZwzTh9U7yJ607Gt4sri TX4zhYb2fdshFgc78uZN4xEEZBdkFUbeF7kfyfieQS+vEI4qdaXSTrtkCIT7rNK+CHWR37ffQ/2 bQPnEQ5YQdCOBEQU= X-Received: by 2002:a05:6871:282:b0:e9:113a:fc6a with SMTP id i2-20020a056871028200b000e9113afc6amr8465393oae.200.1651014492058; Tue, 26 Apr 2022 16:08:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzeuJlB+YjSAOART0rrbDUjrsQF4Rgu7Zrl6DcSlZkTWKqf9xLL3iyvc9W7h6XJUNcJu1uA/g== X-Received: by 2002:a05:6871:282:b0:e9:113a:fc6a with SMTP id i2-20020a056871028200b000e9113afc6amr8465386oae.200.1651014491883; Tue, 26 Apr 2022 16:08:11 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:431:c7f0:2ba0:92e8:26c9:ce7e:f03e]) by smtp.gmail.com with ESMTPSA id q4-20020a4a3004000000b0035e974ec923sm419855oof.2.2022.04.26.16.08.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Apr 2022 16:08:11 -0700 (PDT) From: Leonardo Bras To: =?utf-8?q?Marc-Andr=C3=A9_Lureau?= , Paolo Bonzini , Elena Ufimtseva , Jagannathan Raman , John G Johnson , =?utf-8?q?Daniel_P=2E_Berrang?= =?utf-8?q?=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Eric Blake , Markus Armbruster , Fam Zheng , Peter Xu Subject: [PATCH v10 6/7] multifd: Send header packet without flags if zero-copy-send is enabled Date: Tue, 26 Apr 2022 20:06:55 -0300 Message-Id: <20220426230654.637939-7-leobras@redhat.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220426230654.637939-1-leobras@redhat.com> References: <20220426230654.637939-1-leobras@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Since d48c3a0445 ("multifd: Use a single writev on the send side"), sending the header packet and the memory pages happens in the same writev, which can potentially make the migration faster. Using channel-socket as example, this works well with the default copying mechanism of sendmsg(), but with zero-copy-send=true, it will cause the migration to often break. This happens because the header packet buffer gets reused quite often, and there is a high chance that by the time the MSG_ZEROCOPY mechanism get to send the buffer, it has already changed, sending the wrong data and causing the migration to abort. It means that, as it is, the buffer for the header packet is not suitable for sending with MSG_ZEROCOPY. In order to enable zero copy for multifd, send the header packet on an individual write(), without any flags, and the remanining pages with a writev(), as it was happening before. This only changes how a migration with zero-copy-send=true works, not changing any current behavior for migrations with zero-copy-send=false. Signed-off-by: Leonardo Bras Reviewed-by: Peter Xu Reviewed-by: Daniel P. Berrangé --- migration/multifd.c | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index 15fb668e64..07b2e92d8d 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -617,6 +617,7 @@ static void *multifd_send_thread(void *opaque) MultiFDSendParams *p = opaque; Error *local_err = NULL; int ret = 0; + bool use_zero_copy_send = migrate_use_zero_copy_send(); trace_multifd_send_thread_start(p->id); rcu_register_thread(); @@ -639,9 +640,14 @@ static void *multifd_send_thread(void *opaque) if (p->pending_job) { uint64_t packet_num = p->packet_num; uint32_t flags = p->flags; - p->iovs_num = 1; p->normal_num = 0; + if (use_zero_copy_send) { + p->iovs_num = 0; + } else { + p->iovs_num = 1; + } + for (int i = 0; i < p->pages->num; i++) { p->normal[p->normal_num] = p->pages->offset[i]; p->normal_num++; @@ -665,8 +671,19 @@ static void *multifd_send_thread(void *opaque) trace_multifd_send(p->id, packet_num, p->normal_num, flags, p->next_packet_size); - p->iov[0].iov_len = p->packet_len; - p->iov[0].iov_base = p->packet; + if (use_zero_copy_send) { + /* Send header first, without zerocopy */ + ret = qio_channel_write_all(p->c, (void *)p->packet, + p->packet_len, &local_err); + if (ret != 0) { + break; + } + + } else { + /* Send header using the same writev call */ + p->iov[0].iov_len = p->packet_len; + p->iov[0].iov_base = p->packet; + } ret = qio_channel_writev_all(p->c, p->iov, p->iovs_num, &local_err); From patchwork Tue Apr 26 23:06:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12828058 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0B2DDC433EF for ; Tue, 26 Apr 2022 23:12:01 +0000 (UTC) Received: from localhost ([::1]:47094 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njULg-0002Hc-4N for qemu-devel@archiver.kernel.org; Tue, 26 Apr 2022 19:12:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:45426) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njUI9-0003km-JP for qemu-devel@nongnu.org; Tue, 26 Apr 2022 19:08:21 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:52417) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njUI7-0000E3-Lm for qemu-devel@nongnu.org; Tue, 26 Apr 2022 19:08:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1651014499; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qv4CArNih3imukHMfqH5z6FvPrBaLOtFUJ4rxF55FdE=; b=dZ2w4gsS3cOzS8O1Gysy3xex4eKTdrhDvCG7bzowtdDccVVCKukIYV6AXQAFgZmkNbutam jbIr1m5KALozj+1UAPcjhmPlydX5YiIK4bUgocpa0sH/jrlYx+pb+Sv3HQaDbdt4hZOvlR 3ZtLFwmDxsYSvQ8zUgxD0drOfc1olB0= Received: from mail-oi1-f199.google.com (mail-oi1-f199.google.com [209.85.167.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-195-yl7PSPfzPnul3lROBiJdNg-1; Tue, 26 Apr 2022 19:08:17 -0400 X-MC-Unique: yl7PSPfzPnul3lROBiJdNg-1 Received: by mail-oi1-f199.google.com with SMTP id r65-20020acada44000000b002f9c653b942so97367oig.16 for ; Tue, 26 Apr 2022 16:08:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qv4CArNih3imukHMfqH5z6FvPrBaLOtFUJ4rxF55FdE=; b=MnAgS+l7obRRAY9b5m1a31VxqT3t0ZYZ50i33C7kUHX2fhbepxH/TnTWU3Smi0Rhgn 9afT3qMShTdLacYIenQGYSe/1jNYlrvEwonPpDYPKVIEWOibwAMyw3KxxSjkKl+ZJqPD Z2hgeP1q7E0ZEiKmVd6aCsge+6CWFjcLhjv5IjrF1vFMKYQIMDbNakh0xi8qOsciTzJf kaqRn47OxvND2qLAcLO4iENSOF5ArgOZOyKKyWiz9JfPtUfmcgRZv4bvZ0lGb7Z6wKWo sUgNIrUN3+eIydK+4ETwXgiKwgoXeyoCnHNw5BuQkTKALZ5oyFvo8HDXgnEAp/zUWdTF V2Jg== X-Gm-Message-State: AOAM53354jeKNxhBbBXW00BHnabLPcgPZukk1qMfhb+Pxhjqs3ZooH8l 6LFYWgo48o857J4low/nM2N0jvYIW7WZMkFrO1GzWMfyeWCvd97Sxh/jXpyrrx2GThhj9+gwXR4 Z0rcsmv+ATwtPdHI= X-Received: by 2002:a4a:94cc:0:b0:332:9bdf:8176 with SMTP id l12-20020a4a94cc000000b003329bdf8176mr9233556ooi.63.1651014496260; Tue, 26 Apr 2022 16:08:16 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxrS6TUxKSG5uXUmbkF0zNLr2U5Ua/FTtz0JSIsw/BYoxrAvfz0oARdjFdaQoGLs8E2YcS79A== X-Received: by 2002:a4a:94cc:0:b0:332:9bdf:8176 with SMTP id l12-20020a4a94cc000000b003329bdf8176mr9233539ooi.63.1651014496029; Tue, 26 Apr 2022 16:08:16 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:431:c7f0:2ba0:92e8:26c9:ce7e:f03e]) by smtp.gmail.com with ESMTPSA id q4-20020a4a3004000000b0035e974ec923sm419855oof.2.2022.04.26.16.08.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Apr 2022 16:08:15 -0700 (PDT) From: Leonardo Bras To: =?utf-8?q?Marc-Andr=C3=A9_Lureau?= , Paolo Bonzini , Elena Ufimtseva , Jagannathan Raman , John G Johnson , =?utf-8?q?Daniel_P=2E_Berrang?= =?utf-8?q?=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Eric Blake , Markus Armbruster , Fam Zheng , Peter Xu Subject: [PATCH v10 7/7] multifd: Implement zero copy write in multifd migration (multifd-zero-copy) Date: Tue, 26 Apr 2022 20:06:56 -0300 Message-Id: <20220426230654.637939-8-leobras@redhat.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220426230654.637939-1-leobras@redhat.com> References: <20220426230654.637939-1-leobras@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Implement zero copy send on nocomp_send_write(), by making use of QIOChannel writev + flags & flush interface. Change multifd_send_sync_main() so flush_zero_copy() can be called after each iteration in order to make sure all dirty pages are sent before a new iteration is started. It will also flush at the beginning and at the end of migration. Also make it return -1 if flush_zero_copy() fails, in order to cancel the migration process, and avoid resuming the guest in the target host without receiving all current RAM. This will work fine on RAM migration because the RAM pages are not usually freed, and there is no problem on changing the pages content between writev_zero_copy() and the actual sending of the buffer, because this change will dirty the page and cause it to be re-sent on a next iteration anyway. A lot of locked memory may be needed in order to use multifd migration with zero-copy enabled, so disabling the feature should be necessary for low-privileged users trying to perform multifd migrations. Signed-off-by: Leonardo Bras Reviewed-by: Peter Xu Reviewed-by: Daniel P. Berrangé --- migration/multifd.h | 2 ++ migration/migration.c | 11 ++++++++++- migration/multifd.c | 37 +++++++++++++++++++++++++++++++++++-- migration/socket.c | 5 +++-- 4 files changed, 50 insertions(+), 5 deletions(-) diff --git a/migration/multifd.h b/migration/multifd.h index bcf5992945..4d8d89e5e5 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -92,6 +92,8 @@ typedef struct { uint32_t packet_len; /* pointer to the packet */ MultiFDPacket_t *packet; + /* multifd flags for sending ram */ + int write_flags; /* multifd flags for each packet */ uint32_t flags; /* size of the next packet that contains pages */ diff --git a/migration/migration.c b/migration/migration.c index 4b6df2eb5e..31739b2af9 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1497,7 +1497,16 @@ static bool migrate_params_check(MigrationParameters *params, Error **errp) error_prepend(errp, "Invalid mapping given for block-bitmap-mapping: "); return false; } - +#ifdef CONFIG_LINUX + if (params->zero_copy_send && + (!migrate_use_multifd() || + params->multifd_compression != MULTIFD_COMPRESSION_NONE || + (params->tls_creds && *params->tls_creds))) { + error_setg(errp, + "Zero copy only available for non-compressed non-TLS multifd migration"); + return false; + } +#endif return true; } diff --git a/migration/multifd.c b/migration/multifd.c index 07b2e92d8d..00b4040eca 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -569,6 +569,7 @@ void multifd_save_cleanup(void) int multifd_send_sync_main(QEMUFile *f) { int i; + bool flush_zero_copy; if (!migrate_use_multifd()) { return 0; @@ -579,6 +580,20 @@ int multifd_send_sync_main(QEMUFile *f) return -1; } } + + /* + * When using zero-copy, it's necessary to flush the pages before any of + * the pages can be sent again, so we'll make sure the new version of the + * pages will always arrive _later_ than the old pages. + * + * Currently we achieve this by flushing the zero-page requested writes + * per ram iteration, but in the future we could potentially optimize it + * to be less frequent, e.g. only after we finished one whole scanning of + * all the dirty bitmaps. + */ + + flush_zero_copy = migrate_use_zero_copy_send(); + for (i = 0; i < migrate_multifd_channels(); i++) { MultiFDSendParams *p = &multifd_send_state->params[i]; @@ -600,6 +615,17 @@ int multifd_send_sync_main(QEMUFile *f) ram_counters.transferred += p->packet_len; qemu_mutex_unlock(&p->mutex); qemu_sem_post(&p->sem); + + if (flush_zero_copy && p->c) { + int ret; + Error *err = NULL; + + ret = qio_channel_flush(p->c, &err); + if (ret < 0) { + error_report_err(err); + return -1; + } + } } for (i = 0; i < migrate_multifd_channels(); i++) { MultiFDSendParams *p = &multifd_send_state->params[i]; @@ -685,8 +711,8 @@ static void *multifd_send_thread(void *opaque) p->iov[0].iov_base = p->packet; } - ret = qio_channel_writev_all(p->c, p->iov, p->iovs_num, - &local_err); + ret = qio_channel_writev_full_all(p->c, p->iov, p->iovs_num, NULL, + 0, p->write_flags, &local_err); if (ret != 0) { break; } @@ -914,6 +940,13 @@ int multifd_save_setup(Error **errp) /* We need one extra place for the packet header */ p->iov = g_new0(struct iovec, page_count + 1); p->normal = g_new0(ram_addr_t, page_count); + + if (migrate_use_zero_copy_send()) { + p->write_flags = QIO_CHANNEL_WRITE_FLAG_ZERO_COPY; + } else { + p->write_flags = 0; + } + socket_send_channel_create(multifd_new_send_channel_async, p); } diff --git a/migration/socket.c b/migration/socket.c index 3754d8f72c..4fd5e85f50 100644 --- a/migration/socket.c +++ b/migration/socket.c @@ -79,8 +79,9 @@ static void socket_outgoing_migration(QIOTask *task, trace_migration_socket_outgoing_connected(data->hostname); - if (migrate_use_zero_copy_send()) { - error_setg(&err, "Zero copy send not available in migration"); + if (migrate_use_zero_copy_send() && + !qio_channel_has_feature(sioc, QIO_CHANNEL_FEATURE_WRITE_ZERO_COPY)) { + error_setg(&err, "Zero copy send feature not detected in host kernel"); } out: