From patchwork Wed Sep 22 22:24:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12511527 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-20.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9589C433EF for ; Wed, 22 Sep 2021 22:26:58 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 234EA610D1 for ; Wed, 22 Sep 2021 22:26:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 234EA610D1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:41454 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mTAhd-0001RV-6S for qemu-devel@archiver.kernel.org; Wed, 22 Sep 2021 18:26:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:37256) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mTAf3-0007QO-8J for qemu-devel@nongnu.org; Wed, 22 Sep 2021 18:24:17 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:35431) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mTAez-0000eC-9e for qemu-devel@nongnu.org; Wed, 22 Sep 2021 18:24:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632349452; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vnguP+2mrXxmtwsSKrnUDw8Lm8Nix0Q4Eotebo7a3uQ=; b=dgf7wgmn/vTTWgDM5NtGMEYXfQ515eb/8PJMe7BhSUezUMNJ+dy1YHcmIVAMqpnsYzoR0P uMaP0elgv8SNPxU/A9ArggF3ochj6xcQRYRv0ljZRtMgzY7xGsCCA/eJLuhXybb0ioJNOf V+WlDnblPEeP7kf++mzdoqIrYhGjIl4= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-341-LgSatzx5NXGiFRUMQc4ySQ-1; Wed, 22 Sep 2021 18:24:11 -0400 X-MC-Unique: LgSatzx5NXGiFRUMQc4ySQ-1 Received: by mail-qv1-f70.google.com with SMTP id e6-20020a0cb446000000b0037eeb9851dfso14329693qvf.17 for ; Wed, 22 Sep 2021 15:24:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=vnguP+2mrXxmtwsSKrnUDw8Lm8Nix0Q4Eotebo7a3uQ=; b=5mlkqdPdhhXbLtT/5WDCf8bsUz6n+5xDqKyRxmE7xQsMdDdJrE9q1SDsDIeHUc9gky bMioyH53NPFlRkumnCrEFwwc53oJK4zD3IKc95QtQWueL7KGv4JLFkHqI74kIeBpZn76 smEkFpqt6XJDqkJEtnDBBeyKIwXd88AQFTZN8LiaEMknGyx1QcehiCUIbdS48iTz4cVZ xevM/IILUrLUTCb04SjCimuOAw1PBexosRZyn4wyo39yn9ccGcMfPsshpSdiPwZkmxzl n8tdTLHoeu21OuJqc0y+n1r9XBCoN+g5FlqS/qe9jRoh2dLZsTlFYeIoS91qqQtU5Wek gz2Q== X-Gm-Message-State: AOAM533DUFHosdT0zcqPaHxLOAC9MkH5E2+G7a4Ac2fJCT/XCNqb+5Z0 L6z01VTIFz8Wxakc1cETHP6B8R5YiqaH6AUk0LRVPelYlOQPvJFOrX7hx36wKfX62G82FGN0Q3y K4upknRwijXJOBqg= X-Received: by 2002:a37:9a06:: with SMTP id c6mr1768974qke.53.1632349450329; Wed, 22 Sep 2021 15:24:10 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwunVHImBa7SThiApptPUxUA4kJp8do5E+vUzT2hPGCchAyBAwHf7WJE0oO3FspELeUXgfosA== X-Received: by 2002:a37:9a06:: with SMTP id c6mr1768957qke.53.1632349450142; Wed, 22 Sep 2021 15:24:10 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:431:c7f0:e5d7:bbae:108a:d2ca:1c18]) by smtp.gmail.com with ESMTPSA id 9sm2948633qkc.52.2021.09.22.15.24.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 Sep 2021 15:24:09 -0700 (PDT) From: Leonardo Bras To: =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Peter Xu , Jason Wang Subject: [PATCH v3 1/3] QIOChannel: Add io_async_writev & io_async_flush callbacks Date: Wed, 22 Sep 2021 19:24:21 -0300 Message-Id: <20210922222423.644444-2-leobras@redhat.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20210922222423.644444-1-leobras@redhat.com> References: <20210922222423.644444-1-leobras@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=leobras@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -42 X-Spam_score: -4.3 X-Spam_bar: ---- X-Spam_report: (-4.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.472, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Adds io_async_writev and io_async_flush as optional callback to QIOChannelClass, allowing the implementation of asynchronous writes by subclasses. How to use them: - Write data using qio_channel_async_writev(), - Wait write completion with qio_channel_async_flush(). Notes: Some asynchronous implementations may benefit from zerocopy mechanisms, so it's recommended to keep the write buffer untouched until the return of qio_channel_async_flush(). As the new callbacks are optional, if a subclass does not implement them there will be a fallback to the mandatory synchronous implementation: - io_async_writev will fallback to io_writev, - io_async_flush will return without changing anything. This makes simpler for the user to make use of the asynchronous implementation. Also, some functions like qio_channel_writev_full_all() were adapted to offer an async version, and make better use of the new callbacks. Signed-off-by: Leonardo Bras --- include/io/channel.h | 93 +++++++++++++++++++++++++++++++++++++------- io/channel.c | 66 ++++++++++++++++++++++++------- 2 files changed, 129 insertions(+), 30 deletions(-) diff --git a/include/io/channel.h b/include/io/channel.h index 88988979f8..74f2e3ae8a 100644 --- a/include/io/channel.h +++ b/include/io/channel.h @@ -136,6 +136,14 @@ struct QIOChannelClass { IOHandler *io_read, IOHandler *io_write, void *opaque); + ssize_t (*io_async_writev)(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int *fds, + size_t nfds, + Error **errp); + void (*io_async_flush)(QIOChannel *ioc, + Error **errp); }; /* General I/O handling functions */ @@ -255,12 +263,17 @@ ssize_t qio_channel_readv_full(QIOChannel *ioc, * or QIO_CHANNEL_ERR_BLOCK if no data is can be sent * and the channel is non-blocking */ -ssize_t qio_channel_writev_full(QIOChannel *ioc, - const struct iovec *iov, - size_t niov, - int *fds, - size_t nfds, - Error **errp); +ssize_t __qio_channel_writev_full(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int *fds, + size_t nfds, + bool async, + Error **errp); +#define qio_channel_writev_full(ioc, iov, niov, fds, nfds, errp) \ + __qio_channel_writev_full(ioc, iov, niov, fds, nfds, false, errp) +#define qio_channel_async_writev_full(ioc, iov, niov, fds, nfds, errp) \ + __qio_channel_writev_full(ioc, iov, niov, fds, nfds, true, errp) /** * qio_channel_readv_all_eof: @@ -339,10 +352,15 @@ int qio_channel_readv_all(QIOChannel *ioc, * * Returns: 0 if all bytes were written, or -1 on error */ -int qio_channel_writev_all(QIOChannel *ioc, - const struct iovec *iov, - size_t niov, - Error **erp); +int __qio_channel_writev_all(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + bool async, + Error **erp); +#define qio_channel_writev_all(ioc, iov, niov, erp) \ + __qio_channel_writev_all(ioc, iov, niov, false, erp) +#define qio_channel_async_writev_all(ioc, iov, niov, erp) \ + __qio_channel_writev_all(ioc, iov, niov, true, erp) /** * qio_channel_readv: @@ -849,10 +867,55 @@ int qio_channel_readv_full_all(QIOChannel *ioc, * Returns: 0 if all bytes were written, or -1 on error */ -int qio_channel_writev_full_all(QIOChannel *ioc, - const struct iovec *iov, - size_t niov, - int *fds, size_t nfds, - Error **errp); +int __qio_channel_writev_full_all(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int *fds, size_t nfds, + bool async, Error **errp); +#define qio_channel_writev_full_all(ioc, iov, niov, fds, nfds, errp) \ + __qio_channel_writev_full_all(ioc, iov, niov, fds, nfds, false, errp) +#define qio_channel_async_writev_full_all(ioc, iov, niov, fds, nfds, errp) \ + __qio_channel_writev_full_all(ioc, iov, niov, fds, nfds, true, errp) + +/** + * qio_channel_async_writev: + * @ioc: the channel object + * @iov: the array of memory regions to write data from + * @niov: the length of the @iov array + * @fds: an array of file handles to send + * @nfds: number of file handles in @fds + * @errp: pointer to a NULL-initialized error object + * + * Behaves like qio_channel_writev_full, but will send + * data asynchronously, this meaning this function + * may return before the data is actually sent. + * + * If at some point it's necessary wait for all data to be + * sent, use qio_channel_async_flush(). + * + * If not implemented, falls back to the default writev + */ + +ssize_t qio_channel_async_writev(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int *fds, + size_t nfds, + Error **errp); + +/** + * qio_channel_async_flush: + * @ioc: the channel object + * @errp: pointer to a NULL-initialized error object + * + * Will lock until every packet queued with qio_channel_async_writev() + * is sent. + * + * If not implemented, returns without changing anything. + */ + +void qio_channel_async_flush(QIOChannel *ioc, + Error **errp); + #endif /* QIO_CHANNEL_H */ diff --git a/io/channel.c b/io/channel.c index e8b019dc36..c4819b922f 100644 --- a/io/channel.c +++ b/io/channel.c @@ -67,12 +67,13 @@ ssize_t qio_channel_readv_full(QIOChannel *ioc, } -ssize_t qio_channel_writev_full(QIOChannel *ioc, - const struct iovec *iov, - size_t niov, - int *fds, - size_t nfds, - Error **errp) +ssize_t __qio_channel_writev_full(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int *fds, + size_t nfds, + bool async, + Error **errp) { QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc); @@ -83,6 +84,10 @@ ssize_t qio_channel_writev_full(QIOChannel *ioc, return -1; } + if (async && klass->io_async_writev) { + return klass->io_async_writev(ioc, iov, niov, fds, nfds, errp); + } + return klass->io_writev(ioc, iov, niov, fds, nfds, errp); } @@ -212,19 +217,20 @@ int qio_channel_readv_full_all(QIOChannel *ioc, return ret; } -int qio_channel_writev_all(QIOChannel *ioc, - const struct iovec *iov, - size_t niov, - Error **errp) +int __qio_channel_writev_all(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + bool async, + Error **errp) { - return qio_channel_writev_full_all(ioc, iov, niov, NULL, 0, errp); + return __qio_channel_writev_full_all(ioc, iov, niov, NULL, 0, async, errp); } -int qio_channel_writev_full_all(QIOChannel *ioc, +int __qio_channel_writev_full_all(QIOChannel *ioc, const struct iovec *iov, size_t niov, int *fds, size_t nfds, - Error **errp) + bool async, Error **errp) { int ret = -1; struct iovec *local_iov = g_new(struct iovec, niov); @@ -237,8 +243,8 @@ int qio_channel_writev_full_all(QIOChannel *ioc, while (nlocal_iov > 0) { ssize_t len; - len = qio_channel_writev_full(ioc, local_iov, nlocal_iov, fds, nfds, - errp); + len = __qio_channel_writev_full(ioc, local_iov, nlocal_iov, fds, nfds, + async, errp); if (len == QIO_CHANNEL_ERR_BLOCK) { if (qemu_in_coroutine()) { qio_channel_yield(ioc, G_IO_OUT); @@ -474,6 +480,36 @@ off_t qio_channel_io_seek(QIOChannel *ioc, } +ssize_t qio_channel_async_writev(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int *fds, + size_t nfds, + Error **errp) +{ + QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc); + + if (!klass->io_async_writev) { + return klass->io_writev(ioc, iov, niov, fds, nfds, errp); + } + + return klass->io_async_writev(ioc, iov, niov, fds, nfds, errp); +} + + +void qio_channel_async_flush(QIOChannel *ioc, + Error **errp) +{ + QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc); + + if (!klass->io_async_flush) { + return; + } + + klass->io_async_flush(ioc, errp); +} + + static void qio_channel_restart_read(void *opaque) { QIOChannel *ioc = opaque; From patchwork Wed Sep 22 22:24:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12511531 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-20.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9209C433EF for ; Wed, 22 Sep 2021 22:28:37 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 78F01611B0 for ; Wed, 22 Sep 2021 22:28:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 78F01611B0 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:45792 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mTAjE-0004Qa-ON for qemu-devel@archiver.kernel.org; Wed, 22 Sep 2021 18:28:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:37300) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mTAf4-0007RI-N3 for qemu-devel@nongnu.org; Wed, 22 Sep 2021 18:24:18 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:44971) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mTAf1-0000g4-5C for qemu-devel@nongnu.org; Wed, 22 Sep 2021 18:24:18 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632349454; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=S9HooO/VAlF0ITCmF0xoYrNXnSfM8Xon1sOOl+Fh7CM=; b=RMuCQ3n3C2bjRb+yekJliHTNYYJgIjYKgDgSBzvdOcMJHz2COniSvTUdipuphHJJlFFdPH WBqnIOfdaPQQaHvWRNew+NTXa18qCIy9CgnWY3LqOj613MPLryb45+wud5xe/0D7+tR9oL XRvtg3I950eCxbQXjUjH0O7oBUU3Gng= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-443-VCp2XPu3M6mqE-K_CmSuuQ-1; Wed, 22 Sep 2021 18:24:13 -0400 X-MC-Unique: VCp2XPu3M6mqE-K_CmSuuQ-1 Received: by mail-qv1-f69.google.com with SMTP id h18-20020ad446f2000000b0037a7b48ba05so14175494qvw.19 for ; Wed, 22 Sep 2021 15:24:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=S9HooO/VAlF0ITCmF0xoYrNXnSfM8Xon1sOOl+Fh7CM=; b=Yw+khl6wkdNS78Nz67yA+fkRDBAbo8IKtu9G0VcwKwuxNbiFZKFHxOwP79nTq+Gc99 j2lPGGJ7+IjuFeWhxdssQPtZmGlxmRxNjcvVDWeLqelXWlMGjD+apAuo99ORvUJRSKeB +jwfASdU8paewS7tLcaR3pmwdbeXBRS4rQZOYdMZa4w25yC7DNLbzEJEuxxbML6sH36S QTYPoLFK5yCIcucYYeLzML+UAI93fpIZvQV4ZY7gYVDws0rKHm9js0eEzmYxrUK9rUmS ysEuEuHGGXO6f0VETp/vzjO/W/eDuOZ2cviOB2+4G9bKdCZpAYqFEOOBBiu+y96lRg6M O7lg== X-Gm-Message-State: AOAM531Z3Z9v30GAY+H/FDN5XuyU2yfEuKNqn3JbCcAnOP8VnyH14TXV y+R16A6rT+9JW9Sfk3n/LWMDGQJUNmmCSMU5Z6lkmUVdTvF3BltDAvRMhf/iIFfJJ5oRHDjJb5S N8z4N2l0l+Ni9yVM= X-Received: by 2002:a37:d50:: with SMTP id 77mr1764907qkn.299.1632349452681; Wed, 22 Sep 2021 15:24:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJypTJmBlNiVS8F/KD3pByellrIi1XVBvHVG1mDddrPbswOBQutm7zPtnMefsyKUPwJg4t+MVQ== X-Received: by 2002:a37:d50:: with SMTP id 77mr1764890qkn.299.1632349452498; Wed, 22 Sep 2021 15:24:12 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:431:c7f0:e5d7:bbae:108a:d2ca:1c18]) by smtp.gmail.com with ESMTPSA id 9sm2948633qkc.52.2021.09.22.15.24.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 Sep 2021 15:24:11 -0700 (PDT) From: Leonardo Bras To: =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Peter Xu , Jason Wang Subject: [PATCH v3 2/3] QIOChannelSocket: Implement io_async_write & io_async_flush Date: Wed, 22 Sep 2021 19:24:22 -0300 Message-Id: <20210922222423.644444-3-leobras@redhat.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20210922222423.644444-1-leobras@redhat.com> References: <20210922222423.644444-1-leobras@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=leobras@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -42 X-Spam_score: -4.3 X-Spam_bar: ---- X-Spam_report: (-4.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.472, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Implement the new optional callbacks io_async_write and io_async_flush on QIOChannelSocket, but enables it only when MSG_ZEROCOPY feature is available in the host kernel, and TCP sockets are used. qio_channel_socket_writev() contents were moved to a helper function __qio_channel_socket_writev() which accepts an extra 'flag' argument. This helper function is used to implement qio_channel_socket_writev(), with flags = 0, keeping it's behavior unchanged, and qio_channel_socket_async_writev() with flags = MSG_ZEROCOPY. qio_channel_socket_async_flush() was implemented by reading the socket's error queue, which will have information on MSG_ZEROCOPY send completion. There is no need to worry with re-sending packets in case any error happens, as MSG_ZEROCOPY only works with TCP and it will re-tranmsmit if any error ocurs. Notes on using async_write(): - As MSG_ZEROCOPY tells the kernel to use the same user buffer to avoid copying, some caution is necessary to avoid overwriting any buffer before it's sent. If something like this happen, a newer version of the buffer may be sent instead. - If this is a problem, it's recommended to use async_flush() before freeing or re-using the buffer. - When using MSG_ZERCOCOPY, the buffer memory will be locked, so it may require a larger amount than usually available to non-root user. - If the required amount of locked memory is not available, it falls-back to buffer copying behavior, and synchronous sending. Signed-off-by: Leonardo Bras --- include/io/channel-socket.h | 2 + include/io/channel.h | 1 + io/channel-socket.c | 176 ++++++++++++++++++++++++++++++++++-- 3 files changed, 169 insertions(+), 10 deletions(-) diff --git a/include/io/channel-socket.h b/include/io/channel-socket.h index e747e63514..4d1be0637a 100644 --- a/include/io/channel-socket.h +++ b/include/io/channel-socket.h @@ -47,6 +47,8 @@ struct QIOChannelSocket { socklen_t localAddrLen; struct sockaddr_storage remoteAddr; socklen_t remoteAddrLen; + ssize_t async_queued; + ssize_t async_sent; }; diff --git a/include/io/channel.h b/include/io/channel.h index 74f2e3ae8a..611bb2ea26 100644 --- a/include/io/channel.h +++ b/include/io/channel.h @@ -31,6 +31,7 @@ OBJECT_DECLARE_TYPE(QIOChannel, QIOChannelClass, #define QIO_CHANNEL_ERR_BLOCK -2 +#define QIO_CHANNEL_ERR_NOBUFS -3 typedef enum QIOChannelFeature QIOChannelFeature; diff --git a/io/channel-socket.c b/io/channel-socket.c index 606ec97cf7..c67832d0bb 100644 --- a/io/channel-socket.c +++ b/io/channel-socket.c @@ -26,9 +26,23 @@ #include "io/channel-watch.h" #include "trace.h" #include "qapi/clone-visitor.h" +#ifdef CONFIG_LINUX +#include +#include +#endif #define SOCKET_MAX_FDS 16 +static ssize_t qio_channel_socket_async_writev(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int *fds, + size_t nfds, + Error **errp); + +static void qio_channel_socket_async_flush(QIOChannel *ioc, + Error **errp); + SocketAddress * qio_channel_socket_get_local_address(QIOChannelSocket *ioc, Error **errp) @@ -55,6 +69,8 @@ qio_channel_socket_new(void) sioc = QIO_CHANNEL_SOCKET(object_new(TYPE_QIO_CHANNEL_SOCKET)); sioc->fd = -1; + sioc->async_queued = 0; + sioc->async_sent = 0; ioc = QIO_CHANNEL(sioc); qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN); @@ -140,6 +156,7 @@ int qio_channel_socket_connect_sync(QIOChannelSocket *ioc, Error **errp) { int fd; + int ret, v = 1; trace_qio_channel_socket_connect_sync(ioc, addr); fd = socket_connect(addr, errp); @@ -154,6 +171,19 @@ int qio_channel_socket_connect_sync(QIOChannelSocket *ioc, return -1; } +#ifdef CONFIG_LINUX + if (addr->type != SOCKET_ADDRESS_TYPE_INET) { + return 0; + } + + ret = qemu_setsockopt(fd, SOL_SOCKET, SO_ZEROCOPY, &v, sizeof(v)); + if (ret >= 0) { + QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc); + klass->io_async_writev = qio_channel_socket_async_writev; + klass->io_async_flush = qio_channel_socket_async_flush; + } +#endif + return 0; } @@ -520,12 +550,13 @@ static ssize_t qio_channel_socket_readv(QIOChannel *ioc, return ret; } -static ssize_t qio_channel_socket_writev(QIOChannel *ioc, - const struct iovec *iov, - size_t niov, - int *fds, - size_t nfds, - Error **errp) +static ssize_t __qio_channel_socket_writev(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int *fds, + size_t nfds, + int flags, + Error **errp) { QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc); ssize_t ret; @@ -558,20 +589,145 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc, } retry: - ret = sendmsg(sioc->fd, &msg, 0); + ret = sendmsg(sioc->fd, &msg, flags); if (ret <= 0) { - if (errno == EAGAIN) { + switch (errno) { + case EAGAIN: return QIO_CHANNEL_ERR_BLOCK; - } - if (errno == EINTR) { + case EINTR: goto retry; + case ENOBUFS: + return QIO_CHANNEL_ERR_NOBUFS; } + error_setg_errno(errp, errno, "Unable to write to socket"); return -1; } return ret; } + +static ssize_t qio_channel_socket_writev(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int *fds, + size_t nfds, + Error **errp) +{ + return __qio_channel_socket_writev(ioc, iov, niov, fds, nfds, 0, errp); +} + +static ssize_t qio_channel_socket_async_writev(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int *fds, + size_t nfds, + Error **errp) +{ + QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc); + ssize_t ret; + + sioc->async_queued++; + + ret = __qio_channel_socket_writev(ioc, iov, niov, fds, nfds, MSG_ZEROCOPY, + errp); + if (ret == QIO_CHANNEL_ERR_NOBUFS) { + /* + * Not enough locked memory available to the process. + * Fallback to default sync callback. + */ + + if (errp && *errp) { + warn_reportf_err(*errp, + "Process can't lock enough memory for using MSG_ZEROCOPY," + "falling back to non-zerocopy"); + } + + QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc); + klass->io_async_writev = NULL; + klass->io_async_flush = NULL; + + /* Re-send current buffer */ + ret = qio_channel_socket_writev(ioc, iov, niov, fds, nfds, errp); + } + + return ret; +} + + +static void qio_channel_socket_async_flush(QIOChannel *ioc, + Error **errp) +{ + QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc); + struct msghdr msg = {}; + struct pollfd pfd; + struct sock_extended_err *serr; + struct cmsghdr *cm; + char control[CMSG_SPACE(sizeof(int) * SOCKET_MAX_FDS)]; + int ret; + + memset(control, 0, CMSG_SPACE(sizeof(int) * SOCKET_MAX_FDS)); + msg.msg_control = control; + msg.msg_controllen = sizeof(control); + + while (sioc->async_sent < sioc->async_queued) { + ret = recvmsg(sioc->fd, &msg, MSG_ERRQUEUE); + if (ret < 0) { + if (errno == EAGAIN) { + /* Nothing on errqueue, wait */ + pfd.fd = sioc->fd; + pfd.events = 0; + ret = poll(&pfd, 1, 250); + if (ret == 0) { + /* + * Timeout : After 250ms without receiving any zerocopy + * notification, consider all data as sent. + */ + break; + } else if (ret < 0 || + (pfd.revents & (POLLERR | POLLHUP | POLLNVAL))) { + error_setg_errno(errp, errno, + "Poll error"); + break; + } else { + continue; + } + } + if (errno == EINTR) { + continue; + } + + error_setg_errno(errp, errno, + "Unable to read errqueue"); + break; + } + + cm = CMSG_FIRSTHDR(&msg); + if (cm->cmsg_level != SOL_IP && + cm->cmsg_type != IP_RECVERR) { + error_setg_errno(errp, EPROTOTYPE, + "Wrong cmsg in errqueue"); + break; + } + + serr = (void *) CMSG_DATA(cm); + if (serr->ee_errno != SO_EE_ORIGIN_NONE) { + error_setg_errno(errp, serr->ee_errno, + "Error on socket"); + break; + } + if (serr->ee_origin != SO_EE_ORIGIN_ZEROCOPY) { + error_setg_errno(errp, serr->ee_origin, + "Error not from zerocopy"); + break; + } + + /* No errors, count sent ids*/ + sioc->async_sent += serr->ee_data - serr->ee_info + 1; + } +} + + #else /* WIN32 */ static ssize_t qio_channel_socket_readv(QIOChannel *ioc, const struct iovec *iov, From patchwork Wed Sep 22 22:24:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12511529 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-20.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2000C433F5 for ; Wed, 22 Sep 2021 22:28:03 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 98A82611CA for ; Wed, 22 Sep 2021 22:28:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 98A82611CA Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:44158 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mTAig-0003Ku-Qs for qemu-devel@archiver.kernel.org; Wed, 22 Sep 2021 18:28:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:37358) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mTAf8-0007S6-TI for qemu-devel@nongnu.org; Wed, 22 Sep 2021 18:24:24 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:39780) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mTAf3-0000hM-Ke for qemu-devel@nongnu.org; Wed, 22 Sep 2021 18:24:22 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632349456; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=izSWGRonVb5nbyzyHD9vbKGppoNzPOJNnTbMh5f0KKY=; b=hpSZRCkk0EG/mDJGXy9LMy8/6tWRHVOmLJyEJeDzAvDOfygiU32pw8mDEJmx0/tp3hcCb9 q+u2fnbIZghn9TTUcNBbxS2wLZmPD2rOMGj1OdVSJzoyAQEWShFfFtke6Xlg+kHKgV6e7z GrAn7eoGWGfI1GOtNhguCtsGlhN4IDo= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-40-G_U_SM1oNR-yiezl9TJzow-1; Wed, 22 Sep 2021 18:24:15 -0400 X-MC-Unique: G_U_SM1oNR-yiezl9TJzow-1 Received: by mail-qv1-f69.google.com with SMTP id h18-20020ad446f2000000b0037a7b48ba05so14175948qvw.19 for ; Wed, 22 Sep 2021 15:24:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=izSWGRonVb5nbyzyHD9vbKGppoNzPOJNnTbMh5f0KKY=; b=ym+dEa7QcY6qWYKxn42TUMLuUayfJAMfs2E/QiO2t7Bo7YWav8oNDVYoNf0yDCnbHp qpaToE+CTAdW1c0nuvCPeLIDKOO2CNPQQ7TVlWkQKcUE9P2DeGfg4fSnTqFcwm8XP/oB 1JJaptxIEt67RyS3q4dWBlz3EsB84c1PIOX0vMjeyxM7FMMAolcsc2CLKObEytI/EyMS dNnMj9fWcLl8UQNim77L8odgx3rMa1mfM4HZT4dhB58k28B0A4PwO2l5TYyhMHdgkLPR QTn7ReieHwsRecYhCT13oskf5u0NHBiMSSDtPX+ieLepJUCWk80LJdEHQcyaCYXOV86q jd9A== X-Gm-Message-State: AOAM532zvB9+acKbPQKzfN+B0gGqQVOezyQICXwDGV20nBjtvKXHqwF1 Tn+R7JmZozh55mElJWkXHlYvN59Y2/ikZvR9gD2dNzxtvHFqYpVfWjxnksogS1BaAT7t4oAgWK6 OQBRNom2+wIf8GsY= X-Received: by 2002:ac8:3b51:: with SMTP id r17mr1746360qtf.139.1632349454927; Wed, 22 Sep 2021 15:24:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx2oz5a8RdM08wiaYtYNJs3E3UuSbFl6ISN0B5q3LTYb39Vhu1s4t6kVOAeLCy0Sw0uklb0AA== X-Received: by 2002:ac8:3b51:: with SMTP id r17mr1746348qtf.139.1632349454774; Wed, 22 Sep 2021 15:24:14 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:431:c7f0:e5d7:bbae:108a:d2ca:1c18]) by smtp.gmail.com with ESMTPSA id 9sm2948633qkc.52.2021.09.22.15.24.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 Sep 2021 15:24:14 -0700 (PDT) From: Leonardo Bras To: =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Peter Xu , Jason Wang Subject: [PATCH v3 3/3] multifd: Send using asynchronous write on nocomp to send RAM pages. Date: Wed, 22 Sep 2021 19:24:23 -0300 Message-Id: <20210922222423.644444-4-leobras@redhat.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20210922222423.644444-1-leobras@redhat.com> References: <20210922222423.644444-1-leobras@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=leobras@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.133.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -42 X-Spam_score: -4.3 X-Spam_bar: ---- X-Spam_report: (-4.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.472, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Change multifd nocomp version to use asynchronous write for RAM pages, and benefit of MSG_ZEROCOPY when it's available. The asynchronous flush happens on cleanup only, before destroying the QIOChannel. This will work fine on RAM migration because the RAM pages are not usually freed, and there is no problem on changing the pages content between async_send() and the actual sending of the buffer, because this change will dirty the page and cause it to be re-sent on a next iteration anyway. Signed-off-by: Leonardo Bras --- migration/multifd.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/migration/multifd.c b/migration/multifd.c index 377da78f5b..d247207a0a 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -105,7 +105,7 @@ static int nocomp_send_prepare(MultiFDSendParams *p, uint32_t used, */ static int nocomp_send_write(MultiFDSendParams *p, uint32_t used, Error **errp) { - return qio_channel_writev_all(p->c, p->pages->iov, used, errp); + return qio_channel_async_writev_all(p->c, p->pages->iov, used, errp); } /** @@ -546,6 +546,7 @@ void multifd_save_cleanup(void) MultiFDSendParams *p = &multifd_send_state->params[i]; Error *local_err = NULL; + qio_channel_async_flush(p->c, NULL); socket_send_channel_destroy(p->c); p->c = NULL; qemu_mutex_destroy(&p->mutex);