From patchwork Mon Feb 6 06:53:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13129364 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 36AE7C61DA4 for ; Mon, 6 Feb 2023 06:53:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229488AbjBFGxe (ORCPT ); Mon, 6 Feb 2023 01:53:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47802 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229796AbjBFGxa (ORCPT ); Mon, 6 Feb 2023 01:53:30 -0500 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6CAD125AF; Sun, 5 Feb 2023 22:53:24 -0800 (PST) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 0C29E5FD03; Mon, 6 Feb 2023 09:53:23 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1675666403; bh=b+WXxyeJ81HGb6IFMi06G8yU3v92PAG7EK/RD2ptgY8=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=Aa8jT9wNyLowGiBAYqbhMsZncW4BQel/ZsVMj+rtsxCalYLEn+9h3IzlomxOeIl6Q xWBkNEpPXf0bWbhmWWddEVEB1EmihB60lEkScFU0rwbXILzycJ41n2ALtQ89G8xd8A iw0ZCvQ2BPBT44W5w8P3tqws/cF1KgXd5QaVrEqJ3qIvEaJ7waEXRkjGMsglXbuuSA Oc+zFjuqgiS4M/YKm4aQ+pr3vgT/M1gEqChKMwkaYXXK1pawB31PcWb8MmUq2q9obK rPJN3HehJ3K4xYotwZewWOqxWM0/vHuBusxFgAP+ZAtmKvvrCI/DbGnZATHN28V3sU NLfj5iCOypX5A== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mx.sberdevices.ru (Postfix) with ESMTP; Mon, 6 Feb 2023 09:53:22 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Arseniy Krasnov , "Krasnov Arseniy" CC: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , kernel Subject: [RFC PATCH v1 01/12] vsock: check error queue to set EPOLLERR Thread-Topic: [RFC PATCH v1 01/12] vsock: check error queue to set EPOLLERR Thread-Index: AQHZOfe0xY93Ok9p00a/1VHxin2e2Q== Date: Mon, 6 Feb 2023 06:53:22 +0000 Message-ID: <17a276d3-1112-3431-2a33-c17f3da67470@sberdevices.ru> In-Reply-To: <0e7c6fc4-b4a6-a27b-36e9-359597bba2b5@sberdevices.ru> Accept-Language: en-US, ru-RU Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: <0000EB679691B848B3AE40385576936D@sberdevices.ru> MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/02/06 01:18:00 #20834045 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org If socket's error queue is not empty, EPOLLERR must be set. Signed-off-by: Arseniy Krasnov --- net/vmw_vsock/af_vsock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 19aea7cba26e..b5e51ef4a74c 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1026,7 +1026,7 @@ static __poll_t vsock_poll(struct file *file, struct socket *sock, poll_wait(file, sk_sleep(sk), wait); mask = 0; - if (sk->sk_err) + if (sk->sk_err || !skb_queue_empty_lockless(&sk->sk_error_queue)) /* Signify that there has been an error on this socket. */ mask |= EPOLLERR; From patchwork Mon Feb 6 06:54:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13129373 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B22CBC61DA4 for ; Mon, 6 Feb 2023 06:56:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229717AbjBFG4Y (ORCPT ); Mon, 6 Feb 2023 01:56:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50946 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229490AbjBFG4X (ORCPT ); Mon, 6 Feb 2023 01:56:23 -0500 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B268A1C334; Sun, 5 Feb 2023 22:55:41 -0800 (PST) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id EFD595FD03; Mon, 6 Feb 2023 09:54:51 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1675666492; bh=O6cNhFQfaQVbXViC73z339ui0JNnZlXbWLyqE8Ijrl8=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=iyMC2qGOlxJHnyrtSHhiLmBbkFt1ReQMWF6JCPtKx1U3BHXEZzSg/vvvrhTRxOBHQ SARUIdXkMVs4hgCp+RjZT3HFbSIn7thgeGXq1S7Q9/73C3ae4bGoc8m7qroQIysiXe gUK6ZTwdE0IeB4CNRQJ6YiovOZSIIl+4gD00vpovkGqSVFToVlvhdrt2csc81sOvfr ahuKwjdBHwzkx5hsGImwK2Z1f+/R5LgAldHJTjzI7pnuQ5xSG8cKpKwFDqWWkvNfK4 9vAtQnpkZQf9Rsyol7bryGyBe8zVDCwJ7uqjcdUvzHYFT/+q1Ke+wyYMh5fkRM0BjV qe9nS0kFeRteA== Received: from S-MS-EXCH02.sberdevices.ru (S-MS-EXCH02.sberdevices.ru [172.16.1.5]) by mx.sberdevices.ru (Postfix) with ESMTP; Mon, 6 Feb 2023 09:54:51 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Arseniy Krasnov , "Krasnov Arseniy" CC: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , kernel Subject: [RFC PATCH v1 02/12] vsock: read from socket's error queue Thread-Topic: [RFC PATCH v1 02/12] vsock: read from socket's error queue Thread-Index: AQHZOffoRZzBrg9vIUiTEqCzeANM9A== Date: Mon, 6 Feb 2023 06:54:51 +0000 Message-ID: <7b2f00ce-296c-3f59-9861-468c6340300e@sberdevices.ru> In-Reply-To: <0e7c6fc4-b4a6-a27b-36e9-359597bba2b5@sberdevices.ru> Accept-Language: en-US, ru-RU Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: <2E161FFF127651478A2DC3B377174E34@sberdevices.ru> MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/02/06 01:18:00 #20834045 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This adds handling of MSG_ERRQUEUE input flag for receive call, thus skb from socket's error queue is read. Signed-off-by: Arseniy Krasnov --- include/linux/socket.h | 1 + net/vmw_vsock/af_vsock.c | 26 ++++++++++++++++++++++++++ 2 files changed, 27 insertions(+) diff --git a/include/linux/socket.h b/include/linux/socket.h index 13c3a237b9c9..19a6f39fa014 100644 --- a/include/linux/socket.h +++ b/include/linux/socket.h @@ -379,6 +379,7 @@ struct ucred { #define SOL_MPTCP 284 #define SOL_MCTP 285 #define SOL_SMC 286 +#define SOL_VSOCK 287 /* IPX options */ #define IPX_TYPE 1 diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index b5e51ef4a74c..f752b30b71d6 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -110,6 +110,7 @@ #include #include #include +#include static int __vsock_bind(struct sock *sk, struct sockaddr_vm *addr); static void vsock_sk_destruct(struct sock *sk); @@ -2086,6 +2087,27 @@ static int __vsock_seqpacket_recvmsg(struct sock *sk, struct msghdr *msg, return err; } +static int vsock_err_recvmsg(struct sock *sk, struct msghdr *msg) +{ + struct sock_extended_err *ee; + struct sk_buff *skb; + int err; + + lock_sock(sk); + skb = sock_dequeue_err_skb(sk); + release_sock(sk); + + if (!skb) + return -EAGAIN; + + ee = &SKB_EXT_ERR(skb)->ee; + err = put_cmsg(msg, SOL_VSOCK, 0, sizeof(*ee), ee); + msg->msg_flags |= MSG_ERRQUEUE; + consume_skb(skb); + + return err; +} + static int vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, int flags) @@ -2096,6 +2118,10 @@ vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, int err; sk = sock->sk; + + if (unlikely(flags & MSG_ERRQUEUE)) + return vsock_err_recvmsg(sk, msg); + vsk = vsock_sk(sk); err = 0; From patchwork Mon Feb 6 06:55:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13129374 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47208C61DA4 for ; Mon, 6 Feb 2023 06:56:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229797AbjBFG4s (ORCPT ); Mon, 6 Feb 2023 01:56:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51384 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229750AbjBFG4r (ORCPT ); Mon, 6 Feb 2023 01:56:47 -0500 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AB7F01BAFE; Sun, 5 Feb 2023 22:56:17 -0800 (PST) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 0A8535FD0A; Mon, 6 Feb 2023 09:55:47 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1675666547; bh=jKp9H6Gl8+dpKJMrNsPP4bgIPg00ur5gUBmdQBd30+A=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=myKw0kJTJWPZcUUzr+ic9NSsNN78CsbnOLPu9EvN6QY+EwKU30KYqP3PP3UQnydSy 2AtycgAaILRUybm61fyYNyv22cgvUKRD0Pgj+4LVIv4QllAEu8YnC/QxAiw6Q7AOne K7MR6Ha/2gzXYIizt54FPfllLYLM1mtXEo4a6w0Fv5d+2QmzcOYA1AkRtOfwvCVdvl Tia/7spzmxrC4p9vYDQ71nha5tRl84aPNaVpQGxNOfzLfl3IZQ/UPvl5YWMSWsWl/K ITraHe4ZRnWomtJCGXATOP9LcBTEUMIbUWQEyXeDOGUzzkX7RC+j4hHaIhHjflUdbp LlhPhj0eaufxw== Received: from S-MS-EXCH02.sberdevices.ru (S-MS-EXCH02.sberdevices.ru [172.16.1.5]) by mx.sberdevices.ru (Postfix) with ESMTP; Mon, 6 Feb 2023 09:55:47 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Arseniy Krasnov , "Krasnov Arseniy" CC: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , kernel Subject: [RFC PATCH v1 03/12] vsock: check for MSG_ZEROCOPY support Thread-Topic: [RFC PATCH v1 03/12] vsock: check for MSG_ZEROCOPY support Thread-Index: AQHZOfgJ1YmcF1NMGkqzPZVVHI90wA== Date: Mon, 6 Feb 2023 06:55:46 +0000 Message-ID: In-Reply-To: <0e7c6fc4-b4a6-a27b-36e9-359597bba2b5@sberdevices.ru> Accept-Language: en-US, ru-RU Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: <6875DB21EF0B2A4184A9CDEF0D4F4475@sberdevices.ru> MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/02/06 01:18:00 #20834045 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This feature totally depends on transport, so if transport doesn't support it, return error. Signed-off-by: Arseniy Krasnov --- include/net/af_vsock.h | 2 ++ net/vmw_vsock/af_vsock.c | 7 +++++++ 2 files changed, 9 insertions(+) diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h index 568a87c5e0d0..96d829004c81 100644 --- a/include/net/af_vsock.h +++ b/include/net/af_vsock.h @@ -173,6 +173,8 @@ struct vsock_transport { /* Addressing. */ u32 (*get_local_cid)(void); + + bool (*msgzerocopy_allow)(void); }; /**** CORE ****/ diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index f752b30b71d6..fb0fcb390113 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1788,6 +1788,13 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg, goto out; } + if (msg->msg_flags & MSG_ZEROCOPY && + (!transport->msgzerocopy_allow || + !transport->msgzerocopy_allow())) { + err = -EOPNOTSUPP; + goto out; + } + /* Wait for room in the produce queue to enqueue our user's data. */ timeout = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT); From patchwork Mon Feb 6 06:57:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13129375 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D05DC636D3 for ; Mon, 6 Feb 2023 06:58:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229808AbjBFG56 (ORCPT ); Mon, 6 Feb 2023 01:57:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53092 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229740AbjBFG54 (ORCPT ); Mon, 6 Feb 2023 01:57:56 -0500 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6BB421C586; Sun, 5 Feb 2023 22:57:24 -0800 (PST) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 8ADD55FD03; Mon, 6 Feb 2023 09:57:17 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1675666637; bh=Ech27EtHT9PBdnbJqZP/Jq1qwSOgf4mVJsXc+SOg5H0=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=ebrYb8QM7X3OghlBstj9jgA7q5ncRJHFpQxNUCWYHr0uC3tNjnt+KugnSSAxgWlpm ZFTgPn9gDaiwd4i25AurBM1c3zQKbZLmWelX8qMLY4QW/8SEma+uC6gXWUFCYq1oWg fQL61/Av9OvklhTGWnEuqrLlY5XuKMpfdvaB0eBfanZjP9uVZw5sA96oQS+7pZ6ol9 YepHowbDEEIKDZldGF0xieXaGnUoBTId/lnkHZA1cUgZ1MEPPK9NFuwNPE6Rlb5GjE VM10qT7GAcL4CSS97X4VYvwh9wOceKij3r6+yQkRwC5YdelzZ6NDJw3RhIERxsAdLx zeFIHvR8PCUEg== Received: from S-MS-EXCH02.sberdevices.ru (S-MS-EXCH02.sberdevices.ru [172.16.1.5]) by mx.sberdevices.ru (Postfix) with ESMTP; Mon, 6 Feb 2023 09:57:17 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Arseniy Krasnov , "Krasnov Arseniy" CC: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , kernel Subject: [RFC PATCH v1 04/12] vhost/vsock: non-linear skb handling support Thread-Topic: [RFC PATCH v1 04/12] vhost/vsock: non-linear skb handling support Thread-Index: AQHZOfg/4Da0EZnl9kKVLokjsbh9oA== Date: Mon, 6 Feb 2023 06:57:16 +0000 Message-ID: In-Reply-To: <0e7c6fc4-b4a6-a27b-36e9-359597bba2b5@sberdevices.ru> Accept-Language: en-US, ru-RU Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: <35D67462721A884E87BD8B0DE1EA7C7D@sberdevices.ru> MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/02/06 01:18:00 #20834045 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This adds copying to guest's virtio buffers from non-linear skbs. Such skbs are created by protocol layer when MSG_ZEROCOPY flags is used. Signed-off-by: Arseniy Krasnov --- drivers/vhost/vsock.c | 56 ++++++++++++++++++++++++++++++++---- include/linux/virtio_vsock.h | 12 ++++++++ 2 files changed, 63 insertions(+), 5 deletions(-) diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index 1f3b89c885cc..60b9cafa3e31 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -86,6 +86,44 @@ static struct vhost_vsock *vhost_vsock_get(u32 guest_cid) return NULL; } +static int vhost_transport_copy_nonlinear_skb(struct sk_buff *skb, + struct iov_iter *iov_iter, + size_t len) +{ + size_t rest_len = len; + + while (rest_len && virtio_vsock_skb_has_frags(skb)) { + struct bio_vec *curr_vec; + size_t curr_vec_end; + size_t to_copy; + int curr_frag; + int curr_offs; + + curr_frag = VIRTIO_VSOCK_SKB_CB(skb)->curr_frag; + curr_offs = VIRTIO_VSOCK_SKB_CB(skb)->frag_off; + curr_vec = &skb_shinfo(skb)->frags[curr_frag]; + + curr_vec_end = curr_vec->bv_offset + curr_vec->bv_len; + to_copy = min(rest_len, (size_t)(curr_vec_end - curr_offs)); + + if (copy_page_to_iter(curr_vec->bv_page, curr_offs, + to_copy, iov_iter) != to_copy) + return -1; + + rest_len -= to_copy; + VIRTIO_VSOCK_SKB_CB(skb)->frag_off += to_copy; + + if (VIRTIO_VSOCK_SKB_CB(skb)->frag_off == (curr_vec_end)) { + VIRTIO_VSOCK_SKB_CB(skb)->curr_frag++; + VIRTIO_VSOCK_SKB_CB(skb)->frag_off = 0; + } + } + + skb->data_len -= len; + + return 0; +} + static void vhost_transport_do_send_pkt(struct vhost_vsock *vsock, struct vhost_virtqueue *vq) @@ -197,11 +235,19 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock, break; } - nbytes = copy_to_iter(skb->data, payload_len, &iov_iter); - if (nbytes != payload_len) { - kfree_skb(skb); - vq_err(vq, "Faulted on copying pkt buf\n"); - break; + if (skb_is_nonlinear(skb)) { + if (vhost_transport_copy_nonlinear_skb(skb, &iov_iter, + payload_len)) { + vq_err(vq, "Faulted on copying pkt buf from page\n"); + break; + } + } else { + nbytes = copy_to_iter(skb->data, payload_len, &iov_iter); + if (nbytes != payload_len) { + kfree_skb(skb); + vq_err(vq, "Faulted on copying pkt buf\n"); + break; + } } /* Deliver to monitoring devices all packets that we diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h index 3f9c16611306..e7efdb78ce6e 100644 --- a/include/linux/virtio_vsock.h +++ b/include/linux/virtio_vsock.h @@ -12,6 +12,10 @@ struct virtio_vsock_skb_cb { bool reply; bool tap_delivered; + /* Current fragment in 'frags' of skb. */ + u32 curr_frag; + /* Offset from 0 in current fragment. */ + u32 frag_off; }; #define VIRTIO_VSOCK_SKB_CB(skb) ((struct virtio_vsock_skb_cb *)((skb)->cb)) @@ -46,6 +50,14 @@ static inline void virtio_vsock_skb_clear_tap_delivered(struct sk_buff *skb) VIRTIO_VSOCK_SKB_CB(skb)->tap_delivered = false; } +static inline bool virtio_vsock_skb_has_frags(struct sk_buff *skb) +{ + if (!skb_is_nonlinear(skb)) + return false; + + return VIRTIO_VSOCK_SKB_CB(skb)->curr_frag != skb_shinfo(skb)->nr_frags; +} + static inline void virtio_vsock_skb_rx_put(struct sk_buff *skb) { u32 len; From patchwork Mon Feb 6 06:58:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13129376 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A66FC61DA4 for ; Mon, 6 Feb 2023 06:58:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229815AbjBFG6y (ORCPT ); Mon, 6 Feb 2023 01:58:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54330 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229733AbjBFG6w (ORCPT ); Mon, 6 Feb 2023 01:58:52 -0500 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 412CF193F7; Sun, 5 Feb 2023 22:58:26 -0800 (PST) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 9C6BE5FD03; Mon, 6 Feb 2023 09:58:24 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1675666704; bh=gM89Sds61a/Ke4wkFmjf14TzPo7acoM8k0Ezfh/OfTE=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=c6de/MpSk/4xBtTgl+FzdDqL4UQ8O5WOgcgBI394fValuTI9p58AwowKSqDOZ/lXn QD9TyM4RcoP9LztoDZWlf7cdBMILvlcrQla9VlTO4SLx3WmmUGBPZBNTDJy6rT+kVB 3gvXKIeKzZYiyr3Dwr7FXUVthyCyugQnK8PP8VHd3IzPkaI4oaVdv/BtJ/nZusHnob NyyIO67bd2t0wrPJagdL8EuwKxYscc3iwc2Ag2i1zN5U56brTP7DoI4I9NmGpPSZdP Uryemcx0D+wdJq3E9LQH6QGAwheuKy+lLP08eXqQ8tucs4vQoxZ0StrIytrXrUOEpZ 9cQ86pcyNzbGg== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mx.sberdevices.ru (Postfix) with ESMTP; Mon, 6 Feb 2023 09:58:24 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Arseniy Krasnov , "Krasnov Arseniy" CC: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , kernel Subject: [RFC PATCH v1 05/12] vsock/virtio: non-linear skb support Thread-Topic: [RFC PATCH v1 05/12] vsock/virtio: non-linear skb support Thread-Index: AQHZOfhn3a4rtLPBJ0ikS/zrL9u2kw== Date: Mon, 6 Feb 2023 06:58:24 +0000 Message-ID: In-Reply-To: <0e7c6fc4-b4a6-a27b-36e9-359597bba2b5@sberdevices.ru> Accept-Language: en-US, ru-RU Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: <4C7B0332DEEF004C81506DE1E7C21622@sberdevices.ru> MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/02/06 01:18:00 #20834045 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Use pages of non-linear skb as buffers in virtio tx queue. Signed-off-by: Arseniy Krasnov --- net/vmw_vsock/virtio_transport.c | 31 +++++++++++++++++++++++++------ 1 file changed, 25 insertions(+), 6 deletions(-) diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index 28b5a8e8e094..b8a7d6dc9f46 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -100,7 +100,8 @@ virtio_transport_send_pkt_work(struct work_struct *work) vq = vsock->vqs[VSOCK_VQ_TX]; for (;;) { - struct scatterlist hdr, buf, *sgs[2]; + struct scatterlist *sgs[MAX_SKB_FRAGS + 1]; + struct scatterlist bufs[MAX_SKB_FRAGS + 1]; int ret, in_sg = 0, out_sg = 0; struct sk_buff *skb; bool reply; @@ -111,12 +112,30 @@ virtio_transport_send_pkt_work(struct work_struct *work) virtio_transport_deliver_tap_pkt(skb); reply = virtio_vsock_skb_reply(skb); + sg_init_one(&bufs[0], virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb))); + sgs[out_sg++] = &bufs[0]; + + if (skb_is_nonlinear(skb)) { + int i; + + for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { + struct page *data_page = skb_shinfo(skb)->frags[i].bv_page; + + /* We will use 'page_to_virt()' for userspace page here, + * because virtio layer will call 'virt_to_phys()' later + * to fill buffer descriptor. We don't touch memory at + * "virtual" address of this page. + */ + sg_init_one(&bufs[i + 1], + page_to_virt(data_page), PAGE_SIZE); + sgs[out_sg++] = &bufs[i + 1]; + } + } else { + if (skb->len > 0) { + sg_init_one(&bufs[1], skb->data, skb->len); + sgs[out_sg++] = &bufs[1]; + } - sg_init_one(&hdr, virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb))); - sgs[out_sg++] = &hdr; - if (skb->len > 0) { - sg_init_one(&buf, skb->data, skb->len); - sgs[out_sg++] = &buf; } ret = virtqueue_add_sgs(vq, sgs, out_sg, in_sg, skb, GFP_KERNEL); From patchwork Mon Feb 6 06:59:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13129383 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5681BC05027 for ; Mon, 6 Feb 2023 06:59:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229831AbjBFG72 (ORCPT ); Mon, 6 Feb 2023 01:59:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55038 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229745AbjBFG71 (ORCPT ); Mon, 6 Feb 2023 01:59:27 -0500 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2BFB130F0; Sun, 5 Feb 2023 22:59:23 -0800 (PST) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 934FF5FD03; Mon, 6 Feb 2023 09:59:21 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1675666761; bh=iyACDNtFEIsHltnXG+ySQX2fasSDQiF7Yuq0gz9yqsQ=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=d6MsAvATUavP20XbY45f8BnjkeT4HelwAa47K+EmXLQJBgIPLkEBtTxBabq9Wb49O CkAwduOWqsKVrRISbwq9xq1bI0hhhsrhmkX3Syapz6xLLI0Nj+JbXDD8e/JmhMqWOu BdjD/ZS58bZ3gFSGhn1MpjkSG2pN+WlC81X4zGqtisXSGQVgg+7mNSaxX9KJdRxQlX dVRNc5GTARksDDgyJG/FubDwmAeZ7qgOBtiq25zWRRHy8YiASBw0/IcsXSLb9q6fWH EP/qq0QIST6Pg+gjES8znN4UGtS/PxEUwUnVuH/q/AMnuEARRt/hKbMdH4/S8mOS07 GSq8FB6ac6cmg== Received: from S-MS-EXCH02.sberdevices.ru (S-MS-EXCH02.sberdevices.ru [172.16.1.5]) by mx.sberdevices.ru (Postfix) with ESMTP; Mon, 6 Feb 2023 09:59:21 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Arseniy Krasnov , "Krasnov Arseniy" CC: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , kernel Subject: [RFC PATCH v1 06/12] vsock/virtio: non-linear skb handling for TAP dev Thread-Topic: [RFC PATCH v1 06/12] vsock/virtio: non-linear skb handling for TAP dev Thread-Index: AQHZOfiJxMkNoIhc8E+iz5K/W09Rxg== Date: Mon, 6 Feb 2023 06:59:21 +0000 Message-ID: In-Reply-To: <0e7c6fc4-b4a6-a27b-36e9-359597bba2b5@sberdevices.ru> Accept-Language: en-US, ru-RU Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: <7A34F74EA281224884E6A4D1F6C61275@sberdevices.ru> MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/02/06 01:18:00 #20834045 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org For TAP device new skb is created and data from the current skb is copied to it. This adds copying data from non-linear skb to new the skb. Signed-off-by: Arseniy Krasnov --- net/vmw_vsock/virtio_transport_common.c | 43 +++++++++++++++++++++++-- 1 file changed, 40 insertions(+), 3 deletions(-) diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index a1581c77cf84..05ce97b967ad 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -101,6 +101,39 @@ virtio_transport_alloc_skb(struct virtio_vsock_pkt_info *info, return NULL; } +static void virtio_transport_copy_nonlinear_skb(struct sk_buff *skb, + void *dst, + size_t len) +{ + size_t rest_len = len; + + while (rest_len && virtio_vsock_skb_has_frags(skb)) { + struct bio_vec *curr_vec; + size_t curr_vec_end; + size_t to_copy; + int curr_frag; + int curr_offs; + + curr_frag = VIRTIO_VSOCK_SKB_CB(skb)->curr_frag; + curr_offs = VIRTIO_VSOCK_SKB_CB(skb)->frag_off; + curr_vec = &skb_shinfo(skb)->frags[curr_frag]; + + curr_vec_end = curr_vec->bv_offset + curr_vec->bv_len; + to_copy = min(rest_len, (size_t)(curr_vec_end - curr_offs)); + + memcpy(dst, page_to_virt(curr_vec->bv_page) + curr_offs, + to_copy); + + rest_len -= to_copy; + VIRTIO_VSOCK_SKB_CB(skb)->frag_off += to_copy; + + if (VIRTIO_VSOCK_SKB_CB(skb)->frag_off == (curr_vec_end)) { + VIRTIO_VSOCK_SKB_CB(skb)->curr_frag++; + VIRTIO_VSOCK_SKB_CB(skb)->frag_off = 0; + } + } +} + /* Packet capture */ static struct sk_buff *virtio_transport_build_skb(void *opaque) { @@ -109,7 +142,6 @@ static struct sk_buff *virtio_transport_build_skb(void *opaque) struct af_vsockmon_hdr *hdr; struct sk_buff *skb; size_t payload_len; - void *payload_buf; /* A packet could be split to fit the RX buffer, so we can retrieve * the payload length from the header and the buffer pointer taking @@ -117,7 +149,6 @@ static struct sk_buff *virtio_transport_build_skb(void *opaque) */ pkt_hdr = virtio_vsock_hdr(pkt); payload_len = pkt->len; - payload_buf = pkt->data; skb = alloc_skb(sizeof(*hdr) + sizeof(*pkt_hdr) + payload_len, GFP_ATOMIC); @@ -160,7 +191,13 @@ static struct sk_buff *virtio_transport_build_skb(void *opaque) skb_put_data(skb, pkt_hdr, sizeof(*pkt_hdr)); if (payload_len) { - skb_put_data(skb, payload_buf, payload_len); + if (skb_is_nonlinear(skb)) { + void *data = skb_put(skb, payload_len); + + virtio_transport_copy_nonlinear_skb(skb, data, payload_len); + } else { + skb_put_data(skb, pkt->data, payload_len); + } } return skb; From patchwork Mon Feb 6 07:00:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13129384 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4552C05027 for ; Mon, 6 Feb 2023 07:00:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229880AbjBFHAl (ORCPT ); Mon, 6 Feb 2023 02:00:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55798 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229752AbjBFHAk (ORCPT ); Mon, 6 Feb 2023 02:00:40 -0500 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0C56B18F; Sun, 5 Feb 2023 23:00:38 -0800 (PST) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 55EDC5FD0A; Mon, 6 Feb 2023 10:00:36 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1675666836; bh=dB91PAv4fB7dF1IMO28IW2BHOJO6fnHOBgW4PbMZC18=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=ecKf+cpDCpdKt/SZPoMSmYWgo8ZPCG2ymHoQJfnFYA4HUXju785a/HqEOY3JiqPKs PH1m+9dgViO8Yn0wMzG+32mM3Vk/40rBX8eGY0zVV4VtjUn/FaU42kgi6WdEwluhTk Grbx1W1L5SSVJuC0CgtH1O78iWWd3GfiCuSWh65tsfqA+0eQpv9ggW5E3jeLHH4+oi HpEcSDWWyJbm9ndl2h/aousTiuRV0I7usP3j/XE30SAXRiz7z+8HKtvpmf3/eowj2c kn5ztr/9yA5LmHq0bjxq3SwzI+EXe9QiQDptFum6pLHBZKCLj3+dw9W/6skqjCMWwA gmAksoPm0LJOQ== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mx.sberdevices.ru (Postfix) with ESMTP; Mon, 6 Feb 2023 10:00:36 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Arseniy Krasnov , "Krasnov Arseniy" CC: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , kernel Subject: [RFC PATCH v1 07/12] vsock/virtio: MGS_ZEROCOPY flag support Thread-Topic: [RFC PATCH v1 07/12] vsock/virtio: MGS_ZEROCOPY flag support Thread-Index: AQHZOfi2sn4SgiUsYkKcLOif1R6btw== Date: Mon, 6 Feb 2023 07:00:35 +0000 Message-ID: <716333a1-d6d1-3dde-d04a-365d4a361bfe@sberdevices.ru> In-Reply-To: <0e7c6fc4-b4a6-a27b-36e9-359597bba2b5@sberdevices.ru> Accept-Language: en-US, ru-RU Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/02/06 01:18:00 #20834045 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This adds main logic of MSG_ZEROCOPY flag processing for packet creation. When this flag is set and user's iov iterator fits for zerocopy transmission, call 'get_user_pages()' and add returned pages to the newly created skb. Signed-off-by: Arseniy Krasnov --- net/vmw_vsock/virtio_transport_common.c | 212 ++++++++++++++++++++++-- 1 file changed, 195 insertions(+), 17 deletions(-) diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index 05ce97b967ad..69e37f8a68a6 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -37,6 +37,169 @@ virtio_transport_get_ops(struct vsock_sock *vsk) return container_of(t, struct virtio_transport, transport); } +static int virtio_transport_can_zcopy(struct iov_iter *iov_iter, + size_t free_space) +{ + size_t pages; + int i; + + if (!iter_is_iovec(iov_iter)) + return -1; + + if (iov_iter->iov_offset) + return -1; + + /* We can't send whole iov. */ + if (free_space < iov_iter->count) + return -1; + + for (pages = 0, i = 0; i < iov_iter->nr_segs; i++) { + const struct iovec *iovec; + int pages_in_elem; + + iovec = &iov_iter->iov[i]; + + /* Base must be page aligned. */ + if (offset_in_page(iovec->iov_base)) + return -1; + + /* Only last element could have not page aligned size. */ + if (i != (iov_iter->nr_segs - 1)) { + if (offset_in_page(iovec->iov_len)) + return -1; + + pages_in_elem = iovec->iov_len >> PAGE_SHIFT; + } else { + pages_in_elem = round_up(iovec->iov_len, PAGE_SIZE); + pages_in_elem >>= PAGE_SHIFT; + } + + /* In case of user's pages - one page is one frag. */ + if (pages + pages_in_elem > MAX_SKB_FRAGS) + return -1; + + pages += pages_in_elem; + } + + return 0; +} + +static int virtio_transport_init_zcopy_skb(struct vsock_sock *vsk, + struct sk_buff *skb, + struct iov_iter *iter, + bool zerocopy) +{ + struct ubuf_info_msgzc *uarg_zc; + struct ubuf_info *uarg; + + uarg = msg_zerocopy_realloc(sk_vsock(vsk), + iov_length(iter->iov, iter->nr_segs), + NULL); + + if (!uarg) + return -1; + + uarg_zc = uarg_to_msgzc(uarg); + uarg_zc->zerocopy = zerocopy ? 1 : 0; + + skb_zcopy_init(skb, uarg); + + return 0; +} + +static int virtio_transport_fill_nonlinear_skb(struct sk_buff *skb, + struct vsock_sock *vsk, + struct virtio_vsock_pkt_info *info) +{ + struct iov_iter *iter; + int frag_idx; + int seg_idx; + + iter = &info->msg->msg_iter; + frag_idx = 0; + VIRTIO_VSOCK_SKB_CB(skb)->curr_frag = 0; + VIRTIO_VSOCK_SKB_CB(skb)->frag_off = 0; + + /* At this moment: + * 1) 'iov_offset' is zero. + * 2) Every 'iov_base' and 'iov_len' are also page aligned + * (except length of the last element). + * 3) Number of pages in this iov <= MAX_SKB_FRAGS. + * 4) Length of the data fits in current credit space. + */ + for (seg_idx = 0; seg_idx < iter->nr_segs; seg_idx++) { + struct page *user_pages[MAX_SKB_FRAGS]; + const struct iovec *iovec; + size_t last_frag_len; + size_t pages_in_seg; + int page_idx; + + iovec = &iter->iov[seg_idx]; + pages_in_seg = iovec->iov_len >> PAGE_SHIFT; + + if (iovec->iov_len % PAGE_SIZE) { + last_frag_len = iovec->iov_len % PAGE_SIZE; + pages_in_seg++; + } else { + last_frag_len = PAGE_SIZE; + } + + if (get_user_pages((unsigned long)iovec->iov_base, + pages_in_seg, FOLL_GET, user_pages, + NULL) != pages_in_seg) + return -1; + + for (page_idx = 0; page_idx < pages_in_seg; page_idx++) { + int frag_len = PAGE_SIZE; + + if (page_idx == (pages_in_seg - 1)) + frag_len = last_frag_len; + + skb_fill_page_desc(skb, frag_idx++, + user_pages[page_idx], 0, + frag_len); + skb_len_add(skb, frag_len); + } + } + + return virtio_transport_init_zcopy_skb(vsk, skb, iter, true); +} + +static int virtio_transport_copy_payload(struct sk_buff *skb, + struct vsock_sock *vsk, + struct virtio_vsock_pkt_info *info, + size_t len) +{ + void *payload; + int err; + + payload = skb_put(skb, len); + err = memcpy_from_msg(payload, info->msg, len); + if (err) + return -1; + + if (msg_data_left(info->msg)) + return 0; + + if (info->type == VIRTIO_VSOCK_TYPE_SEQPACKET) { + struct virtio_vsock_hdr *hdr; + + hdr = virtio_vsock_hdr(skb); + + hdr->flags |= cpu_to_le32(VIRTIO_VSOCK_SEQ_EOM); + + if (info->msg->msg_flags & MSG_EOR) + hdr->flags |= cpu_to_le32(VIRTIO_VSOCK_SEQ_EOR); + } + + if (info->flags & MSG_ZEROCOPY) + return virtio_transport_init_zcopy_skb(vsk, skb, + &info->msg->msg_iter, + false); + + return 0; +} + /* Returns a new packet on success, otherwise returns NULL. * * If NULL is returned, errp is set to a negative errno. @@ -47,15 +210,31 @@ virtio_transport_alloc_skb(struct virtio_vsock_pkt_info *info, u32 src_cid, u32 src_port, u32 dst_cid, - u32 dst_port) + u32 dst_port, + struct vsock_sock *vsk) { - const size_t skb_len = VIRTIO_VSOCK_SKB_HEADROOM + len; + const size_t skb_len = VIRTIO_VSOCK_SKB_HEADROOM; struct virtio_vsock_hdr *hdr; struct sk_buff *skb; - void *payload; - int err; + bool use_zcopy = false; + + if (info->msg) { + /* If SOCK_ZEROCOPY is not enabled, ignore MSG_ZEROCOPY + * flag later and continue in classic way(e.g. without + * completion). + */ + if (!sock_flag(sk_vsock(vsk), SOCK_ZEROCOPY)) { + info->flags &= ~MSG_ZEROCOPY; + } else { + if ((info->flags & MSG_ZEROCOPY) && + !virtio_transport_can_zcopy(&info->msg->msg_iter, len)) { + use_zcopy = true; + } + } + } - skb = virtio_vsock_alloc_skb(skb_len, GFP_KERNEL); + /* For MSG_ZEROCOPY length will be added later. */ + skb = virtio_vsock_alloc_skb(skb_len + (use_zcopy ? 0 : len), GFP_KERNEL); if (!skb) return NULL; @@ -70,18 +249,15 @@ virtio_transport_alloc_skb(struct virtio_vsock_pkt_info *info, hdr->len = cpu_to_le32(len); if (info->msg && len > 0) { - payload = skb_put(skb, len); - err = memcpy_from_msg(payload, info->msg, len); - if (err) - goto out; + int err; - if (msg_data_left(info->msg) == 0 && - info->type == VIRTIO_VSOCK_TYPE_SEQPACKET) { - hdr->flags |= cpu_to_le32(VIRTIO_VSOCK_SEQ_EOM); + if (use_zcopy) + err = virtio_transport_fill_nonlinear_skb(skb, vsk, info); + else + err = virtio_transport_copy_payload(skb, vsk, info, len); - if (info->msg->msg_flags & MSG_EOR) - hdr->flags |= cpu_to_le32(VIRTIO_VSOCK_SEQ_EOR); - } + if (err) + goto out; } if (info->reply) @@ -266,7 +442,8 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk, skb = virtio_transport_alloc_skb(info, pkt_len, src_cid, src_port, - dst_cid, dst_port); + dst_cid, dst_port, + vsk); if (!skb) { virtio_transport_put_credit(vvs, pkt_len); return -ENOMEM; @@ -842,6 +1019,7 @@ virtio_transport_stream_enqueue(struct vsock_sock *vsk, .msg = msg, .pkt_len = len, .vsk = vsk, + .flags = msg->msg_flags, }; return virtio_transport_send_pkt_info(vsk, &info); @@ -894,7 +1072,7 @@ static int virtio_transport_reset_no_sock(const struct virtio_transport *t, le64_to_cpu(hdr->dst_cid), le32_to_cpu(hdr->dst_port), le64_to_cpu(hdr->src_cid), - le32_to_cpu(hdr->src_port)); + le32_to_cpu(hdr->src_port), NULL); if (!reply) return -ENOMEM; From patchwork Mon Feb 6 07:01:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13129385 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8FBF3C61DA4 for ; Mon, 6 Feb 2023 07:01:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229821AbjBFHBj (ORCPT ); Mon, 6 Feb 2023 02:01:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56554 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229738AbjBFHBf (ORCPT ); Mon, 6 Feb 2023 02:01:35 -0500 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D10845249; Sun, 5 Feb 2023 23:01:33 -0800 (PST) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 398595FD03; Mon, 6 Feb 2023 10:01:32 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1675666892; bh=3pLjGLtvlR8PtX7rjjJnNKnHtt2owD8Qpbd28G3wgTc=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=onzOMuKAQfwMyuJ/bb2FZ3lNAZFz2PRq07Mwmq6Wec3iQ4Iw8DZeRMefd0EXQcks2 fbpXuATimxr/R9HoeqFOV8SsthAKGxyyepzTv0mm7JPyOYc24m0ze7eI1q9IWFzuxO YWCv/4RTYITWqs9Du2Fgt3m8i16JFmEdIcFu83Sb7IGm8HfjfkbWTZk96boS3RuxtN 0AqtBQzyOffEt15g+e0qDrcJA6aTVvuJ/DqX8fL2ClxcS9X9RVNwICFT0BQoylgmII ElRDaXZl2Q5RbUpfzT9R21wAt6bGS2G28RStcasQv8ixx1tYBDrDlzW6ps6sk1Y8KW ZggL4iFV9cWDQ== Received: from S-MS-EXCH02.sberdevices.ru (S-MS-EXCH02.sberdevices.ru [172.16.1.5]) by mx.sberdevices.ru (Postfix) with ESMTP; Mon, 6 Feb 2023 10:01:32 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Arseniy Krasnov , "Krasnov Arseniy" CC: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , kernel Subject: [RFC PATCH v1 08/12] vhost/vsock: support MSG_ZEROCOPY for transport Thread-Topic: [RFC PATCH v1 08/12] vhost/vsock: support MSG_ZEROCOPY for transport Thread-Index: AQHZOfjX+En7ow/L9km3iEiQmRP5IA== Date: Mon, 6 Feb 2023 07:01:31 +0000 Message-ID: In-Reply-To: <0e7c6fc4-b4a6-a27b-36e9-359597bba2b5@sberdevices.ru> Accept-Language: en-US, ru-RU Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: <52482C1304B05940BFFBE0D5E6351FA8@sberdevices.ru> MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/02/06 01:18:00 #20834045 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Signed-off-by: Arseniy Krasnov --- drivers/vhost/vsock.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index 60b9cafa3e31..afaa80203528 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -440,6 +440,11 @@ static bool vhost_vsock_more_replies(struct vhost_vsock *vsock) return val < vq->num; } +static bool vhost_transport_msgzerocopy_allow(void) +{ + return true; +} + static bool vhost_transport_seqpacket_allow(u32 remote_cid); static struct virtio_transport vhost_transport = { @@ -485,6 +490,7 @@ static struct virtio_transport vhost_transport = { .notify_send_post_enqueue = virtio_transport_notify_send_post_enqueue, .notify_buffer_size = virtio_transport_notify_buffer_size, + .msgzerocopy_allow = vhost_transport_msgzerocopy_allow, }, .send_pkt = vhost_transport_send_pkt, From patchwork Mon Feb 6 07:02:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13129386 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ACED0C636D3 for ; Mon, 6 Feb 2023 07:02:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229854AbjBFHCq (ORCPT ); Mon, 6 Feb 2023 02:02:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57316 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229510AbjBFHCp (ORCPT ); Mon, 6 Feb 2023 02:02:45 -0500 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C45D5249; Sun, 5 Feb 2023 23:02:43 -0800 (PST) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id B9F9B5FD03; Mon, 6 Feb 2023 10:02:41 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1675666961; bh=ks9MrMlvcf/mwHZsnaRwSHAcjAk0PfgI9BorakSSCfA=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=iLsm55ho5IXRcNXv68DhERTg+d4OMF7NPcOecauZcewk3vfvfRvXkm+CCBo+f+Lr7 4PWeXQ4mHJWjdmNgDIJUC6Cr5n9uh2HQ/3oj/9hcuOpeLQLD7URWLQ9wLf/VZp0RwH pJ69tjHmrmtQ7zf8H3LP3Zwy3QgQ/6er7RXm1fO99VSInfVsd+FJDIH4VPT8ESFEjI E1ojrsUHWdsLc5MWlcigrOHg2K1/U+LMLv8MeJ8tmtQDZ0ZzC1aoIf1rCArMOwRHiJ uzad5TpxjzePg7C3yqxVJ2szjUOWWdWWrwysq25okBvRpoEpKPMNlBrDBm3yKXLzRy YtHQUinlwpEUg== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mx.sberdevices.ru (Postfix) with ESMTP; Mon, 6 Feb 2023 10:02:41 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Arseniy Krasnov , "Krasnov Arseniy" CC: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , kernel Subject: [RFC PATCH v1 09/12] vsock/virtio: support MSG_ZEROCOPY for transport Thread-Topic: [RFC PATCH v1 09/12] vsock/virtio: support MSG_ZEROCOPY for transport Thread-Index: AQHZOfkBclfV0p5QrkafxT61PfKC7g== Date: Mon, 6 Feb 2023 07:02:41 +0000 Message-ID: <787cee22-43b7-82d9-1273-27a2d5544282@sberdevices.ru> In-Reply-To: <0e7c6fc4-b4a6-a27b-36e9-359597bba2b5@sberdevices.ru> Accept-Language: en-US, ru-RU Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/02/06 01:18:00 #20834045 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Signed-off-by: Arseniy Krasnov --- net/vmw_vsock/virtio_transport.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index b8a7d6dc9f46..4c5f19015df7 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -432,6 +432,11 @@ static void virtio_vsock_rx_done(struct virtqueue *vq) queue_work(virtio_vsock_workqueue, &vsock->rx_work); } +static bool virtio_transport_msgzerocopy_allow(void) +{ + return true; +} + static bool virtio_transport_seqpacket_allow(u32 remote_cid); static struct virtio_transport virtio_transport = { @@ -476,6 +481,8 @@ static struct virtio_transport virtio_transport = { .notify_send_pre_enqueue = virtio_transport_notify_send_pre_enqueue, .notify_send_post_enqueue = virtio_transport_notify_send_post_enqueue, .notify_buffer_size = virtio_transport_notify_buffer_size, + + .msgzerocopy_allow = virtio_transport_msgzerocopy_allow, }, .send_pkt = virtio_transport_send_pkt, From patchwork Mon Feb 6 07:03:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13129387 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8FE4FC05027 for ; Mon, 6 Feb 2023 07:03:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229872AbjBFHDi (ORCPT ); Mon, 6 Feb 2023 02:03:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58206 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229488AbjBFHDh (ORCPT ); Mon, 6 Feb 2023 02:03:37 -0500 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E18FA6EAB; Sun, 5 Feb 2023 23:03:35 -0800 (PST) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 4C3175FD03; Mon, 6 Feb 2023 10:03:34 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1675667014; bh=B+7PRVOOaI2lvcTHzeaPsvVUOzr1Zjh3HhP1m2qoBFo=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=l4uIPL1D9fJB9FFBwrk1cI7jtL23oGVWWZooJQ1YQ4irLYuQtpGJBAjh/4NGrEQZf ewRwJZTqzIyslWE/AZ8nvxzyXe+Z39p2cFOp5lLQac0IyVlDskHY4A/sZ51465bgxz uzriKAYIyOaixZxClp+/G/tDAwsk5mgU+OJlrzqssvopbJO1ai8s5kmpRBiIFn40sB YOfJaTSjDQ6XBnqf2hcnExUD89blFvCvGZy9FIVo+llsC2QHQYP8NRLFXyzimWw//Y CZF6R6Xl8O7hF7n58x4McXuDa8JDslAAKgEAcztPuTkr+C7aZZQcumvtpblpONTLxW e/SfuHiuAXTtw== Received: from S-MS-EXCH02.sberdevices.ru (S-MS-EXCH02.sberdevices.ru [172.16.1.5]) by mx.sberdevices.ru (Postfix) with ESMTP; Mon, 6 Feb 2023 10:03:34 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Arseniy Krasnov , "Krasnov Arseniy" CC: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , kernel Subject: [RFC PATCH v1 10/12] net/sock: enable setting SO_ZEROCOPY for PF_VSOCK Thread-Topic: [RFC PATCH v1 10/12] net/sock: enable setting SO_ZEROCOPY for PF_VSOCK Thread-Index: AQHZOfkgd8nezKwWuky7m6WWWc9yqw== Date: Mon, 6 Feb 2023 07:03:33 +0000 Message-ID: <0ad20428-b93a-b1e5-ea10-60449d25e0b4@sberdevices.ru> In-Reply-To: <0e7c6fc4-b4a6-a27b-36e9-359597bba2b5@sberdevices.ru> Accept-Language: en-US, ru-RU Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: <6B6138F15C68B647A82F25C1F6E32944@sberdevices.ru> MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/02/06 01:18:00 #20834045 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org PF_VSOCK supports MSG_ZEROCOPY transmission, so SO_ZEROCOPY could be enabled. Signed-off-by: Arseniy Krasnov --- net/core/sock.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/core/sock.c b/net/core/sock.c index 7ba4891460ad..61f15de84e82 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1457,9 +1457,11 @@ int sk_setsockopt(struct sock *sk, int level, int optname, (sk->sk_type == SOCK_DGRAM && sk->sk_protocol == IPPROTO_UDP))) ret = -EOPNOTSUPP; - } else if (sk->sk_family != PF_RDS) { + } else if (sk->sk_family != PF_RDS && + sk->sk_family != PF_VSOCK) { ret = -EOPNOTSUPP; } + if (!ret) { if (val < 0 || val > 1) ret = -EINVAL; From patchwork Mon Feb 6 07:05:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13129390 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9AEA1C05027 for ; Mon, 6 Feb 2023 07:05:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229892AbjBFHFJ (ORCPT ); Mon, 6 Feb 2023 02:05:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59386 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229571AbjBFHFH (ORCPT ); Mon, 6 Feb 2023 02:05:07 -0500 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 110CC6EAB; Sun, 5 Feb 2023 23:05:05 -0800 (PST) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 6297F5FD03; Mon, 6 Feb 2023 10:05:03 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1675667103; bh=HKZuWYpAONF2bZLUFJ5qd1aGEDhbxe5McJq626GydAw=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=QPQ5xcONM9CJw3iDubG+RM4mDGrhZDXI2o4vHSWMaSGQIC+sWWwUID3n4vQB/HIhT 36br3xTdfvzE4rklSLke4bLNjLUu6iOd5iyyizzjixFRISCyYOvfVjgdu9vwft//vz OT0tYTdcuFGLTFgTT/sOLBbBvD0/tXr6UDIwe2G4fFpnRxZP321TFAHCj2n415wK9K +78xeHcwjiMz3cPnmhJjy3yTTY7m3SVFigeGoBmMVGNQjy7RefCpCRCaOeWnToeHIZ q66xULlaCLz9AfEPIHDin6Ox2rTTh9rvM/p0Iabxch79vUqhNzp3/gwxIy6Ev5iJAO +UPRmdNiAO+eA== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mx.sberdevices.ru (Postfix) with ESMTP; Mon, 6 Feb 2023 10:05:03 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Arseniy Krasnov , "Krasnov Arseniy" CC: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , kernel Subject: [RFC PATCH v1 11/12] test/vsock: MSG_ZEROCOPY flag tests Thread-Topic: [RFC PATCH v1 11/12] test/vsock: MSG_ZEROCOPY flag tests Thread-Index: AQHZOflVUJdjiggaWkGCbvkRZHpuwQ== Date: Mon, 6 Feb 2023 07:05:02 +0000 Message-ID: <7f0d1d9b-d9f8-2d79-adb6-9c38d39a12fd@sberdevices.ru> In-Reply-To: <0e7c6fc4-b4a6-a27b-36e9-359597bba2b5@sberdevices.ru> Accept-Language: en-US, ru-RU Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: <59E2B6639B553F4883321661267C890B@sberdevices.ru> MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/02/06 01:18:00 #20834045 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This adds set of tests for MSG_ZEROCOPY flag. Signed-off-by: Arseniy Krasnov --- tools/testing/vsock/Makefile | 2 +- tools/testing/vsock/util.h | 1 + tools/testing/vsock/vsock_test.c | 11 + tools/testing/vsock/vsock_test_zerocopy.c | 470 ++++++++++++++++++++++ tools/testing/vsock/vsock_test_zerocopy.h | 12 + 5 files changed, 495 insertions(+), 1 deletion(-) create mode 100644 tools/testing/vsock/vsock_test_zerocopy.c create mode 100644 tools/testing/vsock/vsock_test_zerocopy.h diff --git a/tools/testing/vsock/Makefile b/tools/testing/vsock/Makefile index 43a254f0e14d..0a78787d1d92 100644 --- a/tools/testing/vsock/Makefile +++ b/tools/testing/vsock/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0-only all: test vsock_perf test: vsock_test vsock_diag_test -vsock_test: vsock_test.o timeout.o control.o util.o +vsock_test: vsock_test.o vsock_test_zerocopy.o timeout.o control.o util.o vsock_diag_test: vsock_diag_test.o timeout.o control.o util.o vsock_perf: vsock_perf.o diff --git a/tools/testing/vsock/util.h b/tools/testing/vsock/util.h index fb99208a95ea..46ba1d3202b8 100644 --- a/tools/testing/vsock/util.h +++ b/tools/testing/vsock/util.h @@ -2,6 +2,7 @@ #ifndef UTIL_H #define UTIL_H +#include #include #include diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c index 67e9f9df3a8c..25373cd18ef1 100644 --- a/tools/testing/vsock/vsock_test.c +++ b/tools/testing/vsock/vsock_test.c @@ -20,6 +20,7 @@ #include #include +#include "vsock_test_zerocopy.h" #include "timeout.h" #include "control.h" #include "util.h" @@ -920,6 +921,16 @@ static struct test_case test_cases[] = { .run_client = test_seqpacket_bigmsg_client, .run_server = test_seqpacket_bigmsg_server, }, + { + .name = "SOCK_STREAM MSG_ZEROCOPY", + .run_client = test_stream_msg_zcopy_client, + .run_server = test_stream_msg_zcopy_server, + }, + { + .name = "SOCK_STREAM MSG_ZEROCOPY empty MSG_ERRQUEUE", + .run_client = test_stream_msg_zcopy_empty_errq_client, + .run_server = test_stream_msg_zcopy_empty_errq_server, + }, {}, }; diff --git a/tools/testing/vsock/vsock_test_zerocopy.c b/tools/testing/vsock/vsock_test_zerocopy.c new file mode 100644 index 000000000000..1214391d3c99 --- /dev/null +++ b/tools/testing/vsock/vsock_test_zerocopy.c @@ -0,0 +1,470 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* MSG_ZEROCOPY feature tests for vsock + * + * Copyright (C) 2023 SberDevices. + * + * Author: Arseniy Krasnov + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "control.h" +#include "vsock_test_zerocopy.h" + +#ifndef SOL_VSOCK +#define SOL_VSOCK 287 +#endif + +#define PAGE_SIZE 4096 +#define POLL_TIMEOUT_MS 2000 +#define SENDMSG_RES_IOV_LEN (-2) + +struct zerocopy_test_data { + bool zerocopied; + bool completion; + int sendmsg_errno; + ssize_t sendmsg_res; + int vecs_cnt; + struct iovec vecs[3]; +}; + +static void do_recv_completion(int fd, bool zerocopied, bool completion) +{ + struct sock_extended_err *serr; + struct msghdr msg = { 0 }; + struct pollfd fds = { 0 }; + char cmsg_data[128]; + struct cmsghdr *cm; + uint32_t hi, lo; + + fds.fd = fd; + fds.events = 0; + + if (poll(&fds, 1, POLL_TIMEOUT_MS) < 0) { + perror("poll"); + exit(EXIT_FAILURE); + } + + if (!(fds.revents & POLLERR)) { + if (completion) { + fprintf(stderr, "POLLERR expected\n"); + exit(EXIT_FAILURE); + } else { + return; + } + } + + msg.msg_control = cmsg_data; + msg.msg_controllen = sizeof(cmsg_data); + + recvmsg(fd, &msg, MSG_ERRQUEUE); + + cm = CMSG_FIRSTHDR(&msg); + if (!cm) { + fprintf(stderr, "cmsg: no cmsg\n"); + exit(EXIT_FAILURE); + } + + if (cm->cmsg_level != SOL_VSOCK) { + fprintf(stderr, "cmsg: unexpected 'cmsg_level'\n"); + exit(EXIT_FAILURE); + } + + if (cm->cmsg_type != 0) { + fprintf(stderr, "cmsg: unexpected 'cmsg_type'\n"); + exit(EXIT_FAILURE); + } + + serr = (void *)CMSG_DATA(cm); + if (serr->ee_origin != SO_EE_ORIGIN_ZEROCOPY) { + printf("serr: wrong origin: %u\n", serr->ee_origin); + exit(EXIT_FAILURE); + } + + if (serr->ee_errno) { + printf("serr: wrong error code: %u\n", serr->ee_errno); + exit(EXIT_FAILURE); + } + + hi = serr->ee_data; + lo = serr->ee_info; + if (hi != lo) { + fprintf(stderr, "serr: expected hi == lo\n"); + exit(EXIT_FAILURE); + } + + if (hi) { + fprintf(stderr, "serr: expected hi == lo == 0\n"); + exit(EXIT_FAILURE); + } + + if (zerocopied && (serr->ee_code & SO_EE_CODE_ZEROCOPY_COPIED)) { + fprintf(stderr, "serr: was copy instead of zerocopy\n"); + exit(EXIT_FAILURE); + } + + if (!zerocopied && !(serr->ee_code & SO_EE_CODE_ZEROCOPY_COPIED)) { + fprintf(stderr, "serr: was zerocopy instead of copy\n"); + exit(EXIT_FAILURE); + } +} + +static void enable_so_zerocopy(int fd) +{ + int val = 1; + + if (setsockopt(fd, SOL_SOCKET, SO_ZEROCOPY, &val, sizeof(val))) + error(1, errno, "setsockopt"); +} + +static void *mmap_no_fail(size_t bytes) +{ + void *res; + + res = mmap(NULL, bytes, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS | MAP_POPULATE, -1, 0); + if (res == MAP_FAILED) { + perror("mmap"); + exit(EXIT_FAILURE); + } + + return res; +} + +static size_t iovec_bytes(const struct iovec *iov, size_t iovnum) +{ + size_t bytes; + int i; + + for (bytes = 0, i = 0; i < iovnum; i++) + bytes += iov[i].iov_len; + + return bytes; +} + +static void iovec_random_init(struct iovec *iov, + const struct zerocopy_test_data *test_data) +{ + int i; + + for (i = 0; i < test_data->vecs_cnt; i++) { + int j; + + if (test_data->vecs[i].iov_base == MAP_FAILED) + continue; + + for (j = 0; j < iov[i].iov_len; j++) + ((uint8_t *)iov[i].iov_base)[j] = rand() & 0xff; + } +} + +static unsigned long iovec_hash_djb2(struct iovec *iov, size_t iovnum) +{ + unsigned long hash; + size_t iov_bytes; + size_t offs; + void *tmp; + int i; + + iov_bytes = iovec_bytes(iov, iovnum); + + tmp = malloc(iov_bytes); + if (!tmp) { + perror("malloc"); + exit(EXIT_FAILURE); + } + + for (offs = 0, i = 0; i < iovnum; i++) { + memcpy(tmp + offs, iov[i].iov_base, iov[i].iov_len); + offs += iov[i].iov_len; + } + + hash = hash_djb2(tmp, iov_bytes); + free(tmp); + + return hash; +} + +static struct zerocopy_test_data test_data_array[] = { + /* Last element has non-page aligned size. */ + { + .zerocopied = true, + .completion = true, + .sendmsg_errno = 0, + .sendmsg_res = SENDMSG_RES_IOV_LEN, + .vecs_cnt = 3, + { + { NULL, PAGE_SIZE }, + { NULL, PAGE_SIZE }, + { NULL, 200 } + } + }, + /* All elements have page aligned base and size. */ + { + .zerocopied = true, + .completion = true, + .sendmsg_errno = 0, + .sendmsg_res = SENDMSG_RES_IOV_LEN, + .vecs_cnt = 3, + { + { NULL, PAGE_SIZE }, + { NULL, PAGE_SIZE * 2 }, + { NULL, PAGE_SIZE * 3 } + } + }, + /* All elements have page aligned base and size. */ + { + .zerocopied = true, + .completion = true, + .sendmsg_errno = 0, + .sendmsg_res = SENDMSG_RES_IOV_LEN, + .vecs_cnt = 3, + { + { NULL, PAGE_SIZE }, + { NULL, PAGE_SIZE }, + { NULL, PAGE_SIZE } + } + }, + /* Middle element has non-page aligned size. */ + { + .zerocopied = false, + .completion = true, + .sendmsg_errno = 0, + .sendmsg_res = SENDMSG_RES_IOV_LEN, + .vecs_cnt = 3, + { + { NULL, PAGE_SIZE }, + { NULL, 100 }, + { NULL, PAGE_SIZE } + } + }, + /* Middle element has both non-page aligned base and size. */ + { + .zerocopied = false, + .completion = true, + .sendmsg_errno = 0, + .sendmsg_res = SENDMSG_RES_IOV_LEN, + .vecs_cnt = 3, + { + { NULL, PAGE_SIZE }, + { (void *)1, 100 }, + { NULL, PAGE_SIZE } + } + }, + /* One element has invalid base. */ + { + .zerocopied = false, + .completion = false, + .sendmsg_errno = ENOMEM, + .sendmsg_res = -1, + .vecs_cnt = 3, + { + { NULL, PAGE_SIZE }, + { MAP_FAILED, PAGE_SIZE }, + { NULL, PAGE_SIZE } + } + }, + /* Valid data, but SO_ZEROCOPY is off. */ + { + .zerocopied = true, + .completion = false, + .sendmsg_errno = 0, + .sendmsg_res = SENDMSG_RES_IOV_LEN, + .vecs_cnt = 1, + { + { NULL, PAGE_SIZE } + } + }, +}; + +static void __test_stream_msg_zerocopy_client(const struct test_opts *opts, + const struct zerocopy_test_data *test_data) +{ + struct msghdr msg = { 0 }; + ssize_t sendmsg_res; + struct iovec *iovec; + int fd; + int i; + + fd = vsock_stream_connect(opts->peer_cid, 1234); + if (fd < 0) { + perror("connect"); + exit(EXIT_FAILURE); + } + + if (test_data->completion) + enable_so_zerocopy(fd); + + iovec = malloc(sizeof(*iovec) * test_data->vecs_cnt); + if (!iovec) { + perror("malloc"); + exit(EXIT_FAILURE); + } + + for (i = 0; i < test_data->vecs_cnt; i++) { + iovec[i].iov_len = test_data->vecs[i].iov_len; + iovec[i].iov_base = mmap_no_fail(test_data->vecs[i].iov_len); + } + + for (i = 0; i < test_data->vecs_cnt; i++) { + if (test_data->vecs[i].iov_base == MAP_FAILED) { + if (munmap(iovec[i].iov_base, iovec[i].iov_len)) { + perror("munmap"); + exit(EXIT_FAILURE); + } + } + } + + iovec_random_init(iovec, test_data); + + msg.msg_iov = iovec; + msg.msg_iovlen = test_data->vecs_cnt; + + errno = 0; + + if (test_data->sendmsg_res == SENDMSG_RES_IOV_LEN) + sendmsg_res = iovec_bytes(iovec, test_data->vecs_cnt); + else + sendmsg_res = test_data->sendmsg_res; + + if (sendmsg(fd, &msg, MSG_ZEROCOPY) != sendmsg_res) { + perror("send"); + exit(EXIT_FAILURE); + } + + if (errno != test_data->sendmsg_errno) { + fprintf(stderr, "expected 'errno' == %i, got %i\n", + test_data->sendmsg_errno, errno); + exit(EXIT_FAILURE); + } + + do_recv_completion(fd, test_data->zerocopied, test_data->completion); + + if (test_data->sendmsg_res == SENDMSG_RES_IOV_LEN) + control_writeulong(iovec_hash_djb2(iovec, test_data->vecs_cnt)); + else + control_writeulong(0); + + free(iovec); + close(fd); + control_writeln("DONE"); +} + +static void __test_stream_msg_zerocopy_server(const struct test_opts *opts, + const struct zerocopy_test_data *test_data) +{ + unsigned long remote_hash; + unsigned long local_hash; + ssize_t total_bytes_rec; + unsigned char *data; + size_t data_len; + int fd; + + fd = vsock_stream_accept(VMADDR_CID_ANY, 1234, NULL); + if (fd < 0) { + perror("accept"); + exit(EXIT_FAILURE); + } + + data_len = iovec_bytes(test_data->vecs, test_data->vecs_cnt); + + data = malloc(data_len); + if (!data) { + perror("malloc"); + exit(EXIT_FAILURE); + } + + total_bytes_rec = 0; + + while (total_bytes_rec != data_len) { + ssize_t bytes_rec; + + bytes_rec = read(fd, data + total_bytes_rec, + data_len - total_bytes_rec); + if (bytes_rec <= 0) + break; + + total_bytes_rec += bytes_rec; + } + + if (test_data->sendmsg_res == SENDMSG_RES_IOV_LEN) + local_hash = hash_djb2(data, data_len); + else + local_hash = 0; + + /* Waiting for some result. */ + remote_hash = control_readulong(); + if (remote_hash != local_hash) { + fprintf(stderr, "hash mismatch\n"); + exit(EXIT_FAILURE); + } + + close(fd); + control_expectln("DONE"); +} + +void test_stream_msg_zcopy_client(const struct test_opts *opts) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(test_data_array); i++) + __test_stream_msg_zerocopy_client(opts, &test_data_array[i]); +} + +void test_stream_msg_zcopy_server(const struct test_opts *opts) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(test_data_array); i++) + __test_stream_msg_zerocopy_server(opts, &test_data_array[i]); +} + +void test_stream_msg_zcopy_empty_errq_client(const struct test_opts *opts) +{ + struct msghdr msg = { 0 }; + char cmsg_data[128]; + ssize_t res; + int fd; + + fd = vsock_stream_connect(opts->peer_cid, 1234); + if (fd < 0) { + perror("connect"); + exit(EXIT_FAILURE); + } + + msg.msg_control = cmsg_data; + msg.msg_controllen = sizeof(cmsg_data); + + res = recvmsg(fd, &msg, MSG_ERRQUEUE); + if (res != -1) { + fprintf(stderr, "expected 'recvmsg(2)' failure, got %zi\n", + res); + exit(EXIT_FAILURE); + } + + control_writeln("DONE"); + close(fd); +} + +void test_stream_msg_zcopy_empty_errq_server(const struct test_opts *opts) +{ + int fd; + + fd = vsock_stream_accept(VMADDR_CID_ANY, 1234, NULL); + if (fd < 0) { + perror("accept"); + exit(EXIT_FAILURE); + } + + control_expectln("DONE"); + close(fd); +} diff --git a/tools/testing/vsock/vsock_test_zerocopy.h b/tools/testing/vsock/vsock_test_zerocopy.h new file mode 100644 index 000000000000..705a1e90f41a --- /dev/null +++ b/tools/testing/vsock/vsock_test_zerocopy.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef VSOCK_TEST_ZEROCOPY_H +#define VSOCK_TEST_ZEROCOPY_H +#include "util.h" + +void test_stream_msg_zcopy_client(const struct test_opts *opts); +void test_stream_msg_zcopy_server(const struct test_opts *opts); + +void test_stream_msg_zcopy_empty_errq_client(const struct test_opts *opts); +void test_stream_msg_zcopy_empty_errq_server(const struct test_opts *opts); + +#endif /* VSOCK_TEST_ZEROCOPY_H */ From patchwork Mon Feb 6 07:06:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13129391 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1637FC05027 for ; Mon, 6 Feb 2023 07:06:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229719AbjBFHGl (ORCPT ); Mon, 6 Feb 2023 02:06:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60756 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229864AbjBFHGk (ORCPT ); Mon, 6 Feb 2023 02:06:40 -0500 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E6361ABC3; Sun, 5 Feb 2023 23:06:34 -0800 (PST) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 0B0025FD03; Mon, 6 Feb 2023 10:06:33 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1675667193; bh=6pMqrPyHGPQW1VgQ8Akh86QM+ETIOMVEMR7fJ2NeMG8=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=IGO63hiBq7CW+U4nicOdwQ0xr1f/HcwilSpOFPbuMFJ/sZGQEMyzvz3Blh72Mexz1 9Mb6QcD4SPSQdhnvIl0NaeBT7++gaG/ZOkTuxrLv/sVIroJfl0QcppQTMw/X5IajvL aueaHKIeYE+L0RARt5or1Jqk7G8NJHCzXV1gSg+9ogb0n6/jwOdGhEZ4wIm4GPG0xW T3hJhbs68KjhurIknbSfklgLiEiLzwsAm7E1OtmFvwXa4l5P1z4oQk3ZhThlVvKRWp zwATTx+GyuxEZqYrlb/czAJRol2CX5zbbC9i0JaSZWEjV+7qCSrVWYxhFWncXGwu8Y ofKoMfgqI+UUA== Received: from S-MS-EXCH02.sberdevices.ru (S-MS-EXCH02.sberdevices.ru [172.16.1.5]) by mx.sberdevices.ru (Postfix) with ESMTP; Mon, 6 Feb 2023 10:06:32 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Arseniy Krasnov , "Krasnov Arseniy" CC: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , kernel Subject: [RFC PATCH v1 12/12] test/vsock: MSG_ZEROCOPY support for vsock_perf Thread-Topic: [RFC PATCH v1 12/12] test/vsock: MSG_ZEROCOPY support for vsock_perf Thread-Index: AQHZOfmKbgKypo6Z8EmH8tGoyYFWVw== Date: Mon, 6 Feb 2023 07:06:32 +0000 Message-ID: <03570f48-f56a-2af4-9579-15a685127aeb@sberdevices.ru> In-Reply-To: <0e7c6fc4-b4a6-a27b-36e9-359597bba2b5@sberdevices.ru> Accept-Language: en-US, ru-RU Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: <26536C37BB13B643B0FDC846754C9890@sberdevices.ru> MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/02/06 01:18:00 #20834045 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org To use this option pass '--zc' parameter: ./vsock_perf --zc --sender --port --bytes With this option MSG_ZEROCOPY flag will be passed to the 'send()' call. Signed-off-by: Arseniy Krasnov --- tools/testing/vsock/vsock_perf.c | 127 +++++++++++++++++++++++++++++-- 1 file changed, 120 insertions(+), 7 deletions(-) diff --git a/tools/testing/vsock/vsock_perf.c b/tools/testing/vsock/vsock_perf.c index a72520338f84..1d435be9b48e 100644 --- a/tools/testing/vsock/vsock_perf.c +++ b/tools/testing/vsock/vsock_perf.c @@ -18,6 +18,8 @@ #include #include #include +#include +#include #define DEFAULT_BUF_SIZE_BYTES (128 * 1024) #define DEFAULT_TO_SEND_BYTES (64 * 1024) @@ -28,9 +30,14 @@ #define BYTES_PER_GB (1024 * 1024 * 1024ULL) #define NSEC_PER_SEC (1000000000ULL) +#ifndef SOL_VSOCK +#define SOL_VSOCK 287 +#endif + static unsigned int port = DEFAULT_PORT; static unsigned long buf_size_bytes = DEFAULT_BUF_SIZE_BYTES; static unsigned long vsock_buf_bytes = DEFAULT_VSOCK_BUF_BYTES; +static bool zerocopy; static void error(const char *s) { @@ -247,15 +254,74 @@ static void run_receiver(unsigned long rcvlowat_bytes) close(fd); } +static void recv_completion(int fd) +{ + struct sock_extended_err *serr; + char cmsg_data[128]; + struct cmsghdr *cm; + struct msghdr msg; + int ret; + + msg.msg_control = cmsg_data; + msg.msg_controllen = sizeof(cmsg_data); + + ret = recvmsg(fd, &msg, MSG_ERRQUEUE); + if (ret == -1) + return; + + cm = CMSG_FIRSTHDR(&msg); + if (!cm) { + fprintf(stderr, "cmsg: no cmsg\n"); + return; + } + + if (cm->cmsg_level != SOL_VSOCK) { + fprintf(stderr, "cmsg: unexpected 'cmsg_level'\n"); + return; + } + + if (cm->cmsg_type) { + fprintf(stderr, "cmsg: unexpected 'cmsg_type'\n"); + return; + } + + serr = (void *)CMSG_DATA(cm); + if (serr->ee_origin != SO_EE_ORIGIN_ZEROCOPY) { + fprintf(stderr, "serr: wrong origin\n"); + return; + } + + if (serr->ee_errno) { + fprintf(stderr, "serr: wrong error code\n"); + return; + } + + if (zerocopy && (serr->ee_code & SO_EE_CODE_ZEROCOPY_COPIED)) + fprintf(stderr, "warning: copy instead of zerocopy\n"); +} + +static void enable_so_zerocopy(int fd) +{ + int val = 1; + + if (setsockopt(fd, SOL_SOCKET, SO_ZEROCOPY, &val, sizeof(val))) + error("setsockopt(SO_ZEROCOPY)"); +} + static void run_sender(int peer_cid, unsigned long to_send_bytes) { time_t tx_begin_ns; time_t tx_total_ns; size_t total_send; + time_t time_in_send; void *data; int fd; - printf("Run as sender\n"); + if (zerocopy) + printf("Run as sender MSG_ZEROCOPY\n"); + else + printf("Run as sender\n"); + printf("Connect to %i:%u\n", peer_cid, port); printf("Send %lu bytes\n", to_send_bytes); printf("TX buffer %lu bytes\n", buf_size_bytes); @@ -265,25 +331,58 @@ static void run_sender(int peer_cid, unsigned long to_send_bytes) if (fd < 0) exit(EXIT_FAILURE); - data = malloc(buf_size_bytes); + if (zerocopy) { + enable_so_zerocopy(fd); - if (!data) { - fprintf(stderr, "'malloc()' failed\n"); - exit(EXIT_FAILURE); + data = mmap(NULL, buf_size_bytes, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + if (data == MAP_FAILED) { + perror("mmap"); + exit(EXIT_FAILURE); + } + } else { + data = malloc(buf_size_bytes); + + if (!data) { + fprintf(stderr, "'malloc()' failed\n"); + exit(EXIT_FAILURE); + } } memset(data, 0, buf_size_bytes); total_send = 0; + time_in_send = 0; tx_begin_ns = current_nsec(); while (total_send < to_send_bytes) { ssize_t sent; + size_t rest_bytes; + time_t before; + + rest_bytes = to_send_bytes - total_send; - sent = write(fd, data, buf_size_bytes); + before = current_nsec(); + sent = send(fd, data, (rest_bytes > buf_size_bytes) ? + buf_size_bytes : rest_bytes, + zerocopy ? MSG_ZEROCOPY : 0); + time_in_send += (current_nsec() - before); if (sent <= 0) error("write"); + if (zerocopy) { + struct pollfd fds = { 0 }; + + fds.fd = fd; + + if (poll(&fds, 1, -1) < 0) { + perror("poll"); + exit(EXIT_FAILURE); + } + + recv_completion(fd); + } + total_send += sent; } @@ -294,9 +393,14 @@ static void run_sender(int peer_cid, unsigned long to_send_bytes) get_gbps(total_send * 8, tx_total_ns)); printf("total time in 'write()': %f sec\n", (float)tx_total_ns / NSEC_PER_SEC); + printf("time in send %f\n", (float)time_in_send / NSEC_PER_SEC); close(fd); - free(data); + + if (zerocopy) + munmap(data, buf_size_bytes); + else + free(data); } static const char optstring[] = ""; @@ -336,6 +440,11 @@ static const struct option longopts[] = { .has_arg = required_argument, .val = 'R', }, + { + .name = "zc", + .has_arg = no_argument, + .val = 'Z', + }, {}, }; @@ -351,6 +460,7 @@ static void usage(void) " --help This message\n" " --sender Sender mode (receiver default)\n" " of the receiver to connect to\n" + " --zc Enable zerocopy\n" " --port Port (default %d)\n" " --bytes KMG Bytes to send (default %d)\n" " --buf-size KMG Data buffer size (default %d). In sender mode\n" @@ -413,6 +523,9 @@ int main(int argc, char **argv) case 'H': /* Help. */ usage(); break; + case 'Z': /* Zerocopy. */ + zerocopy = true; + break; default: usage(); }